xjdr
banner
xjdr.bsky.social
xjdr
@xjdr.bsky.social
hot takes, linear Algebra, JAX apologist, Raconteur
I have become radicalized
November 26, 2024 at 5:27 PM
so far the experience has been pretty good here but the default feeds are _terrible_. feels like its going to take a few weeks to whip these feeds into shape with mutes and "show less like these" plus lots of likes. Following feed is good but i need to follow a lot more people
November 25, 2024 at 2:56 AM
very interesting work and it reminds me a bit of this paper. Tokenizers and ROPE must die. after samplers, i am on to those next ...
arxiv.org/abs/2407.036...
November 25, 2024 at 2:20 AM
i keep forgetting to include this cause i always assume people do this by default. Any time there is an exponent or a norm, you should be working in the highest practical precision
All softmaxes, also the output/vocab one. And the normalizations in f32 too.
November 24, 2024 at 8:05 PM
the BigVision repo is my current reference impl for gemma and ViT. such an underrated repo @giffmana.bsky.social and team are doing the lord's work

github.com/google-resea...

github.com/google-resea...
big_vision/big_vision/models/ppp/gemma.py at main · google-research/big_vision
Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more. - google-research/big_vision
github.com
November 24, 2024 at 5:25 PM
now that people are paying attention again, here is your periodic reminder. Always run in bf16. always apply ROPE and attention softmax at float32 (as shown here)

github.com/xjdr-alt/ent...
November 24, 2024 at 5:23 PM
Reposted by xjdr
So first version of an ml anon starter pack. go.bsky.app/VgWL5L Kept half-anons (like me and Vic). Not all anime pfp, but generally drawn.
November 24, 2024 at 4:55 PM
i trying to follow as many of my old moots as possible and new people as i find them. some of y'all changing your pfp is just mean spirited (im lazy and learned people's pfps not names)
November 24, 2024 at 5:08 PM
Well this looks shockingly professional. I may have to put on a tie to post here
November 22, 2024 at 6:53 PM