Lightnews — Scholar-powered news

bram

@bwasti.bsky.social

550 followers 420 following 6 posts

meta ai, focused on efficient inference (and reasoning for the next couple months)

Posts Replies Media Videos

bram

@bwasti.bsky.social

rough tensor core (TC) times

1 ns - 1000 TC operations
10 ns - load 1000 TCs from shared memory
100 ns - load a single TC from CPU

1 us - 1B float operations
10 us - launch a CUDA kernel
100 us - pytorch call

1 ms - move 1GB from GPU RAM
10 ms - move 1GB btwn 2 GPUs
100 ms - move 1GB from CPU

October 30, 2024 at 10:28 PM

bram

@bwasti.bsky.social

bf16 is to fp16 as fp8-e5m2 is to fp8-e4m3

October 30, 2024 at 1:41 PM

bram

@bwasti.bsky.social

when it comes to adding data to an LLM, i tend to follow this token count rule

one comma: put it in the prompt
two commas: use RAG
three commas: finetune
four commas: pretrain (i’ll be honest i’ve never actually hit this one 😜)

October 30, 2024 at 12:06 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news