Lightnews — Scholar-powered news

galen

@nel.ag

+1

August 12, 2025 at 10:36 PM

galen

@nel.ag

model of self is important! not being able to count letters is whatever, not recognizing this is an actual problem

August 8, 2025 at 4:27 PM

galen

@nel.ag

>"then I put it in our team chat"
do you use slack for work comms or smthn else? we use matrix but curious if signal itself is usable

August 2, 2025 at 2:39 PM

galen

@nel.ag

they ablate this and confirm the result only holds for same-model distillation

July 22, 2025 at 5:49 PM

galen

@nel.ag

they're also here
bsky.app/profile/metr...

METR @metr.org · Jul 14

METR previously estimated that the time horizon of AI agents on software tasks is doubling every 7 months.

We have now analyzed 9 other benchmarks for scientific reasoning, math, robotics, computer use, and self-driving; we observe generally similar rates of improvement.

July 21, 2025 at 5:39 PM

galen

@nel.ag

isn't that the Void infra?

July 9, 2025 at 8:39 PM

galen

@nel.ag

toronto mentioned!

June 28, 2025 at 11:44 PM

galen

@nel.ag

congrats on the release! any thoughts on full duplex? seems like with everything else so polished turn-taking is really holding it back

June 8, 2025 at 5:39 AM

galen

@nel.ag

woah evan post

June 2, 2025 at 3:42 AM

galen

@nel.ag

fwiw model costs are very directly proportional to energy costs. if videogen doesn't become more efficient then the dozen samples costs >$100 and doesn't get consumer usage

May 21, 2025 at 4:40 PM

galen

@nel.ag

this is a pretty reasonable mistake, and it's not even a crazy number to report for the cost of hobbyist usage, but breaks the reference class for estimating commercial deployments

May 21, 2025 at 4:14 PM

galen

@nel.ag

the researcher probably ran the default code on a 5090 or something and reported what was logged there, but in the default config *most* of the energy is being used to shuffle the model between the CPU and GPU hundreds of times just because it doesn't have enough vram

May 21, 2025 at 4:12 PM

galen

@nel.ag

but it's also an OOM off from the number reported! I think i found the reason though; there's a line in the code `pipe.enable_sequential_cpu_offload()` which helps fit the larger model on consumer graphics cards

May 21, 2025 at 4:10 PM

galen

@nel.ag

so I just benched cogx 1.5 5b on an h100, it took 6min 48sec, avging close to max load on the chip at 685 watts, so 685*0.12 hours = 0.082 kWh or 295k joules. This is a lot higher than I expected! I'm not used to diffusion model pipelines.

May 21, 2025 at 4:07 PM

galen

@nel.ag

huh that sounds 10-100x higher than I'd expect for similar models, eg this implies in the range of $15 per generation in server costs, is this amortizing training?

May 21, 2025 at 2:55 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news