Adi Mukherjee
adim.in
Adi Mukherjee
@adim.in
Software Engineer, currently on sabbatical in Japan. Prev: Apple SRE. Working on something new.
I love this idea, thanks for sharing! Btw, in case you revise these, I noticed a typo
December 24, 2024 at 3:51 AM
This shift from training to inference compute is good news for hyperscalers and Nvidia.
December 22, 2024 at 7:32 AM
In the ARC AGI eval (linked article in the first post), the ‘high compute’ mode results came from spending ~$350K in total on inference, giving the model more compute to search the solution tree.
December 22, 2024 at 7:27 AM
These models excel at reasoning-heavy tasks like coding, summarisation, and can work through PhD-level problems with sufficient test time compute. Unlike their predecessors (4o/3.5-sonnet), these reasoning models get ‘smarter’ with inference compute.
December 22, 2024 at 7:25 AM