I love routine, but have to keep reminding myself to break it from time to time.
I love routine, but have to keep reminding myself to break it from time to time.
Both the English and Japanese editions now found a home in the Sakana AI library ✨ @sakanaai.bsky.social
Both the English and Japanese editions now found a home in the Sakana AI library ✨ @sakanaai.bsky.social
Links to more resources in the reply
colton.dev/blog/curing-...
colton.dev/blog/curing-...
Or... how to ensure your context takes advantage of KV-cache to save cost. For example, the cached input tokens cost 0.30 USD/MTok, while uncached ones cost 3 USD/MTok—a 10x difference.
manus.im/blog/Context...
Or... how to ensure your context takes advantage of KV-cache to save cost. For example, the cached input tokens cost 0.30 USD/MTok, while uncached ones cost 3 USD/MTok—a 10x difference.
manus.im/blog/Context...
"Persistence beats talent" is honestly quite inspiring
"Persistence beats talent" is honestly quite inspiring
Probably also illegal
Probably also illegal
www.bondcap.com/report/tai/0
www.bondcap.com/report/tai/0
- understanding things deeply, reading the actual source
- being willing to help other people
- status doesn’t matter, good ideas come from anywhere
endler.dev/2025/best-pr...
- understanding things deeply, reading the actual source
- being willing to help other people
- status doesn’t matter, good ideas come from anywhere
endler.dev/2025/best-pr...
expertofobsolescence.substack.com/p/the-hard-t...
wait it's the test
no no, the code
ummm, is it both?
no, definitely test
hang on what is this test even doing?
*gets up and makes tea*
wait it's the test
no no, the code
ummm, is it both?
no, definitely test
hang on what is this test even doing?
*gets up and makes tea*
📝 Blog post: www.anthropic.com/research/tra...
🧪 "Biology" paper: transformer-circuits.pub/2025/attribu...
⚙️ Methods paper: transformer-circuits.pub/2025/attribu...
Featuring basic multi-step reasoning, planning, introspection and more!
📝 Blog post: www.anthropic.com/research/tra...
🧪 "Biology" paper: transformer-circuits.pub/2025/attribu...
⚙️ Methods paper: transformer-circuits.pub/2025/attribu...
Featuring basic multi-step reasoning, planning, introspection and more!
arxiv.org/abs/2503.14481
arxiv.org/abs/2503.14481
1 Two different clip hyperparams, so positive clipping can uplift more unexpected tokens
2 Dynamic sampling -- remove samples w flat reward in batch
3 Per token loss
4 Managing too long generations in loss
dapo-sia.github.io
1 Two different clip hyperparams, so positive clipping can uplift more unexpected tokens
2 Dynamic sampling -- remove samples w flat reward in batch
3 Per token loss
4 Managing too long generations in loss
dapo-sia.github.io