Julius Adebayo
juliusad.bsky.social
Julius Adebayo
@juliusad.bsky.social
ML researcher, building interpretable models at Guide Labs (guidelabs.bsky.social).
Looks like Tesla’s models sometimes confuse train tracks with road lanes.
January 4, 2025 at 9:23 PM
Reposted by Julius Adebayo
OLMo 2 tech report is out!

We get in the weeds with this one, with 50+ pages on 4 crucial components of LLM development pipeline:
January 3, 2025 at 7:51 PM
Great to see clarification comments. o3 is impressive nonetheless.

Played around with o1 and the ‘thinking’ Gemini model. The cot output (for Gemini) can confusing and convoluted, but it got 3/5 problems right. Stopped on the remaining 2.

These models are an impressive interpretability test bed.
It seems that OpenAI's latest model, o3, can solve 25% of problems on a database called FrontierMath, created by EpochAI, where previous LLMs could only solve 2%. On Twitter I am quoted as saying, "Getting even one question right would be well beyond what we can do now, let alone saturating them."
December 21, 2024 at 7:12 PM
New paper. We show that the representations of LLMs, up to 3B params(!), can be engineered to encode biophysical factors that are meaningful to experts.

We don't have to hope Adam magically finds models that learn useful features; we can optimize for models that encode for interpretable features!
🧵
[1/n] Does AlphaFold3 "know" biophysics and the physics of protein folding? Are protein language models (pLMs) learning coevolutionary patterns? You can try to guess the answer to these questions using mechanistic interpretability.
December 13, 2024 at 1:50 AM
Pinging into the void.
November 18, 2024 at 3:31 AM