Author | Lightnews

Nathan Lambert

@natolambert.bsky.social

First time at CMU

February 13, 2026 at 3:35 PM

Nathan Lambert

@natolambert.bsky.social

Fun to set up real analytics and learn that my RLHF Book pdf is downloaded 50-100 times a day from my site (doesnt include Arxiv downloads/views).

Thanks for reading!

February 12, 2026 at 2:51 PM

Nathan Lambert

@natolambert.bsky.social

Codex app is nice.
Im just a few minutes in and think it'll make some of the crazy things i was doing way easier to monitor.

February 11, 2026 at 11:38 PM

Nathan Lambert

@natolambert.bsky.social

Poll. Do you see the famous METR plot holding true on Jan. 1st of 2027 (~20 hours), or 2028 (~50 hours).

What would be the right way to measure tasks of that scope?

February 11, 2026 at 5:12 PM

Nathan Lambert

@natolambert.bsky.social

Beautiful RL scaling plot from Cursor.
cursor.com/blog/compose...

February 10, 2026 at 12:26 AM

Nathan Lambert

@natolambert.bsky.social

TLDR: codex is a very useful coding tool, claude is the first agent.

Nathan Lambert @natolambert.bsky.social · 4d

I spent a long time testing the new Opus 4.6 and Codex 5.3 models, but the most striking thing was how so many people are reacting to model releases wrong with how we now use models. In my post-benchmark era.

Claude is still king, but codex is closer than ever
www.interconnects.ai/p/opus-46-vs...

Opus 4.6, Codex 5.3, and the post-benchmark era

On comparing models in 2026.

www.interconnects.ai

February 9, 2026 at 3:40 PM

Nathan Lambert

@natolambert.bsky.social

I spent a long time testing the new Opus 4.6 and Codex 5.3 models, but the most striking thing was how so many people are reacting to model releases wrong with how we now use models. In my post-benchmark era.

Claude is still king, but codex is closer than ever
www.interconnects.ai/p/opus-46-vs...

Opus 4.6, Codex 5.3, and the post-benchmark era

On comparing models in 2026.

www.interconnects.ai

February 9, 2026 at 3:21 PM

Nathan Lambert

@natolambert.bsky.social

People don't want to accept that the top used open model families in 2026 are.

Overall:
1. Qwen
2. Llama
3. GPT-OSS

Big models:
1. DeepSeek
2. GPT-OSS/Qwen/everyone else

Llama's inertia says a lot about how the ecosystem works.

February 8, 2026 at 5:45 PM

Nathan Lambert

@natolambert.bsky.social

I want there to be a nanoGPT style speedrunning setup for RL.

February 6, 2026 at 7:29 PM

Nathan Lambert

@natolambert.bsky.social

The best compliment i can give OpenAI's Codex 5.3 is that it feels way more like Claude Code

February 6, 2026 at 6:07 PM

Nathan Lambert

@natolambert.bsky.social

GPT Codex 5.3 sounds like a much bigger change than Claude Opus 4.6, will be curious if this holds up in real testing.

February 5, 2026 at 6:31 PM

Nathan Lambert

@natolambert.bsky.social

“Due to GPT‑5.3-Codex being so different from its predecessors, the data from alpha testing exhibited numerous unusual and counter-intuitive results”

Sounds worth giving a go. Big changes are good.

February 5, 2026 at 6:16 PM

Reposted by Nathan Lambert

Brian Christian

@brianchristian.bsky.social

Reward models (RMs) are supposed to represent human values. But RMs are NOT blank slates – they inherit measurable biases from their base models that stubbornly persist through preference training. #ICLR2026 🧵

February 4, 2026 at 4:30 PM

Nathan Lambert

@natolambert.bsky.social

Ending your day at >99% Claude rate limit usage but not maxing out feels like a masterpiece.

February 5, 2026 at 3:32 AM

Nathan Lambert

@natolambert.bsky.social

Transcript etc: www.interconnects.ai/p/why-nvidia...

Why Nvidia builds open models with Bryan Catanzaro

Interconnects interview #17 on the past, present, and future of the Nemotron project.

www.interconnects.ai

February 4, 2026 at 6:05 PM

Nathan Lambert

@natolambert.bsky.social

Nvidia’s Nemotron is the closest thing the U.S. has to a Qwen approach to open models, but most people don’t know it yet.
I’m very bullish on Nvidia’s open model efforts in 2026.
Interconnects interview #17 on the past, present, and future of the Nemotron project.
www.youtube.com/watch?v=Y3Vb...

Why NVIDIA builds their own open models | Nemotron w/ Bryan Catanzaro

NVIDIA releasing their best models as open weights isn't charity — it's a business decision. And honestly, it's one of the clearest explanations I've heard for why a company would invest heavily in…

www.youtube.com

February 4, 2026 at 6:05 PM

Nathan Lambert

@natolambert.bsky.social

Qwen already dropping models for CNY

February 3, 2026 at 5:48 PM

Nathan Lambert

@natolambert.bsky.social

Gemini not being in the conversation at all with Claude Code and Codex is the real “code red” emergency.

February 3, 2026 at 3:23 PM

Nathan Lambert

@natolambert.bsky.social

Is documented! I did a full memory sweep. The training becomes FLOP limited before memory saturated.

February 2, 2026 at 8:36 PM

Nathan Lambert

@natolambert.bsky.social

Latest open artifacts (#18): Arcee's 400B MoE, LiquidAI's underrated 1B model, new Kimi, and anticipation of a busy month
Tons of useful "niche" models and anticipation of big releases coming soon.
www.interconnects.ai/p/latest-ope...

Latest open artifacts (#18): Arcee, LiquidAI and Moonshot ...

Tons of useful "niche" models and anticipation of big releases coming soon.

www.interconnects.ai

February 2, 2026 at 3:23 PM

Nathan Lambert

@natolambert.bsky.social

Despite being banned, Chinese users (likely via VPNs) are HuggingFace's top user group. They definitely have the most people *building* open models.

February 1, 2026 at 5:07 PM

Nathan Lambert

@natolambert.bsky.social

github.com/natolambert/...

Add direct alignment algorithms (DPO, IPO, SimPO, ORPO, KTO) by natolambert · Pull Request #226 · natolambert/rlhf-book

Summary Implements educational direct alignment algorithms for Chapter 12 6 algorithms: DPO, cDPO, IPO, SimPO, ORPO, KTO Default model: allenai/OLMo-2-0425-1B-SFT Default dataset: argilla/ultrafee...

github.com

February 1, 2026 at 3:41 PM

Nathan Lambert

@natolambert.bsky.social

claude code writing, codex code review, GPT Pro for planning made a working DPO (and related algorithms) repository from scratch for my RLHF book, and the curves are looking right.

On the dgx spark finetuning olmo 2 1b sft. Built by referencing the original repositories + TRL

February 1, 2026 at 3:41 PM

Nathan Lambert

@natolambert.bsky.social

Recorded a podcast, think it’s pretty good and comprehensive, hope you like it ;) youtu.be/EV7WhVT270Q?...

State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490

YouTube video by Lex Fridman

youtu.be

January 31, 2026 at 11:06 PM

Nathan Lambert

@natolambert.bsky.social

I'm visiting CMU for a talk at the Language Technologies Institute on feb 12/13th. Looking forward to chatting with folks about frontiers in RL and building agentic language models.

Email me with "CMU Visit" in the subject if you're interested in chatting & why!

January 31, 2026 at 8:03 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news