Lightnews — Scholar-powered news

Michael Noukhovitch...🏄 NeurIPS 2025

@mnoukhov.bsky.social

Check out Olmo 3 RL-Zero: a clean and scientific setup to benchmark RLVR

Everyone is finetuning with Qwen but its hard to know whether your eval is contaminated and skewing your RLVR results. Olmo 3 has a solution.

Nathan Lambert @natolambert.bsky.social · 7d

We present Olmo 3, our next family of fully open, leading language models.
This family of 7B and 32B models represents:

1. The best 32B base model.
2. The best 7B Western thinking & instruct models.
3. The first 32B (or larger) fully open reasoning model.

November 20, 2025 at 8:38 PM

Reposted by Michael Noukhovitch...🏄 NeurIPS 2025

Nathan Lambert

@natolambert.bsky.social

We present Olmo 3, our next family of fully open, leading language models.
This family of 7B and 32B models represents:

1. The best 32B base model.
2. The best 7B Western thinking & instruct models.
3. The first 32B (or larger) fully open reasoning model.

November 20, 2025 at 2:32 PM

Reposted by Michael Noukhovitch...🏄 NeurIPS 2025

Dane Carnegie Malenfant

@dvnxmvlhdf5.bsky.social

Preprint Alert 🚀

Multi-agent reinforcement learning (MARL) often assumes that agents know when other agents cooperate with them. But for humans, this isn’t always the case. For example, plains indigenous groups used to leave resources for others to use at effigies called Manitokan.
1/8

Manitokan are images set up where one can bring a gift or receive a gift. 1930s Rocky Boy Reservation, Montana, Montana State University photograph. Colourized with AI

June 5, 2025 at 3:32 PM

Michael Noukhovitch...🏄 NeurIPS 2025

@mnoukhov.bsky.social

@dnllvy.bsky.social @oumarkaba.bsky.social presenting cool work at #ICLR2025 on generative models for crystals leveraging symmetry ❄️🪞, repping @mila-quebec.bsky.social

April 24, 2025 at 7:07 AM

Reposted by Michael Noukhovitch...🏄 NeurIPS 2025

Sara Vera Marjanovic

@saravera.bsky.social

Models like DeepSeek-R1 🐋 mark a fundamental shift in how LLMs approach complex problems. In our preprint on R1 Thoughtology, we study R1’s reasoning chains across a variety of tasks; investigating its capabilities, limitations, and behaviour.
🔗: mcgill-nlp.github.io/thoughtology/

A circular diagram with a blue whale icon at the center. The diagram shows 8 interconnected research areas around LLM reasoning represented as colored rectangular boxes arranged in a circular pattern. The areas include: §3 Analysis of Reasoning Chains (central cloud), §4 Scaling of Thoughts (discussing thought length and performance metrics), §5 Long Context Evaluation (focusing on information recall), §6 Faithfulness to Context (examining question answering accuracy), §7 Safety Evaluation (assessing harmful content generation and jailbreak resistance), §8 Language & Culture (exploring moral reasoning and language effects), §9 Relation to Human Processing (comparing cognitive processes), §10 Visual Reasoning (covering ASCII generation capabilities), and §11 Following Token Budget (investigating direct prompting techniques). Arrows connect the sections in a clockwise flow, suggesting an iterative research methodology.

April 1, 2025 at 8:07 PM

Michael Noukhovitch...🏄 NeurIPS 2025

@mnoukhov.bsky.social

Llama 4 uses async RLHF and I would just like to announce that I called it t.co/w9qJxr944C

April 7, 2025 at 7:39 PM

Michael Noukhovitch...🏄 NeurIPS 2025

@mnoukhov.bsky.social

Our work on Asynchronous RLHF was accepted to #ICLR2025 ! (I was so excited to announce it, I forgot to say I was excited)

Used by @ai2.bsky.social for OLMo-2 32B 🔥
New results show ~70% speedups for LLM + RL math and reasoning 🧠

🧵below or hear my DLCT talk online on March 28!

March 18, 2025 at 8:45 PM

Michael Noukhovitch...🏄 NeurIPS 2025

@mnoukhov.bsky.social

Programming using an AI assistant in order to improve AI assistants is giving me strong sci-fi vibes. Specifically Isaac Asimov, who clearly invented vibe coding in 1956 users.ece.cmu.edu/~gamvrosi/th...

February 11, 2025 at 12:17 AM

Michael Noukhovitch...🏄 NeurIPS 2025

@mnoukhov.bsky.social

I'm at #NeurIPS2024 this week if anyone wants to talk about RLHF while drinking an overpriced (but excellent) pourover coffee or tea!

December 11, 2024 at 2:19 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news