Lightnews — Scholar-powered news

Reposted by brendan chambers

Ai2

@ai2.bsky.social

⚠️ Update on Deep Research Tulu (DR Tulu), our post-training recipe for deep research agents: we’re releasing an upgraded version of our example agent, DR Tulu-8B (RL), that matches or beats systems like Gemini 3 Pro & Tongyi DeepResearch-30B-A3B on core benchmarks. 🧵

November 25, 2025 at 7:37 PM

Reposted by brendan chambers

Stella Li

@stellali.bsky.social

Test-time reasoning guidance: up to 66.7% improvement 💡

We scaffold cognitive structures from successful traces to guide reasoning.

Major gains on ill-structured problems🌟

Models possess latent capabilities—they just don't deploy them adaptively without explicit guidance.

November 25, 2025 at 6:26 PM

Reposted by brendan chambers

Stella Li

@stellali.bsky.social

We analyzed 1,598 LLM reasoning papers:

Research concentrates on easily quantifiable behaviors—sequential organization (55%), decomposition (60%)

Neglects meta-cognitive controls (8-16%) and alternative representations (10-27%) that correlate with success⚠️

November 25, 2025 at 6:26 PM

Reposted by brendan chambers

Stella Li

@stellali.bsky.social

Our taxonomy bridges cognitive science → LLM eval:

28 elements across 4 dimensions—reasoning invariants (compositionality, logical coherence), meta-cognitive controls (self-awareness), representations (hierarchical, causal), and operations (backtracking, verification)

November 25, 2025 at 6:26 PM

Reposted by brendan chambers

TMLR Published Papers

@tmlr-pub.bsky.social

Goal-Conditioned Data Augmentation for Offline Reinforcement Learning

Xingshuai Huang, Di Wu, Benoit Boulet

Action editor: Baoxiang Wang

https://openreview.net/forum?id=8K16dplpE0

#reinforcement #conditioning #learns

November 23, 2025 at 9:18 AM

Reposted by brendan chambers

Ai2

@ai2.bsky.social

🌊 Global Mangrove Watch is using OlmoEarth to refresh mangrove map baselines faster, with higher accuracy & less manual annotation—allowing orgs + governments to respond to threats more quickly.
Learn more → buff.ly/6xLHLk6

November 4, 2025 at 2:53 PM

Reposted by brendan chambers

Alexander Doria

@dorialexander.bsky.social

ml halloween costume concept

October 31, 2025 at 10:08 PM

Reposted by brendan chambers

Eugene Vinitsky 🍒

@eugenevinitsky.bsky.social

Yes!! A POMDP world model benchmark with controlled test environments. So excited to play with this

Ching-Lung Hsu @hiallen72.bsky.social · 27d

arxiv.org/abs/2510.19788

Benchmarking World-Model Learning

Model-learning agents should gather information to learn world models that support many downstream tasks and inferences, such as predicting unobserved states, estimating near- and far-term consequence...

arxiv.org

October 29, 2025 at 12:47 PM

Reposted by brendan chambers

Mariel Pettee

@marielpettee.bsky.social

Really proud of this project led by @kylecranmer.bsky.social and created alongside these awesome collaborators and friends. I believe this is important work that will help researchers make grounded decisions when building multimodal models with many diverse inputs. Add it to your paper pile!

Kyle Cranmer @kylecranmer.bsky.social · 28d

New paper, with @rkhashmani.me @marielpettee.bsky.social @garrettmerz.bsky.social Hellen Qu. We introduce a framework for generating realistic, highly multimodal datasets with explicitly calculable mutual information. This is helpful for studying self-supervised learning
arxiv.org/abs/2510.21686

October 28, 2025 at 8:13 PM

Reposted by brendan chambers

Angelina Wang

@angelinawang.bsky.social

Cornell (NYC and Ithaca) is recruiting AI postdocs, apply by Nov 20, 2025! If you're interested in working with me on technical approaches to responsible AI (e.g., personalization, fairness), please email me.

academicjobsonline.org/ajo/jobs/30971

Cornell University, Empire AI Fellows Program

Job #AJO30971, Postdoctoral Fellow, Empire AI Fellows Program, Cornell University, New York, New York, US

academicjobsonline.org

October 28, 2025 at 6:19 PM

brendan chambers

@societyoftrees.bsky.social

Here is the recipe from the latest Thinking Machines blogpost about late post-training:

- generate student rollouts
- query teacher distribution forced on student history
- update using the reverse KL divergence at each step

thinkingmachines.ai/blog/on-poli...

October 28, 2025 at 6:37 PM

Reposted by brendan chambers

Mark J. Nelson

@mm-jj-nn.bsky.social

Did an intro to tokenization lecture today and worked in this thread.

Alexander Doria @dorialexander.bsky.social · Sep 15

> be a language model
> all you see is tokens
> you don't care, it's all abstracted away
> you live for a world of pure ideas, chain of concepts, reasoning streams
> tokens don't exist.

October 24, 2025 at 8:54 PM

Reposted by brendan chambers

Nathan Lambert

@natolambert.bsky.social

The first fantastic paper on scaling RL with LLMs just dropped. I strongly recommend taking a look and will be sharing more thoughts on the blog soon.

The Art of Scaling Reinforcement Learning Compute for LLMs
Khatri & Madaan et al.

buff.ly/olKwF3X

October 16, 2025 at 1:59 PM

Reposted by brendan chambers

Simon Willison

@simonwillison.net

NVIDIA sent me preview hardware of their new DGX Spark 128GB ARM64 4TB "AI supercomputer" - it's a very neat little device, here are my notes so far
simonwillison.net/2025/Oct/14/...

NVIDIA DGX Spark: great hardware, early days for the ecosystem

NVIDIA sent me a preview unit of their new DGX Spark desktop “AI supercomputer”. I’ve never had hardware to review before! You can consider this my first ever sponsored post …

simonwillison.net

October 14, 2025 at 11:38 PM

Reposted by brendan chambers

Sebastian Raschka (rasbt)

@sebastianraschka.com

Multi-Head Latent Attention
🔗 github.com/rasbt/LLMs-f...

October 12, 2025 at 1:57 PM

Reposted by brendan chambers

Grace

@gracekind.net

⚠️ You have marked yourself as an untrusted node in the epistemic network

October 11, 2025 at 1:55 PM

brendan chambers

@societyoftrees.bsky.social

improving pretrained LLMs by searching over iid-noised params, using a reward score (aka fitness criterion) for weight-merging

Mark J. Nelson @mm-jj-nn.bsky.social · Oct 7

Evolution Strategies at Scale: LLM Fine-Tuning Beyond Reinforcement Learning

Fine-tuning pre-trained large language models (LLMs) for down-stream tasks is a critical step in the AI deployment pipeline. Reinforcement learning (RL) is arguably the most prominent fine-tuning meth...

arxiv.org

October 7, 2025 at 5:02 PM

Reposted by brendan chambers

Conference on Language Modeling

@colmweb.org

We are excited to announce 4 outstanding papers 🏆🏆🏆🏆 --> 🧵

October 7, 2025 at 1:23 PM

Reposted by brendan chambers

Michael Kirchhof (ICML)

@mkirchhof.bsky.social

LLMs are currently this one big parameter block that stores all sort of facts. In our new preprint, we add context-specific memory parameters to the model, and pretrain the model along with a big bank of memories.

📑 arxiv.org/abs/2510.02375

[1/10]🧵

October 6, 2025 at 4:06 PM

brendan chambers

@societyoftrees.bsky.social

accepted papers, COLM 2025

colmweb.org/AcceptedPape...

October 6, 2025 at 3:39 PM

Reposted by brendan chambers

Ethan Mollick

@emollick.bsky.social

Paper: arxiv.org/pdf/2509.20328

arxiv.org

October 3, 2025 at 1:05 PM

Reposted by brendan chambers

TMLR Published Papers

@tmlr-pub.bsky.social

Spaced Scheduling for Large Language Model Training

Amine El hattami, Nicolas Chapados, Christopher Pal

Action editor: Colin Raffel

https://openreview.net/forum?id=p0KTYl2B9T

#scheduling #scheduled #training

October 2, 2025 at 4:18 AM

Reposted by brendan chambers

Naomi Saphra

@nsaphra.bsky.social

really neat clear explainer for the new on “centralizing flows” to theoretically model learning dynamics

Understanding Optimization in Deep Learning with Central Flows

centralflows.github.io

October 1, 2025 at 12:20 PM

Reposted by brendan chambers

Lenka Zdeborová

@zdeborova.bsky.social

Scaling laws don’t just show up in test error — they leave fingerprints in the weight spectrum.
In the feature learning regime, we map this connection: phase diagrams of scaling exponents <-> spectral signatures of trained weights. The paper is: arxiv.org/abs/2509.24882

September 30, 2025 at 11:02 AM

Reposted by brendan chambers

norvid_studies

@norvid-studies.bsky.social

latent space opera

September 28, 2025 at 4:26 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news