Lightnews — Scholar-powered news

Reposted by brendan chambers

@tmlr-pub.bsky.social

Goal-Conditioned Data Augmentation for Offline Reinforcement Learning

Xingshuai Huang, Di Wu, Benoit Boulet

Action editor: Baoxiang Wang

https://openreview.net/forum?id=8K16dplpE0

#reinforcement #conditioning #learns

November 23, 2025 at 9:18 AM

Reposted by brendan chambers

Ai2

@ai2.bsky.social

🌊 Global Mangrove Watch is using OlmoEarth to refresh mangrove map baselines faster, with higher accuracy & less manual annotation—allowing orgs + governments to respond to threats more quickly.
Learn more → buff.ly/6xLHLk6

November 4, 2025 at 2:53 PM

Reposted by brendan chambers

Alexander Doria

@dorialexander.bsky.social

ml halloween costume concept

October 31, 2025 at 10:08 PM

Reposted by brendan chambers

Eugene Vinitsky 🍒

@eugenevinitsky.bsky.social

Yes!! A POMDP world model benchmark with controlled test environments. So excited to play with this

Ching-Lung Hsu @hiallen72.bsky.social · 27d

arxiv.org/abs/2510.19788

Benchmarking World-Model Learning

Model-learning agents should gather information to learn world models that support many downstream tasks and inferences, such as predicting unobserved states, estimating near- and far-term consequence...

arxiv.org

October 29, 2025 at 12:47 PM

Reposted by brendan chambers

Mariel Pettee

@marielpettee.bsky.social

Really proud of this project led by @kylecranmer.bsky.social and created alongside these awesome collaborators and friends. I believe this is important work that will help researchers make grounded decisions when building multimodal models with many diverse inputs. Add it to your paper pile!

Kyle Cranmer @kylecranmer.bsky.social · 28d

New paper, with @rkhashmani.me @marielpettee.bsky.social @garrettmerz.bsky.social Hellen Qu. We introduce a framework for generating realistic, highly multimodal datasets with explicitly calculable mutual information. This is helpful for studying self-supervised learning
arxiv.org/abs/2510.21686

October 28, 2025 at 8:13 PM

Reposted by brendan chambers

Angelina Wang

@angelinawang.bsky.social

Cornell (NYC and Ithaca) is recruiting AI postdocs, apply by Nov 20, 2025! If you're interested in working with me on technical approaches to responsible AI (e.g., personalization, fairness), please email me.

academicjobsonline.org/ajo/jobs/30971

Cornell University, Empire AI Fellows Program

Job #AJO30971, Postdoctoral Fellow, Empire AI Fellows Program, Cornell University, New York, New York, US

academicjobsonline.org

October 28, 2025 at 6:19 PM

brendan chambers

@societyoftrees.bsky.social

Here is the recipe from the latest Thinking Machines blogpost about late post-training:

- generate student rollouts
- query teacher distribution forced on student history
- update using the reverse KL divergence at each step

thinkingmachines.ai/blog/on-poli...

October 28, 2025 at 6:37 PM

Reposted by brendan chambers

Mark J. Nelson

@mm-jj-nn.bsky.social

Did an intro to tokenization lecture today and worked in this thread.

Alexander Doria @dorialexander.bsky.social · Sep 15

> be a language model
> all you see is tokens
> you don't care, it's all abstracted away
> you live for a world of pure ideas, chain of concepts, reasoning streams
> tokens don't exist.

October 24, 2025 at 8:54 PM

Reposted by brendan chambers

Nathan Lambert

@natolambert.bsky.social

The first fantastic paper on scaling RL with LLMs just dropped. I strongly recommend taking a look and will be sharing more thoughts on the blog soon.

The Art of Scaling Reinforcement Learning Compute for LLMs
Khatri & Madaan et al.

buff.ly/olKwF3X

October 16, 2025 at 1:59 PM

Reposted by brendan chambers

Simon Willison

@simonwillison.net

NVIDIA sent me preview hardware of their new DGX Spark 128GB ARM64 4TB "AI supercomputer" - it's a very neat little device, here are my notes so far
simonwillison.net/2025/Oct/14/...

NVIDIA DGX Spark: great hardware, early days for the ecosystem

NVIDIA sent me a preview unit of their new DGX Spark desktop “AI supercomputer”. I’ve never had hardware to review before! You can consider this my first ever sponsored post …

simonwillison.net

October 14, 2025 at 11:38 PM

Reposted by brendan chambers

Sebastian Raschka (rasbt)

@sebastianraschka.com

Multi-Head Latent Attention
🔗 github.com/rasbt/LLMs-f...

October 12, 2025 at 1:57 PM

Reposted by brendan chambers

Grace

@gracekind.net

⚠️ You have marked yourself as an untrusted node in the epistemic network

October 11, 2025 at 1:55 PM

brendan chambers

@societyoftrees.bsky.social

improving pretrained LLMs by searching over iid-noised params, using a reward score (aka fitness criterion) for weight-merging

Mark J. Nelson @mm-jj-nn.bsky.social · Oct 7

Evolution Strategies at Scale: LLM Fine-Tuning Beyond Reinforcement Learning

Fine-tuning pre-trained large language models (LLMs) for down-stream tasks is a critical step in the AI deployment pipeline. Reinforcement learning (RL) is arguably the most prominent fine-tuning meth...

arxiv.org

October 7, 2025 at 5:02 PM

Reposted by brendan chambers

Conference on Language Modeling

@colmweb.org

We are excited to announce 4 outstanding papers 🏆🏆🏆🏆 --> 🧵

October 7, 2025 at 1:23 PM

Reposted by brendan chambers

Michael Kirchhof (ICML)

@mkirchhof.bsky.social

LLMs are currently this one big parameter block that stores all sort of facts. In our new preprint, we add context-specific memory parameters to the model, and pretrain the model along with a big bank of memories.

📑 arxiv.org/abs/2510.02375

[1/10]🧵

October 6, 2025 at 4:06 PM

brendan chambers

@societyoftrees.bsky.social

accepted papers, COLM 2025

colmweb.org/AcceptedPape...

October 6, 2025 at 3:39 PM

Reposted by brendan chambers

Ethan Mollick

@emollick.bsky.social

Paper: arxiv.org/pdf/2509.20328

arxiv.org

October 3, 2025 at 1:05 PM

Reposted by brendan chambers

TMLR Published Papers

@tmlr-pub.bsky.social

Spaced Scheduling for Large Language Model Training

Amine El hattami, Nicolas Chapados, Christopher Pal

Action editor: Colin Raffel

https://openreview.net/forum?id=p0KTYl2B9T

#scheduling #scheduled #training

October 2, 2025 at 4:18 AM

Reposted by brendan chambers

Naomi Saphra

@nsaphra.bsky.social

really neat clear explainer for the new on “centralizing flows” to theoretically model learning dynamics

Understanding Optimization in Deep Learning with Central Flows

centralflows.github.io

October 1, 2025 at 12:20 PM

Reposted by brendan chambers

Lenka Zdeborová

@zdeborova.bsky.social

Scaling laws don’t just show up in test error — they leave fingerprints in the weight spectrum.
In the feature learning regime, we map this connection: phase diagrams of scaling exponents <-> spectral signatures of trained weights. The paper is: arxiv.org/abs/2509.24882

September 30, 2025 at 11:02 AM

Reposted by brendan chambers

norvid_studies

@norvid-studies.bsky.social

latent space opera

September 28, 2025 at 4:26 PM

Reposted by brendan chambers

Alexander Doria

@dorialexander.bsky.social

New technical post from Thinky on optimizers but this is the main catch: conditional learning rate per layers.

thinkingmachines.ai/blog/modular...

September 26, 2025 at 6:00 PM

brendan chambers

@societyoftrees.bsky.social

Isaac-01 multimodal model from Perceptron AI - pdf whitepaper

github.com/perceptron-a...

github.com

September 24, 2025 at 5:16 PM

Reposted by brendan chambers

David Marx

@digthatdata.bsky.social

New (March) Schmidhuber I missed where they use a carefully engineered layer to track the information gained by each (prediction) token for solving problems that require computation. Hidden state is predictive of (a? not necessarily minimal?) description length.

Measuring In-Context Computation Complexity via Hidden State Prediction

Detecting when a neural sequence model does "interesting" computation is an open problem. The next token prediction loss is a poor indicator: Low loss can stem from trivially predictable sequences tha...

arxiv.org

September 9, 2025 at 12:06 AM

brendan chambers

@societyoftrees.bsky.social

the Perceptron folks are sharing design specs of their approach to serialize multimodal data as interleaved events

www.perceptron.inc/blog/tensors...

TensorStream - Perceptron

A layer of intelligence for the physical world. We are a research company building the future of Physical AGI.

www.perceptron.inc

September 24, 2025 at 4:27 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news