Lightnews — Scholar-powered news

Jason Weston

@jasonweston.bsky.social

560 followers 340 following 5 posts

Senior Director, Research Scientist @ Meta FAIR + Visiting Prof @ NYU.
Pretrain+SFT: NLP from Scratch (2011). Multilayer attention+position encode+LLM: MemNet (2015). Recent (2024): Self-Rewarding LLMs & more!

Posts Replies Media Videos

Jason Weston

@jasonweston.bsky.social

Our new work on continuous chain of thought.

Tanishq Mathew Abraham @iscienceluvr.bsky.social · Dec 10

Training Large Language Models to Reason in a Continuous Latent Space

Introduces a new paradigm for LLM reasoning called Chain of Continuous Thought (COCONUT)

Directly feed the last hidden state (a continuous thought) as the input embedding for the next token.

arxiv.org/abs/2412.06769

December 10, 2024 at 4:51 PM

Jason Weston

@jasonweston.bsky.social

🚨 Adaptive Decoding via Latent Preference Optimization 🚨
- New layer for Transformer, selects decoding params automatically *per token*
- Learnt via new method Latent Preference Optimization
- Outperforms any fixed temperature decoding, choosing creativity or factuality
arxiv.org/abs/2411.09661
🧵1/4

November 22, 2024 at 1:06 PM

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news