Lightnews — Scholar-powered news

Daniel Jiang

@danielrjiang.bsky.social

Research Scientist Meta, Adjunct Professor at University of Pittsburgh. Interested in reinforcement learning, approximate DP, adaptive experimentation, Bayesian optimization, & operations research. http://danielrjiang.github.io

📍Chicago, IL

Posts Replies Media Videos

Reposted by Daniel Jiang

Nathan Lambert

@natolambert.bsky.social

First draft online version of The RLHF Book is DONE. Recently I've been creating the advanced discussion chapters on everything from Constitutional AI to evaluation and character training, but I also sneak in consistent improvements to the RL specific chapter.

rlhfbook.com

April 16, 2025 at 7:01 PM

Daniel Jiang

@danielrjiang.bsky.social

At ICLR 2025 in Singapore, my co-authors and I presented two papers on RL. Feel free to let us know of any feedback and let me know if you'd like to chat!
- openreview.net/forum?id=AOl...
- openreview.net/forum?id=AOl...

April 26, 2025 at 1:51 AM

Daniel Jiang

@danielrjiang.bsky.social

Our team at Meta is hiring a postdoc researcher! Our group conducts both fundamental and applied research in reinforcement learning, with a focus on applications in Meta's advertising systems.

March 17, 2025 at 1:59 PM

Reposted by Daniel Jiang

Reinforcement Learning Conference

@rl-conference.bsky.social

Congrats to this year's Turing award winners! www.nytimes.com/2025/03/05/t...

Incidentally, if you'd like to hear from them, we know a place they've given / are giving keynotes

Turing Award Goes to A.I. Pioneers Andrew Barto and Richard Sutton

Andrew Barto and Richard Sutton developed reinforcement learning, a technique vital to chatbots like ChatGPT.

www.nytimes.com

March 7, 2025 at 2:38 AM

Reposted by Daniel Jiang

Dylan Foster 🐢

@djfoster.bsky.social

Given a high-quality verifier, language model accuracy can be improved by scaling inference-time compute (e.g., w/ repeated sampling). When can we expect similar gains without an external verifier?

New paper: Self-Improvement in Language Models: The Sharpening Mechanism

arxiv.org/abs/2412.01951

December 14, 2024 at 4:10 PM

Reposted by Daniel Jiang

Eugene Vinitsky 🍒

@eugenevinitsky.bsky.social

An updated intro to reinforcement learning by Kevin Murphy: arxiv.org/abs/2412.05265! Like their books, it covers a lot and is quite up to date with modern approaches. It also is pretty unique in coverage, I don't think a lot of this is synthesized anywhere else yet

Reinforcement Learning: An Overview

This manuscript gives a big-picture, up-to-date overview of the field of (deep) reinforcement learning and sequential decision making, covering value-based RL, policy-gradient methods, model-based met...

arxiv.org

December 9, 2024 at 2:27 PM

Reposted by Daniel Jiang

Cathy Wu

@cathywu.bsky.social

I collected some folk knowledge for RL and stuck them in my lecture slides a couple weeks back: web.mit.edu/6.7920/www/l... See Appendix B... sorry, I know, appendix of a lecture slide deck is not the best for discovery. Suggestions very welcome.

web.mit.edu

November 27, 2024 at 1:36 PM

Reposted by Daniel Jiang

Aviv Tamar

@aviv-tamar.bsky.social

Want to learn / teach RL?  
Check out new book draft:
Reinforcement Learning - Foundations sites.google.com/view/rlfound...
W/ Shie Mannor & Yishay Mansour
This is a rigorous first course in RL, based on our teaching at TAU CS and Technion ECE.

November 25, 2024 at 12:08 PM

Reposted by Daniel Jiang

Tiziano Piccardi

@tiziano.bsky.social

New paper: Do social media algorithms shape affective polarization?

We ran a field experiment on X/Twitter (N=1,256) using LLMs to rerank content in real-time, adjusting exposure to polarizing posts. Result: Algorithmic ranking impacts feelings toward the political outgroup! 🧵⬇️

November 25, 2024 at 8:32 PM

Reposted by Daniel Jiang

Eugene Vinitsky 🍒

@eugenevinitsky.bsky.social

The RL (and some non-RL folks) starter pack is almost full. Pretty clear that the academic move here has succeeded
go.bsky.app/3WPHcHg

November 18, 2024 at 8:30 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news