Jason Lee
jasondeanlee.bsky.social
Jason Lee
@jasondeanlee.bsky.social

Associate Professor at Princeton
Machine Learning Researcher

Economics 44%
Business 37%

Our new work on scaling laws that includes compute, model size, and number of samples. The analysis involves an extremely fine-grained analysis of online sgd built up over the last 8 years of understanding sgd on simple toy models (tensors, single index models, multi index model)
Excited to announce a new paper with Yunwei Ren, Denny Wu,
@jasondeanlee.bsky.social!

We prove a neural scaling law in the SGD learning of extensive width two-layer neural networks.

arxiv.org/abs/2504.19983

🧵below (1/10)

Reposted by Jason Lee

Excited to announce a new paper with Yunwei Ren, Denny Wu,
@jasondeanlee.bsky.social!

We prove a neural scaling law in the SGD learning of extensive width two-layer neural networks.

arxiv.org/abs/2504.19983

🧵below (1/10)
Welcome to the Bluesky account for Stand Up for Science 2025!

Keep an eye on this space for updates, event information, and ways to get involved. We can't wait to see everyone #standupforscience2025 on March 7th, both in DC and locations nationwide!

#scienceforall #sciencenotsilence

Duck in Vancouver! Mott32

Reposted by Jason Lee

“On a log-log plot, my grandmother fits on a straight line.”
-Physicist Fritz Houtermans

There's a lot of truth to this. log-log plots are often abused and can be very misleading

1/5

Lool

Representative results:
Settling the sampling complexity of RL: arxiv.org/abs/2307.13586
Optimal Muti-Distribution Learning (solved a COLT 2023 open problem): arxiv.org/abs/2312.05134
Anytime Acceleration of Gradient Descent (solved a COLT 2024 open problem): arxiv.org/abs/2411.17668
Settling the Sample Complexity of Online Reinforcement Learning
A central issue lying at the heart of online reinforcement learning (RL) is data efficiency. While a number of recent works achieved asymptotically minimal regret in online RL, the optimality of these...
arxiv.org

Zihan Zhang (tinyurl.com/4nks7f9b) is a postdoc with Yuxin Chen, Simon Du, and me.

What's known about the 1.27 lower bound? It's a guess or there is a reason ppl believe it's fundamental?

Send your colt open problems to Zihan, with high probability he will solve it!

What's the point of @perplexity_ai given chatgpt also does search?

Yo add me to your starter packs!

Reposted by Jason Lee

Assume that the nodes of a social network can choose between two alternative technologies: B and X.
A node using B receives a benefit with respect to X, but there is a benefit to using the same tech as the majority of your neighbors.
Assume everyone uses X at time t=0. Will they switch to B?

Reposted by Jason Lee

Starter packs are helpful as well as the twitter import tool chromewebstore.google.com/detail/sky-f...
Sky Follower Bridge - Chrome Web Store
Instantly find and follow the same users from your Twitter follows on Bluesky.
chromewebstore.google.com

Takes too much clicking...

How do I bulk follow people?

Reposted by Jason Lee