Lightnews — Scholar-powered news

@dimitrisp.bsky.social

What if for most of your findings you just post a thread and share a GitHub repo, rather than submitting a 15 page NeurIPS paper with < 1/100 the reach?

May 16, 2025 at 2:57 PM

Dimitris Papailiopoulos

@dimitrisp.bsky.social

LLMs learn world models, beyond a reasonable doubt. It's been the case since GPT-3, but now it should be even more clear. Without them "Guess and Check" would not work.

The fact that these "world models" are approximate/incomplete does not disqualify them.

May 12, 2025 at 6:38 PM

Dimitris Papailiopoulos

@dimitrisp.bsky.social

Is 1948 widely acknowledged as the birth of language models and tokenizers?

In "A Mathematical Theory of Communication", almost as an afterthought Shannon suggests the N-gram for generating English, and that word level tokenization is better than character level tokenization.

May 7, 2025 at 12:05 PM

Reposted by Dimitris Papailiopoulos

Besmira Nushi

@besmiranushi.bsky.social

🎉The Phi-4 reasoning models have landed on HF and Azure AI Foundry. The new models are competitive and often outperform much larger frontier models. It is exciting to see the reasoning capabilities extend to more domains beyond math, including algorithmic reasoning, calendar planning, and coding.

May 1, 2025 at 12:50 AM

Dimitris Papailiopoulos

@dimitrisp.bsky.social

I am afraid to report, RL works.

I think 2-3 years ago, I said I will not work on two ML sub-areas. RL was one of them. I am happy to say that I am not strongly attached to my beliefs.

April 30, 2025 at 8:08 PM

Dimitris Papailiopoulos

@dimitrisp.bsky.social

Re: The Chatbot Arena Illusion

Every eval chokes under hill climbing. If we're lucky, there’s an early phase where *real* learning (both model and community) can occur. I'd argue that a benchmark’s value lies entirely in that window. So the real question is what did we learn?

April 30, 2025 at 4:38 PM

Dimitris Papailiopoulos

@dimitrisp.bsky.social

Fun trivia now that “sycophant” became common language to describe LLMs flattering users:

In Greek, συκοφάντης (sykophántēs) most typically refers to a malicious slanderer, someone spreading lies, not flattery!

Every time you use it, you’re technically using it wrong :D

April 28, 2025 at 1:58 PM

Dimitris Papailiopoulos

@dimitrisp.bsky.social

Come work with us at MSR AI Frontiers and help us figure out reasoning!
We're hiring at the Senior Researcher level (eg post phd).
Please drop me a DM if you do!
jobs.careers.microsoft.com/us/en/job/17...

Search Jobs | Microsoft Careers

jobs.careers.microsoft.com

February 21, 2025 at 3:48 PM

Dimitris Papailiopoulos

@dimitrisp.bsky.social

o3 can't multiply beyond a few digits...

But I think multiplication, addition, maze solving and easy-to-hard generalization is actually solvable on standard transformers...

with recursive self-improvement

Below is the acc of a tiny model teaching itself how to add and multiply

February 13, 2025 at 1:33 PM

Reposted by Dimitris Papailiopoulos

Sung Kim

@sungkim.bsky.social

o3 can't multiply beyond a few digits...

But he think multiplication, addition, maze solving and easy-to-hard generalization is actually solvable on standard transformers...

with recursive self-improvement, as presented by @dimitrisp.bsky.social

February 13, 2025 at 7:57 AM

Dimitris Papailiopoulos

@dimitrisp.bsky.social

Self-improving Transformers can overcome easy-to-hard and length generalization challenges.

Paper on arxiv coming on Monday.
Link to a talk I gave on this below 👇

Super excited about this work!

Talk : youtube.com/watch?v=szhE...
slides: tinyurl.com/SelfImprovem...

February 2, 2025 at 1:23 PM

Dimitris Papailiopoulos

@dimitrisp.bsky.social

Two months before R1 came out, I wrote this in my small notebook of ideas as something to test #schmidhuber

February 1, 2025 at 6:53 PM

Dimitris Papailiopoulos

@dimitrisp.bsky.social

Now that we have reasoner LLMs, let's think about how to GRPO problem generators that generate instances that sit right outside the frontier of current capabilities.

January 29, 2025 at 10:25 PM

Reposted by Dimitris Papailiopoulos

Constantine Caramanis

@cmcaram.bsky.social

🚀 🇬🇷 A year in the making! I’ve just completed a set of 21 lectures in Machine Learning, in Greek, designed for high school students. The course introduces key ML concepts, coding in Python & PyTorch, and real-world AI applications.
👉 WebPage: tinyurl.com/ye2awe8m
🎥 YouTube: tinyurl.com/2wwjru6z

Μηχανική Μάθηση (Machine Learning) - YouTube

Διαλέξεις Τεχνητής Νοημοσύνης και Μηχανικής Μάθησης: https://caramanis.github.io/MachineLearningClass/ Καλωσορίσατε στο μάθημα τεχνητής νοημοσύνης και μηχανι...

tinyurl.com

January 29, 2025 at 6:04 PM

Dimitris Papailiopoulos

@dimitrisp.bsky.social

If you wanted to collect 1 mil reasoning traces from human subjects on say math, that would cost ~$50m, assuming ~50$/person/hour. Interesting to compare with the cost to generate them from a reasoning LLM, with say with cost per trace ~$0.5 (say 10k tokens).. That's 100x cheaper

January 28, 2025 at 9:04 PM

Dimitris Papailiopoulos

@dimitrisp.bsky.social

Ok we've read a lot about test-time compute being the new scaling axis, but what's the next scaling axis?

January 28, 2025 at 9:04 PM

Reposted by Dimitris Papailiopoulos

Ben Recht

@beenwrekt.bsky.social

2014 GoogLeNet: The best image classifier was only trainable using weeks of Google's custom infrastructure.

2018 ResNet: A more accurate model is trainable in a 1/2 hour on a single GPU.

What stops this from happening for LLMs?

Ben Recht @beenwrekt.bsky.social · Jan 27

Machine learning progresses when complicated breakthroughs are soon dramatically simplified as people figure out the salient parts.

What a world we're in where this well-trodden pattern rocks financial markets and escalates geopolitical conflict.

January 27, 2025 at 3:16 PM

Dimitris Papailiopoulos

@dimitrisp.bsky.social

A strong math/theory foundation can be extremely useful for ML research. Not for proving sample complexity bounds on "AGI", but for offering a mental model of inaccessible and complex systems, that can allow for accurate predictions, without running expensive experiments.

January 26, 2025 at 7:25 PM

Dimitris Papailiopoulos

@dimitrisp.bsky.social

The "deepseek distilled o1" is an intellectually vacuous discussion, precisely because what they reported in the R1 paper is a reproducible phenomenon! By now many experiments on non-deepseek models show that acc and inf-time compute increase as the result of outcome-based RL.

January 25, 2025 at 8:17 PM

Dimitris Papailiopoulos

@dimitrisp.bsky.social

GRPO and outcome based RL rely heavily on a verifier with access to ground truth data. But likely can work beyond strictly verifiable domains, as long as you have access to a "weak" grader. And perhaps even beyond that, if "correct trajectories" share a common fingerprint..

January 24, 2025 at 9:55 PM

Dimitris Papailiopoulos

@dimitrisp.bsky.social

Elated to announce that I got some papers accepted, some rejected, and some withdrawn, at some conference that I won't attend :D

January 24, 2025 at 4:09 PM

Dimitris Papailiopoulos

@dimitrisp.bsky.social

1/5 A hypothesis on the emergence of long form "yapping" in reasoning models:

The increase of "yapping" in reasoning models, as they are trained for more rounds of RL, "emerges" (sorry :D) as models discover that verbose reasoning helps them achieve better rewards (eg higher acc).

January 22, 2025 at 7:22 PM

Dimitris Papailiopoulos

@dimitrisp.bsky.social

I love finding silly tests that LLMs are terrible at.
Here's a new one for me: Drawing with Logo (yes the turtle)!
To be fair drawing with Logo is hard. But.. here goes 8 examples with sonnet 3.6 vs o1.

Example 1/8: Draw the letter G

January 20, 2025 at 5:01 AM

Dimitris Papailiopoulos

@dimitrisp.bsky.social

Task vectors are akin to punchcards: you feed them to your LLM and it implements specific tasks, without in-context demonstrations. Liu's new paper examines at what scale, where in the network and when during training do they emerge, and how to encourage their emergence.

arxiv.org/pdf/2501.09240

January 18, 2025 at 4:51 PM

Dimitris Papailiopoulos

@dimitrisp.bsky.social

resolutions for 2025
- be a good dad and partner
- do more nature stuff
- walk & run more
- spend more time in the water
- think deeper
- read good books, don’t feel bad not finishing all
- be a good mentor & colleague
- figure out what reasoning is
- don’t be reward hacking
- have fun

All doable

January 1, 2025 at 8:31 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news