Lightnews — Scholar-powered news

Dhruv Batra

@dhruvbatra.bsky.social

Most recent checkpoint of n1 vs Opus 4.6!

On Navi-Bench and Westworld browser automation benchmarks:

- Same accuracy
- n1 is 2.5x faster
- n1 is 5.6x cheaper

Try it out via the Yutori API.

February 11, 2026 at 5:12 PM

Dhruv Batra

@dhruvbatra.bsky.social

Fun chat with Evan O'Donnell about the similarities between training robots and web agents, managing context for agents that run for months and years, the future of the AI-first web, and ideal form factor for embodied AI.

www.thetimes.blog/p/agents-ne...

February 6, 2026 at 5:57 PM

Dhruv Batra

@dhruvbatra.bsky.social

Maybe coding is just amortized inference for LLMs.

Maybe the reason we write programs down to files is just to save inference costs.

January 21, 2026 at 6:06 PM

Dhruv Batra

@dhruvbatra.bsky.social

The bitter lesson for web agents

The last 1 year has taught us a new bitter lesson that we think others are not yet grokking.

Agents that *look at the web like humans* (screenshots of sites) navigate and generalize better than agents that read code (HTML, DOM).

November 14, 2025 at 9:14 PM

Dhruv Batra

@dhruvbatra.bsky.social

As part of the award ceremony, VQA team presented a recap of vision-and-language research over the last decade — solved problems, progress, and open-challenges for mutimodal LLMs.

October 23, 2025 at 5:18 PM

Dhruv Batra

@dhruvbatra.bsky.social

VQA challenge series won the Mark Everingham prize at #ICCV2025 for stimulating a new strand of vision-and-language research.

It's extra special because ICCV25 marks the 10-year anniversary of the VQA paper.

When we started, the idea of answering any question about any image seemed outlandish.

October 21, 2025 at 7:27 PM

Dhruv Batra

@dhruvbatra.bsky.social

The problem with “AI slop” isn’t the AI — it’s the slop.

People act like AI is the issue, when it’s actually part of the fix.

If we're honest: most of what we make, most of the time, is slop by our own standards.

That’s the generator–discriminator gap in creative work that Ira Glass talks about.

October 15, 2025 at 4:22 PM

Dhruv Batra

@dhruvbatra.bsky.social

It is so refreshing to see conferences innovate on the reviewing model and run actual experiments (!) as opposed to fighting change.

ICLR Conference @iclr-conf.bsky.social · Apr 16

For #ICLR2025, we piloted an LLM that provided optional feedback to some reviewers. Results are promising: over 12K suggestions were incorporated by reviewers to improve review quality. See our blog post for details and more analysis blog.iclr.cc/2025/04/15/l...

Leveraging LLM feedback to enhance review quality – ICLR Blog

blog.iclr.cc

April 16, 2025 at 4:43 AM

Dhruv Batra

@dhruvbatra.bsky.social

My entire robotics career has led to this.

Devi Parikh @deviparikh.bsky.social · Apr 1

Introducing API. A new era of agentic computer use begins today.

youtu.be/ACzGPGgc9BU

Introducing API. A new era of agentic computer use begins today.

YouTube video by Yutori

youtu.be

April 1, 2025 at 4:05 PM

Dhruv Batra

@dhruvbatra.bsky.social

The answer to many "why X?" questions:

Because the laws of physics do not prohibit X and the forces of biology gave us curiosity.

March 28, 2025 at 3:43 PM

Dhruv Batra

@dhruvbatra.bsky.social

I started something new last year with a wonderful group of people. We showed a demo in Jan.

Today, we’re telling our story — show before you talk!

𝘞𝘦 𝘢𝘳𝘦 𝘳𝘦-𝘪𝘮𝘢𝘨𝘪𝘯𝘪𝘯𝘨 𝘩𝘰𝘸 𝘱𝘦𝘰𝘱𝘭𝘦 𝘪𝘯𝘵𝘦𝘳𝘢𝘤𝘵 𝘸𝘪𝘵𝘩 𝘵𝘩𝘦 𝘸𝘦𝘣 — one of humanity’s greatest inventions and a a mess overdue for an overhaul.

yutori.com

March 27, 2025 at 2:31 PM

Reposted by Dhruv Batra

VLMs4All - CVPR 2025 Workshop

@vlms4all.bsky.social

📢Excited to announce our upcoming workshop - Vision Language Models For All: Building Geo-Diverse and Culturally Aware Vision-Language Models (VLMs-4-All) @CVPR 2025!
🌐 sites.google.com/view/vlms4all

March 14, 2025 at 3:55 PM

Dhruv Batra

@dhruvbatra.bsky.social

Using a locally-running LLM to translate a review is explicitly prohibited by @iccv.bsky.social

Why? Whom does this possibly harm?

March 6, 2025 at 6:10 PM

Dhruv Batra

@dhruvbatra.bsky.social

Brilliant talk by Ilya, but he's wrong on one point.

We are NOT running out of data. We are running out of human-written text.

We have more videos than we know what to do with. We just haven't solved pre-training in vision.

Just go out and sense the world. Data is easy.

December 14, 2024 at 7:15 PM

Dhruv Batra

@dhruvbatra.bsky.social

3.2 —> 3.3

See, model naming isn't that hard.

December 7, 2024 at 2:31 AM

Dhruv Batra

@dhruvbatra.bsky.social

Looking forward to #NeurIPS2024 next week!

If you work in digital or physical AI agents, I'm scheduling chats (Dec 9-12). DMs open.

December 6, 2024 at 7:52 PM

Dhruv Batra

@dhruvbatra.bsky.social

Does the term "LLM" mean:

— a language model in the technical sense
— a "modern" AI system
— an auto-regressive symbol-sequence models, built with transformers, trained with SGD and self-supervised learning
— something else?

dhruvbatra.substack.com/p/the-term-l...

The term “LLM” is a misnomer.

Sometime last year, I noticed AI-adjacent (or “AI curious”) folks using the term “LLM” in odd ways:

dhruvbatra.substack.com

December 4, 2024 at 8:01 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news