Lightnews — Scholar-powered news

Reposted by Yi-Hao Peng

Alexander Doria

@dorialexander.bsky.social

Breaking: we release a fully synthetic generalist dataset for pretraining, SYNTH and two new SOTA reasoning models exclusively trained on it. Despite having seen only 200 billion tokens, Baguettotron is currently best-in-class in its size range. pleias.fr/blog/blogsyn...

November 10, 2025 at 5:30 PM

Reposted by Yi-Hao Peng

Nathan Lambert

@natolambert.bsky.social

The first research on the fundamentals of character training -- i.e. applying modern post training techniques to ingrain specific character traits into models.

All models, datasets, code etc released.
Really excited about this project! Sharan, the lead student author, was a joy to work with.

November 4, 2025 at 4:51 PM

Reposted by Yi-Hao Peng

Dmytro Mishkin

@ducha-aiki.bsky.social

Great overview of the workshops and tutorials.
My favorites:
1) CAD representation
2) synthetic data to help city-scale reconstruction
3) trends in 3D vision
4) visual chain-of-thoughts?

November 3, 2025 at 1:06 PM

Reposted by Yi-Hao Peng

Dmytro Mishkin

@ducha-aiki.bsky.social

Instance-Level Composed Image Retrieval
@billpsomas.bsky.social George Retsinas @nikos-efth.bsky.social Panagiotis Filntisis,Yannis Avrithis, Petros Maragos, Ondrej Chum, @gtolias.bsky.social

tl;dr: condition-based retrieval (+dataset) - old photo/sunset/night/aerial/model arxiv.org/abs/2510.25387

November 3, 2025 at 12:53 PM

Reposted by Yi-Hao Peng

Emily Byun

@yewonbyun.bsky.social

💡Can we trust synthetic data for statistical inference?

We show that synthetic data (e.g., LLM simulations) can significantly improve the performance of inference tasks. The key intuition lies in the interactions between the moment residuals of synthetic data and those of real data

October 10, 2025 at 4:12 PM

Reposted by Yi-Hao Peng

Christian Wolf

@chriswolfvision.bsky.social

We have a new sequence model for robotics, which will be presented at #NeurIPS2025:

Kinaema: A recurrent sequence model for memory and pose in motion
arxiv.org/abs/2510.20261

By @mbsariyildiz.bsky.social, @weinzaepfelp.bsky.social, G. Bono, G. Monaci and myself
@naverlabseurope.bsky.social

1/9

October 24, 2025 at 7:18 AM

Reposted by Yi-Hao Peng

Dmytro Mishkin

@ducha-aiki.bsky.social

Minimal Solvers and Model Refinement by @visionviktor

#ICCV2025
danini.github.io/ransac-2025-...

October 20, 2025 at 11:49 PM

Reposted by Yi-Hao Peng

Jessy Li

@jessyjli.bsky.social

🚨 Does your LLM really understand code -- or is it just really good at remembering it?
We built **PLSemanticsBench** to find out.
The results: a wild mix.

✅The Brilliant:
Top reasoning models can execute complex, fuzzer-generated programs -- even with 5+ levels of nested loops! 🤯

❌The Brittle: 🧵

October 14, 2025 at 2:33 AM

Reposted by Yi-Hao Peng

Kwang Moo Yi

@kmyid.bsky.social

Barroso-Laguna et al., "A Scene is Worth a Thousand Features: Feed-Forward Camera Localization from a Collection of Image Features"

When contexting your feed-forward 3D point-map estimator, don't use full image pairs -- just randomly subsample! -> fast compute, more images.

October 2, 2025 at 7:37 PM

Reposted by Yi-Hao Peng

Dmytro Mishkin

@ducha-aiki.bsky.social

Geometry Meets Vision: Revisiting Pretrained Semantics in Distilled Fields

Zhiting Mei, Ola Shorinwa, Anirudha Majumdar
tl;dr: who cares, look at those dino icons!
OK, distilling DINO into NERF -> better object localization, than VGGT.

arxiv.org/abs/2510.03104

October 6, 2025 at 10:48 AM

Reposted by Yi-Hao Peng

Chris Paxton

@cpaxton.bsky.social

Can we crowdsource robot evaluations? lmsys/chatbot arena helped revolutionize LLM evaluations; is it possible to do similar things for robots?

Find out in a new episode of RoboPapers: robopapers.substack.com/p/ep34-roboa...

Ep#34: RoboArena

With Pranav Atreya and Karl Pertsch

robopapers.substack.com

October 3, 2025 at 2:04 PM

Reposted by Yi-Hao Peng

Simon Willison

@simonwillison.net

If you've been trying to figure out DSPy - the automatic prompt optimization system - this talk by @dbreunig.bsky.social is the clearest explanation I've seen yet, with a very useful real-world case study www.youtube.com/watch?v=I9Zt...

My notes here: simonwillison.net/2025/Oct/4/d...

Let the LLM Write the Prompts: An Intro to DSPy in Compound AI Pipelines

YouTube video by Databricks

www.youtube.com

October 4, 2025 at 11:05 PM

Reposted by Yi-Hao Peng

David Bau

@davidbau.bsky.social

Announcing a broad expansion of the National Deep Inference Fabric.

This could be relevant to your research...

September 26, 2025 at 6:47 PM

Reposted by Yi-Hao Peng

Alexander Doria

@dorialexander.bsky.social

New technical post from Thinky on optimizers but this is the main catch: conditional learning rate per layers.

thinkingmachines.ai/blog/modular...

September 26, 2025 at 6:00 PM

Reposted by Yi-Hao Peng

Christian Wolf

@chriswolfvision.bsky.social

Tendency: render non-visual data/parameters into images fed to neural networks.

Joint angles: GENIMA genima-robot.github.io
Trajectories: VINT arxiv.org/abs/2306.14846
Object pose: MFOS arxiv.org/abs/2310.01897
Waypoints: PIVOT arxiv.org/abs/2402.07872

(Repost+update of a tweet from 2024)

September 22, 2025 at 2:07 PM

Reposted by Yi-Hao Peng

Dmytro Mishkin

@ducha-aiki.bsky.social

Large Vision Models Can Solve Mental Rotation Problems

Sebastian Ray Mason, Anders Gjølbye, Phillip Chavarria Højbjerg, Lenka Tětková, Lars Kai Hansen

tl;dr: DINOv3,CLIP know 3D geometry, but only in middle layers. MAE bad, ImageNet bad, cls token bad.
arxiv.org/abs/2509.15271

September 22, 2025 at 10:19 AM

Reposted by Yi-Hao Peng

Andrew Lampinen

@lampinen.bsky.social

Why does AI sometimes fail to generalize, and what might help? In a new paper (arxiv.org/abs/2509.16189), we highlight the latent learning gap — which unifies findings from language modeling to agent navigation — and suggest that episodic memory complements parametric learning to bridge it. Thread:

Latent learning: episodic memory complements parametric learning by enabling flexible reuse of experiences

When do machine learning systems fail to generalize, and what mechanisms could improve their generalization? Here, we draw inspiration from cognitive science to argue that one weakness of machine lear...

arxiv.org

September 22, 2025 at 4:21 AM

Reposted by Yi-Hao Peng

Nathan Lambert

@natolambert.bsky.social

Explaining how to get the most of the CLI agents and why they're so important to understand today and tomorrow's AI progress. They're showing the new fundamentals of agents and how frontier labs will hill climb on open-ended tasks.
buff.ly/jAAhHnQ

Coding as the epicenter of AI progress and the path to general agents

GPT-5-Codex, adoption, denial, peak performance, and everyday gains.

www.interconnects.ai

September 18, 2025 at 3:28 PM

Reposted by Yi-Hao Peng

Dmytro Mishkin

@ducha-aiki.bsky.social

Towards the Next Generation of 3D Reconstruction

@parskatt.bsky.social PhD Thesis.

tl;dr: would be useful in teaching image matching - nice explanations. (too) Fancy and stylish notation. Cool Ack section and cover image.

liu.diva-portal.org/smash/record...

September 18, 2025 at 6:25 AM

Reposted by Yi-Hao Peng

Matthias Niessner

@niessner.bsky.social

Can we use video diffusion to generate 3D scenes?

𝐖𝐨𝐫𝐥𝐝𝐄𝐱𝐩𝐥𝐨𝐫𝐞𝐫 (#SIGGRAPHAsia25) creates fully-navigable scenes via autoregressive video generation.

Text input -> 3DGS scene output & interactive rendering!

🌍http://mschneider456.github.io/world-explorer/
📽️https://youtu.be/N6NJsNyiv6I

September 17, 2025 at 12:08 PM

Reposted by Yi-Hao Peng

Nathan Lambert

@natolambert.bsky.social

I finally got around to making a tool to compare completions from SFT vs. RLHF trained models. This is a mini site for the RLHF book that I've wanted for a while.

buff.ly/lqDL5wa

It's always been hard to say what RLHF does to a model within a more complex post-training pipeline.

September 17, 2025 at 2:20 PM

Reposted by Yi-Hao Peng

Chris Paxton

@cpaxton.bsky.social

Vision-Language-Action models are the foundation of a new wave of generalist robots: networks that take in images from robot cameras (vision) and instructions (language) and produce robot trajectories. We are seeing a remarkable convergence in how these work; more: open.substack.com/pub/itcanthi...

Vision-Language-Action Models and the Search for a Generalist Robot Policy

VLAs are general-purpose robotics models. But how are VLAs doing in the real world. and which ones are people using?

open.substack.com

August 27, 2025 at 11:42 PM

Reposted by Yi-Hao Peng

Johan Edstedt

@parskatt.bsky.social

I'll make a longer post at some point but the tl;dr is:

We take a simple (but underappreciated) closed form solution for estimating H, and replace the homog part with whatever distortion model we have.

This gives us simpler and faster solutions across the board. Super cool work imo!

Zhenjun Zhao @ericzzj.bsky.social · Sep 1

Radially Distorted Homographies, Revisited

Mårten Wadenbäck, Marcus Valtonen Örnhag, @parskatt.bsky.social

tl;dr: minimal solvers for one-sided/two-sided equal/two-sided independent radial distortion homography

arxiv.org/abs/2508.21190

September 1, 2025 at 9:59 AM

Reposted by Yi-Hao Peng

Mark Riedl

@markriedl.bsky.social

Our work on Detecting Suspense in Stories will appear at COLM'25 arxiv.org/abs/2508.15794

Do LLMs know when stories are suspenseful?

We ran LLMs through a bunch of classical psychology suspense studies

The answer: kinda, sorta? Honestly, better than I expected. But the results are nuanced...

1/

August 25, 2025 at 7:49 PM

Reposted by Yi-Hao Peng

David Picard

@davidpicard.bsky.social

🚨 New preprint!
How far can we go with ImageNet for Text-to-Image generation? w. @arrijitghosh.bsky.social @lucasdegeorge.bsky.social @nicolasdufour.bsky.social @vickykalogeiton.bsky.social
TL;DR: Train a text-to-image model using 1000 less data in 200 GPU hrs!

📜https://arxiv.org/abs/2502.21318
🧵👇

March 3, 2025 at 10:19 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news