Lightnews — Scholar-powered news

Koustuv Sinha

@koustuvsinha.com

310 followers 430 following 17 posts

🔬Research Scientist, Meta AI (FAIR).

🎓PhD from McGill University + Mila

🙇‍♂️I study Multimodal LLMs, Vision-Language Alignment, LLM Interpretability & I’m passionate about ML Reproducibility (@reproml.org)

🌎https://koustuvsinha.com/

Posts Replies Media Videos

Pinned

Koustuv Sinha @koustuvsinha.com · Dec 13

🚨 We are pleased to announce the first, in-person event for the Machine Learning Reproducibility Challenge, MLRC 2025! Save your dates: August 21st, 2025 at Princeton!

MLRC 2025 - The ML Reproducibility Challenge @reproml.org · Dec 13

We are excited to announce MLRC 2025, the eighth iteration of MLRC, which will also be its first in-person edition @ Princeton University, NJ, USA on August 21st, 2025 reproml.org/blog/announc... Thanks to Princeton AI Lab & Arvind Narayanan @randomwalker.bsky.social for facilitating the venue!

Announcing MLRC 2025, our first in-person conference | MLRC 2025

We are excited to announce the 8th iteration of the Machine Learning Reproducibility Challenge, MLRC 2025, which will also be the first, in-person conference, hosted at Princeton University, New Jerse...

reproml.org

Reposted by Koustuv Sinha

Adina Williams

@adinawilliams.bsky.social

Our team is hiring a postdoc in (mechanistic) interpretability! The ideal candidate will have research experience in interpretability for text and/or image generation models and be excited about open science!

Please consider applying or sharing with colleagues: metacareers.com/jobs/2223953961352324

careers.com

July 15, 2025 at 8:11 PM

Reposted by Koustuv Sinha

Benno Krojer

@bennokrojer.bsky.social

Excited to share the results of my recent internship!

We ask 🤔
What subtle shortcuts are VideoLLMs taking on spatio-temporal questions?

And how can we instead curate shortcut-robust examples at a large-scale?

We release: MVPBench

Details 👇🔬

June 13, 2025 at 2:47 PM

Reposted by Koustuv Sinha

Alexander Doria

@dorialexander.bsky.social

The HuggingFace/Nanotron team just shipped an entire pretraining textbook in interactive format. huggingface.co/spaces/nanot...

It’s not just a great pedagogic support, but many unprecedented data and experiments presented for the first time in a systematic way.

February 19, 2025 at 7:13 PM

Reposted by Koustuv Sinha

Kaitlyn Zhou

@kaitlynzhou.bsky.social

Excited to have two papers at #NAACL2025!
The first reveals how human over-reliance can be exacerbated by LLM friendliness. The second presents a novel computational method for concept tracing. Check them out!

arxiv.org/pdf/2407.07950

arxiv.org/pdf/2502.05704

February 19, 2025 at 9:58 PM

Reposted by Koustuv Sinha

fairseq2

@fairseq2.bsky.social

👋 Hello world! We’re thrilled to announce the v0.4 release of fairseq2 — an open-source library from FAIR powering many projects at Meta. pip install fairseq2 and explore our trainer API, instruction & preference finetuning (up to 70B), and native vLLM integration.

February 12, 2025 at 12:31 PM

Reposted by Koustuv Sinha

Anna Rogers

@annarogers.bsky.social

I am shocked by the death of Felix Hill. He was one of the brightest minds of my generation.

His last blog post on the stress of working in AI is very poignant. Apart from the emptiness of working mostly to make billionaires even richer, there's the intellectual emptiness of 'scale is all you need'

Guillaume Bellec @bellecguill.bsky.social · Jan 4

The blog post of the late Felix Hill is powerful. Stress for AI researchers today is real.

I did not know Felix Hill and I am sorry for those who did.
This story is perhaps a reminder for students, postdocs, founders and researchers to take care of their well being.

medium.com/@felixhill/2...

200bn Weights of Responsibility

The Stress of Working in Modern AI

medium.com

January 14, 2025 at 12:41 PM

Koustuv Sinha

@koustuvsinha.com

We posted our paper on arxiv recently, sharing this here too: arxiv.org/abs/2412.141... - work led by our amazing intern Peter Tong. Key findings:

- LLMs can be trained to generate visual embeddings!!
- VQA data appears to help a lot in generation!
- Better understanding = better generation!

December 26, 2024 at 8:01 PM

Koustuv Sinha

@koustuvsinha.com

🚨 We are pleased to announce the first, in-person event for the Machine Learning Reproducibility Challenge, MLRC 2025! Save your dates: August 21st, 2025 at Princeton!

MLRC 2025 - The ML Reproducibility Challenge @reproml.org · Dec 13

Announcing MLRC 2025, our first in-person conference | MLRC 2025

reproml.org

December 13, 2024 at 7:06 PM

Reposted by Koustuv Sinha

Adina Williams

@adinawilliams.bsky.social

Our paper PRISM alignment won a best paper award at #neurips2024!

All credits to @hannahrosekirk.bsky.social A.Whitefield, P.Röttger, A.M.Bean, K.Margatina, R.Mosquera-Gomez, J.Ciro, @maxbartolo.bsky.social H.He, B.Vidgen, S.Hale

Catch Hannah tomorrow at neurips.cc/virtual/2024/poster/97804

blog.neurips

December 11, 2024 at 4:20 PM

Koustuv Sinha

@koustuvsinha.com

Checkout the MLRC 2023 posters at #NeurIPS 2024 this week: reproml.org/proceedings/ - do drop by to these posters and say hi!

Online Proceedings | MLRC

Machine Learning Reproducibility Challenge

reproml.org

December 10, 2024 at 4:15 PM

Reposted by Koustuv Sinha

Andrei Bursuc

@abursuc.bsky.social

The return of the Autoregressive Image Model: AIMv2 now going multimodal.
Excellent work by @alaaelnouby.bsky.social & team with code and checkpoints already up:

arxiv.org/abs/2411.14402

November 22, 2024 at 9:44 AM

Reposted by Koustuv Sinha

Michael J. Black

@michael-j-black.bsky.social

For those who missed this post on the-network-that-is-not-to-be-named, I made public my "secrets" for writing a good CVPR paper (or any scientific paper). I've compiled these tips of many years. It's long but hopefully it helps people write better papers. perceiving-systems.blog/en/post/writ...

Writing a good scientific paper

perceiving-systems.blog

November 20, 2024 at 10:18 AM

Reposted by Koustuv Sinha

Laura

@lauraruis.bsky.social

How do LLMs learn to reason from data? Are they ~retrieving the answers from parametric knowledge🦜? In our new preprint, we look at the pretraining data and find evidence against this:

Procedural knowledge in pretraining drives LLM reasoning ⚙️🔢

🧵⬇️

November 20, 2024 at 4:35 PM

Koustuv Sinha

@koustuvsinha.com

When I first read this paper, I instinctively scoffed at the idea. But the more I look at empirical results, the more I’m convinced this paper highlights something fundamentally amazing. Lots of exciting research on this direction will come very soon!

arxiv.org/abs/2405.07987

November 20, 2024 at 12:29 AM