Lightnews — Scholar-powered news

Adhiraj Ghosh

@adhirajghosh.bsky.social

Excited to be in Vienna for #ACL2025 🇦🇹!You'll find @dziadzio.bsky.social and I by our ONEBench poster, so do drop by!

🗓️Wed, July 30, 11-12:30 CET
📍Hall 4/5

I’m also excited to talk about lifelong and personalised benchmarking, data curation and vision-language in general! Let’s connect!

July 27, 2025 at 10:26 PM

Adhiraj Ghosh

@adhirajghosh.bsky.social

Stumbled upon this blogpost recently and found some very useful tips to improve the Bluesky experience. This seemed almost tailored to me - I don't live in the USA and the politics there don't affect me personally. Settings -> Moderation -> Muted Words & Tags cleaned up my feed - strongly recommend!

Naomi Saphra @nsaphra.bsky.social · Apr 26

I wrote something up for AI people who want to get into bluesky and either couldn't assemble an exciting feed or gave up doomscrolling when their Following feed switched to talking politics 24/7.

The AI Researcher's Guide to a Non-Boring Bluesky Feed | Naomi Saphra

How to migrate to bsky without a boring feed.

nsaphra.net

June 25, 2025 at 4:14 PM

Reposted by Adhiraj Ghosh

Jia-Bin Huang

@jbhuang0604.bsky.social

Why More Researchers Should be Content Creators

Just trying something new! I recorded one of my recent talks, sharing what I learned from starting as a small content creator.

youtu.be/0W_7tJtGcMI

We all benefit when there are more content creators!

June 24, 2025 at 9:58 PM

Reposted by Adhiraj Ghosh

Shyamgopal Karthik

@shyamgopal.bsky.social

I'm in Nashville this week attending #CVPR2025. Excited to discuss post-training VLMs and diffusion models!

June 11, 2025 at 3:04 AM

Reposted by Adhiraj Ghosh

Niladri Shekhar Dutt

@niladridutt.bsky.social

🧵1/10 Excited to share our #SIGGRAPH paper "MonetGPT: Solving Puzzles Enhances MLLMs' Image Retouching Skills" 🌟
We explore how to make MLLMs operation-aware by solving visual puzzles and propose a procedural framework for image retouching
#MLLM

May 27, 2025 at 3:13 PM

Adhiraj Ghosh

@adhirajghosh.bsky.social

🏆ONEBench accepted to ACL main! ✨
Stay tuned for the official leaderboard and real-time personalised benchmarking release!

If you’re attending ACL or are generally interested in the future of foundation model benchmarking, happy to talk!

#ACL2025NLP #ACL2025
@aclmeeting.bsky.social

Adhiraj Ghosh @adhirajghosh.bsky.social · Dec 10

🚨Looking to test your foundation model on an arbitrary and open-ended set of capabilities, not explicitly captured by static benchmarks? 🚨

Check out ✨ONEBench✨, where we show how sample-level evaluation is the solution.

🔎 arxiv.org/abs/2412.06745

May 17, 2025 at 7:53 PM

Reposted by Adhiraj Ghosh

Lukas Thede

@lukasthede.bsky.social

🧠 Keeping LLMs factually up to date is a common motivation for knowledge editing.

But what would it actually take to support this in practice at the scale and speed the real world demands?

We explore this question and really push the limits of lifelong knowledge editing in the wild.
👇

April 8, 2025 at 3:32 PM

Reposted by Adhiraj Ghosh

Thaddäus Wiedemer

@thwiedemer.bsky.social

Check out our newest paper!

As always, it was super fun working on this with @prasannamayil.bsky.social

Prasanna Mayilvahanan @prasannamayil.bsky.social · Feb 18

New preprint out! 🎉

How does LLM training loss translate to downstream performance?

We show that pretraining data and tokenizer shape loss-to-loss scaling, while architecture and other factors play a surprisingly minor role!
brendel-group.github.io/llm-line/ 🧵1/8

February 18, 2025 at 2:12 PM

Reposted by Adhiraj Ghosh

Joschka Strüber @ICML2025 🇨🇦

@joschkastrueber.bsky.social

🚨Great Models Think Alike and this Undermines AI Oversight🚨
New paper quantifies LM similarity
(1) LLM-as-a-judge favor more similar models🤥
(2) Complementary knowledge benefits Weak-to-Strong Generalization☯️
(3) More capable models have more correlated failures 📈🙀
🧵👇

February 7, 2025 at 9:12 PM

Adhiraj Ghosh

@adhirajghosh.bsky.social

Godsend

Apoorv Khandelwal @apoorvkh.com · Feb 7

I started a blog! First post is everything I know about setting up (fast, reproducible, error-proof) Python project environments using the latest tools. These methods have saved me a lot of grief. Also a short guide to CUDA in appendix :)

blog.apoorvkh.com/posts/projec...

Managing Project Dependencies

blog.apoorvkh.com

February 7, 2025 at 4:38 PM

Reposted by Adhiraj Ghosh

Andi

@andimara.bsky.social

Fuck it, today we're open-sourcing the codebase used to train SmolVLM from scratch on 256 H100s 🔥
Inspired by our team's effort to open-source DeepSeek's R1, we are releasing the training and evaluation code on top of the weights 🫡
Now you can train any SmolVLM—or create your own custom VLMs!

January 31, 2025 at 3:06 PM

Reposted by Adhiraj Ghosh

Paola Cascante-Bonilla

@pcascanteb.bsky.social

NLI Improves Compositionality in Vision-Language Models is accepted to #ICLR2025!

CECE enables interpretability and achieves significant improvements in hard compositional benchmarks without fine-tuning (e.g., Winoground, EqBen) and alignment (e.g., DrawBench, EditBench). + info: cece-vlm.github.io

January 23, 2025 at 6:34 PM

Reposted by Adhiraj Ghosh

Sebastian Dziadzio

@dziadzio.bsky.social

📄 New Paper: "How to Merge Your Multimodal Models Over Time?"

arxiv.org/abs/2412.06712

Model merging assumes all finetuned models are available at once. But what if they need to be created over time?

We study Temporal Model Merging through the TIME framework to find out!

🧵

How to Merge Your Multimodal Models Over Time?

Model merging combines multiple expert models - finetuned from a base foundation model on diverse tasks and domains - into a single, more capable model. However, most existing model merging approaches...

arxiv.org

December 11, 2024 at 6:00 PM

Reposted by Adhiraj Ghosh

Ameya P.

@bayesiankitten.bsky.social

How do we benchmark the vast capabilities of foundation models? Introducing ONEBench – a unifying benchmark to test them all, led by
@adhirajghosh.bsky.social and
@dziadzio.bsky.social!⬇️

Sample-level benchmarks could be the new generation- reusable, recombinable & evaluate lots of capabilities!

Adhiraj Ghosh @adhirajghosh.bsky.social · Dec 10

🚨Looking to test your foundation model on an arbitrary and open-ended set of capabilities, not explicitly captured by static benchmarks? 🚨

Check out ✨ONEBench✨, where we show how sample-level evaluation is the solution.

🔎 arxiv.org/abs/2412.06745

December 10, 2024 at 6:39 PM

Adhiraj Ghosh

@adhirajghosh.bsky.social

🚨Looking to test your foundation model on an arbitrary and open-ended set of capabilities, not explicitly captured by static benchmarks? 🚨

Check out ✨ONEBench✨, where we show how sample-level evaluation is the solution.

🔎 arxiv.org/abs/2412.06745

December 10, 2024 at 5:44 PM

Reposted by Adhiraj Ghosh

Vishaal Udandarao

@vishaalurao.bsky.social

🚀New Paper: Active Data Curation Effectively Distills Multimodal Models
arxiv.org/abs/2411.18674

Smol models are all the rage these days & knowledge distillation (KD) is key for model compression!

We show how data curation can effectively distill to yield SoTA FLOP-efficient {C/Sig}LIPs!!
🧵👇

December 2, 2024 at 5:59 PM

Adhiraj Ghosh

@adhirajghosh.bsky.social

Excited to test it out, could be a blessing for large-scale projects!

Andi @andimara.bsky.social · Nov 26

Let's go! We are releasing SmolVLM, a smol 2B VLM built for on-device inference that outperforms all models at similar GPU RAM usage and tokens throughputs.

SmolVLM can be fine-tuned on a Google collab and be run on a laptop! Or process millions of documents with a consumer GPU!

November 26, 2024 at 4:59 PM

Adhiraj Ghosh

@adhirajghosh.bsky.social

I've found starter packs on NLP, vision, graphics, etc. But personally, I would love to know and hear from researchers working on vision-language. So, let me know if you'd like to join this starter pack, would be happy to add!

go.bsky.app/TENRRBb

November 19, 2024 at 9:56 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news