Koustuv Sinha
banner
koustuvsinha.com
Koustuv Sinha
@koustuvsinha.com
🔬Research Scientist, Meta AI (FAIR).

🎓PhD from McGill University + Mila

🙇‍♂️I study Multimodal LLMs, Vision-Language Alignment, LLM Interpretability & I’m passionate about ML Reproducibility (@reproml.org)

🌎https://koustuvsinha.com/
Congrats, nice and refreshing papers, especially the word confusion idea! We need better similarity methods, good to see developments in this front! Curious if the confusion similarity depends on the label size of the classifier?
February 20, 2025 at 12:38 PM
Many many congratulations!! 🥳🎉🎉
February 11, 2025 at 1:40 AM
another factor which makes simple mlps work is visual token length. if you care about shorter tokens, you need a better mapper. these days most llms are capable of long context, which reduces the need of compressing visual tokens.
February 2, 2025 at 5:58 AM
one hypothesis why simple mappers work is 1. unfreezing the LLM provides enough parameters for mapping, 2. richer vision representations are closer to llm internal latent space arxiv.org/abs/2405.07987
February 2, 2025 at 5:58 AM
good questions! from what I see some folks still use complex mappers like Perceivers, but often simple mlp works good enough. the variable which induces the biggest improvement is almost always the alignment data.
February 2, 2025 at 5:58 AM
This is actually a cool result - token length being a rough heuristic for confidence of models?
January 31, 2025 at 10:26 PM
Lots of cool findings in our paper as well as in the website: tsb0601.github.io/metamorph/

Excited to see how the community "MetaMorph"'s existing LLMs!
December 26, 2024 at 8:02 PM
I wonder if veo-2 would be better at these prompts!
December 17, 2024 at 8:49 PM
Co-organized by @randomwalker.bsky.social @peterhenderson.bsky.social, @in4dmatics.bsky.social Naila Murray, @adinawilliams.bsky.social, Angela Fan, Mike Rabbat and Joelle Pineau. Checkout our website for CFP and more details: reproml.org
MLRC 2025
Machine Learning Reproducibility Challenge
reproml.org
December 13, 2024 at 7:06 PM
Also, MLRC is now in 🦋 as well - do follow! :) @reproml.org
December 10, 2024 at 4:53 PM
Yes, that imo is one of the most exciting outcome for this direction - learning a new modality with much less compute. We have some really nice results, can’t wait to share it with everyone, stay tuned!
November 21, 2024 at 5:47 AM
👋 hello! :)
November 20, 2024 at 9:52 PM