Benjamin Feuer
benjaminfeuer.bsky.social
Benjamin Feuer
@benjaminfeuer.bsky.social
PhD researcher at NYU, working on LLMs, VLMs, and tabular foundation models from a data-centric perspective. Father of two, NYC diehard.
June 18, 2025 at 2:27 PM
Check out:

Our Website: dcvlr-neurips.github.io

Our Starter Kit (Curate, Train, Eval): github.com/oumi-ai/oumi...

🧵 6 / n
DCVLR: Data Curation for Vision Language Reasoning - NeurIPS 2025 Competition
Join the DCVLR NeurIPS 2025 Competition. Advance visual reasoning in VLMs through data curation.
dcvlr-neurips.github.io
June 18, 2025 at 2:22 PM
* A submission = a curated reasoning dataset on @huggingface with 1k or 10k samples and a scalable, reproducible curation strategy you document in a write-up
* You don’t need to train a model
* You can submit with nothing more than a free Colab or Kaggle account for basic testing

🧵 5 / n
June 18, 2025 at 2:22 PM
💪anyone can compete for free 💪: Thanks to our sponsor @LambdaAPI we offer three free submissions for up to 500 teams. This is unprecedented in data-centric research, which tends to be very expensive because you have to train lots of models!

🧵 4 / n
June 18, 2025 at 2:21 PM
🤖 open-models 🤖: every model we present results for will have open weights, and one of those models will be Molmo-O from @allen_ai (a recent best paper honorable mention from @cvpr at #CVPR2025), trained on open data.

🧵 3 / n
June 18, 2025 at 2:20 PM
DCVLR is data-centric: we train an ~7B VLM on your dataset. The best performer (on benchmarks like MathVista, VMCBench and LiveXiv) will be eligible to win $1500 and a talk at #NeurIPS2025!

We also have a few twists compared to prior data-centric competitions –

🧵 2 / n
June 18, 2025 at 2:20 PM
Co-organizing with wonderful collaborators from MIT, NYU, Stanford and UW: @thaottn.bsky.social , @sewoong79.bsky.social , @sarameghanbeery.bsky.social , @yuhuiz.bsky.social !
May 1, 2025 at 5:04 PM
We are excited to be sponsored by @datologyai.com
, who will be providing prizes for best paper awards 🏆
May 1, 2025 at 5:02 PM
🚀We welcome any submission that discusses domain-specific data curation pipelines and/or generalizable curation principles, getting us closer to building data-centric methods that are robust, efficient, and adaptable across domains.

Refer to our website for the call for papers!
May 1, 2025 at 5:02 PM
That's not what they did, they used gpt-4o for program synthesis, it's fundamentally different than asking the LLM to provide the correct response in the prompt
December 22, 2024 at 11:06 AM
Thanks for sharing! FWIW, I sensed mostly optimism and excitement at NeurIPS -- the people I spoke to were eager to talk about their research and learn about mine. Let's meet up in the new year and compare notes @kyunghyuncho.bsky.social
December 22, 2024 at 11:02 AM
That does seem like a sound rule! Although, interestingly, they did not apply it to me. 😅
December 14, 2024 at 3:40 PM
SOCIAL MEDIA TITLE TAG
SOCIAL MEDIA DESCRIPTION TAG TAG
baskargroup.github.io
December 8, 2024 at 9:16 PM
Statistics in LLMs - Schedule
Saturday, December 14th, 2024
sites.google.com
December 8, 2024 at 9:16 PM
This book helped me learn how to understand ideological and inconsistent intellectual stances (a bit)

www.amazon.com/Righteous-Mi...
The Righteous Mind: Why Good People Are Divided by Politics and Religion
The Righteous Mind: Why Good People Are Divided by Politics and Religion [Haidt, Jonathan] on Amazon.com. *FREE* shipping on qualifying offers. The Righteous Mind: Why Good People Are Divided by Politics and Religion
www.amazon.com
November 28, 2024 at 2:00 PM