🌐 https://anhnguyen.me/
🐦 https://x.com/anh_ng8
✨ An Võ + Khải Nguyên Nguyễn ✨
w/ countless assists from Mohammad Taesiri, Tường Vy Đặng & Prof. Daeyoung Kim.
Code & data: vlmsarebiased.github.io
Paper: arxiv.org/abs/2505.23941
inspired by vlmsareblind.github.io
Thank you for any feedback 🙏
8/8
✨ An Võ + Khải Nguyên Nguyễn ✨
w/ countless assists from Mohammad Taesiri, Tường Vy Đặng & Prof. Daeyoung Kim.
Code & data: vlmsarebiased.github.io
Paper: arxiv.org/abs/2505.23941
inspired by vlmsareblind.github.io
Thank you for any feedback 🙏
8/8
Q: Count the circles in cell C3.
🤖: 3 ❌
VLMs are only ~22% accurate and biased towards the surrounding cells.
7/8
Q: Count the circles in cell C3.
🤖: 3 ❌
VLMs are only ~22% accurate and biased towards the surrounding cells.
7/8
But, here we modify Ebbinghaus pattern so that two inner circles clearly differ in size. And...
o3: equal ❌
Sonnet 3.7: equal ❌
6/8
But, here we modify Ebbinghaus pattern so that two inner circles clearly differ in size. And...
o3: equal ❌
Sonnet 3.7: equal ❌
6/8
🟧 % of predictable, biased answers by VLMs.
5/8
🟧 % of predictable, biased answers by VLMs.
5/8
4/8
4/8
e.g.,: when
- extra leg added to 4-legged animals
- extra stripe added to 3-striped Adidas logo
3/8
e.g.,: when
- extra leg added to 4-legged animals
- extra stripe added to 3-striped Adidas logo
3/8
Hard to believe? 😅
Image to try yourself: 👇http://s.anhnguyen.me/250602__zebra_original_image.png
More examples: github.com/anvo25/vlms-...
2/8
Hard to believe? 😅
Image to try yourself: 👇http://s.anhnguyen.me/250602__zebra_original_image.png
More examples: github.com/anvo25/vlms-...
2/8