Roihn Run Peng 彭润
roihn.bsky.social
Roihn Run Peng 彭润
@roihn.bsky.social
Ph.D. Student @SLED_AI; MSCS & BSE @Umich🎓; Working on #RL & #NLP oriented #EmbodiedAI 🤖
Reposted by Roihn Run Peng 彭润
Vision-Language Models are not yet pragmatically optimal.

We identify 3 key failures of pragmatic competence in referring expression generation with VLMs: (1) cannot uniquely refer to the referent, (2) include excessive or irrelevant information, and (3) misalign with human pragmatic preferences.
April 23, 2025 at 5:55 PM