We identify 3 key failures of pragmatic competence in referring expression generation with VLMs: (1) cannot uniquely refer to the referent, (2) include excessive or irrelevant information, and (3) misalign with human pragmatic preferences.
We identify 3 key failures of pragmatic competence in referring expression generation with VLMs: (1) cannot uniquely refer to the referent, (2) include excessive or irrelevant information, and (3) misalign with human pragmatic preferences.