https://ryokamoi.github.io/
Ryo Kamoi, Yusen Zhang, Sarkar Snigdha Sarathi Das, Ranran Haoran Zhang, Rui Zhang
Paper: arxiv.org/abs/2412.00947
Data: huggingface.co/collections/...
Code: github.com/psunlpgroup/...
Ryo Kamoi, Yusen Zhang, Sarkar Snigdha Sarathi Das, Ranran Haoran Zhang, Rui Zhang
Paper: arxiv.org/abs/2412.00947
Data: huggingface.co/collections/...
Code: github.com/psunlpgroup/...
We conclude that we need to improve both the training data and model architecture of LVLMs for better visual perception. [4/n]
We conclude that we need to improve both the training data and model architecture of LVLMs for better visual perception. [4/n]
Recent benchmarks for LVLMs often involve reasoning or knowledge, putting less focus on visual perception. In contrast, VisOnlyQA is designed to evaluate visual perception directly [2/n]
Recent benchmarks for LVLMs often involve reasoning or knowledge, putting less focus on visual perception. In contrast, VisOnlyQA is designed to evaluate visual perception directly [2/n]
When Can LLMs Actually Correct Their Own Mistakes? A Critical Survey of Self-Correction of LLMs (TACL 2024)
arxiv.org/abs/2406.01297
When Can LLMs Actually Correct Their Own Mistakes? A Critical Survey of Self-Correction of LLMs (TACL 2024)
arxiv.org/abs/2406.01297