arxiv.org/abs/2502.19842
arxiv.org/abs/2502.19842
- Challenges in long-text, high-image-density tasks
- The image ordering constraint proves to be an unsolved challenge
- Challenges in long-text, high-image-density tasks
- The image ordering constraint proves to be an unsolved challenge
Disentangling CLIP Features for Enhanced Localized Understanding (arxiv.org/abs/2502.02977)
Disentangling CLIP Features for Enhanced Localized Understanding (arxiv.org/abs/2502.02977)
arxiv.org/abs/2502.01906
By the way, do you say LVLM or Multimodal Large Language Models (MLLM)? I don't think there's a clear naming convention 🤷
arxiv.org/abs/2502.01906
By the way, do you say LVLM or Multimodal Large Language Models (MLLM)? I don't think there's a clear naming convention 🤷
- SafeRAG: Benchmarking Security in Retrieval-Augmented Generation of Large Language Models (arxiv.org/html/2501.18...)
- RealRAG: Retrieval-augmented Realistic Image Generation via Self-reflective Contrastive Learning (arxiv.org/html/2502.00...)
- SafeRAG: Benchmarking Security in Retrieval-Augmented Generation of Large Language Models (arxiv.org/html/2501.18...)
- RealRAG: Retrieval-augmented Realistic Image Generation via Self-reflective Contrastive Learning (arxiv.org/html/2502.00...)
1. Defects Detection: Evaluating the existence of misinformation in the retrieval results.
2. Utility Extraction: Generation of correct answers even from defective inputs.
arxiv.org/abs/2501.18365
1. Defects Detection: Evaluating the existence of misinformation in the retrieval results.
2. Utility Extraction: Generation of correct answers even from defective inputs.
arxiv.org/abs/2501.18365
arxiv.org/abs/2501.15470
arxiv.org/abs/2501.15470
- MuKA: aclanthology.org/2025.coling-...
- ImageRef-VL: arxiv.org/abs/2501.12418
- MuKA: aclanthology.org/2025.coling-...
- ImageRef-VL: arxiv.org/abs/2501.12418
arxiv.org/abs/2412.07619
arxiv.org/abs/2412.07619
arxiv.org/abs/2412.04616
arxiv.org/abs/2412.04616
arxiv.org/abs/2412.05243
arxiv.org/abs/2412.05243
arxiv.org/abs/2412.00440
Also, the reason image augmentations are not used is probably this:
arxiv.org/abs/2405.187...
arxiv.org/abs/2412.00440
Also, the reason image augmentations are not used is probably this:
arxiv.org/abs/2405.187...
arxiv.org/pdf/2411.16752
arxiv.org/pdf/2411.16752