Oncel Tuzel
onceltuzel.bsky.social
Oncel Tuzel
@onceltuzel.bsky.social
AI researcher at Apple
Reposted by Oncel Tuzel
For more, check out our paper on arxiv: arxiv.org/abs/2412.13303

With the amazing people: @pavankumarvasu.bsky.social , Fartash Faghri, Chun-Liang Li, Hadi Pouransari, Nate True, Albert Antony, Gokul Santhanam, James Gabriel, Peter Grasch, and @onceltuzel.bsky.social
FastVLM: Efficient Vision Encoding for Vision Language Models
Scaling the input image resolution is essential for enhancing the performance of Vision Language Models (VLMs), particularly in text-rich image understanding tasks. However, popular visual encoders su...
arxiv.org
December 19, 2024 at 7:22 PM