Gorka Azkune
gazkune.bsky.social
Gorka Azkune
@gazkune.bsky.social
Associate professor at the University of the Basque Country (UPV/EHU) and researcher at HiTZ Zentroa. Mainly interested on multimodal and multilingual deep learning.
Reposted by Gorka Azkune
Key takeaway: Adding simple structure at inference-time, through image crops and text segments, is a powerful, training-free way to improve Vision-Language Compositionality performance.

Joint work with @Ander Salaberria @eagirre.bsky.social @gazkune.bsky.social @hitz-zentroa.bsky.social
June 18, 2025 at 11:28 AM
Reposted by Gorka Azkune
Our analysis shows that:
1. There is room to improve the quality of extracted text segments.
2. Our method achieves significant performance gains in Winoground's non-trivial instances.
3. Isolated image crops can lose size and quantity information, leaving room for improvement.
June 18, 2025 at 11:28 AM
Reposted by Gorka Azkune
Why are image crops crucial? 🤔 We found that simply adding text segments isn't enough. The biggest performance gains come when text segments are paired with image crops, proving the power of serial image computing.
June 18, 2025 at 11:28 AM
Reposted by Gorka Azkune
Our approach is straightforward yet effective:
1. Divide the image into smaller crops.
2. Extract text segments capturing objects, attributes and relations.
3. Use the VLM to find image crops that best fit the text segments.
4. Aggregate matching similarities for the final score.
June 18, 2025 at 11:28 AM
Reposted by Gorka Azkune
While the experiments were not complicated, they required the collaboration of amazing co-authors, many compute hours, and of course, the impressive collaboration of the Basque community that was involved in manually assessing the models on an arena style evaluation.

Thank you!
June 11, 2025 at 6:01 PM
Reposted by Gorka Azkune
In this work we face the challenge of developing instruct models for Basque, a low-resource language.

Continue pretraining base models is intuitive, but what about instructed models? We analyze systematically all different approaches to find the best solution.

2/3
June 11, 2025 at 6:01 PM
Reposted by Gorka Azkune
Gogoratu otsailaren 17rarte aukera duzuela euskarazko txatbotak ebaluatu eta ikerkuntzan laguntzeko:

ebaluatoia.hitz.eus

Sartu eta parte hartu, erraza eta dibertigarria izateaz gain, sariak ere badaude!
February 12, 2025 at 8:57 AM