Srishti
banner
srishtiy.bsky.social
Srishti
@srishtiy.bsky.social
ELLIS PhD Fellow @belongielab.org | @aicentre.dk | University of Copenhagen | @amsterdamnlp.bsky.social | @ellis.eu

Multi-modal ML | Alignment | Culture | Evaluations & Safety| AI & Society

Web: https://www.srishti.dev/
We then propose 5 frameworks to evaluate cultures in VLMs:
1️⃣ Processual Grounding - who defines culture?
2️⃣ Material Culture - what is represented?
3️⃣ Symbolic Encoding - how is meaning layered?
4️⃣ Contextual Interpretation - who understands and frames meaning?
5️⃣ Temporality -when is culture situated?
June 2, 2025 at 10:36 AM
Modern Vision-Language Models (VLMs) often fail at cultural understanding. But culture isn’t just recognizing things like food, clothes, rituals etc. It's how meaning is made and understood; it also about symbolism, context, and how these things evolve over time.
June 2, 2025 at 10:36 AM
I am excited to announce our latest work 🎉 "Cultural Evaluations of Vision-Language Models Have a Lot to Learn from Cultural Theory". We review recent works on culture in VLMs and argue for deeper grounding in cultural theory to enable more inclusive evaluations.

Paper 🔗: arxiv.org/pdf/2505.22793
June 2, 2025 at 10:36 AM
Thanks! In one result we do compare image-text frames, focusing on if highlighted text frames align with image frames in the article - e.g. political in text often aligns with policy/public-op in images (Diagonal shows there is often
an alignment).
Your work sounds interesting—keen to chat more :) !
April 7, 2025 at 6:26 PM