Yu Fan
yu-fan-768.bsky.social
Yu Fan
@yu-fan-768.bsky.social
Reposted by Yu Fan
Accepted to EMNLP (and more to come 👀)! The camera ready version is now online---very happy with how this turned out

arxiv.org/abs/2507.01234
New preprint! Have you ever tried to cluster text embeddings from different sources, but the clusters just reproduce the sources? Or attempted to retrieve similar documents across multiple languages, and even multilingual embeddings return items in the same language?

Turns out there's an easy fix🧵
September 24, 2025 at 3:21 PM
Reposted by Yu Fan
New preprint! Have you ever tried to cluster text embeddings from different sources, but the clusters just reproduce the sources? Or attempted to retrieve similar documents across multiple languages, and even multilingual embeddings return items in the same language?

Turns out there's an easy fix🧵
July 17, 2025 at 10:53 AM