Danny To Eun Kim
teknology.bsky.social
Danny To Eun Kim
@teknology.bsky.social
PhD student @CMU LTI
NLP | IR | Evaluation | RAG
https://kimdanny.github.io
Excited to present at #CLEF2025 #Touché Lab (Session 2) shared task "Advertisement in RAG"🇪🇸!
@webis.de
🗓️Sept 9 (Tue)
⏲️5:20PM (CEST) / 11:20AM (EST)
📍Florentino Sanz Room
🧠https://arxiv.org/abs/2507.00509
Join us for insights on #RAG + advertising!
September 9, 2025 at 12:02 AM
Reposted by Danny To Eun Kim
Some exciting news! 🤗 After 3 amazing years at TREC, the Tip-of-the-Tongue (ToT) shared task will be a core task at NTCIR-19 in 2026. The new track will focus on tip-of-the-tongue information needs in English and East Asian languages.

More details coming soon. See you all in Tokyo next year!
an aerial view of tokyo at night with lots of lights
ALT: an aerial view of tokyo at night with lots of lights
media.tenor.com
September 1, 2025 at 4:12 PM
Reposted by Danny To Eun Kim
Gentle reminder 📢
All run submissions for the Tip-of-the-Tongue (ToT) Track are due next week Wednesday (Aug 27).

More info: trec-tot.github.io/guidelines
#TREC2025 #TRECToT #TREC2025ToT
August 19, 2025 at 4:45 PM
This year's TREC Tip of the Tongue (ToT) track will be amazing! Based on our rigorous experiments on synthetic ToT query generation presented at #SIGIR2025, we extended the track to open domain ToT queries.
We provide codes for baseline systems, and submissions are due by August 27th!
Important announcement: All run submissions for TREC'25 Tip-of-the-Tongue (TREC-ToT) Track are due by **August 27th**. The run submission form is now open. Please submit your runs before the deadline.

More information: trec-tot.github.io/guidelines
#TREC2025 #TRECToT #TREC2025ToT

Spread the word!
August 4, 2025 at 5:52 PM
Reposted by Danny To Eun Kim
To Eun Kim just presented the work on "Tip of the Tongue Query Elicitation for Simulated Evaluation" at #SIGIR2025. The approach will be used in the #TREC2025 Tip-of-the-Tongue track, and we had some sweets at the poster :)

The paper is available online: dl.acm.org/doi/10.1145/...
July 15, 2025 at 2:30 PM
Reposted by Danny To Eun Kim
Hello TREC-ToTers!

We have released the test queries for the TREC 2025 Tip-of-the-Tongue (TREC-ToT) Track. Please see the guidelines for more information: trec-tot.github.io/guidelines. Run submission deadline will tentatively be in August. #TREC2025 #TRECToT #TREC2025ToT

Please spread the word!
July 13, 2025 at 4:47 PM
❓How do LLMs respond to fair ranking in RAG?
🤩 See how fair ranking boosts downstream utility while promoting fairer attribution of cited sources.
Catch our oral presentation at #ICTIR2025!
#SIGIR2025 @841io.bsky.social
July 12, 2025 at 1:32 PM
Reposted by Danny To Eun Kim
Do not forget to participate in the #TREC2025 Tip-of-the-Tongue (ToT) Track :)

The corpus and baselines (with run files) are now available and easily accessible via the ir_datasets API and the HuggingFace Datasets API.

More details are available at: trec-tot.github.io/guidelines
June 27, 2025 at 2:46 PM
Reposted by Danny To Eun Kim
🖋️ Curious how writing differs across (research) cultures?
🚩 Tired of “cultural” evals that don't consult people?

We engaged with interdisciplinary researchers to identify & measure ✨cultural norms✨in scientific writing, and show that❗LLMs flatten them❗

📜 arxiv.org/abs/2506.00784

[1/11]
June 9, 2025 at 11:30 PM
Reposted by Danny To Eun Kim
Hello TREC-ToTers! 👋🏽

Excited to announce the release of TREC 2025 Tip-of-the-Tongue (TREC-ToT) Track guidelines: trec-tot.github.io/guidelines. We will release test queries in July and run submission deadline will be in August. #TREC2025 #TRECToT #TREC2025ToT

Please register to participate:
TREC 2025 Tip-of-the-Tongue (ToT) Track
Tip of the tongue: The phenomenon of failing to retrieve something from memory, combined with partial recall and the feeling that retrieval is imminent.
trec-tot.github.io
May 9, 2025 at 9:02 PM
Reposted by Danny To Eun Kim
Ever trusted a metric that works great on average, only for it to fail in your specific use case?

In our #NAACL2025 paper (w/ @841io.bsky.social), we show why global evaluations are not enough and why context matters more than you think.

📄 aclanthology.org/2025.finding...
#NLP #Evaluation

(🧵1/9)
April 29, 2025 at 5:10 PM
Reposted by Danny To Eun Kim
If you're interested in OpenAI including shopping results, you might also be interested in @teknology.bsky.social's paper relating retrieval diversity/fairness and generation by downstream RAG models. This has implications for individuals selling products online.
arxiv.org/abs/2409.11598
Towards Fair RAG: On the Impact of Fair Ranking in Retrieval-Augmented Generation
Modern language models frequently include retrieval components to improve their outputs, giving rise to a growing number of retrieval-augmented generation (RAG) systems. Yet, most existing work in RAG...
arxiv.org
April 28, 2025 at 7:34 PM
Reposted by Danny To Eun Kim
If you're working on a recall-oriented task or with ranking systems evaluated across varied users, content, or intents, check it out. 5/5

dl.acm.org/doi/10.1145/...
April 7, 2025 at 4:15 PM
Reposted by Danny To Eun Kim
📢 New Paper: "Recall, Robustness, and Lexicographic Evaluation" (ACM TORS)
F Diaz, M Ekstrand (@md.ekstrandom.net), B Mitra (@bmitra.bsky.social)

For IR, NLP, and ML researchers working on ranking systems evaluated for recall and robustness. 🧵 1/5 dl.acm.org/doi/10.1145/...
April 7, 2025 at 4:15 PM
🚨New Breakthrough in Tip-of-the-Tongue (TOT) Retrieval Research!

We address data limitations and offer a fresh evaluation method for these complex queries.

Curious how TREC TOT track test queries are created? Check out this thread 🧵 and our paper 📄: arxiv.org/abs/2502.17776
Tip of the Tongue Query Elicitation for Simulated Evaluation
Tip-of-the-tongue (TOT) search occurs when a user struggles to recall a specific identifier, such as a document title. While common, existing search systems often fail to effectively support TOT scena...
arxiv.org
March 5, 2025 at 1:32 AM
Reposted by Danny To Eun Kim
Did you know? Gestures used to express universal concepts—like wishing for luck—vary DRAMATICALLY across cultures?
🤞means luck in US but deeply offensive in Vietnam 🚨

📣 We introduce MC-SIGNS, a test bed to evaluate how LLMs/VLMs/T2I handle such nonverbal behavior!

📜: arxiv.org/abs/2502.17710
February 26, 2025 at 4:23 PM
Heading to #NeurIPS2024 to present our ‘Fair RAG’ paper at the #AFME2024 workshop! Let's talk about RAG, Information Retrieval, and Fairness. Honored that our paper was selected as one of the Top 5 Spotlight Papers! 🎉 Let’s connect and chat!
Paper: arxiv.org/abs/2409.11598
Towards Fair RAG: On the Impact of Fair Ranking in Retrieval-Augmented Generation
Many language models now enhance their responses with retrieval capabilities, leading to the widespread adoption of retrieval-augmented generation (RAG) systems. However, despite retrieval being a cor...
arxiv.org
December 9, 2024 at 9:19 PM
Reposted by Danny To Eun Kim
Slides are up! I presented on "Presentation & Consumption in the context of REML"

The full deck is here. There's a lot of gems if you're interested in this space!

retrieval-enhanced-ml.github.io/sigir-ap2024...
December 9, 2024 at 7:14 AM
Those who are attending #SIGIRAP2024, come by and learn how retrieval can enhance ML models!
Today we'll be presenting the Tutorial on Retrieval-Enhanced Machine Learning (REML). Come by to learn about the emerging design patterns in this space and see how to use retrieval beyond RAG.

In collaboration w/ the amazing @841io.bsky.social @teknology.bsky.social Alireza Salemi and Hamed Zamani.
December 9, 2024 at 1:51 AM
Reposted by Danny To Eun Kim
Creating a 🦋 starter pack for people working in IR/RAG: go.bsky.app/88ULgwY

I can’t seem to find everyone though, help definitely appreciated to fill this out (DM or comment)!
November 23, 2024 at 9:19 PM
Reposted by Danny To Eun Kim
Mat is not on 🦋—posting on his behalf!

It's time to revisit common assumptions in IR! Embeddings have improved drastically, but mainstream IR evals have stagnated since MSMARCO + BEIR.

We ask: on private or tricky IR tasks, are rerankers better? Surely, reranking many docs is best?
November 20, 2024 at 7:47 PM
Reposted by Danny To Eun Kim
Time for a starter pack on information retrieval: go.bsky.app/MXPJoTn
November 14, 2024 at 8:57 PM
Reposted by Danny To Eun Kim
Hey all! I started a second starter pack with people who didn't make the first one, please let me know if you'd like to be added:

go.bsky.app/JgneRQk
November 13, 2024 at 12:15 AM
Reposted by Danny To Eun Kim
I'm keeping track of people at the CMU Language Technologies Institute here: go.bsky.app/NhTwCVb. Follow along!
November 12, 2024 at 2:54 PM