This paper conducts a simple test of the effectiveness of rerankers on large amounts of documents. It's really important to think about if you are using RAG a lot.
This paper conducts a simple test of the effectiveness of rerankers on large amounts of documents. It's really important to think about if you are using RAG a lot.
First, a disclaimer that I am writing this as an African who is a speaker of multiple African languages, NLP researcher of African languages, and HCI researcher focusing broadly on..
www.reuters.com/technology/a...
First, a disclaimer that I am writing this as an African who is a speaker of multiple African languages, NLP researcher of African languages, and HCI researcher focusing broadly on..
Featuring work w/ @williambrady.bsky.social @killianmcloughlin.bsky.social
🧵
Featuring work w/ @williambrady.bsky.social @killianmcloughlin.bsky.social
🧵
Love love love this. Those who know me know that I am generally grumpy about evaluation metrics for LLMs and the building culture of benchmark beating that has been going on for awhile now. More of this instead please.
Love love love this. Those who know me know that I am generally grumpy about evaluation metrics for LLMs and the building culture of benchmark beating that has been going on for awhile now. More of this instead please.
A big issue with LLM reasoning is poisoning downstream tasks. In a reasoning chain if 1 instruction is nonoptimal that has a huge impact on later steps and the most popular frameworks don’t have a way to account for that
A big issue with LLM reasoning is poisoning downstream tasks. In a reasoning chain if 1 instruction is nonoptimal that has a huge impact on later steps and the most popular frameworks don’t have a way to account for that
arxiv.org/abs/2411.04118
arxiv.org/abs/2411.04118
“AI won’t take your job, but those who use it will” feels uncomfortably true.
“AI won’t take your job, but those who use it will” feels uncomfortably true.
arxiv.org/abs/2411.00492
arxiv.org/abs/2411.00492