Adi Simhi
adisimhi.bsky.social
Adi Simhi
@adisimhi.bsky.social
NLProc, and machine learning. Ph.D. student Technion
🔍Check out our paper "Trust Me, I’m Wrong: High-Certainty Hallucinations in LLMs", at arxiv.org/pdf/2502.12964 and code at github.com/technion-cs-...
February 19, 2025 at 3:50 PM
What do you think? 🤔
Could high-certainty hallucinations be a major roadblock to safe AI deployment? Let’s discuss! 👇
February 19, 2025 at 3:50 PM
🔮 Takeaway:
We need new approaches to understand hallucinations so we can mitigate them better.
This research moves us toward deeper insights into why LLMs hallucinate and how we can build more trustworthy AI.
February 19, 2025 at 3:50 PM
💡Why does this matter?
- Not all hallucinations stem from uncertainty or lack of knowledge.
- High-certainty hallucinations appear systematically across models & datasets.
- This challenges existing hallucination detection & mitigation strategies that rely on uncertainty signals
February 19, 2025 at 3:50 PM
🛠️How did we test this?
We used knowledge detection & uncertainty measurement methods to analyze when and how hallucinations occur.
February 19, 2025 at 3:50 PM
🚨Key finding:
LLMs can produce hallucinations with high certainty—even when they possess the correct knowledge!
February 19, 2025 at 3:50 PM
🔍The problem:
LLMs sometimes generate hallucinations - factually incorrect outputs. assuming that if the model is certain and does not lack knowledge it must be correct.
February 19, 2025 at 3:50 PM