Kaiser Sun
banner
kaiserwholearns.bsky.social
Kaiser Sun
@kaiserwholearns.bsky.social
Ph.D. student at @jhuclsp, human LM that hallucinates. Formerly @MetaAI, @uwnlp, and @AWS they/them🏳️‍🌈 #NLProc #NLP Crossposting on X.
Congrats and welcome to the DMV area!!!
June 17, 2025 at 2:45 AM
🛠️ Interested in how your LLM behaves under this circumstance? We released the code to generate the diagnostic data for your own LLM.
@mdredze @loadingfan
8/8
June 16, 2025 at 12:02 PM
🔗 Takeaways for practitioners
1. Check for knowledge conflict before prompting.
2. Add further explanation to guide the model in following the context.
3. Monitor hallucinations even when context is supplied.
7/8
June 16, 2025 at 12:02 PM
📏 Implications:
⚡When using an LLM as a judge, its parametric knowledge could lead to incorrect judgment :(
⚡ Retrieval systems need mechanisms to detect and resolve contradictions, not just shove text into the prompt. 6/8
June 16, 2025 at 12:02 PM
🧠 Key finding #3:
“Just give them more explanation?” Providing rationales helps—it pushes models to lean more on the context—but it still can’t fully silence the stubborn parametric knowledge. 5/8
June 16, 2025 at 12:02 PM
⚖️ Key finding #2:
Unsurprisingly, LLMs prefer their own memories. Even when we explicitly instruct them to rely on the provided document, traces of the “wrong” internal belief keep leaking into answers. 4/8
June 16, 2025 at 12:02 PM
⚠️ Key finding #1:
If the task doesn’t require external knowledge (e.g., pure copy), conflict barely matters. However, as soon as knowledge is needed, accuracy tanks when context and memory disagree.
3/8
June 16, 2025 at 12:02 PM
🛠️ We create diagnostic data that…
- Agrees/Contradicts with the model’s knowledge
- Contradictions with different levels of plausibility
- Tasks requiring different levels of knowledge
2/8
June 16, 2025 at 12:02 PM
aclanthology.org
May 6, 2025 at 11:27 PM
It was quite encouraging to find that many friends share my concern of "minor details" obstructing us from gaining reliable conclusions. Really hope that we all can provide well-documented experimentsl details and value the so-called "engineering contributions" more.
May 6, 2025 at 11:25 PM
Reposted by Kaiser Sun
with reasonable freedom, depending on the scale/focus of the business.

Case in point, we are looking to expand the research/foundation models team at Orby AI and are looking for highly motivated researchers and ML/Research engineers. Please reach out if you're interested in learning more!
/fin
January 8, 2025 at 7:39 PM
Agree. Oth it might be helpful as a way to receive report and doubt. There is one user reported that the authors of a paper I was reviewing violate the anonymity policy by posting their submissions in public.
November 20, 2024 at 10:48 PM
🙋
November 20, 2024 at 6:57 AM