Monica M Reddy
monicamreddy.bsky.social
Monica M Reddy
@monicamreddy.bsky.social
PhD student at @KhouryCollege. Working in Machine Learning for Healthcare. Previously: @ StanfordMed @allen_ai, @UmassAmherst
https://monicamunnangi.github.io/
FactEHR is both a benchmark and a training resource for improving clinical LLMs in key tasks like summarization, electronic phenotyping, and QA.
📂 Code & Data: github.com/som-shahlab/...
📄 Paper: arxiv.org/abs/2412.124...
August 11, 2025 at 5:25 PM
We observe wide variation across models — in both fact decomposition and entailment judgment. Some LLMs generate accurate, grounded outputs; others miss or misstate key facts.
FactEHR highlights these gaps and guides improvement.
August 11, 2025 at 5:25 PM
🧠 FactEHR is a large NLI dataset for evaluating entailment-based LLM-as-a-judge methods in clinical text
📄 2,168 notes | 🏥 4 note types, 3 health systems
🔗 987K entailment pairs + 3.4K expert labels
🤖 Full fact decompositions from GPT-4o, Gemini 1.5, LLaMA3 8B, and o1-mini
August 11, 2025 at 5:25 PM
Why is this so hard?
Clinical notes are long, messy, and inconsistent. Evaluating fine-grained factuality across diverse note types (e.g., discharge vs. radiology) is a major challenge — but essential for safe, trustworthy LLMs. ⚠️
August 11, 2025 at 5:25 PM
Reposted by Monica M Reddy
Was AC for one of the papers. Went to metareview and noticed that two reviews were basically paraphrases of each other (down to ordering of weaknesses) and LLM generated. Noticed the paper was also weirdly well written garbage. Then I investigated the deadbeat reviewers, realized they don't exist.
February 13, 2025 at 4:41 AM