Fateme Hashemi Chaleshtori
banner
fatemehc.bsky.social
Fateme Hashemi Chaleshtori
@fatemehc.bsky.social
PhD student at Utah NLP, Mechanistic Interpretability, Trustworthy AI, Human-centered AI
7/ However, LLMs struggle with these complex tasks:
- Realistic argument completion: Llama-3.1-70B finds missing arguments only 18% of the time
- Case retrieval: Best method finds correct precedents in top-5 results just 31.4% of the time

Lots of room for improvement! 📈
June 20, 2025 at 10:07 PM
6/ Surprising finding: GPT-4o outperforms human-written headings!
🤖 GPT-4o: 4.3/5 avg. LLM-as-judge rating for both arg. summ. & comp.
🤵 Lawyers: 4.0/5 (summ.) and 3.9/5 (comp.) avg. rating
LLMs excel at summarization and guided completion tasks, requiring only minor edits.
June 20, 2025 at 10:07 PM
1/ 🚨NEW PAPER: "BriefMe: A Legal NLP Benchmark for Assisting with Legal Briefs", accepted to ACL Findings 2025!
We introduce the first benchmark specifically designed to help LLMs assist lawyers in writing legal briefs 🧑‍⚖️

📄 arxiv.org/abs/2506.06619
🗂️ huggingface.co/datasets/jw4...
June 20, 2025 at 10:07 PM