Working on scientific fact verification, LLM factuality, biomedical NLP. 🌐🧑🏻🎓🇭🇷
Wit a novel dataset of changed medical knowledge, we discover the alarming presence of obsolete advice in eight popular LLMs.⌛
📝: arxiv.org/abs/2509.04304 #NLP
Wit a novel dataset of changed medical knowledge, we discover the alarming presence of obsolete advice in eight popular LLMs.⌛
📝: arxiv.org/abs/2509.04304 #NLP
We test how the RAG performance on QA tasks changes (and plateaus) with increasing context size across different LLMs and retrievers.
📝 arxiv.org/abs/2502.14759
We test how the RAG performance on QA tasks changes (and plateaus) with increasing context size across different LLMs and retrievers.
📝 arxiv.org/abs/2502.14759
This system iteratively collects new knowledge via generated Q&A pairs, making the verification process more robust and explainable.
📜 arxiv.org/abs/2502.14765 #NLP
This system iteratively collects new knowledge via generated Q&A pairs, making the verification process more robust and explainable.
📜 arxiv.org/abs/2502.14765 #NLP
@aclmeeting.bsky.social #ACL2025 #ACL2025nlp #NLP
@aclmeeting.bsky.social #ACL2025 #ACL2025nlp #NLP
We trained 2 new models. Like BERT, but modern. ModernBERT.
Not some hypey GenAI thing, but a proper workhorse model, for retrieval, classification, etc. Real practical stuff.
It's much faster, more accurate, longer context, and more useful. 🧵
Around 1000 participants with 3 days full of intense coding, new experiences, exciting sponsor challenges and workshops, fun side activities, tasty food, creative final solutions, and overall awesome fun! 😊
Join us next year 💙🧑💻🔜 hack.tum.de
Around 1000 participants with 3 days full of intense coding, new experiences, exciting sponsor challenges and workshops, fun side activities, tasty food, creative final solutions, and overall awesome fun! 😊
Join us next year 💙🧑💻🔜 hack.tum.de