IAMJB
iamjbd.bsky.social
IAMJB
@iamjbd.bsky.social
🤗 ML at Hugging Face
🌲 Academic Staff at Stanford University (AIMI Center)
🦴 Radiology AI is my stuff
This work builds on our recent study on Automated Structured Radiology Report Generation (x.com/IAMJBDEL/st...) which introduces the dataset and evaluation framework.
June 12, 2025 at 2:17 PM
Huge thanks to the amazing team at Stanford Center for Artificial Intelligence in Medicine and Imaging (AIMI): Johannes Moll, Louisa Fay, @asfandyar_azhar, @SophieOstmeier, Tim Lueth, Sergios Gatidis, @curtlanglotz
June 12, 2025 at 2:17 PM
📄 Paper: arxiv.org/abs/2506.00200
🌐 Project Page: stanford-aimi.github.io/structuring...
🤗 Models & Data: huggingface.co/collections...
All models and datasets are fully open-source — we hope this contributes to the broader medical AI community! 🤝
Structuring with Lightweight Models - a StanfordAIMI Collection
huggingface.co
June 12, 2025 at 2:17 PM
We benchmark lightweight models (<300M params) against state-of-the-art LLMs (up to 70B params), using human-reviewed test data and clinically grounded evaluation metrics. Our results highlight the strong potential of specialized, efficient models in clinical NLP application.
June 12, 2025 at 2:17 PM
Paper, soon to appear at #ACL2025 main: arxiv.org/pdf/2505.24223
Project page, with all resources (datasets, models, ontology) and usage notes: stanford-aimi.github.io/srrg.html
All models and datasets are publicly available as open-source:
huggingface.co/collections...
Structured Radiology Reports - a StanfordAIMI Collection
huggingface.co
June 9, 2025 at 3:13 PM
4) We conduct a reader study to create a radiologist-validated test set for both the automated structured radiology report task, as well as utterances disease labels from our new ontology.

Finally, external evaluation is conducted using out-of-institution data by @hopprai.
June 9, 2025 at 3:13 PM
3) We fine-tune popular RRG system on this restructured findings and impression, namely:
- Chexagent @StanfordAIMI
- MAIRA-2 @MSFTResearch
- RaDialog @TU_Muenchen
- Chexpert-plus @StanfordAIMI

As well as a BERT architecture for the disease classification system on our new ontology.
June 9, 2025 at 3:13 PM
2) Since each reported observation, whether in the findings or impression sections, is expressed as a single utterance (1.5M unique utterances in total), we use a large language model to label each one according to a new ontology comprising 72 critical chest X-ray (CXR) observations.
June 9, 2025 at 3:13 PM
1) We leverage LLM to restructure MIMIC-CXR and Chexpert-plus (180K Findings sections and 400K Impression sections) into reports categorized by organ system, under strict rules.
June 9, 2025 at 3:13 PM
𝟱. Working Memory: Compiles long-term and task memory to create the final prompt for the LLM.

Typically, 1–3 = Long-Term Memory; 5 = Short-Term Memory.

Thoughts on agent memory?👇
January 24, 2025 at 5:50 PM
𝟮. Semantic Memory: External/grounding knowledge or self-knowledge, similar to RAG context.
𝟯. Procedural Memory: System setup details like prompts, tools, and guardrails (stored in Git/registries).
𝟰. Task Memory: Info retrieved from long-term storage for immediate tasks.
January 24, 2025 at 5:50 PM
Oops. Thanks!
January 16, 2025 at 9:37 PM