Shan Chen
banner
shan23chen.bsky.social
Shan Chen
@shan23chen.bsky.social
PhDing @AIM_Harvard @MassGenBrigham|PhD Fellow @Google | Previously @Bos_CHIP @BrandeisU

More robustness and explainabilities 🧐 for Health AI.
shanchen.dev
Pinned
Here are some reflections on many studies we did this year. Tons of progress has been made, but there are still safety concerns..🧐

Poster 10:30 riverfront at EMNLP2024 🏖️
Happy to chat and connect!

📃 huggingface.co/blog/shanche...

🔊 tinyurl.com/aimpodcast24

@daniellebitterman.bsky.social
What We Learned About LLM/VLMs in Healthcare AI Evaluation:
A Blog post by Shan Chen on Hugging Face
huggingface.co
Reposted by Shan Chen
LLMs tend to prioritize helpfulness > reason. We show that safety-aware, compute-efficient fine-tuning helps models reason more critically in healthcare domain, and generalizes to improved safety alignment across other domains.
www.nature.com/articles/s41... @shan23chen.bsky.social
When helpfulness backfires: LLMs and the risk of false medical information due to sycophantic behavior - npj Digital Medicine
npj Digital Medicine - When helpfulness backfires: LLMs and the risk of false medical information due to sycophantic behavior
www.nature.com
October 18, 2025 at 2:18 PM
Reposted by Shan Chen
An overemphasis on helpfulness makes LLMs vulnerable.
Research shows models will comply with illogical medical requests, generating false information. This sycophantic tendency can be corrected with specific prompting and fine-tuning. #MedSky #MedAI #MLSky
When helpfulness backfires: LLMs and the risk of false medical information due to sycophantic behavior - npj Digital Medicine
npj Digital Medicine - When helpfulness backfires: LLMs and the risk of false medical information due to sycophantic behavior
www.nature.com
October 17, 2025 at 3:53 PM
Reposted by Shan Chen
[1/]💡New Paper
Large reasoning models (LRMs) are strong in English — but how well do they reason in your language?

Our latest work uncovers their limitation and a clear trade-off:
Controlling Thinking Trace Language Comes at the Cost of Accuracy

📄Link: arxiv.org/abs/2505.22888
May 30, 2025 at 1:09 PM
Reposted by Shan Chen
Agents are all the rage and we need to track their abilities in the medical domain. Enter MedBrowseComp, the 1st benchmark to assess agents' abilities to reason, navigate the web, and search for verifiable med info!

Preprint: arxiv.org/abs/2505.14963
Site: moreirap12.github.io/mbc-browse-a...
May 22, 2025 at 4:27 PM
Reposted by Shan Chen
✨ What if your face could tell something about how old your body really is?

Excited to share our latest paper just published in The Lancet Digital Health (open access!)

👉 www.thelancet.com/journals/lan...
FaceAge, a deep learning system to estimate biological age from face photographs to improve prognostication: a model development and validation study
Our results suggest that a deep learning model can estimate biological age from face photographs and thereby enhance survival prediction in patients with cancer. Further research, including validation...
www.thelancet.com
May 9, 2025 at 3:06 PM
CALL FOR REMOTE SPEAKERS: Science in the News Seminar Series, hosted by Harvard x Beacon Hill Seminars

scientists, engineers & doctors, from academic researchers to industry professionals! 🧑‍🔬🧑‍💻 

Email the organizers at scienceinthenews.bhs@gmail.com to sign up for a date! (First-come-first-served)
March 7, 2025 at 1:45 AM
Reposted by Shan Chen
We have a NEW PAPER in @naturemedicine.bsky.social on reporting recommendations for addressing the unique challenges of #largelanguagemodels (LLMs) in biomedical applications

www.nature.com/articles/s41...

#MLSky #StatsSky #medSky #AISky #artificialintelligence #generativeAI #transparency
January 8, 2025 at 10:24 AM
Reposted by Shan Chen
I am always worrying about Benzene (my cat)! www.nytimes.com/2024/12/05/w...

But please don't stop wearing sunscreen! Sun exposure is a known cancer risk, benzene risks unknown. This article has good tips if you want to minimize benzene exposure.

Obligatory Benzene (cat) pic ⬇️
Is It Time to Worry About Benzene in Personal Care Products?
The carcinogen has been found in sunscreen, deodorants, acne creams and other personal care products. Here’s what to know.
www.nytimes.com
December 6, 2024 at 11:12 PM
Team @AnthropicAI & @thesubhashk @joshengels.bsky.social shows SAE features can be good for classifications.

Good evidence by @arthurconmy.bsky.social & @neelnanda.bsky.social on SAE features are transferable across base and IT models.

🧐 How about LLaVA?

tiny.cc/sae1
Are SAE features from the Base Model still meaningful to LLaVA? — LessWrong
Shan Chen, Jack Gallifant, Kuleen Sasse, Danielle Bitterman[1] Please read this as a work in progress where we are colleagues sharing this in a lab (…
tiny.cc
December 5, 2024 at 8:16 PM
Crosscare is accepted
@neuripsconf.bsky.social
🎉 We showed LLMs are far from grounded with true prevalence, and groundings across languages are so inconsistent!
Also, a dashboard for people to explore the prevalence data across diseases and racial groups: crosscare.net
#NeurIPS2024
Cross-Care Dataset
The Cross-Care Dataset provides comprehensive insights into co-occurrence patterns of various diseases. This dataset is invaluable for researchers and healthcare professionals seeking to understand co...
crosscare.net
November 27, 2024 at 3:09 PM
Million thanks to my wonderful advisor @daniellebitterman.bsky.social and all my colleagues and friends!
🎉 Incredibly proud of @shan23chen.bsky.social for being selected for the 2024 Google PhD Fellowship in Natural Language Processing: blog.google/technology/r... !!! So excited to see how Shan's contributions will continue shaping the future of clinical NLP
#HealthAI #NLP
🌟
blog.google
November 17, 2024 at 4:31 PM
Here are some reflections on many studies we did this year. Tons of progress has been made, but there are still safety concerns..🧐

Poster 10:30 riverfront at EMNLP2024 🏖️
Happy to chat and connect!

📃 huggingface.co/blog/shanche...

🔊 tinyurl.com/aimpodcast24

@daniellebitterman.bsky.social
What We Learned About LLM/VLMs in Healthcare AI Evaluation:
A Blog post by Shan Chen on Hugging Face
huggingface.co
November 13, 2024 at 4:28 AM