Lightnews — Scholar-powered news

Mohit Chandra

@mohit30.bsky.social

170 followers 140 following 23 posts

PhDing @GeorgiaTech | Previously: @msftresearch.bsky.social, @Microsoft @iiithyderabad | Research: NLP and Social Computing for Healthcare | Opinions are personal

Homepage: https://mohit3011.github.io/

#ResponsibleAI #Human-CenteredAI #NLPforMentalHealth

Posts Replies Media Videos

Mohit Chandra

@mohit30.bsky.social

Congratulations! 🙌

May 30, 2025 at 4:35 PM

Mohit Chandra

@mohit30.bsky.social

For more details:

Paper: shorturl.at/bldCb
Webpage: shorturl.at/bC1zn
Code: shorturl.at/H8xmp

Grateful for the efforts from my co-authors 🙌: Siddharth Sriraman, @verma22gaurav.bsky.social, Harneet Singh Khanuja, Jose Suarez Campayo, Zihang Li, Michael L. Birnbaum, Munmun De Choudhury

11/11

January 7, 2025 at 9:38 PM

Mohit Chandra

@mohit30.bsky.social

Finding #6: We examined the actionability of mitigation advices. Expert responses scored the highest on overall actionability in comparison to all the LLMs.

While LLMs provide less practical and relevant advice, their advice is more clear and specific.

10/11

Table 3: Mean actionability alignment scores of harm reduction strategies (last column),computed as average of practicality, relevance, specificity, and clarity scores.

January 7, 2025 at 9:38 PM

Mohit Chandra

@mohit30.bsky.social

Finding #5: LLMs struggle to provide expert-aligned harm reduction strategies with larger models producing less expert-aligned strategies than smaller ones.

The best medical model aligned with experts ~71% (GPT-4o score) of the time.

9/11

Table 2: Alignment of harm reduction strategies of various models with the expert’s response.We report the mean and standard deviation for the AlignScore metric GPT-4o score, with the best (bold) and second-best (underline) performing model in each metric highlighted.

January 7, 2025 at 9:38 PM

Mohit Chandra

@mohit30.bsky.social

Using the ADRA framework, we evaluate LLM alignment with experts across expressed emotion, readability, harm reduction strategies, & actionable advice.

Finding #4: We find that LLMs express similar emotions and tones but provide significantly harder to read responses.

8/11

Mean SMOG Scores (for readability) and 95% Confidence Intervals for Various Models (lower values are better).

January 7, 2025 at 9:38 PM

Mohit Chandra

@mohit30.bsky.social

Finding #3: In-context learning boosted performance for both ADR detection and multiclass classification (+23 F1 points for the latter). However, gains in ADR detection task were limited to a few models.

Type of examples had a more pronounced impact for the ADR multiclass class. task.

7/11

January 7, 2025 at 9:38 PM

Mohit Chandra

@mohit30.bsky.social

Finding #2: All LLMs showed “risk-averse” behavior, labeling no-ADR posts as ADR. Claude 3 Opus had a 42% false-positive rate for ADR detection and GPT-4-Turbo misclassified over 50% non-dose/time-related ADRs.

This highlights the lack of "lived-experience" among models.

6/11

January 7, 2025 at 9:38 PM

Mohit Chandra

@mohit30.bsky.social

Finding #1: Larger models perform better for ADR detection tasks (Claude3 Opus led with an accuracy score of 77.41%), but this trend does not hold for ADR multiclass classification. Additionally, distinguishing ADR types remains a significant challenge for all models.

5/11

Table 1: Performance of different models on Binary Detection and Multiclass Classification tasks under Zero-Shot and 5-Shot scenarios.W e report the accuracy score(Acc.) and weighted F1 score as(F1) with the best and second-best performing model metrics in each scenario highlighted in bold and underline, respectively.

January 7, 2025 at 9:38 PM

Mohit Chandra

@mohit30.bsky.social

We introduce the Psych-ADR, a benchmark with Reddit posts annotated for ADR presence/type, paired with expert-written responses and the ADRA framework to systematically evaluate long-form generations in detecting ADR expressions and delivering mitigation strategies.

4/11

Figure 1: Overview of work; we present two tasks in this work– ADR detection and multiclass classification (RQ1), and Expert-LLM response alignment (RQ2).

January 7, 2025 at 9:38 PM

Mohit Chandra

@mohit30.bsky.social

Broader Takeaway #2: To build reliable AI in healthcare, we must move beyond choice-based benchmarks toward tasks that portray the complexities of the real world (such as ADR mitigation) using nuanced frameworks and benchmarks. 📈

Below are some nuanced findings 👇

3/11

January 7, 2025 at 9:38 PM

Mohit Chandra

@mohit30.bsky.social

Broader Takeaway #1: LLMs are tools to empower and not replace mental health professionals. They offer clear & specific advice, addressing the global shortage of care providers—but contextually relevant, practical advice still requires human expertise. 👨‍⚕️👩‍⚕️

2/11

January 7, 2025 at 9:38 PM

Mohit Chandra

@mohit30.bsky.social

Great work! 👏

December 13, 2024 at 1:13 AM

Mohit Chandra

@mohit30.bsky.social

Yup! I joined recently along with a large number of folks and I guess it will become like academic twitter if people continue to engage on the platform.

November 25, 2024 at 11:35 PM

Mohit Chandra

@mohit30.bsky.social

Really amazing work! very insightful

November 25, 2024 at 11:34 PM

Mohit Chandra

@mohit30.bsky.social

Thank you so much!

November 25, 2024 at 6:16 PM

Mohit Chandra

@mohit30.bsky.social

I would love to get added if possible!

November 25, 2024 at 8:00 AM

Mohit Chandra

@mohit30.bsky.social

Congratulations!

It is certainly a good start but I still feel we need more interdisciplinary reviewers (based on the reviews I have gotten). One issue is the ask for reviewers to have at least 3 *CL papers in past 5 years which many researchers might not have.

Something ACs could look into ?

November 24, 2024 at 5:43 AM

Mohit Chandra

@mohit30.bsky.social

Thank you!

November 22, 2024 at 9:14 PM

Mohit Chandra

@mohit30.bsky.social

Would love to get added to this!

November 22, 2024 at 8:41 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news