Hayoung Jung
hayoungjung.bsky.social
Hayoung Jung
@hayoungjung.bsky.social
PhD student at @princetoncitp.bsky.social. Previously @uwcse.bsky.social

website: hayoungjung.me
Broadly interested in computational social science, AI safety & evaluation, NLP for social good & applications (in public health, science...)!

Happy to chat or grab coffee at the conference! Feel free to DM me :)
November 4, 2025 at 6:23 AM
Lastly, I would like to thank my awesome collaborators @shravika-mittal.bsky.social, Ananya Aatreya (my first mentee!), @navreetkaur.bsky.social, and faculty mentors who taught me a lot during this project @tanumitra.bsky.social @munmun10.bsky.social!
September 8, 2025 at 6:13 PM
🙌 We hope public health, platforms, & researchers build on MythTriage to scale OUD myth detection on video platforms.
To support this, we’re releasing everything:
🧠 Models: huggingface.co/SocialCompUW...
💻 Code: github.com/hayoungjungg...
📊 Data: github.com/hayoungjungg...
SocialCompUW/youtube-opioid-myth-detect-M1 · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
huggingface.co
September 8, 2025 at 6:13 PM
🤩 Lastly, we’re excited because this work shows how a decade-old, but simple idea—model cascades—scales with LLM advancements to tackle real high-stakes health issues like OUD myths.

Past work tested model cascades on standard benchmarks (e.g., SQuAD). We validate them in the wild!
September 8, 2025 at 6:13 PM
Our findings offer actionable insights in the context of the ongoing opioid crisis—showing the value of MythTriage:

👩‍⚕️Public health: Inform targeted interventions & debunk myths.
🛡️Platforms: Provides a scalable auditing pipeline to flag high-risk content & improve moderation.
September 8, 2025 at 6:13 PM
📊Finding #3: YouTube’s recommendation continued surfacing myth-supporting content.

➡️12.7% of recs from myth videos led to more myths initially—rising to 22% at deeper levels.

⚠️ Moderation should target these rec pathways that reinforce harmful myths.
September 8, 2025 at 6:13 PM
📊 Finding #2: How you filter your search results matters! Switching from “Relevance” to “Upload Date” or “Rating” increases exposure to myths—echoing the same patterns seen in my COVID-19 misinformation audit: ojs.aaai.org/index.php/IC...

😬A few clicks can change your exposure to myths!
September 8, 2025 at 6:13 PM
🫶Thanks to MythTriage, we present the first large-scale study of OUD-related myths on YouTube!

📊 Finding #1: Nearly 20% of YouTube search results support OUD myths, while 30% oppose.

😰Despite more opposing, myth-supporting content is widespread—and risks shaping how people understand treatment.
September 8, 2025 at 6:13 PM
⚙️So how does MythTriage perform?
📊 Achieves 0.68-0.86 macro F1 and defers only 5-67% of the examples to the costly LLM.

In practice, MythTriage:
💸 Cuts financial costs by 98% vs experts and by 94% vs LLM labeling
⏱️ Cuts time costs by 96% vs experts & by 76% vs LLM labeling
September 8, 2025 at 6:13 PM
🚀 Our solution: MythTriage
👉 Uses lightweight DeBERTa for routine cases
👉 Defers harder ones to GPT-4o (high-performing but costly)

The trick? We distilled DeBERTa on GPT-4o’s synthetic labels—achieving strong performance without massive expert-labeled data.
September 8, 2025 at 6:13 PM
💡Challenge: Detecting OUD myths on video platforms at *scale* is tough–clinical expertise and labeling are essential, but it is slow and costly.

🤖LLMs show promise, but high compute & API costs—especially for long-form video—limit their practicality for large-scale detection.
September 8, 2025 at 6:13 PM
🩺 To rigorously detect OUD myths in our datasets, we collaborated closely with clinical experts to:

✅Validate eight pervasive myths on OUD (see examples below!)
✅Create and refine annotation guidelines
✅Build a gold-standard dataset: 310 videos labeled across 8 myths (~2.5K expert labels).
September 8, 2025 at 6:13 PM
To measure the scale and prevalence of myths on YouTube, we curated opioid and OUD search queries based on real-world search interests. Using these queries, we built two datasets on YouTube:

1️⃣ OUD Search Dataset: 2.9K search results
2️⃣ OUD Recs Dataset: 343K video recommendations
September 8, 2025 at 6:13 PM
🛜Facing offline stigma, many turn to online platforms (YouTube) for health info & recovery.

‼️But myths fuel treatment hesitancy, distrust in healthcare, & stigma.

🤔Understanding the scale of myths is crucial for health officials & platforms to design effective interventions.
September 8, 2025 at 6:13 PM
I would also love to be added!!
June 24, 2025 at 1:30 PM
Thank you for the shoutout, Joey! :)
January 16, 2025 at 5:16 AM