Lightnews — Scholar-powered news

Reposted by Vaidehi Patil

Peter Hall

@peterha2l.bsky.social

In any case, the work is featuring at an interesting-looking workshop this weekend, put on by @katherinelee.bsky.social, @vaidehipatil.bsky.social, and others. More info here: mugenworkshop.github.io

MUGen @ ICML 2025 - Workshop on Machine Unlearning for Generative AI

mugenworkshop.github.io

July 15, 2025 at 1:27 PM

Vaidehi Patil

@vaidehipatil.bsky.social

Thanks to my amazing collaborators Yi-Lin Sung , @peterbhase.bsky.social , Jie Peng, Tianlong Chen , @mohitbansal.bsky.social for a wonderful collaboration!

May 7, 2025 at 6:55 PM

Vaidehi Patil

@vaidehipatil.bsky.social

📎 Check it out here!
📄 Paper: arxiv.org/abs/2505.01456
💻 Code and Dataset: github.com/Vaidehi99/Un...
huggingface.co/datasets/vai...
🤗 HuggingFace: huggingface.co/papers/2505....

Unlearning Sensitive Information in Multimodal LLMs: Benchmark and Attack-Defense Evaluation

LLMs trained on massive datasets may inadvertently acquire sensitive information such as personal details and potentially harmful content. This risk is further heightened in multimodal LLMs as they in...

arxiv.org

May 7, 2025 at 6:55 PM

Vaidehi Patil

@vaidehipatil.bsky.social

Key Findings
🔥 Multimodal attacks are the most effective
🛡️ Our strongest defense is deleting info from hidden states
📉 Larger models are more robust to extraction attacks post-editing compared to smaller ones
🎯 UnLOK-VQA enables targeted evaluations of unlearning defenses

May 7, 2025 at 6:55 PM

Vaidehi Patil

@vaidehipatil.bsky.social

⚔️ Benchmarking Multimodal Unlearning Defenses
Multimodal data opens up new attack vectors.
We benchmark 6 unlearning defenses against 7 attack strategies, including:
✅White-box attacks
✅Black-box paraphrased multimodal prompts

May 7, 2025 at 6:55 PM

Vaidehi Patil

@vaidehipatil.bsky.social

This enables two key types of evaluation:
✅Generalization Evaluation
✔️Rephrased questions
✔️Rephrased images

✅Specificity Evaluation
✔️Neighboring questions (same image, new question)
✔️Neighboring images (same concept, different image)

May 7, 2025 at 6:55 PM

Vaidehi Patil

@vaidehipatil.bsky.social

📦 What Is UnLOK-VQA?
UnLOK-VQA focuses on unlearning pretrained knowledge and builds on OK-VQA, a visual QA dataset. We extend it w/ an automated question-answer generation and image generation pipeline:
✅Forget samples from OK-VQA
✅New samples at varying levels of proximity (easy, medium, hard)

May 7, 2025 at 6:55 PM

Vaidehi Patil

@vaidehipatil.bsky.social

This is essential for:
📜 Legal compliance (e.g., GDPR, CCPA, the right to be forgotten)
🔐 Multimodal Privacy (e.g., faces, locations, license plates)
📷 Trust in real-world image-grounded systems

May 7, 2025 at 6:55 PM

Vaidehi Patil

@vaidehipatil.bsky.social

🔍 Why Does Multimodal Unlearning Matter?
Existing unlearning benchmarks focus only on text.
But multimodal LLMs are trained on web-scale data—images + captions—making them highly vulnerable to leakage of sensitive or unwanted content.
Unlearning must hold across modalities, not just in language.

May 7, 2025 at 6:55 PM

Vaidehi Patil

@vaidehipatil.bsky.social

We study:
❓ How effectively can we erase multimodal knowledge?
❓ How should we measure forgetting in multimodal settings?
✅We benchmark 6 unlearning defenses against 7 whitebox and blackbox attack strategies

May 7, 2025 at 6:55 PM

Vaidehi Patil

@vaidehipatil.bsky.social

Call for PC Members!
We’re looking for program committee members!
📝 Submit your Expression of Interest here: forms.gle/ZPEHeymJ4t5N...
#ICML2025

MUGen @ ICML '25 - PC Expression of Interest

We are currently recruiting reviewers for the Program Committee of MUGen (Machine Unlearning for Generative AI) @ ICML '25. If you are interested in participating, please fill out this form. We antici...

forms.gle

April 2, 2025 at 3:59 PM

Vaidehi Patil

@vaidehipatil.bsky.social

👩‍💻 Organizers:
Mantas Mazeika, Yang Liu, @katherinelee.bsky.social, @mohitbansal.bsky.social, Bo Li and myself (@vaidehipatil.bsky.social) 🙂

April 2, 2025 at 3:59 PM

Vaidehi Patil

@vaidehipatil.bsky.social

🔥 Speakers & Panelists:
We're lucky to have an incredible lineup of speakers and panelists covering diverse topics in our workshop:
Nicholas Carlini, Ling Liu, Shagufta Mehnaz, @peterbhase.bsky.social , Eleni Triantafillou, Sijia Liu, @afedercooper.bsky.social, Amy Cyphert

April 2, 2025 at 3:59 PM

Vaidehi Patil

@vaidehipatil.bsky.social

We invite contributions exploring key challenges and advancements at the intersection of machine unlearning and generative AI!

🔗 Full details & updates: mugenworkshop.github.io

📅 Key Dates:
📝 Submission Deadline: May 19
✅ Acceptance Notifications: June 9
🤝 Workshop Date: July 18 or 19

MUGen @ ICML 2025 - Workshop on Machine Unlearning for Generative AI

mugenworkshop.github.io

April 2, 2025 at 3:59 PM

Vaidehi Patil

@vaidehipatil.bsky.social

Huge thanks to my co-authors
@esteng.bsky.social , and @mohitbansal.bsky.social for a great collaboration!

🚀 Check it out here:
📄 Paper: arxiv.org/abs/2502.15082
💻 Code: github.com/Vaidehi99/UP...
🤗 @huggingface page: huggingface.co/papers/2502....

UPCORE: Utility-Preserving Coreset Selection for Balanced Unlearning

User specifications or legal frameworks often require information to be removed from pretrained models, including large language models (LLMs). This requires deleting or "forgetting" a set of data poi...

arxiv.org

February 25, 2025 at 2:23 AM

Vaidehi Patil

@vaidehipatil.bsky.social

UPCORE consistently outperforms baselines across all methods:

✔️ Less unintended degradation
✔️ Deletion transferred to pruned points

UPCORE provides a practical, method-agnostic approach that improves the reliability of unlearning techniques.

February 25, 2025 at 2:23 AM

Vaidehi Patil

@vaidehipatil.bsky.social

Instead of evaluating at a single training checkpoint, we introduce AUC (Area Under the Curve) across deletion effectiveness and utility.

This provides a complete picture of the trade-off between forgetting and knowledge retention over the unlearning trajectory.

February 25, 2025 at 2:23 AM

Vaidehi Patil

@vaidehipatil.bsky.social

We apply UPCORE across three unlearning methods:
📉 Gradient Ascent
🚫 Refusal
🔄 Negative Preference Optimization (NPO)

We measure:
✔️ Deletion effectiveness – How well the target is removed
✔️ Unintended degradation – Impact on other abilities
✔️ Positive transfer – How well unlearning generalizes

February 25, 2025 at 2:23 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news