Lightnews — Scholar-powered news

Jaydeep Borkar

@jaydeepborkar.bsky.social

Very excited to be joining Meta GenAI as a Visiting Researcher starting this June in New York City!🗽 I’ll be continuing my work on studying memorization and safety in language models.

If you’re in NYC and would like to hang out, please message me :)

May 15, 2025 at 3:18 AM

Jaydeep Borkar

@jaydeepborkar.bsky.social

Really liked this slide by @afedercooper.bsky.social on categorizing extraction vs regurgitation vs memorization of training data at CS&Law today!

March 25, 2025 at 9:11 PM

Jaydeep Borkar

@jaydeepborkar.bsky.social

*Takeaway*: these results underscore the need for more holistic memorization audits, where examples that aren’t extracted at a particular time point are also evaluated for any potential risks. E.g., we find that multiple models have equal or more assisted memorization.

March 2, 2025 at 7:20 PM

Jaydeep Borkar

@jaydeepborkar.bsky.social

—extends to LLMs: removing one layer of memorized PII exposes a 2nd layer, & so forth. We find this to be true even for random removals (which simulate opt-out requests). PII on the verge of memorization surfaces after others are removed.

March 2, 2025 at 7:20 PM

Jaydeep Borkar

@jaydeepborkar.bsky.social

We find that removing extracted PII from the data & re-finetuning from scratch leads to the extraction of other PII. However, this phenomenon stops after certain iterations. Our results confirm that this layered memorization—termed the Onion Effect (Carlini et al. 2022)…

March 2, 2025 at 7:20 PM

Jaydeep Borkar

@jaydeepborkar.bsky.social

We find that: 1) extraction increases substantially with the amount of PII contained in the model’s training set, & 2) inclusion of more PII leads to existing PII being at higher risk of extraction. This effect can increase extraction by over 7× in our setting.

March 2, 2025 at 7:20 PM

Jaydeep Borkar

@jaydeepborkar.bsky.social

We run various tests to characterize the underlying reason for assisted memorization. We causally remove overlapping n-grams based on our methodology and find a strong correlation.

March 2, 2025 at 7:20 PM

Jaydeep Borkar

@jaydeepborkar.bsky.social

The literature so far lacks a clear understanding of the complete memorization landscape throughout training. In this work, we provide a complete taxonomy & uncover novel forms of memorization that arise during training.

March 2, 2025 at 7:20 PM

Jaydeep Borkar

@jaydeepborkar.bsky.social

We observe a phenomenon we call assisted memorization: we find that most PII (email ID) isn’t extracted after it is first seen. But fine-tuning further on data that contains n-grams that overlap with these PII finally leads to their extraction. This is a key factor (in our settings, up to 1/3).

March 2, 2025 at 7:20 PM

Jaydeep Borkar

@jaydeepborkar.bsky.social

What happens if we fine-tune an LLM on more PII? We find that PII that wasn’t previously extracted gets extracted after fine-tuning on *other* PII. This could have implications for earlier seen data (e.g. during post-training or further fine-tuning). 🧵

paper: arxiv.org/pdf/2502.15680

March 2, 2025 at 7:20 PM

Jaydeep Borkar

@jaydeepborkar.bsky.social

—extends to LLMs: removing one layer of memorized PII exposes a 2nd layer, & so forth. We find this to be true even for random removals (which simulate opt-out requests). PII on the verge of memorization surfaces after others are removed.

March 2, 2025 at 7:11 PM

Jaydeep Borkar

@jaydeepborkar.bsky.social

We find that removing extracted PII from the data & re-finetuning from scratch leads to the extraction of other PII. However, this phenomenon stops after certain iterations. Our results confirm that this layered memorization—termed the Onion Effect (Carlini et al. 2022) …

March 2, 2025 at 7:11 PM

Jaydeep Borkar

@jaydeepborkar.bsky.social

We find that: 1) extraction increases substantially with the amount of PII contained in the model’s training set, & 2) inclusion of more PII leads to existing PII being at higher risk of extraction. This effect can increase extraction by over 7× in our setting.

March 2, 2025 at 7:11 PM

Jaydeep Borkar

@jaydeepborkar.bsky.social

We run various tests to characterize the underlying reason for assisted memorization. We causally remove overlapping n-grams based on our methodology and find a strong correlation.

March 2, 2025 at 7:11 PM

Jaydeep Borkar

@jaydeepborkar.bsky.social

The literature so far lacks a clear understanding of the complete memorization landscape throughout training. In this work, we provide a complete taxonomy & uncover novel forms of memorization that arise during training.

March 2, 2025 at 7:11 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news