Lightnews — Scholar-powered news

Igor Shilov (➡️ ICML 🇨🇦)

@igorshilov.bsky.social

Anthropic AI Safety Fellow

PhD student at Imperial College London.
ML, interpretability, privacy, and stuff
🏳️‍🌈

https://igorshilov.com/

Posts Replies Media Videos

Igor Shilov (➡️ ICML 🇨🇦)

@igorshilov.bsky.social

Arrived in beautiful Vancouver!
More conferences with mountain views please!

Ping me if you want to chat about privacy and security of LLMs!

July 16, 2025 at 1:43 PM

Igor Shilov (➡️ ICML 🇨🇦)

@igorshilov.bsky.social

The best part? You can collect per-sample losses for free during training by simply changing the loss reduction:

# Standard PyTorch training loop
criterion = nn.CrossEntropyLoss(reduction="none") # Change from default "mean"

# During training
loss = criterion(outputs, targets)
# Here loss has shape [batch_size] - per-sample losses

# Save the per-sample losses
saved_losses.append(loss.detach())

# Take mean for backward pass
loss.mean().backward()

June 24, 2025 at 3:17 PM

Igor Shilov (➡️ ICML 🇨🇦)

@igorshilov.bsky.social

Our proposed loss trace aggregation methods achieve 92% Precision@k=1% in identifying samples vulnerable to LiRA attack on CIFAR-10 (positives at FPR=0.001). Prior computationally effective vulnerability detection methods (loss, gradient norm) perform barely better than random on the same task.

This image shows a Receiver Operating Characteristic (ROC) curve plotting Precision@k=1% against False Positive Rate (FPR) on a logarithmic x-axis scale from 10^-5 to 10^0. The graph compares five different methods:

LT-IQR (Ours) - shown as a solid blue line that achieves high precision (around 0.67) at very low FPR and gradually increases to nearly 1.0
Loss - shown as a solid green line that closely follows the LT-IQR performance
Gradient norm - shown as a solid pink/magenta line that also tracks very similarly to the LT-IQR method
RMIA (2 shadow models) - shown as an orange dashed line that starts at around 0.61 precision and increases more gradually
Random guess - shown as a dotted light blue line that increases linearly from 0 to 1.0, representing baseline random performance

All methods except random guessing show strong performance, with the proposed LT-IQR method and loss/gradient norm approaches achieving superior precision at low false positive rates. The curves demonstrate the trade-off between precision and false positive rate for what appears to be a machine learning model evaluation or anomaly detection task.

June 24, 2025 at 3:17 PM

Igor Shilov (➡️ ICML 🇨🇦)

@igorshilov.bsky.social

🐸 Check out these CIFAR-10 frog examples:

Easy-to-fit outliers: Loss drops late but reaches near zero → most vulnerable

Hard-to-fit outliers: Loss drops slowly, stays relatively high → somewhat vulnerable

Average samples: Loss drops quickly and stays low → least vulnerable

This image shows a training loss curve graph with three colored lines (purple, yellow-green, and cyan) plotting loss values against training epochs from 0 to 100. Above the graph are three small images that appear to correspond to the three different training runs: a golden/yellow frog or toad, a white tree frog, and what appears to be a darker colored amphibian. The purple line shows rapid loss decrease early in training and stays near zero throughout. The yellow-green line shows high volatility with peaks reaching around 7.0 and periods of low loss. The cyan line demonstrates the most erratic behavior with frequent spikes up to 7.5 and sustained periods of high loss, particularly in the 60-80 epoch range. All three lines converge to low loss values by epoch 100, suggesting successful model training despite different convergence patterns.

A line chart showing training loss over 100 epochs for three different data conditions. The chart has three colored lines: teal for hard-to-fit outliers (12% vulnerable), olive/yellow-green for easy-to-fit outliers (29% vulnerable), and purple for average data (4.6% vulnerable). All three lines start around 2.5 loss at epoch 0, with the purple average line declining most rapidly and smoothly to near 0 by epoch 100. The olive easy-to-fit outlier line declines more gradually, reaching about 0.1 by epoch 100. The teal hard-to-fit outlier line shows the most volatile behavior with frequent spikes and only reaches about 1.25 loss by epoch 100. A vertical dashed gray line appears around epoch 10, likely marking a significant training milestone.

June 24, 2025 at 3:17 PM

Igor Shilov (➡️ ICML 🇨🇦)

@igorshilov.bsky.social

The line-up for the evening:

- Graham Cormode (University of Warwick/Meta AI)
- Lukas Wutschitz (M365 Research, Microsoft)
- Jamie Hayes (Google DeepMind)
- Ilia Shumailov (Google DeepMind)

December 17, 2024 at 10:26 AM

Igor Shilov (➡️ ICML 🇨🇦)

@igorshilov.bsky.social

Wow so we actually got to a point where Anthropic sponsors exhibitions at Tate modern

November 20, 2024 at 11:06 AM

Igor Shilov (➡️ ICML 🇨🇦)

@igorshilov.bsky.social

Low-stakes conspiracy of the day: the protestor throwing glitter at Starmer was a personal favour from Starmer himself. Because as a serious politician you don’t get to wear glitter in public anymore, and sometimes nothing hits quite like it

October 13, 2023 at 9:28 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news