Desi R Ivanova
banner
desirivanova.bsky.social
Desi R Ivanova
@desirivanova.bsky.social
Research fellow @OxfordStats @OxCSML, spent time at FAIR and MSR
Former quant 📈 (@GoldmanSachs), former former gymnast 🤸‍♀️
My opinions are my own
🇧🇬-🇬🇧 sh/ssh
Lecture 6: Training practicalities
Deep learning's black magic
open.substack.com
March 17, 2025 at 2:33 PM
Lecture 5: Backpropagation and Autodifferentiation

Thank god the days of computing gradients by hand are over! Nevertheless, it’s good to know what backprop is and why we do it

open.substack.com/pub/probappr...
Lecture 5: Backprop and Autodiff
Order matters
open.substack.com
March 12, 2025 at 12:11 PM
The fourth post in the series: open.substack.com/pub/probappr...
Lecture 4: Neural network architectures
Attention!
open.substack.com
March 9, 2025 at 6:46 PM
Go read it on arXiv! Thanks to my co-authors @sambowyer.bsky.social and @laurenceai.bsky.social 💥
March 6, 2025 at 3:00 PM
Along with the lightweight library, we provide short code snippets in the paper.
March 6, 2025 at 3:00 PM
…and for constructing error bars on more complicated metrics, such as F1 score, that require the flexibility of Bayes.
March 6, 2025 at 3:00 PM
...and treated without an independence assumption (e.g. using the same eval questions on both LLMs)...
March 6, 2025 at 3:00 PM
...for making comparisons between two LLMs treated independently...
March 6, 2025 at 3:00 PM
We also suggest simple methods for the clustered-question setting (where we don't assume all questions are IID -- instead we have T groups of N/T IID questions)...
March 6, 2025 at 3:00 PM
Or, in this IID question setting, if you want to stay frequentist you can use Wilson-score intervals: en.wikipedia.org/wiki/Binomial_…
https://en.wikipedia.org/wiki/Binomial_….
March 6, 2025 at 3:00 PM
We suggest using Bayesian credible intervals for your error bars instead, with a simple Beta-Binomial model. (The aim is for the methods to achieve nominal 1-alpha coverage i.e. match the dotted line in the top row. A 95% confidence interval should be right 95% of the time.)
March 6, 2025 at 3:00 PM
This, along with the CLT's ignorance of typically binary eval data (correct/incorrect responses to an eval question) lead to poor error bars which collapse to zero-width or extend past [0,1].
March 6, 2025 at 3:00 PM
As LLMs get better, benchmarks to evaluate their capabilities are getting smaller (and harder). This starts to violate the CLT's large N assumption. Meanwhile, we have lots of eval settings in which questions aren't IID (e.g. questions in a benchmark often aren't independent).
March 6, 2025 at 3:00 PM
The third in the teaching blogs series: Introduction to deep learning

open.substack.com/pub/probappr...
Lecture 3: Introduction to Deep Learning
aka neural networks aka differentiable programming
open.substack.com
March 3, 2025 at 5:03 PM
NHS boss was sacked (well, “resigned”), so there’s some hope for major reforms and improvements in the health system (I hope 🤞)
February 26, 2025 at 11:36 AM
Nice. Are the materials publicly available?
February 21, 2025 at 3:54 PM
We currently do 2 lectures on GPs 😅 one could certainly do a whole course (bayesopt, automl) - could be fun!
February 21, 2025 at 3:44 PM
Indeed, the course is already really quite tight. So if DPs are to be covered, something has to be dropped. I’m thinking for next year potentially dropping constrained optimisation/SVMs (done in the first half) and covering BNP more thoroughly
February 21, 2025 at 3:42 PM
It’s a mix - first part was ERM, SVMs and kernels; second part (which is the one I’m teaching) - Bayesian ML (GPs), deep learning and VI
February 21, 2025 at 2:06 PM
Teaching is super undervalued by universities (at least in UK) so there’s very little incentive to do it well. I think this is wrong and thoughtful pedagogy matters deeply. I hope these “teaching blogs” series will help me get up to speed and improve more quickly

open.substack.com/pub/probappr...
Lecture 1: Gaussian Processes and GP Regression
Nice and easy when everything is Gaussian
open.substack.com
February 21, 2025 at 1:10 PM