Aki Vehtari
avehtari.bsky.social
Aki Vehtari
@avehtari.bsky.social
Professor in computational Bayesian modeling at Aalto University, Finland. Bayesian Data Analysis 3rd ed, Regression and Other Stories, and Active Statistics co-author. #mcmc_stan and #arviz developer.

Web page https://users.aalto.fi/~ave/
Sea view today close to my home
November 11, 2025 at 12:24 PM
5 mins ago
October 29, 2025 at 6:08 PM
October 21, 2025 at 3:58 PM
I was two weeks on vacation in sunny and warm Sardinia, Italy, played in a beach ultimate tournament (this year we were the 4th best team in the world, and the best non-USA team), learned to kite surf, snorkeled, and ate lots of delicious food and gelato
October 6, 2025 at 9:21 AM
And the discrete rootogram with white background
September 3, 2025 at 12:10 PM
bayesplot 1.14.0 CRAN release mc-stan.org/bayesplot/ with contributions from @tjmahr.com, Behram Ulukır, and @teemusailynoja.bsky.social

My favorite new feature is the discrete style ppc_rootogram() as proposed in teemusailynoja.github.io/visual-predi... and shown below

1/3
September 3, 2025 at 12:09 PM
We tested the accuracy of the MCSE with 41 posteriordb posteriors of varying complexity, plus with one Birthdays posterior. MCSE matches well the variation in repeated runs of MCMC and bridge sampling. Most of the variation in bridge sampling accuracy is explained by the number of dimensions.
August 21, 2025 at 5:04 PM
For categorical and ordinal data a series of calibration plots can be used. The plots below show one of these calibration plots for Model 1 and Model 2 (the same as in the first post in this thread). The red line going most time outside the blue envelope indicates that Model 1 is misspecifed. 3/4
August 13, 2025 at 2:34 PM
Instead of PPC bar graphs, it is better to look at the calibration of the predictive probabilities with binned calibration plots or even better with PAV-adjusted calibration plot. 2/4
August 13, 2025 at 2:34 PM
Posterior predictive checking of binary, categorical and many ordinal models with bar graphs is useless. Even the simplest models without covariates usually have such intercept terms that category specific probabilities are learned perfectly. Can you guess which model, 1 or 2, is misspecifed? 1/4
August 13, 2025 at 2:34 PM
It's sometimes difficult to get the focus needed for book writing, but this place was perfect for me
August 4, 2025 at 11:20 AM
Based on this photo from 1920's at Helsinki University of Technology (which was later merged to Aalto University), they were also teaching how to draw an owl! (cc @rmcelreath.bsky.social)
July 31, 2025 at 10:48 AM
A new revised version of "Uncertainty in Bayesian leave-one-out cross-validation based model comparison" with Sivula, @mansmag.bsky.social, and Matamoros. We have clarified the goal of the paper, made more clear that the uncertainty is described by the posterior of unknown elpd difference, 1/4
June 23, 2025 at 7:11 AM
The best gelato in Finland
June 4, 2025 at 12:37 PM
In the morning I gave a talk about Bayesian cross-validation at KU Leuven and in the afternoon got to wear Belgian academic gown and hear David Spiegelhalter's honorate doctorate talk, which was great
May 28, 2025 at 3:49 PM
We went to Mordor and all we got were flowers and ice cream.

Bayesian workflow group was a runner-up in Aalto Open Science Award 2024. The current and past group members running-up in alphabetical order: Alejandro Catalina, Anna Riha, Asael Alonzo Matamoros, David Kohns, ...
May 20, 2025 at 12:29 PM
I worked part of the afternoon outside
May 16, 2025 at 2:54 PM
I'll talk about Bayesian workflow Thu 24th April 11-12 CEST in Learn Bayes seminar by Karolinska Institutet @ki.se learnbayes.se/events/bayes... (zoom available)

The focus will be different to my previous workflow talks (see users.aalto.fi/~ave/videos....). This time more flowcharts and shortcuts
April 23, 2025 at 10:18 AM
A new paper with Alex Cooper and Catherine Forbes "Joint leave-group-out cross-validation in Bayesian spatial models" arxiv.org/abs/2504.15586

(Alex did the hard work for this, and running many cross-validation simulations with spatial models is hard)
April 23, 2025 at 8:22 AM
I'm reading a few papers that use notation $\angle\{F|X_1,X_2,\dots,X_n\}$, where F is a distribution function and X are random variables. What does the angle symbol denote? I've not been able to find it with search engines (and one LLM says the most likely explanation is typo)
April 14, 2025 at 4:49 PM
The fast method gives a biased total estimate. The difference estimator corrects the bias using some slow to compute estimates. In the case study, N=407 or N=657, we get close to full brute-force LOO-CV accuracy using subsampling LOO-CV with M=50, which means 8 to 13 faster computation. 7/
March 14, 2025 at 10:33 AM
With subsampling LOO-CV proceedings.mlr.press/v108/magnuss... using the difference estimator we can combine the fast PSIS-LOO-CV performance estimates and M
March 14, 2025 at 10:33 AM
Using Pareto smoothed importance sampling (PSIS) LOO-CV we can get very fast predictive performance estimate along the forward selection search path, but that approach is not cross-validating the search itself, and thus gives slightly optimistic estimates. 3/
March 14, 2025 at 10:33 AM
For a projpred introduction, see doi.org/10.1214/24-S...

Use of reference model and projection already reduces the variance in the model selection criterion that the amount of overfitting in forward selection of covariates is much smaller than with other approaches. 2/
March 14, 2025 at 10:33 AM
In case of more than two categories, we can look at the calibration plots for one-vs-others, and in case of ordinal data we can look at the calibration of cumulative probabilities
March 4, 2025 at 1:15 PM