Matthias Kullowatz
mattyanselmo.bsky.social
Matthias Kullowatz
@mattyanselmo.bsky.social
Data scientist, pickup sportsballer, founder of americansocceranalysis.com.
And in the end, key variables like goal diff, possession length, etc. are still accounted for without overweighting certain teams.
April 26, 2025 at 6:40 PM
This assumes that the underlying xPV framework is fit with some sort of model (anything from OLS to neural network). Then weights can be submitted and applied to the optimization function. Guarantees that the teams that get to “good” states are not over represented in those states.
April 26, 2025 at 6:39 PM
Team proxies = great callout. I propose, instead of removing the proxy variables, control for them. Creat clusters based on team proxy variables at the row level. Within each cluster within each season, weight the rows such that each team is equally represented. Submit weights to ML optimization.
April 26, 2025 at 6:37 PM
Using prior week's lineups, g+ saw WAS as +0.3 goals better than LOU on roster g+ alone. That said, WAS has allowed 1.4 xG in open play vs. LOU's 0.55, which mostly offsets that advantage; model now sees both teams as ~equal. Home effect for LOU + WAS injuries = 50% home win probability.
April 16, 2025 at 7:37 PM
?
March 3, 2025 at 10:26 PM