Martin Eastwood
martineastwood.co.uk
Martin Eastwood
@martineastwood.co.uk
Somewhere in the middle of a Venn diagram of machine learning and football / soccer.

http://www.pena.lt/y/blog.html
Thanks Nils, it’s a good point and is on my todo list to dig further into how long it takes for something like FSAA to stabilise to something useful early on in a player’s career. Even with a fairly wide HDI at that stage, there’s still potentially benefits to its use
October 24, 2025 at 12:05 PM
Not yet, but good idea!
October 20, 2025 at 5:02 PM
Absolutely. Also, people started sharing big (for then anyway) pre-trained networks that made it much easier to get started. I built many models in my day job back then by fine tuning BERT and ImageNet that I would have struggled to train from scratch without investing massively in compute.
October 7, 2025 at 8:18 PM
We can also split Massey Ratings into attack & defence:

🔴 LFC: best attack in the league paired with a mid-table defence
⚪️ ARS: Elite at both ends. They have the #1 defence and the #3 attack
🌳 Forest: A disaster at both ends of the pitch
October 7, 2025 at 7:55 PM
Thanks to everyone who suggested features and reported issues. Your input shapes the package's development.

Questions or feedback welcome at pena.lt/y/contact

Install: pip install penaltyblog
GitHub: github.com/martineastwo...
September 23, 2025 at 7:47 PM
📚 Interactive Colab notebooks are available in the docs - experiment with real examples without any local setup.

I'll be steadily expanding these over the coming weeks to cover all functionality in the package.

Docs: penaltyblog.readthedocs.io
penaltyblog: Football Data & Modelling Made Easy — penaltyblog documentation
penaltyblog.readthedocs.io
September 23, 2025 at 7:46 PM
🔧 Improved implied odds module:

- New logarithmic overround removal method for better accuracy
- Structured results instead of raw arrays
- Better handling of edge cases

Making it easier to work with bookmaker probabilities in your analyses.
September 23, 2025 at 7:46 PM
💰 Expanded betting utilities:

- Kelly Criterion for multiple outcomes
- Arbitrage opportunity detection
- Value bet identification
- Hedge bet calculations
- Odds format conversion (decimal/fractional/American)

All functions now return structured outputs for easier integration.
September 23, 2025 at 7:45 PM
📚 Docs: penaltyblog.readthedocs.io/en/latest/in...
💻 GitHub: github.com/martineastwo...
🐍 pip install penaltyblog

Feedback welcome, let me know what you build!
penaltyblog: Football Data & Modelling Made Easy — penaltyblog documentation
penaltyblog.readthedocs.io
August 15, 2025 at 7:04 PM
🔍 New: Flow Query DSL

Filter datasets with safe, Pythonic expressions:
- AST-parsed (no eval)
- Variables, regex, dates
- Access nested fields
August 15, 2025 at 6:59 PM
📈 Goal models are now 5-10× faster

- Cython-powered analytical gradients for speed + stability
- Fine-tune with minimizer_options:
August 15, 2025 at 6:57 PM
⚽ New: Pitch Plotting API

Build interactive football visualisations with:
- Multiple layouts & themes
- Scatter, heatmaps, arrows, comets
- Custom hover tooltips
August 15, 2025 at 6:55 PM
There's also a new blog post here that explains more about the latest updates:

pena.lt/y/2025/06/10...
MatchFlow 1.4.0: Optimizing, Visualizing, and Validating your Data Pipelines
MatchFlow just got smarter, friendlier, and more powerful for optimizing your pipelines, visualizing your data flow, and keeping your data clean...
pena.lt
June 19, 2025 at 7:40 PM
If you're interested in finding out more:

Why I built it 👉 pena.lt/y/2025/05/25...

Docs 👉 penaltyblog.readthedocs.io/en/latest/in...
Introducing MatchFlow: a JSON-native query engine for football data.
MatchFlow is a JSON-native query engine for football data - no flattening, no fuss...
pena.lt
May 25, 2025 at 8:05 PM
At the heart of matchflow is the Flow class - a lazy, composable way to work with nested football data.

You can filter, group, and summarize JSON-like data without flattening or loading everything into memory.

👇 Example:
May 25, 2025 at 8:02 PM
Thanks! Just to remove another variable - by simulating the result I definitely know which forecast is better as the simulated result is sampled from that particular forecast. When using the actual result you could argue that the true probabilities it's sampled from are unknown.
May 2, 2025 at 10:06 AM