Michael Horrell
banner
horrellmt.bsky.social
Michael Horrell
@horrellmt.bsky.social
PhD in Statistics
Building https://github.com/mthorrell/gbnet
Credit to PyTorch, XGB and LightGBM btw. This was surprisingly easy to accomplish: (1) serialize GBMs to string (2) store the string in PyTorch state_dict.
July 21, 2025 at 3:16 AM
In my talk I plan to cover applications in Forecasting, Ordinal Regression, Mixture of Experts (MoE), even Matrix Factorization, all with XGBoost or LightGBM forming major model components.

If you're going to be at #SciPy2025, come find me, I'd love to chat!
June 24, 2025 at 2:21 PM
If you're a user of XGBoost or LightGBM and want to do more with those already amazing tools, then take a look at GBNet.

GBNet connects XGBoost and LightGBM to PyTorch to enable a wide range of new applications for gradient boosting models.
June 24, 2025 at 2:21 PM
Side note: if you extend this case to use different numbers of observations and different variances, you get actually a pretty nice looking matplotlib plot
June 16, 2025 at 2:54 AM
The simplest version of this is:

- Model1 is some fancy model that may have some bad edge case behavior
- Model2 is a basic average

A straightforward GBNet PyTorch Module will allow one to fit these coefficients.
June 14, 2025 at 5:12 PM
This feels a lot like normal embeddings, though unlike PyTorch embeddings, these embeddings will be fixed at 0 unless there is an empirical reason to have them be non-zero.
May 26, 2025 at 2:48 PM
If you use Meta's Prophet package, give GBNet a look. It might provide significant lift with a one line code change.

github.com/mthorrell/gb...
Release v0.5.0 · mthorrell/gbnet
What's Changed added uncertainty ests by @mthorrell in #73 Full Changelog: v0.4.0...v0.5.0
github.com
May 22, 2025 at 1:29 AM
Next is adding estimates of uncertainty. I _think_ I can get away with having the GBM output a standard deviation and then I just use a gaussian likelihood (rather than pure MSE), but we will see.
April 12, 2025 at 8:11 PM
It's worth mentioning that changepoints, while they do improve performance, do not seem strictly necessary for many datasets. When the core model is Trend + GBM(t), the GBM by itself truly can handle a lot of non-stationary behavior and still give great performance.
April 12, 2025 at 8:11 PM
Explaining the equation: The slope of the (now non-linear) trend-line is a base slope + adjustments based on points in time. Those adjustments are determined by GBDT AND the outputs of the GBDT need to be summed to produce a continuous line. You'd have to use GBNet to do something like this.
March 30, 2025 at 10:35 PM