Lightnews — Scholar-powered news

Michael Horrell

@horrellmt.bsky.social

Side note: if you extend this case to use different numbers of observations and different variances, you get actually a pretty nice looking matplotlib plot

June 16, 2025 at 2:54 AM

Michael Horrell

@horrellmt.bsky.social

Bayesian modeling provides a nice test bed for this

Eg: When finding a mean, how much to weight the prior (call it Model 1) vs the sample average (Model 2)? Some bayesian math gives the optimal answer given # of obs.

A bit of GBNet & PyTorch empirically derives the same answer almost exactly.

June 16, 2025 at 2:54 AM

Michael Horrell

@horrellmt.bsky.social

Next application for GBNet is basically a data-aware model averaging or a mixture of experts type of analysis.

Situation: you have several models with predictions

Q: Is there a data-driven way to combine them? And, for convenience, can I use XGBoost to do find the right averaging coefficients?

June 14, 2025 at 5:12 PM

Michael Horrell

@horrellmt.bsky.social

GBNet calls XGBoost/LightGBM under the hood. This means you can bring native XGB/LGBM features to PyTorch with little effort.

Categorical splitting is one interesting feature to play with using GBNet. To scratch the surface, I fit a basic Word2Vec model using XGBoost for categorical splitting.

May 26, 2025 at 2:48 PM

Michael Horrell

@horrellmt.bsky.social

Just released GBNet V0.5.0

Beyond some usability improvements, the uncertainty estimation for the Forecasting module got merged in. Now GBNet forecasting is:

✅ Faster
✅ More accurate than Prophet
✅ Provides uncertainty estimates
✅ Supports changepoints

May 22, 2025 at 1:29 AM

Michael Horrell

@horrellmt.bsky.social

One benefit to speeding up model fitting code 5X is that you can use that saved time for other things.

Adding conf intervals for gbnet forecasting module, I can do train/validation holdout for this and still be 3-4X faster.

Trying to get 80% test coverage:
New method: 76% avg
Prophet: 55% avg

April 28, 2025 at 12:27 AM

Michael Horrell

@horrellmt.bsky.social

So it's continually a nice surprise that stuff like this kinda just works.

I asked GBNet for a second prediction output. I slapped on torch.nn.GaussianNLLLoss and out comes a variance estimate that is well calibrated.

April 13, 2025 at 12:30 AM

Michael Horrell

@horrellmt.bsky.social

Just merged changepoints into the forecasting sub-module of GBNet and released V0.4.0.

Default forecast performance improved by 20% and achieved a 5X speedup. Using the random training, random horizon benchmark, now 9 of 9 example datasets have better performance with GBNet compared to Prophet.

April 12, 2025 at 8:11 PM

Michael Horrell

@horrellmt.bsky.social

Still working on changepoints. Several methods work but don't improve benchmarks. When your model is Trend + XGBoost, XGB can just handle a lot of non-stationarity.

Most promising method so far (see plot) asks GBDT to fit and find the changepoints. Another cool application of GBNet (see equation).

March 30, 2025 at 10:35 PM

Michael Horrell

@horrellmt.bsky.social

Claude gets half credit. It added changepoints for the PyTorch trend option but skipped it for the GBLinear option.

Unfortunately it's back to the drawing board. PyTorch changepoints fit too slowly and GBLinear, I now realize, can't actually turn them off. It does work though!

March 19, 2025 at 11:33 PM

Michael Horrell

@horrellmt.bsky.social

Case in point: GBNet Forecast fits XB + GBDT(X). It has a linear component.

I replaced PyTorch Linear with GBLinear (removing batchnorm) and...
1. improved accuracy (see table)
2. sped up fitting 10X by using fewer training rounds
3. improved worst-case performance (see plot)

March 5, 2025 at 3:34 AM

Michael Horrell

@horrellmt.bsky.social

GBLinear solves this without scaling. Also, it generally can find sharper minima, often resulting in better training loss which of course can be associated with better test loss.

Training curves on the left (note the log scales). A sample from the dataset on the right... far from a crazy dataset.

March 2, 2025 at 9:34 PM

Michael Horrell

@horrellmt.bsky.social

Cool trick I hadn't seen before today -- You can solve Ridge Regression with some simple concatenations to the X and Y inputs.

Turned out great for my use-case because I was using a pure least-squares solver but I wanted Ridge Regression.

February 21, 2025 at 3:48 AM

Michael Horrell

@horrellmt.bsky.social

Case in point, it seems to work! Nice job! I copied and pasted their loss function, put LightGBM on it via GBNet and off it went.

February 7, 2025 at 2:09 AM

Michael Horrell

@horrellmt.bsky.social

Due to a user request, I added Ordinal Regression into GBNet. You can now use XGBoost/LightGBM for Ord. Regression.

Ord. loss is complex and has viable alternatives. But, on a set of 19 Ordinal Datasets, Ord. Reg. using LightGBM came out on top. Maybe worth keeping in mind if your data is Ordinal.

Table of win rates comparing results from multiple methods on 19 datasets.

January 26, 2025 at 11:10 PM

Michael Horrell

@horrellmt.bsky.social

I used GBNet to fit different 2D embeddings of MNIST. One is trained to classify via GBNet(X) * Beta. The other is trained by contrastive learning via || GBNet(X) - GBNet(Y) ||.

Plots are below. Which is the contrastive embedding? Which is the classification embedding?