Lightnews — Scholar-powered news

Jason Brownlee

@jason2brownlee.bsky.social

Research scientist & software engineer.
PhD in #AI #MachineLearning #DataScience
Authored 40+ tech books and 1500+ tutorials.
Home: JasonBrownlee.me

Posts Replies Media Videos

Pinned

Jason Brownlee @jason2brownlee.bsky.social · Dec 26

Awesome LLM Books
This is a curated list of books for engineers on development with Large Language Models (LLMs)
github.com/Jason2Brownl...

GitHub - Jason2Brownlee/awesome-llm-books: Awesome LLM Books: Curated list of books on Large Language Models

Awesome LLM Books: Curated list of books on Large Language Models - Jason2Brownlee/awesome-llm-books

github.com

Jason Brownlee

@jason2brownlee.bsky.social

Stacking Ensemble With Dropout Regularization
jasonbrownlee.me/blog/posts/s...

Stacking Ensemble With Dropout Regularization

I was thinking about stacking ensembles (stacked generalization) in the sauna. Stacked ensembles overfit, so we need to regularize. Generally, we use cross-validation to ensure that the meta model is ...

jasonbrownlee.me

January 23, 2025 at 10:29 PM

Jason Brownlee

@jason2brownlee.bsky.social

Awesome AutoML Books
A curated list of books for engineers on development with Automated Machine Learning (#AutoML).
github.com/Jason2Brownl...

GitHub - Jason2Brownlee/Awesome-AutoML-Books: Awesome AutoML Books: Curated list of books on Automated Machine Learning

Awesome AutoML Books: Curated list of books on Automated Machine Learning - Jason2Brownlee/Awesome-AutoML-Books

github.com

December 30, 2024 at 10:06 PM

Jason Brownlee

@jason2brownlee.bsky.social

Awesome LLM Books
This is a curated list of books for engineers on development with Large Language Models (LLMs)
github.com/Jason2Brownl...

GitHub - Jason2Brownlee/awesome-llm-books: Awesome LLM Books: Curated list of books on Large Language Models

Awesome LLM Books: Curated list of books on Large Language Models - Jason2Brownlee/awesome-llm-books

github.com

December 26, 2024 at 10:10 PM

Jason Brownlee

@jason2brownlee.bsky.social

Is there evidence that model performance on train and test sets have the same distributions?

Use statistical tests to confirm general model performance distributions are equivalent.

Check Model Performance Distributions:
datasciencediagnostics.com/diagnostics/...

December 11, 2024 at 5:32 PM

Jason Brownlee

@jason2brownlee.bsky.social

Is there evidence that your train and test sets have the same distributions?

Use statistical tests to confirm that numerical and categorical distributions are equivalent.

Train/Test Data Distributions:
datasciencediagnostics.com/diagnostics/...

December 10, 2024 at 8:44 PM

Jason Brownlee

@jason2brownlee.bsky.social

Is there evidence that the Performance Gap is real or just statistical noise?

Carefully quantify the difference between train and test set performance.

Quantify the Performance Gap:
datasciencediagnostics.com/diagnostics/...

December 9, 2024 at 6:45 PM

Jason Brownlee

@jason2brownlee.bsky.social

Data Science Diagnostics
Helpful checks for data scientists with urgent problems
DataScienceDiagnostics.com

#DataScience #MachineLearning

December 8, 2024 at 9:57 PM

Jason Brownlee

@jason2brownlee.bsky.social

Are you sure your train/test split percentage is well chosen?

Common split percentages are just heuristics, it is better to know how your data/model behaves under different split scenarios.

Perform a split-size sensitivity analysis:
datasciencediagnostics.com/diagnostics/...