Jason Brownlee
banner
jason2brownlee.bsky.social
Jason Brownlee
@jason2brownlee.bsky.social
Research scientist & software engineer.
PhD in #AI #MachineLearning #DataScience
Authored 40+ tech books and 1500+ tutorials.
Home: JasonBrownlee.me
Pinned
Awesome LLM Books
This is a curated list of books for engineers on development with Large Language Models (LLMs)
github.com/Jason2Brownl...
GitHub - Jason2Brownlee/awesome-llm-books: Awesome LLM Books: Curated list of books on Large Language Models
Awesome LLM Books: Curated list of books on Large Language Models - Jason2Brownlee/awesome-llm-books
github.com
Awesome AutoML Books
A curated list of books for engineers on development with Automated Machine Learning (#AutoML).
github.com/Jason2Brownl...
GitHub - Jason2Brownlee/Awesome-AutoML-Books: Awesome AutoML Books: Curated list of books on Automated Machine Learning
Awesome AutoML Books: Curated list of books on Automated Machine Learning - Jason2Brownlee/Awesome-AutoML-Books
github.com
December 30, 2024 at 10:06 PM
Awesome LLM Books
This is a curated list of books for engineers on development with Large Language Models (LLMs)
github.com/Jason2Brownl...
GitHub - Jason2Brownlee/awesome-llm-books: Awesome LLM Books: Curated list of books on Large Language Models
Awesome LLM Books: Curated list of books on Large Language Models - Jason2Brownlee/awesome-llm-books
github.com
December 26, 2024 at 10:10 PM
Is there evidence that model performance on train and test sets have the same distributions?

Use statistical tests to confirm general model performance distributions are equivalent.

Check Model Performance Distributions:
datasciencediagnostics.com/diagnostics/...
December 11, 2024 at 5:32 PM
Is there evidence that your train and test sets have the same distributions?

Use statistical tests to confirm that numerical and categorical distributions are equivalent.

Train/Test Data Distributions:
datasciencediagnostics.com/diagnostics/...
December 10, 2024 at 8:44 PM
Is there evidence that the Performance Gap is real or just statistical noise?

Carefully quantify the difference between train and test set performance.

Quantify the Performance Gap:
datasciencediagnostics.com/diagnostics/...
December 9, 2024 at 6:45 PM
Data Science Diagnostics
Helpful checks for data scientists with urgent problems
DataScienceDiagnostics.com

#DataScience #MachineLearning
December 8, 2024 at 9:57 PM
Are you sure your train/test split percentage is well chosen?

Common split percentages are just heuristics, it is better to know how your data/model behaves under different split scenarios.

Perform a split-size sensitivity analysis:
datasciencediagnostics.com/diagnostics/...
December 8, 2024 at 9:46 PM
Data Science Diagnostic Checklist
(from 10+ years of consulting)
github.com/Jason2Brownl...
GitHub - Jason2Brownlee/DataScienceDiagnosticChecklist: Data Science Diagnostic Checklist
Data Science Diagnostic Checklist. Contribute to Jason2Brownlee/DataScienceDiagnosticChecklist development by creating an account on GitHub.
github.com
December 3, 2024 at 6:34 PM
Use code BLACKFRIDAY for 30% off the "Python Concurrency Boxed Set": superfastpython.com/python-jump-...
November 29, 2024 at 8:46 PM
XGBoost is all you need: XGBoosting.com
November 25, 2024 at 2:49 AM