Thomas Speidel
thomas-speidel.bsky.social
Thomas Speidel
@thomas-speidel.bsky.social
@timpmorris.bsky.social you may also be interested in the patently wrong bootstrap implementation in early sklearn. It has since been fixed. My memory is vague but more info here: www.reddit.com/r/statistics...
shaggorama's comment on "Is R better than Python at anything? I started learning R half a year ago and I wonder if I should switch."
Explore this conversation and more from the statistics community
www.reddit.com
August 13, 2025 at 3:36 PM
Besides everything that's been said, it leads to peaks and valleys, although it "looks" ok here. Natural splines, or better, restricted cubic splines (i.e. restricted to be linear at the extremes where there's less data) is the way to go here.
July 16, 2025 at 1:35 PM
I have been using FST for larg-ish datasets with high compression for several yrs. What you select really depends on what you value like size, speed, interoperability, preserving metadata etc. (e.g. FST won't keep some type of metadata like var labels).
February 20, 2025 at 3:19 PM
Then you need to check out @f2harrell.bsky.social extensive notes and #rstat functions! It's surprising how ordinal models esp the proportional odds are not being used more. Robust, can get any quantiles, exceedance P, means, can deal with ceiling/flooring, clumping...
February 13, 2025 at 2:46 PM
📌
February 13, 2025 at 2:38 PM
📌
February 13, 2025 at 2:37 PM
This is lovely, thanks for sharing! Would be nice to also highlight how tidyverse influenced other technologies (cough...panda..cough)
February 12, 2025 at 3:02 PM
📌
February 3, 2025 at 2:40 PM
📌
January 22, 2025 at 2:52 PM
📌
January 21, 2025 at 3:02 PM
They showed calibration of an AI model?😁
January 15, 2025 at 2:36 PM
📌
January 6, 2025 at 2:50 PM
📌
December 16, 2024 at 2:18 PM
📌
December 16, 2024 at 2:11 PM
I think Caret had been largely replaced by tidymodels these days.
December 6, 2024 at 1:30 PM
True, but I feel it is somehow worse in ML because a lot of it is based upon approaches that have long been shown ineffective, misleading or developed for a different goal and these practices are dogmatic in the ML camp. From data splitting to variable selection, calibration, class imbalance etc
November 17, 2024 at 3:56 PM