banner
jacobpstein.bsky.social
@jacobpstein.bsky.social
Evidence-based data science, vibes-based basketball fan. Here for #tidytuesday, mostly. Code here: https://github.com/jacobpstein
Trying to get back into writing. In my latest, I look at NBA rookie minutes and the Wizards and cacti. Like and subscribe!🌵 open.substack.com/pub/wizardsp...
Let 'Em Cook?
Kicking off a series on rookie minutes and rookie growth
open.substack.com
August 1, 2025 at 9:09 PM
If you're on the data science job hunt and feeling discouraged just know that there are terrible clustering algos out there, in production, and you can do much better. Like, look at these 'similar shoes' from DSW. If you're reading this, you can get better results. I believe in you!
July 10, 2025 at 6:10 PM
Still hurts to see the Wizards at 6 after seeing the Wizards at 18-64.
June 25, 2025 at 11:32 PM
Say what you will about Poole as a player, but he went from laughing stock to really winning fans over in DC this past season
NEWS: The Wizards and Pelicans have agreed on a trade, sources confirm to The Athletic.

NOP receives:
◻️ Jordan Poole
◻️ Saddiq Bey
◻️ 2025 No. 40 pick

WAS receives:
◻️ CJ McCollum
◻️ Kelly Olynyk
◻️ Future second-round pick
June 24, 2025 at 7:27 PM
In my old day job, I pushed for more simulations to inform causal design, check methods, and help us learn. I posted some code and did a little write up on LinkedIn with a colleague about a case of propensity score matching that crossed our desks. github.com/jacobpstein/...
GitHub - jacobpstein/psm_did_sim: Repo for PSM + DID bias from inclusion of treatment affected covariates in a matching model
Repo for PSM + DID bias from inclusion of treatment affected covariates in a matching model - jacobpstein/psm_did_sim
github.com
June 23, 2025 at 6:51 PM
Sometimes you accidentally write a recursive loop and that's when the fun really starts.
June 18, 2025 at 6:16 PM
What a difference a default makes: on viz buzz last week I used theme_minimal, which defaults to a clear background. My viz was 7% similar to target viz due to transparency @libbyheeren.bsky.social & @nickwan.bsky.social checked again with a white background and well… m.twitch.tv/nickwan_data...
nickwan_datasci - transparency matters
Watch nickwan_datasci's clip titled
m.twitch.tv
June 16, 2025 at 2:13 PM
Very cool data viz showing game flow in 3D for the NBA finals vsueiro.com/hoop-hills/
Hoop Hills: Peaks & Valleys of NBA Games
This 3D data visualization shows every moment a team was leading or trailing
vsueiro.com
June 7, 2025 at 5:17 PM
I barely had any time for #TidyTuesday this week and want to revisit these Gutenberg data sets with some LLM tools at some point. I looked at life spans but kept it to the period since the modern novel was born. This could be a good interactive if I were doing a quarto presentation
June 4, 2025 at 5:36 PM
I know it's lame to highlight a corporate-y Getty photo, but this is one of those cool basketball pics that highlights how these guys are so good at doing otherworldly stuff--like somehow shooting a ball while seemingly falling and being blocked
May 30, 2025 at 12:03 AM
I don’t like live coding—it’s kind of like when someone asks if you know any jokes and you can’t think of a single funny thing you’ve ever heard in your life. But it’s also probably good to live code occasionally so at least you know where you get stuck, what makes you nervous, etc.
May 29, 2025 at 1:49 AM
This week's #TidyTuesday was a tough one! Lots of correlated values, no domain knowledge, and small-n groups. I spent a long time flailing around trying to figure out what might be interesting. Predicting hit points based on the other data seemed like a good way to compare model types.
May 28, 2025 at 2:30 PM
I didn't get around to doing #TidyTuesday last week because I was hustling to finish slides for a presentation to the DC Data Viz meetup. Here are the slides--https://0196f5d5-dc61-3977-66b7-ccd1e7b9cead.share.connect.posit.cloud/#/title-slide
May 26, 2025 at 5:15 PM
@owenphillips.bsky.social don't quite know what to make of this, but the correlation between two point attempts and shot quality went positive on average for the first time this season. Could be spurious, could be mid-range theory at play
May 22, 2025 at 2:48 AM
I have been re-reading Ferrante's Neapolitan Novels so this week's #TidyTuesday felt very much on theme. I started to go down a rabbit hole of spatial modeling, but decided that for getting this done while I have a little time, it's better just to make a nice descriptive plot.
May 16, 2025 at 5:54 PM
If the 1995 heist thriller “Heat” had been set in DC, it would have been called “Humidity”
May 16, 2025 at 1:04 PM
Just finished a volunteer data project where I was given a bunch of CRM data from different platforms with different column set ups. There is a special satisfaction to cleaning a bunch of messy data, getting into a flow state, and emerging mostly sane.
May 15, 2025 at 11:41 AM
For this weeks #TidyTuesday, I made a Shiny dash using querychat to filter cancelled NSF grants from Grant Watch. As a fired Fed, this was sadly relevant. I'm not the strongest when it comes to Shiny, but it works so that's a win 0196abac-e292-c6e2-6623-5ed2c685e4ea.share.connect.posit.cloud
Cancelled NSF Grants Dashboard
0196abac-e292-c6e2-6623-5ed2c685e4ea.share.connect.posit.cloud
May 7, 2025 at 5:27 PM
Trying to get my #TidyTuesday post up, but hit a snag where my Shiny dash won't load. My question is here if any folks on #rstats Bluesky are Shiny experts and also have used Ellmer, feel free to DM or reply on the forum! forum.posit.co/t/querychat-...

cc @jcheng5.bsky.social
QueryChat dashboard publishes, log doesn't contain errors, but the dash never fully loads
I just finished creating a Shiny dashboard that uses querychat to filter a data set in R and load a bunch of figures, as well as a table. The dash loads fine locally and I was able to publish without ...
forum.posit.co
May 7, 2025 at 1:18 PM
Nobody:
Me: and this is why it’s always good to present uncertainty around your probability estimates. Thanks, and I’ll see myself out.
AAAAAAHHHHHHHHHHH..............
May 7, 2025 at 2:09 AM
Trying to get better at #shiny and am amazed at how good some of the learning tools are. Like shinyWidgetsGallery() which is a function that opens a Shiny dash to help you with your Shiny dash. It’s shiny dashes all the way down #rstats
May 6, 2025 at 12:07 PM
My old man #NBA take is that if a player on a team already wears the number 0, you can’t give another player 00. It’s the same number representing the same numerical value! Anyway, the #Pacers are fun team, all the best to Halliburton and Mathurin.
May 5, 2025 at 12:34 AM
I don’t think there’s data on this but I think we might be in a golden age of the reverse layup
May 1, 2025 at 12:04 AM
Oof, didn't have much time for #TidyTuesday today, but thought I'd look at the density of sessions at the #UseR conference by time slot. I always like these graphs even if I'm not totally sure they make sense. The colors are inspired by LaCroix and the excellent LaCroixR package.
April 30, 2025 at 12:55 AM
For this week's #TidyTuesday, I looked at how poisson and OLS regression differ. I don't think I really learned about this in school, but you can run into all kinds of issues if you want to model count data, like auto fatalities in this week's data.
April 23, 2025 at 3:22 PM