Dan
banner
Dan
@danwalkerdatasci.bsky.social
Data person
Former fish squeezer
Python - R - Rust
Reposted by Dan
“Tens or hundreds of millions of dollars of taxpayer-funded NASA property and laboratories are at risk of either being discarded, mishandled, or out-of-commission for significant time periods.” 🔭🧪

www.gesta-goddard.org/blog/gestas-...
GESTA’s Summary of Goddard Building Closures Status
NASA Goddard Space Flight Center is the largest group of scientists, technicians and engineers in the US who develop Earth and space science flight missions.  Below is GESTA's understanding of the...
www.gesta-goddard.org
November 6, 2025 at 2:50 AM
Reposted by Dan
Messy folders haunting your R projects? 👻

This Wednesday Oct 29 at 4:00 PM PDT, I'll lead a workshop on Efficient File Management in R with {fs} hosted by @r-ladies-stl.bsky.social. Let's clean and organize a spooky messy folder together!

Register at www.meetup.com/rladies-st-l...

#RStats #DataBS
October 27, 2025 at 9:51 PM
Reposted by Dan
You should read this wonderful little history of the #tidyverse, by @hadley.nz.

It reminded me about my early #rstats days as a PhD student (2011 - 2016), where I was constantly trying out the new things Hadley and crew were cooking up.

hadley.github.io/25-tidyverse...
A personal history of the tidyverse
hadley.github.io
October 22, 2025 at 11:30 AM
Reposted by Dan
A #Slurm user just confirmed that "yay it works. Pretty sick!"

Thanks to excellent feedback from several users, it'll soon be even easier to distribute #rstats code via #HPC job schedulers using future.batchtools

#parallel #futureverse
If anyone else is following this, we've moved over to github.com/futureverse/..., where progress has already been made
September 17, 2025 at 3:19 PM
Reposted by Dan
AI is powerful, but it's no free lunch - and again, it's no substitute for YOUR expertise.
August 17, 2025 at 4:21 PM
R and QGIS, name a better combo

#databs
“I used R the statistical programming language to analyse each of the 3-hourly netCDFs — a file format for storing multidimensional scientific data — and create a geoJSON file where the data was greater than 35C. These files were then loaded into Qgis and styled….” - @sdbernard.bsky.social #RStats
How we made it: Deadly heat domes https://on.ft.com/3TqLwud
July 7, 2025 at 1:46 PM
Reposted by Dan
“I used R the statistical programming language to analyse each of the 3-hourly netCDFs — a file format for storing multidimensional scientific data — and create a geoJSON file where the data was greater than 35C. These files were then loaded into Qgis and styled….” - @sdbernard.bsky.social #RStats
July 7, 2025 at 10:50 AM
Reposted by Dan
Data science junkies, get ready! 🚀 "The Test Set" #podcast trailer is here for your viewing pleasure.

Tune in July 1st and every Tuesday after for new episodes with hosts @mchow.com, @hadley.nz, and @wesmckinney.com as they welcome thought leaders in #DataScience.

Subscribe now: pos.it/thetestset
June 18, 2025 at 4:58 PM
Reposted by Dan
Bleeding edge update for the #tidyverse purrr package with even more seamless #rstats parallel maps.

Introducing our shiniest new adverb: `in_parallel()`. Just wrap your function to take advantage of blazing fast parallel processing via mirai.

pak::pak("tidyverse/purrr")

purrr.tidyverse.org/dev/
Functional Programming Tools
A complete and consistent functional programming toolkit for R.
purrr.tidyverse.org
June 13, 2025 at 3:32 PM
Reposted by Dan
Being able to productionize a ML model is often the goal, however there are many things to keep track of when you do. The orbital package lets you translate your fitted scikit-learn or tidymodels model into SQL that that when run produces predictions.

posit.co/blog/databri... #python #rstats
Posit
Accelerate model deployment with Databricks and Orbital for R and Python Scikit-learn/Tidymodels projects.
posit.co
June 9, 2025 at 10:08 PM
Claude 4 is pretty impressive 🤖
May 22, 2025 at 11:19 PM
Reposted by Dan
How reliable are LLMs at extracting data from pdfs? Inspired by @simonwillison.net's PyCon talk, I added extracting FEMA's daily operation briefing to my LLM evals suite.

Just one model extracted the data from the pdf correctly: Gemini 2.5 Pro Preview. Full results -> kschaul.com/llm-evals/ev...
May 16, 2025 at 7:10 PM
Reposted by Dan
☕ Coffee and Coding ☕

Do you have an interesting piece of code/work to showcase, an opportunity for collaboration or a code dilemma you would like help with?

We would love to hear from you at Coffee and Coding – join the NHS-R Community Slack for more info (postcard.nhsrcommunity.com)!

#rstats
NHS-R Community
postcard.nhsrcommunity.com
April 8, 2025 at 11:41 AM
Reposted by Dan
Statistical Rethinking with brms, ggplot2, and the tidyverse Second edition by A Solomon Kurz
#RStats
https://bigbookofr.com/chapters/statistics.html#statistical-rethinking-with-brms-ggplot2-and-the-tidyverse-second-edition
April 5, 2025 at 12:34 PM
Reposted by Dan
Trying something new:
A 🧵 on a topic I find many students struggle with: "why do their 📊 look more professional than my 📊?"

It's *lots* of tiny decisions that aren't the defaults in many libraries, so let's break down 1 simple graph by @jburnmurdoch.bsky.social

🔗 www.ft.com/content/73a1...
November 20, 2024 at 5:09 PM
Reposted by Dan
We're delighted to announce Jonathan McPherson – software architect at Posit – as keynote speaker at posit::conf(2025)!

If you're curious about how thoughtful design principles can improve the data science tools you use, you won't want to miss this!

Join us Sep 16-18 in Atlanta. pos.it/conf
March 20, 2025 at 7:02 PM
Reposted by Dan
R+Docker, we use an R pkg project structure (R/, man/, tests/, inst/) plus additional top-level folders like `exec/` for docker-executable scripts, `dev/` for devel/sandbox scripts, `reports/` for one-time reports, and `local/` for gitignored large files.

app.R, plumber.R etc go at the top level.
February 27, 2025 at 3:43 PM
Reposted by Dan
Skip splash screen in RStudio IDE 2024.12+ nanx.me/blog/post/rs... #rstats
Skip RStudio splash screen
Learn how to skip the RStudio IDE splash screen using simple automation scripts for macOS, Linux, and Windows.
nanx.me
December 17, 2024 at 6:15 AM
Reposted by Dan
So many of my #RStats checks got cleaner when I learned that nrow() always returns a number if you scream it loud enough:
```
nrow(NULL)
#> NULL
NROW(NULL)
#> [1] 0
```
February 19, 2025 at 7:48 PM
Reposted by Dan
People just now finding out that much our digital infrastructure runs on COBOL. The SABRE airline/hotel reservation system e.g. runs on virtual 1980s mainframes executing 1970s COBOL. See also many banks.

The Old Ones dwell in the dark, undiminished. We cannot kill them without crippling ourselves.
February 17, 2025 at 9:36 AM
Success in wartime intelligence through statistics!

#databs
January 31, 2025 at 2:48 PM
Reposted by Dan
Illuminated contours of the Gulf of México.

#rayshader adventures, an #rstats tale
January 30, 2025 at 1:12 PM
Reposted by Dan
If you have used #ggplot2 in the last couple of years you owe a great deal to @teunbrand.bsky.social who is behind most of the new features and fixes.

Read about his journey to become a part of the ggplot2 core team here:
Joining the ggplot2 team - Tidyverse
I joined the ggplot2 team and would like to share the experience.
www.tidyverse.org
January 28, 2025 at 9:09 AM