luisbia.bsky.social
luisbia.bsky.social
@luisbia.bsky.social
Reposted by luisbia.bsky.social
This week on Counting Stuff, log first and think later, because storage is cheap, right? A bit of a rant on how not thinking first makes things much more expensive and hard later #dataBS

www.counting-stuff.com/storage-is-c...
Storage is cheap, but not thinking about logging is expensive
The bad habits of data over-collection run deep.
www.counting-stuff.com
January 21, 2025 at 2:20 PM
Reposted by luisbia.bsky.social
Reproducible Data Science in R: Flexible functions using tidy evaluation. Improve your functions with helpful dataframe evaluation patterns! waterdata.usgs.gov/blog/rds-fun... #rstats
December 17, 2024 at 4:38 PM
Reposted by luisbia.bsky.social
Eleven quick tips for properly handling tabular data https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1012604 🧪 #Rstats
December 1, 2024 at 8:57 PM
Reposted by luisbia.bsky.social
Can DuckDB be the ultimate portable catalog for Data Lakes and Lakehouses? 🤔
With its great integrations (Parquet, JSON, Iceberg, Postgres, & more), it's a strong contender.
I tested this with an extreme demo use case.
motherduck.com/blog/from-da...
From Data Lake to Lakehouse: Can DuckDB be the best portable catalog? - MotherDuck Blog
Discover how catalog became crucial for Lakehouse and how DuckDB can help as a catalog | Reading time: 12 min read
motherduck.com
November 14, 2024 at 12:21 PM
Reposted by luisbia.bsky.social
We're delighted to announce that "Scaling Up with R and Arrow", by Nic Crane, Jonathan Keane and Neal Richardson is now available online at www.arrowrbook.com. In the book, we cover a lot of the practical details and theory behind working with Arrow in R. The paper version will be available soon!
Scaling Up With R and Arrow
www.arrowrbook.com
August 13, 2024 at 10:33 PM