Lightnews — Scholar-powered news

Pete Bachant

@petebachant.me

210 followers 780 following 140 posts

Bicycles, fluid dynamics, Python, open source, open science, reproducibility. https://petebachant.me | https://calkit.org

Posts Replies Media Videos

Pete Bachant

@petebachant.me

Don't be ashamed of "messy" code. If it works, it's good. Share it.

#openscience #reproducibility

Messy reproducible code is better than "clean" but irreproducible code.

October 5, 2025 at 3:18 PM

Pete Bachant

@petebachant.me

Reading through some slides from 2013 titled "how to succeed in reproducible research without really trying". It's true we have all the tools needed for researchers to build their own reproducible workflows, but still many do not. Maybe the tools are still too hard to learn and use!

October 3, 2025 at 11:47 AM

Pete Bachant

@petebachant.me

Graphic design is not my passion. Anyone want to collaborate on an infographic to explain the value of fully automating and version controlling research projects?

docs.google.com/drawings/d/1...

#openscience #reproducibility #automation

A diagram showing an overview of how a Calkit project works.

August 16, 2025 at 3:01 PM

Pete Bachant

@petebachant.me

When you describe the computational methods in your paper without sharing the code and data:

#openscience #reproducibility

A cartoon showing how to draw an owl in two steps, which clearly doesn't provide enough information.

July 21, 2025 at 1:58 PM

Pete Bachant

@petebachant.me

"One button" reproducibility should be the standard

#openscience #reproducibility

A quotation from "Electronic Documents Give Reproducible Research a New Meaning" by Claerbout and Karrenbach, whereby they state the goal of allowing researchers to reproduce their work with a single button.

July 17, 2025 at 2:28 AM

Pete Bachant

@petebachant.me

Calkit now has its own pipeline syntax that forces you to define an environment for every stage, but manages those environments for you automatically. No more pip installs, Docker builds, etc. Your project will just be reproducible.

Docs: docs.calkit.org/pipeline/

#reproducibility #openscience

June 6, 2025 at 2:40 PM

Pete Bachant

@petebachant.me

www.reddit.com/r/PhD/s/HFNX...

May 3, 2025 at 1:26 PM

Pete Bachant

@petebachant.me

All research outputs are valuable. Publish them! #openscience

A cartoon of an iceberg showing "research article" as above the surface, with many other products of research below.

April 29, 2025 at 4:29 PM

Pete Bachant

@petebachant.me

Put together a little GUI app to help Windows users get set up to do open-source scientific computing, since that can be hard (though Windows has improved a lot over the years)

🔗 github.com/calkit/calki...

#openscience #python #reproducibility

A screenshot of the Calkit Assistant app.

April 7, 2025 at 4:24 PM

Pete Bachant

@petebachant.me

I thought MATLAB was supposed to be a convenient self-contained computational environment, but it requires that you manually install additional dependencies.

March 21, 2025 at 12:54 PM

Pete Bachant

@petebachant.me

This sort of data availability statement is pretty common. I understand not archiving GB or even TB of simulation data, but why not provide the case setups and scripts by default? Is it because the authors assume they aren't useful? Embarrassed by sloppy code? Too much effort to upload?

A data availability statement that links to code but doesn't cite version or case setups to reproduce results.

March 20, 2025 at 3:25 PM

Pete Bachant

@petebachant.me

The critical event that caused me to abandon MATLAB and get into Python was a license server failure at my university back in 2013. I am now trying to download MATLAB because a collaborator is using it and here's the result...

March 18, 2025 at 12:45 PM

Pete Bachant

@petebachant.me

🆕 Importing part of one project's dataset(s) into another with Calkit.

I wanted this feature because I have one project that ingests data from an API every day, and another that only needs a very specific subset. This method retains provenance and deduplicates cloud storage.

#opendata

Commands to import datasets with Calkit.

February 16, 2025 at 12:52 PM

Pete Bachant

@petebachant.me

I wanted to see weekly distributions of power and heart rate from cycling and running, but Strava doesn't show those, so I put together a little project. As mentioned in my last post, it was my first DuckDB experience 👍 First Polars experience too 👍

🔗 calkit.io/petebachant/strava-analysis

January 21, 2025 at 8:56 PM

Pete Bachant

@petebachant.me

First time really using DuckDB and I'm impressed. Querying and joining 1.7M rows from JSON and Parquet files (my own Strava time series data) in SQL and it feels instantaneous.

Code snapshot with SQL query and DataFrame.

January 18, 2025 at 2:28 PM

Pete Bachant

@petebachant.me

Just added a basic #reproducibility check for projects on calkit.io. It doesn't even use AI 😜

Also available via the CLI with `calkit check repro`

Research project reproducibility check results from calkit.io.

December 17, 2024 at 3:31 PM

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news