Pete Bachant
@petebachant.me
Bicycles, fluid dynamics, Python, open source, open science, reproducibility. https://petebachant.me | https://calkit.org
The Calkit Run GitHub Action now authenticates with OIDC tokens, so no secrets are required to push artifacts, e.g., the latest PDF of your paper, up to the cloud: github.com/calkit/run-a...
#automation #openscience #reproducibility
#automation #openscience #reproducibility
Release v2.0.0 · calkit/run-action
With this version we automatically fetch a DVC token from calkit.io using GitHub OIDC.
Full Changelog: v1...v2.0.0
github.com
November 8, 2025 at 3:18 PM
The Calkit Run GitHub Action now authenticates with OIDC tokens, so no secrets are required to push artifacts, e.g., the latest PDF of your paper, up to the cloud: github.com/calkit/run-a...
#automation #openscience #reproducibility
#automation #openscience #reproducibility
💥 First "real world" Calkit repro pack just dropped!
In this paper we did a bunch of benchmarking for a brand new astronomical alert brokering system designed to interface with the Rubin Observatory.
Check out the repo here: github.com/boom-astro/b...
#openscience #reproducibility #opensource
In this paper we did a bunch of benchmarking for a brand new astronomical alert brokering system designed to interface with the Rubin Observatory.
Check out the repo here: github.com/boom-astro/b...
#openscience #reproducibility #opensource
GitHub - boom-astro/boom-paper: The first paper about BOOM development.
The first paper about BOOM development. Contribute to boom-astro/boom-paper development by creating an account on GitHub.
github.com
November 7, 2025 at 5:09 PM
💥 First "real world" Calkit repro pack just dropped!
In this paper we did a bunch of benchmarking for a brand new astronomical alert brokering system designed to interface with the Rubin Observatory.
Check out the repo here: github.com/boom-astro/b...
#openscience #reproducibility #opensource
In this paper we did a bunch of benchmarking for a brand new astronomical alert brokering system designed to interface with the Rubin Observatory.
Check out the repo here: github.com/boom-astro/b...
#openscience #reproducibility #opensource
Reposted by Pete Bachant
Really love this kind of reality-check meta-research 👉“The struggle to make transparency mainstream: initial evidence for a slow uptake of open science practices in PhD theses”
royalsocietypublishing.org/doi/full/10....
royalsocietypublishing.org/doi/full/10....
November 5, 2025 at 3:05 AM
Really love this kind of reality-check meta-research 👉“The struggle to make transparency mainstream: initial evidence for a slow uptake of open science practices in PhD theses”
royalsocietypublishing.org/doi/full/10....
royalsocietypublishing.org/doi/full/10....
"8% of authors shared code and software openly..."
insights.taylorandfrancis.com/research-imp...
#openscience #opendata #reproducibility
insights.taylorandfrancis.com/research-imp...
#openscience #opendata #reproducibility
Moving the needle on open data: A new study from Taylor & Francis
Analysis of Open Research activity on Taylor & Francis journals, supported by the AI-solution provider DataSeer.
insights.taylorandfrancis.com
October 24, 2025 at 11:02 PM
"8% of authors shared code and software openly..."
insights.taylorandfrancis.com/research-imp...
#openscience #opendata #reproducibility
insights.taylorandfrancis.com/research-imp...
#openscience #opendata #reproducibility
If you publish a "repro pack" with your paper, you're awesome, but there's about a 10% chance it will actually run on someone else's computer. In this post I explain why that isn't your fault, why it matters, and what we should do about it: petebachant.me/single-button
#openscience #reproducibility
#openscience #reproducibility
Single-button reproducibility: The what, the why, and the how
petebachant.me
October 17, 2025 at 2:05 PM
If you publish a "repro pack" with your paper, you're awesome, but there's about a 10% chance it will actually run on someone else's computer. In this post I explain why that isn't your fault, why it matters, and what we should do about it: petebachant.me/single-button
#openscience #reproducibility
#openscience #reproducibility
Calkit projects can now incorporate Julia Jupyter notebooks into their pipelines: calkit.io/calkit/examp...
#julialang #reproducibility #openscience
#julialang #reproducibility #openscience
Calkit
calkit.io
October 17, 2025 at 2:46 AM
Calkit projects can now incorporate Julia Jupyter notebooks into their pipelines: calkit.io/calkit/examp...
#julialang #reproducibility #openscience
#julialang #reproducibility #openscience
Why number your notebooks/scripts and execute them manually when you could simply put them into a pipeline that automatically manages their environments and caches their outputs?
docs.calkit.org/pipeline/
#datascience #automation
docs.calkit.org/pipeline/
#datascience #automation
The pipeline - Calkit
docs.calkit.org
October 14, 2025 at 3:01 PM
Why number your notebooks/scripts and execute them manually when you could simply put them into a pipeline that automatically manages their environments and caches their outputs?
docs.calkit.org/pipeline/
#datascience #automation
docs.calkit.org/pipeline/
#datascience #automation
1. Generate evidence to support some claims
2. Don't automate the creation of said evidence
Congratulations, you've just contributed to the reproducibility crisis!
#reproducibility #openscience
2. Don't automate the creation of said evidence
Congratulations, you've just contributed to the reproducibility crisis!
#reproducibility #openscience
October 8, 2025 at 8:29 PM
1. Generate evidence to support some claims
2. Don't automate the creation of said evidence
Congratulations, you've just contributed to the reproducibility crisis!
#reproducibility #openscience
2. Don't automate the creation of said evidence
Congratulations, you've just contributed to the reproducibility crisis!
#reproducibility #openscience
Reposted by Pete Bachant
So much brilliant work never makes it into a paper.
The code, the data, the long nights helping others debug.
At pyOpenSci, we believe that code, data, and community are the pulse.
Research advances quickly when we build together & openly.
Join us. 💛 bit.ly/pyos-volunteer
#openscience #opensource
The code, the data, the long nights helping others debug.
At pyOpenSci, we believe that code, data, and community are the pulse.
Research advances quickly when we build together & openly.
Join us. 💛 bit.ly/pyos-volunteer
#openscience #opensource
Get involved with pyOpenSci
pyOpenSci’s Website
bit.ly
October 8, 2025 at 5:20 PM
So much brilliant work never makes it into a paper.
The code, the data, the long nights helping others debug.
At pyOpenSci, we believe that code, data, and community are the pulse.
Research advances quickly when we build together & openly.
Join us. 💛 bit.ly/pyos-volunteer
#openscience #opensource
The code, the data, the long nights helping others debug.
At pyOpenSci, we believe that code, data, and community are the pulse.
Research advances quickly when we build together & openly.
Join us. 💛 bit.ly/pyos-volunteer
#openscience #opensource
October 5, 2025 at 3:18 PM
Reading through some slides from 2013 titled "how to succeed in reproducible research without really trying". It's true we have all the tools needed for researchers to build their own reproducible workflows, but still many do not. Maybe the tools are still too hard to learn and use!
October 3, 2025 at 11:47 AM
Reading through some slides from 2013 titled "how to succeed in reproducible research without really trying". It's true we have all the tools needed for researchers to build their own reproducible workflows, but still many do not. Maybe the tools are still too hard to learn and use!
Programming tip: Name classes after the data they encapsulate, not the actions they perform on that data. For example, instead of SchemaProcessor, just call it Schema:
processed_schema = Schema().process()
#programming #oop #softwareengineering
processed_schema = Schema().process()
#programming #oop #softwareengineering
September 28, 2025 at 9:01 AM
Programming tip: Name classes after the data they encapsulate, not the actions they perform on that data. For example, instead of SchemaProcessor, just call it Schema:
processed_schema = Schema().process()
#programming #oop #softwareengineering
processed_schema = Schema().process()
#programming #oop #softwareengineering
Hot take: Notebooks are fine in production as long as they're part of a reproducible pipeline
docs.calkit.org/notebooks/
#reproducibility #datascience #openscience
docs.calkit.org/notebooks/
#reproducibility #datascience #openscience
Notebooks - Calkit
docs.calkit.org
September 26, 2025 at 10:01 AM
Hot take: Notebooks are fine in production as long as they're part of a reproducible pipeline
docs.calkit.org/notebooks/
#reproducibility #datascience #openscience
docs.calkit.org/notebooks/
#reproducibility #datascience #openscience
Please don't number your scripts. Refer back to (2) and use a pipeline (like Calkit's of course)!
www.nature.com/articles/d41...
#reproducibility #automation #openscience
www.nature.com/articles/d41...
#reproducibility #automation #openscience
It’s a new term: here are 99 lab hacks
Nature asked contributors, editors and working researchers to share their best advice for scientists.
www.nature.com
September 26, 2025 at 8:37 AM
Please don't number your scripts. Refer back to (2) and use a pipeline (like Calkit's of course)!
www.nature.com/articles/d41...
#reproducibility #automation #openscience
www.nature.com/articles/d41...
#reproducibility #automation #openscience
Reposted by Pete Bachant
In a newly released arXiv preprint, we explore how open science practice like sharing data, code and preprints relate to citation impact in French-authored research over a 3-year period.
Thanks to @ouvrirlascience.bsky.social for highlighting its national importance.
🔗 Read more: plos.io/3Vmykrj
Thanks to @ouvrirlascience.bsky.social for highlighting its national importance.
🔗 Read more: plos.io/3Vmykrj
September 16, 2025 at 4:53 PM
In a newly released arXiv preprint, we explore how open science practice like sharing data, code and preprints relate to citation impact in French-authored research over a 3-year period.
Thanks to @ouvrirlascience.bsky.social for highlighting its national importance.
🔗 Read more: plos.io/3Vmykrj
Thanks to @ouvrirlascience.bsky.social for highlighting its national importance.
🔗 Read more: plos.io/3Vmykrj
Reproducibility tip: Any figure, dataset, ML model, etc., should not be shared until it is produced with an automated, version-controlled pipeline.
#reproducibility #openscience
#reproducibility #openscience
September 16, 2025 at 2:28 PM
Reproducibility tip: Any figure, dataset, ML model, etc., should not be shared until it is produced with an automated, version-controlled pipeline.
#reproducibility #openscience
#reproducibility #openscience
While profiling some CUDA code on a SLURM cluster I realized I was not working in a very reproducible way, which could become a problem down the road if I ever needed to know how a certain result was generated, so Calkit now has SLURM integration: docs.calkit.org/pipeline/slu...
SLURM integration - Calkit
docs.calkit.org
September 15, 2025 at 2:55 PM
While profiling some CUDA code on a SLURM cluster I realized I was not working in a very reproducible way, which could become a problem down the road if I ever needed to know how a certain result was generated, so Calkit now has SLURM integration: docs.calkit.org/pipeline/slu...
Julia should have an option that automatically does the same thing as:
export JULIA_LOAD_PATH=@:@stdlib
julia --project=. -e 'using Pkg; Pkg.instantiate()'
before running any command.
Maybe an enhanced reproducibility mode option, like --repro?
#julialang #reproducibility
export JULIA_LOAD_PATH=@:@stdlib
julia --project=. -e 'using Pkg; Pkg.instantiate()'
before running any command.
Maybe an enhanced reproducibility mode option, like --repro?
#julialang #reproducibility
September 14, 2025 at 3:35 PM
Julia should have an option that automatically does the same thing as:
export JULIA_LOAD_PATH=@:@stdlib
julia --project=. -e 'using Pkg; Pkg.instantiate()'
before running any command.
Maybe an enhanced reproducibility mode option, like --repro?
#julialang #reproducibility
export JULIA_LOAD_PATH=@:@stdlib
julia --project=. -e 'using Pkg; Pkg.instantiate()'
before running any command.
Maybe an enhanced reproducibility mode option, like --repro?
#julialang #reproducibility
I don't know why, but I always found it hard to remember the process for adding an SSH key to GitHub, so I made a wizard for it:
calkit config github-ssh
(might be buggy, but still an improvement over manually running commands from the docs)
calkit config github-ssh
(might be buggy, but still an improvement over manually running commands from the docs)
September 12, 2025 at 3:23 PM
I don't know why, but I always found it hard to remember the process for adding an SSH key to GitHub, so I made a wizard for it:
calkit config github-ssh
(might be buggy, but still an improvement over manually running commands from the docs)
calkit config github-ssh
(might be buggy, but still an improvement over manually running commands from the docs)
Moving pieces of code farther apart from each other (into different packages, modules, repos) doesn't guarantee you've decoupled them. In fact, if you haven't, you probably just made your life a whole lot harder.
#softwareengineering
#softwareengineering
September 10, 2025 at 2:33 PM
Moving pieces of code farther apart from each other (into different packages, modules, repos) doesn't guarantee you've decoupled them. In fact, if you haven't, you probably just made your life a whole lot harder.
#softwareengineering
#softwareengineering
If you're a leader of knowledge workers you should be giving teams fewer, vaguer goals. Handing out well-defined projects and tasks to individuals is a waste of their abilities.
September 8, 2025 at 2:34 PM
If you're a leader of knowledge workers you should be giving teams fewer, vaguer goals. Handing out well-defined projects and tasks to individuals is a waste of their abilities.
How much "waste" do you have in your scientific workflow? For example, do you manually rerun plotting scripts/notebooks after updating processing logic? Do you then manually re-upload these figures to Overleaf? Want to automate this stuff away? Reach out and I will help!
September 8, 2025 at 2:20 PM
How much "waste" do you have in your scientific workflow? For example, do you manually rerun plotting scripts/notebooks after updating processing logic? Do you then manually re-upload these figures to Overleaf? Want to automate this stuff away? Reach out and I will help!
Code, data, config files, etc. all must be shared in order to describe computational methods with sufficient detail.
#reproducibility #openscience
#reproducibility #openscience
September 4, 2025 at 7:13 PM
Code, data, config files, etc. all must be shared in order to describe computational methods with sufficient detail.
#reproducibility #openscience
#reproducibility #openscience
Anyone have any good references that examine the relationship between computational reproducibility and time to publication? I'd assume more automated, reproducible workflows will help studies get through peer review more quickly.
#openscience #reproducibility
#openscience #reproducibility
August 27, 2025 at 3:07 PM
Anyone have any good references that examine the relationship between computational reproducibility and time to publication? I'd assume more automated, reproducible workflows will help studies get through peer review more quickly.
#openscience #reproducibility
#openscience #reproducibility
Calkit now has its own GitHub Action to run your project's pipeline and optionally commit and push results: docs.calkit.org/tutorials/gi...
Running Calkit in GitHub Actions - Calkit
docs.calkit.org
August 21, 2025 at 2:46 PM
Calkit now has its own GitHub Action to run your project's pipeline and optionally commit and push results: docs.calkit.org/tutorials/gi...