Bacalhau
banner
bacalhau.org
Bacalhau
@bacalhau.org
Code runs. Results appear.

- No keys to manage
- No storage config to maintain
- No hunting for outputs

Bacalhau 1.8 makes storage feel like it should: invisible.

https://bac.al/180day4
June 26, 2025 at 3:35 PM
Nodes come and go. Your job shouldn’t care. 

We've supercharged Bacalhau's orchestration to make daemon jobs faster and more reliable than ever. They now deploy instantly to new nodes as they join your cluster.

✅ Automatic
✅ Resilient
✅ Zero manual intervention

Bacalhau v1.8.0 - Day 3: How Bacalhau Boosts Daemon Job Reliability
A deep dive into the new orchestration logic that makes daemon jobs smarter, faster, and more resilient to infrastructure changes
bac.al
June 25, 2025 at 4:04 PM
Tired of juggling job IDs like `j-f47ac...`?

We just shipped a major upgrade to job management in Bacalhau:

You can now name, rerun, version, and even `diff` your jobs before you run them.

It's like `git` for your compute jobs.
Full deep dive →
Bacalhau v1.8.0 - Day 2: Rerun, Update, and Version Your Bacalhau Jobs
How we're making job management less of a chore and more of a superpower
bac.al
June 24, 2025 at 4:04 PM
We’ve made distributed computing radically more usable and radically more cost-efficient: 

 • 📊 Native Splunk integration: Slash logging costs by up to 80%
 • 🏷️ Name-based jobs: No more cryptic UUIDs
 • ⚙️Enhanced daemon reliability for services at scale 

Read the full release →
Announcing Bacalhau v1.8.0: Intelligent Edge Computing Meets Enterprise Integration
Discover how Bacalhau v1.8.0 transforms distributed computing with a native Splunk integration, name-based job management, and enhanced daemon orchestration.
bac.al
June 23, 2025 at 4:25 PM
Reposted by Bacalhau
As Data + AI Summit rolls on—with more features, more data, and more spend—we’re focused on something else:

You should be paying less.

Expanso cuts data infrastructure costs by up to 80%, without disrupting your stack.

Try it on 10 servers. Or 10,000.
See how much you save in 60 days.
Expanso Launches Cost Optimization Platform to Cut Enterprise Data & AI Costs by up to 80%
As data and AI conferences ramp up with promises of more features, more data, and more spend, Expanso is launching a different message: you should be paying ...
www.businesswire.com
June 10, 2025 at 5:22 PM
We’re moving toward a world where data sovereignty is table stakes. If you’re building data infra, you need to think globally - and legally.

This Bacalhau tutorial walks through setting up multi-region compute and anonymization using Microsoft Presidio, while keeping compliance front and center.
Cross-border data flows are tricky.

We wanted to see if we could build a real compliant pipeline from scratch - w/ oss tools.

✅ Generate sensitive data in the EU
✅ Anonymize it
✅ Send to US for processing

Bacalhau handles orchestration. Presidio handles privacy.

open.substack.com/pub/bacalhau...
Cross-Border Data Processing with Privacy Compliance Through Bacalhau
Usign Bacalhau to handle complex data pipelines that cross borders while preserving privacy
open.substack.com
May 22, 2025 at 4:10 PM
Streamlining Bacalhau Development with the Power of Docker-in-Docker.

Our latest blog explores a practical solution using Docker Compose and Docker-in-Docker to create a self-contained, local Bacalhau environment.

Learn how to get your local Bacalhau instance running quickly and efficiently.👇
May 15, 2025 at 2:53 PM
Want to set up an open-source, distributed ML pipeline that respects geographical and regulatory restrictions and runs compute in the same location as your data?

This post gets you started with Bacalhau to set up nodes in three different regions, analyze data, all while respecting data sovereignty.
May 9, 2025 at 3:08 PM
Set up an open source, distributed machine learning pipeline that runs compute in the same location as your data?

This post gets you started using Bacalhau to set up nodes in three different regions, send, process, and analyze data, all while respecting data sovereignty.
May 8, 2025 at 4:18 PM
How do you handle data from thousands of distributed sources before it hits a DB like Azure Cosmos DB?

We joined on Azure Cosmos DB TV to discuss exactly that!

Read the breakdown & watch the full episode with a demo in the first comment! 👇
April 29, 2025 at 4:33 PM
Reposted by Bacalhau
Big news: Expanso has been selected for Plug and Play Seattle’s first startup batch!

Bacalhau is redefining what comes after the cloud:
✅ Run compute where data lives
✅ Cut latency & compliance risks
✅ Stay fast, sovereign & efficient

Honored to join this AI & infrastructure-focused cohort!
April 17, 2025 at 7:07 PM
We just won another Data Breakthrough Award - 2 years, 2 wins!
We just won Open Source Data Platform of the Year in the 2025 Data Breakthrough Awards. 2 years, 2 wins — last year: Data Processing Solution of the Year

🔗https://open.substack.com/pub/bacalhau/p/expanso-wins-2025-data-breakthrough?r=2ejbhv&utm_campaign=post&utm_medium=web&showWelcomeOnShare=true
April 3, 2025 at 4:51 PM
Reposted by Bacalhau
Huh... this looks pretty cool. Want to know what it is? Come by my talk at Kubecon at 12:10 GMT in ICC Capital Suite 1-3 tomorrow! Or check us out at the Intel booth! | Kubernetes Cross-Zone/Region Simplified: Harnessing Bacalhau for Efficient Distributed Compute

bit.ly/4bTRY5q
March 31, 2025 at 4:52 PM
New guide: Build a distributed data warehouse with Bacalhau + DuckDB.

Run SQL on regional data (EU + US) without moving it—ideal for privacy, compliance & edge use cases.

Covers partitioning, querying, trend analysis, anomaly detection & more.

Read it: blog.bacalhau.org/p/distribute...
March 28, 2025 at 7:36 PM
🎯 S3 Partitioning just got smarter!

Wrangling big datasets from S3? Bacalhau 1.7 auto-partitions + retries jobs with zero hassle.

→ Object, date, regex, substring? ✅
→ Shared + partitioned inputs? ✅

Process 1000s of files—no custom logic.

🔗 blog.bacalhau.org/p/bacalhau-v...
March 27, 2025 at 4:19 PM
Bacalhau 1.7 is here and the auth game just leveled up 💥

→ Basic Auth (bcrypt optional)
→ API Tokens
→ SSO via OAuth 2.0 (Okta, Google, Azure)

Full walkthrough, sample configs, curl examples, and why this matters 👇

🔗 blog.bacalhau.org/p/bacalhau-v...
March 26, 2025 at 4:27 PM
New blog drop: Partitioned Jobs in Bacalhau v1.7 🎉

Split your job across nodes. Retry failures. Speed things up.

Big data doesn’t have to mean big pain.

Check it out 👉 blog.bacalhau.org/p/bacalhau-v...
#distributedcomputing #bacalhau
Bacalhau v1.7.0 - Day 2: Scaling Your Compute Jobs with Bacalhau Partitioned Jobs
This post is part of the 5-days of Bacalhau 1.7 series.
blog.bacalhau.org
March 25, 2025 at 5:06 PM
Reposted by Bacalhau
Big news: Bacalhau 1.7 just dropped!

We’re talking a smoother dev experience, richer job feedback, and fixes that make a difference. If you’re building on decentralized compute, this one’s for you.

Read all about it: blog.bacalhau.org/p/announcing... #opensource #decentralizedcompute
Announcing Bacalhau 1.7: Empowering Enterprises with Enhanced Scalability, Job Management, and Support
(5:35) Bacalhau v1.7.0 makes distributed computing easier with new licensing, partitioned jobs, and simplified authentication.
blog.bacalhau.org
March 24, 2025 at 5:21 PM
🚀 New Bluesky bot!

@alt-text.bots.bacalhau.org auto-generates alt-text for images using Bacalhau + LLaVa LVMs!

📌 Reply to an image post mentioning @alt-text.bots.bacalhau.org
📌 The bot generates & replies with alt-text!

Try it out! 🌍 #AltText #Accessibility

bac.al/alt-text
Generating Automatic Alt-Text with the Bacalhau Bluesky Bot
(4:10) Using the latest in Large Vision Models (LLaVa), we've built a Bluesky Bot which can generate alt-text for any image in seconds with Bacalhau
bac.al
March 10, 2025 at 5:19 PM
Reposted by Bacalhau
So, a belated shout-out to the folks at CRN for considering us, and a congrats to Red Hat on the win. This only makes us hungrier to go even bigger in 2025!
February 19, 2025 at 4:14 PM
Reposted by Bacalhau
@bacalhau.org has been pushing the boundaries of distributed computing, and seeing it recognized on this level—even if we only just found out—is a big deal for us.
February 19, 2025 at 4:14 PM
Reposted by Bacalhau
Wait… what?! We just found out that we were a finalist for the 2024 CRN Tech Innovator Awards—and no one told us! 😅 Apparently, we made it all the way to the finals… and lost to Red Hat Device Edge. Honestly? If you’re going to lose, losing to Red Hat isn’t bad company to be in.
February 19, 2025 at 4:14 PM
Big data doesn't have to be a big problem! 💡

In our latest blog post, we show you how we processed 46,000,000(!) rows of data in next to no time (at a fraction of the cost of centralised processing) with @bacalhau.org and @duckdb.org 🚀

Check it out!

blog.bacalhau.org/p/how-we-pro...
How We Processed 46 Million Rows Across 20 Nodes Without Breaking the Bank
(03:30) With Bacalhau, you can process huge volumes of the data in a fraction of the time it would take with other processes. Find out one approach that we've taken!
blog.bacalhau.org
February 6, 2025 at 7:06 PM
Have you ever wanted to know if a photo you're taking has a hotdog OR DOESN'T have a hotdog in it? 🌭📸🤔

Well, using cutting-edge machine learning models, that's something that @jobs.bacalhau.org can help you with! 🎉

Read our blog for the details, or try it out for yourself! 👇

bac.al/has-hotdog
Hotdog? Not Hotdog? The Bacalhau Bluesky Bot Knows!
(03:02) Sometimes, you just want to use cutting-edge AI to know whether or not you're looking at a hotdog. Y'know?
bac.al
February 3, 2025 at 5:28 PM