Hannes Mühleisen
banner
hannes.muehleisen.org
Hannes Mühleisen
@hannes.muehleisen.org
I like databases and boats. Co-creator of @duckdb.org, Co-Founder and CEO DuckDB Labs. Professor of Data Engineering at Radboud Universiteit.
Reposted by Hannes Mühleisen
🚀 We released DuckDB v1.4.2, the second patch release of our LTS edition.

🔎 We are shipping new Iceberg features, improved logger/profiler integration and several bugfixes. The new DuckDB version can also read and write Vortex files.

📖 For more details, read
duckdb.org/2025/11/12/a...
November 12, 2025 at 1:22 PM
Reposted by Hannes Mühleisen
This profile in ‘Significance’ on DuckDB co-founder Hannes Mühleisen is quite interesting, and has helpful insights about data quality and the changing meaning of “big data.” Also some good professional advice in here for statisticians.

academic.oup.com/jrssig/artic...
Is big data dead?
Abstract. Data, ducks and statistics – Sandra Alba gathers dispatches from Amsterdam and Auckland
academic.oup.com
November 9, 2025 at 5:47 PM
Reposted by Hannes Mühleisen
We took Canada’s Spatial Access Measures dataset (big, clunky CSVs) → turned it into a single GeoParquet file.

Add DuckDB-WASM + deck.gl & you get
- instant queries
- smooth maps
- no backend

Public data, but actually usable.
developmentseed.org/blog/2025-10...

@saadiqmohiuddin.bsky.social
November 6, 2025 at 6:38 PM
Reposted by Hannes Mühleisen
pg_lake just went open source! (Apache 2.0)

pg_lake is a set of extensions (from Crunchy Data Warehouse) that add comprehensive Iceberg support and data lake access to Postgres, with @duckdb.org transparently integrated into the query engine.

Announcement blog: www.snowflake.com/en/engineeri...
November 4, 2025 at 4:04 PM
Reposted by Hannes Mühleisen
The PyData Amsterdam 2025 keynote “Minus Three Tier: Data Architecture Turned Upside Down” by @hannes.muehleisen.org is out now.

www.youtube.com/watch?v=DxwD...
KEYNOTE: Hannes Mühleisen - Data Architecture Turned Upside Down | PyData Amsterdam 2025
YouTube video by PyData
www.youtube.com
October 31, 2025 at 2:05 PM
Reposted by Hannes Mühleisen
🎞️ 𝘊𝘢𝘯 you store a movie in DuckDB?

In today's blog post, @hannes.muehleisen.org shows how to store a movie as a table encoding the RGB codes pixel-by-pixel, and how to process it: duckdb.org/2025/10/27/m...

Now, whether you 𝘴𝘩𝘰𝘶𝘭𝘥 store a movie in DuckDB... we'll leave that to your judgment.
October 27, 2025 at 3:43 PM
Reposted by Hannes Mühleisen
📣 New blog post by @dtenwolde.bsky.social.

🕸️ In this post, we show how to use DuckDB and the DuckPGQ community extension to analyze financial data for fraudulent patterns with the SQL/PGQ graph syntax that's part of SQL:2023.

📖 Visit duckdb.org/2025/10/22/d... to read the post.
October 22, 2025 at 7:23 PM
Reposted by Hannes Mühleisen
duckdb-mlpack 0.0.2: mlpack is now a duckdb community extension
Bringing mlpack machine learning to duckdb SQL
dirk.eddelbuettel.com/blog/2025/10...
October 26, 2025 at 1:58 PM
Reposted by Hannes Mühleisen
🇫🇮 ​We are hosting a pub session next week during the @helsinkidataweek.bsky.social, where you can chat with DuckDB's co-creator, @hannes.muehleisen.org and have a drink with members of the DuckDB community.

🎟️ Sign up on Luma: luma.com/s5sl9qxx
October 20, 2025 at 2:29 PM
Reposted by Hannes Mühleisen
ML quacks: Combining duckdb and mlpack
dirk.eddelbuettel.com/blog/2025/10...

A 'minimally viable product / demo' of extending @duckdb.org with #mlpack
October 17, 2025 at 6:13 PM
Reposted by Hannes Mühleisen
I'm grateful that Jack Waudby gave me the chance to set the CTE record straight on his @disseminatepodcast.bsky.social. Hear us talk about what you can do with iterative queries in SQL, how efficient variants of recursion in SQL found their way into @duckdb.org, and how trampolines come into play.
October 17, 2025 at 9:16 AM
Reposted by Hannes Mühleisen
Learn how to build powerful yet lightweight #data workflows using #Python, #DuckDB, and #Smallpond with Valery C. Briz, #Pythonista, senior #dataengineer on the 23rd of October in our #online #workshop 18:00-19:30 CEST
Register here: www.meetup.com/pyladiesams/...
October 7, 2025 at 6:43 PM
Reposted by Hannes Mühleisen
🚀 We released DuckDB v1.4.1, the first bugfix release of our LTS edition.

🔎 We expect LTS users to be particularly curious about changes in the system, so we wrote up a short blog post highlighting the most important fixes and improvements.

duckdb.org/2025/10/07/a...
Announcing DuckDB 1.4.1 LTS
Today we are releasing DuckDB 1.4.1, the first bugfix release of our LTS edition.
duckdb.org
October 7, 2025 at 12:09 PM
Reposted by Hannes Mühleisen
Today's Future Data Systems Seminar Speaker: Jordan Tigani (@jrdntgn.bsky.social) will present how @motherduck.com supports modern workloads with DuckLake. Zoom talk open to public at 4:30pm ET. YouTube video available after: db.cs.cmu.edu/events/futur...
[Future Data] DuckLake: Learning from Cloud Data Warehouses to Build a Robust "Lakehouse" - Carnegie Mellon Database Group
When building scalable data systems, it is easy to focus on the... Read More +
db.cs.cmu.edu
October 6, 2025 at 11:55 AM
Reposted by Hannes Mühleisen
✨ We launched a new installation page for DuckDB!

🚀 The new page lets you install the latest stable DuckDB release with just one or two clicks. If the defaults don't fit your use case, no worries: alternative download methods remain available for many clients.
October 1, 2025 at 12:22 PM
Reposted by Hannes Mühleisen
After trying @duckdb.org with terabytes of parquet I'm hardly going back for data exploration to anything else. Hell, I'm now spawning DuckDB for analyzing even .csv and .json files due to how ergonomic its SQL is.
September 26, 2025 at 5:29 PM
Reposted by Hannes Mühleisen
We published a new deep dive by Laurens Kuiper, who recently redesigned DuckDB's sort.

One data point: ordering the TPC-H SF100 lineitem table with the memory limit set to 30 GB is 3× faster in DuckDB v1.4 than in v1.3.

Read more at duckdb.org/2025/09/24/s...
Redesigning DuckDB's Sort, Again
After four years, we've decided to redesign DuckDB's sort implementation, again. In this post, we present and evaluate the new design.
duckdb.org
September 25, 2025 at 6:35 PM
Reposted by Hannes Mühleisen
🚀 We released version 0.3 of the DuckLake specification and the DuckDB ducklake extension today. It includes interoperability with Iceberg, support for geometry types and more.

Check the announcement blog for more details ducklake.select/2025/09/17/d...
September 18, 2025 at 9:20 AM
Reposted by Hannes Mühleisen
This is the most exciting time ever to be working in data, and I'm not talking about AI.

3 years ago, I wrote a database-centric guide in my book for analyzing the full 92 million record 1910 Census.

Now, with #rstats and @duckdb?

Analyze those 92 million rows in seconds.
September 17, 2025 at 5:17 PM
Reposted by Hannes Mühleisen
I'm speaking soon at #PositConf at the 2:40PM session "Get Your Ducks in a Row with Databases" in Regency VI! My talk is "Semantic Search for the Rest of Us with DuckDB (and Llama.cpp)"

#PositConf2025
September 17, 2025 at 6:10 PM
Reposted by Hannes Mühleisen
📈 DuckDB 1.4.0 is out! This is our first LTS release which comes with *one year of community support*. It also supports database encryption, the MERGE SQL statement and Iceberg writes.

For more details, read the announcement blog post at
duckdb.org/2025/09/16/a...
September 16, 2025 at 11:55 AM
We're testing a new distribution channel for @duckdb.org : #docker images. For now they live at `hfmuehleisen/duckdb`, feel free to test them out. And yes, hell got a little colder today.

hub.docker.com/r/hfmuehleis...
hub.docker.com
September 16, 2025 at 7:30 AM
Reposted by Hannes Mühleisen
Such a fun listen on ducklake and duckdb with @hannes.muehleisen.org and @markraasveldt.bsky.social!

Learned a lot, the future of ducklake looks very bright!

overcast.fm/+AAH1YOLrL6Q
Duck Lake: Simplifying the Lakehouse Ecosystem — Data Engineering Podcast
overcast.fm
September 12, 2025 at 11:42 PM
Reposted by Hannes Mühleisen
We are holding the DuckDB Amsterdam Meetup next week, featuring talks by @rolandbouman.bsky.social, Tania Bogatsch and @qxip.bsky.social:

www.meetup.com/duckdb/event...

The event is already at capacity but consider joining the wait list because there are always last-minute RSVP cancellations.
September 10, 2025 at 1:41 PM
Excited to be a keynote speaker at PyData Amsterdam 2025 (September 24–26). My talk is titled 'Minus Three Tier: Data Architecture Turned Upside Down'.

Use code PYDATADB10 for 10% off tickets
amsterdam.pydata.org/conference
#PDAmsterdam2025 #10YearsPDAmsterdam
September 10, 2025 at 1:38 PM