bruceritchie.bsky.social
@bruceritchie.bsky.social
Reposted
Inside China's Mini PC Production: How Tiny Computers Are Made
youtu.be/ohwI3V207Ts
Inside China's Mini PC Production: How Tiny Computers Are Made
YouTube video by SatisFactory Process
youtu.be
November 14, 2025 at 9:22 AM
Continuing my habit of reading a paper a week, this week it's SQLStorm - db.in.tum.de/people/sites... I've been meaning to look into some of the failures reported against Apache DataFusion via github.com/2010YOUY01/d... for some time though it might take a holiday vacation to have the time.
db.in.tum.de
November 14, 2025 at 9:17 PM
Hey Google - can you stop with the firefox bullshit and "making sure you're not a bot" captchas for links to youtube? You are just pissing off a premium member with that shit.
November 7, 2025 at 4:50 PM
I wonder if anyone has done a cost analysis of Python code running in the wild compared to a language that actually is performant. I suspect companies are needlessly spending millions because of lazy developers.
October 17, 2025 at 5:54 PM
Reposted
Our SIGMOD paper with our friends at Tsinghua + @wesmckinney.com + @pateljm.bsky.social on creating a next generation open-source data file format is out. F3 is a future-proof file format avoids the mistakes of Parquet.
📄 Paper: db.cs.cmu.edu/papers/2025/...
📁 Code: github.com/future-file-...
October 1, 2025 at 1:49 PM
Interesting read on what it takes to optimize a database for high core count machines - clickhouse.com/blog/optimiz...
Optimizing ClickHouse for Intel's ultra-high core count processors
Intel's latest processor generations are pushing the number of cores in a server to unprecedented levels. For analytical databases like ClickHouse, ultra-high core counts represent a huge opportunity ...
clickhouse.com
September 18, 2025 at 2:06 PM
I'm tempted to try out the vortex file format (vortex.dev) in my project to see if it has an appreciable impact on performance.
Vortex | An extensible, SOTA columnar file format
Vortex is an extensible, state-of-the-art columnar file format, with associated tools for working with compressed Apache Arrow arrays in-memory, on-disk, and over-the-wire.
vortex.dev
September 12, 2025 at 3:38 PM
ashtom.github.io/developers-r... ... so much absurdity in this it's crazy. Never trust a damn thing from someone whose job depends on selling you something.
August 6, 2025 at 6:38 PM
@apachedatafusion.bsky.social 49.0.0 released. Async UDF's, Parquet modular encryption, WITHIN GROUP support, Dynamic Filters and TopK pushdown and much more ... datafusion.apache.org/blog/2025/07...
Apache DataFusion 49.0.0 Released - Apache DataFusion Blog
datafusion.apache.org
July 29, 2025 at 9:04 PM
Medium has turned into a wasteland of AI generated or AI augmented posts. I'd say less than 25% of the daily digest highlights are actual 'real' articles. Sad.
July 10, 2025 at 2:01 PM
A 200 Ok response from S3 ... isn't always ok. Way to go AWS for making your service horrendous to support. repost.aws/knowledge-ce...
May 30, 2025 at 7:56 PM
I am unsure whether Google Summer of Code is a benefit or a hindrance to an open source project. Time will tell I suppose by the PR's submitted.
April 8, 2025 at 4:48 PM
It's been well over a year since I started the process of rewriting a large and very long running job from Apache Spark/Scala to Apache DataFusion/Rust. We're now well into doing poc's to rewrite a few other expensive jobs the same way. It's a very nice feeling.
April 3, 2025 at 9:08 PM
This one was going around the office today and made me chuckle :)
March 28, 2025 at 1:30 PM
coworker in chat: "... cluster is rebalancing and I'm trying to get the jello to stop shaking". Best explanation of rebalancing I've heard in a long time 🤣
February 27, 2025 at 2:43 PM
Thank you Doug Ford for the $200 vote bribe. I'll use it to contribute to another party and vote to get your kind out of office.
February 9, 2025 at 2:20 PM
Had a good chuckle this morning. Gemini was enabled on company corporate accounts and lasted all of 2 days before it was disabled.
January 31, 2025 at 1:58 PM
Reposted
The latest paper from the #1 CMU-DB PhD student @samarchdb.bsky.social is wild compilation magic! He automatically makes UDFs run 300x faster on SQL Server and 1.3x faster on DuckDB.
Code: github.com/SamArch27/PR...
Paper: www.vldb.org/pvldb/vol18/...
December 6, 2024 at 2:56 PM
Working in Rust for the last year has really made me aware of just how useful some features in other languages really are.

- variadic functions
- Default values for arguments
- Named arguments
- Enum variants as types

Rust is getting if let chains in the 2024 edition though so that is something.
November 26, 2024 at 3:46 PM
Lately there are two things I've been wishing that #Rust had: variadic functions and enum variants as types. Using a builder or macro to work around the first is just that, a workaround. Having the second would make some things much nicer
November 19, 2024 at 8:32 PM
64GB of ram is not enough any more.
November 17, 2024 at 5:45 PM
Datafusion v43 has seen a lot of performance work especially around reading parquet and the numbers are very nice! From the clickbench benchmark on the same hardware type:
November 15, 2024 at 4:17 PM