Everything data (dist sys, databases, messaging, data eng/analytics).
https://jack-vanlightly.com, https://www.hotds.dev
Credit: ESO/B. Tafresh
* Flink wants temporal locality.
* Spark wants value locality.
Same table, conflicting physics.
New post: jack-vanlightly.com/blog/2025/11...
* Flink wants temporal locality.
* Spark wants value locality.
Same table, conflicting physics.
New post: jack-vanlightly.com/blog/2025/11...
My new post explains the KIPs, the trade-offs between reusing old abstractions vs. embracing stateless compute over S3.
jack-vanlightly.com/blog/2025/10...
My new post explains the KIPs, the trade-offs between reusing old abstractions vs. embracing stateless compute over S3.
jack-vanlightly.com/blog/2025/10...
From a systems design view, it trades storage savings for coupling and complexity.
Sometimes, duplication is cheaper than coupling.
jack-vanlightly.com/blog/2025/10...
From a systems design view, it trades storage savings for coupling and complexity.
Sometimes, duplication is cheaper than coupling.
jack-vanlightly.com/blog/2025/10...
Because analytics workloads and OLTP workloads optimize for opposite I/O patterns.
See my dive into data layout, pruning, and what “indexing” really means in open table formats: jack-vanlightly.com/blog/2025/10...
Because analytics workloads and OLTP workloads optimize for opposite I/O patterns.
See my dive into data layout, pruning, and what “indexing” really means in open table formats: jack-vanlightly.com/blog/2025/10...
I spent August reverse-engineering Fluss, Alibaba’s new table storage engine for Flink (partially forked from Kafka). This post covers its architecture, tiering, and how it tackles changelogs & low-latency state.
jack-vanlightly.com/blog/2025/9/...
I spent August reverse-engineering Fluss, Alibaba’s new table storage engine for Flink (partially forked from Kafka). This post covers its architecture, tiering, and how it tackles changelogs & low-latency state.
jack-vanlightly.com/blog/2025/9/...
The post defines what storage unification means, defines terminology and evaluates different building blocks and approaches to doing it.
jack-vanlightly.com/blog/2025/8/...
The post defines what storage unification means, defines terminology and evaluates different building blocks and approaches to doing it.
jack-vanlightly.com/blog/2025/8/...
jack-vanlightly.com/blog/2025/7/...
jack-vanlightly.com/blog/2025/7/...
Coordinated Progress is a 4 part series that explores the common structure behind reliable distributed systems.
jack-vanlightly.com/blog/2025/6/...
Coordinated Progress is a 4 part series that explores the common structure behind reliable distributed systems.
jack-vanlightly.com/blog/2025/6/...
✅ KIP-966: Strengthens the replication protocol.
✅ KIP-996: Introduces PreVote for more stable KRaft leadership.
✅ KIP-848: Delivers more efficient, predictable rebalancing.
✅ KIP-966: Strengthens the replication protocol.
✅ KIP-996: Introduces PreVote for more stable KRaft leadership.
✅ KIP-848: Delivers more efficient, predictable rebalancing.
The Kafka Replication Protocol:
🔹Separation of control plane from data plane.
🔹Role separation with minimal coupling.
🔹Kafka’s alignment with Paxos roles.
jack-vanlightly.com/blog/2025/2/...
The Kafka Replication Protocol:
🔹Separation of control plane from data plane.
🔹Role separation with minimal coupling.
🔹Kafka’s alignment with Paxos roles.
jack-vanlightly.com/blog/2025/2/...
jack-vanlightly.com/blog/2025/2/...
jack-vanlightly.com/blog/2025/2/...