https://distsys.fponzi.me/
wyounas.github.io/aws/concurre...
We’ll use a model checker to see how such a race could happen. Formal verification can’t prevent every failure, but it helps us think more clearly about correctness and reason about subtle bugs.
wyounas.github.io/aws/concurre...
We’ll use a model checker to see how such a race could happen. Formal verification can’t prevent every failure, but it helps us think more clearly about correctness and reason about subtle bugs.
muratbuffalo.blogspot.com/2025/11/tla-...
AWS’s N. Virginia region suffered a DynamoDB outage triggered by a DNS automation defect.This post focuses narrowly on the race condition at the core of the bug, which is best understood through TLA+ modeling
muratbuffalo.blogspot.com/2025/11/tla-...
AWS’s N. Virginia region suffered a DynamoDB outage triggered by a DNS automation defect.This post focuses narrowly on the race condition at the core of the bug, which is best understood through TLA+ modeling
www.xtxmarkets.com/tech/2025-te...
This post motivates TernFS, explains its high-level architecture, and then explores some key implementation details.
www.xtxmarkets.com/tech/2025-te...
This post motivates TernFS, explains its high-level architecture, and then explores some key implementation details.
www.allthingsdistributed.com/2025/05/just...
Each component follows the Unix mantra—do one thing, and do it well—but working together they are able to offer all the features users expect from a database.
www.allthingsdistributed.com/2025/05/just...
Each component follows the Unix mantra—do one thing, and do it well—but working together they are able to offer all the features users expect from a database.
marc-bowes.com/dsql-auth.html
How connections to Aurora DSQL are authenticated and authorized. This information is meant to be supplemental to what is found in the official Amazon Aurora DSQL documentation.
marc-bowes.com/dsql-auth.html
How connections to Aurora DSQL are authenticated and authorized. This information is meant to be supplemental to what is found in the official Amazon Aurora DSQL documentation.
brooker.co.za/blog/2025/08...
People often ask me about the architectural relationship between Amazon Dynamo, Amazon DynamoDB and Aurora DSQL. I’ll start off on comparing how the systems achieve a few key properties.
brooker.co.za/blog/2025/08...
People often ask me about the architectural relationship between Amazon Dynamo, Amazon DynamoDB and Aurora DSQL. I’ll start off on comparing how the systems achieve a few key properties.
s2.dev/blog/lineari...
We can gain confidence that S2 is linearizable by taking an empirical validation approach, using a model checker like Knossos, or Porcupine.
s2.dev/blog/lineari...
We can gain confidence that S2 is linearizable by taking an empirical validation approach, using a model checker like Knossos, or Porcupine.
dbos.dev/blog/durable...
What we really needed to make distributed task queueing robust are durable queues that checkpoint the status of our queued tasks to a durable store like Postgres.
dbos.dev/blog/durable...
What we really needed to make distributed task queueing robust are durable queues that checkpoint the status of our queued tasks to a durable store like Postgres.
relentless-leader.com/dive-deep-in...
relentless-leader.com/dive-deep-in...
muratbuffalo.blogspot.com/2020/06/lear...
A principled, from the foundations-up, studying of distributed systems, which will take a good three months in the first pass, and many more months to build competence after that.
muratbuffalo.blogspot.com/2020/06/lear...
A principled, from the foundations-up, studying of distributed systems, which will take a good three months in the first pass, and many more months to build competence after that.
groups.csail.mit.edu/tds/papers/L...
groups.csail.mit.edu/tds/papers/L...
www.allthingsdistributed.com/2025/05/just...
a few weeks ago, at our internal dev conference I watched a talk from two of our PEs on building DSQL. I asked if they’d be willing to turn their insights into a deeper exploration of DSQL’s development.
www.allthingsdistributed.com/2025/05/just...
a few weeks ago, at our internal dev conference I watched a talk from two of our PEs on building DSQL. I asked if they’d be willing to turn their insights into a deeper exploration of DSQL’s development.
decentralizedthoughts.github.io/2025-05-23-s...
Reasoning about distributed algorithms is hard at the best of times, with state split across remote nodes, asynchrony, concurrency, and non-determinism in the order that event occur
decentralizedthoughts.github.io/2025-05-23-s...
Reasoning about distributed algorithms is hard at the best of times, with state split across remote nodes, asynchrony, concurrency, and non-determinism in the order that event occur
relentless-leader.com/apache-icebe...
Apache Iceberg is an ACID table format designed for large-scale analytics workloads.
relentless-leader.com/apache-icebe...
Apache Iceberg is an ACID table format designed for large-scale analytics workloads.
www.elastic.co/search-labs/...
Debugging concurrency bugs is no picnic, but we're going to get into it. Enter Fray, a deterministic concurrency testing framework that turns flaky failures into reproducible ones.
www.elastic.co/search-labs/...
Debugging concurrency bugs is no picnic, but we're going to get into it. Enter Fray, a deterministic concurrency testing framework that turns flaky failures into reproducible ones.
stevana.github.io/erlangs_not_...
To me it’s clear that the big idea there isn’t lightweight processes2 and message passing, but rather the generic components which in Erlang are called behaviours.
stevana.github.io/erlangs_not_...
To me it’s clear that the big idea there isn’t lightweight processes2 and message passing, but rather the generic components which in Erlang are called behaviours.
pierrezemb.fr/posts/learn-...
A curated collection of resources about deterministic simulation testing for distributed systems.
pierrezemb.fr/posts/learn-...
A curated collection of resources about deterministic simulation testing for distributed systems.
sumercip.com/posts/patter...
sumercip.com/posts/patter...
ilyasergey.net/YSC4231/
This course on basic concurrent and parallel algorithms has been taught by Ilya Sergey at Yale-NUS College in 2019-2024.
ilyasergey.net/YSC4231/
This course on basic concurrent and parallel algorithms has been taught by Ilya Sergey at Yale-NUS College in 2019-2024.
dl.acm.org/doi/10.1145/...
dl.acm.org/doi/10.1145/...
shachaf.net/w/consensus
This page is a relatively informal discussion of distributed consensus and Paxos, what it does, how it works, and some tricks and variants.
shachaf.net/w/consensus
This page is a relatively informal discussion of distributed consensus and Paxos, what it does, how it works, and some tricks and variants.
groups.google.com/g/raft-dev/c...
groups.google.com/g/raft-dev/c...
tigerbeetle.com/blog/2021-08...
CLOCK_MONOTONIC_RAW, CLOCK_MONOTONIC and CLOCK_BOOTTIME, all monotonic clock stopwatches provided by the Linux kernel through the clock_gettime(2) syscall to measure elapsed time
tigerbeetle.com/blog/2021-08...
CLOCK_MONOTONIC_RAW, CLOCK_MONOTONIC and CLOCK_BOOTTIME, all monotonic clock stopwatches provided by the Linux kernel through the clock_gettime(2) syscall to measure elapsed time
restate.dev/blog/buildin...
We built a precursor and from all the lessons learned there, we arrived at a design with a self-contained complete stack, centered around a command log and event-processor, shipping as a single Rust binary
restate.dev/blog/buildin...
We built a precursor and from all the lessons learned there, we arrived at a design with a self-contained complete stack, centered around a command log and event-processor, shipping as a single Rust binary