Lorin Hochstein
@norootcause.surfingcomplexity.com
Student of complex systems failures, resilience engineering, cognitive systems engineering. Will talk your ear off about @resilienceinsoftware.org
New blog post, on the role of attrition in software incidents, and how we don't talk about that in incident write-ups:
surfingcomplexity.blog/2025/11/02/y...
surfingcomplexity.blog/2025/11/02/y...
You’ll never see attrition referenced in an RCA
In the wake of the recent AWS us-east-1 outage, I saw speculation online about how the departure of experienced engineers played a role in the outage. The most notable one was from the acerbic clou…
surfingcomplexity.blog
November 3, 2025 at 1:08 AM
New blog post, on the role of attrition in software incidents, and how we don't talk about that in incident write-ups:
surfingcomplexity.blog/2025/11/02/y...
surfingcomplexity.blog/2025/11/02/y...
New blog post on the recent AWS incident: surfingcomplexity.blog/2025/10/25/q...
Quick thoughts on the recent AWS outage
AWS recently posted a public write-up of the us-east-1 incident that hit them this past Monday. Here are a couple of quick thoughts on it. Reliability → Automation → Complexity → New failure modes …
surfingcomplexity.blog
October 26, 2025 at 3:57 AM
New blog post on the recent AWS incident: surfingcomplexity.blog/2025/10/25/q...
Reposted by Lorin Hochstein
Manual intervention was necessary to correct.
Oh the humanity!
Oh the humanity!
October 24, 2025 at 6:15 AM
Manual intervention was necessary to correct.
Oh the humanity!
Oh the humanity!
Reposted by Lorin Hochstein
I hope this email never finds you. I hope you’re free.
October 23, 2025 at 3:41 PM
I hope this email never finds you. I hope you’re free.
More fun with Sora. sora.chatgpt.com/p/s_68f9b2d0...
norootcause on Sora
Host of trivia game show reads out the following categories: us-east-1, DNS, race conditions, clean-up automation, delays due to backlog, health checks
sora.chatgpt.com
October 23, 2025 at 4:46 AM
More fun with Sora. sora.chatgpt.com/p/s_68f9b2d0...
EVERYBODY GO READ THE AWS INCIDENT WRITE-UP! aws.amazon.com/message/1019...
Summary of the Amazon DynamoDB Service Disruption in Northern Virginia (US-EAST-1) Region
aws.amazon.com
October 23, 2025 at 3:47 AM
EVERYBODY GO READ THE AWS INCIDENT WRITE-UP! aws.amazon.com/message/1019...
norootcause on Sora
Have the audience shout out the name of the show one word at a time
sora.chatgpt.com
October 21, 2025 at 2:43 AM
Reposted by Lorin Hochstein
OH: "Why are all my industry incident-friends utterly destroyed today?"
October 20, 2025 at 11:29 PM
OH: "Why are all my industry incident-friends utterly destroyed today?"
Reposted by Lorin Hochstein
I once asked a very skilled and experienced rower how people who make every boat they are in better do it. He said that he'd known two people like that, and they weren't the fastest or stronger rowers, but their superpower was adding stability where the boat needed it. Not something you'd see ...
New blog post: The illegible nature of software development talent
surfingcomplexity.blog/2025/10/08/t...
surfingcomplexity.blog/2025/10/08/t...
The illegible nature of software development talent
Here’s another blog post on gathering some common threads from reading recent posts. Today’s topic is about the unassuming nature of talented software engineers. The first thread was a …
surfingcomplexity.blog
October 9, 2025 at 10:03 PM
I once asked a very skilled and experienced rower how people who make every boat they are in better do it. He said that he'd known two people like that, and they weren't the fastest or stronger rowers, but their superpower was adding stability where the boat needed it. Not something you'd see ...
Reposted by Lorin Hochstein
I think the spooky magic of Pivotal Labs, at its best, was getting a bunch of these people together and then pulling in more and retaining them.
Bunch of people doing Prisoners Dillema and choosing "trust" every time.
Bunch of people doing Prisoners Dillema and choosing "trust" every time.
October 9, 2025 at 10:18 PM
I think the spooky magic of Pivotal Labs, at its best, was getting a bunch of these people together and then pulling in more and retaining them.
Bunch of people doing Prisoners Dillema and choosing "trust" every time.
Bunch of people doing Prisoners Dillema and choosing "trust" every time.
New blog post: The illegible nature of software development talent
surfingcomplexity.blog/2025/10/08/t...
surfingcomplexity.blog/2025/10/08/t...
The illegible nature of software development talent
Here’s another blog post on gathering some common threads from reading recent posts. Today’s topic is about the unassuming nature of talented software engineers. The first thread was a …
surfingcomplexity.blog
October 9, 2025 at 5:17 AM
New blog post: The illegible nature of software development talent
surfingcomplexity.blog/2025/10/08/t...
surfingcomplexity.blog/2025/10/08/t...
Reposted by Lorin Hochstein
Google blacklisted the iCloud private relay ip addresses. Been broken for ages while now. On iOS safari you need to press the “show ip address” option and then try searching again and it works.
October 6, 2025 at 3:58 AM
Google blacklisted the iCloud private relay ip addresses. Been broken for ages while now. On iOS safari you need to press the “show ip address” option and then try searching again and it works.
I can't access Google Scholar over my phone (it gives me an error when I try to search or click a link on a paper), and I have no idea why.
October 5, 2025 at 8:03 PM
I can't access Google Scholar over my phone (it gives me an error when I try to search or click a link on a paper), and I have no idea why.
New blog post, some thought experiments about learning from incident write-ups: surfingcomplexity.blog/2025/10/04/t...
Two thought experiments
Here’s a thought experiment that John Allspaw related to me, in paraphrased form (John tells me that he will eventually capture this in a blog post of his own, at which time I’ll put a …
surfingcomplexity.blog
October 5, 2025 at 5:01 AM
New blog post, some thought experiments about learning from incident write-ups: surfingcomplexity.blog/2025/10/04/t...
New blog post that's nominally about statistics, based on various things I've been reading lately: surfingcomplexity.blog/2025/09/28/a...
A statistic is as a statistic does
(With apologies to the screenwriters of Forrest Gump) I’m going to use this post to pull together some related threads from different sources I’ve been reading lately. Rationalization a…
surfingcomplexity.blog
September 29, 2025 at 3:17 AM
New blog post that's nominally about statistics, based on various things I've been reading lately: surfingcomplexity.blog/2025/09/28/a...
New blog post on the problem of fixation during incident response: surfingcomplexity.blog/2025/09/20/f...
Fixation: the ever-present risk during incident handling
Recent U.S. headlines have been dominated by school shootings. The bulk of the stories have been about the assassination of Charlie Kirk on the campus of Utah Valley University and the correspondin…
surfingcomplexity.blog
September 20, 2025 at 6:21 PM
New blog post on the problem of fixation during incident response: surfingcomplexity.blog/2025/09/20/f...
New blog post on the trade-offs of fine-grained progressive rollouts
surfingcomplexity.blog/2025/09/13/t...
surfingcomplexity.blog/2025/09/13/t...
The hidden trade-offs of fine-grained progressive rollouts
A progressive rollout refers to the act of rolling out some new functionality gradually rather than all at once. This means that, when you initially deploy it, the change only impacts a fraction of…
surfingcomplexity.blog
September 14, 2025 at 1:38 AM
New blog post on the trade-offs of fine-grained progressive rollouts
surfingcomplexity.blog/2025/09/13/t...
surfingcomplexity.blog/2025/09/13/t...
Good public incident write-up from UptimeLabs. Quick notes:
* A patch-level dependency update unexpectedly changed behavior.
* Failure mode: saturation
* The role of health checks (a tool to improve availability!) in enabling this incident.
uptimelabs.io/when-fast-fl...
* A patch-level dependency update unexpectedly changed behavior.
* Failure mode: saturation
* The role of health checks (a tool to improve availability!) in enabling this incident.
uptimelabs.io/when-fast-fl...
When Fast Flow Delivers A Real Blow: A PIR - Uptime Labs
An honest Post Incident Review: how fast flow and continuous delivery caused an outage, and why progress over perfection drives resilience.
uptimelabs.io
September 1, 2025 at 7:18 PM
Good public incident write-up from UptimeLabs. Quick notes:
* A patch-level dependency update unexpectedly changed behavior.
* Failure mode: saturation
* The role of health checks (a tool to improve availability!) in enabling this incident.
uptimelabs.io/when-fast-fl...
* A patch-level dependency update unexpectedly changed behavior.
* Failure mode: saturation
* The role of health checks (a tool to improve availability!) in enabling this incident.
uptimelabs.io/when-fast-fl...
Reposted by Lorin Hochstein
and he would have gotten away with it too if it weren’t for those meddling kids
August 27, 2025 at 1:26 PM
and he would have gotten away with it too if it weren’t for those meddling kids
I would not have gotten my PhD if this rule was in effect when I was a student. (I started off in an EE PhD program at Boston University, decided I wanted to do CS instead, and left BU with an MSEE for U. Maryland, where I got my PhD. That was six years on my student visa).
Trump admin planning to change student visas from lasting for duration of academic program to fixed 4-yr term, and then much harder to renew
Could destroy US ability to attract global talent, particularly those seeking advanced degrees in STEM. The median time to complete a PhD is 5.7 yrs per NSF.
Could destroy US ability to attract global talent, particularly those seeking advanced degrees in STEM. The median time to complete a PhD is 5.7 yrs per NSF.
Trump Deals A New Immigration Blow To International Students
Trump officials have proposed a new rule limiting international students to fixed periods of entry, making a U.S. education more precarious.
www.forbes.com
August 29, 2025 at 12:18 PM
I would not have gotten my PhD if this rule was in effect when I was a student. (I started off in an EE PhD program at Boston University, decided I wanted to do CS instead, and left BU with an MSEE for U. Maryland, where I got my PhD. That was six years on my student visa).
New blog post: My favorite developer productivity research method that nobody uses
surfingcomplexity.blog/2025/08/24/m...
surfingcomplexity.blog/2025/08/24/m...
My favorite developer productivity research method that nobody uses
You’ve undoubtedly heard of the psychological concept called flow state. This is the feeling you get when you’re in the zone, where you’re doing some sort of task, and you’r…
surfingcomplexity.blog
August 24, 2025 at 7:05 PM
New blog post: My favorite developer productivity research method that nobody uses
surfingcomplexity.blog/2025/08/24/m...
surfingcomplexity.blog/2025/08/24/m...
Reposted by Lorin Hochstein
pretty sure the fire pigeons aren’t gonna care about your silly little sign
August 24, 2025 at 12:35 PM
pretty sure the fire pigeons aren’t gonna care about your silly little sign
August 23, 2025 at 10:47 PM