Explaining Software Design
explaining.software.web.brid.gy
Explaining Software Design
@explaining.software.web.brid.gy
For the past few years, I've been working on a book about software design. Its thesis is fairly simple: Software development can be reduced to a […]

[bridged from https://explaining.software/ on the web: https://fed.brid.gy/web/explaining.software ]
complexity as entropy
In the previous post, I characterized Ron Jeffries' meandering approach to software design as "a shrug in the face of entropy." Some readers seem to have taken this as a strange, flowery metaphor. It wasn't. In this newsletter, our analytic framework is borrowed from information theory: simplicity is a fitness between content and expectation. When we say software is complex, we mean it's difficult to explain; we need to bridge the distance between our audience's expectations and the realities of our software. This distance — the subjective complexity of our software — can be measured in bits. In some literature, this metric is called surprisal. This is a good name; it reminds us that the complexity of our code varies with its audience. But there is a more common term for the number of bits in a message, borrowed from thermodynamics: entropy. To understand the relationship between information and disorder in a physical system, consider why a sequence of random numbers cannot be compressed. When we compress data, we exploit its internal relationships; we use one part to describe another. You can see this in run-length encoding, or in how gzip encodes duplicate strings as back-references. But in a random dataset, there are no internal relationships; with each element, our explanation must begin anew. And this is why, in this newsletter, we spend so much time on the structures that bind the disparate parts of our software together. These structures compress our code, amplify our explanations. They make our software simpler. ## entropy as decay While entropy is an instantaneous measure, it's often used as a metonym for the second law of thermodynamics: in physical systems, entropy increases over time. When we talk about entropy, it connotes more than temporary disorder; it has the weight of inevitability, an inexorable decline. This also applies to entropy in our software. Consider how, in the previous post, Jeffries' code was full of tiny, needless complexities: muddled naming conventions, a single mutable method on an otherwise immutable class. These small inconsistencies make the existing relationships harder to see, easier to ignore. If Jeffries continued to tinker with his solver, we'd expect this complexity to compound. We cannot prevent our software from being misunderstood, even by ourselves. Its structures will, inevitably, be weakened as our software changes. Good design requires continuous maintenance. According to Ron Jeffries, his pinhole approach to software design provides this maintenance. This is demonstrably false; despite spending months iterating on a few hundred lines of code, the inconsistencies remain. But Jeffries is not alone in his belief. Many writers espouse a similar approach to software maintenance, and all of them should be approached with skepticism. ## broken windows, clean code Shortly before co-authoring the Agile Manifesto, Andrew Hunt and David Thomas wrote _The Pragmatic Programmer_. In the first few pages, they introduce the metaphor of **broken windows** : > One broken window, left unrepaired for any substantial length of time, instills in the inhabitants of the building a sense of abandonment.1 And the same was true for software. "We've seen," they said, "clean, functional systems deteriorate pretty quickly once windows start breaking." This metaphor was borrowed from an article in _The Atlantic_ , published in 1982. It described how a social psychologist named Philip Zimbardo had parked a car in Palo Alto. He left the hood up and removed the license plate, as an invitation for passersby to do something antisocial. He observed it for an entire week, but nothing happened. "Then," the article tells us, "Zimbardo smashed part of it with a sledgehammer. Soon, passersby were joining in. Within a few hours, the car had been turned upside down and utterly destroyed."2 It is implied, but never stated, that the car was destroyed because of a simple broken window. This is not what actually happened3, but few people noticed or cared. The article justified a belief that most readers of _The Atlantic_ already held: systems decay from the bottom-up. And this, in turn, suggested the contrapositive: systems are _restored_ from the bottom-up. This preexisting belief was a reflection of both the publication and the time. _The Atlantic_ is a center-right magazine, and individualism was the ascendant political philosophy of the 1980s. As Margaret Thatcher put it: > There is no such thing as society. There is living tapestry of men and women and people and the beauty of that tapestry and the quality of our lives will depend upon how much each of us is prepared to take responsibility for ourselves.4 The moral of the broken window fable was one of personal responsibility. If we have problems — in our neighborhood or in our software — we should look to ourselves. Systems are just the sum of our choices. We can find these same themes in Robert Martin's paean to hygiene, _Clean Code_ : > Have you ever waded through a mess so grave that it took weeks to do what should have taken hours? Have you seen what should have been a one-line change, made instead in hundreds of modules?5 "Why does this happen to code?" he asks. Is it external constraints? Is it poorly articulated goals? No, says Martin, this only happens to us when "[w]e are unprofessional." If our software is complex, it's only because we haven't kept it clean. Martin does not offer a concise definition of clean code. His book, he says, will explain it "in hideous detail."6 He does, however, ask some of his friends to define the term for him. Ward Cunningham provides a familiar answer: > You know you are working on clean code when each routine you read turns out to be pretty much what you expected.7 Martin calls this "profound." But where are these expectations set? His rules are syntactic, focusing entirely on the text of our software.8 It is implied, then, that our expectations arise from that text. Higher-level structures are just the sum of our syntactic choices. This implies that code ought to be self-documenting. And indeed, Martin has repeatedly warned that comments are best avoided. If our software's paratext has anything to teach us, he says, that only means its text has fallen short: > It's very true that there is important information that is not, or cannot be, expressed in code. That's a failure. A failure of our languages, or of our ability to use them to express ourselves. In every case a comment is a failure of our ability to use our languages to express our intent. > > And we fail at that very frequently, and so comments are a necessary evil — or, if you prefer, an unfortunate necessity. If we had the perfect programming language (TM) we would never write another comment.9 As you read through _Clean Code_ , you begin to understand the ideal Martin is chasing. Our classes should be small, with an eye towards reuse in other applications. Our functions should be "just two, or three, or four lines long."10 Martin wants his software to resemble Thatcher's atomized society: a collection of components which are small, clean, decoupled. But this atomization doesn't make our software simpler. It leads, instead, towards entropy; without structures, our software becomes little more than random numbers. When we maintain our software, we are trying to preserve its simplicity. The cleanliness of our code is, at best, a small part of that simplicity. To focus entirely on broken windows — to dismiss the rest as incidental, regrettable — is less than a shrug. It's closing our eyes, and hoping the rest will take care of itself. * * * 1. Hunt 1999, p. 5 ↩ 2. Kelling 1982 ↩ 3. According to Zimbardo's report, his research assistants destroyed the car on their own. After getting over their "considerable reluctance to take that first blow," he said, they found that "the next one comes more easily, with more force, and feels even better." For Zimbardo, this "awakening of dark impulses" in his students was a glimpse of the violence that lurked beneath polite society. Two years later, he would return to these themes in the now-discredited Stanford Prison Experiment. ↩ 4. Thatcher 1987 ↩ 5. Martin 2009, p. 5 ↩ 6. ibid, p. 12 ↩ 7. ibid, p. 11 ↩ 8. Martin went on to publish _Clean Architecture_ , _Clean Craftsmanship_ , and _Clean Agile_ , but this seems to be largely a branding exercise. The themes in _Clean Code_ are rarely mentioned, and never built upon. ↩ 9. Ousterhout 2025 ↩ 10. Martin 2009, p. 34 ↩
explaining.software
March 22, 2025 at 12:54 AM
the sudoku affair
In 2006, Ron Jeffries wrote a series of posts describing his attempts to build a Sudoku solver. He began by wrapping a class around a simple datatype for the board — essentially a `List[Option[Int]]` — and after that, there isn't much to tell. As Peter Seibel puts it: > [H]e basically wandered around for the rest of his five blog postings fiddling with the representation, making it more “object oriented” and then fixing up the tests to work with the new representation and so on until eventually, it seems, he just got bored and gave up, having made only one minor stab at the problem of actually solving puzzles.1 This story has, in some circles, become notorious. There are two reasons for this. The first is that Ron Jeffries is a leading proponent of a post-design approach to software development. Good design, he asserts, is simply the result of keeping your code "properly-factored:" > [Kent] Beck has those rules for properly-factored code: 1) runs all the tests, 2) contains no duplication, 3) expresses every idea you want to express, 4) minimal number of classes and methods. When you work with these rules, you pay attention only to micro-design matters. > > When I used to watch Beck do this, I was sure he was really doing macro design "in his head" and just not talking about it, because you can see the design taking shape, but he never seems to be doing anything directed to the design. So I started trying it. What I experience is that I am never doing anything directed to macro design or architecture: just making small changes, removing duplication, improving the expressiveness of little patches of code. Yet the overall design of the system improves. I swear I'm not doing it.2 The second reason is that, in the same week that Jeffries was wandering around, Peter Norvig released a complete Sudoku solver. And in his first fifteen lines of code, he got further than Jeffries: def cross(A, B): "Cross product of elements in A and elements in B." return [a+b for a in A for b in B] digits = '123456789' rows = 'ABCDEFGHI' cols = digits squares = cross(rows, cols) unitlist = ([cross(rows, c) for c in cols] + [cross(r, cols) for r in rows] + [cross(rs, cs) for rs in ('ABC','DEF','GHI') for cs in ('123','456','789')]) units = dict((s, [u for u in unitlist if s in u]) for s in squares) peers = dict((s, set(sum(units[s],[]))-set([s])) for s in squares) If you compare both implementations, Norvig's is notable for its clarity and directness. Even more notable is his chosen datatype. When Jeffries chose `List[Option[Int]]`, he was clearly mimicking how a Sudoku board was rendered on a page: there are a 81 cells, and some of them have numbers. In Norvig's code, however, the Sudoku board is represented as a collection of possible moves: `Map[Coord, Set[Int]]`. An empty board is a collection of full sets; each cell can be assigned any integer in `{1, ..., 9}`. When a cell is assigned a number, its set is reduced to a singleton, and that number is removed from the sets of all its `peers`. If, as a result, a peer set is reduced to a singleton, this process repeats. After that, the only thing left was a simple recursive search. At each step, a move was randomly selected from the available options. If that move emptied one of the sets, that line of search was abandoned. And if all the sets were reduced to singletons, then our search was over. While Norvig didn't employ any heuristics in his search, they'd be easy to add. A common strategy in Sudoku is looking for naked pairs: if two peers have the same two candidate values, then simply assign one to each. To employ this, all you'd need to do is replace the random selection with a function that looks for matching 2-sets. And so Norvig's implementation was more than a toy solution. It was a minimal, but extensible, example of constraint propagation. It provided a foundation for exploring both the problem and the solution. * * * ## what it all means Norvig, for his part, didn't assign any of this much importance. When interviewed by Seibel, he said the key difference was that Jeffries "didn't know how to solve the problem." > I actually knew — from AI — that, well, there's this field of constraint propagation — I know how that works. There's this field of recursive search — I know how that works. And I could see, right from the start, you put these two together, and you could solve this Sudoku thing. He didn't know that so he was sort of blundering in the dark even though all his code "worked" because he had all these test cases.3 I agree, but would take it a little further. Both Norvig and Jeffries are **genre programmers** ; they have spent most of their career solving a specific kind of problem. And that problem, inevitably, lends itself to a particular kind of solution. Peter Norvig's genre is search. He literally wrote the book on good old-fashioned AI, where every problem is reduced — for better or worse — to a search problem. Ron Jeffries' genre is, as best I can tell, the database application. Like the rest of the Agile Manifesto co-authors, he came up in an era where every business was seeking to "computerize" its processes. This led to a decade's worth of applications consisting of a database, a thin layer of business logic, and an even thinner frontend. There is, in these applications, a close relationship between the database schema and user interface. Consider the scaffolding provided by the Rails framework: you describe an entity, and it generates the code necessary to view and change those entities. This is why Jeffries chose the `List[Option[Int]]` representation; it mimicked how a Sudoku board is presented to its user. This choice is not remarked upon. It is, to Jeffries, simply the obvious place to start. And I'd imagine that in his professional career, this intuition served him well. But his intuition was developed, and applied, within his chosen genre. Here, he was doing something new; a mystery novelist trying his hand at fantasy. But in the end, it was just a bunch of elves and dwarves in a stately manor, waiting for the wizard to tell them whodunnit. Jeffries, it should be noted, also assigns little importance to this episode. After weathering fifteen years of online discourse, he wrote this: > Did incremental design and development fail? I don’t think so. Certainly I was using an incremental approach, and certainly no product came out. Did the approach fail? > > I don’t think so. I think I wasn’t having fun and just stopped working on the project.4 This is belied, however, by the fact that he returned to Sudoku two years later, writing forty-five new posts on the topic. By the end of the fifth post, Jeffries had a working Sudoku solver. _But then he kept going._ He continued to tinker with the solver for another two months and forty posts. And in this prolonged epilogue, we can see the limits of his incremental approach to software design. * * * ## back in the saddle point Jeffries' second attempt at a Sudoku solver begins, predictably, with a `List[Option[Int]]` representation. From there, he writes a function that calculates the possible values for an empty cell. He writes a simple recursive search function, which always selects the first possible value for the first possible cell. And with that, his solver is complete. His solution, from a design perspective, is serviceable. It's fewer than a hundred lines, but because of his chosen representation, much of that is spent on integer arithmetic: def used_numbers_in_sub_grid(self, position): first_index_in_row = position // self.line_size * self.line_size offset_in_row = position % 9 // 3 * 3 first_index_in_sub_grid = first_index_in_row // 27 * 27 + offset_in_row ... for row in range(first_index_in_sub_grid, first_index_in_sub_grid+27, 9): for col in range(0, 3): ... We can contrast this with the (roughly) equivalent code in Norvig's implementation: [cross(rs, cs) for rs in ('ABC','DEF','GHI') for cs in ('123','456','789')] The difference lies in how each cell is represented. In Jeffries' code, each cell is an integer, representing its index in the `List`. Norvig's code, on the other hand, represents each cell as a string describing its row and column. As a result, he doesn't need to fuss with modulo operators; he can simply use string concatenation. This, by itself, is not an indictment of Jeffries' approach. His software design is incremental; what matters is the destination. As he explains in his twenty-third post: > Naturally, I try to make good design decisions, not bad ones, although often enough the decisions I make do not pan out. (Yes, I hear you saying that makes your case for coming up with a solid design early, but no, it doesn’t. It makes mine: I don’t know enough to make a better decision.) So I try to make small decisions, simple decisions, decisions that will be easy to change when, not if, a better idea comes along.5 There is, unfortunately, little evidence of this in the preceding posts. After demonstrating a working solver, Jeffries spends the next six posts trying to "simplify" it. He tinkers a bit with the integer arithmetic, and then hides it away in a separate class. Then, in the twelfth post, things get interesting. While debugging his implementation, he generates an analogue of Norvig's representation: rather than an `Option[Int]`, each cell is a `Set[Int]` of possible values. And in the thirteenth post, he arrives at the notion of constraint propagation: > 1. Suppose the puzzle contains, not just the current solution state, but also, for each of the 81 cells, the available values for that cell. So when we do our `new_puzzle_trying` method, we’d > 2. Create a new puzzle that copied all those values … then > 3. Assign our guess … and then > 4. Recompute the available values for the position’s components, which > 5. Can be done either by removing the guess or recalculating, whichever is better.6 > This is a pivotal point in Jeffries' design process. He has solved the problem, understood the limitations of that solution, and imagined something entirely different. It is, in other words, a chance to start over. If Jeffries started with a different core representation, then it's likely his subsequent design decisions would also change. The bookkeeping for constraint propagation might push him towards Norvig's relational approach to the rules of Sudoku; rather than continually recomputing the rows, columns, and boxes, he could simply have a map of each cell onto its `peers`. He could distill every lesson of the previous posts, creating something simpler and faster. But Jeffries isn't in the business of starting over. He not only believes in incremental design, but in using the smallest possible increments. In his posts, he regularly returns to GeePaw Hill's maxim of "many more much smaller steps." He is only interested in designs that are reachable through a series of small, discrete steps: > In most every article of the 5,000 articles here, at least the programming ones, I have intentionally done minimal design up front. My intention was to demonstrate what happens when I do that, expecting that small steps, tests, and refactoring would enable me to improve the design as needed.... In almost every situation, that has turned out to be the case. Incremental design, at least at the level I do it here, works well. And the techniques thereof allow us to improve any code that needs it.7 This, again, is contradicted by the preceding posts. His attempts to implement constraint propagation had unambiguously failed. He began in his seventeenth post, adding a `List[Set[Int]]` as a sidecar to his core representation. But since the incremental updates were, as yet, unimplemented, it had to be recomputed after each move. As a result, his solver became two orders of magnitude slower; his test puzzle, which once took 200ms to solve, now took a full 20 seconds. Jeffries seems largely unbothered by this. He disables the slow test, and begins to look at search heuristics, under the theory that "moving towards more human approaches"8 might offset the performance loss. He only returns to it ten days and eleven posts later, at which point he's clearly lost the thread. Rather than try to implement the incremental updates, he simply makes it so that the `List[Set[Int]]` is lazily generated. And since nothing is actually _using_ this new data structure, this "fixes" the issue. In the twenty-odd posts that follow, Jeffries never returns to constraint propagation. Instead, he putters around with search heuristics like naked pairs; something that, again, Norvig's approach makes fairly trivial. And after the forty-fifth post, Jeffries seems to lose interest. He has, apparently, reached his destination. The resulting code is, still, serviceable. It solves Sudoku puzzles, and has support for various "human approaches" to solving puzzles. But the implementation9 has a diffuse, muddled quality. In his `Puzzle` class, some methods refer to the `Set[Int]` associated with a cell as `possible_answers`, and others as `candidates`. Likewise, most method names distinguish between `position` (index) and `cell` (value), but `Puzzle.unknown_cells` returns a list of indices. And while `Puzzle` began as an immutable representation, somewhere along the way it grew a mutable `remove_naked_pair` method. In a larger codebase, these sorts of inconsistencies are inevitable. Despite our best efforts, entropy creeps in. But Jeffries' solver is only a few hundred lines of code, and was refined for months on end. We must treat every line as intentional. When we say software is simple, we mean it's easy to explain. Well-designed software often has a narrative structure; there is a natural order to its components, and each helps to explain the next. We can see this in Norvig's implementation: it codifies the rules as a set of relationships, and then uses those relationships to solve the problem. This doesn't happen by accident. The developer needs to hold the entire structure in their head, and find a simple path that connects its constituent parts. And if they can’t find that path, they need to find a better structure. Jeffries, however, does not believe in bigger pictures; his approach to software design is proudly myopic. He prevents himself from seeing the forest by pressing his face against the trees. And sometimes, as he moves from one tree to the next, he takes a moment to celebrate: > As I refine and refine and refine, the design moves toward smaller objects, with single responsibilities and simple code. The design improves, bit by bit, almost by itself.10 But it doesn't. Software design is a deliberate process, and requires deliberate effort. Anything less is just a shrug in the face of entropy. * * * 1. Seibel 2009a ↩ 2. Marick 2008, p. 185 ↩ 3. Seibel 2009b, p. 200 ↩ 4. Jeffries 2022 ↩ 5. Jeffries 2024a ↩ 6. Jeffries 2024b ↩ 7. Jeffries 2024a ↩ 8. Jeffries 2024c ↩ 9. Unfortunately, Jeffries doesn't provide a complete listing of the final version of his solver. To piece it together, you'll need to read all four of his final review posts. ↩ 10. Jeffries 2024a ↩
explaining.software
February 14, 2025 at 12:46 AM
state and trace
When we say fiction belongs to a genre, we mean that it builds upon familiar themes and structures. By assigning a genre — a slasher flick, or a comedy of manners, or a murder mystery — we shape the audience's expectations. Genre is a locus; it makes the rest of the explanation less surprising. And this is the point. People like genre fiction because it's familiar; it can be read easily, or even mindlessly. Literary fiction, on the other hand, challenges the reader. Its goal, according to theorists, is to make the familiar feel strange: > [T]he essential function of poetic art is to counteract the process of habituation encouraged by routine everyday modes of perception. We very readily cease to 'see' the world we live in, and become anaesthetized to its distinctive features. The aim of poetry is to reverse that process, to _defamiliarize_ that with which we are overly familiar, to 'creatively deform' the usual, the normal, and so to inculcate a new childlike, non-jaded vision in us.1 When we create software, however, our sympathies must lie with the genre writer. We want the text to be simple, unsurprising. We want our readers to feel a sense of cozy familiarity. We should, then, try to understand how genres create this familiarity. We can begin with the story beats; the events that comprise the narrative. We will call this the story's **trace**. Each of the events in this trace represent a change to the **state** of the story. This state is a structure, comprised of the key entities and their relationships. These key entities are what shape the narrative. They make the trace predictable; each part of the state helps us anticipate future states. In fiction, this structure is usually learned through repetition; we must read, and compare, many stories within the same genre. Let's consider the structure of a murder mystery, as popularized by Agatha Christie. These stories follow a familiar sequence: a murder occurs, a detective is called in, each suspect is interviewed, and then the murderer is revealed. Each event, as it occurs, updates our state: This structure is quite sparse. There is no place for incidental characters or incidental relationships; everything relates to the murder. In an archetypal murder mystery, there are no aimless conversations, no quiet interludes. Each event is understood as bringing us closer to learning whodunnit. The obvious locus is the victim, who has a strong relationship to all of the suspects. The relationship between the suspects and detective is much weaker; most detectives are itinerant, solving murders under a variety of circumstances and in a variety of locales.2 Structural analysis of fiction tends to focus on the story's trace.3 On the face of it, this seems quite reasonable; the state, after all, is simply the sum of past events. We can adopt a similar perspective when looking at the implementation of a database. As Pat Helland puts it: > Transaction logs record all the changes made to the database. High-speed appends are the only way to change the log. From this perspective, the contents of the database hold a caching of the latest record values in the logs. The truth is the log.4 Datastores such as Apache Kafka take this to its logical conclusion: they only provide a log of events, a trace. State is derived by the application. But consider this scenario: you're watching a television show, and are deep into its fourth season. A friend sits down beside you. They've never seen the show before. "But that's fine," they say, "just catch me up." You will not, in this situation, recite every event of the past four seasons. Most of those events, at this point, don't matter. They belong to story lines that have long since been resolved. Our trace, it turns out, contains narrative arcs: A narrative arc offers closure; once complete, it congeals into the sum of its events. It allows us to focus on the present state, the new equilibrium. Our explanation to our friend will be similarly biased towards the present. We will focus on the state of the narrative; if we discuss events, that suggests a narrative arc which is still in-flight. Similarly, databases continually compact their transaction logs. The transactions are, themselves, narrative arcs; we don't store every step of a transaction, only its outcome. And in applications built atop Kafka, it's impractical to store and process an ever-growing log of events. There will, inevitably, be mechanisms for the derived state to be cached, and the trace to be truncated. Our software is full of narrative arcs. Consider, for instance, that pure functions — functions without any side-effects or context-dependence — are **referentially transparent**. This means that function invocation and the return value are interchangeable: val x = 2 + 2 val x = 4 We can, in other words, ignore what happens inside a pure function. All that matters is how it ends. We can even, if we're careful, apply this logic to less-pure functions: val config = load_config_file() Here, our `config` value depends on the contents of the config file. Different executions may produce different traces. But within a given trace, once we have `config`, it doesn't really matter what happened inside `load_config_file()`. All that matters is how it ended. It's instructive to consider how we use debuggers. We can, if we like, step through our programs one instruction at a time. We can peel away the narrative arcs, examine the raw trace. In practice, however, we never start at the beginning; we start at our breakpoint. And only then, after billions of instructions have elapsed, do we sit down and ask to be caught up. We cannot, then, understand our software solely through its trace. There are too many events, and too few with any real importance. We must layer narrative arcs atop the endless details of our software's execution, and then look to its state. Unlike in fiction, our software's state is explicitly modeled. It is, essentially, the graph of in-memory data. Even as this graph changes, its topology remains redundant, unsurprising. The text of our software is organized around the structures within its state. Consider if this weren't the case. Dijkstra, in _Go To Statement Considered Harmful_ , argues for the opposite: > [W]e should do ... our utmost ... to make the correspondence between the program (spread out in text space) and the process (spread out in time) as trivial as possible.5 When implementing a function, this is good advice. When we read a function, we are usually trying to understand its trace. The text of our function should make that as easy as possible. This argument, however, breaks down at larger scales. If two functions are called in sequence, they don't need to sit side-by-side in our codebase. And that sequence, in any case, may vary with our inputs. There's little point in trying to mirror the trace of our software in its text. This is why, in nearly every language and every project, the text of our software is organized around datatypes. Together, these datatypes comprise our software's model; each is defined by their relation to that whole. And, in turn, facets of that model are reflected in our software's state. Software, however, is as much a machine as it is a text. We must be able to reason about its execution, even as the trace grows impossibly large. Narrative arcs are central to the practice of software design. And so, we must explore both: the structures within our software's state, the narrative arcs atop our software's trace, and the ways in which each shapes the other. * * * 1. Hawkes 2003, p. 47 ↩ 2. Even Miss Marple, who spends most of her time in the South of England, sometimes vacations in the Caribbean. ↩ 3. See, for instance, Joseph Campbell's _The Hero With a Thousand Faces_ , Vladimir Propp's _Morphology of the Folktale_ , and Tzvetan Todorov's _Grammar of the Decameron_. ↩ 4. Helland 2015 ↩ 5. Dijkstra 1968 ↩
explaining.software
January 23, 2025 at 12:41 AM
structuralism
The idea of structure, as used in this newsletter, is nothing new. It was first introduced by Ferdinand de Saussure in 1916, in his enormously influential _Course in General Linguistics_. In it, Saussure describes language as a graph, and words as vertices. Between these vertices, there are negative edges representing difference, and positive edges representing signification and similarity. To understand a word, then, we must understand the topology that surrounds it. The meaning of each word arises from its relationships. Language is not a mere collection of words, it is a structure; each part is defined by its relation to the whole. Saussure's approach gave rise to **structuralism** , a school of thought in which a wide variety of systems — social, narrative, biological, mathematical — were modeled as structures.1 The structuralists were especially interested in paired concepts like masculine/feminine, which were defined through their mutual opposition. These pairings are typically referred to as **binary oppositions** , or simply binaries. These binaries, however, aren't Boolean. Both "masculine" and "feminine" are continuums; our language simply places them on opposite ends of the _same_ continuum. The structuralists were interested in the implications of these oppositions — to become less masculine is, necessarily, to become more feminine — and how they shaped our culture. It's entirely possible to understand structures without studying structuralism. We are, perhaps, uniquely qualified; few disciplines allow a definition as concise as "a graph where vertices are concepts and edges are relationships." But structures exist in every domain. These structures are opportunities; they allow us to draw parallels between those domains and our own. Whenever we delve into a new topic, we should look for the parts which resonate. For example, consider this metaphor: > That's a very _shallow reading_ Here, text is a layered container. The upper layers, the **surface** , must be peeled away to reveal the lower layers, the **substance**. When we say a person is shallow, we mean they never delve into these lower layers. They consume nothing of substance, and therefore _have_ no substance. A surface, then, is cosmetic; changes to something's surface won't affect its substance. It is a facade, a veil to be lifted. Whenever this isn't the case, it must be called out: * Don't _read too deeply_ into that * It does what it _says on the tin_ The shallow-metaphor contains two implicit assumptions: surface and substance are opposites, and shallow/deep is **aligned** with surface/substance. They are, effectively, the same continuum. The first assumption is only sometimes true. A good interface, for instance, _distills_ the underlying implementation. The surface reveals the substance. The second assumption, however, is often a useful heuristic in software. Consider an application, split into the familiar binary of frontend and backend. The application is available on multiple platforms, and thus requires multiple frontends: As part of the application, these frontends are shaped by that application's sense. They share a common purpose. Their implementations, however, tend to reflect their differences. Parallel implementations are difficult to write, and even harder to maintain; broad similarities tend to hide subtle differences. Unable to live in both, business logic is squeezed into the backend. And this business logic is, arguably, the substance of the application. As users move between frontends, the backend remains the same. When building a new frontend, we're rarely able to reshape the backend to fit our needs. None of this applies, however, if there's a single frontend: Here, the frontend and backend exist on the same continuum, but we can draw the threshold wherever we like. If our team is full of frontend developers, our frontend will be full of business logic. The surface and substance are intermingled. And then we come to software without a frontend, which industry luminaries sometimes call **deep tech** : This is meant to mirror the binary of basic and applied research. Basic research delves into the nature of the world; while lacking any immediate applications, it is expected to enable a wide range of applied research in the future. Likewise, deep tech is expected to enable a wide range of future applications. But where basic research is defined by its proximity to observable phenomena, deep tech is only defined by its distance from the user. Nothing aligns depth with substance. All too often, what someone calls deep tech is just a solution in search of a problem. Surface is not necessarily opposed to substance. And where it is, surface/substance is not necessarily aligned with shallow/deep. Nevertheless, we often speak and act as if they are. This is, perhaps, necessary. By placing two concepts at opposite ends of the same continuum, or aligning two continuums with each other, we reduce the world's dimensionality. We make it tractable. We should, however, be willing to introspect on these reductions. There is, for instance, a convention that surfaces are feminine. Consider our assumptions about a person who buys makeup, or works as a receptionist. In our industry, this means that early-career women are often pushed towards frontend roles. I've seen this justified as the sort of work they'd be "good at" or "interested in." Our intuition is a reflection of what we choose to align, and how. When communicating, then, we must be aware of our audience's intuition. To be concise, we must work with that intuition, building atop it. But if we wish to be precise, it's often necessary to break that intuition apart, articulating differences that have long been overlooked. * * * 1. A survey of these various projects can be found in Jean Piaget's _Structuralism_. ↩
explaining.software
January 23, 2025 at 12:41 AM
senior developer agents
When you're responsible for a junior developer, there's an early, crucial milestone: they know when to ask for help. Before this milestone, every task must be carefully curated. Each day of radio silence — be it from embarrassment or enthusiasm — is a cause for concern. Achieving this milestone can be difficult. It requires the developer to intuit the difference between forward motion and progress. It is, however, necessary; until they do, the developer will remain a drain on their team's resources. And this is what we've seen in the first generation of developer agents. Consider Devin, an "AI developer." Everyone I know who's evaluated it has reported the same thing: armed with a prompt, a search engine, and boundless yes-and enthusiasm, it pushes tirelessly forward. Successful outcomes require problems to be carefully scoped, and solutions to be carefully reviewed. It is, in other words, an intern simulator. In this post, we will be exploring a few different topics. The first is the nature of seniority in software development. The second is how agents can incrementally achieve that seniority. Lastly, we will consider our own future: if there _are_ senior developer agents, how will we spend our time? * * * # seniority and simplicity In this newsletter, simplicity is defined as a fitness between content and expectations. There are two ways to accomplish this: we can shape the content, or we can shape the reader's expectations. Whenever we explain our software, there are some natural orderings: class before method, interface before implementation, name before referent. By changing one, we change expectations for the other. This is the essence of software design. This is why designers often fixate on the things that are explained first: the project's purpose, the architectural diagrams, and the names of various components and sub-components. These are the commanding heights of our software; they shape everything that follows. A designer, then, is someone whose work has explanatory power for the work of others. And as a rule, the more explanatory the work, the more experienced the worker. For our purposes, seniority is just another word for ascending the explanatory chain. The impact of this work is, sometimes, easy to overlook. This is because there are three parts to every explanation, but only one is said aloud: The **prefix** is what everyone already knows, the **suffix** is all tomorrow's explanations, and the **content** is everything in between. To understand each part, let's consider a pull request. In addition to the diffs, each pull request usually includes an explanation. Often, this will link to external texts — bug reports, conversations on Slack, other pull requests — to explain why the change is being made. In literary theory, these are called **paratexts** ; they "surround, shape, support, and provide context"1 for the changes being made. Most paratexts, however, tend to be left unreferenced. In an intra-team code review, everyone involved will have received the same onboarding, attended the same meetings, and participated in the same discussions. This shared knowledge has explanatory power. To judge the simplicity of a pull request, we must first understand its prefix. But the prefix, on its own, is not enough. Consider how, in a code review, there are two types of negative feedback. The first is simple: this doesn't work. The second is more nuanced: this _works_ , but won't _last_. Our industry has a number of terms for this. We talk about _incurring technical debt_ : our work today will have to be redone in the future. We talk about _a lack of extensibility_ : our work today doesn't provide a foundation for the future. We talk about _losing degrees of freedom_ : our work today precludes something we want to do in the future. These are different ways of saying the same thing: this content doesn't fit its suffix. It has no explanatory power for the changes yet to come. It sets expectations that will, eventually, have to be unwound. A code review, then, is about more than checking for mistakes. It is about ensuring _simplicity_. There should be a fitness between prefix, content, and suffix; each should help to explain the next. And this is the core responsibility of a senior developer: to ensure simplicity. Within a project, this requires a strong grasp of its prefix and suffix. Across _all_ projects, this requires an internal metric for explanatory power: the degree to which one piece of content, presented first, makes another less surprising. This metric is not some hand-wavy abstraction. In information theory, **surprisal** — a mismatch between expectation and content — is a synonym for entropy. It can be measured in bits. If we can model our audience's expectations, then we can create an explanatory metric. Intuitively, this seems like a problem well-suited to a language model. And, having given it a bit more thought, I think this intuition stands. Language models aren't an off-the-shelf solution2, but I'm optimistic that we can use them to create a passable explanatory metric. It's worth noting, however, that a _better_ explanatory metric is not simply more accurate. Some explanations are more demanding than others. The explanatory relationship between our software's business model and product design is likely complex, requiring careful interpretation by the audience. By comparison, we'd expect the relationship between a function's name and body to be much simpler. And if it isn't, then the fault lies with the code, not the audience. A middling explanatory metric, then, would only be useful for judging the middle (and lower) layers of our software's design. But this is fine. As we'll see, even a limited metric can get us past that early, crucial milestone. And each time we improve the metric, our agents will become a little more senior. But first, if we want to measure the prefix's explanatory power, then it cannot be left unsaid. It must be reified. * * * # reifying the prefix In previous posts, our primary analytic framework has been the **structure**. Each structure is a graph, where each vertex is a subset of our codebase3, and each edge is a relationship. And unless a vertex represents individual lines of code, it will contain its own sub-structure. For the purposes of this post, we will treat each structure as a collection of short, declarative sentences. Each sentence consists of two components, and a relationship: * The **storage proxy** is a router and read-through cache for the **storage service**. * The **lexer** converts text into tokens for the **parser**. * Each value added by **enqueue** can be removed, in order, by **dequeue**. Together, these structures and sub-structures comprise our software's prefix. Every explanation of our software begins (implicitly) at the root structure, where our software is a single vertex in a larger system. From there, our explanation can descend until it reaches the relevant code.4 These structures, then, represent our explanatory topology. We expect each parent to have a high degree of explanatory power for its children: it's difficult to explain `pop` without first explaining `Stack`. Likewise, we expect siblings to have explanatory power: `pop` and `push` are typically explained together. Conversely, if two nodes lack a parent or sibling relationship, then they should have little explanatory power for each other. Each can be understood, or changed, in isolation. Coupling, after all, is just another word for co-explanation. And so, if we were to write out the sentences of our software's structures, we could use the explanatory metric to validate them. The higher-level structures will be the hardest to validate, and likely require human review. But this is fine; high-level structures are fewer in number, slower to change, and thus long-standing. They should already be widely understood. Everywhere else, we can lean on our tooling. By omitting each sentence in turn, we can measure its individual explanatory power. We can highlight the weakest sentences, along with any code that remains surprising. And then we can iterate, reshaping the prefix and code until they fit. Once we're done, we will have a living design document. Each time the code changes, we can check for incipient staleness in our prefix. And whenever we find it, we will be forced to make a choice: which is at fault? The prefix, or the code? * * * # judging the content Sometimes, this question has a simple answer. Tasks for junior developers — human or otherwise — shouldn't require much software design. Their changes, then, should preserve the prefix's explanatory power. If they don't, then something's gone awry; it's time to ask for help. There are a few reasons this might happen: 1. **The task is ill-conceived.** What we're trying to do doesn't fit our software's design; we need to try something different. 2. **The design is ill-conceived.** What we're trying to do doesn't fit our software's design; we need a different design. 3. **The developer can't make it work.** The task is too demanding; it needs to be done by someone more experienced. This last case is relatively straightforward: we reassign the task. The other two cases, unfortunately, are a bit more challenging; to start, we need to figure out which one we're dealing with. Here, again, we can turn to the reified prefix. How much of our prefix is undermined by our changes? And, if we update the prefix to match our changes, how much does that undermine the surrounding code? This is the **impact** of our pull request. The higher-impact the changes, the more likely they are to be ill-conceived. A reified prefix, then, plays two roles. For junior developers, it's a set of guardrails. If they ever bump up against its limitations, then it's time to ask for help. For senior developers, it's a bicycle for their software's design space. Used as a guardrail, a prefix will make agents a tiny bit more senior. It may even, on its own, make them net-positive contributors. If an agent can quickly recognize, and explain, its inability to create a low-impact pull request, there's real value in letting it try. And from there, we can let the agent ride the bicycle, and see what happens. We can allow it to propose higher-impact changes, and if they're any good, we can continue to raise that threshold. Whenever the agent gets a little more senior, we get a little more value. * * * # scrying the suffix Let's imagine, for the moment, that we had an oracle. While the oracle can't tell us exactly what we'll be working on, it _can_ provide a representative sample of possible future tasks. With this in hand, measuring the fitness between content and suffix would become a straightforward, albeit expensive, Monte Carlo process. First, we take large sample of future tasks. Then we try to solve each task, both with and without our proposed changes. Finally, we compare their complexity; did our changes make the task easier, or harder? If, on average, it makes things easier, then our content has explanatory power for the suffix. Unfortunately, we don't have any oracles. Our roadmap can only offer a single, fragile narrative, and what we need is a garden of forking paths. When evaluating a design choice, we must consider both what it enables and what it precludes. We need to ensure that the paths not followed, our **counterfactuals** , aren't something we'll later regret. And this, by itself, is a more tractable problem. We need something akin to QuickCheck which, given a failing input for a test, will try to reduce it down to its minimal form. In our case, we want minimal counterfactuals: simple tasks which, if we make the proposed changes, become significantly more complex. Once generated, they can be added to our review process. In our imagined future, each pull request will have four parts: the description, the codebase diffs, the prefix diffs, and the counterfactuals. And then, finally, a person must review the changes. * * * # the future of software development is software design There are two ways to speed up software development: we can make code easier to generate, or we can make code easier to evaluate. Recently, our industry's attention has been on the former; it is, after all, _generative_ AI. But generation isn't the problem; language models can, and will, drown you in code. The challenge is ensuring the code is _worth the time it takes to evaluate_. And so, unless the quality of your code is improving day-over-day, it's worth considering the other side of that equation. In this post, I've suggested a few different ways that evaluation can be improved. While I think all of them are promising, the only part that's load-bearing is the explanatory metric. If we have that, then agents can begin to evaluate their own output. They can iterate, and improve. And whenever they hit a fixed point, they can ask a human to enter the loop. And this, I think, is a plausible future for software development. If agents are sufficiently inexpensive, then every task can be delegated by default. The agent will make as much progress as it can, and then report back. We will, by necessity, develop an intuition for why they failed. Does the task need to change? The design? Or do we simply need to implement it ourselves? We will, in other words, all become lead developers. And this may not be a smooth transition. Today, these skills are treated as inherently tacit; they cannot be taught, only learned. We rely on the fact some people, after years of hands-on experience, will demonstrate the necessary aptitudes. In this imagined future, we'd need a more direct path. And this, I hope, is where newer facets of the review process — the prefix, the counterfactuals, and so on — would come into play. By giving voice to the unspoken parts of each explanation, we make it easier for junior developers to participate. Looking to my own future, this is something I'd enjoy building. I have, for the past few years, been working on a book about software design. I'm excited by the possibility of putting those ideas into practice. If you have a role matching that description, drop me a line. I'd be happy to chat. * * * 1. Consalvo 2007, p. 182 ↩ 2. On the surface, there seems to be a strong resemblance between a language model's perplexity and human surprisal. But the expectations we derive from a text are not just from what it says, but from what it implies. Perplexity may capture some surface level implications, but it won't get us all the way there. ↩ 3. More generally, vertices in a structure can represent any abstract concept — datatypes, actors, actions, and so on — but for the purposes of this post, they represent contiguous regions of code. ↩ 4. In practice, these paths may be a little indirect. In an MVC web application, for instance, reaching the controller requires a detour through the model. ↩
explaining.software
January 29, 2025 at 12:44 AM
the death of the architect
Once upon a time, every project began with the creation of a canonical design document. This was called the system **architecture** , because it "rightly implie[d] the notion of the _arch_ , or prime, structure."1 Then, documents would be written for each module. These would provide detailed instructions for how the module should be implemented. Often, there would be diagrams for the control flow of each subroutine. And only then, once all the documentation was complete, would the implementation begin. This was seen as a largely mechanical task, akin to compilation. It was generally assigned to the junior members of the team; after all, the hard part was already done. This approach was a major contributor to the software crisis. Decisions made early in this process would become load-bearing, impossible to change.2 This, however, only made the design phase seem _more_ important. Countless methodologies were proposed, all of them design-first. And so, when Kent Beck began to talk about iterative development, people were ready to listen. In his Extreme Programming (XP) methodology, design and implementation were interleaved. "There will never be a time," he said, "when the system 'is designed.' It will always be subject to change, although there will be parts of the system that remain quiet for a while."3 We can find a number of familiar ideas in the first edition of _Extreme Programming Explained_. Beck, for instance, also measures complexity in bits: > Simplicity and communication have a wonderful mutually supportive relationship. The more you communicate, the clearer you can see exactly what needs to be done and the more confidence you have about what really doesn't need to be done. The simpler your system is, the less you have to communicate about, which leads to more complete communication, especially if you can simplify the system enough to require fewer programmers.4 And for your design to remain simple, you would need "a clear overall metaphor so you were sure future design changes would tend to follow a convergent path."5 In fact, metaphors were central to the XP methodology. A metaphor "helps everyone on the project understand the basic elements and their relationships."6 And they are especially useful for high-level design: > Architecture is just as important in XP projects as it is in any software project. Part of the architecture is captured by the system metaphor. If you have a good metaphor in place, everyone on the team can tell about how the system as a whole works.7 Instead of exhaustive design, Beck wanted _just enough_ design. His system metaphor was something that could be explained in a moment, and was robust to change. It was a pane of frosted glass, a locus. A year later, Beck's ideas were distilled into the Agile Manifesto.8 His notion of lightweight design became a preference for "responding to change over following a plan." What's the point of a plan that solves yesterday's problems? And then, three years after the Manifesto, Beck released the second edition of _Extreme Programming Explained_. It had been rewritten from scratch. There was not a single mention of metaphors or system architecture. Nor, really, any discussion of the future. In this iteration of XP, you simply moved from moment to moment. As readers, our instinct is to treat the second edition as a continuation of the first. After all, they have the same author. And there was a reason Beck called it "extreme" programming: > When I first articulated XP, I had the mental image of knobs on a control board. Each knob was a practice that from experience I knew worked well. I would turn all the knobs up to 10 and see what happened.9 The second edition, then, could be Beck simply turning Agile's preference for "responding to change" up to 10. If you're always living in the moment, what's the use of architecture or metaphors? A broader review of the Agile literature, however, reveals a different story. As Robert Martin explains it: > In the years just before and after the signing of the Agile Manifesto, the _Metaphor_ practice was something of an embarrassment for us because we couldn't describe it. We knew it was important, and we could point to some successful examples. But we were unable to effectively articulate what we meant. In several of our talks, lectures, or classes, we simply bailed out and said things like, "You'll know it when you see it."10 And this embarrassment, when you look for it, is plain to see. Metaphors are treated like stray parts next to newly constructed furniture; if they're mentioned at all, it's to explain why they probably don't matter. In _Domain-Driven Design_ , for instance, the topic is buried four hundred pages deep: > System metaphor has become a popular approach because it is one of the core practices of extreme programming. Unfortunately, few projects have found really useful metaphors, and people have tried to push the idea into domains where it is counterproductive.11 This was not a refinement; it was a tactical retreat. Despite everyone's best efforts, Beck's ideas about lightweight software design remained stubbornly tacit. And so, they were quietly discarded. With them, the Agile methodology lost any notion of continuity, of describing the future. It was left floating, unmoored, in the eternal now. * * * Decades later, software design has become something of a backwater. Most writing on the subject can only be called "post-design." It is defined by what it refuses to discuss. "Programmers," Sandi Metz warns us, "are not psychics." > Practical design does not anticipate what will happen to your application. It simply accepts that something will and that, in the present, you cannot know what.12 Our only choice, Ron Jeffries tells us, is to focus on the present: > The source code is also the ONLY document in whatever collection you may have that is guaranteed to reflect reality exactly. As such, it is the only design document that is known to be true. The thoughts, dreams, and fantasies of the designer are only real insofar as they are in the code.13 This seems like hard-nosed pragmatism, until you realize that software is a living text. Each snapshot is just a stepping stone. And the path they follow is, undeniably, shaped by our "thoughts, dreams, and fantasies." There is a _smallness_ to this post-design literature. It confines itself to the syntax, offering heuristics for better code. Sometimes, these heuristics are fenced in by warnings about the failures of the past. But more often, the limitations are treated as self-evident; software design _is_ just a collection of heuristics. This newsletter (and the underlying book) is an attempt to turn back the clock. It imagines a world in which Beck's ideas about software design were more explainable. And it begins, appropriately enough, with a metaphor. * * * 1. Blaauw 1972 ↩ 2. According to Barry Boehm's _Software Engineering Economics_ , which assumed a design-first methodology, the cost of design changes would grow exponentially over a project's lifetime. ↩ 3. Beck 2000, p. 104 ↩ 4. ibid, p. 31 ↩ 5. ibid, p. 65 ↩ 6. ibid, p. 56 ↩ 7. ibid, p. 113 ↩ 8. Martin 2020, p. 32 ↩ 9. Beck 2000, p. xvi ↩ 10. Martin 2020, p. 98 ↩ 11. Evans 2004, p. 447 ↩ 12. Metz 2018, p. 4 ↩ 13. Jeffries 2002 ↩
explaining.software
January 29, 2025 at 12:43 AM
making things better
Previously, we explored how abstract explanations, paired with intent, become specific. And in our case, the intent is almost always to improve our software. But what does this actually mean? To begin, let's consider this metaphor: > Things are _looking up_ This is a statement of optimism: things are improving, and we expect this trend to continue. There are, however, a wide range of up-metaphors. And, as shown by George Lakoff and Mark Johnson in their book _Metaphors We Live By_ , these metaphors align. Combined, they create a specific vision of how things will improve: Happy is up; sad is down: > You're in _high_ spirits. He's really _low_ these days. More is up; less is down: > My income _rose_ last year. He is _under_ age. Having control is up; being subject to control is down: > I am _on top_ of the situation. He _fell_ from power. Good is up; bad is down: > He does _high_ -quality work. Things are at an all-time _low_. Foreseeable future events are up (and ahead): > All _up_ coming events are listed in the paper. I'm afraid of what's _up ahead_ of us.1 When things are looking up, then, we are looking into a future where there is _more_. We will be happier, we will have more control; whatever we consider to be good, there will be more of it in our life. The up-metaphor asserts there is an alignment between everything we consider good. It rests upon what what Albert Hirschman called the **synergy illusion** : the belief that all good things go together. > It is of course an ancient idea, traceable in particular to the Greeks, that there is harmony among ... various desirable qualities such as the good, the beautiful, and the true. A celebrated expression of the idea is in Keat's "Ode on a Grecian Urn": "Beauty is truth, truth beauty."2 We know that this is an illusion. Tradeoffs exist; improving one aspect of a system can make other aspects worse. As projects grow, our control over them shrinks. Ugly truths abound, and beauty is a luxury we can rarely afford. Knowing this, however, does not mean accepting it. Confronted with this dissonance, this ugliness, we inevitably gesture towards a better future. We talk about better design, better practices, better processes. We await better abstractions. We imagine a world in which we cannot help but make something beautiful. This belief in the future, in an unending ascent towards perfection, is a belief in **progress**. The flaws in this belief — its internal tensions, the fact that it is closer to a theology than a theory — have been pointed out for centuries.3 It is, nevertheless, an inescapable part of the software industry. Everything we do, whether design or implementation, is oriented towards an imagined future. Any discussion of improvement, then, should build upon these intuitions. Our metric should, wherever possible, allow for indefinite growth. Ideally, the metric should be linear; the return on our effort shouldn't have any plateaus or sudden jumps. A little more should be a little better, forever and always. Often, there are many such metrics. Consider how we might improve a queue. We could focus on the queue's capacity; a better queue can hold more messages. We could focus on the queue's throughput; a better queue can process more messages at once. We could also focus on the queue's latency; a better queue can process a message in less time. This, however, lacks linearity; when reducing latency, there are always diminishing returns. In fact, as Donald Knuth reminds us, attempts at optimization can make things worse: > [T]hese attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We _should_ forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.4 We must, however, still optimize: > Yet we should not pass up our opportunities in that critical 3%. Optimization, then, is what philosophers call a _pharmakon_ , an Ancient Greek word that means both remedy and poison. It is only useful when paired with expertise. Wherever you're cautioned to do something in moderation, you've found a _pharmakon_. Consider the "Rule of Three," as popularized by Martin Fowler: > The first time you do something, you just do it. The second time you do something similar, you wince at the duplication, but you do the duplicate thing anyway. The third time you do something similar, you refactor.5 Optimization is a _pharmakon_ , abstraction is a _pharmakon_. These quotes, however, tell us little beyond that. They have the structure of advice, but omit any of the expertise we'd need to apply them. What differentiates the critical code we should optimize from everything else? Likewise, similarity is a continuum; where should we draw the line? If our intent is a _pharmakon_ , we need a shared understanding of when it should no longer be pursued. The easiest way to do this is to describe a different intent, and assign it higher precedence. Let's look at a concrete example. Some years back, I was responsible for the design and implementation of a server which would authenticate and route all incoming API requests. When trying to articulate the design goals for the project, I came up with an ordered list of sub-goals: 1. It should be **transparent** — it should be easy for us to reason about the handling of each individual request, as well as the overall state of a server process. 2. It should be **stable** — it should be robust when receiving unexpected volumes of both normal and pathological requests. 3. It should be **fast** — when forwarding requests to backend services, it should add minimal overhead. 4. It should be **extensible** — it should be easy to understand and modify in ways that won't jeopardize the first three properties. This list was not a decomposition of the project into different pieces. There wasn't one component relating to transparency, and another relating to stability. Each represented a different holistic intent, and collectively they defined what "better" meant for the project. The primary intent was transparency. We wanted stability _except_ where it would make things less transparent. We wanted it to be fast _except_ where that would make things less stable or transparent, and so on. In this sort of list, all but the first intent are assumed to be _pharmakons_. This is what is often missing from discussions of optimization or abstraction: the concrete, project-specific goals that should take precedence. And even the first intent, transparency, is only an absolute good in the context of the project; we must always consider if our time is better spent elsewhere. Intent is generative. It is what takes us from an abstract metaphor, like a queue, to a concrete implementation. Creating a shared understanding of that intent, then, is one of the most important things we can do. We must assume, however, that this intent will be followed _indefinitely_. If that has the potential to lead the project astray, then we must create a list. And each time we find one intent impinging upon another, our expertise will grow. * * * 1. All quotes excerpted from Lakoff and Johnson 1980, pp. 15-16 ↩ 2. Hirschman 1991, p. 151 ↩ 3. For a good survey of these analyses and critiques, see Dinerstein 2006. ↩ 4. Knuth 1974 ↩ 5. Fowler 2018, p. 50 ↩
explaining.software
January 23, 2025 at 12:41 AM
intent and implication
There was, in the mid 2010s, a popular formula for explaining a new startup: "Uber, but for ____." This was a metaphor: the startup, despite targeting a different market, was similar to Uber. It was, however, a fairly ambiguous metaphor; there were many ways that a company could resemble Uber. The most visible facet of Uber was their use of a mobile app to affect the physical world. This was, at the time, a novel concept. Every app promised a "magical user experience;" you could summon a car, summon a cleaning service, summon a doctor to prescribe medicinal marijuana1. There was, it seemed, no limit to what you could accomplish with the tap of a screen. Also important, however, was Uber's use of so-called gig workers. Early press coverage of this labor model typically focused on its flexibility — they could work wherever and whenever they wanted — and glossed over the lack of benefits or guaranteed income. For Uber to retain its magic, the car had to appear quickly. Unfortunately for the drivers, the easiest way to minimize latency is to also minimize utilization. To Uber, the passenger's time was precious and the driver's time was cheap. This was not, however, true of every startup that resembled Uber. My visit from the doctor, for instance, was scheduled several days out; the magic was that he showed up at all. Now imagine the year is 2013, and a friend is telling you about their new startup: Uber, but for dog walkers. When interpreting this metaphor, we must consider which aspects of Uber would make this a more viable business. It's unlikely, for instance, that they are describing a roving fleet of walkers, ready to pick up your dog at a moment's notice. Dog walking is a recurring service, built on trust. Even if the startup used gig workers, those workers would be significantly less fungible. Users would expect to recognize the person walking their dog; if a walker quit, that would weaken the user's trust. This implies a very different relationship between the startup and their labor force. It also implies that the mobile app would need to offer some sort of matchmaking service: Tinder, but for dog walkers. Our friend's **intent** , when describing their startup, is to describe a viable business model. Our interpretation of the metaphor, the perspective we adopt, must satisfy that intent. Likewise, when we propose a change — adding a queue, for instance — the intent is to improve our software. This intent, paired with domain expertise, carves away the ambiguity. The metaphor becomes specific; it tells us what we need to create. This domain expertise builds atop the broader expertise we've developed through a lifetime of communication. In his influential paper _Logic and Conversation_ , H.P. Grice described what he calls the "cooperative principle" that underpins every conversation: > Our talk exchanges do not normally consist of a succession of disconnected remarks, and would not be rational if they did. They are characteristically, to some degree at least, cooperative efforts; and each participant recognizes in them, to some extent, a common purpose or set of purposes, or at least a mutually accepted direction. Our words, paired with some intent, generate unstated **implications**. And within every conversation, there is a cooperative intent: > Suppose that A and B are talking about a mutual friend C, who is now working in a bank. A asks B how C is getting on in his job, and B replies, _Oh, quite well, I think; he likes his colleagues, and hasn't been to prison yet_. Our natural assumption is that B is cooperating with A; their intent is to answer the question. We must, then, find a reason for why B felt this was a useful response. Are they implying that C is prone to illegal behavior? His colleagues? Bankers as a whole? The context, in a fully cooperative conversation, should remove any lingering ambiguities. Our explanations, then, are rarely self-explanatory. The implications are left as an exercise for the audience. There are a number of ways this can go wrong. Our intent could be unclear. Our audience could be unable, or unwilling, to work out the implications. Or, perhaps, the explanation itself could be flawed. But these risks exist in any collaboration. A high-level explanation, like the Uber-metaphor, is an invitation to apply our expertise. Even the person using the metaphor must, like everyone else, reason through the implications of their own words. * * * 1. My doctor, when he arrived, was riding an electric skateboard. ↩
explaining.software
January 23, 2025 at 12:41 AM
structures as paths
In the fractal-metaphor, our software is an open space. Each reader traverses that space, noting new details as they come into view. When our software is well-designed, these details are small and incremental. But this isn't quite right. If it were, movement through our software would satisfy the triangle inequality: In Euclidean geometry, if `AB` and `BC` are not collinear, then a shorter path `AC` must exist. In our software, however, this is almost never the case: There is no explanation of `pop` which skips over `Stack`. Likewise, there is no explanation of `pop` which skips over `push`. Their meaning is intertwined; they are part of the same structure. The simplest explanation, then, is rarely a straight line. To explore our software, a reader must traverse its structures. The paths taken through our software are, in many ways, predictable. Explanation is a top-down process. Every explanation of an MVC web app, for instance, begins with its purpose and then its model. As a result, these early paths become well-worn. They are quickly relegated to the unstated prefix of our explanations. This context may be implicit, but it cannot be ignored. The earliest use of the MVC pattern was in the Xerox Alto personal computer. The controller's inputs were clicks and keypresses. The view's output was bitmapped pixels. Everything on the screen — the windows, the menus, the scrollbars — had its own model, view, and controller. These were, in some cases, colocated in the same Smalltalk-80 object.1 It was, in other words, different in every detail from an MVC web application. This is to be expected; it had a different prefix. As a result, the MVC structure took on an entirely different meaning. It is popular, in some circles, to talk about "emergent design." If we focus on the low-level details, they say, the rest will take care of itself. Software, in other words, is best explained from the bottom-up. But these people are, invariably, experienced developers. Often, they are consultants who have spent much of their career building variations on the same theme.2 The design, by all indications, is emerging from their own tacit expertise. As designers, most of what we create is left unsaid. We cannot, however, let it be forgotten. * * * 1. Reenskaug 2003 ↩ 2. Most early consulting work came from businesses looking to "computerize" themselves. In almost every case, this involved building some sort of CRUD app. ↩
explaining.software
January 23, 2025 at 12:41 AM
decoupling in depth
In his influential paper _On the Criteria To Be Used in Decomposing Systems into Modules_ , David L. Parnas offers some simple, timeless advice: if two things change together, they belong together. This is not, in itself, an answer. It simply replaces a nebulous question — do these belong in the same module? — with something more concrete. In this newsletter, we are trying to do something similar. Previously, we established that every change is an explanation. When we say two things are coupled, we mean that they tend to be co-explained. This co-explanation is out of necessity; their meaning is intertwined. One, or both, has explanatory power for the other. And so, if we want to know if two things will change together, we should look to their explanatory power. If one thing explains another, they belong together. And if not, we should preserve that separation with an interface. Consider a simple hash table. It's difficult to imagine an explanation which begins with our application's business logic, and ends with "and this is how a hash table is implemented." Our business logic has no explanatory power for the underlying data structures. The converse is a bit more nuanced. There is a small but real possibility that we will give an explanation that begins with hash collisions, and ends with "and that's why our application is slow." But hash collisions are an incremental concept; we can still gloss over the remaining implementation details. It's difficult to imagine an explanation which begins with the in-memory layout of a hash table, and ends with our business logic. We can find a similar pattern in SQL. While ostensibly a declarative language — it describes the result, not the process — we can often use different queries to generate the same result. Sometimes, the performance of these queries varies by orders of magnitude. This is why many SQL implementations provide an `EXPLAIN` command. Given a query, it will provide a plan for how that query will be executed. Like hash collisions, these query plans are an incremental concept; they explain more without explaining everything. When we say two things are **decoupled** , we mean co-explanation is unlikely. Our goal is not perfect separation. UUIDs are only _very likely_ to be unique. Checksums are only _very likely_ to detect changes. Most explanations are simple, a few are not. Complexity should be weighted by frequency. And so, interfaces for complex implementations are often layered. The purpose of each layer is to reduce the audience for the next. In military terms, the implementation is defended in depth: > Rather than defeating an attacker with a single, strong defensive line, defence in depth relies on the tendency of an attack to lose momentum over time or as it covers a larger area.1 Your readers are not determined invaders. They want the interface to suffice. Give them a bit of explanatory distance, and they'll find a reason to turn back. * * * 1. Wikipedia 2024 ↩
explaining.software
January 29, 2025 at 12:49 AM
similar, but different
In the software design literature, cohesion is often referred to by a different name: single responsibility. As Sandi Metz explains it: > When everything in a class is related to its central purpose, the class is said to be _highly cohesive_ or to have a single responsibility.1 To determine if a method belongs inside a class, Metz suggests posing it as a question. "Mr. Bicycle Gear, what is your ratio," she says, makes sense. "Mr. Bicycle Gear, what is your tire size," does not. In a cohesive class, then, all the methods are alike. They all extend from the same singular responsibility. They all share a common locus. We do not need this similarity to be reaffirmed at the method level. We do not ask, "Mr. Bicycle Gear, what is your gear ratio." Instead, a method's name only needs to describe its specific role. When we give something a name, it is to explain how it is different. Consider the names inside a function. A variable named `count` doesn't tell us what is being counted, but it shouldn't have to. The surrounding lexical scopes — the function, the class, the namespace — should explain most of it. A quick glance at the code directly adjacent to `count` should explain the rest. A name like `count` describes its relationship to the surrounding code, the singular way in which it differs. In all other ways, we must assume, it is similar. If it were at odds with its surroundings, counting something unrelated, then the name would be longer. Names, then, should be as short as possible. A long name tells the reader to pay close attention; the code is doing something surprising, something at odds with its surroundings. It signals a lack of cohesion; the disparate components don't quite fit. Simple names arise from cohesion. They are only possible when each component is similar, but different. This difference means that a name conveys two things. It describes what its referent _is_ , but also what nearby referents _are not_. This is why `Util` is a bad name; it suggests one class has utility, and all the others don't. This is also why we prefer `count` to `n`. We want to partition our variables by role, not by whether they're numeric. Simple names are short, but meaningful. Single-letter names can, however, have real meaning. There is, for instance, a long-standing convention that `i` is an index for the outermost loop, `j` for the first inner loop, and so on. In semiotics, these are called **indexical signs** ; like a finger, they point to their referent. As long as the loops are nearby, these indices tell us everything we need to know. Single responsibility, like cohesion, is a tacit concept. Robert Martin tells us that "[a] class should only have one reason to change,"2 but Sandi Metz cautions that the "SRP doesn't require that a class do only one very narrow thing or that it change for only a single nitpicky reason." It's left to us to find the golden mean. Nothing in this post changes that. Similarity and difference are both continuums; only we can decide how much is enough. It does, however, suggest an indirect measure for cohesion. If every name within some lexical scope is short and meaningful, then that scope is probably cohesive. This is why, as a software designer, I constantly return to the names. Sometimes, they tell me I've done enough. And the rest of the time, they at least tell me if I'm moving in the right direction. * * * 1. Metz 2019, p. 22 ↩ 2. Martin 2003, p. 95 ↩
explaining.software
January 23, 2025 at 12:41 AM
transparent like frosted glass
Sherry Turkle wrote her study of the culture of computing, _The Second Self_ , "on an Apple II computer that had, quite literally, been torn bare."1 Its circuitry had been exposed, and its operating system replaced. Even her word processor felt close to the machine; it required her to use "a formal language of nested delimiters" that could be easily "translated into electrical impulses." This experience helped her to understand "the aesthetic of technological transparency" in early personal computing. The purpose of transparency, she said, was to give enthusiasts "the pleasure of understanding a complex system down to its simplest level." But this understanding couldn't last. With each year, the hardware and software grew more complex. No one could expect the average user — who had, just recently, bought their first computer — to hold the entire system in their head. And so, the meaning of transparency changed. Newer machines, like the Macintosh, encouraged users to "take the machine at (inter)face value." Deep understanding was neither required nor rewarded. "By the mid-1990s," Turkle says, "when people said that something was transparent, they meant that they could immediately make it work, not that they knew how it worked." This contranymic transparency was especially popular in the distributed systems literature. In the 1990s, many researchers believed that the network ought to be abstracted away entirely. As Wolfgang Emmerich explained it: > [T]he fact that a system is composed from distributed components should be hidden from users; it has to be transparent.2 Here, transparency means no outward difference between local and remote resources. Any method invocation might fire off a network request. This, Emmerich asserted, would prevent the "average application engineer" from being "slowed down by the complexity introduced through distribution." This simplicity, however, is fragile. Method invocation has no explanatory power for latency or DNS issues. When things go wrong, the "average application engineer" will understand less than nothing. A good interface is a locus. It is _usually_ where our explanation ends, but _always_ where it begins. It is a stepping stone, not a terminus. And so, an interface should reveal the shape of the underlying implementation. It should only obscure the finer details that are, to most people, irrelevant. It should be transparent like frosted glass. This is easier than it may seem. Imagine you're being onboarded. They've drawn lines and boxes on the whiteboard, to illustrate the overall system. One box, far removed from your own project, is labelled "auth service." In that moment, the name suffices. You know that there is, somewhere, a service responsible for authentication and authorization. And if you ever need to know more, the name gives you a broad sense of what to expect. Some people, of course, will always look past the name. And this is fine. As we saw with the fractal-metaphor, we don't need to bisect our software with a single, perfect interface. Instead, we can split it into layers, each revealing incrementally more detail. And so, when we look past the auth service, we will find a small number of named components. Those components, in turn, can be decomposed into named classes, methods, and values. Each decomposition, however, should be less likely than the last. In a well-designed system, the names usually suffice. * * * 1. Turkle 2005, p. 7 ↩ 2. Emmerich 2000, p. 19 ↩
explaining.software
January 23, 2025 at 12:47 AM
the simplicity of a fractal
Previously, we've looked at code generation in both Rails and Thrift. But unlike Thrift, the code generated by Rails is meant to be changed. Any change is fine, so long as it's not too surprising; the only limit is our judgement. Rails, then, doesn't fit the limb-metaphor. Our explanation will not always end with the model. Even so, the model remains a locus; for many, it suffices. The only thing users need to know about a CRUD app are the entities.1 And this is all we want from an interface. It doesn't need to be impermeable, it just needs to suffice for a reasonable number of people. An interface suffices whenever the underlying details become **textural** ; we know they exist, but wouldn't notice if they changed. This phenomenon is self-similar; every component, at every level, will have its own texture.2 Our software, in other words, resembles fractal geometry. It's easy to create fractal geometry. Generate a small number of random values within `[-0.5, 0.5]`. Then generate twice as many values at half the amplitude, and repeat: The sum of these **octaves** is a fractal curve. Here, the final result is comprised of six octaves. The effect of the last four octaves, however, is fairly subtle: After a handful of octaves, it's all texture. And once we hit that threshold, we can stop. We can be confident there are no hidden surprises. This is an ideal to which we can all aspire. But software is understood one piece at a time. And each piece is contiguous with others. A commonsense compression scheme would preserve nearby details, and elide everything else: This is a common strategy in video game engines. When the player is far away from an object, the game will render a less-detailed version. Smaller and less significant objects may be skipped over entirely.3 As the player moves, the elided details are reintroduced. When done poorly, this can be very distracting. Imagine walking across a flat, grassy meadow. Suddenly, something appears in the middle distance. Is it important? Dangerous? No, it's just a tree. This is bad design. The tree is significant; it belongs to the higher octaves.4 It should be visible at a distance. In an open space, new details should be barely noticeable. In the fractal-metaphor, simplicity means gradual surprise. It's common, when discussing software abstractions, to use a container-metaphor. The interface contains, and obscures, the underlying implementation. And when we open the container, everything is revealed at once; there is a sudden spike of surprise. But the usefulness of of the container-metaphor is short-lived; the surprise doesn't last. The next time, we know what to expect. When we read familiar code, we are just reacquainting ourselves with the finer details. And so, when reasoning about familiar code, we should prefer the fractal-metaphor. Our layered abstractions are not containers, but octaves. The higher octaves should be memorable, and the lower octaves easy to page into memory. Even sprawling code can be simple; it just needs to be explained one step at a time. * * * 1. This phenomenon is both explored and generalized in Daniel Jackson's _The Essence of Software_. ↩ 2. Even if our software is designed to run on specific hardware, it isn't designed to run on a specific chunk of silicon. ↩ 3. This isn't only for performance reasons. Details smaller than a single pixel can lead to Moiré patterns and other visual artifacts. ↩ 4. Within this metaphor, "high" octaves are the high-level octaves, not the high-frequency octaves. ↩
explaining.software
January 7, 2025 at 12:38 AM
the simplicity of a limb
In their paper on _Evolvability_ , Marc Kirschner and John Gerhart discuss the separation of concerns within our genetic code. They pay special attention to limbs: > The limb is a complex structure with precisely placed bone, cartilage, muscle, nerves, and vascular elements, and one might think it is difficult for such a structure to change in evolution.1 Every vertebrate uses the same genes to encode their limbs. This mechanism can describe wings, flippers, arms, and everything in between. But how? Our genome can't contain separate descriptions of our bones, muscles, nerves, and so on. If it did, a mutation in one would need to mirrored in all the others. Productive evolution would be nearly impossible. "The basic structure of a limb," it turns out, "depends on serial cartilaginous condensations." These condensations are both a scaffold and a signal; they ossify into bone, but also guide the development of the surrounding muscles, nerves, and blood vessels. Only one part of the limb is described in absolute terms; everything else is relative. As a result, "[e]volutionary modification of vertebrate limb shape and size is reduced mostly to the mutational modification of the cartilaginous condensations. It need not be simultaneously accompanied by mutationally derived changes in the muscle, nerve, and vascular systems, which can accommodate to any of a wide range of limb sizes and shapes." The limb is highly coupled, but still easy to change. This is because it has a locus, and that locus suffices. Our explanation can begin and end with the cartilaginous condensations. This is a common strategy in software design. Consider the Thrift interface definition language (IDL), which is used to specify datatypes: struct Classroom { 1: string building, 2: string room, } For each `struct`, there is a corresponding binary format. The Thrift compiler can generate, in almost any language, a codec for that format. This generated code is long, repetitive, and usually ignored. The specification is a locus, and it suffices. In the limb-metaphor, simplicity means opacity. The interface provides a language that can describe every problem, every solution. The underlying details are forever excluded from our explanation's suffix. This view of simplicity, however, can sometimes lead us astray. Consider this named value: const total_cost = ... Like all names in software, this is an interface. We can reference `total_cost` without knowing how it was computed. And when reading this expression, we can usually ignore the right-hand side. The name suffices. But this interface is not opaque. Imagine if it were: the right-hand side of expression would be hidden away entirely. We would be unable to change the underlying computation; all we could do is change the name, and hope the compiler does the right thing. No one wants this. Names and referents are, by design, part of the same structure. When changing one, we must consider if the other still fits. This is the shortcoming of most biological metaphors in software. Evolution is a blind watchmaker; we, on the other hand, can see. And so, interfaces that conform to the limb-metaphor are most useful at the periphery of our software. They provide a terminus, a place where our explanation naturally ends. To understand the intermediate interfaces in our software, we need a different metaphor. Next week, we will explore the simplicity of a fractal. * * * 1. Kirschner 1998 ↩
explaining.software
January 7, 2025 at 12:38 AM
better explanations through coupling
Previously, we explored how coupling and cohesion are not separable concepts. * * * When our software is cohesive, everything fits. Each part is shaped by its relationships. Together, they comprise an undirected graph, which we will call a **structure**. Each structure is an amplifier; by explaining one vertex, we begin to explain the others. Often, these vertices have a natural order. There is one vertex that, when explained first, simplifies all the others. We will call this the **locus** of a structure. These two concepts — structures and loci — will become our primary tools for understanding software design. To begin, let's look at three examples. ## model/view/controller An MVC web application has three major components: Sometimes these components are drawn as part of a narrative. The request goes into the **controller** , the response comes out of the **view** , and somewhere in between we interact with our database via the **model**. This narrative, however, is not the structure of MVC. Instead, the structure describes the relationship between the model and the contents of the view and controller. This is reflected in the code generation offered by the Rails web framework. When adding a new database entity, developers are expected to use a command line tool: rails generate scaffold Widget name:string cost:decimal This creates a database table which can store our widgets. It also, however, generates a dozen ancillary files allowing the application to create, edit, and display widgets. This code, once generated, can be changed. The relationship between the model and the rest of the application is predictable rather than fixed. Even so, by understanding this vertex, we gain a broad understanding of the rest of the structure. The model is a locus. MVC frameworks are, today, a bit out of fashion. The point is not to extoll their virtues, but to analyze their success; for over a decade, they shaped how people thought about an entire category of software. As designers, we should aspire to do the same. ## entities and hierarchy The MVC structure describes an entire web application. Vertices within a structure, however, often contain their own sub-structures. For instance, our model might contain these entities: The entities, and their relationships, are easy to grasp. A `student` attends `course`s, each of which is taught by a `teacher`. Each `course` is in a `classroom`, which is cleaned by a `janitor`. This structure, however, provides no hints as to what each entity should contain. There are countless things we could know about a `classroom`: location, size, seating capacity, layout, and so on. We could track every work order ever assigned to the room, or every piece of gum ever removed from the bottom of a desk. Like all relational schemas, this graph is undirected. There are no roots; we can start anywhere, and traverse everywhere else. In-memory data, however, _does_ have roots; usually, we start with the user. Here, we have chosen a locus. All paths begin with `student`; each entity is seen through a `student`'s eyes. A `classroom` is where a `course` is attended; what matters most is its location. Some parts of the schema are nebulous, hidden from view; what should a `student` know about a `janitor`? By changing our locus, we change our perspective: Now, a `classroom` is where a `course` is taught. We care about the location, but other aspects as well. What's the seating capacity? Are there chalkboards, whiteboards? Is there a projector? Are lectures easily recorded? In our early prototypes, most of these facets will be unrepresented. They are, nevertheless, part of our design. That design describes what our software is, but also what we expect it to become. ## what kind of queue? A queue is a line of people, all waiting for the same thing. When we describe part of our software as a queue, we are using a metaphor. To make this metaphor work — to perceive our software as a line of people — we must answer a number of questions. How often do the people arrive? How long are they willing to wait? Where are they waiting? How many people can that space hold? The queue is a useful metaphor because it makes us ask useful questions. This remains true even when we have a more abstract association — a conveyor belt, or simply a bounded space between two processes.1 And these questions, asked in context, often have intuitive answers. Consider a system which contains client process A and service B. Someone suggests adding a queue between them. What do they mean? Absent any context, they could mean any number of things. They could mean adding an in-memory queue to A, or a disk-backed queue that can survive process restarts. They could mean adding in-memory queues for each process in B, to buffer requests. Or, perhaps, they could mean adding an entirely new service to act as a queue between A and B. If it's a new service, that only raises further questions. Do messages persist on the queue after being passed to B? If so, how long? Is it a task queue, requiring processes in B to confirm completion? If so, how long do we wait before a task is considered abandoned? Should we have a heartbeat, to support long-lived tasks? The decisions, large and small, are nearly endless. But when using the queue-metaphor, there _is_ a context: the system. And, typically, everyone involved in a design discussion understands that system. They know its purpose, its structure, and its shortcomings. These are all part of the metaphor's prefix. And so, when someone suggests a queue, the conversation doesn't devolve into endless questions. In context, their meaning is clear. The queue-metaphor is, again, a locus. It gives us a broad understanding of the underlying implementation. But that understanding is a product of both the metaphor and the system that surrounds it. A piece can only be understood through its relation to the whole. * * * 1. One association which _doesn't_ raise useful questions is a generic `Queue[T]` data structure. Real-world queues are associated with _waiting_ , and raise questions about latency and capacity. The data structure, on the other hand, has few associations beyond implementing breadth-first searches. ↩
explaining.software
November 20, 2024 at 12:51 AM
glossary
### complexity The sum of every explanation. Weighted heavily towards future explanations. Measured in bits, but only relative to your audience's expectations. See also: * a brief introduction * * * ### coupling The degree to which two things tend to be explained together. Sometimes it makes things simpler, and sometimes it doesn't. See also: * coupling as co-explanation * * * ### explanation The core task of software development. When we try to understand software, we explain to our software. When we change software, we explain it others. See also: * a brief introduction * the anatomy of an explanation * * * ### explanatory power The degree to which one text, explained first, makes another text less surprising. See also: * the anatomy of an explanation * transparent like frosted glass * * * ### locus plural **loci** (LOW-sigh) The vertex in a structure which has explanatory power for all the others. It is where our explanation naturally begins, and often ends. As an example, consider how often the name of something — a class, a function, a value — suffices in an explanation. And even when it doesn't, we still reference it by name. See also: * better explanations through coupling * transparent like frosted glass * * * ### paratext An external text that shapes how our text is understood. In software development, the text is our code, and the paratext is everything else: the conversations, the diagrams, the READMEs, and so on. See also: * a brief introduction * * * ### prefix The things that your audience already knows. Provides explanatory power for the content of your explanation. See also: * the anatomy of an explanation * * * ### structure An undirected graph, where the vertices are concepts and the edges are relationships. Usually, each vertex contains its own substructure. At lower levels, each structure corresponds to a contiguous chunk of code. Since each vertex is shaped and constrained by its neighbors, a structure is an amplifier: by explaining one vertex, we begin to explain the others. See also: * better explanations through coupling * similar, but different * * * ### suffix Everything you anticipate explaining in the future. What your current explanation is seeking to simplify. See also: * the anatomy of an explanation * * * ### surprisal Also known as entropy. We prefer this term because it emphasizes that the information content of a message is always relative to the recipient's expectations. See also: * a brief introduction * * * ### tacit knowledge Something that we know, but struggle to articulate. Usually learned through repetition. See also: * the anatomy of an explanation * _The Tacit Dimension_ by Michael Polanyi * * *
explaining.software
August 23, 2024 at 7:17 PM
coupling as co-explanation
Previously, I provided a brief introduction to this newsletter, and then analyzed the structure of explanations in software development. * * * For many in the software industry, "coupling" is a dirty word. This dates back to a 1974 paper entitled _Structured Design_1, which asserted that the goal of software design was to minimize coupling and maximize cohesion. The authors defined coupling as "relationships among modules" and cohesion as "relationships among elements in the same module." This distinction was objective, delineated by the syntactic boundaries of our code. But which syntactic boundaries? According to the paper, modules were "PL/I procedures, FORTRAN mainlines and subprograms, and, in general, subroutines of all types." In other words, functions. No one, however, gave this much weight. When object-oriented programming came into vogue, people talked about the need for cohesive classes.2 Others talked about cohesion within components, or entire systems.3 And this makes sense. We want our software to feel cohesive, at every level. We want the disparate parts to fit together. But according to the original paper, this is impossible. Cohesion exists between "elements in the same module." These elements cannot, themselves, be modules. Otherwise, our cohesion becomes their coupling. And so, the meaning of cohesion has changed. It is no longer the simple, explicit concept offered by the original paper. Instead, we're told that we can intuit when code lacks cohesion; it simply "doesn't feel right."4 Cohesion has become a tacit concept. This retreat into tacit knowledge isn't all bad. Cohesion offers a way to discuss and refine our intuitions about software design. Most discussions of cohesion, however, fall short of this. Often, they can be reduced to a bare tautology: coupling is good, except when it isn't. This newsletter aims to change that. We can begin by defining coupling as co-explanation. The degree of coupling between any two things is determined by how often they're explained together. If two things are regularly co-explained, it's because their meaning is intertwined. Understanding one requires understanding the other. Changing one requires changing the other. They are, in other words, coupled. This coupling has a cost; when we explain one, the other comes along for the ride. It also, however, has a benefit; coupling creates explanatory power. When we explain one, we begin to explain the other. Coupling is an invaluable tool in software design. It's our job — perhaps our only job — to recognize when the benefits outweigh the costs. Next week, we will look at some examples of beneficial coupling, and try to understand how it works. * * * 1. Stevens, Myers, and Constantine 1974 ↩ 2. Booch 1991, p. 124 ↩ 3. Larmen 2005, p. 299 ↩ 4. Metz 2018, p. 22 ↩
explaining.software
August 20, 2024 at 8:52 PM
the anatomy of an explanation
Previously, I provided a brief introduction to the ideas that will be covered in this newsletter. * * * Software development can be reduced to a single, iterative action. Almost everything we do in the course of a day — the pull requests, the meetings, the whiteboard diagrams, the hallway conversations — is an explanation. Our job is to explain, over and over, the meaning of our software: what it is, and what we expect it to become. There are three parts to these explanations, and only one is said aloud. The **prefix** is what everyone already knows. It is the shared context, the foundation upon which our explanation rests. The **suffix** is all tomorrow's explanations. It is everything that, for now, no one needs to hear. The **content** is everything in between. It is what our audience needs to know, but doesn't. To judge an explanation, and to improve it, we must understand all three parts. Let's consider each in turn. ## the prefix In any conversation, most things are left unsaid. When talking to a friend, we don't begin by reciting the details of past conversations. When talking to a stranger, we don't begin with a review of social norms. But these unstated things are, nevertheless, important. They have **explanatory power** for our conversation; they make our words less surprising. Similarly, we don't begin every discussion of our software by describing our company's product and business model. These are well-understood; they are baked into our audience's expectations. They are part of the unstated prefix. Our explanations rest upon these prefixes. Changes can be enormously disruptive. Consider what happens when a company pivots to a new business model: each past decision must be reexplained, to see if it still makes sense. We can separate any prefix into two parts: the explicit and the tacit. The **explicit** prefix is the part that can be articulated. It is the things that we have said, or written down, or drawn on whiteboards. The **tacit** prefix is the part that is inarticulate; the things that we know, but struggle to say. It is the skills and intuitions that we only know how to convey through repetition. Our industry is awash in tacit knowledge. The Python community, for instance, aspires to write "Pythonic" code. Such code "seems natural to proficient Python programmers"1; in other words, it fits their expectations. And this proficiency, by all accounts, develops over time; it cannot be taught, only learned. As a rule, an explicit prefix is easier to teach, easier to discuss, and easier to change. In practice, however, our prefixes remain overwhelmingly tacit. Our expectations arise from an admixture of habit, mimicry, and half-forgotten arguments. And this is for good reason. The leap from tacit and explicit is uncertain; we might not stick the landing. As Michael Polanyi explains it: > Repeat a word several times, attending carefully to the motion of your tongue and lips, and to the sound you make, and soon the word will sound hollow and eventually lose its meaning.2 This is a failed explanation. We have taken something familiar and made it foreign. And unless we have some special insight into phonetics, it's the only outcome that can be expected. This is why Pythonic code remains a tacit concept. The cost of explicit knowledge is uncertain, unbounded. There are, invariably, a dozen better ways to spend our time. A tacit prefix is harder to teach, harder to discuss, and harder to change. But it is still, in most cases, simpler than the alternative. ## the suffix As software designers, we spend a lot of time thinking about the future. Consider how difficult it can be to evaluate a software framework. Each framework provides a set of concepts, and you need to decide if those concepts have explanatory power for your application. Most people start with a small prototype, to ensure that simple problems have simple solutions. But this isn't enough; what you want to know is if the simplicity will last. As our software grows — in size, in capability, in complexity — we will sometimes need to color outside the lines. We will need to work around the framework rather than within it. And each time we do, the framework will lose a bit of its explanatory power. To judge one explanation, we must consider its effect on future explanations. The cost of a framework is proportional to its workarounds. The cost of a workaround is proportional to how often it must be re-explained. The suffix reflects all of this; it is everything we expect to explain. It is our feature roadmap. It is the sharp edges in our code. It is the implementation details we will, eventually, have to understand. It is what, with each design choice, we are trying to simplify. ## the content In between the prefix and suffix, we have the actual content of our explanation: the words, the code, the gestures, the diagrams. Simple content builds atop the prefix, and towards the suffix. Content, then, conveys more than itself. The prefix is reflected in what it takes for granted. The suffix is reflected in what it values; whatever we have today, we'll probably want more tomorrow. The prefix, suffix, and content are **coupled**. They are distinct, but not separable. By changing one, we affect the others. In this newsletter, we will look at the relationships that bind our explanations. But more importantly, we will delve into the nature of coupling itself. We will try to understand when it hurts, when it helps, and how to tell the difference. * * * 1. Ramalho 2015, p. 748 ↩ 2. Polanyi 1966, p. 18 ↩
explaining.software
August 12, 2024 at 9:22 PM
a brief introduction
As software designers, our goal is to reduce complexity. We want our software to be easier to understand, and easier to change. These are not distinct concerns. For our software to be understood or changed, it must be **explained** ; we must tell a story about what our software is, and what it's expected to become. When understanding software, we tell that story to ourselves. When changing software, we tell that story to others. Software which is **complex** takes a long time to explain. Our explanations are scoped to the task at hand. When we onboard someone, we provide a broad and shallow overview. When we perform a code review, we narrowly focus on the salient details. Good design makes all of these explanations simpler. The complexity of a system is the sum of its explanations. An explanation is a message. According to communication theory, it can be measured in bits. But the information conveyed by a message, its **surprisal** , depends on its audience. For something to be surprising, it must be unexpected. And so, simplicity is not intrinsic; it does not arise from our code's size or syntax. Simplicity is a fitness between software and our expectations. There are two core tasks in software development: design and implementation. As designers, we set expectations. As implementers, we try to realize those expectations. In both cases, we are pursuing simplicity. Design often occurs around our code, rather than within it. A README is software design. Onboarding is software design. Code review is software design. Answering a question is software design. In literary theory, these external documents and discussions are called **paratexts**. These paratexts "surround, shape, support, and provide context for texts. They may alter the meaning of texts, further enhance meaning, or provide challenges to sedimented meanings."1 And so, in the posts that follow, we will explore both text and paratext. We will analyze the combined meaning of our words, our diagrams, and our code. We will attempt to understand how we understand software. * * * 1. Consalvo 2007, p. 182 ↩
explaining.software
August 6, 2024 at 8:32 PM