Please enable JavaScript.
Coggle requires JavaScript to display documents.
SRE (Cap. 9 - Simplicity) (Modularity (Expanding outward from APIs and…
SRE (Cap. 9 - Simplicity)
System Stability Versus Agility
For the majority of production software systems, we want a balanced mix of stability and agility.
SREs work to create procedures, practices, and tools that render software more reliable. At the same time, SREs ensure that this work has as little impact on developer agility as possible.
In fact, SRE’s experience has found that reliable processes tend to actually increase developer agility: rapid, reliable production
rollouts make changes in production easier to see. As a result, once a bug surfaces, it takes less time to find and fix that bug.
Software systems are inherently dynamic and unstable.
A software system can only be perfectly stable if it exists in a vacuum. If we stop changing the codebase, we stop introducing bugs. If the underlying hardware or libraries never change, neither of these components will introduce bugs. If we freeze the current user base, we’ll never have to scale the system.
In fact, a good summary of the SRE approach to managing systems is:
"At the end of the day, our job is to keep agility and stability in balance in the system."
The Virtue of Boring
In the words of Google engineer Robert Muth,
"Unlike a detective story, the lack of excitement, suspense, and puzzles is actually a desirable property of source code."
Surprises in production are the nemeses of SRE.
I Won’t Give Up My Code!
Because engineers are human beings who often form an emotional attachment to their creations, confrontations over large-scale purges of the source tree are not uncommon. Some might protest,
"What if we need that code later?" "Why don’t we just comment the code out so we can easily add it again later?" or "Why don’t we gate the code with a flag instead of deleting it?" These are all terrible suggestions
.
At the risk of sounding extreme, when you consider a web service that’s expected to be available 24/7, to some extent, every new line of code written is a liability.
Minimal APIs
Writing clear, minimal APIs is an essential aspect of managing simplicity in a software system
. The fewer methods and arguments we provide to consumers of the API, the easier that API will be to understand, and the more effort we can devote to making those methods as good as they can possibly be
Again, a recurring theme appears:
the conscious decision to not take on certain problems allows us to focus on our core problem and make the solutions we explicitly set out to create substantially better
. In software, less is more! A small, simple API is usually also a hallmark of a well-understood problem.
Modularity
Expanding outward from APIs and single binaries,
many of the rules of thumb that apply to object-oriented programming also apply to the design of distributed systems
.
The ability to make changes to parts of the system in isolation is essential to creating a supportable system. Specifically, loose coupling between binaries, or between binaries and configuration, is a simplicity pattern that simultaneously promotes developer agility and system stability.
If a bug is discovered in one program that is a component of a larger system, that bug can be fixed and pushed to production independent of the rest of the system.
Release Simplicity
Simple releases are generally better than complicated releases.
It is much easier to measure and understand the impact of a single change rather than a batch of changes released simultaneously. If we release 100 unrelated changes to a system at the same time and performance gets worse, understanding which changes impacted performance, and how they did so, will take considerable effort or additional instrumentation
A Simple Conclusion
This chapter has repeated one theme over and over:
software simplicity is a prerequisite to reliability.
We are not being lazy when we consider how we might simplify each step of a given task. Instead, we are clarifying what it is we actually want to accomplish and how we might most easily do so.
Every time we say "no" to a feature, we are not restricting innovation; we are keeping the environment uncluttered of distractions so that focus remains squarely on innovation, and real engineering can proceed.