Line of Representation

John Allspaw, CTO at Adaptive Capacity Labs, presented _How Your Systems Keep Running Day After Day_ at DevOps Enterprise Summit 2017. He brings vital attention to the complexity of the systems we build and the essential role of human performance. youtube

The System from The STELLA Report source

One of the most important diagrams on the Internet.

The cloud in the lower right, below and connected to "monitoring tools," is The System as Peter Alvaro described: a graph of connected systems and a combinatorial search space. John Allspaw and SNAFUcatchers expand The System to include all the infrastructure and tooling and monitoring used to manage the artifacts of the system.

All of these computational systems exist below the line of representation. We cannot see our systems directly. We can only see and manipulate them through this line of representation.

More importantly, all of the real work done in the system happens above the line where the humans are. After an incident when the system has surprised us we conduct retros and come up with todo lists to prevent the same incident from happening again. These interventions are completely necessary. And they are still interventions below the line.

These two experts in resiliance will tell you, all those technologists (us included)?

Indy and Sallah: They're digging in the wrong place. source

It's easy to dig in the wrong place. Remember it's a combinatorial space. As I quoted Peter Alvaro earlier: > We're going to have to be smart about how we select our experiments.

If only we had a laser pointer to show us where in the combinatorial space we should dig. source

John Allspaw tells us that incidents are the light to guide us.