Learning from Security Incidents

Kelly Shortridge shares how to use insights from resilience engineering to: think differently about incident response and recovery; leverage resilience stress tests to build muscle memory for response; foster constructive, healthy practices; fuel a feedback loop to improve system design and operation; overcome the pernicious cognitive biases that otherwise sabotage our response and recovery efforts. youtube

YOUTUBE LDPyekzEu10 Kelly Shortridge shares how to use insights from resilience engineering to: think differently about incident response and recovery youtube

# Resilience 101

Failure is inevitable. A natural part of complex systems as they operate. Behaviors of the system are impossible to predict.

YOUTUBE LDPyekzEu10 START 263 END 530 Resilience 101 (5m)

Resilience is the ability of a system to learn and adapt to changing conditions to continue succeeding.

Attempting to prevent failure is often a waste of resources.

Invest efforts instead to prepare and plan for, absorb, recover from, and more successfully adapt to adverse events.

The heart of resilience is adaptive capacity: how poised a system is to change how it works based on context.

We want resilience, not just robustness. In an information security incident we want to adapt the system to close the vulnerability, not just bounce back.

Software is sociotechnical. The code cannot adapt itself. Humans are the primary mechanism for adaptation in our complex software systems.