Four concepts for resilience and the implications for the future of resilience engineering

3 Apr 2021

This week we have a paper by David Woods who is a principal at Adaptive Capacity Labs, a sponsor. Sponsorship or relation to a sponsor does not influence how I analyze papers and have featured Woods' papers previously.

I’ve talked to a lot of readers who have told me that they would like to also have a real time, face to face discussion about these papers as a way to better understand and absorb them. I’d like to help with that, so as an experiment over the next few weeks, I’ll be holding a session where you can join and discuss the paper I’ve feature here that week.

Interested? You can sign up to get the calendar invite and meeting link here. I’ll be sending invites out later in the week, but the first session will be about an hour, this Friday, January 31st at 9:00 am PST (5:00 pm UTC), I hope to see you there!

Four concepts for resilience and the implications for the future of resilience engineering

This is a paper by David Woods that appeared in Initiative on Complexity in Natural, Social & Engineered Systems. Woods points out, as you’ll likely have noticed dear reader, that the term “resilience” can be very overloaded and used in a number of different ways, some more helpful than others.

This is a good guide to help understand the different things that get lumped with resilience, like robustness. It also serves as an easier intro to some of Woods more dense works and ideas (such as How Adaptive Systems Fail).

From a practical perspective, knowing what form of resilience we or others are discussion, we can be more specific and clear and communicate more effectively. Also, by knowing the various facets of what resilience could mean we can begin to choose better perspectives as we analyze and build systems.

To help sort out these usages, Woods groups them into four categories. They are resilience as:

Rebound
Robustness
Graceful extensibility
Sustained adaptability

Rebound

Resilience as rebound looks at why some systems (or individuals) recover well from challenges, while others do not.

One of the difficulties in adopting this view and definition of resilience is that approaching the question directly, just means restarting the question.

Further, this view looks at the specifics of each event or challenge,
Further, systems that are able to recover well need to have an ability to look to the future and anticipate threats or challenges. But to learn about robustness, we look to the past.

Additionally, the focus on specifics of each event, can cause an important aspect event to be overlooked, the surprise that occurs because the event is not consistent with the model that existed previously (and will now have to be revised).

Some research in the area has shown that in order to recover well, processes and resources need to be in place prior to the incident. Examining what was in place prior is one way in which learning from this view can still be useful.

Another difficulty with this view is that the idea of a system returning to some “normal” state hasn’t held up well in the face of further study. In adapting and responding to challenges the system becomes changed and becomes something new.

Robustness

Woods defines this category of resilience as “increased ability to absorb perturbations.” When a system becomes more robust, the list of the distrubances that a system can respond to effectively expands. Woods provides a good way of recognizing and understanding robustness in paraphrasing David Alderson and John Doyle:

" robustness is always of the form: system X has property Y that is robust in sense Z to perturbation W"

Though the list of the things that the system can effectively respond to grows, the system becomes vulnerable in different in new ways. We cannot say that by increasing robustness, and thus that list, that we have increased the performance of the system in the face of all challenges, simply changed it. This is a trade-off that we must make.

Robustness does not answer the question about what happens to the system when distrubances occur outside of those that had been planned for. This is where resilience comes into play, when the challenges the system will face are not well known or when they are continually changing.

Graceful Extensibility

Woods tells us that this where we see resilience as the opposite of brittleness. This view asks: “how do systems stretch to handle surprises?”

All systems have some area of surprises as they all have boundaries. This is complicated by the fact that though the boundaries exist, they are rarely known exactly and are often changing. Woods tells us that brittleness is how quickly a system declines when performing at or near the boundary and the systems with low graceful extensibility tend to collapse at the boundaries.

Because of the changes of the boundaries and the system as it adapts to challenges, it’s important to note that graceful extensibility is a dynamic capability. Further, Woods uses the term to indicate that adaptation need not simply “less negative,” but can actually be a net positive as well (more on graceful extensibility).

In this view of resilience, resilience comes from having a system that in advance can handle various classes of surprises.

Sustained Adaptability

The final view of resilience is that of sustained adaptability or as Woods puts it: “the ability manage/regulate adaptive capacities of systems that are layered networks.”

This view asks more questions than the previous views:

What governance or architectural characteristics explain the difference between networks that produce sustained adaptability and those that fail to sustain adaptability?
What design principles and techniques would allow one to engineer a network that can produce sustained adaptability?
How would one know if one succeeded in their engineering (how can one confidently assess whether a system has the ability to sustain adaptability over time, like evolvability from a biological perspective and like a new kind of stability from a control engineering perspective)?

Unlike some of the other views that look at a very narrow area of the system (like how it recovers), this area of resilience is a higher level concept that includes the trade offs and multiple dimensions of how human performance really works. In this view, Woods tells us:

" it makes sense to say a system is resilient, or not, based on how well it balances all the tradeoffs, or not"

Not only does this view seem to be the most fruitful from a research perspective, it also is a good view from which to view our systems as software people. This is a perpsective that can help us better design systems and asses their performance.

Takeaways

The term “resilience” has come to mean different things in different contexts to different people.
- This overloading can make learning and discussion difficult.
The many uses of “resilience” can be broken down into 4 categories:
- Rebound
- Robustness
- Graceful Extensibility
- Sustained Adaptability
The first two categories haven’t yielded nearly as much as the latter two in research and understanding.
- The latter two are also good lenses from which to view our systems, and good focuses for areas of study and examination.
Knowing more specifically which type of resilience we’re talking about in a given conversation can help us be more precise and more clear in our discussions.
The view of resilience as sustained adaptability is one (though not the only) perspective that we can adopt to help us better build and evaluate systems.

Subscribe to Resilience Roundup

Subscribe to the newsletter.