A Sensemaking Lens on Reliability

Thanks to everyone who took time out to hang out and say hi while you were in Vegas for reinvent. If you’re ever in Vegas, I’m happy to grab coffee or give some local recommendations!

A Sensemaking Lens on Reliability

This week we’re taking a look at a paper by Ruth Blantt, Marlys Christianson, Kathleen Sutcliffe, and Marilynn Rosenthal published in the Journal of Organizational Behaviour.

I found this paper kind of confusing in some spots, but despite that, I still think it’s useful and informative. Especially when the authors go through how they gathered this data.

They interviewed 26 of the 85 residents (chosen randomly) at a 550 bed hospital they nickname “University Hospital”. They used sensemaking in two ways, as a way in which to view the work they were learning about from residents, but also as the thing that would be investigating.

They tell us that sensemaking is a mixture of thinking and action. This is where people take a stream of experiences about events and categorize them in a way that they can understand and form a cohesive story about, shaped by their identity and experiences. From this understanding, they can take action. They then act is if something is the case, then test their understanding through action.

Sensemaking is how people answer two questions:

  1. “What is the story?”
  2. “Now what?”

The authors chose healthcare to investigate because though there are processes and procedures, the process of treating individual patients is highly variable and can be unpredictable.

The paper really sets out to answer two questions.

  1. What do we learn about the two existing approaches to reliability in this setting (prevention vs. resilience)?
  2. What do we learn about the sensemaking processes of residents as they encounter lapses in reliability?

At first they define two different approaches to dealing with what they call “lapses” of reliability or reliable care. The first is preparation. The problem with this, of course, is that there are many things that can happen in complex domains that cannot be planned for in advance and avoided or mitigated.

The second approach they call resilience. They explain that this approach is one in which lapses are noticed in real time and addressed or adapted to as possible. The problem here they note is that as their research showed many of the so-called “lapses” were things that went unnoticed at the time that they occurred so in some cases it was too late to do anything.

The interview

What I like most about this paper is that they talk quite a bit about how they interviewed to gather this data. Since they were interviewing people about their own performance it is possible that the interviewees inadvertently or otherwise, misremembered their successes or failures, but the authors say that they still feel the data is fairly good given that it ultimately aligns with some other sociological research about interventions.

The authors asked the residents to recall a “medical mishap” that they were involved in. They define involved in as either witnessed or caused. These seem like drastically different things to me and it’s not clear what portion of the responses from the residents were ones that they had witnessed or ones that they felt they caused. But the authors do note that they made a point to ask specifically about “medical mishaps” as to avoid terms that imply judgment such as mistake or error or even more formal things such as “adverse event”.

I think this is a good idea, and mirrors much of my experience and training that says it of course matters how you ask questions. But what is strange is that in the paper itself the authors note that they use the terms mistake, error, and mishap interchangeably.

They are a little more clear and informative about how they went about the interviews than some other papers that I’ve seen. I’m not suggesting that we all model our data gathering or interviews after purely academic approaches, but I think seeing behind the scenes a bit can help shape how we get out some information within our own teams and organizations especially for things like post incident review.

Choosing to intervene

Even though it wasn’t one of the main questions the authors set out to answer, they were able to examine why and when some residents spoke up when they noticed a mishap and others did not.

As they began cataloging and coding the data, they noticed patterns about when someone would speak up and began referencing other sociological papers to help them understand it.

Ultimately they determined that speaking up or using “voice” as opposed to silence, wasn’t just a binary, one time choice as some other research had portrayed. Instead, it was something that the residents were reassessing throughout an event.

When asking the residents more about speaking up, they learned that there were two factors that played a large role in whether or not someone would speak up. One was whether or not they anticipated speaking up to be effective. Typically residents would “voice” as a default, but if they’d had other experiences to tell them that the recipient wouldn’t act on the feedback, then they were less likely to do so. The next was whether or not they anticipated speaking up causing a negative outcome for themselves or others. For example, whether or not they’d be punished or seen as incompetent.

Many of the residents in their later years of residency describe trying and struggling to strike a balance between speaking up or admitting uncertainty and behaving as the attending that they wanted and were expected to be.

I think this is something that comes up a lot in software as well. So often we set up structures and roles where it can seem (or in fact be) someone’s job to know the answer as opposed to apply their expertise and experience. It’s no wonder that it can be difficult to say “I don’t know” or “I’m not sure about this” in those moments.


Where things start to get really confusing is in some of the terminology that they use. As you can see by the title, they talk about reliability alongside resilience. But it seems like when the authors are referring to reliability of healthcare, they’re really talking about creating successful or positive outcomes. Whereas when I think about reliability of healthcare I tend to think about things like getting care, getting care that has shown to be effective, things like that.

This is reinforced by a story they share from a resident where a patient came in with a fever, had tests and imaging done, but they couldn’t find the source of the fever. This typically means that antibiotics aren’t given until the source is found. Later ,when the source was determined though, it was too late to intervene for the patient. The resident speculated that perhaps had they given antibiotics earlier the results would have been different.

The authors say that “This quote highlights that physicians can undertake actions that may seem appropriate in the moment but that turn out to be mistaken in retrospect.” Which seems to gloss over a lot. It also seems to state the obvious a bit at the same time missing the point. Not being a doctor I of course don’t know why it made sense to wait for antibiotics, I presume so that the right ones could be used. Presuming there is some sort of reason this is the “standard care,” it’s hard for me to see this as a “lapse”. It is of course and unfortunate outcome, but it seems if this is a lapse then anything these residents did where they didn’t have perfect foresight would be a lapse.

Other lapses they mention are things like the wrong medication being administered or even in some cases, potentially reaching the limits of medicine. Where in some cases the hospital staff were unable to determine what was wrong with a patient. Again, reaching the limits of medicine is an unfortunate outcome, but I again struggle to see it as a “lapse” where the authors can then test the two schools of thought they’re examining.

I think we can fall into similar patterns in software, that anything that results in an adverse outcome must have been preceded by a “lapse” somewhere. It’s entirely possible I’ve completely misunderstood the authors, but I think this pattern of thinking can be harmful and can prevent us from learning about incidents and focusing our investments in the right places. If we insist on looking for some lapse, we’ll find it, how effective or accurate that finding is is very questionable.

They ultimately conclude that neither of the approaches, preparation or resilience is appropriate in all situations and emphasize how difficult error detection and correction is. They also say that they:

“join with other scholars, such as Wears and Cook[…] to call for research aimed at unpacking the mechanisms through which individuals are able to detect and correct errors earlier in their development.


  • Whether or not speaking up was anticipated to be effective or well received played a large role in whether or not residents spoke up when they noticed a mishap.
    • This was not just a one-time choice, but something the residents continued to reassess throughout a procedure.
    • Decision support tools, roles, other artifacts that are not reliant upon specific people to be effective may help
  • Many of the mishaps were not noticed in a time period that would allow them to be corrected or mitigated.
  • Sensemaking is influenced by personal factors such as identity and previous experience.
  • Sensemaking is both a lens through which work can be examined, but also something that people do that can be investigated
  • Error detection is very difficult, no one approach or tool can improve it 100%
  • Not everything that has an adverse outcome should necessarily be seen as caused by a “lapse” somewhere.
← When mental models go wrong. Co-occurrences in dynamic, critical systems
Organizing for Resilience - Part 3 - Organizations →

Subscribe to Resilience Roundup

Subscribe to the newsletter.