The complexity of success and failure: the story of the Gimli Glider

This story shows how complex and intertwined notions of failure and success are. Success and failures during incidents (and operations in general) are more nuanced and complex than a simple binary. If we're too quick to label something a failure, we risk overlooking all the different changing goals that happen throughout an incident. This limits our understanding and can lead us astray when it comes to making changes.

When an incident starts our goal may be something like to make it all better, as if it hadn't happened. But if this problem is especially complex or elusive, the goal often shifts to one of mitigation, to contain the disturbance and hide it from users. If the disturbance continues to spread, then safing actions may become the goal, keep the database from being lost, for example, or in the case of this incident, get the plane on the ground somehow without injuries.

Tools and resources that are designed to help you during an incident or just during a normal day can cause their own problems. On the other hand, those things that are unexpected can sometimes help you as well.

I like the analogy of treating the incident or accident as a tunnel[1]. When you're in the incident, whether you realize you're in an incident yet or not, its unfolding in front of you, a little at a time. While you're in it you can't see very far ahead of you. The tunnel might be straight and fairly short or it could be longest one yet, full of twists and turns, while you're inside there's no way to know.

Outside (and in the future) is a different story however. As humans we have a natural inclination to connect the dots and imagine a straight line. This straight line obscures so much of what helped create success or why we might label the outcome failure. In fact this straight line, the assumption of a fairly simple, linear narrative even obscures the complexity of motions of failure and success itself.

The story of the "Gimli Glider" highlights this very well.

We've talked in the past about first and second stories. I'll tell this story from a couple of perspectives, along the way we'll see what each highlights and obscures, and what we can learn about operating our own systems and incidents. For now, I'll tell the story starting in the air, next time we'll revisit what it was like on the ground.

I'm also reminded of the old story about luck. That's what this story, and our incidents are like. You'll notice this sort of back and forth pattern as the story goes on.

There is a Taoist story of an old farmer who had worked his crops for many years. One day his horse ran away. Upon hearing the news, his neighbors came to visit. "Such bad luck," they said sympathetically. "May be," the farmer replied.

The next morning the horse returned, bringing with it three other wild horses. "How wonderful," the neighbors exclaimed. "May be," replied the old man.

The following day, his son tried to ride one of the untamed horses, was thrown, and broke his leg. The neighbors again came to offer their sympathy on his misfortune. "May be," answered the farmer.

The day after, military officials came to the village to draft young men into the army. Seeing that the son's leg was broken, they passed him by. The neighbors congratulated the farmer on how well things had turned out. "May be," said the farmer.

The Gimli Glider is the nickname for Air Canada flight 143, a Boeing 767-233 that was flying on July 23, 1983 between Montreal and Edmonton. Midway through the flight they ran out of fuel.

The in-flight incident centered around a failure of the sensor that tells you how much fuel is left(the Fuel-quantity Indicator Sensor or FQIS). These had failed at a fairly high rate in the 767s, so that the only replacement that was on hand ended up being bad as well. There was a backup channel in the plane, but due to confusion with how the maintenance worked, it was inadvertently turned off. There was a replacement available though, that had been borrowed from another airline. This part was waiting at their destination, slated to be installed after landing.

There was a manual process that they could use to measure the fuel, using a "drip stick." This gives a measurement of the volume of fuel available. But the fueling and the navigational computer need to know how much fuel by weight. In this case, the navigational computer needed kilograms (one of the few planes in the fleet that did). So a conversion had to be applied.

An incorrect fuel conversion had been applied. But both the pilots as well as ground crew agreed that there was enough fuel for the trip. Several people seemed to think this was correct. This shows us that having someone else take a look or "double check" something isn't always enough, what is confusing or misleading for one person can be so for another. This is an example of why "human error" is not a useful finding in an investigation, but instead should be a prompt to dig further, what made sense at the time to one person, can make sense to another.

A working sensor borrowed from another airline was available and waiting to be installed at the destination. Its easy to see why flying could have made sense, there was a manual procedure and as far as they knew, they were literally flying to the solution.

This problem was first noticed in flight when an alarm goes off that tells the pilots they have a fuel pressure problem on the left side of the plane. Not a big deal they think, a failed fuel pump isn't catastrophic, the engine can get fuel by gravity, so they turn off the alarm. Almost immediately, they get the same alarm again, this time for the right engine. At this point they're talking to controllers and letting them know they'll divert to Winnipeg and are planning on doing a single-engine landing and trying to restart the left engine that has stopped.

Only seconds later there is another warning alarm, one that they described as a "bong" that no one in the cockpit could ever remember hearing before. This is the warning sound for "all engines out," and shortly thereafter the right engine stops as well. At this point they have no engines, a situation that was not anticipated and thus had not been covered in training.

Expertise played a huge role in this incident (as it so often does in ours). The Captain, Robert Pearson, was an experienced glider pilot so when the plane ran out of fuel, he was able to fly in a way that isn't taught and rarely used in commercial flying!

Without the engines running, the pilots didn't have access to main power. This meant they couldn't do things like deploy the landing gear the normal way. Instead they had to manually release the landing gear and rely on gravity to lower it and lock it in. Despite this, only the main gear locked, the wheel under the nose did not.

As they are descending, they searched the emergency checklist for what to do when flying with both engines out, only to find out that there was no such section. The First Officer, Maurice Quintal, who was in communication with Winnipeg controllers, does some calculations and determines that they aren't going to be able to reach Winnipeg.

Now they have the landing gear down, they are without power or fuel though, and flying the plane like a glider. They need to find a place to land, you can't glide forever after all!

The First Officer had been a pilot in the Royal Canadian Air Force and had served at nearby Station Gimli and so suggested that they land there.

As they approach it though, they learn what neither the First Officer or the controller they were talking to knew: that the area was no longer an air force base, that a section of it was converted to a race track. A race track that was currently in use for a race. This means that the area around the (now decommissioned) runway was full of cars and people as they were using part of that area to run the race.

As they're coming in they realize they're still too high and moving too fast. Remember, without the engines, they had no main power, and unlike previous aircraft, very little hydraulic power. In order to decrease altitude, Captain Pearson decides to do a "slip" by "crossing the controls," this is a maneuver where the rudder goes in one direction and the ailerons (the hinged part they can control on the back of the wing) in the other. This lets them descend without increasing their forward speed like they would if they had just descended by pointing the nose down. This technique comes straight from gliders and is almost never used in light jets like this.

Now that the plane was slowed though, that mean their hydraulic power was even less than before, this lead to the pilots being surprised as the controls were now slow to respond.

Also, with the engines out, the plane didn't make much noise, so the people on the ground around the runway didn't get any warning that plane was coming in. As they descended the captain noticed two kids on bicycles within 1000 feet of where they were expecting to land.

As they land the captain brakes so hard that two of the tires blow out. Remember that unlocked nose wheel? Well when they hit the ground it collapses and is pushed back into its storage position. This makes the nose hit, bounce, and then drag along the ground. This extra friction helps them keep from crashing into the crowds around the runway.

Also, remember that the runways was decommissioned? Well, in order to make it suitable for drag racing, it had a guard rail in the middle now. The captain gives it extra right brake and gets the main landing gear to hang over the guardrail, bringing everything to a complete stop. 17 minutes after losing power, all 61 passengers and 8 crew are safely on the ground with no major injuries.

Not only was no one seriously hurt, but the aircraft itself wasn't even significantly damaged! It was repaired and continued to fly until 2008.

Afterwards

After the event, Air Canada does an internal investigation and demotes the captain for 6 months, suspends the first officer for 2 weeks, and suspends 3 of the maintenance crew. This is somewhat expected of a processes that is designed from the beginning to find fault as opposed to one that is designed to facilitate learning.

Recently, at LFIConf, Dr. Ivan Pupulidy talked about just this sort of thing in the context of the US Forest Service, I strongly recommend watching it for a great explanation of what it takes to change these sorts of processes.

Almost 2 years later, in in 1985, both pilots are given the first ever Federation Aeornautique Internationale Diploma for Outstanding Airmanship.

Later, several crews would try a similar scenario in a simulator, all result in crashes.

We've looked at this second story from one perspective, that in the air. We got to see how expertise and experience played a huge role, in this case unanticipated expertise. There was no "training session" or "knowledge transfer session" that would have been very impactful to convey these things to others.

That doesn't mean that we shouldn't care about training. It means that we need to approach it differently, we need to give people the time, space, and resources to practice the things they've learned. A place to develop the experience that leads to expertise. That means that a lot of the things that we think of as "learning" in our industry are really only a first step.

Hearing from an expert (whether live or in writing) or taking a workshop or reading a book are really only a first step. A very important and valuable first step, but one that is only useful if it continues and develops.

References

[1]: Dekker, S. (2014). The field guide to understanding “human error” (Third edition). Ashgate.
[2]: Zen Stories to Tell Your Neighbors: Maybe.
[3]: Final report of the Board of Inquiry

Subscribe to Resilience Roundup

Subscribe to the newsletter.