Air France 447: Lessons Learned

Ten years ago, as May turned into June 2009, Air France Flight 447, bound for Paris from Rio de Janeiro, crashed into the Atlantic. All 228 passengers and crew on board were killed. In the immediate aftermath it was difficult to understand how it happened. The plane was an Airbus A330, one of the most sophisticated in operation and manned by three highly-trained pilots. For two years commentators speculated on why this plane had simply fallen out of the sky. Then, in May 2011, the flight recorders were salvaged from the ocean floor and an explanation could begin to be pieced together.

That explanation still reverberates today. In a long form article for Vanity Fair in 2014 called The Human Factor, author William Langewiesche outlines the story of the fateful flight and unpicks the reasons for its crash. Rather than being caused by a mechanical fault or freak occurrence of nature, the events of that night can be traced back to human fallibility. And because human fallibility is so pervasive beyond the field of aviation, the incident has become something of a case study, recounted in many more articles and books since.

There are four elements to the crash of Air France Flight 447 that keep it relevant.

Relationship Between Humans and Machines

The first centres around the relationship between humans and machines.

Autopilot technology has been a feature of aircraft design for years. Individual components had been automated as long ago as the 1960s. Then faster processing power and digital innovations allowed automation to become more integrated. Flight management computers were added that could be programmed on the ground and autopilot systems were refined to handle the plane from take-off to the rollout after landing. In 1987 Airbus introduced a ‘fly-by-wire’ technology that replaced conventional manual flight controls with an electronic interface. Whatever a pilot did at the controls was converted into electronic signals that executed the manoeuvres on the wings and the tail.

All of this progress led to a much better safety record across the industry. The A330 model itself had never crashed in commercial service over the fifteen years since its introduction in 1994. But while the chances of disaster had been reduced, they had not been entirely eliminated. In his book, Messy, Tim Harford writes, “paradoxically, there is a risk to building a plane that protects pilots so assiduously from even the tiniest error. It means that when something challenging does occur, the pilots will have very little experience to draw on as they try to meet that challenge.” The risk had not disappeared, it had simply changed form.

The pilot at the controls that night was Pierre-Cédric Bonin. Although he had clocked up many hours in an Airbus cockpit, his actual experience of manually flying a plane like the A330 was minimal. His role had primarily been to monitor the automatic system. The time he had spent manually flying would likely have been focused on take-off and landing. So, when the autopilot disengaged, because ice crystals had begun to form inside the air-speed sensors in the fuselage, he didn’t know what to do. The fly-by-wire system downgraded itself to a mode that gave Bonin less assistance. With the safety net gone, the plane was now liable to stall if conditions allowed, and Bonin inadvertently proceeded to create those conditions.

Harford goes on: “This problem has a name: the paradox of automation. It applies to a wide variety of contexts, from the operators of a nuclear power station to the crew of a cruise ship to the simple fact that we can’t remember phone numbers anymore because we have them all stored in our mobile phones – or that we struggle with mental arithmetic because we’re surrounded by electronic calculators. The better the automatic systems, the more out-of-practice human operators will be, and the more unusual will be the situations they face.”

Harford quotes psychologist James Reason, author of Human Error: “Manual control is a highly skilled activity, and skills need to be practised continuously in order to maintain them. Yet an automatic control system that fails only rarely denies operators the opportunity for practising these basic control skills … when manual takeover is necessary something has usually gone wrong; this means that operators need to be more rather than less skilled in order to cope with these atypical conditions.”

In the case of Air France Flight 447 this asymmetry had dire consequences. Years earlier Earl Wiener, an aviation engineer, coined what became known as Wiener’s Laws of aviation and human error. One of these laws is: ‘Digital devices tune out small errors while creating opportunities for large errors.’

The issue is highly relevant in the design of self-driving cars. In her book Hello World, Hannah Fry dedicates a chapter to self-driving cars in which she draws on the Air France analogy. Asking passengers to pay attention when for most of the time they don’t have to is unrealistic; asking them suddenly to pay attention in an emergency is unsafe. She quotes Gill Pratt, head of Toyota’s research institute:

“The worst case is a car that will need driver intervention once every 200,000 miles … An ordinary person who has a [new] car every 100,000 miles would never see it [the automation hand over control]. But every once in a while, maybe once for every two cars that I own, there would be that one time where it suddenly goes ‘beep beep beep, now it’s your turn!’ and that person, typically having not seen this for years and years, would … not be prepared when that happened.”

Both Harford and Fry caution against over-reliance on automation in a world that is becoming increasingly automated. They argue that it should be used to support decision making rather than supplant it. It is a challenge that the autonomous vehicle industry is having to face up to.

Depths of Human Psychology

The second element of the Air France episode that lends it relevance today is the spotlight it shines on how humans perform under pressure. In his book Smarter Faster Better, author Charles Duhigg describes the concept of cognitive tunnelling. When people become overly focused on what’s directly in front of them or become preoccupied with immediate tasks, they often fail to perceive something that lurks in plain sight. That’s cognitive tunnelling. The most famous demonstration of it is The Invisible Gorilla test, where subjects are asked to watch a video of groups of people passing a basketball around. The task is to count the number of passes made by one of the teams. But like many psychological experiments that’s not really the task. Without giving it away completely, see if you can do it here.

Cognitive tunnelling can kick in when our brains are forced suddenly to transition from a state of relaxation to one of stimulation. Duhigg argues that when the alarm sounded to alert the crew of Air France Flight 447 that the autopilot had shut down after ice crystals had formed on the air-speed sensors, Bonin entered a cognitive tunnel. His attention had been relaxed for the past four hours and was now looking to latch onto a focal point, which he found in the primary flight display right in front of his eyes.

The primary flight display is the gauge in the cockpit that shows if the plane is level. Duhigg contends that Bonin became so focused on correcting the roll in the plane and levelling it, that he pulled back on his stick, causing the plane’s nose to lift higher. Whether he’s right or not, we don’t really know. What we do know is that Bonin pulled the plane higher, an action that aviation experts find mysterious. It was this action that ultimately led to the plane stalling. Meanwhile, Bonin’s co-pilot, David Robert, was focussing on a screen next to him that displayed an updating series of instructions generated by the aircraft’s computer. One of the instructions was to stabilize and go back down, and he communicated that to Bonin, but rather than check, his focus became consumed with the scrolling instructions, and Bonin continued to climb.

A short while later the ice crystals cleared and the air-speed sensors began functioning again, but by now both men were in a state of panic. A speaker in the cockpit was blaring STALL STALL STALL (seventy-five times in total, according to the recording) but the pilots didn’t acknowledge it. The plane had stalled, it was losing altitude, and they didn’t know what was going on. Less than a minute and a half after he had taken over manual control of the plane Bonin shouted: ‘Putain, j’ai plus le contrôle de l’avion, là! J’ai plus le contrôle de l’avion!’ Damn it, I don’t have control of the plane, I don’t have control of the plane at all!

Duhigg argues that when crisis struck, the Air France 447 pilots didn’t know where to focus. They didn’t have sufficient mental models available to them to navigate the scenario they were faced with and make reliable decisions, so they became reactive and descended into a state of confusion. Bonin’s last words were ‘Mais qu’est-ce que se passe?’ But what’s happening?

He contrasts their flight with that of Qantas Flight 32 in November 2010. The aircraft serving that flight suffered an uncontained engine failure shortly after take-off in Singapore but was landed safely by pilot Richard de Crespigny. It was regarded by experts as an exceptional landing and 469 lives were saved. As Duhigg tells it, the turning point came when de Crespigny realised ‘We need to stop focusing on what’s wrong and start paying attention to what’s still working’. With the benefit of hindsight perhaps, de Crespigny later told Duhigg: “Computers fail, checklists fail, everything can fail. But people can’t. We have to make decisions, and that includes what deserves our attention. The key is forcing yourself to think …That’s why we have human pilots. It’s our job to think about what might happen, instead of what is”.

Duhigg cites research conducted by psychologists at Klein Associates in the 1980’s that suggests that the antidote to cognitive tunnelling is an ability to tell ourselves stories and create mental pictures – to play out scenarios in our heads as it were. He elaborates with research conducted out of MIT in 2007 that looked out how productivity compares among workers in information-intensive jobs. Various common traits emerged among the most productive employees – there was an optimal level of multi-tasking (five projects, not more) and assignments were more likely to be in earlier stages of development. But the trait Duhigg identifies as the most important to mitigate attention lapse is the employees’ eagerness to generate theories. They were constantly trying to figure out how information fits together, in the words of one of the researchers.

The modern world is full of distractions and increasingly we are being forced to flit from one task to the next. Fortunately, most professionals rarely have to switch from relaxation mode into a mode that requires life-or-death responses. But the antidote may be the same – tell yourself stories, anticipate, build mental models.

Antifragility

The third element of the fateful flight that keeps it relevant is simply what we can learn from it.

In his book Antifragile, Nassim Taleb writes, “Every plane crash brings us closer to safety, improves the system, and makes the next flight safer—those who perish contribute to the overall safety of others. Swiss flight 111, TWA flight 800, and Air France flight 447 allowed the improvement of the system.”

For the crash of Air France Flight 447 to make any contribution to system improvement, the wreckage of the plane first needed to be found. It wasn’t easy. The black boxes that contained flight data and voice recorders were the size of shoe boxes and they were lost in an undersea area the size of Switzerland. Spearheaded by the French aviation body (BEA) the search became the longest, most difficult and most expensive undersea search launched to date. It took two years and a cost of US$45 million for the wreckage to be pinpointed, according to Sharon Bertsch McGrayne. The effort that went into finding it demonstrates how important understanding failure is to the airline industry.

Once they had been found the BEA studied the contents of the black boxes and filed a final report in July 2012. The report contained a number of recommendations that have since been incorporated into aircraft systems and pilot training. These include an emphasis on training new pilots to fly the plane when the autopilot fails and on prompting pilots regularly to turn off autopilot to maintain skills.

Atul Gawande comments in his book, The Checklist Manifesto, that few other industries investigate their failures as thoroughly. He suggests that unless mistakes turn up on cable news, they are not typically analysed. That’s so in healthcare, in teaching, in the legal profession and in financial services. Take healthcare. In the United States there are an estimated 200,000 preventable medical deaths every year, the equivalent of almost three fatal airline crashes per day. Ian Leslie wrote a piece for the New Statesman in 2014, in which he said that globally there is a one-in-ten chance that, owing to preventable mistakes or oversights, a patient will leave a hospital in a worse state than when they entered it. He tells the story of ex-pilot Martin Bromiley, who, like Gawande, is keen to borrow practices developed by the airline industry in an effort to improve safety in the field of medicine.

Two such practices are ‘Crew Resource Management’ (CRM) and the checklist.

Crew Resource Management was born out of some work done in the late 1970’s by NASA, which concluded that many aircraft accidents were the result of poor communication in the cockpit and that teamwork matters more than individual flying skill. The solution was to design a system that fosters collective intelligence to mitigate against the risk of individual human error. Attributes include a less authoritarian cockpit culture, open communication and a stronger focus on teamwork. The system is credited for much of the improvement in aircraft safety that has been witnessed over the past forty years.

In the case of Air France Flight 447 it is clear that CRM deteriorated as the disaster developed. Initially the co-pilots defined a shared representation of the situation, but then they busied themselves in their individual tasks and failed to communicate clearly. One of the BEA recommendations following the crash was to improve and standardise pilot CRM training.

Over recent years CRM has been applied in other fields, notably medicine and in particular surgery. Studies have highlighted differences between the environments, but the basic risk that flows from undue deference to the captain / consultant surgeon is similar.

The checklist has also gained traction outside of aviation, in no small part due to the work of Gawande. A checklist simply consists of a standardised list of procedures to follow for every operation and for every eventuality. What it’s not is a complex algorithm designed to allow anyone to do the job at hand; rather it is a quick and easy tool aimed at supporting the skills of expert professionals. It empowers the people at the edges of the room and invites collective responsibility. Its advantage is also as a means to distribute recommended practice to the front-line. Gawande notes that when fields like medicine do hold investigations following an incident, recommendations can become lost in bulky reports and don’t make it out. A checklist distils key take-aways into actionable points.

It is important not to over-exaggerate the read-across from air travel into other fields. Taleb would see the application in medicine, but he makes a distinction in fields such as financial services. He writes: “There are hundreds of thousands of plane flights every year, and a crash in one plane does not involve others, so errors remain confined and highly epistemic—whereas globalized economic systems operate as one: errors spread and compound. This creates a separation between good and bad systems. Good systems such as airlines are set up to have small errors, independent from each other—or, in effect, negatively correlated to each other, since mistakes lower the odds of future mistakes… If every plane crash makes the next one less likely, every bank crash makes the next one more likely.”

Blame the Pilot

This point about systems brings us to the last point.

Even with all the data available there is still disagreement on the root cause of the accident. The Air France 447 pilots ‘were hideously incompetent,’ says William Langewiesche, author of the Vanity Fair article. Tom Dieusaert, author of a book Computer crashes: When airplane systems fail has a different perspective: ‘That plane should not have been flying at all,’ he says, ‘but afterwards the blame was put on the pilots.’

The truth is that a confluence of events occurred to create the disaster. The crew decided to fly through a thunderstorm near the equator rather than around it; the pilot in charge – Bonin’s superior – had had very little sleep in Rio the night before; he took himself off to rest just before the incident took place; the air-speed sensors froze over, causing the air-speed gauge to go blank; the systems were programmed for autopilot to disengage in such an eventuality; Bonin pulled the plane’s nose up, inexplicably; the pilots may have assumed that the plane could not stall, possibly not realising that many protections were bypassed when autopilot disengaged; the controls on the right and left seats in the cockpit were not co-joined in this particular aircraft unlike in others, so Robert did not feel that Bonin was pulling up; the pilots were not sufficiently trained to deal with an eventuality like this; the pilots communicated poorly; the pilots ignored the stall warning; the pilots became overly focused on some small details.

That’s twelve contributory factors, and there are likely more. Some rest with the pilots, some with their employer, Air France, some with the aircraft manufacturer, Airbus. But it is the interaction of all these factors that lead to the plane crashing. Humans have a natural tendency to gravitate towards simple narratives of cause and effect. We often target the last link in the chain – in this case the pilots – as the culprits and underestimate the influence of all the other decisions that lead up to the fateful ones. In the case of Air France 447 these decisions touched on the design of the aircraft cockpit, the computers inside and the training given to the pilots.

This idea of a broader cause of failure was explored by James Reason in the early 1990’s. Rather than focusing on the person, he advocates focusing on the system. Through his lens errors are to be expected, but they are as consequences rather than causes. In order to protect against such errors, systems need to be designed with barriers to absorb them. In an ideal world these barriers would be impermeable, but that can be unrealistic, and so the next best thing is to make sure that holes in the various barriers don’t line up. This is Reason’s Swiss Cheese Model. The system’s barriers – be they engineered, human, or procedural – are analogous to slices of Swiss cheese. Holes arise from either active failures or latent conditions. As long as the holes in different layers don’t line up, the system is safe.

The model has widespread use today in safety science. Perhaps it will always be the case that, as Reason says, blaming individuals is emotionally more satisfying than targeting institutions, but thinking in systems has been shown to generate better results.

Conclusion

Air France Flight 447 became a story when it disappeared so suddenly. It remains a story because of what it reflects about the way we work with the machines we create and the ways in which we think and learn. Air disasters make front page news, but there are multiple other domains in which those same factors play a part. My own field of investing doesn’t deal in life-or-death decisions, but it confronts the cognitive tunnelling problem on a regular basis and needs to be conscious of the automation problem if risk management systems are set up that way. Howard Marks, founder of Oaktree Capital, once compared the job of an investment manager to that of an airline pilot: ‘hours of boredom punctuated by moments of terror’. As we continue to automate more and more tasks that description will permeate more widely, and the lessons from Air Flight 447 will only get more relevant.

06france-master1050 — Source: New York Times

5 thoughts on “Air France Flight 447: Ten Years On”

Alex says:

May 8, 2019 at 7:58 pm

Nice essay. I think this mostly falls under the topic of “resilience engineering.”

Here is by far the best newsletter I’ve found on the topic: https://www.getrevue.co/profile/resilience

My favourite is this edition: https://www.getrevue.co/profile/resilience/issues/resilience-roundup-common-ground-and-coordination-in-joint-activity-part-2-issue-26-167322

Loading...

alexpetralia says:

May 8, 2019 at 7:59 pm

Edit: And here’s probably the best summary of the topic I’ve come across – https://www.youtube.com/watch?v=2S0k12uZR14

Loading...

Pingback: We Love the Internet 2019/26: The Two types of airport people edition | Curiously Persistent
Pingback: Ever Increasing Complexity – Degrees of Certainty
Bill says:

March 1, 2022 at 8:46 pm

Absolutely fascinating. I have a lot of books to read now. Thank you.

Loading...

Air France Flight 447: Ten Years On

Relationship Between Humans and Machines

Depths of Human Psychology

Antifragility

Blame the Pilot

Conclusion

Like this:

5 thoughts on “Air France Flight 447: Ten Years On”

Leave a ReplyCancel reply

Relationship Between Humans and Machines

Depths of Human Psychology

Antifragility

Blame the Pilot

Conclusion

Share this:

Like this:

5 thoughts on “Air France Flight 447: Ten Years On”

Leave a ReplyCancel reply

Discover more from Degrees of Certainty