Safety Moment #19: Root Cause Analysis

Root cause analysis in the process industries
Oscar Wilde

Last week’s Safety Moment — Incident Investigation: Words, Words, Words prompted a discussion to do with the vexed topic of root cause analysis. While we all agree that it is vital to determine the underlying reasons for the occurrence of incidents the catch is that there is no agreed upon definition for the term “root cause”.

An 800 person forum comprised of Root Cause Analysis (RCA) practitioners from all over the world tried to define “Root Cause Analysis.” They could not agree on an answer. . . . It means different things to different industries – even different things within the same industries. It is even difficult to find consistency within the same companies, or even sites within a company.

Nelms 2007

Which begs the question, Why can we not come up with an agreed upon definition for “root cause”?

Oscar Wilde (1854 – 1900) once said,

A truth ceases to be a truth as soon as two people perceive it.

What he meant is that, while facts are facts, each person will select and interpret facts according to their own background and experiences — their own version of reality. Which is why there is never a single root cause that will explain why incidents occur on process and energy facilities.

To continue the Nelms quotation,

The problem with Root Cause Analysis is that it has become whatever people want it to be. If you only want to see problems in your "Management Systems," that's all you will see. If you only want to understand the physical mechanisms of problems that is all you will see.

Ironically, the different ways in which different people look at potential root causes can be beneficial if handled properly. For example, if a pump seal fails, one investigator may note that the wrong type of seal was installed. Therefore her root cause trail will examine the company’s purchasing and procurement procedures. At the conclusion of the investigation she may define the root cause of the failure as “Limitations in the enterprise resource software”.

Another investigator may find that the maintenance technician who installed the seal had not been provided with accurate procedures, and had never received training for the installation of this type of seal. Therefore his root cause trail will scrutinize the process for writing procedures and for making sure that people are properly trained in the use of those procedures. His definition for the root cause of the failure may be “Failure to write adequate maintenance procedures and to properly train maintenance technicians”.

A third investigator may note that the process liquid in the pump is different from the original design. He or she may then develop a root cause trail to do with materials failure, resulting in a root cause, “Management of Change system provides inadequate guidance regarding material integrity checks”.

The key point here is that all three of these analyses are correct and each person has found a root cause, but none have found the root cause because there is no such thing.

The practical upshot of this line of thought is that incident investigation teams should be comprised of as many different disciplines as possible. To have a team that is made up of just process engineers, for example, means that insights from the purchasing or human resource departments will be missed. The more points of view that can be provided the better the chance of preventing a similar incident from recurring.

The material in this Safety Moment is taken from Chapter 11 of the second edition the book Process Risk and Reliability Management.