Safety Moment #21: Happy Trails

Choosing root cause analysis in incident investigation.

The last three Safety Moments have looked at various aspects of Root Cause Analysis. In Incident Investigation: Words, Words, Words we stressed the importance of defining and using words such as ‘incident’, ‘accident’ and ‘root cause analysis’ precisely. This was followed by Root Cause Analysis which showed that there is no single root cause. Instead people will develop a root cause trail based on their own experience and insights. This is healthy — the multiple points of view can provide different ways of preventing similar incidents from recurring. Last week we published It’s Turtles All The Way Down in which we show that a root cause analysis is an example of infinite regress — there can be no such thing as a true root cause.

The challenge that those running incident investigations face is which of the multiple lines of thinking — the “Happy Trails” — should be followed given that all organizations have limited time, money and people.

In the previous safety moments we postulated an incident in which a pump seal failed causing a leak of highly flammable materials. We created three potential root cause trails. All of them were good, but none of them were the root cause — because there is no such thing. The three trails were:

  1. Deficiencies in the enterprise management software that led to the wrong replacement seal being ordered.
  2. Lack of procedures and training for the maintenance technicians.
  3. Failure to recognize that a change in the composition of the process fluid should have been assessed through the Management of Change system.

A starting point as to which trail to follow could be to ask the following questions.

  1. Which trail would be the most cost effective?
    For example, would it be more costly to examine the enterprise management software in depth or to make sure that all maintenance procedures are up to date?
  2. Which approach is likely to be successful?
    It is probable that updating the maintenance procedures and training would be the approach best likely to succeed.
  3. Which approach could yield benefits quickly?
    Probably working on maintenance procedures and training would be best since the benefits would occur almost immediately as the updated procedures are published.
  4. Which of these lines of analysis could reveal the most about the overall management systems?
    It could be that, because the proper management of change is so fundamental to a process safety system, following this trail could tell management most about their systems.
  5. Do previous incident analyses provide guidance?
    If the current incident seems to share features with previous events then it may be best to look for a root cause that addresses them all. For example, if there have been other situations where process conditions were changed without proper evaluation then maybe the Management of Change trail is the best one to follow.

A potential snag with the above approaches is that the incident analysis team may follow well-worn trails and so miss an opportunity to come up with original insights. For example, it is quite likely that other investigations have highlighted the need for better maintenance procedures. But it may be that no one has thought about analyzing the enterprise management software in the context of process safety.

Root cause analysis in incident investigationIn the words of Robert Frost,


I shall be telling this with a sigh
Somewhere ages and ages hence:
Two roads diverged in a wood, and I—
I took the one less traveled by,
And that has made all the difference.

The material in this Safety Moment is taken from Chapter 11 of the book Process Risk and Reliability Management.