FMEA stands for Failure Mode and Effect Analysis is used to measure risk in a system, to prioritize which ones need to be tackled most urgently. For some reason I could never remember the term until I split it into its two sections: ‘Failure Modes’ and ‘Effect Analysis’.
Failure Modes are the ways or ‘modes’ that an operation can fail
Effect Analysis is analyzing the effect that any failures would have
FMEA is therefore ‘analyzing the effects’ of the ‘modes (ways) a system can fail’
Why do you need to do FMEA?
Surely you want to remove all all issues, so why not just work through solving them all, using a Pareto chart to make sure you’re solving the most common issues first?
Not all issues are created equally.
For simplicity, let’s give an example of a company that makes electric saws, and there’s only two things that can go wrong with the manufacturing process. 10% of the issues are the guard doesn’t get put on the saw, and 90% of the issues are the saw doesn’t get painted. Using a Pareto chart to prioritize your issues you’d be merrily trying your hardest to solve the paint problem, as this would remove in one go 90% of the issues. The guard not being on the saw though is a much more severe issue, as a customer could cut off a hand, whereas they may not even notice the paint issue.
FMEA tries to solve this issue, by factoring in not only how common the issue is, but also what are the consequenses, and how likely it is to get missed through to the customer (an issue the customer finds is much worse than an issue you find, thinking back to the ‘saw guard’ issue).
How do you perform FMEA?
Failure mode and Effect Analysis is a simple but powerful tool, which is useful whenever you need to minimize risk. The step by step guide is:
- Identify all the core risks in the system you’re looking at (the Cause and Effect or Fishbone chart is good for this, or you could use brainstorming)
- Put all your risks into a table. Common columns are:
- Part or Process being analyzed
- Risk of failure
- Consequences of the risk occurring
- Causes of the failure
- Methods of detection for the failure – N.B. there will be several risks of failure per part number / process, and likely several causes per failure
- Rate the following on a score of 1 to 10 (10 being worst):
- Severity of the consequences
- Likelihood of Occurrence for the failure
- Detectability of the failure
- Calculate the Risk Priority Number (RPN) by multiplying your 3 numbers together – this is your measure for how risky that risk is
- Sort your table by Risk Priority Number – you now have a prioritized list for tackling your risks
- Assign actions to improve the system
- Re-rate the system
Now that you’ve got your prioritized list, you can use a Pareto Chart using the RPN numbers as the variable, to make sure you have effectively covered all the core risks.