Human Error, Safety and Reliability
The interaction of design and human capabilities

The Crash of Eastern Flight 401 - Dec. 1972
Account drawn from Danaher (1980)
Diverted from approach to Miami Int'l Airport due to light indicating a  malfunction in nose landing gear light.
Set autopilot to 2000 feet to reduce work load while checking nose landing gear.
Autopilot was inadvertently switched off by pilot, leading to a gradual descent.
Crew did not notice descent

The Crash - continued
ATC saw plane reading at 900 feet.  The current system could report errors for up to three sweeps.
The controller did contact plane but was told all OK.
Controller’s attention was diverted by 5 other planes he was responsible for
30 seconds later place crashed killing 99 out of 176.

Crash of Eastern Flight 401 - Errors
Pilot Error: not watching altitude which is pilots responsibility
Pilot assumed autopilot worked.
Controller Error: did not report low altitude to the pilot
 (They are required to now).
Name all the factors that contributed to this crash?

Error
DEFINITION:  an action or lack of action that violates some tolerance limit(s) of the system.
Thus defined in terms of system requirements and capabilities.
The the occurrence of an error does not imply anything about human, even if it is “the persons fault.”
It could be a system flaw

Try This:  Name the Colors
red blue yellow green
yellow green yellow red
blue red green blue
yellow yellow blue green
green red red yellow
blue blue green red
red green blue yellow

Try This: Name the Colors

Human Error Probability
Error Probability (EP) also known as Human Error Probability (HEP):
EP = (# of errors)/(total # of opportunities for the error)
value between 0 and 1
gives rate of errors
this is a probabilistic value
it does not indicate if an error will or will not occur
just the likelihood
does not indicate type or cause of error

Reliability
DEFINITION:  Probability of a successful outcome of the system or component.
Reliability is also defined in terms of system requirements.
Thus, to evaluate a system it is necessary to know the goals and purposes of the system.
Reliability is a probabilistic term.
Never seen the perfect system.
Calculation of Reliability
R = (# of successful operations)/(total # of operations)
R = 1 - EP

Human Error Classification Systems - 1
Basic Error Types
Unintentional vs. Intentional
e.g. mistake on a test vs. what speeds most of us drive.
Unrecovered vs. Recovered
Recovered:  Error with possibility for damage but no damage actually occurred. (Driving home drunk safely).
Unrecovered:  Error where damage could not be avoided.
The recovered error of one day could be the next day's unrecovered error.

Human Error Classification Systems - 2
Swain and Guttman’s (1980) Human Error Categories.
Error of Omission
tpographicl errrs
Error of Commission
Hitting thumb with the hammer
Extraneous Act
reading a different class's assignment in class
Sequential Error
My usual: light the fire before opening the damper
Time Error
running a red light

Human Error Classification Systems - 3
Meister’s (1971) Types of Failures
Based on where the error originates.
Operating error:
System is not operated according to intended procedure.
Design Error:
Designer does not take into account human abilities.
Manufacturing Error:
System is not built according to design.
Installation and Maintenance Errors
System is not installed or maintained correctly.
Scary how common these are.

Human Error Classification Systems - 4

Human Error Classification Systems - 5
Another Cognitively Based System - Slips vs. Mistakes by Reason and Navon
Slips are errors in execution
Mistakes are errors in planning an action
Lawrence’s (1974) Model with Relative Frequency
Failure to perceive a hazard 36%
Underestimate a hazard 25%
Failure to respond 17%
Ineffective response 14
Importance: Different types of errors need different types of actions to prevent.

Error Measurement
Variable Error:  errors that differ from trial to trial.
Constant Error: errors that are constant from trial to trial.
Figure - after Champanis (1951)
Constant are easier to predict and thus correct.

Human-Machine and Error Analysis
A Brief Overview
Some Steps that are part of a complete analysis (Swain & Guttman, 1980)
1. Describe system goals and functions.
2. Describe situation.
3. Describe tasks and jobs.
4. Analyze tasks for where errors are likely.
5. Estimate probability of each error.
6. Estimate probability error is not corrected.
7. Devise means to increase reliability.
8. Repeat steps 4 - 7in light of changes.

Calculation of Human Error Probability
There are several techniques, will discuss THERP (Swain, 1963)
Start at top with probability of correct/incorrect action.
Next act is probability of given the last action.
These are conditional probabilities - They are not independent.
Sum of partial error probabilities at bottom is overall error probability.

Calculation of Human Error Probability - 2
THERP (Cont.)
In the diagram, a capital letter is a correct outcome and a small letter is an erroneous action.
The | symbol indicates a conditional probability.
Apply to starting a car.
K = correct key
k = incorrect key
S = getting key into ignition
s = missing ignition
P(S|K) is probability of getting key into ignition, given getting correct key. This is the only correct outcome.
P(error) = 1-P(S|K)

Calculation of Human Error Probability - 2
To get probabilities of specific actions, it is common to used tabled values.
Example HEPs (Swain and Guttman, 1980)
Select wrong control in a group .003
of labeled identical controls
Turn control wrong direction .5
under stress when design
violates population norm.
Failure to recognize an incorrect .01
status of item in front of operator

Effects of System Complexity on Reliability
In general reliability goes down as number of components goes up (i.e. as complexity goes up).
Components in a Series
In a series if any single component fails the whole system fails - the four tires on the car.
Rs = R1 * R2 * ... * Rn
Examples: All components have reliability of 0.90.
n = 1 | Rs = = .90
n = 2 | Rs = .9*.9 = .81
n = 3 | Rs = .9*.9*.9 = .73
n = 10 | Rs = .910 = .35

Effects of Redundancy on Reliability
Active Redundancy: Both components operate all the time but only one is needed.
Failure occurs only when both fail or (EP1)*(EP2)
Thus reliability is:RS = 1 - P(1-RI)
Example: Use two components and both components have a reliability of 0.90.
In a series| Rs = .9*.9 = .81 (above)
Redundant| Rs = 1-(1-.9)2  = .99
Two redundant components in a series|
Rs = .99*.99 = .98

Techniques to Improve Reliability
HARDWARE
KISS (Keep It Simple Stupid).
A-10 ~33% unavailable at any one time.
F-111D ~66% unavailable
Apache Helicopter is similar record to F-111D
Make it reliable/Quality Control
HUMAN
Use human factors knowledge in design - back to Three-Mile Island.
Use human as redundant system.
Others?

Risk Analysis
DEFINITION: An estimation of the consequences associated with particular errors.
Includes estimate of probability
i.e., risk = p(error)*consequences(error)
Can be any sort of risk
e.g., loss of life, money, etc.
Must estimate significance of these various consequences
Used to assist many types of decisions:
Estimates of safety
Estimates of probable success
Types of training to use to help operators not to miss important errors