Lisanne Bainbridge
Department of Psychology, University College London
August 1984
Interest in the ergonomics of process control has increased recently. This has been partly as a result of the Three Mile Island nuclear power plant incident, in which operators made errors which could be attributed to inadequacies in both interface and training (Malone et al (1980). Also recently developed potential for supporting human decision making by computer has led control system designers to ask what form this support should take.
Earlier research focussed on the process operator as a controller (e.g. Edwards &;Lees, 1974; Bainbridge, 1981). In large modern processes, the operator is expected mainly to deal with infrequent plant transients such as start up, shut down, and system failure. The operator had to detect and diagnose as well as to recover from system failure, so the emphasis is on their cognitive functions ('cognitive' refers to memory, attention, interpretation and thinking). Recent theoretical work has emphasised diagnosis rather than control (e.g. Rasmussen &;Rouse, 1981), although the reason for this emphasis is not clear, as studies of operator behaviour during real incidents (see below) show that the operator may have more difficulty during compensation/recovery/fault management than with diagnosis. It is frequently stated that perceptual-motor control skills are no longer used in process control but in fact this is not the case. The operator does have to use control skills in these transient situations (e.g. Ainsworth &;Whitfield, 1983), and has little opportunity for practice.
This paper will cover two main topics : the way operators deal with system failure, and the
implications of these findings for interface design, under the following headings :
- Diagnosis and Compensation using process faults
- Operator aspects of power plant incidents.
- Data required from operator studies.
- Diagnosis and compensation behaviour in a simulated incident.
- General methods of diagnosis by process operators.
- Interface support.
- Form of the operators' knowledge.
- General aspects of interface design and performance prediction.
Diagnosis And Compensation During Process Faults
Operator Aspects Of Power Plant Incidents
The six nuclear power plant incidents commented on here have been reported as follows :
| Pew et al (1981) | Prairie Island | PI | Pressurised water reactor |
| Oyster Creek | OY | Boiling water reactor | |
| North Anna | NA | PWR | |
| Oconee | OC | PWR | |
| Woods (1982) | Ginna | G | |
| Malone et al (1980) | TMI-2 | TMI | PWR |
These reports give detailed post-event analyses, made with operators, of what happened during
each incident. Reference numbers below refer to numbered decisions in the above reports.
Operator activities in these incidents
a. In 3 out of 6 of these incidents there were no problems with diagnosing the system failure.
b. In at least two cases the operators considered alternative causes for the plant failure
symptoms, and looked for confirming evidence (PI/1 and G).
c. There were 3 diagnostic failures :
1. At TMI-2 diagnosis took a long time because an indicator implied that a valve was closed
when it was actually open.
2. In OV/2 and OV/3 one operator assumed wrongly that another operator had followed earlier
instructions correctly, and made further decisions on this basis. (It is possible that interface
design led to the initial error).
3. In G/3 the operators accepted inadequate evidence as confirmation of their hypothesis about
the fault, possible because they were busy.
During compensation for system failure, the operators predicted the effects of alternative
actions in order to choose between them. In two cases they clearly show good control (NA/2,
OC/4). There were various reasons why the operators' other action decisions were made under
high uncertainty :
1. The action had unpredictable and risky effects (PI/3).
2. The displays gave inadequate information about present system state, either because of poor
interface design (OY/1, G/1B, OY/2, OY/3, OC/3, G/5, TMI/1, TMI/3) or through
instrumentation failure (OC/2).
3. The operator assumed wrongly that another operator had made the correct actions (OY/2 and
OY/3), or that the process automatics had functioned correctly (OC/3, TMI/2).
4. The operators could think through the direction of change in a cause-effect chain, that is
whether variables increase or decrease (for a formal representation of such chains see
Nakamura et al, 1982). However the operators did not have sufficient knowledge of process
dynamics to predict the size and timing of these effects (NA/2, G/2, G/5).
5. The operators could think through a cause-effect chain, but did not have enough knowledge
about the conditions in which some actions should not be used (G/1C1, TMI/1).
6. The operator did not know enough about cause-effect chains in the plant, due to inadequate
training (TMI/1).
7. The reports show that operators did not follow 'procedures' blindly, but thought out the
effects of suggested actions and assessed whether they were appropriate. There were several
occasions when the operators had difficulty in deciding whether the required procedure was the
best action in the circumstances (PI/3, NA/2, G/1A, TMI/4).
8. The operators were sometimes distracted (NA/1, NA/3) and sometimes preoccupied
(G/1C2, G/3).
Practical Implications
These findings suggest several practical recommendations :
A. Operator faith in automatic equipment can be misplaced, so it could be a mistake to allocate task functions or train operators on the assumption that automatic equipment is failure proof, unless there is adequate back-up equipment.
B. The operators' control activities during plant transients must be better supported. The findings suggest that operators usually know about the causal chains in the process, but not enough about the dynamics (size and timing) of the process changes. The operators would have less difficulty in evaluating alternative actions if they had less uncertainty about the future development of the present plant state, and the effect of actions on it. Information about causal chains is intellectual, and can be learned from lectures or conversations. Information about dynamics is a 'feel' skill which can best be learned through hands-on experience with a well designed interface. It is important to ensure that operators can identify time lags and rates of change from the interface. It is also interesting to ask whether it is possible for the operators to learn 'feel' skills by using a keyboard and button pressing sequence to indicate the size of the control action they want to make, rather than a control on which effort of movement correlates with size of effect.
C. The studies showed that procedures could be difficult to look up, and ambiguous. Also :
1. The procedures were treated as advice, but the operators could not evaluate alternative
actions without uncertainty and therefore risk, when they knew too little about plant dynamics.
2. The operator assessed whether a procedure did not allow for the particular circumstances
and should therefore be overridden. Process operators are frequently instructed to take this
approach. The operators did not always think that NRC procedures suggested the best action,
but these actions have a regulatory status so the operator was under extra stress in evaluating
them.
3. In each case where the operators questioned the use of a procedure they were concerned that
if they followed it they would loose some important and preferred controlling function, such as
a method of cooling. This suggests that operator training should include experience of
controlling the plant when major control functions are not available, so that they have a better
basis for evaluating whether loss of a function will be crucial.
D. Distraction is known to be a major source of human error (Reason &;Mycielska, 1982).
Several ways of combating this can be suggested.
1. Interface design should support considering all eventualities, and give feedback that actions
have been made.
2. Operators should have experience of completing tasks, when they have other competing
tasks of high importance, so that they are aware of distraction as a human limitation.
Data Required From Operator Studies
The above reports all contain detailed information about what the operators thought and did, from which readers can draw their own conclusions about what was going on. At this stage of our understanding, when there are no generally accepted concepts for cognitive tasks, let alone a theory sufficient to account for them, then detailed data is necessary. For example, it is not adequate to study only incident reports, from which one is simply likely to discover the implicit accident theory of the people making out the reports.
In considering support for decision making, the crucial questions are about how the decision
making is done, not just the end result. We need information about the operators'
understanding of the situation and their thinking, not only a record of their actions. (Duncan,
1982, distinguishes 'process' models, models of thinking, from 'product' models, models of
actions.) For example, in Hollnagel's (1981) study of verbal ('think alone') protocols collected
during simulated process operation, the operator said (S1 at 03.15) : 'so if I run it [the rod
bank] up now, then I have to take in some water to get the rods back in again'. This appears in
the activity summary as :
determine status of system
increase water batch.
The protocol shows that the operator is anticipating the need for an action, he is not looking at
the system state without prior expectations or intentions and then deciding to respond by
increasing the water flow. If only the activity summary had been reported in this study, it
would not be possible for someone else to reach this interpretation.
Pew et al (1981) divide the operators' behaviour into categories of : available information, event signalled, knowledge or belief about system state, intention, expectation, decision/action, source for decision/action, immediate feedback. Woods (1982) used : detect, interpret, control, feedback. Both these reports give a brief summary of the behaviour in the category, so these categories provide a structured precis of events, described with a cognitive emphasis.
If verbal protocols, rather than interviews, have been collected, they should be presented in expanded form. Operators tend to talk cryptically, and say things like 'I must do that because ....', without completing the phrase. The reason must be obtained from an interview, and the meaning of 'that' must be identified, for use by a reader who does not know the plant well. A diagram of the plant and components mentioned is also necessary for a reader not familiar with the industry.
Diagnosis And Compensation Behaviour In A Simulated Incident
The reports on the nuclear plant incidents give rich information about what the operators did. The analyses were however made after the event. The operators' reports may have been influenced by changes in memory for events. Also, in post-event interviews people can give reasons which were not thought about explicitly while doing the task (e.g. Bainbridge, 1981, p.281). Post-event interviews may not therefore give valid information about whether operators reached their conclusions unconsciously or by thinking through at the time. Post-incident interviews may also focus on the strategies used to find the fault, while at the time the operators may have focussed on explaining or anticipating unusual process behaviour.
The above reports must therefore be supplemented by data gathered during diagnosis and compensation, rather than after the event. I know of only one study (on a full scale simulator, using experienced operators who did not know which fault to expect so that the situation was as much like a real incident as possible) in which detailed protocols have been collected during response to the failure, and analysed from a cognitive viewpoint. Page et al (1983) studied a team of 3 commissioning engineers working in a PWR training simulator.
In this 'incident', the first few things to happen were :
1. audible and visual alarms indicated that a pump had failed.
2. the Shift Leader initiated the procedures for stabilising the plant in response to this failure.
3. the Shift Leader asked the Reactor and Turbine Operators to monitor for the possible effect
of the out-of-action pump.
4. he then telephoned the technician on the plant to ask him to find out what was wrong with
the pump.
Even this brief extract shows that 'diagnosis' and 'recovery' are not single processes, but are
general words for several different types of activity.
There are three ways of detecting changes such as non-normal plant conditions :
1. responding to an alarm. To psychologists this is an 'orienting response'.
2. thinking of something which needs to be checked. This is active attention, or hypothesis
testing, and is difficult to distinguish from diagnosis.
3. incidentally noticing that something is wrong while doing something else. In psychology
and artificial intelligence this may be called the operation of a 'demon' (Charniak, 1972).
At this stage in this incident these operators thought that they knew what was wrong with the plant (information that this was not true only appeared later). Their main concern was to maintain system integrity, to prevent the 'disturbance' from becoming large enough to set off the shut-down systems. This is an important reminder. The six real incidents above were analysed because the reactor had tripped and a dangerous possibility had arisen. As the worst failure in a PWR can develop within a few seconds it is of course important to design the safety systems to cope with this, but this should not distract from the fact that in many more everyday failures the operators are not dealing with a situation in which the shut-down safety systems have operated.
The operators in this case knew what part of the plant was not functioning correctly, at the level of the component which caused the abnormal process behaviour, i.e. the pump, but did not investigate in more detail. A technician was given the task of finding out which component within the pump needed to be replaced or repaired.
There can be several phases of stabilisation/ compensation/ recovery, e.g. :
1. The operator tries to keep the process in, or return it to, a stable state. The operator may
know what is wrong and how this affects plant behaviour ('compensation for fault', Rouse,
1982), or the operator may not know what is wrong ('compensation for symptoms').
Compensation for symptoms may be necessary either because the O is still trying to diagnose
the fault, using in part the information gained by trying to stabilise it, or because the O does not
have adequate knowledge of plant causality in this fault situation.
2. a technician repairs the faulty component.
3. the operator brings the process back up to its normal operating level.
General Methods Of Diagnosis By Process Operators
Diagnosis is a form of problem solving: the operators have hypotheses about what is wrong, and these hypotheses have to be tested. General models of problems solving can include a first stage of devising a problem solving strategy. Experienced process operators do not appear to do this, which suggests that their general strategies are already developed.
Origin of hypotheses
The models for how operators produce their hypotheses about the reasons for plant failure, which are reviewed by Rouse (1982), are of two basic types. The selection of things to consider further could arise 'unconsciously', that is they may be thought of without any conscious awareness of the mental processes by which the alternatives were suggested, or they may occur as a result of thinking explicitly about the potential alternatives. It is known that the unconscious process can be a highly efficient way of suggesting appropriate behaviour, given experience, and it is an important form of cognitive skill.
Before asking which of these methods is used by experienced process operators, we need a method for identifying them. Any method has to depend on reports by the operators, and 'introspection' has well known difficulties. For present purposes I suggest a 'negative' method of inferring from verbal protocols. We know, from the protocols of individual operators and the conversations of teams, that the operators do explicitly think through the effects of causal chains in the process when comparing alternative actions during compensation. I suggest that if material of this sort does not appear in the protocols, then the operators have not explicitly thought through causal chains to identify possibilities. Of course this is weak evidence about the operators' conscious experience, but it is clearly identifiable.
On this basis, there is unpublished evidence from Page et al 's (1983) study that the operators think of the hypotheses to test by unconscious cognitive skill. If this is so, then a major implication is that studies of plant fault diagnosis must be done using skilled operators, as extrapolation from the methods of inexperienced operators may be invalid. The efficiency of this unconscious process will depend on the operators' experience of faults and knowledge of the process plant, so may be incomplete and must be supported by the interface design.
Two further points must be made. One is that in the Page et al example the operators were commissioning engineers, who would be expected to have more experience of dealing with plant failure than the average operator. The second point is about the type of fault training. Training of English operators, at least until 1981, was in the form : they are told what fault has occurred, and trained to work out what the process behaviour will be as a result. This is the reverse of a real fault situation, in which they see the abnormal process behaviour and have find out what caused it. Unfortunately cognitive processes are not instantly reversible. It might be that one would find explicit reasoning sequences in the generation of hypotheses by operators who had been trained to think about faults the appropriate way round. However German operators receive mixed training, with some faults presented without prior warning, and these operators still appear not to reason explicitly in the symptom-fault direction. (Reasoning from event to effect is used by the operators in the compensation part of their task, when they anticipate the effects of actions in order to choose between them. This is the easier direction to handle as it reduces the combinatorial explosion of possibilities to consider.)
Testing the hypotheses
There are three ways in which the operators could test their hypotheses about what is wrong
with the plant :
a. by checking the interface or the plant for direct information about whether the hypothesised
faulty component is working correctly. There were many examples of this in the 6 real
incidents and the Page et al study.
b. by deliberately making a change to the process which will have one anticipated and useful
effect if the hypothesised component is faulty and another if it is not. There was one example
of this in the 7 analyses which give information about the operators' intentions (OY, time
1421-3). This is evaluative diagnosis (see below). There is no example in these process
operation incidents where the operator 'injects a test signal' into the process just to see what
happens, as maintenance technicians do.
c. Thinking through and evaluating predicted consequences only seems to occur (in these
examples) during compensation, as analysed above. Although both diagnosis and
compensation are problem solving situations, the hypothesis testing stages are essentially
different. During diagnosis the hypotheses are about the state of the external world, which
must be checked directly. During compensation, the hypotheses are about 'good' actions, and
evaluating the 'goodness' of a proposed action consists of mentally thinking of its
consequences and comparing these predictions with known criteria.
Cognitive skill
The above analysis makes use of the notion that operators could think of hypotheses to test either unconsciously/ automatically, or by thinking though causal chains. This is a superficial categorisation of the possibilities, which is convenient for this level of discussion but is inadequate as an account of the nature of cognitive skill. Whether an operator uses automatic skilled behaviour, or thinks out what to do (which can itself be more or less skilled) depends on their experience and on the unpredictability of the environment. A highly skilled operator is more likely to act automatically, but should be able to change freely to 'thinking it out' methods if something unusual happens. Operators should use both interchangeably as required. There is some discussion of this flexibility in Bainbridge (1978).
The account of cognitive behaviour which control engineers may be most familiar with is that given by Rasmussen (e.g. 1983a), which is simply based on this automatic/think through distinction. This simple categorisation does have value. Pew et al (1981) found both Rasmussen's behaviour taxonomy and his pyramid model were useful in explaining to operators what Pew et al were interested in finding out about. The diagrams are useful for giving a basic idea, to people who know nothing about cognitive processes, of the sequence of stages in making a decision, the ways a decision can be made, and the flexibility of the processes. However they do not give an account of cognitive mechanisms which is sufficient for a specialist making design decisions. Two types of way in which Rasmussen's account is incomplete can be illustrated.
One is the interpretation of the words 'skill', 'rule', and 'knowledge'. Rasmussen (e.g. 1983a) suggest three main types of cognitive behaviour, which he calls skill based, rule based, and knowledge based. There are problems with using this taxonomy. For example, when an operator detects abnormal plant behaviour they may immediately think, without conscious deliberation, what faults it could be due to. This immediate thought, in which the person is not aware of the processes mediating between input and response, might be described by a psychologist as cognitive skill. It is conditional, i.e. 'if x then y ', behaviour so could be described as a rule or production system. It is also knowledge based, in the sense that it could only be done effectively by someone who knows a great deal about the process.
Inversely, the word 'rule' may be used to describe following a given procedure or 'algorithm', or using a standard method which has developed on the basis of experience, or using a 'heuristic' or 'rule of thumb'. In accounts of 'rule' and 'knowledge' based behaviour it seems to be implicit that an 'if variable value is i then do action j ' sequence is an example of a 'rule', while 'if variable A changes then variable B changes' or 'if component x changes then variable y changes' are 'knowledge'. In an expert system these could all be rules in the knowledge base.
These are not just semantic quibbles, the difficulties arise because more than 3 different types of processing are required of the operator, and it is difficult to find a way of assigning them to only 3 categories which people will agree on. In the taxonomies which have proved most useful in analysing real tasks, the categories have distinguished the function of the behaviour within the operator's thinking (e.g. Pew et al , 1981), rather than making assumptions about mechanism. For example, Pew et al (1981) distinguished : knowledge or belief about the process state, intention, expectation, decision. Whether it is appropriate to use more detailed categories of cognitive behaviour (such as compare, explain, recall) depends on the task being investigated and the purpose of the study, e.g. Umbers (1975), Ainsworth &;Whitfield (1983).
The second problem is with the associated model for the organisation of cognitive behaviour, described Rasmussen as a pyramid. Again people with no knowledge of cognitive processes find this gives them useful insights, but it does not contain mechanisms sufficient to account for complex behaviour.
In Rasmussen's 'pyramid' model (e.g. 1983b) he places 'skilled' behaviour at the base of the
pyramid and 'knowledge' based behaviour at the top. Indeed anyone with an academic
background is taught to consider conscious problem solving as the highest form of mental
activity. However, some of the most important contributions to problem solving come in
'eureka' experiences, when a solution appears without any conscious thinking activity.
Rasmussen shows various routes through his three behaviour types, from the input stimulus to
the output action. All the routes through the model are from stimulus to response. This can be
misleading as it does not include most of the aspects of cognitive behaviour which make human
thinking so powerful, and does not recognise important aspects of its flexibility which must be
supported by interface design. Al least the following mechanisms affect the operators'
behaviour relative to current goals and anticipated events :
1. Feedback : Feedback of information obtained as a result of doing the action, so errors can be
identified and parts of the previous behaviour repeated.
2. Recursion : While someone is solving a problem they may come across another problem
which must be solved before they can solve the first one. This embedding of the same
operation within itself is called recursion.
3. Mental simulation and anticipation : the primary use of a 'mental model' of the behaviour of
the outside world is as a basis for preplanning, anticipating events, or for thinking through the
effects of an action to evaluate it before a signal arrives or an action is made.
4. Working memory and multiple goals : All the foregoing types of behaviour are coordinated
by reference to the person's knowledge of the actual and desired state of the external world.
Goals can be adapted to the present possibilities (e.g. Hayes-Roth &;Hayes-Roth, 1979).
All these aspects of cognitive behaviour are powerful, and it would be difficult to produce a single diagram which showed how they interact with each other. The complexity of the sequencing processes in cognitive behaviour, and the flexibility of interchange between 'skill' and 'problem solving' types of behaviour are indicated in Bainbridge (1981), which is also too simple an account.
Interface Support
We are beginning to know a little about how the operator think when responding to a process fault. What sorts of knowledge does this thinking depend on and build up, and how can this be supported by interface design ? The general term for this knowledge is the operator's 'mental model', but this knowledge is not some sort of simple unit. A recent collection of papers on mental models (Gentner &;Stevens, 1983) is of interest to psychological researchers in this area, but the papers use such general terms as 'schema' without giving explicit information about the form of the user's knowledge from which one could make interface design recommendations.
The process operator could have a large number of different types of knowledge, the uses of which are interrelated, and which might be generated/ inferred from each other by more or less obvious mappings. Are there basically different types of knowledge, which are processed in different ways ? what information is used by the operator in diagnosis ? Is the focus of the operator's attention during diagnosis on the process variables or the plant components ? What forms of knowledge do they need, and what are the problems of moving between these ?
Patterns or sentences
The models of diagnosis which Rouse (1982) reviews are concerned either with unconscious skill or with explicit thinking through. They can also be characterised on another dimension : the data they work with are either in the form of images/patterns or of logical predicates/ propositions/ language. Information in pattern or language form may be best represented by different types of display. There is a major debate in psychology about whether there actually are these two distinct forms of mental representation, and whether it is ultimately possible to distinguish between them (for the debate in action see Kosslyn et al , 1979). In this paper the concern is whether a task can be done more easily using one form of display than another. Such a result is usually taken to have implications about the nature of the mental representation used in doing the task, but questions about whether this inference is valid are not of concern here. Presumably anyway, in any complex behaviour all possible forms of mental representation will he used.
Display support. A few examples can be quoted :
1. In laboratory experiments which have compared the performance of people using image and
language forms of display, studies using deductive reasoning tasks (i.e. ones in which the
answer lies within the information given) give inconclusive results (e.g. Mayer, 1976; Polich
&;Schwartz, 1974), while there is evidence that a spatial representation is easier to use in more
creative tasks, in which the solution requires an integration and interpretation of the information
given (e.g. Gerwin &;Newstead, 1977; Carroll et al, 1980).
2. Thorndyke &;Hayes-Roth (1982), in a study of training for spatial knowledge, found that
training in using a map led to better performance in some tests while experience of walking
around the space led to greater ability on others. After considerable experience the difference
between the two forms of training disappeared. This does not mean that the two forms of
representation are equivalent, but suggests that after practice the user develops both forms of
mental representation. Starting with a map, some spatial relations are easier to handle but
knowledge about paths through the space develops with experience, and vice versa.
Propositions about individual parts of the structure can be extracted from a pattern, and a
pattern can be built up by relating individual propositions, so the question in display design is
which form of knowledge need to be most easily accessible.
In most aspects of process control the operator is concerned with the structure of relations between facts, rather than with isolated pieces of information, so one might infer that visual patterns would be a more effective display format. There are general points as well as experimental findings which support using spatial displays for structured information. Information can often be expressed in pictures more succinctly and using a less specialised professional vocabulary. The problems of transforming predicates about spatial relations disappear. Patterns can contain implicit information, from which propositions can be extracted if necessary. However an important problem with graphic displays is that it is difficult to represent some types of information, e.g. to represent the many types of organisation of cognitive behaviour in a single diagram. An example of the power of visual presentations comes from engineering papers on cognitive processes : the diagrams or tables are called 'models' and are considered more rigorous than verbal passages which discuss the difficulties of a simple account, such as appear in the text accompanying Rasmussen's diagrams.
Process information used in diagnosis
As usual, observation of actual behaviour shows that the theoretical analyses leave out some important aspects. Rasmussen has published the most interesting analyses of the ways in which information is actually used in diagnosis. He has published two different ones and uses the term 'topographic search' in both, which confused me considerably. Rasmussen studied diagnosis by electronic maintenance technicians, and has extrapolated from his findings to make suggestions about diagnosis in process control. Now we have more information about process control diagnosis it is important to ask whether this extrapolation is valid.
One use of the term 'topographic' is concerned with the information used in diagnosing
abnormal process behaviour. Rasmussen (1981) has distinguished between :
1. 'topographic search' : using information about the normal level of the variable being checked
to identify abnormal behaviour.
2. 'symptomatic search' : using specific information about the relation between actual
symptoms and particular faults, so that it is possible to go straight to the component or group
of components which might be faulty.
Symptomatic search does not necessarily require knowledge of the causal chain linking fault
and symptoms. Duncan (1981) has devised ingenious techniques for training operators to
recognise such symptoms. Symptomatic search is very effective, but other methods are also
necessary when dealing with faults which have not been experienced by the operator or
anticipated by the plant designers.
Rasmussen's second analysis concentrates on the information guiding the sequence of tests
made during diagnosis. Rasmussen (1983b; Rasmussen &;Jensen, 1974) distinguishes three
types of sequencing :
1. 'topographic search' : the electronic maintenance technicians studied by Rasmussen usually
used the wiring diagrams to indicate the information flow in the equipment they were testing,
and followed this in their sequence of tests, so they did not need to understand the system in
order to check it. Rouse (1982) calls this 'context free search'.
2. 'functional search' : The technicians less frequently used knowledge of the system to test
functionally related sub-units in the equipment. The result of one test (good/ bad) led to a
functionally related next test.
3. 'evaluation' : the technicians had richer knowledge, a 'mental model', of the equipment
which enabled them to relate system function and specific behaviour, so they could
immediately go to components.
In analysing the evidence from process operation it is helpful to make a finer division. The operator could deal with each variable independently (A), or could know which variables change together, the structure in the process. (Letters refer to the following table.) Within such functional groups the operator could test each variable separately (B), or know patterns of variable values which may occur in one of these groups, or use the functional knowledge to think out what might be wrong (E). The specific knowledge about patterns might be either knowledge of the normal patterns of behaviour used to recognise that 'something' is wrong (C), or specific fault patterns (D).
The alternative accounts of the information used in diagnosis can be mapped onto each other as in the following table. (For interest, a recent account of automated maintenance diagnosis is included : Davis, 1983).
| Rouse (1982) | Rasmussen (1981) | Rasmussen (1983) | Davis (1983) | |
| A | Each variable independently | Context free | topographic | topographic | Test generation |
| Variables in known groups | Context specific | - | - | - | |
| B | Each variable separately | - | - | functional | Discrepancy detection |
| C | Normal pattern known | - | - | - | - |
| D | Specific fault pattern known | - | symptomatic | - | - |
| E | Thinking through | - | - | evaluation | Models of causal interaction |
Which of these methods are used by process operators ? Rasmussen and Jensen (974) found that electronic maintenance technicians primarily use topographic search. Technicians work with a variety of equipment and this strategy does not require special knowledge of each. The technicians do not show what would be considered classic problem-solving behaviour. What they do is less efficient in terms of testing procedure but requires less complex mental work. Rouse (1981) has shown that inexperienced technicians practising diagnosis on randomly connected components acquire some general skill which transfers to related real diagnosis situations. In contrast, process operators work for many years with one system. They are not in need of a strategy which reduces the problems of dealing with several different systems each day, so one might expect they will show some form of context specific behaviour. The dangers of extrapolating from Rasmussen's results to process control are illustrated by Rouse's (1982) theoretical suggestions. He suggests that context specific pattern recognition behaviour should be easier than context free sequential search, the opposite of Rasmussen's findings with technicians.
In the Page et al (1983) study, the operators work by recognising that a pattern of process behaviour is not normal (C), as far as can be identified. It is important to note that they did not have the sort of alarm annunciator panel (matrix of alarm lights) which is characteristic of English and US control rooms. Operators using these claim to be able to recognise specific fault patterns, unless too many alarms go off at once.
Process variables or plant components
The second question is whether the operators' patterns of knowledge are primarily in terms of relations between process variables, or between plant components. The maintenance technicians studied by Rasmussen and Jensen (1974) interpreted the results of their tests in terms of acceptability of components. This has led Rasmussen and Lind (1981) to recommend displays for process plant diagnosis which are based on the state of components, or of functional groupings in the plant. Their suggested process representation is a network in which functional components or component groupings are the nodes, and process variables are implicitly represented by branches between the nodes.
However an important difference between technicians and operators is that operators have a primary responsibility for maintaining plant stability, while maintenance technicians usually work on equipment which is not simultaneously in operation. During normal process control the operators focus on acceptability of process variable values rather than the adequacy of the plant components. Baerentsen et al (1983) present the 'mental model' of operators controlling a conventional oil-fired power plant, based on the plant knowledge mentioned by the operators in verbal protocols and interviews. This mental model is a network with the variables as nodes, the emphasis of the representation. The functions relating the variables are represented as branches between the nodes.
This leads one to ask whether operators structure their knowledge of the present plant state, during diagnosis, with the main focus in terms of variables rather than components. The Page et al study (Bainbridge and Reinartz, 1984) shows that those operators thought in terms of explaining why variables were not behaving normally, for the purpose of which they checked whether component states were acceptable. This focus on variables has the advantage that it is usable for the parallel task of compensation. As mentioned above, once the operators have identified which component is not functioning correctly, at the level of component which influences the behaviour of the process variables, further exploration and diagnosis of the components is passed to technicians. There may be a different allocation of responsibility in different countries or in different industries, or even within one industry (de Keyser, 1984). However, as the results of this one example are the inverse of the recommendations made by Rasmussen and colleagues it is important to investigate this further.
Several different display formats could be available to the operators for diagnosis and compensation. However the Page et al study suggests that operators focus on the same aspects of process information in both tasks. Even if this is not the case, it could cause difficulties for the operator to use different displays for tasks between which they interchange as frequently and flexibly as they do during real incidents, unless both displays are available simultaneously.
Descriptive mode
The operator uses at least five different forms of information about the process. Three of these describe overall relations, and two describe events over time.
1. A representations of the cause-effect relations in the process, which focusses on the process
variables, such as a signal-flow-graph.
An SFG shows each variable by a standard symbol, connected by branches labelled to indicate
the dynamics of the connecting function. Standard symbols do not have the same mnemonic
effectiveness as a mimic diagram, but this representation can show the main causal chains more
clearly than the mimic as it can show energy flows and chemical changes explicitly.
2. A mimic diagram of the plant.
This shows the plant structures by symbolic representations (using the word 'symbol' with the
display design rather than semiotic meaning) linked by the major flow paths. This shows the
plant context within which process changes occur. Static mimics focus on plant mechanisms
rather than process behaviour. We have seen that the operators' focus is on variables, this
could explain why static mimics are not much used by experienced operators. Dynamic mimics
can include analogue or digital information about variable values. For mass flows, mimics can
show causal changes and conditions on events, for example by showing the status of pumps or
valves. There is not necessarily a 1:1 mapping between mimic and SFG, and as they show
different aspects of the process the optimum spatial layout for each may be different.
3. The geographical location of displays and controls on the interface, and of parts of the
process on the plant.
In a well designed interlace there should be a meaningful mapping between interface and
process. There is no necessary mapping between geographical location of parts of the plant and
their function.
4. A representation of major phases during a process transient such as start-up or shutdown,
extracting the most important causal chains and the most important dynamic changes in each
phase.
This makes explicit information which is implicit in chart recordings, but which takes time and
experience to extract, and is not represented in a form which is easy to think about.
In some task types, sequences of events can be described by a state-transition network, in which states of the task are the nodes, and actions which change the state from one to another are represented on the branches. I do not find such networks helpful in describing most process operations, as the networks require discrete states, and include no mechanism for describing the dynamic changes in process state with which the operators' actions are concerned.
5. A chart recording of changes in the process over time, as a result of step changes in input variables or in manipulated variables. In theory there is a 1:1 mapping between this and the SFG. The SFG makes explicit ('compiled') the overall relationships in the plant, with a description of the connecting functions, from which its behaviour over time could be generated, given information about changes in input variables not under the control of the operator. In practice this prediction is difficult, especially as the connecting functions are often not known sufficiently accurately.
Inversely it should be possible to infer the causal relationships in the plant from the chart recording, which makes explicit the process behaviour over time, but it is only easy to do this when you already know what you are looking for. There are however impressive examples of operators who have discovered a great deal about observable functional relations in the plant, after extensive experience of processes which are not theoretically well understood.
There are therefore several types of plant description which do not map onto each other, and several types which do contain other information implicitly but to extract it takes knowledge and time. This is not to mention the potential that computer generated displays give for showing data which have been transformed or inferred in some way, e.g. displays for monitoring for off-normal states or for showing temperature distributions. These suggestions also do not cover all the different types of data which are used in cognitive skill, and all this ignores the fact that in many real plant the operators get much of their information from talking to other people.
Several of these display formats contain unchanging information about process structure, and the general properties of its behaviour over time, which do not need to be displayed to operators who are continually refreshing this knowledge by interaction with the process. In the diagnosis context the main problems arise however in unfamiliar situations, where knowledge support is needed.
Levels of knowledge
Within these types of information, the process operators' knowledge could be at several levels
of detail. One of the many interesting things which Rasmussen has focussed discussion on is
the use of computer generated displays which give information about the state of combined
functions of the process or parts of the plant, rather than individual components. Rasmussen
and Lind (1981) propose two hierarchies of combination :
1. 'aggregation' refers to the level of resolution at which parts of the plant are described, for
example a pump or a cooling system.
2. 'abstraction' refers to the type of descriptive mode being used, e.g. a physical component or
a mass-energy flow.
In practice the two types of usage may be correlated, as one type of concept may be more
appropriate for describing a given level of detail. These proposed hierarchies raise the question
of whether operators do actually think using different levels of representation.
Aggregation. The evidence suggests that operators do work at least two 'levels' of detail. They consider the behaviour of individual process variables, but they also think in terms of aggregates such as cooling. Individual variables are mentioned repeatedly in their reports and conversations. Evidence that they think of aggregates is given by their behaviour in real incidents, for example when they are questioning whether to follow a procedure because it will remove the availability of a cooling function. Other considerations however suggest that it is not possible to identify one 'level' of a given variable or component which is true in all circumstances. Instead the knowledge is in the form of a heterarchy, which appears as a hierarchy in relation to the focus of attention in a particular task. Consider for example the relation between main cooling water pump and reactor in a PWR. The reactor and pump are at different 'levels' from the perspective of production, as the reactor is primarily concerned with energy production while the pump is part of a subsidiary function of maintaining process efficiency. However the pump and reactor are at the same level' from the point of view of fault management, as the most effective compensatory action if the pump fails is to reduce heat output from the reactor. This suggests that it might be rash to devise separate displays for the operator showing process information at different levels of aggregation, or at least this should only be done after careful study of the contexts in which a given piece of information might be used.
The evidence quoted from the Page et al (1983) study suggests that operators work mainly with changes in process variables, and the flow or other functions which can affect these. At least in this example they do not go down to the level of combinations of components, within a pump for example. This level of consideration is handed over to the technicians.
Abstraction. Rasmussen and Lind (1981) use this term to refer to the type of 'descriptive
mode'. They relate descriptive modes to a hierarchy, and suggest that the operator is concerned
with at least three levels of abstraction :
- the purpose of the process, the goal : 'why' things are done.
- the nature of the process : 'what' it does.
- the physical properties of the process : 'how' functions are implemented.Actually it is
confusing to use these 'why', 'what', 'how' words in association with a hierarchy of
abstraction, as each of these words can be used within one 'level' of 'abstraction'. Suppose for
example that the operator knows that :
increased fuel flow increases temperature.
This information can be used to answer several types of question :
what happens if fuel flow increases ?
why has temperature increased ?
how can temperature be increased ?
why has fuel flow been increased ?
It cannot be used alone at this level to the answer the question :
why has temperature been increased ?
which can only be answered at this level by reference to the next item in the causal chain, i.e. to
what is affected when temperature changes.
Operators do tend to give explanations within this type of description, e.g. Ainsworth and Whitfield (1983) (MA.3 at 0.12) : 'the temperature on this mill has perhaps dropped by 2 degrees, that all, since we started, that's as a result of reducing in the PA [primary air] flow, because we have put very little coal in the mill'. This might not be what an engineer would consider as an explanation, but we are concerned here with the mental models of operators not the mental models of design engineers. However, although an operator gives this level of explanation while under pressure of work, they may be able to give a fuller account in an interview. Cuny (1977) has done extensive interviews to investigate how much operators do understand about their process at the technical rather than the functional level. He found that less than a third of explanations given by 3 experienced operators were at a technical level, the majority of explanations were in terms of empirically observable relationships.
There is an interesting and important question about the level at which operators need to call on specialist advice when responding to plant failure. Within the seven incidents analysed above, there is one example when the Technical Support Center was manned (Woods, 1982, G/5). There was extensive discussion between the TSC and operators about whether use of the safety injection pumps should be continued or terminated. From the summary provided, it appears that both were concerned about the process behaviour, and with establishing the availability of future recovery functions, given insufficient knowledge about both the present state and system response. From the data provided it appears that they differed in the priority which they assigned to the availability of different functions, rather than in their knowledge of the process dynamics or the way they reasoned. (One might add that in retrospect it appears the TSC were incorrect.)
General Aspects Of Interface Design And Performance Prediction
Knowing what the operator needs to know does not ensure a good interface. If this information
is conveyed badly, perhaps using indiscriminate symbols or a layout which does not clearly
map the structure, then this sort of factor, which is apparently a detail, can outweigh the effect
of an excellent task analysis. There are three main aspects to consider. However most of the
issues are not unique to process control so they will not be discussed in detail here.
1. perceptual-motor skills in using the interface.
2. human performance capacity.
3. working memory.
Perceptual-motor skill. This heading refers to the physical and cognitive skills of using the interface, of looking at or reaching to the correct place on the interface and interpreting the information there, rather than to the perceptual-motor skills of controlling the process. It should be trivially easy to find a particular place in a data base, and to interpret the information once it has been found. If the operators have to solve problems in order to obtain the information they need, this will interrupt their thinking about their main task. The above analyses of real nuclear incidents gives many examples where the operators' problems were increased by the interface.
Performance capacity. Engineers looking for advice from ergonomists tend to ask for absolute numbers for performance levels. 'We don't want to know about cognitive processes, just tell us what is the human : error rate/ information transmission capacity/ memory capacity/ perception capacity, and we will design the system accordingly.' Unfortunately the task categories used as a basis for asking for such numbers are too simple. Suppose for example one asks for human failure rates in 'deductive reasoning'. The '3-term series' problem (if A is bigger than B and C is smaller than B, which is the biggest ?) is a task used in reasoning studies which supplies a simply described example. Hunter (1957) found (to adapt his results for this purpose) that the number of people who can say which is the biggest item within a given time period depends on the way the task is presented :
| presentation | % people finding solution in time |
| a greater than b, b greater than c | 70 |
| a greater than b, c less than b | 59 |
| b less than a, c less than b | 43 |
| b less than a, b greater than c | 27 (in this case it is easier to say which is the largest) |
It is important to have data which identify priorities for financial investment to improvise operator performance. In many cases, available data on relative performance with different types of equipment can be used. Time pressures tend not to be critical in process control, so engineers are concerned only about interface aspects which will double the time taken, or increase human error rates by an order of magnitude. In office automation, and rapid response tasks such as flying, much smaller differences in performance become critical. This paper concentrates on the content of what should be displayed, rather than the technology of how it should be displayed, so the relevant data will not be reviewed.
Working memory. The operator's knowledge of the current state of the process provides the context for making rapid wise decisions. Knowledge about how this information develops, and what form it takes, are important in design decisions about manual take-over, or the number of VDU pages of information which must be available simultaneously. For example, in modest installations where investment does not justify installing more than one or two VDUs, it might be better to get the computer to drive conventional instruments, so that all the information needed in a decision can be available together.
Interesting recent work by cognitive psychologists supports the notion that working memory develops as a function of thinking about the task, using knowledge available in longer-term memory about what potentially could happen, and that this working memory provides the context which determines optimum future thinking, e.g. Chase and Ericsson (1981), Johnson-Laird (1983). (Confusingly for us, what Johnson-Laird calls a 'mental model' is more akin to the notion of working storage as used in ergonomics). These studies reinforce and expand out understanding of the underlying cognitive processes, but they do not have added implications for design issues so will not be discussed further here.
Obviously there is a daunting amount of research to be done, on the best techniques for task analysis, the best interface formats, and the best mapping between the two. Some recommendations can be based on what is known to cause difficulty in using an interface. The memory load problems caused by having to call up a sequence of VDU pages, carrying information which needs to be cross referenced, argue strongly for having sufficient VDUs to display all the information needed in any one decision, and without having to do information extraction tasks which require complex cognitive processes.
Conclusion
The author would like to thank Susan J. Reinartz (formerly Page) for comments on an earlier
version of this manuscript.
1988 Addendum :
Types of Knowledge and Display Design
Two of my recent papers are relevant to this :
Bainbridge (1987) VDU/VDT interfaces for process control.
Bainbridge (1988) Types of representation.
Charles Brennan's M.Sc. Thesis (1987) compared mimic and signal-flow-graph representations. His results suggest that mimics are better display formats for observers who know about the process being studied, as mimics provide many reminder cues about things to consider. SFGs may be better for inexperienced observers who do not already know the underlying causal structure.
1997 Addendum
Some other interesting papers
Hukki and Norros (1993) Diagnostic orientation in control of disturbance situations. Ergonomics, 35, 1317-1328.
Marshall et al (1981) Panel diagnosis training for major hazard continuous process installations. The Chemical Engineer, 365, 66-69.
Patrick (1993) Cognitive spects of fault-finding : training and transfer. Le Travail Humain. 56, 185-210.
Shepherd et al (1977) Control panel diagnosis : a comparison of three training methods. Ergonomics, 20, 347-361.
©1997 Lisanne Bainbridge
Main Index Page | References | Comment Form |