Errors (Reducing errors (Provide explicit cues:
must be visually salient…
Provide explicit cues:
- must be visually salient, just-in-time, and meaningful cues that are not specific are ineffective habituation is a concern
- Selection of visual cue/features shown to be automatic and processing of these occur whether they are informative or not. Even addition of simple visual cue e.g. orange dot, can =changes in how people interact with physical objects e.g. doors (Wallace and Huffman, 1990)
- External reminders/environmental cues should reduce PCE/omissions by reactivating attention —> freq used in industrial settings and aviation to reduce human error, by prompting actor to make sure all steps have been completed in the task. Everyday devices have indicator lights/beeps/alarms to act as reminders
- Chung and Byrne (2008) 2 expts using 81p's performing the phaser task
-Experimental cue appears when p is required to press the tracking button a second time to disengage the firing system. PCE is p’s moving on to “main control step” without disengaging the system first
Expt 1: control (no intervention), cued (cued version of task- orange light cue), mode error (cost version of task- tells you if activated), and combined cued and mode error condition
- no condition sig. reduced PCE rate --> visual cue not reducing number of errors shows that placing a reminder right next to the target can be ineffective.
- Is this a visual error? e.g. didn't pay attention to the cue/forgot its association with the PC action = WM load and speed-accuracy trade off likely contribute
- Cost in form of mode error didnt change behaviour on subsequent tasks --> follows findings of Serig (2001) that p’s error commission is independent of negative/positive feedback about task performance
Expt 2: further looked at by changing cue appearance and function
- Hollangel (1993) cue’s strength relative to other elements of the task that is important when making it a potential reminder —> based on observation that when a task is more trivial, attention is more easily diverted = need specific cues that demand actions
- Sutcliffe (1995) most affective visual attributes for attracting attention on interfaces are movement, shape and size, colour (red, green and yellow), brightness, shading and texture, surroundings
- expt altered visual salience (alternating red and yellow), visual specificity (directional shape of cue towards the button), and it was either just-in-time or not (cue appeared with PC step --> gave prior warning by highlighting the mode)
- cue that was visually salient, just-in-time, and meaningful entirely eliminated the error. cues that were not as specific were ineffective
- Limitations: attribute cue effectiveness in expt 2 to training procedures --> promoted better recall of task at testing. Better training may account for some of the difference, but PCE frequencies in the control were similar in both expts
- Implications: interfaces must be designed to both reduce the frequency of human error and mitigate their effects, particularly in safety critical domains. This work was an initial step to extend our understanding of how visual cues may be used effectively to improve performance in interactive tasks.
- Design changes e.g. ATM
often impossible/too expensive to to force individuals to do post completion action (as with ATMs)
- Alternative = add a cue to remind user of some action e.g. bright warning label, green light on ATM card slot
Perceptual interference effect = phenomenon of disfluency leading to better information processing (perceptual errors)
- by introducing disfluency into perceived information (i.e. obscuring text or numbers in a different typeface or colour) leads to more effortful and deeper processing, better memory encoding as well as memory retrieval (Kahneman, 2008).
- Soboczenski et al. (2013) disfluency in the presentation of a number can decrease errors in both the transcription of sentences and in number-entry tasks --> something as simple as making the characters harder to read by presenting them in light grey rather than standard black font = sig. reduction in the number of errors without adversely affecting the speed of either task.
= perceptual difficulties trigger deeper processing --> reducing errors
- need to apply to safety critical situations e.g. programming syringe pumps in healthcare settings, where info programming is a matter of life or death
- Soboczenski et al. (2015): used auditory distractors alongside the PIE effect to investigate the effect on number-entry errors. Auditory distractors important because high pressure safety jobs often conducted in noisy, distracting environments = does the PIE effect still hold?
--> the number of errors is sig. reduced by PIE and the rate of making errors is reduced = PIE improves number-entry accuracy, even in the presence of auditory distractors.
- Only modest effect size and only marginally sig. but bearing in mind the healthcare context where number-entry is a widespread and safety-critical task, even small effects of this sort can be important
Some numbers are more familiar/meaningful than others (encoding errors):
- Wiseman (2014): familiar numbers are represented more strongly than non-familiar numbers in memory = familiar numbers are sig faster to transcribe, opens up possibility of more errors
- log analysis of hospital devices shows that there are clear patterns in the numbers used = medical workers are likely to be more familiar with some numbers than others
- outlined a no. of heuristics that to improve the design of number entry interfaces e.g. entering a decimal point should require the user to request that functionality to prevent accidental slips (big impact because it changes magnitude and is so small it might be visually missed) = decimal point button should be harder to reach. Numbers 1,000, 100 and 50 should be accessible in one keypress (much like .com appears on phones when entering a URL)
- tested these, found adapting number entry interfaces to fit the specific task they are used for could help reduce the opportunities for making errors by reducing the number of keypresses needed to enter data (didn't reduce errors)
- specifically for medical domain- requiring p's to enter all three numbers (usually two of the required numbers can be used to calculate the third) a natural checksum could be used to ensure that there were no errors during the number entry task --> by using this method, number entry error rates could be sig. reduced (no errors went unnoticed)
- BUT, came at a cost to time to complete task (3 no. instead of 2) --> can improve design to reduce error rates, but its important to figure out how to balance increased accuracy with decreased speed
Checking: can stop errors from happening, but can also make it more likely that people will detect errors
- Wiseman et al. (2013) looked at 2 possible interface designs to help users detect number entry errors using idea of checksum (an additional, redundant, number that is related to the to-be-entered numbers in such a way that it is sufficient to verify the correctness of the checksum, as opposed to checking each of the entered numbers)
- 1st interface- users check their own work with the help of the checksum (2 number interface)
- 2nd interface - users enter the checksum along with the other numbers so that the system can do the checking (3 number interface)
- For both cases 2 numbers needed to be entered, while the third number served as a checksum.
- For 1, users caught only 36% of their errors (not surprising, without a forcing function why would people check? it wastes time), for 2 all errors were caught, but the need to enter the checksum increased entry time by 46%.
- When p's were allowed to choose between the two interfaces, they chose the second interface in only 12% of the cases.
- Limiation = p's not penalised in meaningful way for entering inaccurate info --> In situs where the results of incorrect number entry were more costly, it is possible that users may pay more attention to the information provided by the checksum in the 2-number interface – or be more willing to use the slower 3-number interface.
-p's asked to enter many sets of numbers - might not generalise to most workplaces e.g. when infusion pumps are programmed on the hospital ward, only one set of numbers for a single prescription is entered at a single time.
- Although these results cannot be generalized to other specific contexts, the results illustrate the strengths and weaknesses of each way of using checksums to catch number entry errors.
Motor movement errors: Oladimeji et al. (2011)
- 2 main styles of number entry interfaces found on medical devices: serial interfaces like the ubiquitous 12-key numeric keypad, and incremental interfaces that use a knob or a pair of keys to increase or decrease numbers.
- experiment investigated the effect of interface design on error detection in number entry --> the incremental interface produces more accurate inputs than the serial interface (also slower to input), and the magnitude of errors suggests that the incremental interface could reduce the death rate relative to the numeric keypad.
Task lockouts encourage checking:
- O’Hara and Payne (1999) preventing ps from starting a task for a short period improved performance on a problem-solving activity --> being locked-out for 7s rather than 3s meant that p's planned their actions more carefully = superior performance.
Explained: as long as extra planning costs are expected to be offset by reductions in execution costs, people will exert effort on planning
- Lockouts have been used successfully in safety- critical settings (Green et al., 2015)
- Brumby et al. [(2013) experimented with introducing lockouts immediately after interruptions. They found that lockouts reduced the rate of errors made on resuming a task after an interruption. Additionally, post-interruption resumptions were faster if they followed a lockout.
previous work provides little guidance on, for example, how long a lockout must be in order to be effective. Perhaps more importantly, prior work gives us little indication of whether lockouts are effective in practice or simply encourage people to attend to other tasks while they are locked-out
- Gould et al. (2016) examine whether lockouts induce switching behavior and whether such behavior is deleterious to checking performance.
- 1st lab expt, investigates relationship between lockout duration (0s, 3s, 6s) and error detection in a routine number-entry task.
- longer lockouts yield improved error detection
- 2nd expt. = an online crowdsourced experiment
- found that in less controlled environments the effectiveness of lockouts in encouraging checking is compromised by their tendency to induce people to switch to other activities.
BUT, as in expt. 2- lockout duration has to be carefully monitored, too long durations encourage switching
- presence of competing tasks might mean that people can derive more utility from a lockout period by switching to something else
- One of the factors that is likely to influence peoples’ propensity to switch during lockouts is the cost of making switches (Borst et al., 2013)
-will switch when perceived utility of switching exceeds the perceived costs of switching
-when lockouts are very short, the costs of switching are proportionally higher: more of a lockout period is lost on switching costs.
= expect people to be more inclined to switch to other activities as lockouts lengthen.
- Reasons to be cautious about applying lockouts:
1) longer lockouts increase p's feelings of frustration --> negative affective states reduce accuracy in number-entry tasks (Cairns et al., 2014)
2) practical issues --> cumulative time lost to alarms on medical devices already costs hospitals large sums in lost time (Lee et al., 2012). Introducing even very short lockouts may have large impacts when aggregated over an entire healthcare system.
Types of errors
- Slips (errors at skill level) = the action is appropriate, but is carried out incorrectly
-more difficult to avoid --> you can't train people out of making slips and more knowledge/info won't help as they know the correct action
- Mistakes (errors at higher levels) = Action is carried out correctly, but it is inappropriate
- Knowledge-based: faulty conceptual knowledge, incomplete knowledge, biases and faulty heuristics, incorrect selection of knowledge, information overload.
- Rule-based: Misapplication of good rules. Encoding deficiencies in rules. Action deficiencies in rules. Dissociation between knowledge and rules.
- Solve by giving people knowledge
Categorisation of action slips: Norman (1981)
- looked at self-reports of slip errors using diary study method --> kept a record all the slip errors that he made throughout the day and attempted to classify slip errors according to their cause
- Problems with relying on self-report --> no reliable estimates of frequency of error and this can't determine error cause
- His 7 stages of action theory can help explain these slips (see diagram) --> assumes action sequences are controlled by sensorimotor knowledge structures- schemas. Based on activation and selection of appropriate schemas to successfully complete an action. Uses a triggering mechanism that there are appropriate conditions must be satisfied for the schema to operate
- there are stages of execution (intention --> action sequence --> executed on physical world), and stages of evaluation (evaluate success of action --> perception of world --> interpret according to expectations --> compare with goal)
Theory permits several opportunities for slips - error in selection of intention or errors in the specification of the components --> identifies 3 main categories of slips:
- 1) errors in the formation of intention
- mode errors (errors in classifying the situation) —> an appropriate action for a situation is being performed, but it is not the current situation
- description errors (errors resulting from ambiguous/not specific intentions) e.g. replacing lid to sugar container on the coffee cup
- 2) faulty activation of schemas
- capture errors —> familiar habit instead of intended action sequence e.g. example on next slide
- data-driven —> environment forces intrusion on intention activation e.g. Stroop task
- associative activations —> strong association between schemas (not similarity) leads to inappropriate ones being activated
- loss of activation —> certain schemas lose activation due to normal decay/interference properties of primary memory e.g. forget task goal
- 3) Faulty triggering
- spoonerisms- components of words are interchanged
- blends- when unsure which of two actions to perform can merge 2 schemas e.g. indecision between use of words close and shut = response clut
Lab studies of human error --> slips happen in routine, procedural tasks so important to train p's, and in lab can increase control and reliability of measures
- lots of focus placed on the post-completion step --> final step in a procedure, completed after the achievement of the main goal
- post-completion error = people forget this stage because they've achieved their goal (e.g. getting the original paper after making a photocopy, leaving card in ATM because have got the money)
Why errors occur
- hierarchical task analysis --> routine procedures can typically be decomposed into a series of subtasks and when completing a task, the user must keep track of where they are in the procedure (i.e. memory for goals)
- control codes are goals in working memory that store place-keeping information about task progress (Altmann & Trafton, 2002)
- memory for goals (i.e. like with interruptions) --> declarative memory representations are defined by activation, which declines over time and may fall below retrieval threshold
- =predicts that working memory load should mediate likelihood of error
Byrne and Bovair (1997):
- With larger WM capacity the goal can be actively rehearsed = error less likely.
- With increased WM load the main goal will decay more rapidly so high chance of error because PCS less likely to reach activation threshold.
- Computational theory suggests PCE due to goal loss from WM --> goals in WM supply activation to their sub-goals, so sub-goals only receive activation as long as their parent goal is active. When goal eliminated from WM by being satisfied, its sub-goals lose activation supply from the satisfied goal. When WM load is high, the loss of support from the parent goal may lead to the loss of unsatisfied sub-goals
- e.g. for photocopier, higher WM load assumed to be associated with faster decay/displacement of information from WM, so whether or not the sub-goal of removing the original from the copier reaches threshold depends on WM load.
Expt 1: phaser task and deflector task, each of which with 2 versions - control and PC versions
- Control and PC versions had same steps until the end of the task where the PC version has the main goal satisfied before the task is completed
- Phaser task - complex procedure involving several sub-goals and a time-consuming tracking subtask. Final step = turning off tracking. In the PC version this step occurs after the main goal of the procedure has been satisfied.
- Deflector task – simpler (fewer sub-goals and steps). In the PC version p's can fulfill the main goal before satisfying the ‘reset switching’ sub-goal.
- Found PCE can be generated in lab, phaser task is useful paradigm (some degree of ecological validity --> unusual distractions and/or fatigue were not necessary to get p's to produce errors). Errors in general (including PCE) not generated by deflector task.
- PCE are somehow different from procedural errors (i.e. errors in the execution of an otherwise routine procedure)
- Task complexity (link to WM) may be key variable in producing PCE --> naturalistic tasks where PCE appear (e.g. photocopiers and ATMs) are not particularly complex but can become complex if there is an external memory load present
Expt 2: extend Expt 1 through the addition of WM capacity assessment and an external memory load condition.
- did phaser task as well as other task (transporter) with PC
- WM load manipulated: no load = task done as in expt 1. Load = did concurrent, unrelated task as well
- WM capacity manipulated: sentence digit and operation span tests
- Findings support a Collaborative Activation-based Production System (CAPS) model --> high capacity and low memory load should produce fewer PCEs e.g. under conditions in which WM capacity was not taxed (high-capacity ps doing phaser task without external load) PCE were extremely rare
- General principle: When enough capacity was available, the model did not make the error, but with less available capacity it did
- Issue of learning in these situ's relatively unexplored --> evidence for learning, but not clear whether people can learn to completely stop making PCE
- Limitations- CAPS can't explain all errors, doesn't include a learning mechanism even tho learning reduced PCE
HCI is the study of interfaces between people and computers --> people make a lot of error e.g. failure to check inputs, making slips when typing
- these can even be life threatening e.g. deaths caused by badly developed interface (e.g. accidentally giving chemo medication over 24 mins not 24 hrs)
- in the aviation domain alone an analysis of the flight recorders revealed that 70–80% of accidents are based on human error or based on a chain of failures in relation to the human factor (Martins et al., 2013)
- need to aid in the development of systems that protect/limit from errors --> contactless max spend £30 to protect the user when fail to check amount/theres an input error
- Errors also tell us about normal behaviour
- Kohn et al. (2000) stated that in the USA more people die from medical errors than from road accidents, breast cancer or AIDS. While errors in healthcare are prevalent, the majority of errors are not caused by people acting carelessly but by erroneous systems and procedures which lead people to commit errors