Critical incident reporting and observational studies have identified nontechnical skills that are vital to successful anesthesia crisis management. Examples of such skills include task management, team working, situation awareness, and decision making. These skills are not necessarily acquired through clinical experience and may need to be specifically taught. This study uses a high-fidelity patient simulator to assess the effect of repeated exposure to simulated anesthesia crises on the nontechnical skills of anesthesia residents.
After institutional research board approval and informed consent, 20 anesthesia residents were recruited. Each resident was randomized to participate as the primary anesthesiologist in the management of three different simulated anesthesia crises using a high-fidelity patient simulator. After each session, videotaped footage was used to facilitate debriefing of their nontechnical skills. The videotapes were later reviewed by two expert blinded independent assessors who rated each resident's nontechnical skills by using a previously validated and reliable marking system.
: A significant improvement in the nontechnical skills of residents was demonstrated from their first to second session and from their first to third session (both P < 0.005). However from their second to third session, no significant improvement was observed. Interrater reliability between assessors was modest (single rater intraclass correlation = 0.53).
A single exposure to anesthesia crises using a high-fidelity patient simulator can improve the nontechnical skills of anesthesia residents. However, an additional simulation session may confer little or no additional benefit.
TRADITIONAL anesthesia teaching has placed significant emphasis on knowledge acquisition and the mastering of technical skills. However, critical incident reporting and observational studies, both in the clinical setting and on patient simulators, have identified nontechnical skills to be major determinants of successful anesthesia crisis management.1,2Nontechnical skills are those that do not relate to medical knowledge or technical procedures but instead encompass cognitive skills (e.g. , decision making, situation awareness) and interpersonal skills (e.g. , exchanging information, assertiveness).2These qualities are not necessarily acquired by anesthesia trainees through routine clinical experience and may need to be specifically taught.3
Despite worldwide adoption of patient simulation in anesthesiology, there remains a lack of valid and reliable simulation performance assessment tools.4Although most of the literature has focused on assessment of knowledge and technical skills during anesthesia simulation, research on nontechnical skills has become a recent area of interest.2,5–7A comprehensive and reliable nontechnical skills assessment tool called the Anaesthetists' Non-Technical Skills (ANTS) system has recently been developed.5
The hierarchical ANTS scoring system consists at the highest level of four basic skill categories, namely task management, team working, situation awareness, and decision making. These skill categories are further divided up into 15 skill elements. Each element is anchored for rating with examples of behaviors indicating good and poor practice.
Although studies have addressed the issue of performance improvement with repeated simulation exposure, all of these have focused on knowledge and technical skills ability.8–11Debriefing inclusive of videotape review was used between simulation sessions in two of these studies.9,10
The purpose of this study was to prospectively investigate the effects of repeated simulation of anesthesia crisis management and videotape-aided debriefing on the nontechnical skills ability of anesthesia residents using the ANTS scoring system.
Materials and Methods
Recruitment and Orientation Phases
After institutional research board (St. Michael's Hospital, University of Toronto, Toronto, Ontario, Canada) approval, anesthesia residents in postgraduate years 2 and 4 from within the University of Toronto training program were invited to participate as study subjects. Informed consent was obtained. There were no exclusion criteria. Residents were free to decline to participate. In addition to informed consent, confidentiality agreements were signed to ensure that information pertaining to the simulation scenarios would not be disseminated.
Before the simulation sessions, an orientation session was held for all subjects. During an initial 1-h didactic period, crisis evolution, patient simulation, and anesthesia crisis resource management (ACRM) principles1,3were discussed. Although many of the behaviors that the ANTS system addresses were discussed, there was no specific mention of the ANTS scoring system itself. Subjects then participated in hands-on familiarization with the Laerdal SimMan® simulator mannequin and monitors (Laerdal Medical Canada Ltd., Toronto, Ontario, Canada), the Datex® anesthesia machine (Datex Corporation, St. Laurent, Quebec, Canada), and the mock operating room environment. This orientation session did not include practice at crisis management during an actual scenario.
Interventions Phase
Subjects attended their simulator sessions in groups of three. Each session consisted of three different scenarios. For each scenario, one subject played the role of the primary anesthesiologist, and another subject remained in an adjacent room, available as a secondary anesthesiologist if help was requested. Simulation center personnel and one of the two principal study investigators functioned as perioperative personnel in each scenario in the scripted roles of surgeon and nurse. The third subject observed the scenario in a passive role.
Each scenario consisted of a verbal handover from a principal investigator to the primary anesthesiologist that provided pertinent information such as patient history, investigations, and anesthesia and operative progress to date. A mock anesthesia record sheet containing most of this information was also provided. This principal investigator then left the operating room and directed the prescripted scenario from the control room with help from a simulator technician. The entire simulation was videotaped. A graphical display of the patient's vital signs throughout was overlaid onto the videotaped footage.
During the simulation, the primary anesthesiologist was able to call for help at any time from the secondary anesthesiologist in an adjacent room. The secondary anesthesiologist was previously instructed to be a semiactive participant (i.e. , perform tasks only if instructed, and not offer crisis management advice or differential diagnoses). The surgeon and nurse roles were played according to a script, and they were available to perform tasks only if instructed. The scenario concluded either with resolution of the crisis or at the discretion of the primary investigator in the control room.
Immediately after the scenario, all subjects received a videotape-assisted debriefing, guided by ACRM training principles.1,3The critique of each performance focused predominantly on nontechnical skills.
Subjects rotated through the three scenarios, taking turns at being the primary anesthesiologist, the secondary anesthesiologist, and the passive observer. A debriefing occurred after each of the three scenarios.
The subjects were kept in the same group for the duration of the study. A month later, the same group was bought back to the simulation center to participate in their second simulation session, which consisted of three different scenarios, again with debriefing after each scenario. A further month later, the group participated in their third simulation session, involving another three different scenarios. During the second and third sessions, debriefing of the performance of a given subject was not specifically targeted to areas of weakness identified during previous sessions.
Nine different anesthesia crisis scenarios were used. Although each subject participated in some capacity in all nine scenarios, each was the primary anesthesiologist in only three of them. These three scenarios, each separated in time by 1 month, formed the basis for the repeated performance assessments. In addition, the order in which subjects participated as the primary anesthesiologist was rotated over the three simulation sessions.
The scenarios used were selected from the institution's existing ACRM teaching program and included latex anaphylaxis, massive fat embolism, blocked endotracheal tube, concealed massive hemorrhage, difficult airway in a burn victim, severe intracranial hypertension, local anesthesia toxicity, malignant hyperthermia, and pipeline oxygen failure. Each scenario had a predefined sequence of when and how the crisis situation evolved. The responses to predicted therapeutic interventions were also standardized as much as possible.
Assessment Phase
Two staff anesthesiologists with expertise in simulation and ACRM principles were recruited and trained by the principal investigators to be assessors using the ANTS scoring system.5††The assessors were not familiar with any of the subjects.
Initial assessor training consisted of providing them with the background ANTS literature2,5and the User Manual.††They then underwent 4 h of training using the ANTS system to independently rate the videotaped performance of residents managing simulated anesthesia crises. These videotapes documented performances by residents not involved in this study working through the scenarios used for this study. The assessors were free to use the videotape rewind function at any time. After the assessment of each videotape, ANTS scores were compared, and use of the system was discussed. Although no formal attempt was made to calibrate the assessors, scores that diverged widely were further discussed.
The ANTS system is hierarchical and consists of the four skill categories of task management, team working, situation awareness, and decision making (table 1). Task management, for example, is defined as “skills for organizing resources and required activities to achieve goals, be they individual case plans or longer term scheduling issues.”
Each skill category is further divided up in to a number of skill elements (table 1). Each skill element then has a number of different example behaviors for good and poor performance. For example, in the skill element of identifying and utilizing resources, an example of good performance is “allocates tasks to appropriate members of team,” whereas one of poor performance is “overloads team members with tasks.”
The behaviors observed were rated at both the category and element levels. The ANTS scoring system uses a four-point scale to describe the performance of the nontechnical skills observed (table 2). During the initial training phase, feedback by the assessors suggested this four-point scale did not provide enough scope to rate many of the observed skills. Therefore, the scale was modified to include the utilization of half points, thus turning it into a seven-point scale, i.e. , 1, 1.5, 2, 2.5, 3, 3.5, and 4. This limitation in the ANTS has been previously observed.12
At the conclusion of the interventions phase of the study, all study videotapes were forwarded to the assessors. During the rating process, the assessors were blinded as to whether a subject was performing as the primary anesthesiologist in their first, second, or third session. They viewed and rated all videotapes independently and in random order.
Statistical Analysis
Statistical analysis was performed using SigmaStat 2.03 (SPSS Incorporated, Chicago, IL). Borrowing from the psychological field, effect sizes of greater than 1.0 SD are acceptable in assessing teaching interventions.13With 20 subjects using a two-tailed α of 0.0125, after a Bonferroni correction for four primary outcomes, we had 94% power to detect an effect size of 1.0 SD between the first and third simulator sessions.
The primary outcome measures used were the ANTS scores given by the assessors for the four skill categories. These category scores were analyzed parametrically using repeated-measures analysis of variance. Analyzing global rating scales parametrically, when possible, as continuous data has become convention in the educational literature because it is more powerful than nonparametric analysis.13A two-tailed P value of less than 0.0125 was considered statistically significant, after a Bonferroni correction for four independent primary outcomes. Significant results were then analyzed using a Tukey post hoc analysis.
The secondary outcome measures used were the ANTS scores given by the assessors for the 15 skill elements. These element scores were also compared using repeated-measures analysis of variance for parametric data and chi-square analysis for nonparametric data. A two-tailed P value of less than 0.05 was considered statistically significant for multiple secondary outcomes. Significant results were then analyzed using a Tukey post hoc analysis.
Interrater reliability for the two ANTS assessors was evaluated using intraclass correlation over the range of data, with a two-tailed P value of less than 0.05. Interrater reliability was measured at both the category and element levels.
Results
Demographics
Twenty-seven subjects were approached to take part in this study. Twenty subjects completed three scenarios as the primary anesthesiologist and had adequate videotaped footage available for subsequent analysis. Of the 20 subjects who formed the basis for this study, there was an even distribution of 10 second-year and 10 fourth-year residents. A greater number of male subjects participated in the study, reflective of the demographics of our anesthesia training program (table 3).
Primary Outcome Measures: Category Scores
The ANTS results from the first sessions are most representative of preintervention control scores, as preexisting skill levels in the residents before additional training was assessed. The scores from the second session correspond to the additional skills acquired from the previous session's training. The scores from the third session should correspond to the additional skills obtained from the second session's training (table 3).
For each of the four skill categories of task management, team working, situation awareness, and decision making, there was significant improvement in the mean scores of subjects between their first and second sessions (all P < 0.005) and their first and third sessions (all P < 0.005; fig. 1). These results represent the effect of the intervention of a single simulation session with debriefing. No significant differences were seen in the mean category scores between their second and third sessions (all P = not significant; fig. 1), representing the effect of the additional intervention of a further simulation session with debriefing 1 month later.
Secondary Outcome Measures: Element Scores
At the element level, for all 15 of the nontechnical skill elements, there were significant improvements in the mean scores of subjects between their first and second sessions (all P < 0.05) and their first and third sessions (all P < 0.05; figs. 2–5). No significant differences were seen in the mean element scores between their second and third sessions (all P = not significant; figs. 2–5).
Interrater Reliability
At the category level, across the four categories, interrater reliability overall was acceptable (single rater intraclass correlation = 0.53; P < 0.001). At the element level, across the 15 elements, interrater reliability was modest (single rater intraclass correlation = 0.50; P < 0.001).
Second- versus Fourth-year Residents
Mean category scores for the first, second, and third sessions tended toward higher scores for fourth-year residents as compared with second-year residents, but this was not statistically significant (table 3).
Discussion
Anesthesia education using patient simulation modeled on ACRM-type courses involving scenario-based teaching with debriefing has become widespread among many anesthesia residency training programs. Demonstrating the benefit of this type of simulation based education has been problematic.
The results from our study suggest that a single simulation session improves the nontechnical skills of residents. An additional simulation session 1 month later seems to confer little or no additional benefit.
However, before removing additional simulation sessions from a curriculum, some additional points should be considered. First, we did not observe a ceiling effect in the evaluation of nontechnical skills, because residents did not achieve the maximum score by the third session. This suggests opportunities exist for further improvement in these skills. Second, studies in simulation have not yet examined the optimal interval between training sessions to achieve and maintain proficiency in ACRM. Currently, many centers conduct successive ACRM training sessions over a period of years (typically one full-day course per year), using modules incorporating more ACRM-related concepts and more complex scenarios and involving more of the anesthesia subspecialties. It is possible that the short interval between our simulation sessions was inadequate to show ongoing improvement in nontechnical skills.
There were several design and methodologic limitations to this study. Our study lacked a control group without serial exposure to simulation and debriefing. Therefore, we cannot determine whether the improvements in nontechnical skills were due to repeated exposure to a simulation environment, to debriefing, or to both. Studies of skills improvement through simulation must attempt to control for familiarity of the simulation environment, so that observed improvements in performance are not entirely attributed to greater experience with the test modality.14An attempt was made to control for this by introducing an orientation session before formal commencement of the study.
Our scenarios were subjectively judged to be equally difficult by investigators with expertise in simulation, recognizing that creating scenarios of equal complexity is challenging. Hence, some of our scenarios may have advantaged more senior trainees with greater medical knowledge and clinical experience. However, the randomization of scenarios should have minimized this potential bias.
In our study, the secondary anesthesiologist was instructed to perform tasks only if instructed to, and not to assume the leadership role of the primary anesthesiologist. This modification from clinical practice was incorporated so that the scenarios involving relatively passive and less verbal primary anesthesiologists were not taken over by secondary anesthesiologists of a more vocal and aggressive disposition. The study design aimed to investigate an individual's serial performance as a team leader rather than the overall team performance within a mutually cooperative environment.
The design of this study necessitated that subjects attend in groups of three during each simulation session. Although within each session each subject was the primary anesthesiologist in only one scenario, he or she was involved in some capacity for all three of the scenarios and took part in three debriefings. It is probable that subjects not only learned nontechnical skills from the scenario and debriefing in which they were the primary anesthesiologist, but they also learned passively from their participation in the other scenarios. We attempted to control for the effect of passive learning by rotating the order in which subjects participated as the primary anesthesiologist over the three simulation sessions.
Although we were not able to demonstrate any statistically significant difference in the mean category scores between second- and fourth-year residents, the results trend toward superior nontechnical skills performance in fourth-year residents. This trend was seen in all categories and elements and in all three of the simulation sessions. This is not surprising, because almost all of the fourth-year residents had a single previous remote simulator experience. The greater nontechnical ability of the senior residents may have occurred through a combination of an increased familiarity with ACRM and the simulation environment, and greater clinical experience. This study was not powered to demonstrate statistically significant differences between junior and senior residents. A larger study may have allowed us to demonstrate statistically significant differences in performance between the groups of residents.
Anesthesia crisis management should ideally combine cognitive and interpersonal skills with medical knowledge and procedural skills. The two domains are interdependent. However, the significance of this interdependence is somewhat variable. Gaba et al. 6‡‡examined the relation between technical and behavioral (nontechnical) performance and showed a general trend of correlation, but with some outlying values. Although their study looked more at team rather than individual performance, most groups showed that the levels of technical and behavioral performance tended to match. One pattern of outliers were groups that worked poorly as a team and thus had low behavioral scores, but had good technical scores that resulted from the individual efforts of only a few members. A single group that had good behavioral and team processes but a lack of collective knowledge, such that a poor technical score resulted, represented the other outlier pattern. Weller et al. 7also demonstrated good correlation between behavior and knowledge using a simple global rating scale. Difficulty therefore arises in attempting to devise a performance assessment tool, such as the ANTS system, that exclusively measures nontechnical skills ability.
Validity and reliability must be present before an evaluative tool, such as the ANTS system, becomes widely adopted. Initial evaluative studies with the ANTS system have suggested that it is a reliable and usable measure of nontechnical skills ability in the simulator environment and fulfils some aspects of validity.5 Construct validity refers to the extent to which a test reflects the concept that is being tested, and it is verified if the test results are in keeping with expectation. The results of this study imply that the ANTS scoring system has construct validity. The expectation was that repeated simulation, debriefing, and nontechnical skills teaching would result in demonstration of improved nontechnical skills ability. This was confirmed by the statistically significant improvement in mean category and element scores from the first to the second and from the first to the third sessions.
With regard to reliability, previous studies have shown that the variability between raters when assessing nontechnical skills is greater than when assessing technical skills.6,7Although the ANTS authors found satisfactory interrater reliability, they mentioned that it was not ideal.5In this study, despite the primary investigators and assessors having limited familiarity with the ANTS tool, the interrater reliability was modest and acceptable.
However, the ANTS performance assessment tool does have limitations. Although it is used predominantly for assessment of nontechnical skills, some of the categories and elements are inherently linked to medical knowledge and expertise. For example, the element of providing and maintaining standards is defined as “supporting safety and quality by adhering to accepted principles of anesthesia; following where possible, codes of good practice, treatment protocols or guidelines, and mental checklists.” Moreover, how a subject is rated in the element of using authority and assertiveness is influenced by the appropriateness of their diagnosis and management strategy. In our study, we acknowledge that the evaluation of nontechnical skills in certain elements may have been influenced by the subject's medical knowledge.
Some additional criticisms of the ANTS system have been raised in a recent review.12The ANTS system does not differentiate between those nontechnical skills needed for different scenarios, because it assumes that these skills are completely generic and context free. It also makes no distinction between required nontechnical skills in a given clinical setting and the generic set of nontechnical skills.
Ideally, improved skills in the simulator would translate into improved ability in real clinical situations, thus validating ACRM simulation training. However, evidence for this is lacking because of difficulties in creating and using valid and reliable performance measures in the clinical setting. Moreover, the feasibility of assessing crisis management performance in the clinical setting is difficult, given that crises are rare and unpredictable.
Subjective changes in real-life anesthesia practice after simulator training may provide a surrogate measure of its benefit. A recent survey showed that after ACRM training, participants perceived a long-term change in practice that included improved communication, leadership, ability to work collaboratively with colleagues, and improved problem-solving strategies.15
Despite the current paucity of evidence demonstrating the benefit of ACRM-type simulation training, much of the anesthesia community seems to have embraced it, judging by the ever-growing list of simulation sites worldwide. Subjectively, the impression from both teachers and participants is that simulation-based education is very useful. We believe that we have demonstrated in this study that ACRM-type simulation based education is beneficial and can significantly improve the nontechnical skills ability of residents. These results add to the existing evidence and existing expert opinion that simulation-based education be incorporated into all anesthesia curriculums.
The authors thank the second- and fourth-year anesthesiology residents in the University of Toronto (Toronto, Ontario, Canada) training program for their participation in this study.