In medicine, different data exist on the effectiveness of checklists and MECA: whereas prior research in the perioperative setting showed a highly significant reduction of missed critical process steps (6% when checklists were available vs. 23% when they were unavailable) , the implementation of a pediatric sedation safety checklist failed to show a significant reduction in sedation-related adverse events . Additionally, there is conflicting evidence for the effectiveness of checklists to improve perioperative outcomes in some populations . In intensive and emergency care medicine, the implementation of a multidisciplinary safety checklist during bedside bronchoscopy-guided percutaneous tracheostomy was independently associated with a 580% reduction in adverse procedural events  and the implementation of a preintubation checklist for ED intubation of trauma patients was associated with a 7.7% absolute risk reduction . Implementation of a multifaceted quality improvement intervention with daily checklists, goal setting, and clinician prompting did not reduce in-hospital mortality among critically ill patients treated in ICUs in Brazil . Since data on risk reduction using checklists in intensive care and emergency medicine still seem to be very limited, it is difficult to classify our results precisely . In order to correctly classify the observed effect, it depends on the intention to treat population. If MECA were not applied, or only partially applied, or in the worst case, the wrong MECA was chosen (expectably leading to worse results than in the control group), this is a very crucial consideration in the evaluation of the system. A per protocol analysis may be helpful to further break down the observed effect. Additionally, checklists may be designed as diagnostic (Scriven schema) or problem-solving (Higgins and Boorman schema), but, as the text-based algorithms used in this study do not contain any obvious actions or criteria to “check off”, we chose the term “medical emergency cognitive aid” (MECA) instead of checklist in order to ensure transparency with readers . In our simulations, MECA were subjectively judged as helpful by 94% of the participants. Overall, our results suggest that in a high-fidelity simulation MECA use by residents of different specialties led to a relevant risk reduction in the management of medical emergencies. Teams performed significantly better when MECA were available even though the MECA used were previously unknown to the study participants. In case of insufficient performance using MECA, those teams have simply refrained from using the checklists to the extent provided. Interestingly the use of wrong MECA led to an even worse performance than not using MECA at all, which resulted from a consecutive fault after misinterpretation of the underlying cardiac rhythm. Our study participants, mostly 2nd or 3rd year residents, were quite equally distributed in terms of age, gender and overall work experience. “Perioperative” residents had significantly more pre-experience in ED medicine and pre-hospital trauma life support (PHTLS), which is in line with previous data from Germany . Participants’ perceptions of MECA revealed a high level of acceptance across all specialties with a single exception regarding potential stress reduction which was significantly more obvious in “medical” versus “perioperative” residents. The underlying cause mainly remains uncertain, and we can only speculate on the reasons. With the introduction of the WHO surgical checklist in 2008 , checklists have become an indispensable part of daily surgical and anesthesiologic routine. Thus, maybe “perioperative” residents were more experienced in the overall checklist or MECA use. Additionally, “perioperative” residents had significantly higher preexisting emergency medicine experience, which could also contribute to the fact that the stress-reducing effect in this group was not as pronounced as in the group of "medical" residents.
Participants with prior experience using checklists who were not randomized to the MECA group also rated them mostly favorably.Although our results are promising and inspiring concerning a broader introduction of MECA in hospitals, their use can also be problematic when applied to clinical problems that require nonlinear responses  and there may be a risk of therapeutic misalignment [5,6,7,8] with the delivery of excess or inappropriate interventions, like in sepsis. To minimize this bias, we deliberately selected established guidelines that can easily be described by checklists, algorithms, MECA or protocols [16,17,18,19,20]. If teamwork-training initiatives are combined with the implementation of MECA, this may confound the results . In our study, MECA—although available—were generally not used in 34% of cases. In the per protocol analysis, MECA were used in 63 of the 120 scenarios. We can only speculate why this was the case despite the broad fundamental agreement in the survey conducted afterwards. With regard to the approximately 50% usage in the intervention group, it is hard to reconcile the 94% of the 501 respondents who subjectively judged the checklists to be useful. We cannot exclude a bias in this context. One has to bear in mind that self-reported perceptions are weaker compared with other more objective forms of evidence related to the impact and efficacy of MECA. However, we anticipate that training the teams with their individual roles within the teamwork before the emergency and, most importantly, training the teams in the handling of the MECA would have improved MECA utilization and may have led to a further reduction in the failure of critical work steps so that our results are potentially underestimated. In fact, the only information our participants received before the start of the scenario was that a total of 10 MECA would be available. In order not to distort the results, the issue of specific MECA content was not broached in advance. In our analysis of the performances of the two groups, we excluded teams who selected to use checklists, but nonetheless, selected the wrong ones during the simulation case scenarios. By doing so, there was a slight chance that this omission of data would inadvertently inflate the ratings of performances of the intervention group. Indeed, a recalculation showed only negligible differences. One might discuss how far the “critical steps” selection process contributes to the construct validity of the ratings based on the scores generated. We aimed to provide a scoring system as transparent and comprehensible as possible and thus chose a binary technique as this method seemed to be most appropriate. Additional limiting factors for MECA usage are lack of time and high stress levels in an emergency. This might explain why most groups used the MECA only brief at the beginning, whereas other groups did not use MECA until they were stuck. Only 10 groups systematically worked through the MECA from the beginning to the end, resulting in > 90% execution of correct work steps.
Simulator-based studies nearly always lack real patients. However, high-fidelity simulation has become an accepted part of medical training and evaluation . Overlooking other fields, like aviation, show that simulation is an established and efficient part of testing and assessing (a) the value of safety protocols and (b) possible consequences in case of deferring from such protocols without the often-deleterious results seen in real life. In case of medical simulation, high agreement rates with findings in real cases have been reported [21, 22]. Simulation also helps to make rare events, like the emergencies used in this study, trainable and enabling to investigate topics that for a variety of practical and ethical reasons would be very difficult to investigate in real cases. Further strengths include the number of participants and the identical conditions for all teams which would have been impossible to achieve outside of a simulated setting. Video-analysis and the high number of checkpoints enabled us to perform a detailed analysis. We are sure about a benefit for trainees when using MECA, however with the numbers as low as they are, that is impossible to say with certainty. Although this may not really be generalizable by now, our study contributes its part to literature, and paves the way for further investigations with additional data and analysis.