Quality indicators and scientific evidence
QIs aim at measuring quality. The common definition of quality by the United States Institute of Medicine is 'the degree to which health services for individuals and populations increase the likelihood of desired health outcomes and are consistent with current professional knowledge .' Thus, any QI should be related to a certain level of the desired health outcomes. Any person is allowed to consider an outcome as 'desired' and devise the consequent QIs. Nevertheless, there is little doubt that health-care quality ultimately aims at influencing mortality and/or morbidity. Indeed, the above mentioned two outcomes are most used QIs themselves, under the category of 'outcome indicators'' of the classic classification by Donabedian . However, it has been identified that the effects of quality on mortality may be difficult to measure because of a low signal-to-noise ratio . It has been suggested to measure the processes of care (by the so called 'process indicators') instead of the outcomes to overcome the above-mentioned problem . However, in order to improve the quality, the processes measured by such QIs should 'increase the likelihood of desired health outcomes.' Therefore, 'out through the door, in through the window' is the link with survival . The link is usually provided by research and represents the evidence underpinning the QI itself. For instance, first a good level of evidence (i.e. a survival benefit) was established by scientific research for administering beta-blockers in the emergency room to patients with myocardial infarction. Subsequently, a QI measuring the actual adherence to this practice was widely adopted [20, 21]. Further attempts to validate this QI proving its link with the outcome (i.e. comparing patient survival in the hospitals with high adherence vs. hospitals with low adherence) may be desirable, but not indispensable.
Quality indicators in Trauma Care
Trauma care, as compared to other branches of medicine, suffers from a paucity of evidence, as a result of underfunding of research . In addition, special difficulties in collecting the information due to some characteristics of trauma care itself, such as multidisciplinarity, logistic complexity, and emergency also result in insufficient evidence. Therefore, a majority of the processes of care are not supported by evidence. Subsequently, the respective QIs are also not supported. The ensuing attempts to validate these QIs (i.e. the assessment of their relationship with the patients' outcomes, usually mortality) are not substantially different from the scientific research into the processes themselves, and are hindered by the same difficulties. Thus, such attempts are often unsuccessful [1, 23–25]. For example, it is reasonable for an Emergency Medical System to evaluate its quality through the rate of prehospital intubation of head-injured patients with GCS <9. However, if a researcher sought to validate this indicator against survival (the 'golden' outcome), he/she would face the same uncertainties faced by intubation itself .
Hence, it is not by chance that the most used and accepted QI in trauma care is a straightforward outcome measure, i.e. the benchmarked risk-adjusted mortality. The main advantage of the above mentioned QI is that it does not need validation. However, the main disadvantage is that it does not refer to specific processes of care. As a consequence, the quality-makers remain in search of valid process indicators at the time of identifying and targeting the causes of mortality differences.
The 'off-hour' effect as a quality indicator
Mortality in 'after time' versus 'business time' expresses whether the quality of care is the same during the different time periods being compared. This analysis is meaningful and of practical interest as everybody is aware of the possible deficiencies in trauma care during after-hours. Such deficiencies are caused by the differential availability of staff, facilities, resources and procedures, by fatigue or sleepiness of the personnel and by increased logistic difficulties in pre-hospital rescue (e.g. flight restrictions for helicopters at night).
Similar to the benchmarked risk-adjusted mortality, the investigation of the 'off-hour' effect would enjoy the important benefit of being an outcome indicator. Therefore, this indicator would not require validation against the outcome. At the same time though, differently from the benchmarked adjusted mortality, it does not measure the quality of care on the whole, but just a portion of it. Therefore, it could act as a process indicator as well, and help identify the processes that should be targeted to improve the quality of care. For example, this QI might drive interventions to increase the staffing of hospitals during weekends or launch a night flight HEMS program. Moreover, the re-calculation of the QI at a later time could assess the efficacy of the above-mentioned interventions. On the other hand, the absence of the 'off-hour' effect could be a sort of a quality mark for hospitals or systems whose specific processes of care could then become models for others to copy.
Another advantage is that this QI can be calculated at the local level (trauma center, trauma system or geographical region) without complex benchmarking against data from other settings, a procedure that may be biased if the data are inhomogeneous. The evaluation of the 'off-hour' effect is an internal comparison, as the compared groups come from the same setting. Thus, unaccounted differences (e.g. systematic between-hospital differences in severity score assignment) are less probable. Conceptually, it resembles the difference that occurs between the case-control and case-crossover study-design . In the former design, cases and controls are different subjects, while in the latter design cases and controls are the same subjects, though observed in different times. Consequently, some sources of potential confounding, i.e. those related to the fixed characteristics of the unit of analysis do not change within the matched pairs and are controlled for by the design.
However, it is necessary to exercise some caution and understanding. All the factors influencing survival at the patient level (age, mechanism of injury, injury severity etc.) should be carefully accounted and adjusted. This is because systematic differences may still occur. For instance, patients admitted in the off hours are known to be younger,  plausibly because the young tend to go out at night. In addition, penetrating trauma occurs more often,  probably as more violence transpires at night. Finally, injury severity may be worse because traffic accidents are also more severe at night . For all the above-mentioned reasons, a crude, unadjusted comparison of mortality would not be reasonable. Thus, a risk-adjusted model would be required for a proper application of this QI.
Another important caveat is that the aspect of quality measured by this indicator is relative, and not absolute. In other words, the absence of the 'off-hour' effect is always recommendable, but not sufficient. Even though it is uncommon, a system with the same mortality in working and off hours could still have an elevated overall mortality, which is disturbing. Therefore, this QI should not be considered as an alternative to the benchmarked risk-adjusted mortality, but only as complementary.
This QI would retain its meaningfulness when applied at any level (e.g. one or more hospitals, one trauma system or one geographical area). However, it could capture the full picture of possible differences between parts of the day/week only if all the possible hospitals where a patient could be brought were included. The processes of care that bring patients from the trauma scene to the definitive hospital are crucial . Further, these processes are also likely to be affected by the time of the day. The processes of care would be fully mirrored if the indicator were applied at a population-level, i.e. trauma system or geographical area. Suppose, for instance only some hospitals within a system (usually the referrals centers) are considered and the patients transferred from another facility are excluded. Consequently, a possible increase in the mortality caused by malfunctioning of the referral system in after hours could go undetected. For the same reason, the choice of the variable used to classify patients (time of injury, time of arrival to 1st hospital or time of arrival to definitive hospital) could also influence the results.
Finally, the detailed definition of the working time should not be fixed but variable. It should depend on the characteristics of the setting being analyzed. For instance, the resources available on a Saturday morning may resemble those of business time in some hospitals/systems and those of aftertime in others. Thus, the QI should be adapted accordingly.
The feasibility of an indicator is an important aspect. This is because 'measures based on data that are difficult to obtain must be extremely valuable or they will result in misspent resources' . For this reason, trauma mortality inside and outside working hours appears feasible, as the necessary data are already part of the core set recommended by the Utstein Template (30-day mortality, time of 1st emergency call or time of hospital arrival, and predictive model variables) [31, 32].
As mentioned previously, the literature investigating the 'off-hour' effect is inconsistent and divided more or less equally between the positive and negative findings. Curiously enough, all the studies focusing on trauma yielded negative results (no difference). However, the opposite occurred for studies focusing on myocardial infarction, which is surprising as both these conditions share many features: time-dependency, early mortality and the importance of early and centralized care. A majority of the studies on trauma were conducted in Level 1 Trauma Centers. These studies used the time of arrival at the hospital to classify the patients and excluded patients transferred between hospitals. This could have lowered the chances of finding a difference, as elucidated above. The other explanation is that the quality of trauma care in those studies was just good enough to protect from the 'weekend effect'. This appears reasonable given that Level 1 Trauma Centers have immediate access to a full trauma team at all times, while interventional cardiologists are rarely in-house during off-hours.