Which score should be used for posttraumatic multiple organ failure? - Comparison of the MODS, Denver- and SOFA- Scores

Background Multiple organ dysfunction and multiple organ failure (MOF) is still a major complication and challenge in the treatment of severely injured patients. The incidence varies decisively in current studies, which complicates the comparability regarding risk factors, treatment recommendations and patients’ outcome. Therefore, we analysed how the currently used scoring systems, the MODS, Denver- and SOFA Score, influence the definition and compared the scores’ predictive ability. Methods Out of datasets of severely injured patients (ISS ≥ 16, Age ≥ 16) staying more tha 48 h on the ICU, the scores were calculated, respectively. The scores’ predictive ability on day three after trauma for resource requiring measurements and patient specific outcomes were compared using receiver-operating characteristics. Results One hundred seventy-six patients with a mean ISS 28 ± 13 could be included. MODS and SOFA score defined the incidence of MOF consistently (46.5 % vs. 52.3 %), while the Denver score defined MOF in 22.2 %. The MODS outperformed Denver- and SOFA score in predicting mortality (area under the curve/AUC: 0.83 vs. 0.67 vs. 0.72), but was inferior predicting the length of stay (AUC 0.71 vs.0.80 vs.0.82) and a prolonged time on mechanical ventilation (AUC 0.75 vs. 0.81 vs. 0.84). MODS and SOFA score were comparably sensitive and the Denver score more specific in all analyses. Conclusions All three scores have a comparable ability to predict the outcome in trauma patients including patients with severe traumatic brain injury (TBI). Either score could be favored depending weather a higher sensitivity or specificity is targeted. The SOFA score showed the most balanced relation of sensitivity and specificity. The incidence of posttraumatic MOF relies decisively on the score applied. Therefore harmonizing the competing scores and definitions is desirable.


Background
Despite all improvements in trauma care during the last decades, post-injury multiple organ failure (MOF) remains a major complication and challenge in severely injured patients [1]. During the post-traumatic hospital course, it has been described as "resourceintensive, morbid and lethal" and is considered as the main cause of late postinjury mortality [2,3]. Furthermore, MOF causes up to 30 % among the possibly preventable deaths [4].
According to our groups' previous work, the incidence of posttraumatic MOF in severely injured patients increased during the last decade accompanied by a decreasing case fatality rate [5]. However, the incidences of MOF in different comparable studies varied decisively and ranged from 6 to 42 % [1,2,[5][6][7]. On the one hand, differences in inclusion criteria, patient's treatment and trauma systems may explain some of these differences. However, all of these studies originated from developed trauma systems and focused on severely injured patients. On the other hand, in most recent publications, three different scores defining MOF were used: the Sequential Organ Assessment Score (SOFA), the Marshall Multiple Organ Dysfunction Score (MODS), and the Denver score. Although these three scores define the same syndrome, there are substantial differences in selection and assessment of observed organ systems. Presumably, the selected score might have a significant influence on the observed MOF incidence and complicates comparing observed incidence rates.
Originally these scores were not developed to predict patients' outcome. Since MOF is a major complication during the post-traumatic treatment, the scores' predictive value on patients' outcome is clinically relevant during daily trauma care. Furthermore, the understanding of scoring MOF is valuable in research, for example, in stratifying and including patients for clinical trials that include the endpoint MOF. Up to date, there has not been a comparison of the Denver, MODS and SOFA scores applied on the same data set. Therefore, in the present study we aimed to compare these three most frequently used MOF scores for their ability to predict the outcome in severely injured patients.

Study population
All severely injured patients (n = 749), who were admitted to the intensive care unit (ICU) of our Level I Trauma Center between 2011 and 2013, were eligible for further analysis. Inclusion criteria were a relevant trauma load displayed by an ISS (Injury Severity Score) ≥ 16, age ≥ 16 years and a length of stay on ICU for more than 48 h. Patients with incomplete data sets regarding one of the scores were excluded. Detailed patient numbers are displayed in Fig. 1.
Patient characteristics such as demographics and comorbidities were recorded at hospital admission. Vital parameters and laboratory data were recorded daily through the ICU stay. Injury pattern including injury mechanism and severity displayed by ISS and New Injury Severity Score (NISS) were assessed. Using the Revised Injury Severity Classification (RISC II) the predicted mortality was calculated [8]. Clinical events were recorded until death or hospital discharge. The local ethics committee of the Cologne Merheim Medical Center approved the study. Patient records and information were anonymised prior to analysis. According to the ethics committee individual patient consent was not required.

Multiple organ failure scores
Multiple organ failure was defined according to three currently used scores, the SOFA-, MODS and Denver score ( Table 1).
The SOFA score, initially used to assess critically ill ICU patients and secondly validated for trauma patients, is composed of scores from six organ systems, graded from 0 to 4 according to the degree of dysfunction or failure [9][10][11]. A score of 3 or greater for one of the organ systems was defined as a failure of this organ. In both the initial description and the evaluation of the SOFA score by Vincent et al. [9,10] there is no statement on when to define multiple organ failure. As frequently used previously, in the current study MOF was defined as organ failure (score ≥ 3 points) of at least two of the listed organs or systems [5,12].
The Marshall Multiple Organ Dysfunction Score (MODS) assesses the same six organ systems using slightly different values for four grades of organ dysfunction. Most obvious is the difference in grading the cardiovascular system. In contrast to the surrogate parameter, use of inotropic medication, a composite measure, the pressure-adjusted heart rate (PAR) is used. The PAR is calculated by heart rate (HR) multiplied by the ratio of the central venous pressure (CVP) to the mean arterial pressure (MAP) [13]. The total score, ranging from 0 to 24, arises from the sum of all single organ scores using the first measured value of the day. Marshall et al. did not define a specific cut-off for the diagnosis of MOF. Instead, the authors associated score ranges with mortality rates [13]. However, previous studies, that validated the MODS, have defined a score of more than 5 either for one day or two consecutive days to define the presence of MOF [1,14].
The Denver score has been specifically developed to assess posttraumatic organ failure excluding severe traumatic brain injury (TBI). The score rates four organ systems on a scale from 0 to 3 (Table 1). In difference to the previously presented scores, the Denver score does not include a grading of the hematologic system and the CNS. The Denver score defines MOF as a score of more than 3 occurring more than 48 h after injury [14,15].
For calculation of all scores, daily laboratorial and physiological values were used. Due to the comparability of the results, for all scores the worst daily values were used. Daily through the ICU stay, multiple organ failure status was defined as recommended by the authors. However, we revisited the previously described cut-off points using receiver operating characteristic (ROC) curves. Reversible physiologic derangements during the early posttraumatic treatment influence the scores' grading, but do not represent a substantial organ failure [16]. However, a prediction of the outcome as soon as possible after trauma would be desirable. Therefore and in accordance with previous validations, we used MOF score values on day three after trauma for further analysis and prediction of outcome [14,17].

Patient adverse outcomes
The scores were compared by evaluating the scores' association with patient adverse outcomes, which were ICU length of stay (LOS), days with mechanical ventilation (MVD), ventilator free days (VFD) and hospital mortality. VFDs were calculated as days without mechanical ventilation within 28 days after the injury to account for patients that died early and accordingly had less MVDs [18]. As LOS, MVDs and VFDs were not normally distributed, these outcome parameters were dichotomized for further analysis: 1. ICU LOS and mechanical ventilation up to seven days or longer; 2. ventilator free days of more or less than 21 days. The cut-off points of seven days for LOS and MVD and 21 days for VFD, respectively, were chosen to depict a complicated course during the ICU stay. Furthermore, this stratification allows a comparison with previous validations of either two of the scores, respectively [14,17]. Sepsis was defined according to the criteria of Bone et al. [19].

Statistical analysis
Data are presented as mean ± standard deviation (SD) (range of values) for continuous variables or percentages for categorical variables. For the comparison of the performance of the SOFA-, MODS and Denver-Score in predicting patient's adverse outcomes, the area under the receiving operating characteristics curve (AUROC) was calculated with LOS, MVD, VFD and hospital mortality as the state variables. The comparison of two areas under the receiving operating characteristics curve was based upon the 95 % confidence interval for each curve. For all statistical analyses, a probability of less than 0.05 was considered to be statistically significant. All data were analysed by using IBM SPSS 22 (IBM Corporation, IBM Inc., Armonk, NY, USA).

Results
In an observation period of three years, 176 severely injured trauma patients remained eligible for further analysis with complete data sets to calculate the Denver-, MODS and SOFA -Score. In the final cohort, patients had a mean age of 53 ± 21 (range: 16-91) years, were predominantly male (67 %) and sustained mainly blunt trauma (96.9 %). Patients were severely injured with a mean ISS of 28 ± 13 (range: 16-50). Severe TBI (AIShead ≥ 3) and thoracic trauma (AISthorax ≥ 3) were observed in 119 and 89 patients, respectively, while severe abdominal and skeletal injuries were less frequent. Out of the final cohort, 32 patients (18.2 %) died after mean 10.2 ± 11.7 (range: 4-29) days after injury. Detailed patient demographics and injury scoring are presented in Table 2. Within the final cohort, there were 32 deaths (18.2 %). Cause of mortality included failure of several organs (28 %), respiratory failure (22 %), failure of cerebral functions (22 %), and sepsis (12 %). In 16 % of the cases, cause of mortality was not documented. Outcome parameters are presented in Table 3. Neither demographic data nor injury severity differed significantly between the respective score -groups. Regardless of the score applied, MOF patients were more severely injured displayed by an increased ISS, and a higher ratio had severe head injuries compared to the whole cohort ( Table 2).
As expected, patients having MOF had a poor outcome. Regardless of the score applied, MOF patients required a longer ICU LOS and more days on mechanical ventilation, while the length of the inpatient treatment did not differ compared to all patients. As could be expected, mortality was higher when patients were labelled as having MOF regardless of the score applied. However there was no difference in mortality between the MOF groups.
The analysis of the sensitivity and specificity regarding (a) patient specific adverse outcomes such as mortality and ventilator free days and (b) resource -requiring measurements such as ICU LOS and days on mechanical ventilation revealed some differences between the scores (Table 4). In predicting VFD, MODS and SOFA score showed a better relation of sensitivity and specificity compared with the Denver score without differences in the AUC. The MODS convinced with the best sensitivity and highest AUC in predicting mortality, while the Denver Score showed poor sensitivity but good specificity (Fig. 2).
Predicting prolonged ICU LOS and days on mechanical ventilation, the SOFA score surpassed substantially the Denver score, but regarding the overall performance, both scores outperformed the MODS in the AUC (Fig. 2).

Discussion
Originally, the MODS, Denver-and SOFA Score were created for defining the presence of MOF. The three  scores differ obviously in their components as the MODS and SOFA score consider the CNS and coagulation system. This might contribute to the increased incidence of MOF, especially in the presence of TBI. As TBI is associated with an increased mortality in trauma patients, this could furthermore influence the scores' predictive value.
Comparing the three scores, the components weighting has to be recognised since the Denver-and SOFA score grade the cardiovascular system using a surrogate parameter (use of inotropic medication) while the MODS depicts physiologic parameters. Despite the interest in an accurate definition of this syndrome, the scores' ability to predict patients' adverse outcomes and resource utilization after severe trauma is of clinical relevance. The presented study revealed substantial differences between the scores in sensitivity and specificity, which lead to pronounced variations in the assessed incidence rates of MOF and consecutively in the scores' predictive values. The observed incidence of 22 to 52 % according to the applied score appears comparable to previous studies. In a large registry analysis, the MOF incidence was 32.7 % using to the SOFA score [5]. Using one data set, Sauaia et al. described an incidence of 49.7 % for the MODS and 22.2 % for the Denver score [14]. However, all of these numbers appear high compared to clinical experience. In a study comparing the presence of MOF defined by experienced intensive care physicians to the performance of different scores, the clinically defined incidence rate was 26 % and was significantly lower than defined by the scores [20]. The strict classification of MOF and Non-MOF patients accomplished by the scores, which is inevitable for predicting the clinical outcome and statistical analysis, might not be the ideal instrument for daily practice. Preferably, the scores' use as continuous scale might be helpful with respect to the patients' daily development.
Recognizing MOF as soon as possible after trauma enables the early assessment of the clinical outcome and the potentially required resource utilisation. Previously, day three after trauma has been shown to be the earliest moment possible defining MOF since organ dysfunction during the immediate posttraumatic treatment may occur due to reversible physiologic derangements [16]. Regarding the overall performance in predicting resourcerequiring outcome parameters such as ICU LOS and MVD, the SOFA and Denver Score outperformed the MODS. In patient specific outcome parameter, all scores performed similarly predicting VFD. However, the MODS surpassed the other scores in predicting mortality. The differences in sensitivity and specificity were remarkable, but were also observed in previous studies [14,17]. For example, due to the Denver score's low sensitivity, only half of fatal cases were captured in the presented study. Certainly, there are difficulties defining this complex syndrome in trauma patients, especially in patients with TBI. The necessity of mechanical ventilation due to thoracic injuries or the required deep sedation due to TBI complicates the scoring of the CNS. Since the GCS is assessed as essential part of the MODS and SOFA score, this grading might lead to false high score values. Furthermore, maintaining a sufficient cerebral perfusion pressure (CPP) often demands the use of vasopressors or inotropic medication, which directly effects the scoring of the cardiovascular system in the SOFA and Denver scores. Using physiologic parameters such as the PAR for grading the cardiovascular system could protect the MODS against therapeutic actions. However, injury pattern and organ dysfunction requires the need of sedation or inotropic medication and therefore contributes to the patients' overall status displayed by the total score value.
The Denver score was particularly defined and validated in patients without TBI [14,15]. However, TBI is frequent in European countries with an incidence of 214/100.000 persons/year [21]. In the presented cohort, we observed a TBI incidence of 68 %, which displays the importance of TBI for the daily practice in trauma care. Therefore, we decided deliberately to include these patients and we observed that all three scores work reasonably well in a generalised cohort without excluding this major group of patients.
Considering CNS as a confounder, previous studies have analysed the MODS and SOFA score excluding the GCS scale [12,22]. Unfortunately, the overall score performance regarding adverse outcomes was not described. Recently Vasilevskis et al. introduced a promising approach substituting the GCS scale by the use of the Richmond Agitation-Sedation Scale (RASS), which is easier to apply in sedated and intubated patients [23,24]. Although this score has to be validated in a more comprehensive patient cohort including trauma patients, the use of RASS can avoid an underestimation of the SOFA score, which might occur when the neurologic component is ignored [25].
However, some limitations have to be acknowledged. This is a retrospective, single-center study using clinical data. All values have been reassessed for plausibility, but the analysis relies on data documented during the inpatient stay. 84 cases had to be excluded as one or more day three values were missing to apply the scores. In most cases either, patients were not invasive monitored or the GCS was not documented. These missing data might add a selection bias to the presented study. In patients with complete data sets, we used worst daily values for all three scores, although Marshall et al. recommend the first morning values every day to avoid capturing momentary physiological changes [13]. For an improved comparability, we presupposed the same, worst daily values for all three scores as recommended by the SOFA-and Denver score authors [9,14]. Certainly this might have influenced the MODS' performance. In the analysed cohort, the cause of death was unfortunately not documented in all cases. Furthermore, not all deaths were associated or caused by MOF. Nevertheless, mortality depicts the poorest possible outcome in trauma patients. Therefore, regardless of the actual cause of death, predicting this fatal clinical course based on daily patient values could be a valuable tool in patient treatment.

Conclusion
The MODS, Denver and SOFA score have a comparable ability to predict the outcome in severely injured patients including patients with severe TBI. Denver and SOFA score convinced in predicting ICU resource use, while the MODS surpassed the other scores in predicting mortality. The SOFA score showed the most balanced relation between sensitivity and specificity. The incidence of posttraumatic MOF relies decisively on the score applied. Therefore harmonizing the competing scores and definitions would be desirable.