Helicopter emergency medical services use of thoracic point of care ultrasound for pneumothorax: a systematic review and meta-analysis

Background Auscultating for breath sounds to assess for pneumothorax in the helicopter emergency medical services (HEMS) settings can be extremely challenging. Thoracic point of care ultrasound (POCUS) offers a seemingly more useful visual (rather than audible) alternative. This review critically and quantitatively evaluates the use of thoracic POCUS for pneumothorax in the HEMS setting. Methods A systematic literature review with meta-analysis was conducted. Only papers reporting on patients undergoing POCUS for pneumothorax in the helicopter or pre-hospital setting were included. Primary outcome was accuracy, focusing on sensitivity and specificity. Secondary outcome was practicality. PubMed, Embase and the Cochrane Library were searched. The Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) was used to assess validity of studies. Results Twelve studies reporting on n = 1,936 images from medical and trauma patients were included in qualitative synthesis. Studies were nearly all observational designs. Most images were acquired by nurses or paramedics who were previously novices to ultrasound. The reference standard was predominantly CT. Specificity results were unanimously precise and very high, whereas sensitivity results were imprecise and extremely variable. Meta-analysis of eight studies involving n = 1,713 images yielded pooled sensitivity 61% (95% CI: 27–87%; I2 = 94%) and pooled specificity 99% (95% CI: 98–100%; I2 = 89%). Six studies involving n = 315 images reported practicality. The highest or second highest categorisation of image quality was reported in around half of those images. Conclusion Thoracic POCUS is highly specific but has extremely variable sensitivity for pneumothorax when performed in the HEMS setting. This is from purely a diagnostic (not clinical) perspective. Sensitivity increases when only clinically significant pneumothoraces are considered. Case reports reveal thoracic POCUS can appropriately alter treatment and triage decisions, but only for a small number of patients. It appears predominantly useful in mitigating against unnecessary interventions. More research reporting patient focused outcomes is required. In the meantime, thoracic POCUS appears to offer a more appropriate visual alternative to auscultation for breath sounds when assessing for pneumothorax in the HEMS setting.


Rationale
Helicopter emergency medical services (HEMS) provide pre-hospital critical care and interfacility transfers. They encounter patients presenting with pneumothorax and tension pneumothorax. Pneumothorax occurs when air enters the pleural cavity through a plural fault. These faults may have traumatic, idiopathic (spontaneous or relating to disease) or iatrogenic (related to medical intervention) causes. Prevalence of pneumothorax amongst patients presenting to HEMS providers is reported as being between 10 and 20% [1][2][3][4][5][6].
Tension pneumothorax occurs when the plural fault functions as a one-way valve [7]. Air continues to enter the plural cavity more quickly than it can escape. An increase in intrapleural pressure ensues [7]. This causes lung collapse, diaphragmatic depression, chest wall expansion and contralateral lung compression [7]. Eventually, compression of the thoracic vena cava ensues [7]. This leads to reduced venous return and eventual circulatory collapse [7]. It is an immediate life-threatening condition portrayed by these clinical manifestations. Hence, early recognition and immediate treatment are imperative. Similarly, timely identification of a simple pneumothorax alerts clinicians to the risk of inducing a tension pneumothorax in patients who undergo positive pressure ventilation or altitude related volume expansion [8].
One of the cornerstones of pre-hospital assessment for pneumothorax is auscultation of the chest. However, auscultating to determine the presence of breath sounds in the pre-hospital setting can be extremely challenging. Brown et al. evaluated the accuracy of auscultation by pre-hospital clinicians to detect breath sounds whilst in a moving ambulance [9]. In a sample of n = 260, they reported n = 117 false negatives [9]. Similarly, Hunt et al. concluded that auscultation for breath sounds in the helicopter environment was impossible [10]. The inability to auscultate in this setting renders differentiating pneumothorax seemingly more difficult. There appears greater potential for pneumothorax going undiagnosed.
A recent meta-analysis reported a 19% complication rate associated with performing a thoracostomy [11]. These included iatrogenic injury, bleeding, and infection. Hence, there also appears a risk of clinicians unnecessarily exposing patients to the risk of these complications due apparent greater difficulties in ruling out a pneumothorax in the HEMS setting [12].

Clinical role of index test
The advent of handheld ultrasound machines combined with their improved image quality has brought a point of care ultrasound (POCUS) capability into the HEMS arena. Ultrasonic imaging of underlying anatomy can now be depicted on handheld electronic tablets or smart phone devices. They can depict the visceral pleura sliding on the parietal pleura (termed lung sliding) as a glistening movement at the plural line ( Fig. 1) [13,14]. In motion mode (M-mode), movement of the lung appears as a grainy image below the plural line, while the still chest wall above is depicted as static straight lines. This is termed the seashore sign (Fig. 2a) [13,14]. Pneumothorax pathology results in an absence of lung sliding and the resultant M-mode image is that of parallel horizontal lines above and below the pleural line (Fig. 2b) [13,14]. This pattern is often referred to as the barcode or stratosphere sign (Fig. 2b) [13,14].Thoracic POCUS can also depict lung pulse (pulsation of the heart transmitted through lung tissue) which is present in normal lung, but absent in pneumothorax [13,14].
Another useful observation when assessing for pneumothorax are lung comets. These are caused by reverberation of ultrasound waves at the peripheral lung parenchyma and inter-plural layer ( Fig. 1) [15]. They appear as short (typically less than 1 cm) vertical artefacts beginning at the plural line, which then taper and fade with increasing depth (Fig. 1) [15]. Fibrosed lung interstitium or a mixing of air and fluid in the interstitium can cause another phenomenon termed B-lines [14,15]. These also appear as bright vertical lines, but are much longer than lung comets-they shine down from the pleura to the end of the screen [15]. Lung comets and B-lines move with lung sliding [15]; as they both arise from lung tissue and/or the inter-plural layer, their presence can be used to discount pneumothorax [14]. Lung comets are the more useful diagnostic observation as they are ubiquitous irrespective of disease status [15]. By virtue of its visual modality, thoracic POCUS appears to offer a superior alternative to auscultation of the chest to aid diagnosis of pneumothorax in the HEMS setting.

Objectives
Recent updates to resuscitation guidelines place greater emphasis on the use of POCUS to identify underlying pathology and target resuscitative interventions [16,17]. They make specific mention of the merits of its use for assessing for pneumothorax [16,17]. However, no quantitative analysis focusing on the pre-hospital or helicopter setting has been published to date. The aim of this paper was to addresses this evidentiary gap by conducting a systematic review and meta-analysis of the accuracy and practicality of thoracic POCUS for pneumothorax in the HEMS setting. Measurements of accuracy focussed on sensitivity and specificity. Practicality was measured as declared rates of diagnostically adequate images, or practicality rating.

Methods
All elements of the Preferred Reporting Items for Systematic Reviews and Meta-Analysis for Diagnostic Test Accuracy (PRISMA-DTA) studies checklist are reported under separate subheadings [18].

Protocol and registration
The protocol for this review was prospectively submitted to the International Prospective Register of Systematic Reviews (PROSPERO) on the 21st of November 2020. It was first published on the PROSPERO database on the 2nd of December 2020 (Registration No. CRD42020221946).

Eligibility criteria
To be included for analysis, results had to meet all the following population, index-test, reference test and target condition eligibility criteria:  4. Target condition-reporting on accuracy or practicality in identifying the presence or absence of pneumothorax.
Only randomised trials and non-randomised studies were eligible for inclusion. Articles which were not available in full text or not written in English were excluded.

Information sources
PubMed (includes MEDLINE), Embase and the Cochrane Library were searched between 4th and 15th January 2021. An additional search (guided by GreyNet. org) for papers not published in mainstream journals was also conducted. Reference lists of the search results were checked for studies which were eligible for inclusion. A search of the International Clinical Trials Registry Platform (which includes ClinicalTrials.gov) was also completed.

Search
MeSH and EMTREE terms were customised to search PubMed (also adopted by the Cochrane Library) and Embase respectively. A free-text search for key terms (including their synonyms and related terms) appearing in titles and abstracts was also conducted. Searches using MeSH and EMTREE terms were 'exploded' to automatically search the respective subheadings where appropriate. Search terms are included at "Appendix 1".

Study selection
The study selection process is depicted in Fig. 3.

Data collection process
Data was extracted into a data collection table generated as a spreadsheet using Microsoft Excel (Version 16.0.13628.20274, Office 365, Microsoft Corporation, Redmond, Washington, 2021).

Definitions for data extraction
Data items harvested for analysis are presented as column headings in Table 1.

Risk of bias and applicability
The Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) assessment tool as described in a publication by Whiting et al. was used to assess the internal and external validity of studies [19]. Funnel plot asymmetry analysis for reviews of diagnostic studies developed by Deeks et al. was used to assess for the presence of publication bias [20].

Diagnostic accuracy measures
STATA (StataCorp, Stata Statistical Software: Release 16. College Station, TX: StataCorp LLC, 2019) was used to calculate each study's prevalence of pneumothorax, sensitivities, specificities, positive predictive value (PPV), negative predictive value (NPV) and the respective 95% confidence intervals (CI) using the data extracted. Review Manager (Version 5.4.1. Copenhagen: The Nordic Cochrane Centre, The Cochrane Collaboration, 2020) software was used to create forest plots of sensitivity and specificity.

Synthesis of results
Practicality and accuracy were reported in qualitative thematic synthesis. The intention was to perform subgroup analysis to account for differences between the helicopter (in-flight) setting versus the pre-hospital.

Meta-analysis
STATA statistics software package (StataCorp, Stata Statistical Software: Release 16. College Station, TX: StataCorp LLC, 2019) was used to conduct the metaanalysis. It focused on summarising sensitivities and specificities using a random effects model [21]. A positive or negative result (presence or absence of pneumothorax) was modelled as a single common binary threshold across all studies.

Additional analyses
The intention was to conduct sensitivity analysis to account for the biases and concerns reported using the QUADS-2 tool. Practicality was reported using simple descriptive statistics. The intention was to present mean acquisition rates (percentages) of diagnostically adequate images.

Study characteristics
All twelve included studies were published between 2011 and 2020 [1-6, 22-26, 33]. Collectively, they reported on the interpretations of n = 1,936 images captured predominantly from trauma patients presenting in the pre-hospital and/or in-flight setting. All included patients were adults (≥ 18 years). Apart from one randomised simulation trial [26], all were observational designs. Results involved overall quantitative and qualitative evaluations, including the raw data used to make these calculations and subjective conclusions. Although the protocols of ultrasonography varied, they all included comparable elements of thoracic scanning to evaluate for pneumothorax. The majority (58%, n ≈ 1,132) of images were acquired by nurses or paramedics who had undertaken familiarisation training to enable them to participate in the studies. They were previously novices in thoracic ultrasound. Around 34% (n ≈ 659) were acquired by physicians experienced and accredited in ultrasound use, or by experienced sonographers. Authors did not declare the experience of those conducting the scans in around 8% (n ≈ 145) of images acquired. The reference standard was predominantly CT scanning. However, there were exceptions; these included X-ray, emergency department (ED) clinical assessment (including ultrasound) and expert review of the saved images. One study was funded by an academic institution [33], another was funded by the manufacturer of the ultrasound device used in the study [2]. The remaining studies either declared that no funding had been received, or they made no comment about sources of funding. Included study characteristics are summarised in Table 1. Figure 4 depicts the risk of bias and applicability concerns of the eight studies included in meta-analysis [1-6, 22, 24]. Of these eight studies, five were assessed as having a high risk of bias in at least one domain and/or area of applicability. The remaining three studies involved an unclear risk of bias. Derivation of the risk of bias and applicability results are presented in "Appendix 2".

Accuracy
Eleven studies reported on the accuracy of a total of n = 1,900 separate images [1][2][3][4][5][6][22][23][24][25][26]. The majority (60%, n ≈ 1,132) were acquired by paramedics or nurses; 32% (n ≈ 623) were acquired by physicians; the authors did not declare the credentials of the device operator for the remaining 8% (n = 145). The Neesse et al. and Scharonow et al. studies involved patients undergoing thoracic POCUS examinations performed by physicians certified in sonography [23,25]. Results were compared with CT, ED ultrasound or X-ray. Hospital staff were blinded to the results of pre-hospital imaging results. Pneumothorax was correctly ruled-out in all patients in both these studies. This zero prevalence can be explained by these studies reporting on predominantly non-trauma patients. Although these studies were of interest as they reported the apparent ability of thoracic POCUS to correctly rule out pneumothorax in a population of predominantly medical patients, they were excluded from meta-analysis on account of them reporting a zero prevalence of pneumothorax [23,25]. Khalil et al. randomised n = 30 paramedics to undertake a 30-min cardiac and thoracic POCUS lecture followed by practical scanning of n = 10 volunteer subjects [26]. This intervention group was compared to n = 30 paramedics with no additional training, the majority of whom (n = 28, 93%) had never performed a POCUS examination. Both groups were then exposed to blinded simulation scenarios, one of which involved a tension pneumothorax in the pre-hospital setting. The simulation involved loud noise to hinder auscultation. Most paramedics (n = 27, 90%) in the intervention group utilised thoracic POCUS during their examination of the pneumothorax patient, whereas only n = 2 (7%) utilised it in the control group. Although a higher percentage of paramedics correctly diagnosed the tension pneumothorax in the intervention group (77% versus 57%), this difference was not considered statistically significant (p = 0.1). Although the sample size met the power calculation requirements, it relied on the premise that the thoracic POCUS education curriculum would improve diagnostic accuracy by 35%. No references were cited to substantiate modelling the required sample size on this magnitude of improvement.
The remaining n = 8 studies reporting accuracy data were included in the meta-analysis [1-6, 22, 24]. This data is summarised in Figs. 5, 6 and 7. Pertinent additional aspects of these studies are described in more detail below.
Prevalence of pneumothorax in the eight studies included in the meta-analysis was between 10 and 20% (Fig. 5) [1-6, 22, 24]. Exceptions were Ketelaars et al. who reported a prevalence of 40% (95% CI: 28-55%) and Lyon et al. who simulated a prevalence of 44% (95% CI: 31-57%) [22,24]. PPV results were typically high but blighted by poor precision. This was on account of the relatively few images analysed. NPVs were similarly high but more precise (Fig. 7). In the Quick et al. study, accuracy reduced to sensitivity 68% (95% CI: 46-85%) and specificity 96% (95% CI: 90-98%) when the sample was limited to only patients who underwent CT as the reference test (n = 116) [3]. The authors stressed that all those that did not undergo CT had clear signs of pneumothorax on X-ray, or definitive clinical signs.
Multivariate binomial regression analysis in the Oliver et al. study revealed none of the variables observed had a significant effect on accuracy [1]. These included the operator's clinical discipline (paramedic or physician), time from POCUS examination to CT scan, means of transportation (ground versus air), patient demographic and mechanism of injury [1].

Practicality
Six studies rated the quality of a total of n = 315 separate images [4,5,[22][23][24]33]. Most (48%, n = 151) were acquired by physicians, 6% (n = 19) were acquired by non-physicians; authors did not declare the credentials of the device operator for the remaining 46% (n = 145). Methods of categorising image quality varied greatly between studies. Images in the Ketelaars et al. study were evaluated as Good (55%), Moderate (25%), Poor (14%) and Not rated (16%) [22]. Neesse et al. published similar observational data in the P-CHEST study [23]. Image quality was rated as Excellent (27%), Mediocre (44%) and Poor (29%) [23]. In the Ronaldson et al. study, expert reviewers graded 79% (n = 19) images as diagnostically adequate [5]. This involved a variety of settings including the back of ambulances, roadside and whilst in fixed-and rotary-wing aircraft. Roline et al. focussed purely on inflight thoracic POCUS imaging [4]. The results of n = 81 saved images were reviewed by a recognised expert in POCUS who was blinded to flight-crew interpretations. They rated image quality as Good (54%) and Poor (44%) [4].
Snaith et al. compared the results of Extended Focused Assessment with Sonography in Trauma (eFAST) imaging conducted in an ED, versus a stationary ambulance, versus a moving ambulance [33]. A total of n = 36 examinations were performed in these settings by experienced emergency physicians or sonographers. When graded by an experienced clinical academic sonographer, no significant difference was observed between the quality of images produced. Although the mean time to conduct the eFAST examination was 20 s longer in the moving ambulance compared to the other two settings, this was not deemed statistically significant (p = 0.15). This study was a small feasibility study which the authors admit was likely underpowered.
The study by Lyon et al. differed in that it involved an airborne model consisting of an air-filled intravenous pressure bag placed inside another pressure bag to simulate the pleural interface of the lungs [24]. Air was injected between the bags to simulate pneumothorax. The images published in this paper show that the model produced a seemingly life-like depiction characteristic of an ultrasound image of the plural interface. Hence, despite this being a simulation study, it was deemed appropriate for inclusion in this review. A total of n = 16 M-mode images of the model pleura were obtained whilst in flight. These simulated no-pneumothorax (no air injected) and pneumothorax (air injected) in various flight configurations. Four emergency physicians experienced in the use of ultrasound to detect pneumothorax reviewed the captured images independently. They were blinded to the simulation and constituted the reference test on which to deduce the accuracy of M-mode imaging to detect pneumothorax in flight. They reported M-mode tracing during thoracic POCUS examination had a fine sawtooth wave pattern which was more pronounced in flight than on the ground [24]. However, this did not impede image interpretation. The authors concede that human tissue may have behaved differently. Three studies involving n = 317 separate images reported the average time it took to complete the POCUS examination [6,23,25]. Most of these scans (60%, n = 190) were completed by nurses or paramedics. The mean time to conduct the P-CHEST assessment (including cardiac views) was two minutes and the time limit of five minutes was never exceeded [23]. Similarly, the average time to complete the entire eFAST examination (including cardiac and abdominal views) in a study by Yates et al. was also around two minutes [6]. Unexpectedly, overall on-scene time was reduced by an average of four minutes after the introduction of POCUS into this service [6]. This was attributed to the POCUS training program, and the result yielded by the eFAST examinations, improving decision making on rapid transportation. However, confounding due to the Hawthorne effect cannot be discounted. In the Scharonow et al. study, the time to complete a thoracic POCUS examination was approximately 30 s [25]. Regression analysis revealed that the use of ultrasound did not have a statistically significant impact on mission time [25]. The most common reason reported for poor image quality in studies was larger body habitus -a complication shared in all settings [5,6,22,23,35]. Authors also cited short flight times and packaging as barriers to image acquisition [2,4].

Synthesis of results: meta-analysis
Of the twelve studies yielded by the search strategy [1-6, 22-26, 33], four were excluded from meta-analysis [23,25,26,33]. The Snaith et al. study was excluded as it did not report sufficient accuracy data [33]. The Neesse et al. and Scharonow et al. were excluded on account of there being no pneumothoraces present in the samples [23,25]. Khalil et al. reported the number of thoracostomies performed [26]. As this outcome was not necessarily an indication of POCUS interpretation, this data was also excluded.

Additional analysis
Studies involving multiple settings did not differentiate between in-flight and other settings when reporting accuracy. Hence, it was not possible to conduct subgroup analysis of results exclusively reporting in-flight image acquisition (nor any a posteriori identified subgroups) as intended due to a lack of data.

Summary of evidence
This systematic review and meta-analysis quantified the sensitivity and specificity of thoracic POCUS for pneumothorax amongst HEMS providers. It also reports on the practicality of performing a thoracic POCUS examination in this setting. The included studies were all vulnerable to accusation of bias. Specificity results were unanimously precise and very high, whereas sensitivity results were imprecise and extremely variable. Metaanalysis results reflected this with low and imprecise pooled sensitivity blighted by considerable heterogeneity: (61% (95% CI: 27-87%; I2 = 94%). Pooled specificity results were precise and extremely high: 99% (95% CI: 98-100%; I2 = 89%). The highest or second highest categorisation of image quality was obtained in around half of patients.

Accuracy
The extreme variability in the sensitivity results was expected as studies involved image acquisition and subjective interpretation in an extremely variable and unpredictable setting by operators of differing abilities... such is the nature of pre-hospital care. Included studies were also vulnerable to differences on account of variations in experimental methods between studies. Further investigation of the heterogeneity was not performed as the small numbers involved would not produce meaningful analysis [36]. These heterogeneity and bias vulnerabilities demand that interpretation and external application of these sensitivity findings to a specific setting or circumstance be done in an extremely cautious manner. The pooled sensitivity result was much lower compared to previously published reviews involving the ED setting. The National Institute for Health and Care Excellence reported that thoracic POCUS in the ED for pneumothorax had pooled sensitivity 85% (95% CI: 68-95%) [37]. This was superior to X-ray [37]. Four other reviews reported similarly higher and more consistent sensitivity results and also corroborated the superiority of ED POCUS over X-ray [38][39][40][41]. This can be explained by the ED setting being more controlled, posing less environmental challenges. It may also be explained by differences in operator training and experience-the ED reviews differed in that they involved studies where operators were clinicians with previous experience in POCUS. However, a more recently published review in the ED setting by Netherton et al. reported a pooled sensitivity 69% (95% CI: 66-73%) [42]. For various reasons, this latter paper included different studies with lower sensitivities compared to the former reviews. These lower sensitivities may be partly explained by recent advances in CT imaging accuracy and higher instances of occult detection due to an increase in routine CT scanning.
Their pooled results also suffered with considerable heterogeneity [38][39][40][41][42]. Previously published reviews involving pre-hospital thoracic POCUS for pneumothorax are exclusively narrative in nature [43][44][45][46][47][48]. Authors cite a paucity in suitable evidence at the time of writing as the reason why they did not conduct quantitative analysis. However, they did unanimously conclude that pre-hospital ultrasound (including thoracic POCUS for pneumothorax) is feasible and useful, but only for some patients. The extremely variable and seemingly unpredictable sensitivity results in this literature review renders thoracic POCUS an inappropriate tool to rule out pneumothorax in the HEMS setting. However, this is from purely a diagnostic and academic perspective, not necessarily a practical perspective considering sensitivity in terms of the clinical significance of the pneumothoraces. Two of the studies involving around a third (n = 679) of the total images also reported on only pneumothoraces requiring intervention [2,6]. Yates et al. revealed a reduced prevalence (compared to all pneumothoraces) of 5% (95% CI: 3-10%) versus 9% (95% CI: 6-15%); but crucially, an increased sensitivity of 40% (95% CI: 12-74%) versus 22% (95% CI: 6-48%) [6]. The same trend was reported in the Press et al. study. They reported a reduced prevalence of 4% (95% CI: 2-6%) versus 9% (95% CI: 6-12%); and again, increased sensitivity: 50% (95% CI: 22-58%) versus 19% (95% CI: 9-34%) [2]. Three studies also reported on comparisons between pre-hospital thoracic POCUS versus in-hospital diagnosis (prior to CT) [2,6,22]. In the Yates et al. study, when the receiving trauma team's assessment was used as the reference test, sensitivity increased to 67% (95% CI: 22-96%) versus 22% (95% CI: 6-48%) [6]. Similarly, the poor sensitivity rate in the Press et al. study was replicated in X-ray imaging [2]. Of the n = 35 false negative HEMS interpretations, n = 31 were also falsenegative on X-ray. Sensitivity in the Ketelaars et al. study was also comparable to X-ray [22]. Of the n = 15 false negatives, n = 12 were also false negative on X-ray. Pneumothorax was only evident on CT in these cases, thus rendering their significance questionable [22].
The accuracy results of this literature review appear to present an overly pessimistic representation of sensitivity when considered in the context of clinical significance. Sensitivity improves when evaluating patient focussed outcomes rather than diagnostic ones. Although some pneumothoraces are missed, these are only apparent on CT (not clinically or on X-ray). For the missed cases that subsequently underwent an in-hospital intervention, the appropriateness of intervening in these cases in the HEMS setting is debatable. The difficulty of conducting a clinical assessment (specifically auscultating) in the HEMS setting seemingly hinders diagnosis of pneumothorax [9,10]. Combining difficulties in diagnosis with an imperative to treat this potentially life threatening condition may contribute towards performing unnecessary pre-hospital thoracostomies. A study reviewing n = 56 pre-hospital thoracostomies revealed around a quarter were unnecessary as no pneumothorax had been present [49]. Another study reported no evidence of pneumothorax in 79% (n = 15) of n = 19 cases where pre-hospital thoracostomies had been performed [50]. One HEMS service reported blindly performing thoracostomies on all pulseless trauma patients to relieve a potential tension pneumothorax [12]. They concluded this subsequently appeared unnecessary in 90% (n = 130) of cases.
It appears that thoracic POCUS can be used to mitigate against performing such unnecessary thoracic procedures. Lyon et al. reported a 21% decrease in chest decompressions performed following the introduction of thoracic POCUS into service [51]. Other authors also corroborate this hypothesis [6,22,51,52]. It appears that usefulness is not limited to mitigating unnecessary interventions. Case reports also describe how thoracic POCUS enabled timely differential diagnosis and directed targeted treatments in rapidly deteriorating patients [6,35]. Nevertheless, there appears no highquality evidence reporting the usefulness of thoracic POCUS for pneumothorax. The evidence is limited to case reports demonstrating benefit for a small number of patients only. Crucially, there is no suggestion of performing a thoracic POCUS examination having a deleterious effect.

Practicality
Despite the challenges of the pre-hospital and helicopter environment, results revealed that it is possible to obtain diagnostically adequate images in the HEMS setting. In general, it was reported that the highest or second highest categorisation of image quality was obtained in around half of patients. Unexpectedly, the greatest barrier to image acquisition was cited as body habitus, as opposed to one of the difficulties more commonly associated with the pre-hospital and helicopter environments. In-flight image acquisition was reportedly more difficult, but nevertheless possible.
The time it took clinicians to perform the examination was negligible. Regardless, prolonging on-scene times is arguably a moot point. Concerns around prolonging on-scene times leading to worse outcomes are usually attributable to the Golden Hour mantra [53]. This timeliness paradigm dictates that on-scene time should be minimised for the critically injured. This is so their needs can be met at hospital within an hour of injury [53]. The crux is that this relies on the presupposition that their immediate needs can only be met at an appropriate trauma hospital. The advent of more advanced pre-hospital diagnostic and interventional capabilities means this is no longer the case [54]. Besides, some patients will require significant intervention to avoid mortality much sooner [55].
Most of the images in the studies included in the literature review were conducted by previous novices to ultrasound. They demonstrated they were able to acquire and interpret thoracic POCUS images after undergoing only short training courses. Studies evaluating such curricula conclude that these programmes enable operators to competently acquire and interpret thoracic images to assess for pneumothorax [27,29,56,57]. This may be explained by pre-hospital clinicians being already familiar with locating intercostal spaces due to familiarity with performing needle thoracostomies. In addition, unlike some other aspects of ultrasound, differentiating between normal and pathological findings involves assessment for several relatively easily distinguishable features.

Limitations
The study selection process was conducted by one person, thus rendering it vulnerable to selection bias. This was mitigated to some extent by application of an objective selection criteria and transparent reporting of the reasons for not including papers in both qualitative and quantitative analysis. Results yielded only a small number of studies, and these included relatively few images. The predominantly high and unclear risk of bias associated with the included studies compromises the validity of the meta-analytical estimates.
Meta-analysis was blighted by apparent considerable heterogeneity indicated by an I 2 value ≥ 75%. However, calculating separate I 2 statistics for sensitivity and specificity fails to account for any correlation between the two [36]. This can result in an over-estimation of the degree of heterogeneity [36]. Comparisons with the visual assessment of forest plots revealed that the high level of heterogeneity in sensitivity is corroborated by little overlap in some of the relatively wide (imprecise) 95% CIs. Conversely, the high level of heterogeneity in specificity is contested by a consistent overlap in their narrow 95% CIs. Whilst heterogeneity exists in the specificity results, its magnitude remains debatable. It was not possible to conduct analysis of exclusively in-flight image acquisition, nor any a posteriori identified sub-groups.
In the Ronaldson et al. study, practitioners correctly diagnosed a pneumothorax that was excluded from analysis as the image was deemed not diagnostically adequate by the reviewers (pneumothorax was confirmed using X-ray) [5]. This highlights the inability to account for other clinical variables associated with pneumothorax that may bias diagnosis.
The test used for funnel plot asymmetry has low power when data is heterogeneous [20]. The Cochrane Handbook for Systematic Reviews of Diagnostic Test Accuracy recommends caution when using it to assess for publication bias in this these cases [21]. Hence, there appeared no useful method of determining the risk of publication bias that would yields meaningful results. However, unlike in meta-analysis of interventional data, it appears unclear to what extent (if at all) the potential for publication bias compromises the validity of meta-analysis of diagnostic test accuracy results.
This review was performed according to PRISMA-DTA checklist with a prospectively submitted protocol and application of validated tools. However, some aspects such as assessment of bias and determining sources of heterogeneity were unavoidably subjective.

Conclusions
It is possible to acquire diagnostically adequate thoracic POCUS images during HEMS missions. It also appears that novices to ultrasound can be taught to acquire and interpret images in this setting after relatively short training programmes. Specificity results are consistently very high and precise. Sensitivity appears imprecise and extremely unpredictable. This can be explained by differences in operator ability, settings, and the various environmental challenges associated with this area of practice. Sensitivity appears to increase when only clinically significant pneumothoraces are considered. The relevance of the false negatives in the HEMS setting is debatable. Irrespective, POCUS appears superior to auscultation with a conventional stethoscope when assessing for pneumothorax. It can appropriately alter treatment and triage decisions, but only for a small number of patients. This is predominantly on account of its apparent potential to reduce the number of unnecessary procedures. This hypothesis and the benefits this may yield requires further research. Randomised controlled methodologies reporting on patient focused outcomes are required. Reporting on mortality or morbidity may prove impractical. Future research may need to involve patient focussed surrogate outcomes such as numbers of clinically significant pneumothoraces detected, or the appropriateness of pre-hospital thoracic interventions performed or withheld. Crucially, it should account for potential confounding. In the meantime, thoracic POCUS appears to offer a more appropriate visual (rather than audible) alternative to auscultation for breath sounds when assessing for pneumothorax in the HEMS setting. It is imperative that users remain mindful that in the HEMS setting, environmental factors can compromise the high sensitivity (but not the specificity) previously reported in studies involving the ED setting.

Appendix 1: Search terms
Embase was searched using the following EMTREE (/exp) and free text (ab,ti) terms: