The performance of a deep learning (DL) algorithm should be validated in actual clinical situations, before its clinical implementation.
To evaluate the performance of a DL algorithm for identifying chest radiographs with clinically relevant abnormalities in the emergency department (ED) setting.
This single-center retrospective study included consecutive patients who visited the ED and underwent initial chest radiography between January 1 and March 31, 2017. Chest radiographs were analyzed with a commercially available DL algorithm. The performance of the algorithm was evaluated by determining the area under the receiver operating characteristic curve (AUC), sensitivity, and specificity at predefined operating cutoffs (high-sensitivity and high-specificity cutoffs). The sensitivities and specificities of the algorithm were compared with those of the on-call radiology residents who interpreted the chest radiographs in the actual practice by using McNemar tests. If there were discordant findings between the algorithm and resident, the residents reinterpreted the chest radiographs by using the algorithm’s output.
A total of 1135 patients (mean age, 53 years ± 18; 582 men) were evaluated. In the identification of abnormal chest radiographs, the algorithm showed an AUC of 0.95 (95% confidence interval [CI]: 0.93, 0.96), a sensitivity of 88.7% (227 of 256 radiographs; 95% CI: 84.1%, 92.3%), and a specificity of 69.6% (612 of 879 radiographs; 95% CI: 66.5%, 72.7%) at the high-sensitivity cutoff and a sensitivity of 81.6% (209 of 256 radiographs; 95% CI: 76.3%, 86.2%) and specificity of 90.3% (794 of 879 radiographs; 95% CI: 88.2%, 92.2%) at the high-specificity cutoff. Radiology residents showed lower sensitivity (65.6% [168 of 256 radiographs; 95% CI: 59.5%, 71.4%], P < .001) and higher specificity (98.1% [862 of 879 radiographs; 95% CI: 96.9%, 98.9%], P < .001) compared with the algorithm. After reinterpretation of chest radiographs with use of the algorithm’s outputs, the sensitivity of the residents improved (73.4% [188 of 256 radiographs; 95% CI: 68.0%, 78.8%], P = .003), whereas specificity was reduced (94.3% [829 of 879 radiographs; 95% CI: 92.8%, 95.8%], P < .001).
A deep learning algorithm used with emergency department chest radiographs showed diagnostic performance for identifying clinically relevant abnormalities and helped improve the sensitivity of radiology residents’ evaluation.
Eui Jin Hwang, Ju Gang Nam, Woo Hyeon Lim, Sae Jin Park, Yun Soo Jeong, Ji Hee Kang, Eun Kyoung Hong, Taek Min Kim, Jin Mo Goo, Sunggyun Park, Ki Hwan Kim, Chang Min Park
From the Department of Radiology, Seoul National University College of Medicine, 101 Daehak-ro, Jongno-gu, Seoul 03080, Korea (E.J.H., J.G.N., W.H.L., S.J.P., Y.S.J., J.H.K., E.K.H., T.M.K., J.M.G., C.M.P.); and Lunit, Seoul, Korea (S.P., K.H.K.).
Learning Visual Context by Comparison
SRM: A Style-based Recalibration Module for Convolutional Neural Networks
Learning Loss for Active Learning
PseudoEdgeNet: Nuclei Segmentation only with Point Annotations
Batch-Instance Normalization for Adaptively Style-Invariant Neural Networks
Distort-and-Recover: Color Enhancement Using Deep Reinforcement Learning
CBAM: Convolutional Block Attention Module
BAM: Bottleneck Attention Module
Keep and Learn: Continual Learning by Constraining the Latent Space for Knowledge Preservation in Neural Networks
Accurate Lung Segmentation via Network-Wise Training of Convolutional Networks
Transferring Knowledge to Smaller Network With Class-Distance Loss
Semantic Noise Modeling for Better Representation Learning
Self-Transfer Learning for Fully Weakly Supervised Object Localization
Pixel-Level Domain Transfer
A Novel Approach for Tuberculosis Screening Based on Deep Convolutional Neural Networks
Deconvolutional Feature Stacking for Weakly-Supervised Semantic Segmentation
AttentionNet: Aggregating Weak Directions for Accurate Object Detection
Development and Validation of a Deep Learning–based Automatic Detection Algorithm for Malignant Pulmonary Nodules on Chest Radiographs
Clinical Validation of a Deep Learning Algorithm for Detection of Pneumonia on Chest Radiographs in Emergency Department Patients with Acute Febrile Respiratory Illness
Undetected Lung Cancer at Posteroanterior Chest Radiography: Potential Role of a Deep Learning–based Detection Algorithm
Deep-learning algorithms for the interpretation of chest radiographs to aid in the triage of COVID-19 patients: A multicenter retrospective study
Validation of a Deep Learning Algorithm for the Detection of Malignant Pulmonary Nodules in Chest Radiographs
Deep-learning Based Automated Detection Algorithm for Active Pulmonary Tuberculosis on Chest Radiographs: Diagnostic Performance in Systematic Screening of Asymptomatic Individuals
Performance of a Deep-learning Algorithm Compared to Radiologic Interpretation for Lung Cancer Detection on Chest Radiographs in a Health Screening Population
Implementation of a Deep Learning-Based Computer-Aided Detection System for the Interpretation of Chest Radiographs in Patients Suspected for COVID-19
Deep Learning–based Automatic Detection Algorithm for Reducing Overlooked Lung Cancers on Chest Radiographs
Automated identification of chest radiographs with referable abnormality with deep learning: need for recalibration
Deep learning algorithm for surveillance of pneumothorax after lung biopsy: a multicenter diagnostic cohort study
Test-retest reproducibility of a deep learning–based automatic detection algorithm for the chest radiograph
Development and Validation of a Deep Learning–Based Automated Detection Algorithm for Major Thoracic Diseases on Chest Radiographs
Development and Validation of a Deep Learning–based Automatic Detection Algorithm for Active Pulmonary Tuberculosis on Chest Radiographs
Development and validation of a deep learning algorithm detecting 10 common abnormalities on chest radiographs