Back to List

Clinical Validation of a Deep Learning Algorithm for Detection of Pneumonia on Chest Radiographs in Emergency Department Patients with Acute Febrile Respiratory Illness

Jae Hyun Kim et al. — Journal of Clinical Medicine (2020)

Early identification of pneumonia is essential in patients with acute febrile respiratory illness (FRI). We evaluated the performance and added value of a commercial deep learning (DL) algorithm in detecting pneumonia on chest radiographs (CRs) of patients visiting the emergency department (ED) with acute FRI. This single-centre, retrospective study included 377 consecutive patients who visited the ED and the resulting 387 CRs in August 2018–January 2019. The performance of a DL algorithm in detection of pneumonia on CRs was evaluated based on area under the receiver operating characteristics (AUROC) curves, sensitivity, specificity, negative predictive values (NPVs), and positive predictive values (PPVs). Three ED physicians independently reviewed CRs with observer performance test to detect pneumonia, which was re-evaluated with the algorithm eight weeks later. AUROC, sensitivity, and specificity measurements were compared between “DL algorithm” vs. “physicians-only” and between “physicians-only” vs. “physicians aided with the algorithm”. Among 377 patients, 83 (22.0%) had pneumonia. AUROC, sensitivity, specificity, PPV, and NPV of the algorithm for detection of pneumonia on CRs were 0.861, 58.3%, 94.4%, 74.2%, and 89.1%, respectively. For the detection of ‘visible pneumonia on CR’ (60 CRs from 59 patients), AUROC, sensitivity, specificity, PPV, and NPV were 0.940, 81.7%, 94.4%, 74.2%, and 96.3%, respectively. In the observer performance test, the algorithm performed better than the physicians for pneumonia (AUROC, 0.861 vs. 0.788, p = 0.017; specificity, 94.4% vs. 88.7%, p < 0.0001) and visible pneumonia (AUROC, 0.940 vs. 0.871, p = 0.007; sensitivity, 81.7% vs. 73.9%, p = 0.034; specificity, 94.4% vs. 88.7%, p < 0.0001). Detection of pneumonia (sensitivity, 82.2% vs. 53.2%, p = 0.008; specificity, 98.1% vs. 88.7%; p < 0.0001) and ‘visible pneumonia’ (sensitivity, 82.2% vs. 73.9%, p = 0.014; specificity, 98.1% vs. 88.7%, p < 0.0001) significantly improved when the algorithm was used by the physicians. Mean reading time for the physicians decreased from 165 to 101 min with the assistance of the algorithm. Thus, the DL algorithm showed a better diagnosis of pneumonia, particularly visible pneumonia on CR, and improved diagnosis by ED physicians in patients with acute FRI.

Read the full paper

Jae Hyun Kim1, Jin Young Kim1, Gun Ha Kim1, Donghoon Kang2, In Jung Kim2, Jeongkuk Seo2, Jason R. Andrews3 and Chang Min Park4

1Department of Radiology, Armed Forces Goyang Hospital, 215, Hyeeum-ro, Deogyang-gu, Goyang-si, Gyeonggi-do 10271, Korea, 2Department of Internal Medicine, Armed Forces Goyang Hospital, 215, Hyeeum-ro, Deogyang-gu, Goyang-si, Gyeonggi-do 10271, Korea, 3Division of Infectious Diseases and Geographic Medicine, Stanford University School of Medicine, 291 Campus Drive, Stanford, CA 94305, USA, 4Department of Radiology and Institute of Radiation Medicine, Seoul National University College of Medicine, 101 Daehak-ro, Jongno-gu, Seoul 03080, Korea

Journal of Clinical Medicine (2020)

Read more