To perform test-retest reproducibility analyses for deep learning–based automatic detection algorithm (DLAD) using two stationary chest radiographs (CRs) with short-term intervals, to analyze influential factors on test-retest variations, and to investigate the robustness of DLAD to simulated post-processing and positional changes.
This retrospective study included patients with pulmonary nodules resected in 2017. Preoperative CRs without interval changes were used. Test-retest reproducibility was analyzed in terms of median differences of abnormality scores, intraclass correlation coefficients (ICC), and 95% limits of agreement (LoA). Factors associated with test-retest variation were investigated using univariable and multivariable analyses. Shifts in classification between the two CRs were analyzed using pre-determined cutoffs. Radiograph post-processing (blurring and sharpening) and positional changes (translations in x- and y-axes, rotation, and shearing) were simulated and agreement of abnormality scores between the original and simulated CRs was investigated.
Our study analyzed 169 patients (median age, 65 years; 91 men). The median difference of abnormality scores was 1–2% and ICC ranged from 0.83 to 0.90. The 95% LoA was approximately ± 30%. Test-retest variation was negatively associated with solid portion size (β, − 0.50; p = 0.008) and good nodule conspicuity (β, − 0.94; p < 0.001). A small fraction (15/169) showed discordant classifications when the high-specificity cutoff (46%) was applied to the model outputs (p = 0.04). DLAD was robust to the simulated positional change (ICC, 0.984, 0.996), but relatively less robust to post-processing (ICC, 0.872, 0.968).
DLAD was robust to the test-retest variation. However, inconspicuous nodules may cause fluctuations of the model output and subsequent misclassifications.
• The deep learning–based automatic detection algorithm was robust to the test-retest variation of the chest radiographs in general.
• The test-retest variation was negatively associated with solid portion size and good nodule conspicuity.
• High-specificity cutoff (46%) resulted in discordant classifications of 8.9% (15/169; p = 0.04) between the test-retest radiographs.
Hyungjin Kim, Chang Min Park & Jin Mo Goo
Learning Visual Context by Comparison
SRM: A Style-based Recalibration Module for Convolutional Neural Networks
Learning Loss for Active Learning
PseudoEdgeNet: Nuclei Segmentation only with Point Annotations
Batch-Instance Normalization for Adaptively Style-Invariant Neural Networks
Distort-and-Recover: Color Enhancement Using Deep Reinforcement Learning
CBAM: Convolutional Block Attention Module
BAM: Bottleneck Attention Module
Keep and Learn: Continual Learning by Constraining the Latent Space for Knowledge Preservation in Neural Networks
Accurate Lung Segmentation via Network-Wise Training of Convolutional Networks
Transferring Knowledge to Smaller Network With Class-Distance Loss
Semantic Noise Modeling for Better Representation Learning
Self-Transfer Learning for Fully Weakly Supervised Object Localization
Pixel-Level Domain Transfer
A Novel Approach for Tuberculosis Screening Based on Deep Convolutional Neural Networks
Deconvolutional Feature Stacking for Weakly-Supervised Semantic Segmentation
AttentionNet: Aggregating Weak Directions for Accurate Object Detection
Development and Validation of a Deep Learning–based Automatic Detection Algorithm for Malignant Pulmonary Nodules on Chest Radiographs
Clinical Validation of a Deep Learning Algorithm for Detection of Pneumonia on Chest Radiographs in Emergency Department Patients with Acute Febrile Respiratory Illness
Undetected Lung Cancer at Posteroanterior Chest Radiography: Potential Role of a Deep Learning–based Detection Algorithm
Deep-learning algorithms for the interpretation of chest radiographs to aid in the triage of COVID-19 patients: A multicenter retrospective study
Validation of a Deep Learning Algorithm for the Detection of Malignant Pulmonary Nodules in Chest Radiographs
Deep-learning Based Automated Detection Algorithm for Active Pulmonary Tuberculosis on Chest Radiographs: Diagnostic Performance in Systematic Screening of Asymptomatic Individuals
Performance of a Deep-learning Algorithm Compared to Radiologic Interpretation for Lung Cancer Detection on Chest Radiographs in a Health Screening Population
Implementation of a Deep Learning-Based Computer-Aided Detection System for the Interpretation of Chest Radiographs in Patients Suspected for COVID-19
Deep Learning–based Automatic Detection Algorithm for Reducing Overlooked Lung Cancers on Chest Radiographs
Automated identification of chest radiographs with referable abnormality with deep learning: need for recalibration
Deep learning algorithm for surveillance of pneumothorax after lung biopsy: a multicenter diagnostic cohort study
Deep Learning for Chest Radiograph Diagnosis in the Emergency Department
Development and Validation of a Deep Learning–Based Automated Detection Algorithm for Major Thoracic Diseases on Chest Radiographs
Development and Validation of a Deep Learning–based Automatic Detection Algorithm for Active Pulmonary Tuberculosis on Chest Radiographs
Development and validation of a deep learning algorithm detecting 10 common abnormalities on chest radiographs
Performance of a deep-learning algorithm for referable thoracic abnormalities on chest radiographs: A multicenter study of a health screening cohort
Reducing Domain Gap by Reducing Style Bias
AI-based improvement in lung cancer detection on chest radiographs: results of a multi-reader study in NLST dataset
COVID-19 pneumonia on chest X-rays: Performance of a deep learning-based computer-aided detection system
Deep Learning for Detection of Pulmonary Metastasis on Chest Radiographs