AUTHORS
Thiego RamonSoaresa, Roberto Dias deOliveiraab, Yiran E.Liuc, Andrea da SilvaSantosa, Paulo Cesar Pereira dosSantosa, Luma Ravena SoaresMonteb, Lissandra Maia deOliveirad, Chang MinParkef, Eui JinHwangef, Jason R.Andrewsci, JulioCrodadghi
aFaculty of Health Sciences of Federal University of Grande Dourados, Dourados, MS, Brazil
bNursing School, State University of Mato Grosso do Sul, Dourados, MS, Brazil
cDivision of Infectious Diseases and Geographic Medicine, Stanford University School of Medicine, Stanford, CA, United States of America
dOswaldo Cruz Foundation, Campo Grande, MS, Brazil
eDepartment of Radiology, Seoul National University College of Medicine, Seoul, Korea
fDepartment of Radiology, Seoul National University Hospital, Seoul, Korea
gDepartment of Epidemiology of Microbial Diseases, Yale University School of Public Health, New Haven, CT, United States of America
hSchool of Medicine, Federal University of Mato Grosso do Sul, Campo Grande, MS, Brazil
PUBLISHED
Abstract
Background
The World Health Organization (WHO) recommends systematic tuberculosis (TB) screening in prisons.
Evidence is lacking for accurate and scalable screening approaches in this setting. We aimed to assess the accuracy of
artificial intelligence-based chest x-ray interpretation algorithms for TB screening in prisons.
Methods
We performed prospective TB screening in three male prisons in Brazil from October 2017 to December
2019. We administered a standardized questionnaire, performed a chest x-ray in a mobile unit, and collected sputum
for confirmatory testing using Xpert MTB/RIF and culture. We evaluated x-ray images using three algorithms
(CAD4TB version 6, Lunit version 3.1.0.0 and qXR version 3) and compared their accuracy. We utilized multivariable
logistic regression to assess the effect of demographic and clinical characteristics on algorithm accuracy. Finally, we
investigated the relationship between abnormality scores and Xpert semi-quantitative results.
Findings
Among 2075 incarcerated individuals, 259 (12.5%) had confirmed TB. All three algorithms performed
similarly overall with area under the receiver operating characteristic curve (AUC) of 0.88–0.91. At 90% sensitivity,
only LunitTB and qXR met the WHO Target Product Profile requirements for a triage test, with specificity of 84% and
74%, respectively. All algorithms had variable performance by age, prior TB, smoking, and presence of TB symptoms.
LunitTB was the most robust to this heterogeneity but nonetheless failed to meet the TPP for individuals with
previous TB. Abnormality scores of all three algorithms were significantly correlated with sputum bacillary load.
Interpretation
Automated x-ray interpretation algorithms can be an effective triage tool for TB screening in prisons.
However, their specificity is insufficient in individuals with previous TB.