Want to learn more about Lunit AI Solutions? Let’s connect! Contact Us

AI-integrated screening to replace double reading of mammograms: a population-wide accuracy and feasibility study

Mohammad T. Elhakim et al. - Radiology: Artificial Intelligence

AUTHORS

Mohammad T. Elhakim, Sarah W. Stougaard, Ole Graumann, Mads Nielsen, Oke Gerke, Lisbet B. Larsen, Benjamin S. B. Rasmussen

From the Department of Radiology (M.T.E., L.B.L., B.S.B.R.), Department of Nuclear Medicine (O. Gerke), and CAI-X–Centre for Clinical Artificial Intelligence (B.S.B.R.), Odense University Hospital, Kløvervænget 10, Entrance 112, 2nd Floor, 5000 Odense C, Denmark; Department of Clinical Research, Research and Innovation Unit of Radiology, University of Southern Denmark, Odense, Denmark (M.T.E., S.W.S., O. Graumann, O. Gerke, B.S.B.R.); Department of Radiology, Aarhus University Hospital, Aarhus, Denmark (O. Graumann); Department of Clinical Research, Aarhus University, Aarhus, Denmark (O. Graumann); and Department of Computer Science, University of Copenhagen, Copenhagen, Denmark (M.N.).

PUBLISHED

Radiology: Artificial Intelligence

Abstract

Mammography screening supported by deep learning–based artificial intelligence (AI) solutions can potentially reduce workload without compromising breast cancer detection accuracy, but the site of deployment in the workflow might be crucial. This retrospective study compared three simulated AI-integrated screening scenarios with standard double reading with arbitration in a sample of 249 402 mammograms from a representative screening population. A commercial AI system replaced the first reader (scenario 1: integrated AIfirst), the second reader (scenario 2: integrated AIsecond), or both readers for triaging of low- and high-risk cases (scenario 3: integrated AItriage). AI threshold values were chosen based partly on previous validation and setting the screen-read volume reduction at approximately 50% across scenarios. Detection accuracy measures were calculated. Compared with standard double reading, integrated AIfirst showed no evidence of a difference in accuracy metrics except for a higher arbitration rate (+0.99%, P < .001). Integrated AIsecond had lower sensitivity (−1.58%, P < .001), negative predictive value (NPV) (−0.01%, P < .001), and recall rate (−0.06%, P = .04) but a higher positive predictive value (PPV) (+0.03%, P < .001) and arbitration rate (+1.22%, P < .001). Integrated AItriage achieved higher sensitivity (+1.33%, P < .001), PPV (+0.36%, P = .03), and NPV (+0.01%, P < .001) but lower arbitration rate (−0.88%, P < .001). Replacing one or both readers with AI seems feasible; however, the site of application in the workflow can have clinically relevant effects on accuracy and workload.


Summary

Evaluation of three scenarios of simulated artificial intelligence (AI)–integrated screening in an entire mammography screening population showed that replacing one reader or both readers with AI in a double-reading setting can reduce workload without reducing cancer detection accuracy.


Key Points

  • Accuracy and feasibility of three simulated scenarios of artificial intelligence (AI)–integrated screening were evaluated in a large-scale cohort from population-wide double-read mammography screening.

  • Replacing both readers for triaging of low- and high-risk cases or replacing the first reader fully with AI reduced the volume of screening reads by 49.7% and 48.8%, respectively, without reducing cancer detection accuracy.

  • Fully replacing the second reader with AI reduced the volume of screening reads by 48.7% and the number of recalls by 2.2%, but at a cost of a reduced sensitivity (−1.5%, P < .001).

Read the full paper