Back to List

AI-based EGFR-mutation prediction from haematoxylin and eosin (H&E) images in non-small cell lung cancer (NSCLC): A global multi-cohort validation study

Published 2025

AI-based EGFR-mutation prediction from haematoxylin and eosin (H&E) images in non-small cell lung cancer (NSCLC): A global multi-cohort validation study

Jongchan Park, Biagio Brattoli, Jack Shi, Michael Senior, Talha Qaiser, Jack Rawson, Woochan Hwang, Sangwon Shin, Taebum Lee, Chan-Young Ock, Sérgio Pereira, Huw Bannister, Elia Riboni-Verri, Siraj Ali, Luiza Moore

AACR, 2025

Abstract
Background: Despite guideline recommendations for EGFR testing in NSCLC patients, many are under-tested due to resource constraints and long turnaround times. Existing AI-based genotype prediction models using H&E whole slide images (WSIs) have mostly been validated in limited settings, resulting in barriers to real-world applicability. To address these challenges, we developed and validated a deep learning model that demonstrates robust and consistent performance across various clinical settings and independent patient cohorts.
Method: Our study utilized WSIs from 4,684 EGFR-mutated and 7,576 wild-type NSCLC cases from multiple sites, including USA, China, and Republic of Korea, to train convolutional neural networks (CNN) and ViT-based models to predict EGFR mutations from H&E WSIs. To improve generalization performance, data-centric approaches, including augmentations, oversampling, and weighted losses, were adopted, and the ensemble of prediction scores was used as the final prediction. We report validation performance in three cohorts: an independent internal validation cohort; a multi-scanner cohort of paired images scanned by six different scanners and with two different magnifications; and independent external validation cohorts from AstraZeneca.
Results: The internal validation cohort (n=625) and the multi-scanner cohort (n=2,261; 323 in each subgroup) included 206 (33.0%) and 168 (52.0%) EGFR-mutated samples, respectively. Our methods showed significant improvement in performance, achieving an area under the ROC curve (AUC) of 0.880 compared to 0.723 from a model we previously presented. The model demonstrated generalized performance across mutation subtypes (AUC 0.843 to 0.894), specimen types (0.823 to 0.897), scanners (0.822 to 0.855), and magnifications (0.824 to 0.832). The model was further validated on independent external cohorts which include additional EGFR-mutated samples.
Conclusion: An AI model demonstrated robust validation performance across diverse settings for EGFR mutation prediction from H&E WSIs in NSCLC, representing a vital step toward real-world clinical impact.

View abstract