Back to List

Observer Performance Study to Examine the Feasibility of the AI-powered PD-L1 Analyzer to Assist Pathologists’ Assessment of PD-L1 Expression Using Tumor Proportion Score in Non-Small Cell Lung Cancer

Seokhwi Kim et al. — ASCO(2022)



Programmed death ligand 1 (PD-L1) expression is the standard biomarker for PD-L1 inhibitors in advanced non-small cell lung cancer (NSCLC). However, evaluation of PD-L1 tumor proportion score (TPS) by pathologists causes inter-observer variation and demands time to interpret. This study aimed to evaluate the benefit of the artificial intelligence (AI) algorithm in assisting pathologists to determine TPS on PD-L1 immunohistochemistry (IHC) whole-slide images (WSIs) in NSCLC.


Lunit SCOPE PD-L1, an AI-powered PD-L1 TPS analyzer, was developed from 393,565 tumor cells annotated by board-certified pathologists for PD-L1 expression in 802 WSIs stained by 22C3 pharmDx IHC. The AI model was developed based on a region-based convolutional neural network, and the model can detect and count PD-L1 positive or negative tumor cells from WSIs to calculate TPS. Seven independent board-certified pathologists scored ground truth (GT) of PD-L1 TPS from 199 WSI of NSCLC stained by 22C3 pharmDx IHC. TPS from each GT reader was grouped as negative ( < 1%), low (1% to 49%), or high (≥ 50%). The GT of each slide was determined by the consensus of GT readers. Another twelve independent board-certified pathologists scored PD-L1 TPS from the same WSIs as observer performance testers (OPT). They scored TPS twice with a washout interval of 4 weeks, with or without AI assistance. TPS accuracy change and reading time of OPT reader according to the presence or absence of AI assistance were analyzed.


The standalone accuracy of the AI model was 0.809 (95% CI: 0.690–0.941). With AI assistance, the overall accuracy of TPS had been changed from 0.799 (95% confidence interval [CI]: 0.764–0.836) to 0.832 (95% CI: 0.796–0.869) (P = 0.004). AI assistance increased the accuracy rate in 11 out of 12 OPT readers. The result of the generalized linear mixed model revealed that AI assistance and specimen type affected the probability of correct answer, while the order of reading did not (Table 1). The mean time to read with AI was 195.4±506.5 (mean±standard deviation) seconds, which was significantly shorter than the mean time to read without AI (285.1±1578.4, P < 0.001).


This study demonstrates that an AI-powered PD-L1 TPS analyzer can assist board-certified pathologists in evaluating TPS of NSCLC by improving the accuracy of TPS group evaluation and reducing the time to read slides.

Table 1. Generalized linear mixed model of various factors that can influence the result of evaluating the correct TPS group

Read the full paper

Seokhwi Kim1, Hyojin Kim2, Wonkyung Jung3, Soo Ick Cho3, Joel Bentz4, Warren Clingan4, Kevin Golden4, Ramir Arcega4, Hyunsik Bae5, Dawn L. Butler4, Sangjoon Choi5, Maryam Farinola4, Daniel Harrison4, Soohyun Hwang5, Minsun Jung6, Nilesh Kashikar4, Hyunsung Kim7, Julia Manny4, Carmen Winters4, Jonathan H. Hughes4.

1Department of Pathology, Ajou University School of Medicine, Suwon, Republic of Korea. 2Department of Pathology, Seoul National University Bundang Hospital, Seongnam, Republic of Korea. 3Lunit Inc., Seoul, Republic of Korea. 4Aurora Research Institute, FL, United States. 5Department of Pathology and Translational Genomics, Samsung Medical Center, Seoul, Republic of Korea. 6Department of Pathology, Yonsei University College of Medicine, Seoul, Republic of Korea.

7Department of Pathology, Hanyang University College of Medicine, Seoul, Republic of Korea.


Read more