NEW research has shown that an AI system can more efficiently detect clinically significant prostate cancer on MRI compared to radiologists. These findings hold the potential to significantly aid the diagnostic pathway of prostate cancer, alleviating the ever-increasing workload of healthcare professionals, and reducing dependence on experienced radiologists.
In this international, paired, non-inferiority, confirmatory study, the team trained and externally validated an AI system (developed within an international consortium) for detecting Gleason grade Group 2 or greater cancers using a retrospective cohort of 10,207 MRI examinations from 9,129 patients. Of these examinations, 9,207 cases from three centres (11 sites) based in the Netherlands were used for training and tuning, and 1,000 cases from four centres (12 sites) based in the Netherlands and Norway were used for testing. They also facilitated a multireader, multicase observer study with 62 radiologists (45 centres in 20 countries; median 7 years of experience in reading prostate MRI) using Prostate Imaging–Reporting and Data System (PI-RADS) (2.1) on 400 paired MRI examinations from the testing cohort. Primary endpoints were the sensitivity, specificity, and the area under the receiver operating characteristic curve (AUROC) of the AI system in comparison with that of all readers using PI-RADS (2.1) and in comparison with that of the historical radiology readings made during multidisciplinary routine practice.
The researchers found that, of the 10,207 examinations, 2,440 cases had histologically confirmed Gleason grade Group 2 or greater prostate cancer. In the subset of 400 testing cases in which the AI system was compared with the radiologists participating in the reader study, the AI system showed a statistically superior and non-inferior AUROC of 0·91 (95% confidence interval [CI] 0·87-0·94; p<0·0001), in comparison to the pool of 62 radiologists with an AUROC of 0·86 (0·83-0·89), with a lower boundary of the two-sided 95% Wald CI for the difference in AUROC of 0·02. At the mean PI-RADS 3 or greater operating point of all readers, the AI system detected 6·8% more cases with Gleason grade group 2 or greater cancers at the same specificity (57·7%, 95% CI 51·6-63·3), or 50·4% fewer false-positive results and 20·0% fewer cases with Gleason grade group 1 cancers at the same sensitivity (89·4%, 95% CI 85·3-92·9). In all 1000 testing cases where the AI system was compared with the radiology readings made during multidisciplinary practice, non-inferiority was not confirmed, as the AI system showed lower specificity (68·9% [95% CI 65·3-72·4] vs 69·0% [65·5-72·5]) at the same sensitivity (96·1%, 94·0-98·2) as the PI-RADS 3 or greater operating point. The lower boundary of the two-sided 95% Wald CI for the difference in specificity (−0·04) was greater than the non-inferiority margin (-0·05) and a p value below the significance threshold was reached (p<0·001).
This research has shown that an AI system is superior to radiologists using PI-RADS (2.1) at detecting clinically significant prostate cancer. Implementation of this system would allow for a significant improvement in diagnostic practices, with benefits for both clinicians and patients. More research into the topic is needed, however these findings are promising for the future of prostate cancer diagnosis.
Victoria Antoniou, EMJ, London, UK
Reference
Saha A et al. Artificial intelligence and radiologists in prostate cancer detection on MRI (PI-CAI): an international, paired, non-inferiority, confirmatory study. Lancet Oncol. 2024;S1470-2045(24)00220-1.