Instance screening mammograms with an invasive ductal carcinoma (arrows) by which the ladies wouldn’t have been recalled with an AI–reading-only technique. Nevertheless, these examinations would have been proven to radiologists in a hybrid studying technique based mostly on the AI uncertainty rating of the entropy of the imply chance of malignancy (PoM) rating of probably the most suspicious area. For each examinations, mediolateral indirect (left) and craniocaudal (proper) views of the affected breast are proven. (A) Photographs in a 67-year-old lady who was recalled as a result of each radiologists scored the proper breast as Breast Imaging Reporting and Knowledge System (BI-RADS) 0. The girl wouldn’t have been recalled if the examination was learn by the AI mannequin, which assigned a PoM rating of 40, however the prediction would have been labeled as an unsure prediction with an uncertainty quantification of 0.86. (B) Photographs in a 63-year-old lady who was recalled as a result of each radiologists scored the proper breast as BI-RADS 4. The girl wouldn’t have been recalled if the examination was learn by the AI mannequin, with a PoM rating of 44, however the prediction can be labeled as an unsure prediction with an uncertainty quantification of 0.98. Credit score: Radiological Society of North America (RSNA)
A hybrid studying technique for screening mammography, developed by Dutch researchers and deployed retrospectively to greater than 40,000 exams, diminished radiologist workload by 38% with out altering recall or most cancers detection charges.
The examine, which emphasizes AI confidence, was revealed in Radiology.
“Although the overall performance of state-of-the-art AI models is very high, AI sometimes makes mistakes,” stated Sarah D. Verboom, M.Sc., a doctoral candidate within the Division of Medical Imaging at Radboud College Medical Heart within the Netherlands.
“Identifying exams in which AI interpretation is unreliable is crucial to allow for and optimize use of AI models in breast cancer screening programs.”
The hybrid studying technique includes utilizing a mix of radiologist readers and a stand-alone AI interpretation of circumstances by which the AI mannequin performs in addition to, or higher than, the radiologist.
“We can achieve this performance level if the AI model provides not only an assessment of the probability of malignancy (PoM) for a case but also a rating of its certainty of that assessment,” Verboom stated.
“Unfortunately, the PoM itself is not always a good predictor of certainty because deep neural networks tend to be overconfident in their predictions.”
To develop and consider a hybrid studying technique, the researchers used a dataset of 41,469 screening mammography exams from 15,522 ladies (median age 59 years) with 332 screen-detected cancers and 34 interval cancers. The exams have been carried out between 2003 and 2018 in Utrecht, Netherlands, as a part of the Dutch Nationwide Breast Most cancers Screening Program.
The dataset was divided on the affected person stage into two equal teams with equivalent most cancers detection, recall and interval most cancers charges. The primary group was used to find out the optimum thresholds for the hybrid studying technique, whereas the second group was used to guage the studying methods.
Of the uncertainty metrics evaluated by the researchers, the entropy of the imply PoM rating of probably the most suspicious area produced a most cancers detection charge of 6.6 per 1,000 circumstances and a recall charge of 23.7 per 1,000 circumstances, just like charges of normal double-reading by radiologists.
The ultimate hybrid studying technique concerned AI evaluating each screening mammogram to supply two outputs: the PoM and an uncertainty estimate of that prediction. When AI decided the PoM was under the established threshold with certainty, the case was thought-about regular.
When AI detected a PoM above the established threshold, ladies have been recalled for additional testing, however solely when that prediction was deemed assured. In any other case, the examination was double-read by radiologists.

The one instance of a screening examination with a screen-detected most cancers that will have been missed by AI in a hybrid studying technique based mostly on the AI uncertainty rating of the entropy of the imply chance of malignancy (PoM) rating of probably the most suspicious area. Throughout screening, a 52-year-old lady was recalled following arbitration scoring of the proper breast as Breast Imaging Reporting and Knowledge System (BI-RADS) 4 after the primary and second radiologists scored the proper breast as BI-RADS 1 and 4, respectively. This lady wouldn’t have been recalled if the examination was learn by the AI mannequin, which assigned a PoM rating of 30, which might be labeled as a sure prediction with an uncertainty quantification of 0.57. Each the mediolateral indirect (left) and craniocaudal (proper) views of the affected breast are proven. The containers point out the calcifications discovered throughout screening, and the ultimate prognosis of this examination was ductal carcinoma in situ. Credit score: Radiological Society of North America (RSNA)
Though nearly all of AI choices have been unsure and deferred to a human reader, 38% have been labeled as sure and may very well be learn solely by AI. Utilizing the researchers’ technique diminished radiologist studying workload to 61.9% with out altering recall (23.6‰ vs. 23.9‰) or most cancers detection (6.6‰ vs. 6.7‰) charges, each of that are corresponding to these of normal double-reading.
When the AI mannequin was sure, the world beneath the curve (AUC) was increased (0.96 vs. 0.87). Its sensitivity almost matched that of double radiologist studying (85.4% vs. 88.9%). Youthful ladies with dense breasts have been extra prone to have an unsure AI rating.
“The key component of our study isn’t necessarily that this is the best way to split the workload, but that it’s helpful to have uncertainty quantification built into AI models,” Verboom stated. “I hope commercial products integrate this into their models, because I think it’s a very useful metric.”
Verboom famous that if the examine outcomes occurred in scientific apply, the choice to recall 19% of girls can be made by AI with out the intervention of a radiologist.
“Several studies have shown that women participating in breast cancer screening programs have positive attitudes about the use of AI,” she stated. “However, most women prefer their mammogram to be read by at least one radiologist.”
She stated it might be extra acceptable for radiologists to evaluation exams deemed unsure by AI, in addition to AI recall circumstances.
“The use of AI with uncertainty quantification can be a possible solution for workforce shortages and could help build trust in the implementation of AI,” Verboom stated.
Verboom stated additional analysis, ideally a potential trial, is required to find out how the workload discount achieved by the hybrid studying technique might lower radiologist studying time.
“I think in the future, we could get to a point where a portion of women are sent home without ever having a radiologist look at their mammogram because AI will determine that their exam is normal,” she stated. “We’re not there yet, but I think we could get there with this uncertainty metric and quality control.”
Extra data:
AI Ought to Learn Mammograms Solely When Assured: A Hybrid Breast Most cancers Screening Studying Technique, Radiology (2025).
Supplied by
Radiological Society of North America
Quotation:
AI hybrid technique improves mammogram interpretation (2025, August 19)
retrieved 19 August 2025
from https://medicalxpress.com/information/2025-08-ai-hybrid-strategy-mammogram.html
This doc is topic to copyright. Other than any truthful dealing for the aim of personal examine or analysis, no
half could also be reproduced with out the written permission. The content material is offered for data functions solely.

