| Home > Publications database > Prospective Evidence on Artificial Intelligence-Assisted Melanoma Diagnostics: A Systematic Review and Meta-Analysis. |
| Journal Article | DKFZ-2026-00693 |
; ; ; ; ; ; ; ; ;
2026
American Medical Association
Chicago, Ill.
Abstract: Dermoscopy is a standard of care for melanoma diagnostics, and artificial intelligence (AI) systems are increasingly investigated as decision-support tools. Prospective evidence is essential to assess their performance compared to dermatologists.To evaluate the diagnostic performance of dermatologists, AI systems, and dermatologists assisted by AI in prospective studies of melanoma detection, and to assess the readiness of AI for clinical use.PubMed, Embase, Web of Science, and Google Scholar were searched from inception through July 9, 2025.Eligible studies were prospective, used dermoscopic images, and reported or allowed calculation of performance metrics for dermatologists, AI, or dermatologists assisted by AI against a histopathologic reference standard. Nondermoscopic comparators and retrospective designs were excluded. Studies with 20 or fewer histopathologically confirmed melanomas were excluded a priori from quantitative synthesis.Two reviewers independently screened and extracted data and discrepancies or missing values were clarified among all authors. Risk of bias and applicability were assessed with QUADAS-2 and QUADAS-C. Study-level sensitivity and specificity were summarized and plotted; head-to-head comparisons were analyzed descriptively.Diagnostic outcomes were sensitivity, specificity, accuracy, and balanced accuracy for melanoma detection.Eleven prospective studies with a total of more than 2500 patients and 50 participant-dermatologists were included in the analyses. Dermatologists achieved a pooled sensitivity of 78.6% (95% CI, 67.5%-88.1%) and specificity of 75.2% (95% CI, 63.3%-84.3%), whereas AI alone reached 80.9% (95% CI, 63.6%-94.5%) sensitivity and 75.6% (95% CI, 64.5%-85.6%) specificity. In the single study reporting AI-assisted dermatologists, sensitivity was 91.9% and specificity was 83.7%. In direct clinical comparisons, AI demonstrated higher specificity and similar sensitivity. Most studies were at high risk of bias in patient selection and index test domains, primarily due to the preselection of lesions suspected of melanoma and binary classifications.In the systematic review and meta-analysis of prospective settings, AI systems perform at comparable levels to dermatologists for melanoma diagnostics and may enhance performance when used as a decision-support tool. However, the frequent risk of bias and limited generalizability of current studies highlight the need for broader validation in unselected patient populations in the clinical setting.
|
The record appears in these collections: |