Preprint DKFZ-2025-02822

http://join2-wiki.gsi.de/foswiki/pub/Main/Artwork/join2_logo100x88.png
ScaleMAI: Accelerating the Development of Trusted Datasets and AI Models

 ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;

2025
arXiv

arXiv () [10.48550/ARXIV.2501.03410]  GO

Abstract: Building trusted datasets is critical for transparent and responsible Medical AI (MAI) research, but creating even small, high-quality datasets can take years of effort from multidisciplinary teams. This process often delays AI benefits, as human-centric data creation and AI-centric model development are treated as separate, sequential steps. To overcome this, we propose ScaleMAI, an agent of AI-integrated data curation and annotation, allowing data quality and AI performance to improve in a self-reinforcing cycle and reducing development time from years to months. We adopt pancreatic tumor detection as an example. First, ScaleMAI progressively creates a dataset of 25,362 CT scans, including per-voxel annotations for benign/malignant tumors and 24 anatomical structures. Second, through progressive human-in-the-loop iterations, ScaleMAI provides Flagship AI Model that can approach the proficiency of expert annotators (30-year experience) in detecting pancreatic tumors. Flagship Model significantly outperforms models developed from smaller, fixed-quality datasets, with substantial gains in tumor detection (+14%), segmentation (+5%), and classification (72%) on three prestigious benchmarks. In summary, ScaleMAI transforms the speed, scale, and reliability of medical dataset creation, paving the way for a variety of impactful, data-driven applications.

Keyword(s): Computer Vision and Pattern Recognition (cs.CV) ; FOS: Computer and information sciences


Note: ScaleMAI_Accelerating_the_Development_of_Trusted_D.pdf

Contributing Institute(s):
  1. E230 Medizinische Bildverarbeitung (E230)
Research Program(s):
  1. 315 - Bildgebung und Radioonkologie (POF4-315) (POF4-315)

Appears in the scientific report 2025
Click to display QR Code for this record

The record appears in these collections:
Document types > Reports > Preprints
Public records
Publications database

 Record created 2025-12-08, last modified 2025-12-09



Rate this document:

Rate this document:
1
2
3
 
(Not yet reviewed)