TY - JOUR
AU - Maier-Hein, Lena
AU - Reinke, Annika
AU - Godau, Patrick
AU - Tizabi, Minu D
AU - Büttner, Florian
AU - Christodoulou, Evangelia
AU - Glocker, Ben
AU - Isensee, Fabian
AU - Kleesiek, Jens
AU - Kozubek, Michal
AU - Reyes, Mauricio
AU - Riegler, Michael A
AU - Wiesenfarth, Manuel
AU - Kavur, A Emre
AU - Sudre, Carole H
AU - Baumgartner, Michael
AU - Eisenmann, Matthias
AU - Heckmann-Nötzel, Doreen
AU - Rädsch, Tim
AU - Acion, Laura
AU - Antonelli, Michela
AU - Arbel, Tal
AU - Bakas, Spyridon
AU - Benis, Arriel
AU - Blaschko, Matthew B
AU - Cardoso, M Jorge
AU - Cheplygina, Veronika
AU - Cimini, Beth A
AU - Collins, Gary S
AU - Farahani, Keyvan
AU - Ferrer, Luciana
AU - Galdran, Adrian
AU - van Ginneken, Bram
AU - Haase, Robert
AU - Hashimoto, Daniel A
AU - Hoffman, Michael M
AU - Huisman, Merel
AU - Jannin, Pierre
AU - Kahn, Charles E
AU - Kainmueller, Dagmar
AU - Kainz, Bernhard
AU - Karargyris, Alexandros
AU - Karthikesalingam, Alan
AU - Kofler, Florian
AU - Kopp-Schneider, Annette
AU - Kreshuk, Anna
AU - Kurc, Tahsin
AU - Landman, Bennett A
AU - Litjens, Geert
AU - Madani, Amin
AU - Maier-Hein, Klaus
AU - Martel, Anne L
AU - Mattson, Peter
AU - Meijering, Erik
AU - Menze, Bjoern
AU - Moons, Karel G M
AU - Müller, Henning
AU - Nichyporuk, Brennan
AU - Nickel, Felix
AU - Petersen, Jens
AU - Rajpoot, Nasir
AU - Rieke, Nicola
AU - Saez-Rodriguez, Julio
AU - Sánchez, Clara I
AU - Shetty, Shravya
AU - van Smeden, Maarten
AU - Summers, Ronald M
AU - Taha, Abdel A
AU - Tiulpin, Aleksei
AU - Tsaftaris, Sotirios A
AU - Van Calster, Ben
AU - Varoquaux, Gaël
AU - Jäger, Paul
TI - Metrics reloaded: recommendations for image analysis validation.
JO - Nature methods
VL - 21
IS - 2
SN - 1548-7091
CY - London [u.a.]
PB - Nature Publishing Group
M1 - DKFZ-2024-00337
SP - 195 - 212
PY - 2024
N1 - #EA:E130#LA:E290#
AB - Increasing evidence shows that flaws in machine learning (ML) algorithm validation are an underestimated global problem. In biomedical image analysis, chosen performance metrics often do not reflect the domain interest, and thus fail to adequately measure scientific progress and hinder translation of ML techniques into practice. To overcome this, we created Metrics Reloaded, a comprehensive framework guiding researchers in the problem-aware selection of metrics. Developed by a large international consortium in a multistage Delphi process, it is based on the novel concept of a problem fingerprint-a structured representation of the given problem that captures all aspects that are relevant for metric selection, from the domain interest to the properties of the target structure(s), dataset and algorithm output. On the basis of the problem fingerprint, users are guided through the process of choosing and applying appropriate validation metrics while being made aware of potential pitfalls. Metrics Reloaded targets image analysis problems that can be interpreted as classification tasks at image, object or pixel level, namely image-level classification, object detection, semantic segmentation and instance segmentation tasks. To improve the user experience, we implemented the framework in the Metrics Reloaded online tool. Following the convergence of ML methodology across application domains, Metrics Reloaded fosters the convergence of validation methodology. Its applicability is demonstrated for various biomedical use cases.
LB - PUB:(DE-HGF)16
C6 - pmid:38347141
DO - DOI:10.1038/s41592-023-02151-z
UR - https://inrepo02.dkfz.de/record/288083
ER -