LLM-powered breast cancer staging from PET/CT reports: a comparative performance study.

Spitzl, Daniel; Steinhelfer, Lisa; Eiber, Matthias; Endrös, Lukas; Braren, Rickmer; Mergen, Markus

doi:10.1016/j.ijmedinf.2025.106053

Journal Article

DKFZ-2025-01532

LLM-powered breast cancer staging from PET/CT reports: a comparative performance study.

Spitzl, D. ; Mergen, M. ; Braren, R.DKFZ* ; Endrös, L. ; Eiber, M.DKFZ* ; Steinhelfer, L.

2025
Elsevier Amsterdam [u.a.]

International journal of medical informatics 204, 106053 (2025) [10.1016/j.ijmedinf.2025.106053]

Abstract: Imaging reports are crucial in breast cancer management, with the tumor-node-metastasis (TNM) classification serving as a widely used model for assessing disease severity, guiding treatment decisions, and predicting patient outcomes. Large language models (LLMs) offer a potential solution by extracting standardized UICC TNM classifications and the corresponding UICC stage directly from existing PET/CT reports. This approach holds promise to enhance staging accuracy, streamline multidisciplinary discussions, and improve patient outcomes.Here, we evaluated four LLMs-ChatGPT-4o, DeepSeek V3, Claude 3.5 Sonnet, and Gemini 2.0 Flash-for their capacity to determine TNM staging based on UICC/AJCC breast cancer guidelines. A total of 111 fictitious PET/CT reports were analyzed, and each model's outputs were measured against expert-generated TNM classifications and stage categorizations.Among the tested models, Claude 3.5 Sonnet demonstrated superior F1 scores of 0.95%, 0.95%, 1.00% and 0.92% for T, N, M classification and UICC stage classification, respectively.These findings underscore the ability of advanced natural language processing (NLP) technologies to support reliable cancer staging, potentially aiding clinicians. Despite the encouraging performance, prospective clinical trials and validation across diverse practice settings remain critical to confirming these preliminary outcomes. Nonetheless, this study highlights the promise of LLM-based systems in reinforcing the accuracy of oncologic workflows and lays the groundwork for broader adoption of AI-driven tools in breast cancer management.

Keyword(s): Artificial intelligence ; Breast cancer ; Clinical decision support ; Diagnostics

Classification:

ddc:004

Contributing Institute(s):

DKTK Koordinierungsstelle München (MU01)

Research Program(s):

899 - ohne Topic (POF4-899) (POF4-899)

Appears in the scientific report 2025

Database coverage:
Medline

; BIOSIS Previews ; Biological Abstracts ; Clarivate Analytics Master Journal List ; Current Contents - Clinical Medicine ; Current Contents - Life Sciences ; Ebsco Academic Search ; Essential Science Indicators ; IF < 5 ; JCR ; Nationallizenz

; SCOPUS ; Science Citation Index Expanded ; Web of Science Core Collection

Click to display QR Code for this record

The record appears in these collections:
Document types > Articles > Journal Article
Public records
Publications database

Record created 2025-07-25, last modified 2025-07-26

Similar records

Rate this document:

(Not yet reviewed)

Add to personal basket
Export as Author List with IDs BibTeX (UTF-8), EndNote XML, EndNote Text, RIS, MARC, Print MARC, MARCXML, DC,
Request correction
Submit fulltext

guest :: login DKFZ
		Search		Submit		Personalize Your alerts Your baskets Your searches		Help