Home > Publications database > Recommendations for validating omics prediction models: Insights from a lung cancer RNA biomarker study. > print |
001 | 303479 | ||
005 | 20250814114517.0 | ||
024 | 7 | _ | |a 10.1158/1055-9965.EPI-25-0787 |2 doi |
024 | 7 | _ | |a pmid:40794097 |2 pmid |
024 | 7 | _ | |a 1055-9965 |2 ISSN |
024 | 7 | _ | |a 1538-7755 |2 ISSN |
037 | _ | _ | |a DKFZ-2025-01676 |
041 | _ | _ | |a English |
082 | _ | _ | |a 610 |
100 | 1 | _ | |a Pestarino, Luca |0 0000-0001-7097-2954 |b 0 |
245 | _ | _ | |a Recommendations for validating omics prediction models: Insights from a lung cancer RNA biomarker study. |
260 | _ | _ | |a Philadelphia, Pa. |c 2025 |b AACR |
336 | 7 | _ | |a article |2 DRIVER |
336 | 7 | _ | |a Output Types/Journal article |2 DataCite |
336 | 7 | _ | |a Journal Article |b journal |m journal |0 PUB:(DE-HGF)16 |s 1755152012_13098 |2 PUB:(DE-HGF) |
336 | 7 | _ | |a ARTICLE |2 BibTeX |
336 | 7 | _ | |a JOURNAL_ARTICLE |2 ORCID |
336 | 7 | _ | |a Journal Article |0 0 |2 EndNote |
500 | _ | _ | |a epub |
520 | _ | _ | |a External validation of predictive models in medical research is crucial to ensure their generalizability and applicability across diverse populations. However, validation often reveals discrepancies in model performance due to cohort differences, sample collection and storage, overfitting, and inconsistencies in data handling. This study investigates the challenges encountered during external validation of predictive models for early lung cancer detection using small RNA biomarkers, tying these challenges to specific validation outcomes and deriving recommendations.Predictive models based on the XGBoost algorithm, developed from serum samples in the JanusRNA cohort, were externally validated in two independent Norwegian cohorts: HUNT and NOWAC. These cohorts differed in sample types, RNA abundance, library preparation protocols, and lung cancer histological classification. Strategies to harmonize data processing and address these discrepancies were employed to ensure a robust validation process.Validation revealed significant challenges due to cohort heterogeneity. Median AUC values ranged from 0.50 to 0.66 in validation cohorts, compared to 0.62-0.76 in the original models. Models performed worse in the female-only NOWAC cohort, where plasma was used, highlighting the impact of sample type and cohort characteristics on predictive accuracy.Based on the challenges encountered during validation, we propose seven recommendations to guide robust external validation of omics-based predictive models including harmonizing data processing across cohorts, re-evaluating overfitting, and critically assessing model performance for clinical applications.By highlighting practical issues in model validation and providing recommendations, this study supports more reliable and clinically applicable biomarker-based prediction models, ultimately aiding cancer screening and prevention efforts. |
536 | _ | _ | |a 313 - Krebsrisikofaktoren und Prävention (POF4-313) |0 G:(DE-HGF)POF4-313 |c POF4-313 |f POF IV |x 0 |
588 | _ | _ | |a Dataset connected to CrossRef, PubMed, , Journals: inrepo02.dkfz.de |
700 | 1 | _ | |a Turzanski-Fortner, Renée |0 P:(DE-He78)74a6af8347ec5cbd4b77e562e10ca1f2 |b 1 |u dkfz |
700 | 1 | _ | |a Nøst, Therese H |0 0000-0001-6805-3094 |b 2 |
700 | 1 | _ | |a Fotopoulos, Ioannis |0 0009-0006-3398-3498 |b 3 |
700 | 1 | _ | |a Urbarova, Ilona |0 0000-0001-6626-2917 |b 4 |
700 | 1 | _ | |a Røe, Oluf D |0 0000-0002-4870-5822 |b 5 |
700 | 1 | _ | |a Langseth, Hilde |0 0000-0002-9446-4855 |b 6 |
700 | 1 | _ | |a Rounge, Trine Ballestad |0 0000-0003-2677-2722 |b 7 |
773 | _ | _ | |a 10.1158/1055-9965.EPI-25-0787 |0 PERI:(DE-600)2036781-8 |p nn |t Cancer epidemiology, biomarkers & prevention |v nn |y 2025 |x 1055-9965 |
909 | C | O | |o oai:inrepo02.dkfz.de:303479 |p VDB |
910 | 1 | _ | |a Deutsches Krebsforschungszentrum |0 I:(DE-588b)2036810-0 |k DKFZ |b 1 |6 P:(DE-He78)74a6af8347ec5cbd4b77e562e10ca1f2 |
913 | 1 | _ | |a DE-HGF |b Gesundheit |l Krebsforschung |1 G:(DE-HGF)POF4-310 |0 G:(DE-HGF)POF4-313 |3 G:(DE-HGF)POF4 |2 G:(DE-HGF)POF4-300 |4 G:(DE-HGF)POF |v Krebsrisikofaktoren und Prävention |x 0 |
914 | 1 | _ | |y 2025 |
915 | _ | _ | |a DBCoverage |0 StatID:(DE-HGF)0200 |2 StatID |b SCOPUS |d 2024-12-10 |
915 | _ | _ | |a DBCoverage |0 StatID:(DE-HGF)0300 |2 StatID |b Medline |d 2024-12-10 |
915 | _ | _ | |a DBCoverage |0 StatID:(DE-HGF)0199 |2 StatID |b Clarivate Analytics Master Journal List |d 2024-12-10 |
915 | _ | _ | |a DBCoverage |0 StatID:(DE-HGF)1050 |2 StatID |b BIOSIS Previews |d 2024-12-10 |
915 | _ | _ | |a DBCoverage |0 StatID:(DE-HGF)0160 |2 StatID |b Essential Science Indicators |d 2024-12-10 |
915 | _ | _ | |a DBCoverage |0 StatID:(DE-HGF)1030 |2 StatID |b Current Contents - Life Sciences |d 2024-12-10 |
915 | _ | _ | |a DBCoverage |0 StatID:(DE-HGF)1190 |2 StatID |b Biological Abstracts |d 2024-12-10 |
915 | _ | _ | |a DBCoverage |0 StatID:(DE-HGF)1110 |2 StatID |b Current Contents - Clinical Medicine |d 2024-12-10 |
915 | _ | _ | |a WoS |0 StatID:(DE-HGF)0113 |2 StatID |b Science Citation Index Expanded |d 2024-12-10 |
915 | _ | _ | |a DBCoverage |0 StatID:(DE-HGF)0150 |2 StatID |b Web of Science Core Collection |d 2024-12-10 |
915 | _ | _ | |a JCR |0 StatID:(DE-HGF)0100 |2 StatID |b CANCER EPIDEM BIOMAR : 2022 |d 2024-12-10 |
915 | _ | _ | |a IF < 5 |0 StatID:(DE-HGF)9900 |2 StatID |d 2024-12-10 |
920 | 1 | _ | |0 I:(DE-He78)C180-20160331 |k C180 |l Krebsepidemiologie |x 0 |
980 | _ | _ | |a journal |
980 | _ | _ | |a VDB |
980 | _ | _ | |a I:(DE-He78)C180-20160331 |
980 | _ | _ | |a UNRESTRICTED |
Library | Collection | CLSMajor | CLSMinor | Language | Author |
---|