| Home > Publications database > Sequential sample size calculations and learning curves safeguard the robust development of a clinical prediction model for individuals. > print |
| 001 | 307383 | ||
| 005 | 20251223120207.0 | ||
| 024 | 7 | _ | |a 10.1016/j.jclinepi.2025.112117 |2 doi |
| 024 | 7 | _ | |a pmid:41423140 |2 pmid |
| 024 | 7 | _ | |a 0895-4356 |2 ISSN |
| 024 | 7 | _ | |a 1878-5921 |2 ISSN |
| 037 | _ | _ | |a DKFZ-2025-03027 |
| 041 | _ | _ | |a English |
| 082 | _ | _ | |a 610 |
| 100 | 1 | _ | |a Legha, Amardeep |b 0 |
| 245 | _ | _ | |a Sequential sample size calculations and learning curves safeguard the robust development of a clinical prediction model for individuals. |
| 260 | _ | _ | |a Amsterdam [u.a.] |c 2025 |b Elsevier Science |
| 336 | 7 | _ | |a article |2 DRIVER |
| 336 | 7 | _ | |a Output Types/Journal article |2 DataCite |
| 336 | 7 | _ | |a Journal Article |b journal |m journal |0 PUB:(DE-HGF)16 |s 1766414821_3634864 |2 PUB:(DE-HGF) |
| 336 | 7 | _ | |a ARTICLE |2 BibTeX |
| 336 | 7 | _ | |a JOURNAL_ARTICLE |2 ORCID |
| 336 | 7 | _ | |a Journal Article |0 0 |2 EndNote |
| 500 | _ | _ | |a epub |
| 520 | _ | _ | |a When recruiting participants to a new study developing a clinical prediction model (CPM), sample size calculations are typically conducted before data collection based on sensible assumptions. This leads to a fixed sample size, but if the assumptions are inaccurate, the actual sample size required to develop a reliable model may be higher or even lower. To safeguard against this, adaptive sample size approaches have been proposed, based on sequential evaluation of (changes in) a model's predictive performance.To illustrate and extend sequential sample size calculations for CPM development by (i) proposing stopping rules for prospective data collection based on minimising uncertainty (instability) and misclassification of individual-level predictions, and (ii) showcasing how it safeguards against inaccurate fixed sample size calculations.Using the sequential approach repeats the pre-defined model development strategy every time a chosen number (e.g., 100) of participants are recruited and adequately followed up. At each stage, CPM performance is evaluated using bootstrapping, leading to prediction and classification stability statistics and plots, alongside optimism-adjusted measures of calibration and discrimination. Learning curves display the trend of results against sample size and recruitment is stopped when a chosen stopping rule is met.Our approach is illustrated for model development of acute kidney injury using (penalised) logistic regression CPMs. Prior to recruitment based on perceived sensible assumptions, the fixed sample size calculation suggests recruiting 342 patients to minimise overfitting; however, during data collection the sequential approach reveals that a much larger sample size of 1100 is required to minimise overfitting (targeting a bootstrap-corrected calibration slope ≥0.9). If the stopping rule criteria also target small uncertainty and misclassification probability of individual predictions, the sequential approach suggests an even larger sample size of about n=1800.For CPM development studies involving prospective data collection, a sequential sample size approach allows users to dynamically monitor individual-level prediction and classification instability. This helps determine when enough participants have been recruited and safeguards against using inaccurate assumptions in a sample size calculation prior to data recruitment. Engagement with patients and other stakeholders is crucial to identify sensible context-specific stopping rules for robust individual predictions. |
| 536 | _ | _ | |a 315 - Bildgebung und Radioonkologie (POF4-315) |0 G:(DE-HGF)POF4-315 |c POF4-315 |f POF IV |x 0 |
| 588 | _ | _ | |a Dataset connected to CrossRef, PubMed, , Journals: inrepo02.dkfz.de |
| 650 | _ | 7 | |a Clinical Prediction Models |2 Other |
| 650 | _ | 7 | |a Instability |2 Other |
| 650 | _ | 7 | |a Learning Curves |2 Other |
| 650 | _ | 7 | |a Model Development |2 Other |
| 650 | _ | 7 | |a Sample Size |2 Other |
| 650 | _ | 7 | |a Sequential |2 Other |
| 650 | _ | 7 | |a Uncertainty |2 Other |
| 700 | 1 | _ | |a Ensor, Joie |b 1 |
| 700 | 1 | _ | |a Whittle, Rebecca |b 2 |
| 700 | 1 | _ | |a Archer, Lucinda |b 3 |
| 700 | 1 | _ | |a Van Calster, Ben |b 4 |
| 700 | 1 | _ | |a Christodoulou, Evangelia |0 P:(DE-He78)8da2eca0bc6341c8681c317fe2b8e27b |b 5 |u dkfz |
| 700 | 1 | _ | |a Snell, Kym I E |b 6 |
| 700 | 1 | _ | |a Sadatsafavi, Mohsen |b 7 |
| 700 | 1 | _ | |a Collins, Gary S |b 8 |
| 700 | 1 | _ | |a Riley, Richard D |b 9 |
| 773 | _ | _ | |a 10.1016/j.jclinepi.2025.112117 |g p. 112117 - |0 PERI:(DE-600)1500490-9 |p nn |t Journal of clinical epidemiology |v nn |y 2025 |x 0895-4356 |
| 909 | C | O | |o oai:inrepo02.dkfz.de:307383 |p VDB |
| 910 | 1 | _ | |a Deutsches Krebsforschungszentrum |0 I:(DE-588b)2036810-0 |k DKFZ |b 5 |6 P:(DE-He78)8da2eca0bc6341c8681c317fe2b8e27b |
| 913 | 1 | _ | |a DE-HGF |b Gesundheit |l Krebsforschung |1 G:(DE-HGF)POF4-310 |0 G:(DE-HGF)POF4-315 |3 G:(DE-HGF)POF4 |2 G:(DE-HGF)POF4-300 |4 G:(DE-HGF)POF |v Bildgebung und Radioonkologie |x 0 |
| 914 | 1 | _ | |y 2025 |
| 915 | _ | _ | |a Nationallizenz |0 StatID:(DE-HGF)0420 |2 StatID |d 2024-12-11 |w ger |
| 915 | _ | _ | |a DBCoverage |0 StatID:(DE-HGF)0200 |2 StatID |b SCOPUS |d 2024-12-11 |
| 915 | _ | _ | |a DBCoverage |0 StatID:(DE-HGF)0300 |2 StatID |b Medline |d 2024-12-11 |
| 915 | _ | _ | |a DBCoverage |0 StatID:(DE-HGF)0199 |2 StatID |b Clarivate Analytics Master Journal List |d 2024-12-11 |
| 915 | _ | _ | |a DBCoverage |0 StatID:(DE-HGF)0160 |2 StatID |b Essential Science Indicators |d 2024-12-11 |
| 915 | _ | _ | |a DBCoverage |0 StatID:(DE-HGF)1030 |2 StatID |b Current Contents - Life Sciences |d 2024-12-11 |
| 915 | _ | _ | |a DBCoverage |0 StatID:(DE-HGF)1110 |2 StatID |b Current Contents - Clinical Medicine |d 2024-12-11 |
| 915 | _ | _ | |a WoS |0 StatID:(DE-HGF)0113 |2 StatID |b Science Citation Index Expanded |d 2024-12-11 |
| 915 | _ | _ | |a DBCoverage |0 StatID:(DE-HGF)0150 |2 StatID |b Web of Science Core Collection |d 2024-12-11 |
| 915 | _ | _ | |a JCR |0 StatID:(DE-HGF)0100 |2 StatID |b J CLIN EPIDEMIOL : 2022 |d 2024-12-11 |
| 915 | _ | _ | |a DBCoverage |0 StatID:(DE-HGF)0600 |2 StatID |b Ebsco Academic Search |d 2024-12-11 |
| 915 | _ | _ | |a Peer Review |0 StatID:(DE-HGF)0030 |2 StatID |b ASC |d 2024-12-11 |
| 915 | _ | _ | |a IF >= 5 |0 StatID:(DE-HGF)9905 |2 StatID |b J CLIN EPIDEMIOL : 2022 |d 2024-12-11 |
| 920 | 1 | _ | |0 I:(DE-He78)E130-20160331 |k E130 |l E130 Intelligente Medizinische Systeme |x 0 |
| 980 | _ | _ | |a journal |
| 980 | _ | _ | |a VDB |
| 980 | _ | _ | |a I:(DE-He78)E130-20160331 |
| 980 | _ | _ | |a UNRESTRICTED |
| Library | Collection | CLSMajor | CLSMinor | Language | Author |
|---|