TY  - JOUR
AU  - Shmatko, Artem
AU  - Jung, Alexander Wolfgang
AU  - Gaurav, Kumar
AU  - Brunak, Søren
AU  - Mortensen, Laust Hvas
AU  - Birney, Ewan
AU  - Fitzgerald, Tom
AU  - Gerstung, Moritz
TI  - Learning the natural history of human disease with generative transformers.
JO  - Nature
VL  - nn
SN  - 0028-0836
CY  - London [u.a.]
PB  - Nature Publ. Group
M1  - DKFZ-2025-01925
SP  - nn
PY  - 2025
N1  - #EA:B450#LA:B450# / epub
AB  - Decision-making in healthcare relies on understanding patients' past and current health states to predict and, ultimately, change their future course1-3. Artificial intelligence (AI) methods promise to aid this task by learning patterns of disease progression from large corpora of health records4,5. However, their potential has not been fully investigated at scale. Here we modify the GPT6 (generative pretrained transformer) architecture to model the progression and competing nature of human diseases. We train this model, Delphi-2M, on data from 0.4 million UK Biobank participants and validate it using external data from 1.9 million Danish individuals with no change in parameters. Delphi-2M predicts the rates of more than 1,000 diseases, conditional on each individual's past disease history, with accuracy comparable to that of existing single-disease models. Delphi-2M's generative nature also enables sampling of synthetic future health trajectories, providing meaningful estimates of potential disease burden for up to 20 years, and enabling the training of AI models that have never seen actual data. Explainable AI methods7 provide insights into Delphi-2M's predictions, revealing clusters of co-morbidities within and across disease chapters and their time-dependent consequences on future health, but also highlight biases learnt from training data. In summary, transformer-based models appear to be well suited for predictive and generative health-related tasks, are applicable to population-scale datasets and provide insights into temporal dependencies between disease events, potentially improving the understanding of personalized health risks and informing precision medicine approaches.
LB  - PUB:(DE-HGF)16
C6  - pmid:40963019
DO  - DOI:10.1038/s41586-025-09529-3
UR  - https://inrepo02.dkfz.de/record/304606
ER  -