TY  - JOUR
AU  - De Bin, Riccardo
AU  - Boulesteix, Anne-Laure
AU  - Benner, Axel
AU  - Becker, Natalia
AU  - Sauerbrei, Willi
TI  - Combining clinical and molecular data in regression prediction models: insights from a simulation study.
JO  - Briefings in bioinformatics
VL  - 21
IS  - 6
SN  - 1477-4054
CY  - Oxford [u.a.]
PB  - Oxford University Press
M1  - DKFZ-2019-02681
SP  - 1904-1919
PY  - 2020
N1  - 2020 Dec 1;21(6):1904-1919
AB  - Data integration, i.e. the use of different sources of information for data analysis, is becoming one of the most important topics in modern statistics. Especially in, but not limited to, biomedical applications, a relevant issue is the combination of low-dimensional (e.g. clinical data) and high-dimensional (e.g. molecular data such as gene expressions) data sources in a prediction model. Not only the different characteristics of the data, but also the complex correlation structure within and between the two data sources, pose challenging issues. In this paper, we investigate these issues via simulations, providing some useful insight into strategies to combine low- and high-dimensional data in a regression prediction model. In particular, we focus on the effect of the correlation structure on the results, while accounting for the influence of our specific choices in the design of the simulation study.
LB  - PUB:(DE-HGF)16
C6  - pmid:31750518
DO  - DOI:10.1093/bib/bbz136
UR  - https://inrepo02.dkfz.de/record/147704
ER  -