Integrating multiple molecular sources into a clinical risk prediction signature by extracting complementary information.

Hieke, Stefanie; Schlenl, Richard F; Bullinger, Lars; Binder, Harald; Schumacher, Martin; Benner, Axel

doi:10.1186/s12859-016-1183-6

Items
Marc 21

001			128776
005			20240228143347.0
024	7	_	\|a 10.1186/s12859-016-1183-6 \|2 doi
024	7	_	\|a pmid:27578050 \|2 pmid
024	7	_	\|a pmc:PMC5004308 \|2 pmc
024	7	_	\|a altmetric:10909583 \|2 altmetric
037	_	_	\|a DKFZ-2017-04791
041	_	_	\|a eng
082	_	_	\|a 004
100	1	_	\|a Hieke, Stefanie \|0 0000-0002-1810-9149 \|b 0
245	_	_	\|a Integrating multiple molecular sources into a clinical risk prediction signature by extracting complementary information.
260	_	_	\|a London \|c 2016 \|b BioMed Central
336	7	_	\|a article \|2 DRIVER
336	7	_	\|a Output Types/Journal article \|2 DataCite
336	7	_	\|a Journal Article \|b journal \|m journal \|0 PUB:(DE-HGF)16 \|s 1522055462_13173 \|2 PUB:(DE-HGF)
336	7	_	\|a ARTICLE \|2 BibTeX
336	7	_	\|a JOURNAL_ARTICLE \|2 ORCID
336	7	_	\|a Journal Article \|0 0 \|2 EndNote
520	_	_	\|a High-throughput technology allows for genome-wide measurements at different molecular levels for the same patient, e.g. single nucleotide polymorphisms (SNPs) and gene expression. Correspondingly, it might be beneficial to also integrate complementary information from different molecular levels when building multivariable risk prediction models for a clinical endpoint, such as treatment response or survival. Unfortunately, such a high-dimensional modeling task will often be complicated by a limited overlap of molecular measurements at different levels between patients, i.e. measurements from all molecular levels are available only for a smaller proportion of patients.We propose a sequential strategy for building clinical risk prediction models that integrate genome-wide measurements from two molecular levels in a complementary way. To deal with partial overlap, we develop an imputation approach that allows us to use all available data. This approach is investigated in two acute myeloid leukemia applications combining gene expression with either SNP or DNA methylation data. After obtaining a sparse risk prediction signature e.g. from SNP data, an automatically selected set of prognostic SNPs, by componentwise likelihood-based boosting, imputation is performed for the corresponding linear predictor by a linking model that incorporates e.g. gene expression measurements. The imputed linear predictor is then used for adjustment when building a prognostic signature from the gene expression data. For evaluation, we consider stability, as quantified by inclusion frequencies across resampling data sets. Despite an extremely small overlap in the application example with gene expression and SNPs, several genes are seen to be more stably identified when taking the (imputed) linear predictor from the SNP data into account. In the application with gene expression and DNA methylation, prediction performance with respect to survival also indicates that the proposed approach might work well.We consider imputation of linear predictor values to be a feasible and sensible approach for dealing with partial overlap in complementary integrative analysis of molecular measurements at different levels. More generally, these results indicate that a complementary strategy for integrating different molecular levels can result in more stable risk prediction signatures, potentially providing a more reliable insight into the underlying biology.
536	_	_	\|a 313 - Cancer risk factors and prevention (POF3-313) \|0 G:(DE-HGF)POF3-313 \|c POF3-313 \|f POF III \|x 0
588	_	_	\|a Dataset connected to CrossRef, PubMed,
700	1	_	\|a Benner, Axel \|0 P:(DE-He78)e15dfa1260625c69d6690a197392a994 \|b 1 \|u dkfz
700	1	_	\|a Schlenl, Richard F \|b 2
700	1	_	\|a Schumacher, Martin \|b 3
700	1	_	\|a Bullinger, Lars \|b 4
700	1	_	\|a Binder, Harald \|b 5
773	_	_	\|a 10.1186/s12859-016-1183-6 \|g Vol. 17, no. 1, p. 327 \|0 PERI:(DE-600)2041484-5 \|n 1 \|p 327 \|t BMC bioinformatics \|v 17 \|y 2016 \|x 1471-2105
909	C	O	\|o oai:inrepo02.dkfz.de:128776 \|p VDB
910	1	_	\|a Deutsches Krebsforschungszentrum \|0 I:(DE-588b)2036810-0 \|k DKFZ \|b 1 \|6 P:(DE-He78)e15dfa1260625c69d6690a197392a994
913	1	_	\|a DE-HGF \|l Krebsforschung \|1 G:(DE-HGF)POF3-310 \|0 G:(DE-HGF)POF3-313 \|2 G:(DE-HGF)POF3-300 \|v Cancer risk factors and prevention \|x 0 \|4 G:(DE-HGF)POF \|3 G:(DE-HGF)POF3 \|b Gesundheit
914	1	_	\|y 2016
915	_	_	\|a JCR \|0 StatID:(DE-HGF)0100 \|2 StatID \|b BMC BIOINFORMATICS : 2015
915	_	_	\|a DBCoverage \|0 StatID:(DE-HGF)0200 \|2 StatID \|b SCOPUS
915	_	_	\|a DBCoverage \|0 StatID:(DE-HGF)0300 \|2 StatID \|b Medline
915	_	_	\|a DBCoverage \|0 StatID:(DE-HGF)0310 \|2 StatID \|b NCBI Molecular Biology Database
915	_	_	\|a DBCoverage \|0 StatID:(DE-HGF)0501 \|2 StatID \|b DOAJ Seal
915	_	_	\|a DBCoverage \|0 StatID:(DE-HGF)0500 \|2 StatID \|b DOAJ
915	_	_	\|a Creative Commons Attribution CC BY (No Version) \|0 LIC:(DE-HGF)CCBYNV \|2 V:(DE-HGF) \|b DOAJ
915	_	_	\|a DBCoverage \|0 StatID:(DE-HGF)0600 \|2 StatID \|b Ebsco Academic Search
915	_	_	\|a Peer Review \|0 StatID:(DE-HGF)0030 \|2 StatID \|b ASC
915	_	_	\|a DBCoverage \|0 StatID:(DE-HGF)0199 \|2 StatID \|b Thomson Reuters Master Journal List
915	_	_	\|a WoS \|0 StatID:(DE-HGF)0111 \|2 StatID \|b Science Citation Index Expanded
915	_	_	\|a DBCoverage \|0 StatID:(DE-HGF)0150 \|2 StatID \|b Web of Science Core Collection
915	_	_	\|a DBCoverage \|0 StatID:(DE-HGF)1050 \|2 StatID \|b BIOSIS Previews
915	_	_	\|a IF < 5 \|0 StatID:(DE-HGF)9900 \|2 StatID
920	1	_	\|0 I:(DE-He78)C060-20160331 \|k C060 \|l Biostatistik \|x 0
980	_	_	\|a journal
980	_	_	\|a VDB
980	_	_	\|a I:(DE-He78)C060-20160331
980	_	_	\|a UNRESTRICTED

Library	Collection	CLSMajor	CLSMinor	Language	Author

Marc 21

guest :: login DKFZ
		Search		Submit		Personalize Your alerts Your baskets Your searches		Help