001     144575
005     20240229112629.0
024 7 _ |a 10.1186/s12859-019-3014-z
|2 doi
024 7 _ |a pmid:31419933
|2 pmid
024 7 _ |a pmc:PMC6697926
|2 pmc
024 7 _ |a altmetric:65102660
|2 altmetric
037 _ _ |a DKFZ-2019-02018
041 _ _ |a eng
082 _ _ |a 610
100 1 _ |a Johann, Pascal
|0 P:(DE-He78)3fdc3623477264cb5d0e14f256dbfbb8
|b 0
|e First author
|u dkfz
245 _ _ |a RF_Purify: a novel tool for comprehensive analysis of tumor-purity in methylation array data based on random forest regression.
260 _ _ |a Heidelberg
|c 2019
|b Springer
336 7 _ |a article
|2 DRIVER
336 7 _ |a Output Types/Journal article
|2 DataCite
336 7 _ |a Journal Article
|b journal
|m journal
|0 PUB:(DE-HGF)16
|s 1577106984_15304
|2 PUB:(DE-HGF)
336 7 _ |a ARTICLE
|2 BibTeX
336 7 _ |a JOURNAL_ARTICLE
|2 ORCID
336 7 _ |a Journal Article
|0 0
|2 EndNote
520 _ _ |a With the advent of array-based techniques to measure methylation levels in primary tumor samples, systematic investigations of methylomes have widely been performed on a large number of tumor entities. Most of these approaches are not based on measuring individual cell methylation but rather the bulk tumor sample DNA, which contains a mixture of tumor cells, infiltrating immune cells and other stromal components. This raises questions about the purity of a certain tumor sample, given the varying degrees of stromal infiltration in different entities. Previous methods to infer tumor purity require or are based on the use of matching control samples which are rarely available. Here we present a novel, reference free method to quantify tumor purity, based on two Random Forest classifiers, which were trained on ABSOLUTE as well as ESTIMATE purity values from TCGA tumor samples. We subsequently apply this method to a previously published, large dataset of brain tumors, proving that these models perform well in datasets that have not been characterized with respect to tumor purity .Using two gold standard methods to infer purity - the ABSOLUTE score based on whole genome sequencing data and the ESTIMATE score based on gene expression data- we have optimized Random Forest classifiers to predict tumor purity in entities that were contained in the TCGA project. We validated these classifiers using an independent test data set and cross-compared it to other methods which have been applied to the TCGA datasets (such as ESTIMATE and LUMP). Using Illumina methylation array data of brain tumor entities (as published in Capper et al. (Nature 555:469-474,2018)) we applied this model to estimate tumor purity and find that subgroups of brain tumors display substantial differences in tumor purity.Random forest- based tumor purity prediction is a well suited tool to extrapolate gold standard measures of purity to novel methylation array datasets. In contrast to other available methylation based tumor purity estimation methods, our classifiers do not need a priori knowledge about the tumor entity or matching control tissue to predict tumor purity.
536 _ _ |a 312 - Functional and structural genomics (POF3-312)
|0 G:(DE-HGF)POF3-312
|c POF3-312
|f POF III
|x 0
588 _ _ |a Dataset connected to CrossRef, PubMed,
700 1 _ |a Jäger, Natalie
|0 P:(DE-He78)bff9e3e3d86865d2b0836bb8f3ce98f3
|b 1
|u dkfz
700 1 _ |a Pfister, Stefan M
|0 P:(DE-He78)f746aa965c4e1af518b016de3aaff5d9
|b 2
|u dkfz
700 1 _ |a Sill, Martin
|0 P:(DE-He78)45440b44791309bd4b7dbb4f73333f9b
|b 3
|e Last author
|u dkfz
773 _ _ |a 10.1186/s12859-019-3014-z
|g Vol. 20, no. 1, p. 428
|0 PERI:(DE-600)2041484-5
|n 1
|p 428
|t BMC bioinformatics
|v 20
|y 2019
|x 1471-2105
909 C O |p VDB
|o oai:inrepo02.dkfz.de:144575
910 1 _ |a Deutsches Krebsforschungszentrum
|0 I:(DE-588b)2036810-0
|k DKFZ
|b 0
|6 P:(DE-He78)3fdc3623477264cb5d0e14f256dbfbb8
910 1 _ |a Deutsches Krebsforschungszentrum
|0 I:(DE-588b)2036810-0
|k DKFZ
|b 1
|6 P:(DE-He78)bff9e3e3d86865d2b0836bb8f3ce98f3
910 1 _ |a Deutsches Krebsforschungszentrum
|0 I:(DE-588b)2036810-0
|k DKFZ
|b 2
|6 P:(DE-He78)f746aa965c4e1af518b016de3aaff5d9
910 1 _ |a Deutsches Krebsforschungszentrum
|0 I:(DE-588b)2036810-0
|k DKFZ
|b 3
|6 P:(DE-He78)45440b44791309bd4b7dbb4f73333f9b
913 1 _ |a DE-HGF
|l Krebsforschung
|1 G:(DE-HGF)POF3-310
|0 G:(DE-HGF)POF3-312
|2 G:(DE-HGF)POF3-300
|v Functional and structural genomics
|x 0
|4 G:(DE-HGF)POF
|3 G:(DE-HGF)POF3
|b Gesundheit
914 1 _ |y 2019
915 _ _ |a JCR
|0 StatID:(DE-HGF)0100
|2 StatID
|b BMC BIOINFORMATICS : 2017
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0200
|2 StatID
|b SCOPUS
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0300
|2 StatID
|b Medline
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0310
|2 StatID
|b NCBI Molecular Biology Database
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0320
|2 StatID
|b PubMed Central
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0501
|2 StatID
|b DOAJ Seal
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0500
|2 StatID
|b DOAJ
915 _ _ |a Peer Review
|0 StatID:(DE-HGF)0030
|2 StatID
|b DOAJ : Blind peer review
915 _ _ |a Creative Commons Attribution CC BY (No Version)
|0 LIC:(DE-HGF)CCBYNV
|2 V:(DE-HGF)
|b DOAJ
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0600
|2 StatID
|b Ebsco Academic Search
915 _ _ |a Peer Review
|0 StatID:(DE-HGF)0030
|2 StatID
|b ASC
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0199
|2 StatID
|b Clarivate Analytics Master Journal List
915 _ _ |a WoS
|0 StatID:(DE-HGF)0111
|2 StatID
|b Science Citation Index Expanded
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0150
|2 StatID
|b Web of Science Core Collection
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)1050
|2 StatID
|b BIOSIS Previews
915 _ _ |a IF < 5
|0 StatID:(DE-HGF)9900
|2 StatID
920 1 _ |0 I:(DE-He78)B062-20160331
|k B062
|l Pädiatrische Neuroonkologie
|x 0
920 1 _ |0 I:(DE-He78)L101-20160331
|k L101
|l DKTK Heidelberg
|x 1
980 _ _ |a journal
980 _ _ |a VDB
980 _ _ |a I:(DE-He78)B062-20160331
980 _ _ |a I:(DE-He78)L101-20160331
980 _ _ |a UNRESTRICTED


LibraryCollectionCLSMajorCLSMinorLanguageAuthor
Marc 21