TY - JOUR AU - Saadati, Maral AU - Benner, Axel TI - Statistical challenges of high-dimensional methylation data. JO - Statistics in medicine VL - 33 IS - 30 SN - 0277-6715 CY - Chichester [u.a.] PB - Wiley M1 - DKFZ-2017-04207 SP - 5347 - 5357 PY - 2014 AB - With the fast growing field of epigenetics comes the need to better understand the intricacies of DNA methylation data analysis. High-throughput profiling using techniques, such as Illumina's BeadArray assay, enable the quantitative assessment of methylation. Challenges arise from the fact that resulting methylation levels (so-called beta values) are proportions between 0 and 1, often from an asymmetric, bimodal distribution with peaks close to 0 and 1. Therefore, the majority of standard statistical approaches do not apply. The logit transformation into so-called M-values is a common approach to circumvent this problem and aims to allow the use of common statistical methods. However, it can be observed that the transformation from beta to M-values does not necessarily result in an approximately homoscedastic distribution. Often, bimodality, asymmetry and heteroscedasticity are conserved even after transformation. We give an overview and discussion of methods suggested in the recent years that attempt to address the characteristics of methylation data in univariate screening settings. In order to identify 'differential' methylation with respect to covariates of interest while adjusting for confounders, we compare parametric methods, such as linear and beta regression, and nonparametric methods, such as rank-based regression. Our goal is to sensitise researchers to the challenges and issues that arise from this type of data as well as to present possible solutions. LB - PUB:(DE-HGF)16 C6 - pmid:25042556 DO - DOI:10.1002/sim.6251 UR - https://inrepo02.dkfz.de/record/128189 ER -