TY  - JOUR
AU  - Deutelmoser, Heike
AU  - Scherer, Dominique
AU  - Brenner, Hermann
AU  - Waldenberger, Melanie
AU  - Suhre, Karsten
AU  - Kastenmüller, Gabi
AU  - Lorenzo Bermejo, Justo
TI  - Robust Huber-LASSO for improved prediction of protein, metabolite and gene expression levels relying on individual genotype data.
JO  - Briefings in bioinformatics
VL  - 22
IS  - 4
SN  - 1477-4054
CY  - Oxford [u.a.]
PB  - Oxford University Press
M1  - DKFZ-2020-02221
SP  - bbaa230
PY  - 2021
N1  - #EA:C120#
AB  - Least absolute shrinkage and selection operator (LASSO) regression is often applied to select the most promising set of single nucleotide polymorphisms (SNPs) associated with a molecular phenotype of interest. While the penalization parameter λ restricts the number of selected SNPs and the potential model overfitting, the least-squares loss function of standard LASSO regression translates into a strong dependence of statistical results on a small number of individuals with phenotypes or genotypes divergent from the majority of the study population-typically comprised of outliers and high-leverage observations. Robust methods have been developed to constrain the influence of divergent observations and generate statistical results that apply to the bulk of study data, but they have rarely been applied to genetic association studies. In this article, we review, for newcomers to the field of robust statistics, a novel version of standard LASSO that utilizes the Huber loss function. We conduct comprehensive simulations and analyze real protein, metabolite, mRNA expression and genotype data to compare the stability of penalization, the cross-iteration concordance of the model, the false-positive and true-positive rates and the prediction accuracy of standard and robust Huber-LASSO. Although the two methods showed controlled false-positive rates ≤2.1
LB  - PUB:(DE-HGF)16
C6  - pmid:33063116
DO  - DOI:10.1093/bib/bbaa230
UR  - https://inrepo02.dkfz.de/record/164053
ER  -