Evaluation of a phenotype imputation approach using GAW20 simulated data

AbstractStatistical power, which is the probability of correctly rejecting a false null hypothesis, is a limitation of genome-wide association studies (GWAS). Sample size is a major component of statistical power that can be easily affected by missingness in phenotypic data and restrain the ability to detect associated single-nucleotide polymorphisms (SNPs) with small effect sizes. Although some phenotypes are hard to collect because of cost and loss to follow-up, correlated phenotypes that are easily collected can be leveraged for association analysis. In this paper, we evaluate a phenotype imputation method that incorporates family structure and correlation between multiple phenotypes using GAW20 simulated data. The distribution of missing values is derived using information contained in the missing sample ’s relatives and additional correlated phenotypes. We show that this imputation method can improve power in the association analysis compared with excluding observations with missing data, while achieving the correct Type I error rate.We also examine factors that may affect the imputation accuracy.
Source: BMC Proceedings - Category: Biomedical Science Source Type: research