Analysis of cancer gene expression data with an assisted robust marker identification approach
In this study, we develop an ARMI (assisted robust marker identification) approach for analyzing cancer studies with measurements on GEs as well as regulators. The proposed approach borrows information from regulators and can be more effective than analyzing GE data alone. A robust objective function is adopted to accommodate long‐tailed distributions. Marker identification is effectively realized using penalization. The proposed approach has an intuitive formulation and is computationally much affordable. Simulation shows its satisfactory performance under a variety of settings. TCGA (The Cancer Genome Atlas) data on me...
Source: Genetic Epidemiology - September 14, 2017 Category: Epidemiology Authors: Hao Chai, Xingjie Shi, Qingzhao Zhang, Qing Zhao, Yuan Huang, Shuangge Ma Tags: RESEARCH ARTICLE Source Type: research

Phenotype validation in electronic health records based genetic association studies
Abstract The linkage between electronic health records (EHRs) and genotype data makes it plausible to study the genetic susceptibility of a wide range of disease phenotypes. Despite that EHR‐derived phenotype data are subjected to misclassification, it has been shown useful for discovering susceptible genes, particularly in the setting of phenome‐wide association studies (PheWAS). It is essential to characterize discovered associations using gold standard phenotype data by chart review. In this work, we propose a genotype stratified case‐control sampling strategy to select subjects for phenotype validation. We develo...
Source: Genetic Epidemiology - September 1, 2017 Category: Epidemiology Authors: Lu Wang, Scott M. Damrauer, Hong Zhang, Alan X. Zhang, Rui Xiao, Jason H. Moore, Jinbo Chen Tags: RESEARCH ARTICLE Source Type: research

Evolutionarily derived networks to inform disease pathways
ABSTRACT Methods to identify genes or pathways associated with complex diseases are often inadequate to elucidate most risk because they make implicit and oversimplified assumptions about underlying models of disease etiology. These can lead to incomplete or inadequate conclusions. To address this, we previously developed human phenotype networks (HPN), linking phenotypes based on shared biology. However, such visualization alone is often uninterpretable, and requires additional filtering. Here, we expand the HPN to include another method, evolutionary triangulation (ET). ET utilizes the hypothesis that alleles affecting d...
Source: Genetic Epidemiology - September 1, 2017 Category: Epidemiology Authors: Britney E. Graham, Christian Darabos, Minjun Huang, Louis J. Muglia, Jason H. Moore, Scott M. Williams Tags: RESEARCH ARTICLE Source Type: research

The more you test, the more you find: The smallest P ‐values become increasingly enriched with real findings as more tests are conducted
ABSTRACT The increasing accessibility of data to researchers makes it possible to conduct massive amounts of statistical testing. Rather than follow specific scientific hypotheses with statistical analysis, researchers can now test many possible relationships and let statistics generate hypotheses for them. The field of genetic epidemiology is an illustrative case, where testing of candidate genetic variants for association with an outcome has been replaced by agnostic screening of the entire genome. Poor replication rates of candidate gene studies have improved dramatically with the increase in genomic coverage, due to fa...
Source: Genetic Epidemiology - September 1, 2017 Category: Epidemiology Authors: Olga A. Vsevolozhskaya, Chia ‐Ling Kuo, Gabriel Ruiz, Luda Diatchenko, Dmitri V. Zaykin Tags: RESEARCH ARTICLE Source Type: research

Iterative hard thresholding for model selection in genome ‐wide association studies
ABSTRACT A genome‐wide association study (GWAS) correlates marker and trait variation in a study sample. Each subject is genotyped at a multitude of SNPs (single nucleotide polymorphisms) spanning the genome. Here, we assume that subjects are randomly collected unrelateds and that trait values are normally distributed or can be transformed to normality. Over the past decade, geneticists have been remarkably successful in applying GWAS analysis to hundreds of traits. The massive amount of data produced in these studies present unique computational challenges. Penalized regression with the ℓ1 penalty (LASSO) or minimax c...
Source: Genetic Epidemiology - September 1, 2017 Category: Epidemiology Authors: Kevin L. Keys, Gary K. Chen, Kenneth Lange Tags: RESEARCH ARTICLE Source Type: research

A multivariate distance ‐based analytic framework for microbial interdependence association test in longitudinal study
ABSTRACT Human microbiome is the collection of microbes living in and on the various parts of our body. The microbes living on our body in nature do not live alone. They act as integrated microbial community with massive competing and cooperating and contribute to our human health in a very important way. Most current analyses focus on examining microbial differences at a single time point, which do not adequately capture the dynamic nature of the microbiome data. With the advent of high‐throughput sequencing and analytical tools, we are able to probe the interdependent relationship among microbial species through longit...
Source: Genetic Epidemiology - September 1, 2017 Category: Epidemiology Authors: Yilong Zhang, Sung Won Han, Laura M. Cox, Huilin Li Tags: RESEARCH ARTICLE Source Type: research

Improving power of association tests using multiple sets of imputed genotypes from distributed reference panels
ABSTRACT The accuracy of genotype imputation depends upon two factors: the sample size of the reference panel and the genetic similarity between the reference panel and the target samples. When multiple reference panels are not consented to combine together, it is unclear how to combine the imputation results to optimize the power of genetic association studies. We compared the accuracy of 9,265 Norwegian genomes imputed from three reference panels—1000 Genomes phase 3 (1000G), Haplotype Reference Consortium (HRC), and a reference panel containing 2,201 Norwegian participants from the population‐based Nord Trøndelag H...
Source: Genetic Epidemiology - September 1, 2017 Category: Epidemiology Authors: Wei Zhou, Lars G. Fritsche, Sayantan Das, He Zhang, Jonas B. Nielsen, Oddgeir L. Holmen, Jin Chen, Maoxuan Lin, Maiken B. Elvestad, Kristian Hveem, Goncalo R. Abecasis, Hyun Min Kang, Cristen J. Willer Tags: RESEARCH ARTICLE Source Type: research

A functional U ‐statistic method for association analysis of sequencing data
Abstract Although sequencing studies hold great promise for uncovering novel variants predisposing to human diseases, the high dimensionality of the sequencing data brings tremendous challenges to data analysis. Moreover, for many complex diseases (e.g., psychiatric disorders) multiple related phenotypes are collected. These phenotypes can be different measurements of an underlying disease, or measurements characterizing multiple related diseases for studying common genetic mechanism. Although jointly analyzing these phenotypes could potentially increase the power of identifying disease‐associated genes, the different ty...
Source: Genetic Epidemiology - August 29, 2017 Category: Epidemiology Authors: Sneha Jadhav, Xiaoran Tong, Qing Lu Tags: RESEARCH ARTICLE Source Type: research

The 2017 Annual Meeting of the International Genetic Epidemiology Society
(Source: Genetic Epidemiology)
Source: Genetic Epidemiology - August 22, 2017 Category: Epidemiology Tags: ABSTRACTS Source Type: research

Issue Information
(Source: Genetic Epidemiology)
Source: Genetic Epidemiology - August 16, 2017 Category: Epidemiology Tags: ISSUE INFORMATION Source Type: research

An efficient study design to test parent ‐of‐origin effects in family trios
ABSTRACT Increasing evidence has shown that genes may cause prenatal, neonatal, and pediatric diseases depending on their parental origins. Statistical models that incorporate parent‐of‐origin effects (POEs) can improve the power of detecting disease‐associated genes and help explain the missing heritability of diseases. In many studies, children have been sequenced for genome‐wide association testing. But it may become unaffordable to sequence their parents and evaluate POEs. Motivated by the reality, we proposed a budget‐friendly study design of sequencing children and only genotyping their parents through sing...
Source: Genetic Epidemiology - July 1, 2017 Category: Epidemiology Authors: Xiaobo Yu, Gao Chen, Rui Feng Tags: RESEARCH ARTICLE Source Type: research

Adaptive testing for association between two random vectors in moderate to high dimensions
ABSTRACT Testing for association between two random vectors is a common and important task in many fields, however, existing tests, such as Escoufier's RV test, are suitable only for low‐dimensional data, not for high‐dimensional data. In moderate to high dimensions, it is necessary to consider sparse signals, which are often expected with only a few, but not many, variables associated with each other. We generalize the RV test to moderate‐to‐high dimensions. The key idea is to data adaptively weight each variable pair based on its empirical association. As the consequence, the proposed test is adaptive, alleviatin...
Source: Genetic Epidemiology - July 1, 2017 Category: Epidemiology Authors: Zhiyuan Xu, Gongjun Xu, Wei Pan, Tags: RESEARCH ARTICLE Source Type: research

A comparison of methods for inferring causal relationships between genotype and phenotype using additional biological measurements
ABSTRACT Genome wide association studies (GWAS) have been very successful over the last decade at identifying genetic variants associated with disease phenotypes. However, interpretation of the results obtained can be challenging. Incorporation of further relevant biological measurements (e.g. ‘omics’ data) measured in the same individuals for whom we have genotype and phenotype data may help us to learn more about the mechanism and pathways through which causal genetic variants affect disease. We review various methods for causal inference that can be used for assessing the relationships between genetic variables, oth...
Source: Genetic Epidemiology - July 1, 2017 Category: Epidemiology Authors: Holly F. Ainsworth, So ‐Youn Shin, Heather J. Cordell Tags: RESEARCH ARTICLE Source Type: research

Improving power for rare ‐variant tests by integrating external controls
ABSTRACT Due to the drop in sequencing cost, the number of sequenced genomes is increasing rapidly. To improve power of rare‐variant tests, these sequenced samples could be used as external control samples in addition to control samples from the study itself. However, when using external controls, possible batch effects due to the use of different sequencing platforms or genotype calling pipelines can dramatically increase type I error rates. To address this, we propose novel summary statistics based single and gene‐ or region‐based rare‐variant tests that allow the integration of external controls while controllin...
Source: Genetic Epidemiology - June 28, 2017 Category: Epidemiology Authors: Seunggeun Lee, Sehee Kim, Christian Fuchsberger Tags: RESEARCH ARTICLE Source Type: research

Accommodating missingness in environmental measurements in gene ‐environment interaction analysis
In this study, we conduct G‐E interaction analysis with prognosis data under an accelerated failure time (AFT) model. To accommodate missingness in E measurements, we adopt a nonparametric kernel‐based data augmentation approach. With a well‐designed weighting scheme, a nice “byproduct” is that the proposed approach enjoys a certain robustness property. A penalization approach, which respects the “main effects, interactions” hierarchy, is adopted for selection (of important interactions and main effects) and regularized estimation. The proposed approach has sound interpretations and a solid statistical basis....
Source: Genetic Epidemiology - June 28, 2017 Category: Epidemiology Authors: Mengyun Wu, Yangguang Zang, Sanguo Zhang, Jian Huang, Shuangge Ma Tags: RESEARCH ARTICLE Source Type: research