Omics feature selection with the extended SIS R package: identification of a body mass index epigenetic multi-marker in the Strong Heart Study

Am J Epidemiol. 2024 Feb 20:kwae006. doi: 10.1093/aje/kwae006. Online ahead of print.ABSTRACTThe statistical analysis of omics data poses a great computational challenge given its ultra-high dimensional nature and frequent between-features correlation. In this work, we extended the Iterative Sure Independence Screening (ISIS) algorithm by pairing ISIS with elastic-net (Enet) and two versions of adaptive Enet (AEnet and MSAEnet) to efficiently improve feature selection and effect estimation in omics research. We subsequently used genome-wide human blood DNA methylation data from American Indians of the Strong Heart Study (N=2,235 participants), measured in 1989-1991, to compare the performance (predictive accuracy, coefficient estimation and computational efficiency) of SIS-paired regularization methods to Bayesian shrinkage and traditional linear regression to identify epigenomic multi-marker of body mass index. ISIS-AEnet outperformed the other methods in prediction. In biological pathway enrichment analysis of genes annotated to BMI-related differentially methylated positions, ISIS-AEnet captured most of the enriched pathways in common for at least two of all the evaluated methods. ISIS-AEnet can favor biological discovery because it identifies the most robust biological pathways while achieving an optimal balance between bias and efficient feature selection. In the extended SIS R package, we also implemented ISIS paired with Cox and logistic regression for time-to-event an...
Source: Am J Epidemiol - Category: Epidemiology Authors: Source Type: research