Predicting childhood allergy using machine learning methods on multi-omics data

Asthma, rhinitis, and eczema, are among the most prevalent allergic diseases worldwide. They have strong genetic and epigenetic contributions. We hypothesized that an integration of multi-omics layers can accurately predict allergy using machine learning methods.We combined data on environmental and genetic risk scores with blood and nasal DNA methylation data from 348 subjects aged 16-years from the Dutch PIAMA (Prevention and Incidence of Asthma and Mite Allergy) birth cohort. After assessing multiple machine learning methods, we selected Elastic Net for its accuracy, low overfit and interpretability.The majority of predictive power could be attributed to nasal DNA methylation. Using strict feature selection, we created a parsimonious allergy prediction model based on just three nasal CpG sites, that is able to robustly predict allergic disease. This model achieved a ROC AUC of 0.86 in the discovery PIAMA cohort and 0.82 in a Puerto Rican replication cohort of similar age. Lower performance was observed in two younger Dutch and Danish replication cohorts, both at age 6 years, which could be explained by the differing and age dependent methylation levels. The DNA methylation levels of the model’s three CpG sites are related to IgE sensitization and allergic disease comorbidity and are able to differentiate between symptomatic and asymptomatic allergy. The transcriptomic features associated with methylation at these CpG sites indicated increased presence or activity of ...
Source: European Respiratory Journal - Category: Respiratory Medicine Authors: Tags: Paediatric asthma and allergy Source Type: research