Comparative evaluation of multiomics integration tools for the study of prediabetes: insights into the earliest stages of type 2 diabetes mellitus

AbstractType 2 diabetes mellitus (T2D) remains a critical health concern, particularly in its early disease stages such as prediabetes. Understanding these early stages is paramount for improving patient outcomes. Multiomics data integration tools offer promise in unraveling the underlying mechanisms of T2D. The advent of high-throughput technology and the increasing availability of multiomics data has led to the development of several statistical and network-based integration methods. However, the performance of such methods varies, requiring their output evaluation in an unbiased manner. Here, we conducted a comparative analysis of three represented unsupervised multiomics integration tools, MOFA  + , GFA, and ICluster alongside an in-house supervised model EMFR, using two complementary benchmarks. First, we assessed how well the features selected by each tool could discriminate between patient and control samples using both linear and nonlinear classification models. Second, we quantifi ed how much each type of omics data-selected features contributed to the total variance. Through such detailed comparisons between the unsupervised, we observed that the features selected by MOFA + and GFA gave the best F1 score (0.7) in the nonlinear classification model, clearly discriminating between patient and control classes. Hence, we recommend these two unsupervised integration tools for feature selection purposes. Our comparative analyses were conducted on a real biological...
Source: Network Modeling Analysis in Health Informatics and Bioinformatics - Category: Bioinformatics Source Type: research