High-Order Correlation Integration for Single-Cell or Bulk RNA-seq Data Analysis

Conclusion The distinct types of biological data could provide a precise explanation for understanding the complex biological processes (Ghazalpour et al., 2006; Kutalik et al., 2008; Li et al., 2012; Zhang et al., 2012; Chen and Zhang, 2016; Zeng et al., 2016; Feng et al., 2018; Yu and Zeng, 2018). In recent decades, many approaches were proposed for analyzing single-cell data or multi-omics data to identify subtypes and construct biological networks (Gygi et al., 1999; Ding and He, 2004; Chari et al., 2010; Zhang et al., 2011; Kiselev et al., 2017; Guo et al., 2018a,b; Wang et al., 2018). However, for most methods, there are some limitations on reliably identifying the sample types by exploiting multi-datasets, such as the effect of noise on data and the computational cost. And some methods would fail to make full use of the similarity information between samples, thus making the results unreliable. Hence, in order to overcome this problem, a flexible and efficient integration method with automated information fusion and bias correction is demanded. In this work, we introduced the data-driven integrating method HCI. The key idea of this method is to incorporate the high-order similarity matrices (e.g., Pearson correlation matrix) into pattern fusion analysis, where the sample cluster or subtype structure can be actually determined benefiting from the high-order correlations. And the obtained combinatorial sample patterns from HCI could represent comprehensive characterizat...
Source: Frontiers in Genetics - Category: Genetics & Stem Cells Source Type: research