scAMAC: self-supervised clustering of scRNA-seq data based on adaptive multi-scale autoencoder
Brief Bioinform. 2024 Jan 22;25(2):bbae068. doi: 10.1093/bib/bbae068.ABSTRACTCluster assignment is vital to analyzing single-cell RNA sequencing (scRNA-seq) data to understand high-level biological processes. Deep learning-based clustering methods have recently been widely used in scRNA-seq data analysis. However, existing deep models often overlook the interconnections and interactions among network layers, leading to the loss of structural information within the network layers. Herein, we develop a new self-supervised clustering method based on an adaptive multi-scale autoencoder, called scAMAC. The self-supervised clust...
Source: Briefings in Bioinformatics - March 1, 2024 Category: Bioinformatics Authors: Dayu Tan Cheng Yang Jing Wang Yansen Su Chunhou Zheng Source Type: research

CRISPRlnc: a machine learning method for lncRNA-specific single-guide RNA design of CRISPR/Cas9 system
In this study, we first evaluated the performances of a series of known sgRNA-designing tools in the context of both coding and non-coding datasets. Meanwhile, we analyzed the underpinnings of their varied performances on the sgRNA's specificity for lncRNA including nucleic acid sequence, genome location and editing mechanism preference. Furthermore, we introduce a support vector machine-based machine learning algorithm named CRISPRlnc, which aims to model both CRISPR knock-out (CRISPRko) and CRISPR inhibition (CRISPRi) mechanisms to predict the on-target activity of targets. CRISPRlnc combined the paired-sgRNA design and ...
Source: Briefings in Bioinformatics - March 1, 2024 Category: Bioinformatics Authors: Zitian Yang Zexin Zhang Jing Li Wen Chen Changning Liu Source Type: research

TransGCN: a semi-supervised graph convolution network-based framework to infer protein translocations in spatio-temporal proteomics
Brief Bioinform. 2024 Jan 22;25(2):bbae055. doi: 10.1093/bib/bbae055.ABSTRACTProtein subcellular localization (PSL) is very important in order to understand its functions, and its movement between subcellular niches within cells plays fundamental roles in biological process regulation. Mass spectrometry-based spatio-temporal proteomics technologies can help provide new insights of protein translocation, but bring the challenge in identifying reliable protein translocation events due to the noise interference and insufficient data mining. We propose a semi-supervised graph convolution network (GCN)-based framework termed Tr...
Source: Briefings in Bioinformatics - March 1, 2024 Category: Bioinformatics Authors: Bing Wang Xiangzheng Zhang Xudong Han Bingjie Hao Yan Li Xuejiang Guo Source Type: research

Elevated incidence of somatic mutations at prevalent genetic sites
Brief Bioinform. 2024 Jan 22;25(2):bbae065. doi: 10.1093/bib/bbae065.ABSTRACTThe common loci represent a distinct set of the human genome sites that harbor genetic variants found in at least 1% of the population. Small somatic mutations occur at the common loci and non-common loci, i.e. csmVariants and ncsmVariants, are presumed with similar probabilities. However, our work revealed that within the coding region, common loci constituted only 1.03% of all loci, yet they accounted for 5.14% of TCGA somatic mutations. Furthermore, the small somatic mutation incidence rate at these common loci was 2.7 times that observed in th...
Source: Briefings in Bioinformatics - March 1, 2024 Category: Bioinformatics Authors: Mengyao Wang Shuai Cheng Li Bairong Shen Source Type: research

Deeply integrating latent consistent representations in high-noise multi-omics data for cancer subtyping
Brief Bioinform. 2024 Jan 22;25(2):bbae061. doi: 10.1093/bib/bbae061.ABSTRACTCancer is a complex and high-mortality disease regulated by multiple factors. Accurate cancer subtyping is crucial for formulating personalized treatment plans and improving patient survival rates. The underlying mechanisms that drive cancer progression can be comprehensively understood by analyzing multi-omics data. However, the high noise levels in omics data often pose challenges in capturing consistent representations and adequately integrating their information. This paper proposed a novel variational autoencoder-based deep learning model, na...
Source: Briefings in Bioinformatics - March 1, 2024 Category: Bioinformatics Authors: Yueyi Cai Shunfang Wang Source Type: research

Adjustment of scRNA-seq data to improve cell-type decomposition of spatial transcriptomics
Brief Bioinform. 2024 Jan 22;25(2):bbae063. doi: 10.1093/bib/bbae063.ABSTRACTMost sequencing-based spatial transcriptomics (ST) technologies do not achieve single-cell resolution where each captured location (spot) may contain a mixture of cells from heterogeneous cell types, and several cell-type decomposition methods have been proposed to estimate cell type proportions of each spot by integrating with single-cell RNA sequencing (scRNA-seq) data. However, these existing methods did not fully consider the effect of distribution difference between scRNA-seq and ST data for decomposition, leading to biased cell-type-specific...
Source: Briefings in Bioinformatics - March 1, 2024 Category: Bioinformatics Authors: Lanying Wang Yuxuan Hu Lin Gao Source Type: research

Biolinguistic graph fusion model for circRNA-miRNA association prediction
Brief Bioinform. 2024 Jan 22;25(2):bbae058. doi: 10.1093/bib/bbae058.ABSTRACTEmerging clinical evidence suggests that sophisticated associations with circular ribonucleic acids (RNAs) (circRNAs) and microRNAs (miRNAs) are a critical regulatory factor of various pathological processes and play a critical role in most intricate human diseases. Nonetheless, the above correlations via wet experiments are error-prone and labor-intensive, and the underlying novel circRNA-miRNA association (CMA) has been validated by numerous existing computational methods that rely only on single correlation data. Considering the inadequacy of e...
Source: Briefings in Bioinformatics - March 1, 2024 Category: Bioinformatics Authors: Lu-Xiang Guo Lei Wang Zhu-Hong You Chang-Qing Yu Meng-Lei Hu Bo-Wei Zhao Yang Li Source Type: research

Ion entropy and accurate entropy-based FDR estimation in metabolomics
Brief Bioinform. 2024 Jan 22;25(2):bbae056. doi: 10.1093/bib/bbae056.ABSTRACTAccurate metabolite annotation and false discovery rate (FDR) control remain challenging in large-scale metabolomics. Recent progress leveraging proteomics experiences and interdisciplinary inspirations has provided valuable insights. While target-decoy strategies have been introduced, generating reliable decoy libraries is difficult due to metabolite complexity. Moreover, continuous bioinformatics innovation is imperative to improve the utilization of expanding spectral resources while reducing false annotations. Here, we introduce the concept of...
Source: Briefings in Bioinformatics - March 1, 2024 Category: Bioinformatics Authors: Shaowei An Miaoshan Lu Ruimin Wang Jinyin Wang Hengxuan Jiang Cong Xie Junjie Tong Changbin Yu Source Type: research

Dual-channel hypergraph convolutional network for predicting herb-disease associations
Brief Bioinform. 2024 Jan 22;25(2):bbae067. doi: 10.1093/bib/bbae067.ABSTRACTHerbs applicability in disease treatment has been verified through experiences over thousands of years. The understanding of herb-disease associations (HDAs) is yet far from complete due to the complicated mechanism inherent in multi-target and multi-component (MTMC) botanical therapeutics. Most of the existing prediction models fail to incorporate the MTMC mechanism. To overcome this problem, we propose a novel dual-channel hypergraph convolutional network, namely HGHDA, for HDA prediction. Technically, HGHDA first adopts an autoencoder to projec...
Source: Briefings in Bioinformatics - March 1, 2024 Category: Bioinformatics Authors: Lun Hu Menglong Zhang Pengwei Hu Jun Zhang Chao Niu Xueying Lu Xiangrui Jiang Yupeng Ma Source Type: research

scAMAC: self-supervised clustering of scRNA-seq data based on adaptive multi-scale autoencoder
Brief Bioinform. 2024 Jan 22;25(2):bbae068. doi: 10.1093/bib/bbae068.ABSTRACTCluster assignment is vital to analyzing single-cell RNA sequencing (scRNA-seq) data to understand high-level biological processes. Deep learning-based clustering methods have recently been widely used in scRNA-seq data analysis. However, existing deep models often overlook the interconnections and interactions among network layers, leading to the loss of structural information within the network layers. Herein, we develop a new self-supervised clustering method based on an adaptive multi-scale autoencoder, called scAMAC. The self-supervised clust...
Source: Briefings in Bioinformatics - March 1, 2024 Category: Bioinformatics Authors: Dayu Tan Cheng Yang Jing Wang Yansen Su Chunhou Zheng Source Type: research

CRISPRlnc: a machine learning method for lncRNA-specific single-guide RNA design of CRISPR/Cas9 system
In this study, we first evaluated the performances of a series of known sgRNA-designing tools in the context of both coding and non-coding datasets. Meanwhile, we analyzed the underpinnings of their varied performances on the sgRNA's specificity for lncRNA including nucleic acid sequence, genome location and editing mechanism preference. Furthermore, we introduce a support vector machine-based machine learning algorithm named CRISPRlnc, which aims to model both CRISPR knock-out (CRISPRko) and CRISPR inhibition (CRISPRi) mechanisms to predict the on-target activity of targets. CRISPRlnc combined the paired-sgRNA design and ...
Source: Briefings in Bioinformatics - March 1, 2024 Category: Bioinformatics Authors: Zitian Yang Zexin Zhang Jing Li Wen Chen Changning Liu Source Type: research

TransGCN: a semi-supervised graph convolution network-based framework to infer protein translocations in spatio-temporal proteomics
Brief Bioinform. 2024 Jan 22;25(2):bbae055. doi: 10.1093/bib/bbae055.ABSTRACTProtein subcellular localization (PSL) is very important in order to understand its functions, and its movement between subcellular niches within cells plays fundamental roles in biological process regulation. Mass spectrometry-based spatio-temporal proteomics technologies can help provide new insights of protein translocation, but bring the challenge in identifying reliable protein translocation events due to the noise interference and insufficient data mining. We propose a semi-supervised graph convolution network (GCN)-based framework termed Tr...
Source: Briefings in Bioinformatics - March 1, 2024 Category: Bioinformatics Authors: Bing Wang Xiangzheng Zhang Xudong Han Bingjie Hao Yan Li Xuejiang Guo Source Type: research

Elevated incidence of somatic mutations at prevalent genetic sites
Brief Bioinform. 2024 Jan 22;25(2):bbae065. doi: 10.1093/bib/bbae065.ABSTRACTThe common loci represent a distinct set of the human genome sites that harbor genetic variants found in at least 1% of the population. Small somatic mutations occur at the common loci and non-common loci, i.e. csmVariants and ncsmVariants, are presumed with similar probabilities. However, our work revealed that within the coding region, common loci constituted only 1.03% of all loci, yet they accounted for 5.14% of TCGA somatic mutations. Furthermore, the small somatic mutation incidence rate at these common loci was 2.7 times that observed in th...
Source: Briefings in Bioinformatics - March 1, 2024 Category: Bioinformatics Authors: Mengyao Wang Shuai Cheng Li Bairong Shen Source Type: research

Deeply integrating latent consistent representations in high-noise multi-omics data for cancer subtyping
Brief Bioinform. 2024 Jan 22;25(2):bbae061. doi: 10.1093/bib/bbae061.ABSTRACTCancer is a complex and high-mortality disease regulated by multiple factors. Accurate cancer subtyping is crucial for formulating personalized treatment plans and improving patient survival rates. The underlying mechanisms that drive cancer progression can be comprehensively understood by analyzing multi-omics data. However, the high noise levels in omics data often pose challenges in capturing consistent representations and adequately integrating their information. This paper proposed a novel variational autoencoder-based deep learning model, na...
Source: Briefings in Bioinformatics - March 1, 2024 Category: Bioinformatics Authors: Yueyi Cai Shunfang Wang Source Type: research

Adjustment of scRNA-seq data to improve cell-type decomposition of spatial transcriptomics
Brief Bioinform. 2024 Jan 22;25(2):bbae063. doi: 10.1093/bib/bbae063.ABSTRACTMost sequencing-based spatial transcriptomics (ST) technologies do not achieve single-cell resolution where each captured location (spot) may contain a mixture of cells from heterogeneous cell types, and several cell-type decomposition methods have been proposed to estimate cell type proportions of each spot by integrating with single-cell RNA sequencing (scRNA-seq) data. However, these existing methods did not fully consider the effect of distribution difference between scRNA-seq and ST data for decomposition, leading to biased cell-type-specific...
Source: Briefings in Bioinformatics - March 1, 2024 Category: Bioinformatics Authors: Lanying Wang Yuxuan Hu Lin Gao Source Type: research