MoCoLo: a testing framework for motif co-localization
We present a new analytical method for examining feature interaction by introducing the notion of reciprocal co-occurrence, define statistics to estimate it and hypotheses to test for it. Our approach leverages conditional motif co-occurrence events between features to infer their co-localization. Using reverse conditional probabilities and introducing a novel simulation approach that retains motif properties (e.g. length, guanine-content), our method further accounts for potential confounders in testing. As a proof-of-concept, motif co-localization (MoCoLo) confirmed the co-occurrence of histone markers in a breast cancer...
Source: Briefings in Bioinformatics - March 23, 2024 Category: Bioinformatics Authors: Qi Xu Imee M A Del Mundo Maha Zewail-Foote Brian T Luke Karen M Vasquez Jeanne Kowalski Source Type: research

scHybridBERT: integrating gene regulation and cell graph for spatiotemporal dynamics in single-cell clustering
In this study, spatiotemporal embedding and cell graphs are extracted to capture spatial dynamics at the molecular level. In order to enhance the accuracy of cell type detection, this study proposes the scHybridBERT architecture to conduct multi-view modeling of scRNA-seq data using extracted spatiotemporal patterns. In this scHybridBERT method, graph learning models are employed to deal with cell graphs and the Performer model employs spatiotemporal embeddings. Experimental outcomes about benchmark scRNA-seq datasets indicate that the proposed scHybridBERT method is able to enhance the accuracy of single-cell clustering t...
Source: Briefings in Bioinformatics - March 22, 2024 Category: Bioinformatics Authors: Zhang Wei Wu Chenjun Xing Feiyang Jiang Mingfeng Zhang Yixuan Liu Qi Shi Zhuoxing Dai Qi Source Type: research

ReHoGCNES-MDA: prediction of miRNA-disease associations using homogenous graph convolutional networks based on regular graph with random edge sampler
Brief Bioinform. 2024 Jan 22;25(2):bbae103. doi: 10.1093/bib/bbae103.ABSTRACTNumerous investigations increasingly indicate the significance of microRNA (miRNA) in human diseases. Hence, unearthing associations between miRNA and diseases can contribute to precise diagnosis and efficacious remediation of medical conditions. The detection of miRNA-disease linkages via computational techniques utilizing biological information has emerged as a cost-effective and highly efficient approach. Here, we introduced a computational framework named ReHoGCNES, designed for prospective miRNA-disease association prediction (ReHoGCNES-MDA)....
Source: Briefings in Bioinformatics - March 22, 2024 Category: Bioinformatics Authors: Yufang Zhang Yanyi Chu Shenggeng Lin Yi Xiong Dong-Qing Wei Source Type: research

GPCR-IPL score: multilevel featurization of GPCR-ligand interaction patterns and prediction of ligand functions from selectivity to biased activation
In this study, we designed 3D multilevel features to describe the extracellular interaction patterns. Subsequently, these 3D features were utilized to predict the post-binding events that result from conformational dynamics from the extracellular to intracellular areas. To understand the adaptability of GPCR ligands, we collected the conformational information of flexible residues during binding and performed molecular featurization on a broad range of GPCR-ligand complexes. As a result, we developed GPCR-ligand interaction patterns, binding pockets, and ligand features as score (GPCR-IPL score) for predicting the function...
Source: Briefings in Bioinformatics - March 22, 2024 Category: Bioinformatics Authors: Surendra Kumar Mahesh K Teli Mi-Hyun Kim Source Type: research

polyGBLUP: a modified genomic best linear unbiased prediction improved the genomic prediction efficiency for autopolyploid species
In this study, we developed a modified genomic best linear unbiased prediction (GBLUP) model (polyGBLUP) through constructing additive and dominant genomic relationship matrices based on different allele dosages. polyGBLUP could carry out genomic prediction for autopolyploid species regardless of the ploidy level. Through comprehensive simulations and analysis of real data of autotetraploid blueberry and guinea grass and autohexaploid sweet potato, the results showed that polyGBLUP achieved higher prediction accuracy than GBLUP and its superiority was more obvious when the ploidy level of autopolyploids is high. Furthermor...
Source: Briefings in Bioinformatics - March 22, 2024 Category: Bioinformatics Authors: Hailiang Song Qin Zhang Hongxia Hu Source Type: research

pathMap: a path-based mapping tool for long noisy reads with high sensitivity
In this study, we present pathMap, a novel k-mer graph-based mapper that is specifically designed for mapping SMS reads with high sensitivity. By viewing the alignment chain as a path containing as many anchors as possible in the matched k-mer graph, pathMap treats chaining as a path selection problem in the directed graph. pathMap iteratively searches the longest path in the remaining nodes; more candidate chains with high quality can be effectively detected and aligned. Compared to other state-of-the-art mapping methods such as minimap2 and Winnowmap2, experiment results on simulated and real-life datasets demonstrate th...
Source: Briefings in Bioinformatics - March 22, 2024 Category: Bioinformatics Authors: Ze-Gang Wei Xiao-Dan Zhang Xing-Guo Fan Yu Qian Fei Liu Fang-Xiang Wu Source Type: research

Comparative analysis of models in predicting the effects of SNPs on TF-DNA binding using large-scale in vitro and in vivo data
Brief Bioinform. 2024 Jan 22;25(2):bbae110. doi: 10.1093/bib/bbae110.ABSTRACTNon-coding variants associated with complex traits can alter the motifs of transcription factor (TF)-deoxyribonucleic acid binding. Although many computational models have been developed to predict the effects of non-coding variants on TF binding, their predictive power lacks systematic evaluation. Here we have evaluated 14 different models built on position weight matrices (PWMs), support vector machines, ordinary least squares and deep neural networks (DNNs), using large-scale in vitro (i.e. SNP-SELEX) and in vivo (i.e. allele-specific binding, ...
Source: Briefings in Bioinformatics - March 22, 2024 Category: Bioinformatics Authors: Dongmei Han Yurun Li Linxiao Wang Xuan Liang Yuanyuan Miao Wenran Li Sijia Wang Zhen Wang Source Type: research

DDK-Linker: a network-based strategy identifies disease signals by linking high-throughput omics datasets to disease knowledge
Brief Bioinform. 2024 Jan 22;25(2):bbae111. doi: 10.1093/bib/bbae111.ABSTRACTThe high-throughput genomic and proteomic scanning approaches allow investigators to measure the quantification of genome-wide genes (or gene products) for certain disease conditions, which plays an essential role in promoting the discovery of disease mechanisms. The high-throughput approaches often generate a large gene list of interest (GOIs), such as differentially expressed genes/proteins. However, researchers have to perform manual triage and validation to explore the most promising, biologically plausible linkages between the known disease g...
Source: Briefings in Bioinformatics - March 22, 2024 Category: Bioinformatics Authors: Xiangren Kong Lihong Diao Peng Jiang Shiyan Nie Shuzhen Guo Dong Li Source Type: research

DEMO-EM2: assembling protein complex structures from cryo-EM maps through intertwined chain and domain fitting
In this study, we extend our previously developed DEMO-EM to present DEMO-EM2, an automated method for constructing protein complex models from cryo-EM maps through an iterative assembly procedure intertwining chain- and domain-level matching and fitting for predicted chain models. The method was carefully evaluated on 27 cryo-electron tomography (cryo-ET) maps and 16 single-particle EM maps, where DEMO-EM2 models achieved an average TM-score of 0.92, outperforming those of state-of-the-art methods. The results demonstrate an efficient method that enables the rapid and reliable solution of challenging cryo-EM structure mod...
Source: Briefings in Bioinformatics - March 22, 2024 Category: Bioinformatics Authors: Ziying Zhang Yaxian Cai Biao Zhang Wei Zheng Lydia Freddolino Guijun Zhang Xiaogen Zhou Source Type: research

scHybridBERT: integrating gene regulation and cell graph for spatiotemporal dynamics in single-cell clustering
In this study, spatiotemporal embedding and cell graphs are extracted to capture spatial dynamics at the molecular level. In order to enhance the accuracy of cell type detection, this study proposes the scHybridBERT architecture to conduct multi-view modeling of scRNA-seq data using extracted spatiotemporal patterns. In this scHybridBERT method, graph learning models are employed to deal with cell graphs and the Performer model employs spatiotemporal embeddings. Experimental outcomes about benchmark scRNA-seq datasets indicate that the proposed scHybridBERT method is able to enhance the accuracy of single-cell clustering t...
Source: Briefings in Bioinformatics - March 22, 2024 Category: Bioinformatics Authors: Zhang Wei Wu Chenjun Xing Feiyang Jiang Mingfeng Zhang Yixuan Liu Qi Shi Zhuoxing Dai Qi Source Type: research

ReHoGCNES-MDA: prediction of miRNA-disease associations using homogenous graph convolutional networks based on regular graph with random edge sampler
Brief Bioinform. 2024 Jan 22;25(2):bbae103. doi: 10.1093/bib/bbae103.ABSTRACTNumerous investigations increasingly indicate the significance of microRNA (miRNA) in human diseases. Hence, unearthing associations between miRNA and diseases can contribute to precise diagnosis and efficacious remediation of medical conditions. The detection of miRNA-disease linkages via computational techniques utilizing biological information has emerged as a cost-effective and highly efficient approach. Here, we introduced a computational framework named ReHoGCNES, designed for prospective miRNA-disease association prediction (ReHoGCNES-MDA)....
Source: Briefings in Bioinformatics - March 22, 2024 Category: Bioinformatics Authors: Yufang Zhang Yanyi Chu Shenggeng Lin Yi Xiong Dong-Qing Wei Source Type: research

GPCR-IPL score: multilevel featurization of GPCR-ligand interaction patterns and prediction of ligand functions from selectivity to biased activation
In this study, we designed 3D multilevel features to describe the extracellular interaction patterns. Subsequently, these 3D features were utilized to predict the post-binding events that result from conformational dynamics from the extracellular to intracellular areas. To understand the adaptability of GPCR ligands, we collected the conformational information of flexible residues during binding and performed molecular featurization on a broad range of GPCR-ligand complexes. As a result, we developed GPCR-ligand interaction patterns, binding pockets, and ligand features as score (GPCR-IPL score) for predicting the function...
Source: Briefings in Bioinformatics - March 22, 2024 Category: Bioinformatics Authors: Surendra Kumar Mahesh K Teli Mi-Hyun Kim Source Type: research

polyGBLUP: a modified genomic best linear unbiased prediction improved the genomic prediction efficiency for autopolyploid species
In this study, we developed a modified genomic best linear unbiased prediction (GBLUP) model (polyGBLUP) through constructing additive and dominant genomic relationship matrices based on different allele dosages. polyGBLUP could carry out genomic prediction for autopolyploid species regardless of the ploidy level. Through comprehensive simulations and analysis of real data of autotetraploid blueberry and guinea grass and autohexaploid sweet potato, the results showed that polyGBLUP achieved higher prediction accuracy than GBLUP and its superiority was more obvious when the ploidy level of autopolyploids is high. Furthermor...
Source: Briefings in Bioinformatics - March 22, 2024 Category: Bioinformatics Authors: Hailiang Song Qin Zhang Hongxia Hu Source Type: research

pathMap: a path-based mapping tool for long noisy reads with high sensitivity
In this study, we present pathMap, a novel k-mer graph-based mapper that is specifically designed for mapping SMS reads with high sensitivity. By viewing the alignment chain as a path containing as many anchors as possible in the matched k-mer graph, pathMap treats chaining as a path selection problem in the directed graph. pathMap iteratively searches the longest path in the remaining nodes; more candidate chains with high quality can be effectively detected and aligned. Compared to other state-of-the-art mapping methods such as minimap2 and Winnowmap2, experiment results on simulated and real-life datasets demonstrate th...
Source: Briefings in Bioinformatics - March 22, 2024 Category: Bioinformatics Authors: Ze-Gang Wei Xiao-Dan Zhang Xing-Guo Fan Yu Qian Fei Liu Fang-Xiang Wu Source Type: research

Comparative analysis of models in predicting the effects of SNPs on TF-DNA binding using large-scale in vitro and in vivo data
Brief Bioinform. 2024 Jan 22;25(2):bbae110. doi: 10.1093/bib/bbae110.ABSTRACTNon-coding variants associated with complex traits can alter the motifs of transcription factor (TF)-deoxyribonucleic acid binding. Although many computational models have been developed to predict the effects of non-coding variants on TF binding, their predictive power lacks systematic evaluation. Here we have evaluated 14 different models built on position weight matrices (PWMs), support vector machines, ordinary least squares and deep neural networks (DNNs), using large-scale in vitro (i.e. SNP-SELEX) and in vivo (i.e. allele-specific binding, ...
Source: Briefings in Bioinformatics - March 22, 2024 Category: Bioinformatics Authors: Dongmei Han Yurun Li Linxiao Wang Xuan Liang Yuanyuan Miao Wenran Li Sijia Wang Zhen Wang Source Type: research