ViSEAGO: a Bioconductor package for clustering biological functions using Gene Ontology and semantic similarity
The main objective of ViSEAGO package is to carry out a data mining of biological functions and establish links between genes involved in the study. We developed ViSEAGO in R to facilitate functional Gene Ontolog... (Source: BioData Mining)
Source: BioData Mining - August 6, 2019 Category: Bioinformatics Authors: Aur élien Brionne, Amélie Juanchich and Christelle Hennequet-Antier Tags: Short report Source Type: research

Disease associations depend on visit type: results from a visit-wide association study
Widespread adoption of Electronic Health Records (EHR) increased the number of reported disease association studies, or Phenome-Wide Association Studies (PheWAS). Traditional PheWAS studies ignore visit type (i.e... (Source: BioData Mining)
Source: BioData Mining - July 11, 2019 Category: Bioinformatics Authors: Mary Regina Boland, Snigdha Alur-Gupta, Lisa Levine, Peter Gabriel and Graciela Gonzalez-Hernandez Tags: Short report Source Type: research

Exploration of a diversity of computational and statistical measures of association for genome-wide genetic studies
The principal line of investigation in Genome Wide Association Studies (GWAS) is the identification of main effects, that is individual Single Nucleotide Polymorphisms (SNPs) which are associated with the trai... (Source: BioData Mining)
Source: BioData Mining - July 9, 2019 Category: Bioinformatics Authors: Elisabetta Manduchi, Patryk R. Orzechowski, Marylyn D. Ritchie and Jason H. Moore Tags: Research Source Type: research

On the utilization of deep and ensemble learning to detect milk adulteration
Fraudulent milk adulteration is a dangerous practice in the dairy industry that is harmful to consumers since milk is one of the most consumed food products. Milk quality can be assessed by Fourier Transformed... (Source: BioData Mining)
Source: BioData Mining - July 8, 2019 Category: Bioinformatics Authors: Habib Asseiss Neto, Wanessa L.F. Tavares, Daniela C.S.Z. Ribeiro, Ronnie C.O. Alves, Leorges M. Fonseca and S érgio V.A. Campos Tags: Methodology Source Type: research

ClickGene: an open cloud-based platform for big pan-cancer data genome-wide association study, visualization and exploration
Tremendous amount of whole-genome sequencing data have been provided by large consortium projects such as TCGA (The Cancer Genome Atlas), COSMIC and so on, which creates incredible opportunities for functional... (Source: BioData Mining)
Source: BioData Mining - June 26, 2019 Category: Bioinformatics Authors: Jia-Hao Bi, Yi-Fan Tong, Zhe-Wei Qiu, Xing-Feng Yang, John Minna, Adi F. Gazdar and Kai Song Tags: Software article Source Type: research

Confounding of linkage disequilibrium patterns in large scale DNA based gene-gene interaction studies
In Genome-Wide Association Studies (GWAS), the concept of linkage disequilibrium is important as it allows identifying genetic markers that tag the actual causal variants. In Genome-Wide Association Interactio... (Source: BioData Mining)
Source: BioData Mining - June 10, 2019 Category: Bioinformatics Authors: Marc Joiret, Jestinah M. Mahachie John, Elena S. Gusareva and Kristel Van Steen Tags: Research Source Type: research

Innovative strategies for annotating the “relationSNP” between variants and molecular phenotypes
Characterizing how variation at the level of individual nucleotides contributes to traits and diseases has been an area of growing interest since the completion of sequencing the first human genome. Our unders... (Source: BioData Mining)
Source: BioData Mining - May 14, 2019 Category: Bioinformatics Authors: Jason E. Miller, Yogasudha Veturi and Marylyn D. Ritchie Tags: Review Source Type: research

Within-sample co-methylation patterns in normal tissues
DNA methylation is an epigenetic event that may regulate gene expression. Because of this regulation role, aberrant DNA methylation is often associated with many diseases. Within-sample DNA co-methylation is t... (Source: BioData Mining)
Source: BioData Mining - May 9, 2019 Category: Bioinformatics Authors: Lillian Sun and Shuying Sun Tags: Research Source Type: research

Characterizing human genomic coevolution in locus-gene regulatory interactions
Coevolution has been used to identify and predict interactions and functional relationships between proteins of many different organisms including humans. Current efforts in annotating the human genome increas... (Source: BioData Mining)
Source: BioData Mining - March 15, 2019 Category: Bioinformatics Authors: Daniel Savel and Mehmet Koyut ürk Tags: Research Source Type: research

Encodings and models for antimicrobial peptide classification for multi-resistant pathogens
Antimicrobial peptides (AMPs) are part of the inherent immune system. In fact, they occur in almost all organisms including, e.g., plants, animals, and humans. Remarkably, they show effectivity also against mu... (Source: BioData Mining)
Source: BioData Mining - March 4, 2019 Category: Bioinformatics Authors: Sebastian Sp änig and Dominik Heider Tags: Review Source Type: research

Testing the assumptions of parametric linear models: the need for biological data mining in disciplines such as human genetics
(Source: BioData Mining)
Source: BioData Mining - February 11, 2019 Category: Bioinformatics Authors: Jason H. Moore, Trudy F. C. Mackay and Scott M. Williams Tags: Editorial Source Type: research

Approximate kernel reconstruction for time-varying networks
Most existing algorithms for modeling and analyzing molecular networks assume a static or time-invariant network topology. Such view, however, does not render the temporal evolution of the underlying biologica... (Source: BioData Mining)
Source: BioData Mining - February 6, 2019 Category: Bioinformatics Authors: Gregory Ditzler, Nidhal Bouaynaya, Roman Shterenberg and Hassan M. Fathallah-Shaykh Tags: Research Source Type: research

A biplot correlation range for group-wise metabolite selection in mass spectrometry
Analytic methods are available to acquire extensive metabolic information in a cost-effective manner for personalized medicine, yet disease risk and diagnosis mostly rely upon individual biomarkers based on st... (Source: BioData Mining)
Source: BioData Mining - February 4, 2019 Category: Bioinformatics Authors: Youngja H Park, Taewoon Kong, James R. Roede, Dean P. Jones and Kichun Lee Tags: Research Source Type: research

Predicting opioid dependence from electronic health records with machine learning
The opioid epidemic in the United States is averaging over 100 deaths per day due to overdose. The effectiveness of opioids as pain treatments, and the drug-seeking behavior of opioid addicts, leads physicians... (Source: BioData Mining)
Source: BioData Mining - January 29, 2019 Category: Bioinformatics Authors: Randall J. Ellis, Zichen Wang, Nicholas Genes and Avi Ma ’ayan Tags: Research Source Type: research

Use case driven evaluation of open databases for pediatric cancer research
A plethora of Web resources are available offering information on clinical, pre-clinical, genomic and theoretical aspects of cancer, including not only the comprehensive cancer projects as ICGC and TCGA, but a... (Source: BioData Mining)
Source: BioData Mining - January 15, 2019 Category: Bioinformatics Authors: Fleur Jeanquartier, Claire Jean-Quartier and Andreas Holzinger Tags: Research Source Type: research

Application of an interpretable classification model on Early Folding Residues during protein folding
Machine learning strategies are prominent tools for data analysis. Especially in life sciences, they have become increasingly important to handle the growing datasets collected by the scientific community. Mea... (Source: BioData Mining)
Source: BioData Mining - January 5, 2019 Category: Bioinformatics Authors: Sebastian Bittrich, Marika Kaden, Christoph Leberecht, Florian Kaiser, Thomas Villmann and Dirk Labudde Tags: Methodology Source Type: research

Unified Cox model based multifactor dimensionality reduction method for gene-gene interaction analysis of the survival phenotype
One strategy for addressing missing heritability in genome-wide association study is gene-gene interaction analysis, which, unlike a single gene approach, involves high-dimensionality. The multifactor dimensio... (Source: BioData Mining)
Source: BioData Mining - December 14, 2018 Category: Bioinformatics Authors: Seungyeoun Lee, Donghee Son, Yongkang Kim, Wenbao Yu and Taesung Park Tags: Methodology Source Type: research

Distributed retrieval engine for the development of cloud-deployed biological databases
The integration of cloud resources with federated data retrieval has the potential of improving the maintenance, accessibility and performance of specialized databases in the biomedical field. However, such an... (Source: BioData Mining)
Source: BioData Mining - November 12, 2018 Category: Bioinformatics Authors: David Bouzaglo, Israel Chasida and Elishai Ezra Tsur Tags: Software article Source Type: research

Knomics-Biota - a system for exploratory analysis of human gut microbiota data
Metagenomic surveys of human microbiota are becoming increasingly widespread in academic research as well as in food and pharmaceutical industries and clinical context. Intuitive tools for investigating experi... (Source: BioData Mining)
Source: BioData Mining - November 6, 2018 Category: Bioinformatics Authors: Daria Efimova, Alexander Tyakht, Anna Popenko, Anatoly Vasilyev, Ilya Altukhov, Nikita Dovidchenko, Vera Odintsova, Natalya Klimenko, Robert Loshkarev, Maria Pashkova, Anna Elizarova, Viktoriya Voroshilova, Sergei Slavskii, Yury Pekov, Ekaterina Filippova Tags: Software article Source Type: research

A fast forward 3D connection algorithm for mitochondria and synapse segmentations from serial EM images
It is becoming increasingly clear that the quantification of mitochondria and synapses is of great significance to understand the function of biological nervous systems. Electron microscopy (EM), with the nece... (Source: BioData Mining)
Source: BioData Mining - November 5, 2018 Category: Bioinformatics Authors: Weifu Li, Jing Liu, Chi Xiao, Hao Deng, Qiwei Xie and Hua Han Tags: Methodology Source Type: research

Transition-transversion encoding and genetic relationship metric in ReliefF feature selection improves pathway enrichment in GWAS
ReliefF is a nearest-neighbor based feature selection algorithm that efficiently detects variants that are important due to statistical interactions or epistasis. For categorical predictors, like genotypes, th... (Source: BioData Mining)
Source: BioData Mining - November 3, 2018 Category: Bioinformatics Authors: M. Arabnejad, B. A. Dawkins, W. S. Bush, B. C. White, A. R. Harkness and B. A. McKinney Tags: Methodology Source Type: research

Combining DNA methylation and RNA sequencing data of cancer for supervised knowledge extraction
In the Next Generation Sequencing (NGS) era a large amount of biological data is being sequenced, analyzed, and stored in many public databases, whose interoperability is often required to allow an enhanced ac... (Source: BioData Mining)
Source: BioData Mining - October 25, 2018 Category: Bioinformatics Authors: Eleonora Cappelli, Giovanni Felici and Emanuel Weitschek Tags: Methodology Source Type: research

To know the objective is not (necessarily) to know the objective function
(Source: BioData Mining)
Source: BioData Mining - October 4, 2018 Category: Bioinformatics Authors: Moshe Sipper, Ryan J. Urbanowicz and Jason H. Moore Tags: Editorial Source Type: research

Grasping frequent subgraph mining for bioinformatics applications
Searching for interesting common subgraphs in graph data is a well-studied problem in data mining. Subgraph mining techniques focus on the discovery of patterns in graphs that exhibit a specific network struct... (Source: BioData Mining)
Source: BioData Mining - September 3, 2018 Category: Bioinformatics Authors: Aida Mrzic, Pieter Meysman, Wout Bittremieux, Pieter Moris, Boris Cule, Bart Goethals and Kris Laukens Tags: Review Source Type: research

Functional relevance for central cornea thickness-associated genetic variants by using integrative analyses
The genetic architecture underlying central cornea thickness (CCT) is far from understood. Most of the CCT-associated variants are located in the non-coding regions, raising the difficulty of following functio... (Source: BioData Mining)
Source: BioData Mining - August 15, 2018 Category: Bioinformatics Authors: Jing Zhang, Dan Wu, Yiqin Dai and Jianjiang Xu Tags: Research Source Type: research

Evolutionary methods for variable selection in the epidemiological modeling of cardiovascular diseases
This study implements an advanced evo... (Source: BioData Mining)
Source: BioData Mining - August 14, 2018 Category: Bioinformatics Authors: Christina Brester, Jussi Kauhanen, Tomi-Pekka Tuomainen, Sari Voutilainen, Mauno R önkkö, Kimmo Ronkainen, Eugene Semenkin and Mikko Kolehmainen Tags: Research Source Type: research

Protein folding prediction in the HP model using ions motion optimization with a greedy algorithm
The function of a protein is determined by its native protein structure. Among many protein prediction methods, the Hydrophobic-Polar (HP) model, an ab initio method, simplifies the protein folding prediction ... (Source: BioData Mining)
Source: BioData Mining - August 8, 2018 Category: Bioinformatics Authors: Cheng-Hong Yang, Kuo-Chuan Wu, Yu-Shiun Lin, Li-Yeh Chuang and Hsueh-Wei Chang Tags: Methodology Source Type: research

A multi-objective gene clustering algorithm guided by apriori biological knowledge with intensification and diversification strategies
Biologists aim to understand the genetic background of diseases, metabolic disorders or any other genetic condition. Microarrays are one of the main high-throughput technologies for collecting information abou... (Source: BioData Mining)
Source: BioData Mining - August 7, 2018 Category: Bioinformatics Authors: Jorge Parraga-Alava, Marcio Dorn and Mario Inostroza-Ponta Tags: Methodology Source Type: research

TRIQ: a new method to evaluate triclusters
Triclustering has shown to be a valuable tool for the analysis of microarray data since its appearance as an improvement of classical clustering and biclustering techniques. The standard for validation of tric... (Source: BioData Mining)
Source: BioData Mining - August 6, 2018 Category: Bioinformatics Authors: David Guti érrez-Avilés, Raúl Giráldez, Francisco Javier Gil-Cumbreras and Cristina Rubio-Escudero Tags: Research Source Type: research

PathCORE-T: identifying and visualizing globally co-occurring pathways in large transcriptomic compendia
Investigators often interpret genome-wide data by analyzing the expression levels of genes within pathways. While this within-pathway analysis is routine, the products of any one pathway can affect the activit... (Source: BioData Mining)
Source: BioData Mining - July 3, 2018 Category: Bioinformatics Authors: Kathleen M. Chen, Jie Tan, Gregory P. Way, Georgia Doing, Deborah A. Hogan and Casey S. Greene Tags: Software article Source Type: research

Integrative analysis of gene expression and methylation data for breast cancer cell lines
The deadly costs of cancer and necessity for an accurate method of early cancer detection have demanded the identification of genetic and epigenetic factors associated with cancer. DNA methylation, an epigenet... (Source: BioData Mining)
Source: BioData Mining - June 25, 2018 Category: Bioinformatics Authors: Catherine Li, Juyon Lee, Jessica Ding and Shuying Sun Tags: Research Source Type: research

Measuring associations between the microbiota and repeated measures of continuous clinical variables using a lasso-penalized generalized linear mixed model
Human microbiome studies in clinical settings generally focus on distinguishing the microbiota in health from that in disease at a specific point in time. However, microbiome samples may be associated with dis... (Source: BioData Mining)
Source: BioData Mining - June 15, 2018 Category: Bioinformatics Authors: Laura Tipton, Karen T. Cuenco, Laurence Huang, Ruth M. Greenblatt, Eric Kleerup, Frank Sciurba, Steven R. Duncan, Michael P. Donahoe, Alison Morris and Elodie Ghedin Tags: Research Source Type: research

Soft document clustering using a novel graph covering approach
In text mining, document clustering describes the efforts to assign unstructured documents to clusters, which in turn usually refer to topics. Clustering is widely used in science for data retrieval and organi... (Source: BioData Mining)
Source: BioData Mining - June 14, 2018 Category: Bioinformatics Authors: Jens D örpinghaus, Sebastian Schaaf and Marc Jacobs Tags: Methodology Source Type: research

Characterizing the effects of missing data and evaluating imputation methods for chemical prioritization applications using ToxPi
The Toxicological Priority Index (ToxPi) is a method for prioritization and profiling of chemicals that integrates data from diverse sources. However, individual data sources ( “assays”), such as in vitro bioas... (Source: BioData Mining)
Source: BioData Mining - June 13, 2018 Category: Bioinformatics Authors: Kimberly T. To, Rebecca C. Fry and David M. Reif Tags: Research Source Type: research

Feature selection for gene prediction in metagenomic fragments
Computational approaches, specifically machine-learning techniques, play an important role in many metagenomic analysis algorithms, such as gene prediction. Due to the large feature space, current de novo gene... (Source: BioData Mining)
Source: BioData Mining - June 7, 2018 Category: Bioinformatics Authors: Amani Al-Ajlan and Achraf El Allali Tags: Methodology Source Type: research

Connecting genetics and gene expression data for target prioritisation and drug repositioning
Developing new drugs continues to be a highly inefficient and costly business. By repurposing an existing compound for a different indication, drug repositioning offers an attractive alternative to traditional... (Source: BioData Mining)
Source: BioData Mining - May 31, 2018 Category: Bioinformatics Authors: Enrico Ferrero and Pankaj Agarwal Tags: Short report Source Type: research

Gene set analysis methods: a systematic comparison
Gene set analysis is a valuable tool to summarize high-dimensional gene expression data in terms of biologically relevant sets. This is an active area of research and numerous gene set analysis methods have be... (Source: BioData Mining)
Source: BioData Mining - May 31, 2018 Category: Bioinformatics Authors: Ravi Mathur, Daniel Rotroff, Jun Ma, Ali Shojaie and Alison Motsinger-Reif Tags: Research Source Type: research

Collective feature selection to identify crucial epistatic variants
Machine learning methods have gained popularity and practicality in identifying linear and non-linear effects of variants associated with complex disease/traits. Detection of epistatic interactions still remai... (Source: BioData Mining)
Source: BioData Mining - April 19, 2018 Category: Bioinformatics Authors: Shefali S. Verma, Anastasia Lucas, Xinyuan Zhang, Yogasudha Veturi, Scott Dudek, Binglan Li, Ruowang Li, Ryan Urbanowicz, Jason H. Moore, Dokyoon Kim and Marylyn D. Ritchie Tags: Research Source Type: research

Improving machine learning reproducibility in genetic association studies with proportional instance cross validation (PICV)
Machine learning methods and conventions are increasingly employed for the analysis of large, complex biomedical data sets, including genome-wide association studies (GWAS). Reproducibility of machine learning... (Source: BioData Mining)
Source: BioData Mining - April 19, 2018 Category: Bioinformatics Authors: Elizabeth R. Piette and Jason H. Moore Tags: Methodology Source Type: research

Pairwise gene GO-based measures for biclustering of high-dimensional expression data
Biclustering algorithms search for groups of genes that share the same behavior under a subset of samples in gene expression data. Nowadays, the biological knowledge available in public repositories can be use... (Source: BioData Mining)
Source: BioData Mining - March 27, 2018 Category: Bioinformatics Authors: Juan A. Nepomuceno, Alicia Troncoso, Isabel A. Nepomuceno-Chamorro and Jes ús S. Aguilar-Ruiz Tags: Research Source Type: research

A novel joint analysis framework improves identification of differentially expressed genes in cross disease transcriptomic analysis
Detecting differentially expressed (DE) genes between disease and normal control group is one of the most common analyses in genome-wide transcriptomic data. Since most studies don ’t have a lot of samples, res... (Source: BioData Mining)
Source: BioData Mining - February 20, 2018 Category: Bioinformatics Authors: Wenyi Qin and Hui Lu Tags: Methodology Source Type: research

Investigating the parameter space of evolutionary algorithms
Evolutionary computation (EC) has been widely applied to biological and biomedical data. The practice of EC involves the tuning of many parameters, such as population size, generation count, selection size, an... (Source: BioData Mining)
Source: BioData Mining - February 17, 2018 Category: Bioinformatics Authors: Moshe Sipper, Weixuan Fu, Karuna Ahuja and Jason H. Moore Tags: Research Source Type: research

Identification of influential observations in high-dimensional cancer survival data through the rank product test
Survival analysis is a statistical technique widely used in many fields of science, in particular in the medical area, and which studies the time until an event of interest occurs. Outlier detection in this co... (Source: BioData Mining)
Source: BioData Mining - February 14, 2018 Category: Bioinformatics Authors: Eunice Carrasquinha, Andr é Veríssimo, Marta B. Lopes and Susana Vinga Tags: Research Source Type: research

Scalable non-negative matrix tri-factorization
Matrix factorization is a well established pattern discovery tool that has seen numerous applications in biomedical data analytics, such as gene expression co-clustering, patient stratification, and gene-disea... (Source: BioData Mining)
Source: BioData Mining - December 29, 2017 Category: Bioinformatics Authors: Andrej Čopar, Marinka žitnik and Blaž Zupan Tags: Research Source Type: research

An automated pipeline for bouton, spine, and synapse detection of in vivo two-photon images
In the nervous system, the neurons communicate through synapses. The size, morphology, and connectivity of these synapses are significant in determining the functional properties of the neural network. Therefo... (Source: BioData Mining)
Source: BioData Mining - December 20, 2017 Category: Bioinformatics Authors: Qiwei Xie, Xi Chen, Hao Deng, Danqian Liu, Yingyu Sun, Xiaojuan Zhou, Yang Yang and Hua Han Tags: Research Source Type: research

TSPmap, a tool making use of traveling salesperson problem solvers in the efficient and accurate construction of high-density genetic linkage maps
Recent advances in nucleic acid sequencing technologies have led to a dramatic increase in the number of markers available to generate genetic linkage maps. This increased marker density can be used to improve... (Source: BioData Mining)
Source: BioData Mining - December 19, 2017 Category: Bioinformatics Authors: J. Grey Monroe, Zachariah A. Allen, Paul Tanger, Jack L. Mullen, John T. Lovell, Brook T. Moyers, Darrell Whitley and John K. McKay Tags: Software article Source Type: research

Sparse generalized linear model with L 0 approximation for feature selection and prediction with big omics data
Feature selection and prediction are the most important tasks for big data mining. The common strategies for feature selection in big data mining are L 1, SCAD and MC+. However, none of... (Source: BioData Mining)
Source: BioData Mining - December 19, 2017 Category: Bioinformatics Authors: Zhenqiu Liu, Fengzhu Sun and Dermot P. McGovern Tags: Methodology Source Type: research

Cluster ensemble based on Random Forests for genetic data
Clustering plays a crucial role in several application domains, such as bioinformatics. In bioinformatics, clustering has been extensively used as an approach for detecting interesting patterns in genetic data... (Source: BioData Mining)
Source: BioData Mining - December 15, 2017 Category: Bioinformatics Authors: Luluah Alhusain and Alaaeldin M. Hafez Tags: Methodology Source Type: research

PMLB: a large benchmark suite for machine learning evaluation and comparison
The selection, development, or comparison of machine learning methods in data mining can be a difficult task based on the target problem and goals of a particular study. Numerous publicly available real-world ... (Source: BioData Mining)
Source: BioData Mining - December 11, 2017 Category: Bioinformatics Authors: Randal S. Olson, William La Cava, Patryk Orzechowski, Ryan J. Urbanowicz and Jason H. Moore Tags: Research Source Type: research

Ten quick tips for machine learning in computational biology
Machine learning has become a pivotal tool for many projects in computational biology, bioinformatics, and health informatics. Nevertheless, beginners and biomedical researchers often do not have enough experi... (Source: BioData Mining)
Source: BioData Mining - December 8, 2017 Category: Bioinformatics Authors: Davide Chicco Tags: Review Source Type: research