The discordant method: a novel approach for differential correlation
(Source: Bioinformatics)
Source: Bioinformatics - December 28, 2016 Category: Bioinformatics Authors: Siska, C., Bowler, R., Kechris, K. Tags: CORRIGENDUM Source Type: research

OntoBrowser: a collaborative tool for curation of ontologies by subject matter experts
Summary: The lack of controlled terminology and ontology usage leads to incomplete search results and poor interoperability between databases. One of the major underlying challenges of data integration is curating data to adhere to controlled terminologies and/or ontologies. Finding subject matter experts with the time and skills required to perform data curation is often problematic. In addition, existing tools are not designed for continuous data integration and collaborative curation. This results in time-consuming curation workflows that often become unsustainable. The primary objective of OntoBrowser is to provide an ...
Source: Bioinformatics - December 28, 2016 Category: Bioinformatics Authors: Ravagli, C., Pognan, F., Marc, P. Tags: DATABASES AND ONTOLOGIES Source Type: research

Precrec: fast and accurate precision-recall and ROC curve calculations in R
Summary: The precision–recall plot is more informative than the ROC plot when evaluating classifiers on imbalanced datasets, but fast and accurate curve calculation tools for precision–recall plots are currently not available. We have developed Precrec, an R library that aims to overcome this limitation of the plot. Our tool provides fast and accurate precision–recall calculations together with multiple functionalities that work efficiently under different conditions. Availability and Implementation: Precrec is licensed under GPL-3 and freely available from CRAN (https://cran.r-project.org/package=precrec...
Source: Bioinformatics - December 28, 2016 Category: Bioinformatics Authors: Saito, T., Rehmsmeier, M. Tags: DATA AND TEXT MINING Source Type: research

AKT: ancestry and kinship toolkit
Motivation: Ancestry and Kinship Toolkit (AKT) is a statistical genetics tool for analysing large cohorts of whole-genome sequenced samples. It can rapidly detect related samples, characterize sample ancestry, calculate correlation between variants, check Mendel consistency and perform data clustering. AKT brings together the functionality of many state-of-the-art methods, with a focus on speed and a unified interface. We believe it will be an invaluable tool for the curation of large WGS datasets. Availability and Implementation: The source code is available at https://illumina.github.io/akt. Contacts: joconnell@illumina....
Source: Bioinformatics - December 28, 2016 Category: Bioinformatics Authors: Arthur, R., Schulz-Trieglaff, O., Cox, A. J., OConnell, J. Tags: GENETICS AND POPULATION ANALYSIS Source Type: research

RiboDiff: detecting changes of mRNA translation efficiency from ribosome footprints
We present a statistical framework and an analysis tool, RiboDiff, to detect genes with changes in translation efficiency across experimental treatments. RiboDiff uses generalized linear models to estimate the over-dispersion of RNA-Seq and ribosome profiling measurements separately, and performs a statistical test for differential translation efficiency using both mRNA abundance and ribosome occupancy. Availability and Implementation: RiboDiff webpage http://bioweb.me/ribodiff. Source code including scripts for preprocessing the FASTQ data are available at http://github.com/ratschlab/ribodiff. Contacts: zhongy@cbio.mskcc....
Source: Bioinformatics - December 28, 2016 Category: Bioinformatics Authors: Zhong, Y., Karaletsos, T., Drewe, P., Sreedharan, V. T., Kuo, D., Singh, K., Wendel, H.-G., Rätsch, G. Tags: GENE EXPRESSION Source Type: research

MPRAnator: a web-based tool for the design of massively parallel reporter assay experiments
Motivation: With the rapid advances in DNA synthesis and sequencing technologies and the continuing decline in the associated costs, high-throughput experiments can be performed to investigate the regulatory role of thousands of oligonucleotide sequences simultaneously. Nevertheless, designing high-throughput reporter assay experiments such as massively parallel reporter assays (MPRAs) and similar methods remains challenging. Results: We introduce MPRAnator, a set of tools that facilitate rapid design of MPRA experiments. With MPRA Motif design, a set of variables provides fine control of how motifs are placed into sequenc...
Source: Bioinformatics - December 28, 2016 Category: Bioinformatics Authors: Georgakopoulos-Soares, I., Jain, N., Gray, J. M., Hemberg, M. Tags: GENE EXPRESSION Source Type: research

DAPAR & ProStaR: software to perform statistical analyses in quantitative discovery proteomics
Summary: DAPAR and ProStaR are software tools to perform the statistical analysis of label-free XIC-based quantitative discovery proteomics experiments. DAPAR contains procedures to filter, normalize, impute missing value, aggregate peptide intensities, perform null hypothesis significance tests and select the most likely differentially abundant proteins with a corresponding false discovery rate. ProStaR is a graphical user interface that allows friendly access to the DAPAR functionalities through a web browser. Availability and implementation: DAPAR and ProStaR are implemented in the R language and are available on the we...
Source: Bioinformatics - December 28, 2016 Category: Bioinformatics Authors: Wieczorek, S., Combes, F., Lazar, C., Giai Gianetto, Q., Gatto, L., Dorffer, A., Hesse, A.-M., Coute, Y., Ferro, M., Bruley, C., Burger, T. Tags: GENE EXPRESSION Source Type: research

FATSLiM: a fast and robust software to analyze MD simulations of membranes
Summary: When studying biological membranes, Molecular Dynamics (MD) simulations reveal to be quite complementary to experimental techniques. Because the simulated systems keep increasing both in size and complexity, the analysis of MD trajectories need to be computationally efficient while being robust enough to perform analysis on membranes that may be curved or deformed due to their size and/or protein-lipid interactions. This work presents a new software named FATSLiM (‘Fast Analysis Toolbox for Simulations of Lipid Membranes’) that can extract physical properties from MD simulations of membranes (with or w...
Source: Bioinformatics - December 28, 2016 Category: Bioinformatics Authors: Buchoux, S. Tags: STRUCTURAL BIOINFORMATICS Source Type: research

Verdant: automated annotation, alignment and phylogenetic analysis of whole chloroplast genomes
Motivation: Chloroplast genomes are now produced in the hundreds for angiosperm phylogenetics projects, but current methods for annotation, alignment and tree estimation still require some manual intervention reducing throughput and increasing analysis time for large chloroplast systematics projects. Results: Verdant is a web-based software suite and database built to take advantage a novel annotation program, annoBTD. Using annoBTD, Verdant provides accurate annotation of chloroplast genomes without manual intervention. Subsequent alignment and tree estimation can incorporate newly annotated and publically available plast...
Source: Bioinformatics - December 28, 2016 Category: Bioinformatics Authors: McKain, M. R., Hartsock, R. H., Wohl, M. M., Kellogg, E. A. Tags: PHYLOGENETICS Source Type: research

PHYLOViZ 2.0: providing scalable data integration and visualization for multiple phylogenetic inference methods
Summary: High Throughput Sequencing provides a cost effective means of generating high resolution data for hundreds or even thousands of strains, and is rapidly superseding methodologies based on a few genomic loci. The wealth of genomic data deposited on public databases such as Sequence Read Archive/European Nucleotide Archive provides a powerful resource for evolutionary analysis and epidemiological surveillance. However, many of the analysis tools currently available do not scale well to these large datasets, nor provide the means to fully integrate ancillary data. Here we present PHYLOViZ 2.0, an extension of PHYLOViZ...
Source: Bioinformatics - December 28, 2016 Category: Bioinformatics Authors: Nascimento, M., Sousa, A., Ramirez, M., Francisco, A. P., Carrico, J. A., Vaz, C. Tags: PHYLOGENETICS Source Type: research

The OGCleaner: filtering false-positive homology clusters
We present the Orthology Group Cleaner (the OGCleaner), a tool designed for filtering putative orthology groups as homology or non-homology clusters by considering all sequences in a cluster. The OGCleaner relies on high-quality orthologous groups identified in OrthoDB to train machine learning algorithms that are able to distinguish between true-positive and false-positive homology groups. This package aims to improve the quality of phylogenetic tree construction especially in instances of lower-quality transcriptome assemblies. Availability and Implementation: https://github.com/byucsl/ogcleaner Contact: sfujimoto@gmail....
Source: Bioinformatics - December 28, 2016 Category: Bioinformatics Authors: Fujimoto, M. S., Suvorov, A., Jensen, N. O., Clement, M. J., Snell, Q., Bybee, S. M. Tags: PHYLOGENETICS Source Type: research

PseKRAAC: a flexible web server for generating pseudo K-tuple reduced amino acids composition
Summary: The reduced amino acids perform powerful ability for both simplifying protein complexity and identifying functional conserved regions. However, dealing with different protein problems may need different kinds of cluster methods. Encouraged by the success of pseudo-amino acid composition algorithm, we developed a freely available web server, called PseKRAAC (the pseudo K-tuple reduced amino acids composition). By implementing reduced amino acid alphabets, the protein complexity can be significantly simplified, which leads to decrease chance of overfitting, lower computational handicap and reduce information redunda...
Source: Bioinformatics - December 28, 2016 Category: Bioinformatics Authors: Zuo, Y., Li, Y., Chen, Y., Li, G., Yan, Z., Yang, L. Tags: SEQUENCE ANALYSIS Source Type: research

stringMLST: a fast k-mer based tool for multilocus sequence typing
Rapid and accurate identification of the sequence type (ST) of bacterial pathogens is critical for epidemiological surveillance and outbreak control. Cheaper and faster next-generation sequencing (NGS) technologies have taken preference over the traditional method of amplicon sequencing for multilocus sequence typing (MLST). But data generated by NGS platforms necessitate quality control, genome assembly and sequence similarity searching before an isolate’s ST can be determined. These are computationally intensive and time consuming steps, which are not ideally suited for real-time molecular epidemiology. Here, we pr...
Source: Bioinformatics - December 28, 2016 Category: Bioinformatics Authors: Gupta, A., Jordan, I. K., Rishishwar, L. Tags: GENOME ANALYSIS Source Type: research

LncVar: a database of genetic variation associated with long non-coding genes
Motivation: Long non-coding RNAs (lncRNAs) are essential in many molecular pathways, and are frequently associated with disease but the mechanisms of most lncRNAs have not yet been characterized. Genetic variations, including single nucleotide polymorphisms (SNPs) and structural variations, are widely distributed in the genome, including lncRNA gene regions. As the number of studies on lncRNAs grows rapidly, it is necessary to evaluate the effects of genetic variations on lncRNAs. Results: Here, we present LncVar, a database of genetic variation associated with long non-coding genes in six species. We collected lncRNAs fro...
Source: Bioinformatics - December 28, 2016 Category: Bioinformatics Authors: Chen, X., Hao, Y., Cui, Y., Fan, Z., He, S., Luo, J., Chen, R. Tags: DATABASES AND ONTOLOGIES Source Type: research

Assessment of cancer and virus antigens for cross-reactivity in human tissues
Motivation: Cross-reactivity (CR) or invocation of autoimmune side effects in various tissues has important safety implications in adoptive immunotherapy directed against selected antigens. The ability to predict CR (on-target and off-target toxicities) may help in the early selection of safer therapeutically relevant target antigens. Results: We developed a methodology for the calculation of quantitative CR for any defined peptide epitope. Using this approach, we performed assessment of 4 groups of 283 currently known human MHC-class-I epitopes including differentiation antigens, overexpressed proteins, cancer-testis anti...
Source: Bioinformatics - December 28, 2016 Category: Bioinformatics Authors: Jaravine, V., Raffegerst, S., Schendel, D. J., Frishman, D. Tags: SYSTEMS BIOLOGY Source Type: research