The IMEx coronavirus interactome: an evolving map of Coronaviridae –host molecular interactions
We present a curated dataset of physical molecular interactions focused on proteins from SARS-CoV-2, SARS-CoV-1 and other members of theCoronaviridae family that has been manually extracted by International Molecular Exchange (IMEx) Consortium curators. Currently, the dataset comprises over 4400 binarized interactions extracted from 151 publications. The dataset can be accessed in the standard formats recommended by the Proteomics Standards Initiative (HUPO-PSI) at the IntAct database website (https://www.ebi.ac.uk/intact) and will be continuously updated as research on COVID-19 progresses. (Source: Database : The Journal ...
Source: Database : The Journal of Biological Databases and Curation - November 18, 2020 Category: Databases & Libraries Source Type: research

OCCAM: prediction of small ORFs in bacterial genomes by means of a target-decoy database approach and machine learning techniques
AbstractSmall open reading frames (ORFs) have been systematically disregarded by automatic genome annotation. The difficulty in finding patterns in tiny sequences is the main reason that makes small ORFs to be overlooked by computational procedures. However, advances in experimental methods show that small proteins can play vital roles in cellular activities. Hence, it is urgent to make progress in the development of computational approaches to speed up the identification of potential small ORFs. In this work, our focus is on bacterial genomes. We improve a previous approach to identify small ORFs in bacteria. Our method u...
Source: Database : The Journal of Biological Databases and Curation - November 18, 2020 Category: Databases & Libraries Source Type: research

TopoDB: a novel multifunctional management system for laboratory animal colonies
AbstractAnimal models are widely employed in basic research to test mechanistic hypotheses in a complex biological environment as well as to evaluate the therapeutic potential of candidate compounds in preclinical settings. Rodents, and in particular mice, represent the most commonin vivo models for their small size, short lifespan and possibility to manipulate their genome. Over time, a typical laboratory will develop a substantial number of inbred strains and transgenic mouse lines, requiring a substantial effort, in both logistic and economic terms, to maintain an animal colony for research purposes and to safeguard the...
Source: Database : The Journal of Biological Databases and Curation - November 18, 2020 Category: Databases & Libraries Source Type: research

A content-based dataset recommendation system for researchers —a case study on Gene Expression Omnibus (GEO) repository
AbstractIt is a growing trend among researchers to make their data publicly available for experimental reproducibility and data reusability. Sharing data with fellow researchers helps in increasing the visibility of the work. On the other hand, there are researchers who are inhibited by the lack of data resources. To overcome this challenge, many repositories and knowledge bases have been established to date to ease data sharing. Further, in the past two decades, there has been an exponential increase in the number of datasets added to these dataset repositories. However, most of these repositories are domain-specific, and...
Source: Database : The Journal of Biological Databases and Curation - November 12, 2020 Category: Databases & Libraries Source Type: research

Color Data v2: a user-friendly, open-access database with hereditary cancer and hereditary cardiovascular conditions datasets
AbstractPublicly available genetic databases promote data sharing and fuel scientific discoveries for the prevention, treatment and management of disease. In 2018, we built Color Data, a user-friendly, open access database containing genotypic and self-reported phenotypic information from 50  000 individuals who were sequenced for 30 genes associated with hereditary cancer. In a continued effort to promote access to these types of data, we launched Color Data v2, an updated version of the Color Data database. This new release includes additional clinical genetic testing results from m ore than 18 000 individuals who we...
Source: Database : The Journal of Biological Databases and Curation - November 11, 2020 Category: Databases & Libraries Source Type: research

A Collection of Benchmark Data Sets for Knowledge Graph-based Similarity in the Biomedical Domain
AbstractThe ability to compare entities within a knowledge graph is a cornerstone technique for several applications, ranging from the integration of heterogeneous data to machine learning. It is of particular importance in the biomedical domain, where semantic similarity can be applied to the prediction of protein –protein interactions, associations between diseases and genes, cellular localization of proteins, among others. In recent years, several knowledge graph-based semantic similarity measures have been developed, but building a gold standard data set to support their evaluation is non-trivial.We pres ent a collec...
Source: Database : The Journal of Biological Databases and Curation - November 11, 2020 Category: Databases & Libraries Source Type: research

A checklist recipe: making species data open and FAIR
AbstractSpecies checklists are a crucial source of information for research and policy. Unfortunately, many traditional species checklists vary wildly in their content, format, availability and maintenance. The fact that these are not open, findable, accessible, interoperable and reusable (FAIR) severely hampers fast and efficient information flow to policy and decision-making that are required to tackle the current biodiversity crisis. Here, we propose a reproducible, semi-automated workflow to transform traditional checklist data into a FAIR and open species registry. We showcase our workflow by applying it to the public...
Source: Database : The Journal of Biological Databases and Curation - November 11, 2020 Category: Databases & Libraries Source Type: research

CircR2Cancer: a manually curated database of associations between circRNAs and cancers
AbstractAccumulating evidences have shown that the deregulation of circRNA has close association with many human cancers. However, these experimental verified circRNA –cancer associations are not collected in any database. Here, we develop a manually curated database (circR2Cancer) that provides experimentally supported associations between circRNAs and cancers. The current version of the circR2Cancer contains 1439 associations between 1135 circRNAs and 82 canc ers by extracting data from existing literatures and databases. In addition, circR2Cancer contains the information of cancer exacted from Disease Ontology and bas...
Source: Database : The Journal of Biological Databases and Curation - November 11, 2020 Category: Databases & Libraries Source Type: research

CitrusKB: a comprehensive knowledge base for transcriptome and interactome of Citrus spp. infected by Xanthomonas citri subsp. citri at different infection stages
AbstractCitrus canker type A is a serious disease caused byXanthomonas citri subsp.citri (X. citri), which is responsible for severe losses to growers and to the citrus industry worldwide. To date, no canker-resistant citrus genotypes are available, and there is limited information regarding the molecular and genetic mechanisms involved in the early stages of the citrus canker development. Here, we present the CitrusKB knowledge base. This is the firstin vivo interactome database for different citrus cultivars, and it was produced to provide a valuable resource of information on citrus and their interaction with the citrus...
Source: Database : The Journal of Biological Databases and Curation - November 11, 2020 Category: Databases & Libraries Source Type: research

WGVD: an integrated web-database for wheat genome variation and selective signatures
AbstractBread wheat is one of the most important crops worldwide. With the release of the complete wheat reference genome and the development of next-generation sequencing technology, a mass of genomic data from bread wheat and its progenitors has been yield and has provided genomic resources for wheat genetics research. To conveniently and effectively access and use these data, we established Wheat Genome Variation Database, an integrated web-database including genomic variations from whole-genome resequencing and exome-capture data for bread wheat and its progenitors, as well as selective signatures during the process of...
Source: Database : The Journal of Biological Databases and Curation - November 11, 2020 Category: Databases & Libraries Source Type: research

YQFC: a web tool to compare quantitative biological features between two yeast gene lists
AbstractNowadays high-throughput omics technologies are routinely used in biological research. From the omics data, researchers can easily get two gene lists (e.g. stress-induced genes vs. stress-repressed genes) related to their biological question. The next step would be to apply enrichment analysis tools to identify distinct functional/regulatory features between these two gene lists for further investigation. Although various enrichment analysis tools are already available, two challenges remain to be addressed. First, most existing tools are designed to analyze only one gene list, so they cannot directly compare two g...
Source: Database : The Journal of Biological Databases and Curation - November 11, 2020 Category: Databases & Libraries Source Type: research

Exploring functionally annotated transcriptional consensus regulatory elements with CONREL
AbstractUnderstanding the interaction between human genome regulatory elements and transcription factors is fundamental to elucidate the structure of gene regulatory networks. Here we present CONREL, a web application that allows for the exploration of functionally annotated transcriptional ‘consensus’ regulatory elements at different levels of abstraction. CONREL provides an extensive collection of consensus promoters, enhancers and active enhancers for 198 cell-lines across 38 tissue types, which are also combined to provide global consensuses. In addition, 1000 Genomes Project g enotype data and the ‘total binding...
Source: Database : The Journal of Biological Databases and Curation - November 9, 2020 Category: Databases & Libraries Source Type: research

LAMP2: a major update of the database linking antimicrobial peptides
AbstractAntimicrobial peptides (AMPs) have been regarded as a potential weapon to fight against drug-resistant bacteria, which is threating the globe. Thus, more and more AMPs had been designed or identified. There is a need to integrate them into a platform for researchers to facilitate investigation and analyze existing AMPs. The AMP database has become an important tool for the discovery and transformation of AMPs as agents. A database linking antimicrobial peptides (LAMPs), launched in 2013, serves as a comprehensive tool to supply exhaustive information of AMP on a single platform. LAMP2, an updated version of LAMP, h...
Source: Database : The Journal of Biological Databases and Curation - August 25, 2020 Category: Databases & Libraries Source Type: research

FAIR digital objects in environmental and life sciences should comprise workflow operation design data and method information for repeatability of study setups and reproducibility of results
AbstractRepeatability of study setups and reproducibility of research results by underlying data are major requirements in science. Until now, abstract models for describing the structural logic of studies in environmental sciences are lacking and tools for data management are insufficient. Mandatory for repeatability and reproducibility is the use of sophisticated data management solutions going beyond data file sharing. Particularly, it implies maintenance of coherent data along workflows. Design data concern elements from elementary domains of operations being transformation, measurement and transaction. Operation desig...
Source: Database : The Journal of Biological Databases and Curation - August 20, 2020 Category: Databases & Libraries Source Type: research

FLUTE: Fast and reliable knowledge retrieval from biomedical literature
AbstractState-of-the-art machine reading methods extract, in hours, hundreds of thousands of events from the biomedical literature. However, many of the extracted biomolecular interactions are incorrect or not relevant for computational modeling of a system of interest. Therefore, rapid, automated methods are required to filter and select accurate and useful information. The FiLter for Understanding True Events (FLUTE) tool uses public protein interaction databases to filter interactions that have been extracted by machines from databases such as PubMed and score them for accuracy. Confidence in the interactions allows for...
Source: Database : The Journal of Biological Databases and Curation - August 6, 2020 Category: Databases & Libraries Source Type: research