Which methods are the most effective in enabling novice users to participate in ontology creation? A usability study
AbstractProducing findable, accessible, interoperable and reusable (FAIR) data cannot be accomplished solely by data curators in all disciplines. In biology, we have shown that phenotypic data curation is not only costly, but it is burdened with inter-curator variation. We intend to propose a software platform that would enable all data producers, including authors of scientific publications, to produce ontologized data at the time of publication. Working toward this goal, we need to identify ontology construction methods that are preferred by end users. Here, we employ two usability studies to evaluate effectiveness, effi...
Source: Database : The Journal of Biological Databases and Curation - June 22, 2021 Category: Databases & Libraries Source Type: research

MetamORF: a repository of unique short open reading frames identified by both experimental and computational approaches for gene and metagene analyses
AbstractThe development of high-throughput technologies revealed the existence of non-canonical short open reading frames (sORFs) on most eukaryotic ribonucleic acids. They are ubiquitous genetic elements conserved across species and suspected to be involved in numerous cellular processes. MetamORF (https://metamorf.hb.univ-amu.fr/) aims to provide a repository of unique sORFs identified in the human and mouse genomes with both experimental and computational approaches. By gathering publicly available sORF data, normalizing them and summarizing redundant information, we were able to identify a total of 1  162 675 uniqu...
Source: Database : The Journal of Biological Databases and Curation - June 22, 2021 Category: Databases & Libraries Source Type: research

mPPI: a database extension to visualize structural interactome in a one-to-many manner
AbstractProtein –protein interaction (PPI) databases with structural information are useful to investigate biological functions at both systematic and atomic levels. However, most existing PPI databases only curate binary interactome. From the perspective of the display and function of PPI, as well as the structu ral binding interface, the related database and resources are summarized. We developed a database extension, named mPPI, for PPI structural visualization. Comparing with the existing structural interactomes that curate resolved PPI conformation in pairs, mPPI can visualize target protein and its mul tiple intera...
Source: Database : The Journal of Biological Databases and Curation - June 22, 2021 Category: Databases & Libraries Source Type: research

ReMeDy: a platform for integrating and sharing published stem cell research data with a focus on iPSC trials
AbstractAbstractRecent regenerative medicine studies have emphasized the need for increased standardization, harmonization and sharing of information related to stem cell product characterization, to help drive these innovative interventions toward public availability and to increase collaboration in the scientific community. Although numerous attempts and numerous databases have been made to manage these data, a platform that incorporates all the heterogeneous data collected from stem cell projects into a harmonized project-based framework is still lacking. The aim of the database, which is described in this study, is to ...
Source: Database : The Journal of Biological Databases and Curation - June 22, 2021 Category: Databases & Libraries Source Type: research

dbGENVOC: database of GENomic Variants of Oral Cancer, with special reference to India
AbstractOral cancer is highly prevalent in India and is the most frequent cancer type among Indian males. It is also very common in southeast Asia. India has participated in the International Cancer Genome Consortium (ICGC) and some national initiatives to generate large-scale genomic data on oral cancer patients and analyze to identify associations and systematically catalog the associated variants. We have now created an open, web-accessible database of these variants found significantly associated with Indian oral cancer patients, with a user-friendly interface to enable easy mining. We have value added to this database...
Source: Database : The Journal of Biological Databases and Curation - May 28, 2021 Category: Databases & Libraries Source Type: research

BENviewer: a gene interaction network visualization server based on graph embedding model
AbstractBENviewer is a brand-new online gene interaction network visualization server based on graph embedding models. With mature graph embedding algorithms applied on several interaction network databases, it provides human-friendly 2D visualization based on more than 2000 biological pathways, and these results present not only genes involved but also tightness of interactions in an intuitive way. As a unique visualization server introducing graph embedding application for the first time, it is expected to help researchers gain deeper insights into biological networks beyond generating results explainable by existing kno...
Source: Database : The Journal of Biological Databases and Curation - May 28, 2021 Category: Databases & Libraries Source Type: research

emiRIT: a text-mining-based resource for microRNA information
AbstractmicroRNAs (miRNAs) are essential gene regulators, and their dysregulation often leads to diseases. Easy access to miRNA information is crucial for interpreting generated experimental data, connecting facts across publications and developing new hypotheses built on previous knowledge. Here, we present extracting miRNA Information from Text (emiRIT), a text-miningbased resource, which presents miRNA information mined from the literature through a user-friendly interface. We collected 149  ,233 miRNA –PubMed ID pairs from Medline between January 1997 and May 2020. emiRIT currently contains ‘miRNA –gene regulat...
Source: Database : The Journal of Biological Databases and Curation - May 28, 2021 Category: Databases & Libraries Source Type: research

Integration of 1:1 orthology maps and updated datasets into Echinobase
AbstractEchinobase (https://echinobase.org) is a central online platform that generates, manages and hosts genomic data relevant to echinoderm research. While the resource primarily serves the echinoderm research community, the recent release of an excellent quality genome for the frequently studied purple sea urchin (Strongylocentrotus purpuratus genome, v5.0) has provided an opportunity to adapt to the needs of a broader research community across other model systems. To this end, establishing pipelines to identify orthologous genes between echinoderms and other species has become a priority in many contexts including nom...
Source: Database : The Journal of Biological Databases and Curation - May 19, 2021 Category: Databases & Libraries Source Type: research

An overview of graph databases and their applications in the biomedical domain
AbstractOver the past couple of decades, the explosion of densely interconnected data has stimulated the research, development and adoption of graph database technologies. From early graph models to more recent native graph databases, the landscape of implementations has evolved to cover enterprise-ready requirements. Because of the interconnected nature of its data, the biomedical domain has been one of the early adopters of graph databases, enabling more natural representation models and better data integration workflows, exploration and analysis facilities. In this work, we survey the literature to explore the evolution...
Source: Database : The Journal of Biological Databases and Curation - May 18, 2021 Category: Databases & Libraries Source Type: research

Application of beta and gamma carbonic anhydrase sequences as tools for identification of bacterial contamination in the whole genome sequence of inbred Wuzhishan minipig (Sus scrofa) annotated in databases
In this study, we used bioinformatics methods and web tools such as UniProt, European Bioinformatics Institute, National Center for Biotechnology Information, Ensembl Genome Browser, Ensembl Bacteria, RSCB PDB andPseudomonas Genome Database. Our analysis defined that pig has 12 classical α-CAs and 3 CA-related proteins. Meanwhile, it was approved that the detected CAs in WZSP are categorized in the β- and γ-CA families, which belong toPseudomonas spp. andAcinetobacter spp. The protein structure study revealed that the identified β-CA sequence from WZSP belongs toPseudomonas aeruginosa with PDB ID: 5JJ8, and the identif...
Source: Database : The Journal of Biological Databases and Curation - May 18, 2021 Category: Databases & Libraries Source Type: research

covid19census: U.S. and Italy COVID-19 metrics and other epidemiological data
In conclusion, it was observed that the ‘covid19census’ package allows to easily extract area-level data from both the USA and Italy using few functions. These comprehensive data can be used to provide reliable estimates of the effect of certain variables on COVID-19 outcomes.Database URL:https://github.com/c1au6i0/covid19census (Source: Database : The Journal of Biological Databases and Curation)
Source: Database : The Journal of Biological Databases and Curation - May 15, 2021 Category: Databases & Libraries Source Type: research

Challenges for FAIR-compliant description and comparison of crop phenotype data with standardized controlled vocabularies
AbstractCrop phenotypic data underpin many pre-breeding efforts to characterize variation within germplasm collections. Although there has been an increase in the global capacity for accumulating and comparing such data, a lack of consistency in the systematic description of metadata often limits integration and sharing. We therefore aimed to understand some of the challenges facing findable, accesible, interoperable and reusable (FAIR) curation and annotation of phenotypic data from minor and underutilized crops. We used bambara groundnut (Vigna subterranea) as an exemplar underutilized crop to assess the ability of the C...
Source: Database : The Journal of Biological Databases and Curation - May 15, 2021 Category: Databases & Libraries Source Type: research

COVIDOUTCOME —estimating COVID severity based on mutation signatures in the SARS-CoV-2 genome
AbstractNumerous studies demonstrate frequent mutations in the genome of SARS-CoV-2. Our goal was to statistically link mutations to severe disease outcome. We used an automated machine learning approach where 1594 viral genomes with available clinical follow-up data were used as the training set (797 ‘severe’ and 797 ‘mild’). The best algorithm, based on random forest classification combined with the LASSO feature selection algorithm, was employed to the training set to link mutation signatures and outcome. The performance of the final model was estimated by repeated, stratified, 10-fold cross validation (CV) and ...
Source: Database : The Journal of Biological Databases and Curation - May 8, 2021 Category: Databases & Libraries Source Type: research

Human IRES Atlas: an integrative platform for studying IRES-driven translational regulation in humans
AbstractIt is now known that cap-independent translation initiation facilitated by internal ribosome entry sites (IRESs) is vital in selective cellular protein synthesis under stress and different physiological conditions. However, three problems make it hard to understand transcriptome-wide cellular IRES-mediated translation initiation mechanisms: (i) complex interplay between IRESs and other translation initiation –related information, (ii) reliability issue ofin silico cellular IRES investigation and (iii) labor-intensivein vivo IRES identification. In this research, we constructed the Human IRES Atlas database for a ...
Source: Database : The Journal of Biological Databases and Curation - May 4, 2021 Category: Databases & Libraries Source Type: research

CANNUSE, a database of traditional Cannabis uses —an opportunity for new research
AbstractCannabis is one of the most versatile genera in terms of plant uses and has been exploited by humans for millennia due to its medicinal properties, strong fibres, nutritious seeds and psychoactive resin. Nowadays,Cannabis is the centre of many scientific studies, which mainly focus on its chemical composition and medicinal properties. Unfortunately, while new applications of this plant are continuously being developed, some of its traditional uses are becoming rare and even disappearing altogether. Information on traditional uses ofCannabis is vast, but it is scattered across many publication sources in different f...
Source: Database : The Journal of Biological Databases and Curation - May 1, 2021 Category: Databases & Libraries Source Type: research