IBDDB: a manually curated and text-mining-enhanced database of genes involved in inflammatory bowel disease
AbstractTo date, research on inflammatory bowel disease (IBD, encompassing Crohn ’s disease and ulcerative colitis), a chronic complex disorder, has generated a large amount of data scattered across published literature (1 06 333) listed in PubMed on 14 October 2020, and no dedicated database currently exists that catalogues information on genes associated with IBD. We aimed to manually curate 289 genes that are experimentally validated to be linked with IBD and its known phenotypes. Furthermore, we have developed an integrated platform providing information about different aspects of these genes by incorporating sever...
Source: Database : The Journal of Biological Databases and Curation - April 30, 2021 Category: Databases & Libraries Source Type: research

Increasing metadata coverage of SRA BioSample entries using deep learning –based named entity recognition
AbstractHigh-quality metadata annotations for data hosted in large public repositories are essential for research reproducibility and for conducting fast, powerful and scalable meta-analyses. Currently, a majority of sequencing samples in the National Center for Biotechnology Information ’s Sequence Read Archive (SRA) are missing metadata across several categories. In an effort to improve the metadata coverage of these samples, we leveraged almost 44 million attribute–value pairs from SRA BioSample to train a scalable, recurrent neural network that predicts missing metadata via named entity recognition (NER). The netwo...
Source: Database : The Journal of Biological Databases and Curation - April 29, 2021 Category: Databases & Libraries Source Type: research

Tripal MegaSearch: a tool for interactive and customizable query and download of big data
AbstractTripal MegaSearch is a Tripal module for querying and downloading biological data stored in Chado. This module allows site users to select data types, restrict the dataset by applying various filters and then customizing fields to view and download through a single interface. Set by site administrators, example data types include gene, germplasm, marker, map, QTL, genotype, phenotype and expression data. When querying for genes, users can restrict the gene dataset using various filters such as name, chromosome position and functional annotation. They can then customize fields to download, such as name, organism, ty...
Source: Database : The Journal of Biological Databases and Curation - April 26, 2021 Category: Databases & Libraries Source Type: research

BC-TFdb: a database of transcription factor drivers in breast cancer
AbstractTranscription factors (TFs) are DNA-binding proteins, which regulate many essential biological functions. In several cancer types, TF function is altered by various direct mechanisms, including gene amplification or deletion, point mutations, chromosomal translocations, expression alterations, as well as indirectly by non-coding DNA mutations influencing the binding of the TF. TFs are also actively involved in breast cancer (BC) initiation and progression. Herein, we have developed an open-access database, BC-TFdb (Breast Cancer Transcription Factors database), of curated, non-redundant TF involved in BC. The datab...
Source: Database : The Journal of Biological Databases and Curation - April 21, 2021 Category: Databases & Libraries Source Type: research

APICURON: a database to credit and acknowledge the work of biocurators
AbstractAPICURON is an open and freely accessible resource that tracks and credits the work of biocurators across multiple participating knowledgebases. Biocuration is essential to extract knowledge from research data and make it available in a structured and standardized way to the scientific community. However, processing biological data —mainly from literature—requires a huge effort that is difficult to attribute and quantify. APICURON collects biocuration events from third-party resources and aggregates this information, spotlighting biocurator contributions. APICURON promotes biocurator engagement implementing gam...
Source: Database : The Journal of Biological Databases and Curation - April 21, 2021 Category: Databases & Libraries Source Type: research

H3ABioNet genomic medicine and microbiome data portals hackathon proceedings
AbstractAfrican genomic medicine and microbiome datasets are usually not well characterized in terms of their origin, making it difficult to find and extract data for specific African ethnic groups or even countries. The Pan-African H3Africa Bioinformatics Network (H3ABioNet) recognized the need for developing data portals for African genomic medicine and African microbiomes to address this and ran a hackathon to initiate their development. The two portals were designed and significant progress was made in their development during the hackathon. All the participants worked in a very synergistic and collaborative atmosphere...
Source: Database : The Journal of Biological Databases and Curation - April 17, 2021 Category: Databases & Libraries Source Type: research

Post-translational modifications in proteins: resources, tools and prediction methods
AbstractPosttranslational modifications (PTMs) refer to amino acid side chain modification in some proteins after their biosynthesis. There are more than 400 different types of PTMs affecting many aspects of protein functions. Such modifications happen as crucial molecular regulatory mechanisms to regulate diverse cellular processes. These processes have a significant impact on the structure and function of proteins. Disruption in PTMs can lead to the dysfunction of vital biological processes and hence to various diseases. High-throughput experimental methods for discovery of PTMs are very laborious and time-consuming. The...
Source: Database : The Journal of Biological Databases and Curation - April 7, 2021 Category: Databases & Libraries Source Type: research

Posttranslational modifications in proteins: resources, tools and prediction methods
AbstractPosttranslational modifications (PTMs) refer to amino acid side chain modification in some proteins after their biosynthesis. There are more than 400 different types of PTMs affecting many aspects of protein functions. Such modifications happen as crucial molecular regulatory mechanisms to regulate diverse cellular processes. These processes have a significant impact on the structure and function of proteins. Disruption in PTMs can lead to the dysfunction of vital biological processes and hence to various diseases. High-throughput experimental methods for discovery of PTMs are very laborious and time-consuming. The...
Source: Database : The Journal of Biological Databases and Curation - April 7, 2021 Category: Databases & Libraries Source Type: research

MENSAdb: a thorough structural analysis of membrane protein dimers
AbstractMembrane proteins (MPs) are key players in a variety of different cellular processes and constitute the target of around 60% of all Food and Drug Administration –approved drugs. Despite their importance, there is still a massive lack of relevant structural, biochemical and mechanistic information mainly due to their localization within the lipid bilayer. To help fulfil this gap, we developed the MEmbrane protein dimer Novel Structure Analyser database (ME NSAdb). This interactive web application summarizes the evolutionary and physicochemical properties of dimeric MPs to expand the available knowledge on the fund...
Source: Database : The Journal of Biological Databases and Curation - April 5, 2021 Category: Databases & Libraries Source Type: research

Wormicloud: a new text summarization tool based on word clouds to explore the C. elegans literature
AbstractFinding relevant information from newly published scientific papers is becoming increasingly difficult due to the pace at which articles are published every year as well as the increasing amount of information per paper. Biocuration and model organism databases provide a map for researchers to navigate through the complex structure of the biomedical literature by distilling knowledge into curated and standardized information. In addition, scientific search engines such as PubMed and text-mining tools such as Textpresso allow researchers to easily search for specific biological aspects from newly published papers, f...
Source: Database : The Journal of Biological Databases and Curation - March 31, 2021 Category: Databases & Libraries Source Type: research

Drugmonizome and Drugmonizome-ML: integration and abstraction of small molecule attributes for drug enrichment analysis and machine learning
AbstractUnderstanding the underlying molecular and structural similarities between seemingly heterogeneous sets of drugs can aid in identifying drug repurposing opportunities and assist in the discovery of novel properties of preclinical small molecules. A wealth of information about drug and small molecule structure, targets, indications and side effects; induced gene expression signatures; and other attributes are publicly available through web-based tools, databases and repositories. By processing, abstracting and aggregating information from these resources into drug set libraries, knowledge about novel properties of d...
Source: Database : The Journal of Biological Databases and Curation - March 31, 2021 Category: Databases & Libraries Source Type: research

Bioinformatics tools developed to support BioCompute Objects
AbstractDevelopments in high-throughput sequencing (HTS) result in an exponential increase in the amount of data generated by sequencing experiments, an increase in the complexity of bioinformatics analysis reporting and an increase in the types of data generated. These increases in volume, diversity and complexity of the data generated and their analysis expose the necessity of a structured and standardized reporting template. BioCompute Objects (BCOs) provide the requisite support for communication of HTS data analysis that includes support for workflow, as well as data, curation, accessibility and reproducibility of com...
Source: Database : The Journal of Biological Databases and Curation - March 30, 2021 Category: Databases & Libraries Source Type: research

An immunologically friendly classification of non-peptidic ligands
AbstractThe Immune Epitope Database (IEDB) freely provides experimental data regarding immune epitopes to the scientific public. The main users of the IEDB are immunologists who can easily use our web interface to search for peptidic epitopes via their simple single-letter codes. For example, ‘A’ stands for ‘alanine’. Similarly, users can easily navigate the IEDB’s simplified NCBI taxonomy hierarchy to locate proteins from specific organisms. However, some epitopes are non-peptidic, such as carbohydrates, lipids, chemicals and drugs, and it is more challenging to consistently n ame them and search upon, making ac...
Source: Database : The Journal of Biological Databases and Curation - March 27, 2021 Category: Databases & Libraries Source Type: research

Development of a biomarker database toward performing disease classification and finding disease interrelations
AbstractA biomarker is a measurable indicator of a disease or abnormal state of a body that plays an important role in disease diagnosis, prognosis and treatment. The biomarker has become a significant topic due to its versatile usage in the medical field and in rapid detection of the presence or severity of some diseases. The volume of biomarker data is rapidly increasing and the identified data are scattered. To provide comprehensive information, the explosively growing data need to be recorded in a single platform. There is no open-source freely available comprehensive online biomarker database. To fulfill this purpose,...
Source: Database : The Journal of Biological Databases and Curation - March 11, 2021 Category: Databases & Libraries Source Type: research

CMBD: a manually curated cancer metabolic biomarker knowledge database
AbstractThe pathogenesis of cancer is influenced by interactions among genes, proteins, metabolites and other small molecules. Understanding cancer progression at the metabolic level is propitious to the visual decoding of changes in living organisms. To date, a large number of metabolic biomarkers in cancer have been measured and reported, which provide an alternative method for cancer precision diagnosis, treatment and prognosis. To systematically understand the heterogeneity of cancers, we developed the database CMBD to integrate the cancer metabolic biomarkers scattered over literatures in PubMed. At present, CMBD cont...
Source: Database : The Journal of Biological Databases and Curation - March 9, 2021 Category: Databases & Libraries Source Type: research