Progress in the study of genome size evolution in Asteraceae: analysis of the last update
AbstractThe Genome Size in Asteraceae Database (GSAD,http://www.asteraceaegenomesize.com) has been recently updated, with data from papers published or in press until July 2018. This constitutes the third release of GSAD, currently containing 4350 data entries for 1496 species, which represent a growth of 22.52% in the number of species with available genome size data compared with the previous release, and a growth of 57.72% in terms of entries. Approximately 6% of Asteraceae species are covered in terms of known genome sizes. The number of source papers included in this release (198) means a 48.87% increase with respect ...
Source: Database : The Journal of Biological Databases and Curation - October 14, 2019 Category: Databases & Libraries Source Type: research

RumimiR: a detailed microRNA database focused on ruminant species
AbstractThe ever-increasing use of next-generation sequencing technologies to explore the genome has generated large quantities of data in recent years. Numerous publications have described several thousand sequences of microRNAs, all species included. A new database (RumimiR) has been created from the literature to provide a detailed description of microRNAs for three ruminant species: cattle, goats and sheep. To date, 2887, 2733 and 5095 unique microRNAs from bovine, caprine and ovine species, respectively, are included. In addition to the most recent reference genomic position and sequence of each microRNA, this databas...
Source: Database : The Journal of Biological Databases and Curation - October 14, 2019 Category: Databases & Libraries Source Type: research

The extraction of complex relationships and their conversion to biological expression language (BEL) overview of the BioCreative VI (2017) BEL track
AbstractKnowledge of the molecular interactions of biological and chemical entities and their involvement in biological processes or clinical phenotypes is important for data interpretation. Unfortunately, this knowledge is mostly embedded in the literature in such a way that it is unavailable for automated data analysis procedures. Biological expression language (BEL) is a syntax representation allowing for the structured representation of a broad range of biological relationships. It is used in various situations to extract such knowledge and transform it into BEL networks. To support the tedious and time-intensive extra...
Source: Database : The Journal of Biological Databases and Curation - October 11, 2019 Category: Databases & Libraries Source Type: research

RiceRelativesGD: a genomic database of rice relatives for rice research
AbstractRice (Oryza sativa L.) is one of the most important crops worldwide. Its relatives, including phylogenetically related species of rice and paddy weeds with a similar ecological niche, can provide crucial genetic resources (such as resistance to biotic and abiotic stresses and high photosynthetic efficiency) for rice research. Although many rice genomic databases have been constructed, a database providing large-scale curated genomic data from rice relatives and offering specific gene resources is still lacking. Here, we present RiceRelativesGD, a user-friendly genomic database of rice relatives. RiceRelativesGD int...
Source: Database : The Journal of Biological Databases and Curation - September 27, 2019 Category: Databases & Libraries Source Type: research

ApicoTFdb: the comprehensive web repository of apicomplexan transcription factors and transcription-associated co-factors
AbstractDespite significant progress in apicomplexan genome sequencing and genomics, the current list of experimentally validated transcription factors (TFs) in these genomes is incomplete and mainly consists of AP2 family of proteins, with only a limited number of non-AP2 family TFs and transcription-associated co-factors (TcoFs). We have performed a systematic bioinformatics-aided prediction of TFs and TcoFs in apicomplexan genomes and developed the ApicoTFdb database which consists of experimentally validated as well as computationally predicted TFs and TcoFs in 14 apicomplexan species. The predicted TFs are manually cu...
Source: Database : The Journal of Biological Databases and Curation - September 16, 2019 Category: Databases & Libraries Source Type: research

Benchmarking database systems for Genomic Selection implementation
AbstractMotivationWith high-throughput genotyping systems now available, it has become feasible to fully integrate genotyping information into breeding programs. To make use of this information effectively requires DNA extraction facilities and marker production facilities that can efficiently deploy the desired set of markers across samples with a rapid turnaround time that allows for selection before crosses needed to be made. In reality, breeders often have a short window of time to make decisions by the time they are able to collect all their phenotyping data and receive corresponding genotyping data. This presents a c...
Source: Database : The Journal of Biological Databases and Curation - September 11, 2019 Category: Databases & Libraries Source Type: research

PICEAdatabase: a web database for Picea omics and phenotypic information
AbstractPicea belongs to thePinaceae family and is a famous commercial tree species because of its straight trunk and excellent timber traits. Recently, omics have been widely used for fundamental and mechanism studies onPicea plants. To improve the accessibility to omics and phenotypic data and facilitate further studies, we compiled the sequences of 2 chloroplast genomes (Picea crassifolia andPicea asperata) and 32 complete omics data sets, including 20 transcriptomes, 4 proteomes, 2 degradomes and 6 microRNAs fromP.crassifolia,P.asperata,Picea balfouriana andPicea abies tissues under different treatments, in PICEAdataba...
Source: Database : The Journal of Biological Databases and Curation - August 15, 2019 Category: Databases & Libraries Source Type: research

CasPDB: an integrated and annotated database for Cas proteins from bacteria and archaea
AbstractClustered regularly interspaced short palindromic repeats (CRISPR) and associated proteins (Cas) constitute CRISPR –Cas systems, which are antiphage immune systems present in numerous bacterial and most archaeal species. In recent years, CRISPR–Cas systems have been developed into reliable and powerful genome editing tools. Nevertheless, finding similar or better tools from bacteria or archaea remains crucia l. This requires the exploration of different CRISPR systems, identification and characterization new Cas proteins. Archives tailored for Cas proteins are urgently needed and necessitate the predict...
Source: Database : The Journal of Biological Databases and Curation - August 14, 2019 Category: Databases & Libraries Source Type: research

CDEK: Clinical Drug Experience Knowledgebase
AbstractThe Clinical Drug Experience Knowledgebase (CDEK) is a database and web platform of active pharmaceutical ingredients with evidence of clinical testing as well as the organizations involved in their research and development. CDEK was curated by disambiguating intervention and organization names from ClinicalTrials.gov and cross-referencing these entries with other prominent drug databases. Approximately 43% of active pharmaceutical ingredients in the CDEK database were sourced from ClinicalTrials.gov and cannot be found in any other prominent compound-oriented database. The contents of CDEK are structured around th...
Source: Database : The Journal of Biological Databases and Curation - August 14, 2019 Category: Databases & Libraries Source Type: research

SOCCOMAS: a FAIR web content management system that uses knowledge graphs and that is based on semantic programming
AbstractWe introduce Semantic Ontology-Controlled application for web Content Management Systems (SOCCOMAS), a development framework for FAIR ( ‘findable’, ‘accessible’, ‘interoperable’, ‘reusable’) Semantic Web Content Management Systems (S-WCMSs). Each S-WCMS run by SOCCOMAS has its contents managed through a corresponding knowledge base that stores all data and metadata in the form of semantic knowledge graphs in a Jena t uple store. Automated procedures track provenance, user contributions and detailed change history. Each S-WCMS is accessible via both a graphical user in...
Source: Database : The Journal of Biological Databases and Curation - August 8, 2019 Category: Databases & Libraries Source Type: research

Mammalian Annotation Database for improved annotation and functional classification of Omics datasets from less well-annotated organisms
AbstractNext-generation sequencing technologies and the availability of an increasing number of mammalian and other genomes allow gene expression studies, particularly RNA sequencing, in many non-model organisms. However, incomplete genome annotation and assignments of genes to functional annotation databases can lead to a substantial loss of information in downstream data analysis. To overcome this, we developed Mammalian Annotation Database tool (MAdb, https://madb.ethz.ch) to conveniently provide homologous gene information for selected mammalian species. The assignment between species is performed in three steps: (i) m...
Source: Database : The Journal of Biological Databases and Curation - July 26, 2019 Category: Databases & Libraries Source Type: research

An effective biomedical data migration tool from resource description framework to JSON
AbstractResource Description Framework (RDF) is widely used for representing biomedical data in practical applications. With the increases of RDF-based applications, there is an emerging requirement of novel architectures to provide effective supports for the future RDF data explosion. Inspired by the success of the new designs in National Center for Biotechnology Information dbSNP (The Single Nucleotide Polymorphism Database) for managing the increasing data volumes using JSON (JavaScript Object Notation), in this paper we present an effective mapping tool that allows data migrations from RDF to JSON for supporting future...
Source: Database : The Journal of Biological Databases and Curation - July 25, 2019 Category: Databases & Libraries Source Type: research

Tripal v3: an ontology-based toolkit for construction of FAIR biological community databases
AbstractCommunity biological databases provide an important online resource for both public and private data, analysis tools and community engagement. These sites house genomic, transcriptomic, genetic, breeding and ancillary data for specific species, families or clades. Due to the complexity and increasing quantities of these data, construction of online resources is increasingly difficult especially with limited funding and access to technical expertise. Furthermore, online repositories are expected to promote FAIR data principles (findable, accessible, interoperable and reusable) that presents additional challenges. Th...
Source: Database : The Journal of Biological Databases and Curation - July 22, 2019 Category: Databases & Libraries Source Type: research

IRAM: virus capsid database and analysis resource
AbstractIRAM is an online, open access, comprehensive database and analysis resource for virus capsids. The database includes over 200  000 hierarchically organized capsid-associated nucleotide and amino acid sequences, as well as 193 capsids structures of high resolution (1–5 Å). Each capsid’s structure includes a data file for capsid domain (PDB), capsid symmetry unit (PDB) and capsid structure information (PSF); these co ntain capsid structural information that is necessary to run further computational studies. Physicochemical properties analysis is implemented for calculating capsid total ...
Source: Database : The Journal of Biological Databases and Curation - July 18, 2019 Category: Databases & Libraries Source Type: research

Combined alignments of sequences and domains characterize unknown proteins with remotely related protein search PSISearch2D
AbstractIterative homology search has been widely used in identification of remotely related proteins. Our previous study has found that the query-seeded sequence iterative search can reduce homologous over-extension errors and greatly improve selectivity. However, iterative homology search remains challenging in protein functional prediction. More sensitive scoring models are highly needed to improve the predictive performance of the alignment methods, and alignment annotation with better visualization has also become imperative for result interpretation. Here we report an open-source application PSISearch2D that runs que...
Source: Database : The Journal of Biological Databases and Curation - July 17, 2019 Category: Databases & Libraries Source Type: research

VariED: the first integrated database of gene annotation and expression profiles for variants related to human diseases
AbstractIntegrated analysis of DNA variants and gene expression profiles may facilitate precise identification of gene regulatory networks involved in disease mechanisms. Despite the widespread availability of public resources, we lack databases that are capable of simultaneously providing gene expression profiles, variant annotations, functional prediction scores and pathogenic analyses. VariED is the first web-based querying system that integrates an annotation database and expression profiles for genetic variants. The database offers a user-friendly platform and locates gene/variant names in the literature by connecting...
Source: Database : The Journal of Biological Databases and Curation - July 17, 2019 Category: Databases & Libraries Source Type: research

A curated collection of transcriptome datasets to investigate the molecular mechanisms of immunoglobulin E-mediated atopic diseases
AbstractPrevalence of allergies has reached ~20% of population in developed countries and sensitization rate to one or more allergens among school age children are approaching 50%. However, the combination of the complexity of atopic allergy susceptibility/development and environmental factors has made identification of gene biomarkers challenging. The amount of publicly accessible transcriptomic data presents an unprecedented opportunity for mechanistic discoveries and validation of complex disease signatures across studies. However, this necessitates structured methodologies and visual tools for the interpretation of res...
Source: Database : The Journal of Biological Databases and Curation - July 10, 2019 Category: Databases & Libraries Source Type: research

MycoResistance: a curated resource of drug resistance molecules in Mycobacteria
AbstractThe emergence and spread of drug-resistantMycobacterium tuberculosis is of global concern. To improve the understanding of drug resistance inMycobacteria, numerous studies have been performed to discover diagnostic markers and genetic determinants associated with resistance to anti-tuberculosis drug. However, the related information is scattered in a massive body of literature, which is inconvenient for researchers to investigate the molecular mechanism of drug resistance. Therefore, we manually collected 1707 curated associations between 73 compounds and 132 molecules (including coding genes and non-coding RNAs) i...
Source: Database : The Journal of Biological Databases and Curation - July 10, 2019 Category: Databases & Libraries Source Type: research

Semalytics: a semantic analytics platform for the exploration of distributed and heterogeneous cancer data in translational research
AbstractEach cancer is a complex system with unique molecular features determining its dynamics, such as its prognosis and response to therapies. Understanding the role of these biological traits is fundamental in order to personalize cancer clinical care according to the characteristics of each patient ’s disease. To achieve this, translational researchers propagate patients’ samples throughin vivo andin vitro cultures to test different therapies on the same tumor and to compare their outcomes with the molecular profile of the disease. This in turn generates information that can be subsequently translated into...
Source: Database : The Journal of Biological Databases and Curation - July 9, 2019 Category: Databases & Libraries Source Type: research

piRDisease v1.0: a manually curated database for piRNA associated diseases
AbstractIn recent years, researches focusing on PIWI-interacting RNAs (piRNAs) have increased rapidly. It has been revealed that piRNAs have strong association with a wide range of diseases; thus, it becomes very important to understand piRNAs ’ role(s) in disease diagnosis, prognosis and assessment of treatment response. We searched more than 2500 articles using keywords, such as `PIWI-interacting RNAs’ and `piRNAs’, and further scrutinized the articles to collect piRNAs-disease association data. These data are highly complex and h eterogeneous due to various types of piRNA idnetifiers (IDs) and differen...
Source: Database : The Journal of Biological Databases and Curation - July 2, 2019 Category: Databases & Libraries Source Type: research

Phytochelatin database: a resource for phytochelatin complexes of nutritional and environmental metals
AbstractPhytochelatins (PyCs) are a diverse set of plant compounds that chelate metals, protect against metal toxicity and function in metal homeostasis. PyCs are present in plants consumed as food by humans and could, in principle, impact absorption and utilization of essential and toxic metals such as selenium and cadmium, respectively. PyCs vary in terminal amino acid composition and chain length, exist in multiple oxidation states and reversibly bind multiple metals; consequently, PyCs include a large set of possible structures. Although individual PyC-metal complexes have been studied, no resource exists to characteri...
Source: Database : The Journal of Biological Databases and Curation - July 2, 2019 Category: Databases & Libraries Source Type: research

PubMed Text Similarity Model and its application to curation efforts in the Conserved Domain Database
This study proposes a text similarity model to help biocuration efforts of the Conserved Domain Database (CDD). CDD is a curated resource that catalogs annotated multiple sequence alignment models for ancient domains and full-length proteins. These models allow for fast searching and quick identification of conserved motifs in protein sequences via Reverse PSI-BLAST. In addition, CDD curators prepare summaries detailing the function of these conserved domains and specific protein families, based on published peer-reviewed articles. To facilitate information access for database users, it is desirable to specifically identif...
Source: Database : The Journal of Biological Databases and Curation - July 2, 2019 Category: Databases & Libraries Source Type: research

PTSD Biomarker Database: deep dive metadatabase for PTSD biomarkers, visualizations and analysis tools
AbstractThe PTSD Biomarker Database (PTSDDB) is a database that provides a landscape view of physiological markers being studied as putative biomarkers in the current post-traumatic stress disorder (PTSD) literature to enable researchers to explore and compare findings quickly. The PTSDDB currently contains over 900 biomarkers and their relevant information from 109 original articles published from 1997 to 2017. Further, the curated content stored in this database is complemented by a web application consisting of multiple interactive visualizations that enable the investigation of biomarker knowledge in PTSD (e.g. clinica...
Source: Database : The Journal of Biological Databases and Curation - June 29, 2019 Category: Databases & Libraries Source Type: research

Oomycete Gene Table: an online database for comparative genomic analyses of the oomycete microorganisms
AbstractOomycetes form a unique group of the fungal-like, aquatic, eukaryotic microorganisms. Lifestyle and pathogenicity of the oomycetes are diverse. Many pathogenic oomycetes affect a broad range of plants and cause enormous economic loss annually. Some pathogenic oomycetes cause destructive and deadly diseases in a variety of animals, including humans. No effective antimicrobial agent against the oomycetes is available. Genomic data of many oomycetes are currently available. Comparative analyses of the oomycete genomes must be performed to better understand the oomycete biology and virulence, as well as to identify con...
Source: Database : The Journal of Biological Databases and Curation - June 29, 2019 Category: Databases & Libraries Source Type: research

PRRDB 2.0: a comprehensive database of pattern-recognition receptors and their ligands
AbstractPRRDB 2.0 is an updated version of PRRDB that maintains comprehensive information about pattern-recognition receptors (PRRs) and their ligands. The current version of the database has ~2700 entries, which are nearly five times of the previous version. It contains extensive information about 467 unique PRRs and 827 pathogens-associated molecular patterns (PAMPs), manually extracted from ~600 research articles. It possesses information about PRRs and PAMPs that has been extracted manually from research articles and public databases. Each entry provides comprehensive details about PRRs and PAMPs that includes their na...
Source: Database : The Journal of Biological Databases and Curation - June 27, 2019 Category: Databases & Libraries Source Type: research

MolMeDB: Molecules on Membranes Database
AbstractBiological membranes act as barriers or reservoirs for many compounds within the human body. As such, they play an important role in pharmacokinetics and pharmacodynamics of drugs and other molecular species. Until now, most membrane/drug interactions have been inferred from simple partitioning between octanol and water phases. However, the observed variability in membrane composition and among compounds themselves stretches beyond such simplification as there are multiple drug –membrane interactions. Numerous experimental and theoretical approaches are used to determine the molecule–membrane interactio...
Source: Database : The Journal of Biological Databases and Curation - June 27, 2019 Category: Databases & Libraries Source Type: research

LncCeRBase: a database of experimentally validated human competing endogenous long non-coding RNAs
This manuscript has been corrected to update the LncCeRBase web address tohttp://www.insect-genome.com/LncCeRBase (Source: Database : The Journal of Biological Databases and Curation)
Source: Database : The Journal of Biological Databases and Curation - June 25, 2019 Category: Databases & Libraries Source Type: research

SeQuery: an interactive graph database for visualizing the GPCR superfamily
In this study, we propose a web-based graphical database tool, SeQuery, for intuitively visualizing proteome/genome networks by integrating the sequential, structural and functional information of sequences. As a demonstration of our tool ’s effectiveness, we constructed a graph database of G protein-coupled receptor (GPCR) sequences by integrating data from the UniProt, GPCRdb and RCSB PDB databases. Our tool attempts to achieve two goals: (i) given the sequence of a query protein, correctly and efficiently identify whether the pr otein is a GPCR, and, if so, define its sequential and functional roles in the GPCR su...
Source: Database : The Journal of Biological Databases and Curation - June 25, 2019 Category: Databases & Libraries Source Type: research

MepmiRDB: a medicinal plant microRNA database
AbstractMicroRNAs (miRNAs) have been recognized as a key regulator in plant development and metabolism. Recent reports showed that the miRNAs of medicinal plants not only act as a critical modulator in secondary metabolism but also had a great potential of performing cross-kingdom gene regulation. Although several plant miRNA repositories have been publicly available, no miRNA database specific for medicinal plants has been reported to date. Here, we report the first version of MepmiRDB (medicinal plant microRNA database), which is freely accessible athttp://mepmirdb.cn/mepmirdb/index.html. This database accommodates thous...
Source: Database : The Journal of Biological Databases and Curation - June 24, 2019 Category: Databases & Libraries Source Type: research

Mr.Vc: a database of microarray and RNA-seq of Vibrio cholerae
In this study, we constructed a microarray and RNA-seq database ofV. cholerae (Mr.Vc), containing gene transcriptional expression data of 145 experimental conditions ofV. cholerae from various sources, covering 25  937 entries of differentially expressed genes. In addition, we collected relevant information including gene annotation, operons they may belong to and possible interaction partners of their protein products. With Mr.Vc, users can easily find transcriptome data they are interested in, such as the experimental conditions in which a gene of interest was differentially expressed in, or all genes that were di...
Source: Database : The Journal of Biological Databases and Curation - June 24, 2019 Category: Databases & Libraries Source Type: research

Re-curation and rational enrichment of knowledge graphs in Biological Expression Language
AbstractThe rapid accumulation of new biomedical literature not only causes curated knowledge graphs (KGs) to become outdated and incomplete, but also makes manual curation an impractical and unsustainable solution. Automated or semi-automated workflows are necessary to assist in prioritizing and curating the literature to update and enrich KGs. We have developed two workflows: one for re-curating a given KG to assure its syntactic and semantic quality and another for rationally enriching it by manually revising automatically extracted relations for nodes with low information density. We applied these workflows to the KGs ...
Source: Database : The Journal of Biological Databases and Curation - June 21, 2019 Category: Databases & Libraries Source Type: research

AmyloWiki: an integrated database for Bacillus velezensis FZB42, the model strain for plant growth-promoting Bacilli
AbstractSince its isolation 20  years ago, many studies have been devoted toBacillus velezensis FZB42 (former nameBacillus amyloliquefaciens subsp.plantarum FZB42), which has been gradually accepted as a model organism for Gram-positive rhizobacteria. FZB42 is different from another widely studied bacterial strain,Bacillus subtilis 168, in its many features that are closely associated with plants. FZB42 represents a large group ofBacillus isolates that are beneficial to plants and of great importance in agriculture. In this work a database for FZB42 named ‘AmyloWiki’ is built to integrate all information o...
Source: Database : The Journal of Biological Databases and Curation - June 19, 2019 Category: Databases & Libraries Source Type: research

CCRDB: a cancer circRNAs-related database and its application in hepatocellular carcinoma-related circRNAs
AbstractCircular RNAs (circRNAs) are widely expressed in human cells and tissues and can form a covalently closed exon circularization, which have stable patterns and play important regulatory roles in physiological or pathological process. There is still lack of a comprehensively disease-related knowledge base for in-depth analysis of circRNAs. In this paper, a cancer circRNAs-related database (CCRDB) was established. The CCRDB ’s initial circRNAs data were collected by sequencing experimental data of 10 samples from 5 patients with hepatocellular carcinoma (HCC), where a total of 11 501 circRNAs were found a...
Source: Database : The Journal of Biological Databases and Curation - June 19, 2019 Category: Databases & Libraries Source Type: research

GrainGenes: centralized small grain resources and digital platform for geneticists and breeders
AbstractGrainGenes (https://wheat.pw.usda.gov orhttps://graingenes.org) is an international centralized repository for curated, peer-reviewed datasets useful to researchers working on wheat, barley, rye and oat. GrainGenes manages genomic, genetic, germplasm and phenotypic datasets through a dynamically generated web interface for facilitated data discovery. Since 1992, GrainGenes has served geneticists and breeders in both the public and private sectors on six continents. Recently, several new datasets were curated into the database along with new tools for analysis. The GrainGenes homepage was enhanced by making it more ...
Source: Database : The Journal of Biological Databases and Curation - June 18, 2019 Category: Databases & Libraries Source Type: research

ChlamBase: a curated model organism database for the Chlamydia research community
This manuscript has been amended to include additional authors who were inadvertently omitted. (Source: Database : The Journal of Biological Databases and Curation)
Source: Database : The Journal of Biological Databases and Curation - June 18, 2019 Category: Databases & Libraries Source Type: research

SpinachBase: a central portal for spinach genomics
AbstractSpinach (Spinacia oleracea L.) is a nutritious vegetable enriched with many essential minerals and vitamins. A reference spinach genome has been recently released, and additional spinach genomic resources are being rapidly developed. Therefore, there is an urgent need of a central database to store, query, analyze and integrate various resources of spinach genomic data. To this end, we developed SpinachBase (http://spinachbase.org), which provides centralized public accesses to genomic data as well as analytical tools to assist research and breeding in spinach. The database currently stores the spinach reference ge...
Source: Database : The Journal of Biological Databases and Curation - June 18, 2019 Category: Databases & Libraries Source Type: research

Using association rule mining and ontologies to generate metadata recommendations from multiple biomedical databases
AbstractMetadata —the machine-readable descriptions of the data—are increasingly seen as crucial for describing the vast array of biomedical datasets that are currently being deposited in public repositories. While most public repositories have firm requirements that metadata must accompany submitted datasets, t he quality of those metadata is generally very poor. A key problem is that the typical metadata acquisition process is onerous and time consuming, with little interactive guidance or assistance provided to users. Secondary problems include the lack of validation and sparse use of standardized terms or o...
Source: Database : The Journal of Biological Databases and Curation - June 10, 2019 Category: Databases & Libraries Source Type: research

Chickspress: a resource for chicken gene expression
AbstractHigh-throughput sequencing and proteomics technologies are markedly increasing the amount of RNA and peptide data that are available to researchers, which are typically made publicly available via data repositories such as the NCBI Sequence Read Archive and proteome archives, respectively. These data sets contain valuable information about when and where gene products are expressed, but this information is not readily obtainable from archived data sets. Here we report Chickspress (http://geneatlas.arl.arizona.edu), the first publicly available gene expression resource for chicken tissues. Since there is no single s...
Source: Database : The Journal of Biological Databases and Curation - June 10, 2019 Category: Databases & Libraries Source Type: research

A web-based tool for the prediction of rice transcription factor function
AbstractTranscription factors (TFs) are an important class of regulatory molecules. Despite their importance, only a small number of genes encoding TFs have been characterized inOryza sativa (rice), often because gene duplication and functional redundancy complicate their analysis. To address this challenge, we developed a web-based tool called the Rice Transcription Factor Phylogenomics Database (RTFDB) and demonstrate its application for predicting TF function. The RTFDB hosts transcriptome and co-expression analyses. Sources include high-throughput data from oligonucleotide microarray (Affymetrix and Agilent) as well as...
Source: Database : The Journal of Biological Databases and Curation - June 6, 2019 Category: Databases & Libraries Source Type: research

Endometriosis Knowledgebase: a gene-based resource on endometriosis
AbstractEndometriosis is a complex, benign, estrogen-dependent gynecological disorder with an incidence of ~10% women in reproductive age. The implantation and growth of endometrial cells outside the uterus leads to the development of endometriosis. Endometriosis is also associated with comorbid conditions like cardiovascular and autoimmune diseases. The absence of non-invasive diagnostic markers, delayed diagnosis, high risk of recurrence of the disease on surgical removal of the tissue and absence of a definitive cure for endometriosis makes it imperative to gain insights into the complex etiology of endometriosis. A ple...
Source: Database : The Journal of Biological Databases and Curation - June 5, 2019 Category: Databases & Libraries Source Type: research

ResMarkerDB: a database of biomarkers of response to antibody therapy in breast and colorectal cancer
AbstractThe clinical efficacy of therapeutic monoclonal antibodies for breast and colorectal cancer has greatly contributed to the improvement of patients ’ outcomes by individualizing their treatments according to their genomic background. However, primary or acquired resistance to treatment reduces its efficacy. In this context, the identification of biomarkers predictive of drug response would support research and development of new alternative t reatments. Biomarkers play a major role in the genomic revolution, supporting disease diagnosis and treatment decision-making. Currently, several molecular biomarkers of ...
Source: Database : The Journal of Biological Databases and Curation - June 4, 2019 Category: Databases & Libraries Source Type: research

VigSatDB: genome-wide microsatellite DNA marker database of three species of Vigna for germplasm characterization and improvement
We presentVigSatDB—the world’s first comprehensive microsatellite database of genusVigna, containing>875  K putative microsatellite markers with 772 354 simple and 103 865 compound markers mined from six genome assemblies of threeVigna species, namely,Vigna radiata (Mung bean),Vigna angularis (Adzuki bean) andVigna unguiculata (Cowpea). It also contains 1976 validated published markers. Markers can be selected on the basis of chromosomes/location specificity, and primers can be generated using Primer3core tool integrated at backend. Efficacy ofVigSatDB for microsatellite loci genotyping ha...
Source: Database : The Journal of Biological Databases and Curation - May 31, 2019 Category: Databases & Libraries Source Type: research

Chemical –protein interaction extraction via contextualized word representations and multihead attention
We present a deep neural model for CPI extraction based on deep c ontext representation and multihead attention. Our model mainly consists of the following three parts: a deep context representation layer, a bidirectional long short-term memory networks (Bi-LSTMs) layer and a multihead attention layer. The deep context representation is employed to provide more co mprehensive feature input for Bi-LSTMs. The multihead attention can effectively emphasize the important part of the Bi-LSTMs output. We evaluated our method on the public ChemProt corpus. These experimental results show that both deep context representation and m...
Source: Database : The Journal of Biological Databases and Curation - May 24, 2019 Category: Databases & Libraries Source Type: research

LanceletDB: an integrated genome database for lancelet, comparing domain types and combination in orthologues among lancelet and other species
AbstractLancelet (amphioxus) represents the most basally divergent extant chordate (cephalochordates) that diverged from the other two chordate lineages (urochordates and vertebrates) more than half a billion years ago. As it occupies a key position in evolution, it is considered as one of the best proxies for understanding the chordate ancestral state. Thus, the construction of a database with multiple lancelet genomes and gene annotation data, including protein domains, is urgently needed to investigate the loss and gain of domains in orthologues among species, especially ancient domain types (non-vertebrate-specific dom...
Source: Database : The Journal of Biological Databases and Curation - May 18, 2019 Category: Databases & Libraries Source Type: research

GIDB: a knowledge database for the automated curation and multidimensional analysis of molecular signatures in gastrointestinal cancer
AbstractGastrointestinal (GI) cancer is common, characterized by high mortality, and includes oesophagus, gastric, liver, bile duct, pancreas, rectal and colon cancers. The insufficient specificity and sensitivity of biomarkers is still a key clinical hindrance for GI cancer diagnosis and successful treatment. The emergence of `precision medicine ’, `basket trial’ and `field cancerization’ concepts calls for an urgent need and importance for the understanding of how organ system cancers occur at the molecular levels. Knowledge from both the literature and data available in public databases is informative ...
Source: Database : The Journal of Biological Databases and Curation - May 15, 2019 Category: Databases & Libraries Source Type: research

CropCircDB: a comprehensive circular RNA resource for crops in response to abiotic stress
AbstractCircular RNA (circRNAs) may mediate mRNA expression as miRNA sponge. Since the community has paid more attention on circRNAs, a lot of circRNA databases have been developed for plant. However, a comprehensive collection of circRNAs in crop response to abiotic stress is still lacking. In this work, we applied a big-data approach to take full advantage of large-scale sequencing data, and developed a rich circRNA resource: CropCircDB for maize and rice, later extending to incorporate more crop species. We also designed a metric: stress detections score, which is specifically for detecting circRNAs under stress conditi...
Source: Database : The Journal of Biological Databases and Curation - May 6, 2019 Category: Databases & Libraries Source Type: research

The NCBI BioCollections Database
The citation (Source: Database : The Journal of Biological Databases and Curation)
Source: Database : The Journal of Biological Databases and Curation - April 29, 2019 Category: Databases & Libraries Source Type: research

The MACADAM database: a MetAboliC pAthways DAtabase for Microbial taxonomic groups for mining potential metabolic capacities of archaeal and bacterial taxonomic groups
We present MetAboliC pAthways DAtabase for Microbial taxonomic groups (MACADAM) here, a user-friendly database that makes it possible to find presence/absence/completeness statistics for metabolic pathways at a given microbial taxonomic position. For each prokaryotic ‘RefSeq complete genome’, MACADAM builds a pathway genome database (PGDB) using Pathway Tools software based on MetaCyc data that includes metabolic pathways as well as associated metabolites, reactions and enzymes. To ensure the highest quality of the genome functional annotation data, MACADAM also contains MicroCyc, a manually curated collection ...
Source: Database : The Journal of Biological Databases and Curation - April 29, 2019 Category: Databases & Libraries Source Type: research

YeasTSS: an integrative web database of yeast transcription start sites
AbstractThe transcription initiation landscape of eukaryotic genes is complex and highly dynamic. In eukaryotes, genes can generate multiple transcript variants that differ in 5 ′ boundaries due to usages of alternative transcription start sites (TSSs), and the abundance of transcript isoforms are highly variable. Due to a large number and complexity of the TSSs, it is not feasible to depict details of transcript initiation landscape of all genes using text-format genome annotation files. Therefore, it is necessary to provide data visualization of TSSs to represent quantitative TSS maps and the core promoters (CPs). ...
Source: Database : The Journal of Biological Databases and Curation - April 26, 2019 Category: Databases & Libraries Source Type: research

PlantMP: a database for moonlighting plant proteins
AbstractMoonlighting proteins are single polypeptide chains capable of executing two or more distinct biochemical and/or biological functions. Here, we describe the development of PlantMP, which is a manually curated online-based database of plant proteins that are known to `moonlight ’. The database contains searchable UniProt IDs and names, canonical and moonlighting functions, gene ontology numbers, plant species as well as links to the PubMed indexed articles. Proteins homologous to experimentally confirmed moonlighting proteins from the model plantArabidopsis thaliana are provided as a separate list of `likely m...
Source: Database : The Journal of Biological Databases and Curation - April 25, 2019 Category: Databases & Libraries Source Type: research