Named entity recognition of rice genes and phenotypes based on BiGRU neural networks

Comput Biol Chem. 2023 Nov 3;108:107977. doi: 10.1016/j.compbiolchem.2023.107977. Online ahead of print.ABSTRACTNamed Entity Recognition (NER) is a fundamental but crucial task in natural language processing (NLP) and big data analysis, with wide application range. NER for rice genes and phenotypes is a technique to identify genes and phenotypes from a large amount of text. NER for rice genes and phenotypes can facilitate the acquisition of information in the field of crops and provide references for our research on higher quality crops. At the same time, named entity recognition still faces many challenges. In this paper, we propose an improved bidirectional gated recurrent unit neural network (BI-GRU) method, which is used to automatically identify the required entities (i.e. gene names, rice phenotypes) from relevant rice literature and patents. The neural network model is combined with the Softmax function to directly output the probabilities of labels, forming the BI-GRU-SF model. With the ability of deep learning methods, the semantic information in the context can be learned without the need for feature engineering. Finally, we conducted experiments, and the results showed that our proposed model provided better performance compared to other models. All datasets and resource codes of BI-GRU-SF are available at https://github.com/qqeeqq/NER for academic use.PMID:37995493 | DOI:10.1016/j.compbiolchem.2023.107977
Source: Computational Biology and Chemistry - Category: Bioinformatics Authors: Source Type: research