Effects of Porting Essie Tokenization and Normalization to Solr

AMIA Annu Symp Proc. 2024 Jan 11;2023:369-378. eCollection 2023.ABSTRACTSearch for information is now an integral part of healthcare. Searches are enabled by search engines whose objective is to efficiently retrieve the relevant information for the user query. When it comes to retrieving biomedical text and literature, Essie search engine developed at the National Library of Medicine (NLM) performs exceptionally well. However, Essie is a software system developed for NLM that has ceased development and support. On the other hand, Solr is a popular opensource enterprise search engine used by many of the world's largest internet sites, offering continuous developments and improvements along with the state-of-the-art features. In this paper, we present our approach to porting the key features of Essie and developing custom components to be used in Solr. We demonstrate the effectiveness of the added components on three benchmark biomedical datasets. The custom components may aid the community in improving search methods for biomedical text retrieval.PMID:38222430 | PMC:PMC10785910
Source: AMIA Annual Symposium Proceedings - Category: Bioinformatics Authors: Source Type: research