CIBS: A biomedical text summarizer using topic-based sentence clustering

Publication date: Available online 13 November 2018Source: Journal of Biomedical InformaticsAuthor(s): Milad MoradiAbstractAutomatic text summarizers can reduce the time required to read lengthy text documents by extracting the most important parts. Multi-document summarizers should produce a summary that covers the main topics of multiple related input texts to diminish the extent of redundant information. In this paper, we propose a novel summarization method named Clustering and Itemset mining based Biomedical Summarizer (CIBS). The summarizer extracts biomedical concepts from the input documents and employs an itemset mining algorithm to discover main topics. Then, it applies a clustering algorithm to put the sentences into clusters such that those in the same cluster share similar topics. Selecting sentences from all the clusters, the summarizer can produce a summary that covers a wide range of topics of the input text. Using the Recall-Oriented Understudy for Gisting Evaluation (ROUGE) toolkit, we evaluate the performance of the CIBS method against four summarizers including a state-of-the-art method. The results show that the CIBS method can improve the performance of single- and multi-document biomedical text summarization. It is shown that the topic-based sentence clustering approach can be effectively used to increase the informative content of summaries, as well as to decrease the redundant information.Graphical abstract
Source: Journal of Biomedical Informatics - Category: Information Technology Source Type: research