A long journey to short abbreviations: developing an open-source framework for clinical abbreviation recognition and disambiguation (CARD)

Conclusion CARD detected 27 317 and 107 303 distinct abbreviations from discharge summaries and clinic visit notes, respectively. Two sense inventories were constructed for the 1000 most frequent abbreviations in these 2 corpora. Using the sense inventories created from discharge summaries, CARD achieved an F1 score of 0.755 for identifying and disambiguating all abbreviations in a corpus from the VUMC discharge summaries, which is superior to MetaMap and Apache’s clinical Text Analysis Knowledge Extraction System (cTAKES). Using additional external corpora, we also demonstrated that the MetaMap-CARD wrapper improved MetaMap’s performance in recognizing disorder entities in clinical notes. The CARD framework, 2 sense inventories, and the wrapper for MetaMap are publicly available at https://sbmi.uth.edu/ccb/resources/abbreviation.htm. We believe the CARD framework can be a valuable resource for improving abbreviation identification in clinical NLP systems.
Source: Journal of the American Medical Informatics Association - Category: Information Technology Authors: Tags: Research and Applications Source Type: research