Comparing Medline citations using modified N-grams.

DISCUSSION: Results show that the detection of duplicate Medline citations can be improved by modifying n-grams and that high performance can also be obtained using only unigrams (F1=0.959), particularly when allowing for substitutions of alternative phrases. PMID: 23715801 [PubMed - as supplied by publisher]
Source: Journal of the American Medical Informatics Association - Category: Information Technology Authors: Tags: J Am Med Inform Assoc Source Type: research