Clinical text annotation - what factors are associated with the cost of time?

In this study, we aimed to investigate how factors inherent to the text affect annotation time for a named entity recognition (NER) task. We recruited 9 users to annotate a clinical corpus and recorded annotation time for each sample. Then we defined a set of factors that we hypothesized might affect annotation time, and fitted them into a linear regression model to predict annotation time. The linear regression model achieved an R2 of 0.611, and revealed eight time-associated factors, including characteristics of sentences, individual users, and annotation order with implications for the practice of annotation, and the development of cost models for active learning research. PMID: 30815201 [PubMed - indexed for MEDLINE]
Source: AMIA Annual Symposium Proceedings - Category: Bioinformatics Tags: AMIA Annu Symp Proc Source Type: research