Natural Language Processing for Enterprise-scale De-identification of Protected Health Information in Clinical Notes

AMIA Annu Symp Proc. 2022 May 23;2022:92-101. eCollection 2022.ABSTRACTPatient privacy is a major concern when allowing data sharing and the flow of health information. Hence, de-identification and anonymization techniques are used to ensure the protection of patient health information while supporting the secondary uses of data to advance the healthcare system and improve patient outcomes. Several de-identification tools have been developed for free-text, however, this research focuses on developing notes de-identification and adjudication framework that has been tested for i2b2 searches. The aim is to facilitate clinical notes research without an additional HIPAA approval process or consent by a clinician or patient especially for narrative free-text notes such as physician and nursing notes. In this paper, we build a scalable, accurate, and maintainable pipeline for notes de-identification utilizing the natural language processing and REDCap database as a method of adjudication verification. The system is deployed at an enterprise-scale where researchers can search and visualize over 45 million de-identified notes hosted in an i2b2 instance.PMID:35854742 | PMC:PMC9285160
Source: AMIA Annual Symposium Proceedings - Category: Bioinformatics Authors: Source Type: research