Using Machine Learning to Identify Health Outcomes from Electronic Health Record Data

AbstractPurpose of ReviewElectronic health records (EHRs) contain valuable data for identifying health outcomes, but these data also present numerous challenges when creating computable phenotyping algorithms. Machine learning methods could help with some of these challenges. In this review, we discuss four common scenarios that researchers may find helpful for thinking critically about when and for what tasks machine learning may be used to identify health outcomes from EHR data.Recent FindingsWe first consider the conditions in which machine learning may be especially useful with respect to two dimensions of a health outcome: (1) the characteristics of its diagnostic criteria and (2) the format in which its diagnostic data are usually stored within EHR systems. In the first dimension, we propose that for health outcomes with diagnostic criteria involving many clinical factors, vague definitions, or subjective interpretations, machine learning may be useful for modeling the complex diagnostic decision-making process from a vector of clinical inputs to identify individuals with the health outcome. In the second dimension, we propose that for health outcomes where diagnostic information is largely stored in unstructured formats such as free text or images, machine learning may be useful for extracting and structuring this information as part of a natural language processing system or an image recognition task. We then consider these two dimensions jointly to define four common...
Source: Current Epidemiology Reports - Category: Epidemiology Source Type: research