Using electronic health records to develop and validate a machine-learning tool to predict type 2 diabetes outcomes: a study protocol

Introduction Type 2 diabetes mellitus (T2DM) is a major cause of blindness, kidney failure, myocardial infarction, stroke and lower limb amputation. We are still unable, however, to accurately predict or identify which patients are at a higher risk of deterioration. Most risk stratification tools do not account for novel factors such as sociodemographic determinants, self-management ability or access to healthcare. Additionally, most tools are based in clinical trials, with limited external generalisability. Objective The aim of this work is to design and validate a machine learning-based tool to identify patients with T2DM at high risk of clinical deterioration, based on a comprehensive set of patient-level characteristics retrieved from a population health linked dataset. Sample and design Retrospective cohort study of patients with diagnosis of T2DM on 1 January 2015, with a 5-year follow-up. Anonymised electronic healthcare records from the Whole System Integrated Care (WSIC) database will be used. Preliminary outcomes Outcome variables of clinical deterioration will include retinopathy, chronic renal disease, myocardial infarction, stroke, peripheral arterial disease or death. Predictor variables will include sociodemographic and geographic data, patients’ ability to self-manage disease, clinical and metabolic parameters and healthcare service usage. Prognostic models will be defined using multidependence Bayesian networks. The derivation cohort, comprising 80% ...
Source: BMJ Open - Category: General Medicine Authors: Tags: Open access, Health informatics Source Type: research