Applying random forest in a health administrative data context: a conceptual guide

We describe in detail how RF can be useful in health services research, provide guidance on data set up, modeling decisions and demonstrate how to interpret results. We al so highlight specific considerations for applying RF to health administrative data. In a working example, we compare RF with logistic regression, Ridge regression and LASSO in their ability to predict whether a person has a regular medical doctor. We use survey responses to “do you have a regular medical doctor” from three cycles of the Canadian Community Health Survey (2007, 2009, 2011). Responses are linked with physician claims’ data from 2002 to 2012. We limit our cohort to persons 40 years and older at the time of responding to the survey. We discuss the strengths and weaknesses of using RF in a health services research setting in comparison to using more conventional modeling techniques. Applying a RF model in a health services research setting can have advantages over conventional modeling approaches and we encourage health services researchers to add RF to their toolbox of predictive modeling methods.
Source: Health Services and Outcomes Research Methodology - Category: Statistics Source Type: research