Predicting the spread of influenza epidemics by analyzing twitter messages

In this study, we predicted an influenza-like illness (ILI) based on social media data derived from Twitter. Tweets and patients do not always have a linear correlation; therefore, we employed nonlinear methods including autoregressive with exogenous inputs (ARX), autoregressive-moving-average with exogenous inputs (ARMAX), nonlinear autoregressive exogenous (NARX), deep multilayer perceptron (DeepMLP), and a convolutional neural network (CNN). Two new features employed to significantly reduce the prediction errors are products of the tweets and Centers for Disease Control and Prevention (CDC) data and of the tweets and Google data. Furthermore, we introduced a new method based on entropy that decreased the errors as well as time complexity. Among the available methods and features, the best results were obtained with the newly developed features in the deep neural network methods and the entropy-based method that decreased the mean average error by up to 25%. The entropy method also reduced the time complexity. Applying the above-mentioned methods to the Twitter datasets from 2009 to 2010 and 2011 –2014 revealed that the ILI outbreak can be predicted 2–4 weeks earlier than by the CDC.
Source: Health and Technology - Category: Information Technology Source Type: research