In a recently published study Twitter-derived neighborhood characteristics associated with obesity and diabetes, by analyzing 422,094 tweets over one year from the state of Utah, there was a good demonstration of the predictive value of tweets to determine population based prevalence of obesity and diabetes.
Thats a mouthful, so let me rephrase.
Roughly, using machine learning and word identification to look for ‘happy” tweets, physical activity tweets, and low-calorie density food tweets it was possible to identify healthier, lower obesity, lower diabetes, zip codes (based on community medical records).
“Happy” tweets were identified by a software program called MALLET – Machine Learning for LanguagE Toolkit which is a software program which analyzes text sentiment based on pre-identified happy tweets. It basically rates tweets on a happy scale from 0-1!
There are quite a few limitations in this study, so I wouldn’t be using Zillow and Twitter to buy a new house in a zip code in order to lose weight just yet, but this does raise some interesting questions.
First, can we use big data gathered from social media to develop proxy metrics for public health issues?
Second, would it be possible to reverse engineer this (such as in Facebook infamous emotional contagion study, Experimental evidence of massive-scale emotional contagion through social networks) and effect public health change via social media?
Imagine that: Twitter as a public weight loss tool!
Time well tell. This study is worth a read.