In recent years, the annual and weekly cases of syphilis in the U.S. have shot upwards. The growing frequency of this sexually-transmitted disease (STD) has driven medical authorities to look into effective yet cheap ways to keep an eye on its spread.
Search engines and social media are recent technologies, but they are highly popular and commonly used. The idea of looking up information on sexual health and risk of syphilis would have occurred to people who are worried about contracting the disease.
Social media users have also shown an increasing tendency to talk about sexual health and risky behaviors. Many of these discussions could have been prompted by the same people who are at risk from STDs.
Researchers from several California universities have considered the use of search engine histories and social media as possible tools to monitor public health and conduct research. The Centers for Disease Control and Prevention (CDC) guided their efforts in two separate studies. (Related: YouTube is not safe for your children: The video platform “data mines” information from its young users.)
In the study concerning search engine results, the researchers used CDC data to find out how many weekly cases of syphilis were reported in each state from 2012 to 2014. Next, they looked up 25 keywords related to the risk of syphilis before mining the weekly online search query data from Google Trends that involved those keywords.
They ended up with 155 weeks' worth of data involving syphilis-related online queries. After accounting for a one-week lag, the researchers used the Google Trends data during the first 10 weeks of each year to train several different "models."
Each model consisted of varying sets of keyword predictors for each year. The researchers validated the models for 2012 and 2014 for 52 weeks, while the 2014 model was validated for 42 weeks.
According to their reports, the models made correct predictions for 144 weeks of primary and secondary syphilis cases in all 50 states. The researchers concluded that Google Trends weekly search data was a viable means of predicting future syphilis rates in a state over the next few weeks.
The second study looked into the possibility of using social media data from 2012 to predict syphilis cases in 2013. For this experiment, the researchers turned to the CDC again for data on primary, secondary, and early latent syphilis cases in 2012 and 2013. This time, the data came from the county level instead of the state.
The researchers also mined more than 8,500 geo-located tweets in the U.S. These Twitter posts were filtered because they included keywords connected to sexual risk. Vernacular terms for sex were included as keywords.
They looked at the correlation between syphilis-related tweets and reports of actual cases of the STD by county. The researchers reported that they found a strong link between syphilis-related tweets and all three case types. The more people tweeted about the STD, the more cases there were.
In a similar vein to their online query data study, the California researchers make the case that social media can serve as an accurate and affordable way to follow and predict the spread of syphilis.
Stay aware of how the authorities are mining your social media and online search data at PrivacyWatch.news.
Sources include: