Yelp Helps Identify Food-borne Infectious Disease Outbreaks
Researchers from Columbia University develop a surveillance system that uses Yelp reviews to identify and monitor food-borne illnesses.
*Updated on 01/16/2017 at 12:02 PM EST
Since its creation in 2004, the social media outlet Yelp, has served as a platform for individuals to share their thoughts on local businesses with other users. Now, investigators from Columbia University’s department of computer science have found another way to use the app—to identify and track food-borne illnesses.
When it comes to reporting instances of food-borne illness, individuals appear to be shifting from reporting through health department complaint registration systems to using social media. In a recent study published in the Journal of American Medical Informatics Association, the investigators discuss how they developed a system that is capable of tracking food-borne illnesses by using restaurant reviews posted on Yelp.
"Several years ago, while investigating an outbreak of gastrointestinal illness associated with a restaurant, DOHMH noted that patrons had reported illnesses on the business review website Yelp that had not been reported to the citywide information service, 311. To explore the potential of using Yelp to identify unreported outbreaks, DOHMH worked with Columbia University and Yelp on a pilot project to prospectively identify restaurant reviews on Yelp that referred to foodborne illness," paper author Thomas Effland, PhD student, Computer Science Department, Data Science Institute, at Columbia University, told Contagion®. Since 2012, the New York City Department of Health and Mental Hygiene (DOHMH) has been using the system to track food-borne illness in NYC restaurants, according to a recent press release.
For the system, the investigators built classifiers for 2 tasks: 1) to determine if a review indicates an individual is experiencing a food-borne illness and 2) to determine if a review indicates that several individuals are experiencing food-borne illnesses. To do this, the system looks at keywords such as “sick” and “multiple,” according to the press release. "First, all of the reviews which are flagged by the system are then passed on to DOHMH epidemiologists for manual review and further investigation, so we can use their feedback to measure the performance as the system is in use. For the reviews that do not get flagged, we sample 1,000 reviews and have epidemiologists label them and use the labels to measure performance, as is done for the flagged reviews. We then use a formula to combine the performance measurements from these two parts to get the overall estimates," Effland explained. This is referred to as "bias-adjustment" in the paper.
Since 2012, this system has helped identify 8,523 additional food poisoning complaints as well as 10 outbreaks of food-borne illness associated with NYC restaurants, Effland said.
The Centers for Disease Control and Prevention estimates that contaminated food is responsible for a staggering 48 million illnesses and over 3000 deaths in the United States each year, according to the study authors. The majority (68%) of the 1200 food-borne outbreaks that are reported and investigated nationally are restaurant-related; these numbers underscore the need for improved surveillance.
“Younger people are less likely to report food-borne illness via traditional channels,” according to the press release. “The popularity of online reviews and the incorporation of social media data into public health surveillance systems are, however, becoming more common.”
In fact, according to the study authors, 1 evaluation that compared the use of “informal and unconventional outbreak detection methods” with more traditional methods found that “the informal source was the first to report in 70% of outbreaks,” underscoring the usefulness of incorporating social media reviews into the surveillance system.
"We know that people may not be aware of how to report to the Health Department. Using social media data, such as Yelp and Twitter data, allows us to identify foodborne illness complaints and outbreaks that might not have been reported to us through 311 or healthcare providers," Effland explained.
The system was used in a pilot study that ran from July 1, 2012, to March 31, 2013, and found a total of 463 Yelp reviews that suggested food-borne illness had occurred; upon further investigation, the investigators found that only 3% of those illnesses had been reported to the DOHMH through the traditional complaint system.
Because of the successes illustrated in the pilot study, the researchers are already working on three different fronts of improvements to the system, according to Effland. "(1) We are exploring the incorporation of other online and social media sources for surveillance by the system. For example, we have begun to incorporate Twitter data over the past year or so. (2) We are exploring potential collaborations with other cities to deploy the system. (3) We are investigating the use of cutting-edge “deep learning” machine learning methods to further improve the accuracy of the system's classifications," he said.
The DOHMH is going to continue working with Columbia University to incorporate even more social media data sources into the surveillance system, such as Twitter.
"Twitter presents a challenge in that tweets are more terse than Yelp reviews and many do not indicate a particular restaurant. Additionally, it may be challenging to identify tweets associated with a particular geographic area. However, other city health departments have been successful in identifying foodborne illness complaints on Twitter indicating that it may useful for outbreak detection," Effland concluded.
Feature Picture Source: brennanMKE / flickr / Creative Commons.