Skip navigation
Please use this identifier to cite or link to this item:
Title: Improved Real-Time Disease Surveillance: Leveraging Google Search Data to Monitor Influenza and Measles Outbreaks in Madagascar and the United States
Authors: Roszkowska, Natalia
Advisors: Grenfell, Bryan
Department: Ecology and Evolutionary Biology
Certificate Program: Global Health and Health Policy Program
Class Year: 2020
Abstract: Disease surveillance is important for timely information provision and detection of new health problems and epidemics. Often, this surveillance is non-existent, heavily underreported or largely delayed. Given these drawbacks, recent years have seen a large push for the development of online tools that can provide data in real-time and be applied in places without a strong traditional surveillance system. Our novel research tests the supplemental application of digital data in a low and high income setting. We utilize AutoRegression with Google search data (ARGO) to combine disease-related Google searches with historical case trends to track disease activity in near real-time. We follow seasonal influenza in the United States, an example of a high income country, and non-seasonal influenza and measles in Madagascar, a low income nation. To analyze the value of Google search information for disease surveillance, we compare the predictive performance of ARGO against models with just Google Trends or historical data. We also test ARGO’s usability during an ongoing disease outbreak and resistance to data corruption by underreporting. In our analysis, ARGO outperforms all other models for influenza in both the United States and Madagascar. For measles, ARGO also outperforms all other models in two out of the three metrics demonstrating ARGO’s ability to work on different diseases. The model performs well when faced with a distracting plague outbreak and underreported cases. We obtain promising results for the use of online tools for disease surveillance in both countries. Although ARGO is limited by internet coverage, the model performs well even in low income nations, especially if stronger historical data exists. Increasing internet coverage or improving historical data will enhance the model’s functionality.
Type of Material: Princeton University Senior Theses
Language: en
Appears in Collections:Ecology and Evolutionary Biology, 1992-2022
Global Health and Health Policy Program, 2017-2022

Files in This Item:
File Description SizeFormat 
ROSZKOWSKA-NATALIA-THESIS.pdf814.64 kBAdobe PDF    Request a copy

Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.