Using web search query data to monitor dengue epidemics: a new model for neglected tropical disease surveillance - PubMed (original) (raw)

Using web search query data to monitor dengue epidemics: a new model for neglected tropical disease surveillance

Emily H Chan et al. PLoS Negl Trop Dis. 2011 May.

Abstract

Background: A variety of obstacles including bureaucracy and lack of resources have interfered with timely detection and reporting of dengue cases in many endemic countries. Surveillance efforts have turned to modern data sources, such as Internet search queries, which have been shown to be effective for monitoring influenza-like illnesses. However, few have evaluated the utility of web search query data for other diseases, especially those of high morbidity and mortality or where a vaccine may not exist. In this study, we aimed to assess whether web search queries are a viable data source for the early detection and monitoring of dengue epidemics.

Methodology/principal findings: Bolivia, Brazil, India, Indonesia and Singapore were chosen for analysis based on available data and adequate search volume. For each country, a univariate linear model was then built by fitting a time series of the fraction of Google search query volume for specific dengue-related queries from that country against a time series of official dengue case counts for a time-frame within 2003-2010. The specific combination of queries used was chosen to maximize model fit. Spurious spikes in the data were also removed prior to model fitting. The final models, fit using a training subset of the data, were cross-validated against both the overall dataset and a holdout subset of the data. All models were found to fit the data quite well, with validation correlations ranging from 0.82 to 0.99.

Conclusions/significance: Web search query data were found to be capable of tracking dengue activity in Bolivia, Brazil, India, Indonesia and Singapore. Whereas traditional dengue data from official sources are often not available until after some substantial delay, web search query data are available in near real-time. These data represent valuable complement to assist with traditional dengue surveillance.

PubMed Disclaimer

Conflict of interest statement

This study was supported by funding from Google Inc., and two of the authors (VS, CC) are employees of Google Inc.

Figures

Figure 1

Figure 1. A comparison of the model-fitted and official case counts dengue epidemic curves in each country.

The model-fitted epidemic curve as compared to the official case counts epidemic curve for dengue in each of the five countries for which a model built on Google search volume data was developed. Bolivia and Singapore are shown at a weekly resolution, the others on a monthly resolution. The activity index is a scaled measure of the case counts, representing the relative amount of dengue activity in each country on a scale from 0 to 100. Shaded regions indicate the season held out for testing the final models.

Comment in

Similar articles

Cited by

References

    1. Beatty ME, Stone A, Fitzsimons DW, Hanna JN, Lam SK, et al. Best practices in dengue surveillance: a report from the Asia-Pacific and Americas Dengue Prevention Boards. PLoS Negl Trop Dis. 2010;4:e890–e890. - PMC - PubMed
    1. Guzman MG, Halstead SB, Artsob H, Buchy P, Farrar J, et al. Dengue: a continuing global threat. Nat Rev Microbiol. 2010;8:S7–S16. - PMC - PubMed
    1. Special Programme for Research & Training in Tropical Diseases. Geneva, Switzerland: World Health Organization; 2007. Scientific working group report on dengue.
    1. Runge-Ranzinger S, Horstick O, Marx M, Kroeger A. What does dengue disease surveillance contribute to predicting and detecting outbreaks and describing trends? Trop Med Int Health. 2008;13:1022–1041. - PubMed
    1. Yih WK, Teates KS, Abrams A, Kleinman K, Kulldorff M, et al. Telephone triage service data for detection of influenza-like illness. PLoS One. 2009;4:e5260–e5260. - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources