Home Coronavirus How Google Search Data Can Predict COVID-19 Outbreaks

How Google Search Data Can Predict COVID-19 Outbreaks


While some of the behaviors that lead to SARS-CoV-2 infections are clear, new waves of COVID-19 cases do not always follow predicted patterns.

Now, however, a study from researchers at New York University’s Courant Institute of Mathematical Sciences describes a possible means of spotting infection surges before they happen through the analysis of online searches.

The researchers discovered a correlation between a surge in searches relating to activities outside the home — activities that could put people at risk of SARS-CoV-2 infection — and a rise in COVID-19 cases 10–14 days afterward. Infections fell when there was an increase in searches relating to stay-at-home activities.

Study author Anasse Bari, a clinical assistant professor at the Courant Institute, notes that experts have already successfully used data mining “in finance to generate data-driven investments, such as studying satellite images of cars in parking lots to predict businesses’ earnings.”

“Our research shows the same techniques could be applied to combatting a pandemic by spotting, ahead of time, where outbreaks are likely to occur,” says senior author Megan Coffee of the Division of Infectious Disease & Immunology at the New York University (NYU) Grossman School of Medicine.

Identifying with greater precision those behaviors that produce infection spikes can help epidemiologists and policymakers more effectively shape public policies regarding closures, lockdowns, and so on.

The system that the study paper describes avoids privacy issues by involving only large clusters of anonymized data.

The study appears in Social Network Analysis and Mining.

Mobility vs. isolation

The researchers’ first step was to develop categories based on search phrases or keywords that they could then track.

The two key categories that they tracked were called the mobility index and the isolation index.

The team assigned certain searches to the mobility index track, including “theaters near me,” “flight tickets,” and other inquiries about activities that involve leaving the home and being in physical proximity with others.

As Bari puts it, “When someone searches the closing time of a local bar or looks up directions to a local gym, they give some insight into what future risks they may have.”

For the isolation index track, the researchers collected search queries — such as “at-home yoga” or “food delivery” — that indicated an intention to remain home and isolated.

The researchers based their categorization of keywords on the Democracy Fund + UCLA Nationscape survey — a study in which respondents listed the things that they would be doing if “restrictions were lifted on the advice of public health officials regarding activities.”

The survey found that the top three activities that people missed were “going to a stadium/concert,” “going to the movies,” and “attending a sports event.”

According to Bari, “This is a first step toward building a tool that can help predict COVID-19 case surges by capturing higher risk activities and intended mobility, which searches for gyms and in-person dining can illuminate.”