How can open data help detect disease outbreaks?

Stay up to date:
Future of Global Health and Healthcare
The outbreak of disease is often a latent but hard-hitting global concern, as currently exemplified by the flurry of anxiety swirling around Ebola. Many efforts have sprung up to fight spread of the disease, including the World Bank which has recently committed $400 million to said cause. While the growing human cost of the disease is distressing enough, if Ebola is not contained, the World Bank has estimated an additional economic cost of $32.6 billion to the West Africa region.
So what can open data do in the health sector?
To find out, we recently spoke with Clark Freifeld, co-founder of HealthMap.org, a web-based tool that has been in the news for detecting Ebola 9 days before the World Health Organization officially announced it. Started in 2006 at Boston Children’s Hospital, HealthMap uses public health, media, and open data sources to provide real-time information and alerts about disease outbreak based on a number of filters, including location.
In 2006, co-founders Clark Freifeld (developer) and John Brownstein (epidemiologist) were working at Boston Children’s Hospital. They wanted to capture value from what was then the untapped explosion of information on the web about disease outbreaks. However, the problem was that this information was not well-organized and not useful for real-time decision making or analysis.
HealthMap uses a web crawler that searches the web 24 hours a day, pulling in data from hundreds of thousands of relevant publicly available sources including news media, health groups, and government agencies. HealthMap then applies filtering and text-processing algorithms before making the data available- offering the public a global view of ongoing disease outbreak activities in 15 languages.
While there was initially skepticism in the health community about relying on informal sources, the HealthMap approach has become an important part of public health surveillance. Government agencies (including the CDC, HHS, USAID) now take a structured feed from Healthmap for integration with their own data feeds. The CDC has also collaborated with HealthMap on an interactive Dengue map which combines the CDC’s official risk maps and Healthmap’s real-time outbreak data.
A good mix of public health officials and the general public regularly access the site–many trying to understand what is happening around them. But there are challenges. As with all data approaches that currently deal with mining content that comes in the form of free text, there is a lot of noise. Automated algorithms are created to process and account for this dynamic, but it is hard to get a true sense of what is happening with an unfiltered raw feed.
Beyond sheer volume, there are granularity and linguistic challenges with this type of intake. Sometimes there is no location information in news articles, making it difficult to pinpoint locations of outbreak. A lot of noise also results in searches for disease-related terms, typically turning up scientific findings, vaccination campaigns, and linguistic variations that don’t pertain to disease outbreak. For example: Justin Bieber Fever.
Beyond refining and improving existing text-mashing and filtering algorithms to reduce noise and to cast a wider net, the team has a number of areas set for future experimentation and development. The adoption of mobile devices has helped HealthMap take its next step. Mobile technology has allowed for spin-off projects exploring direct reports from the field as an additional data source and method for validation. Further refining algorithms to make better sense of social media data is another area of key exploration in the search for more valuable signals. There are also plans to extend the number of languages.
This post first appeared on The World Bank Blog
Author: Samuel Lee is a member of the World Bank Finances team, a Finance Complex project that aims to make financial data about the World Bank’s activities readily available, re-useable, and useful to the public-at-large and various stakeholder groups.
Image: A health inspection and quarantine researcher demonstrates to customs policemen the symptoms of Ebola, at a laboratory at an airport in Qingdao, Shandong province August 11, 2014. REUTERS.
Don't miss any update on this topic
Create a free account and access your personalized content collection with our latest publications and analyses.
License and Republishing
World Economic Forum articles may be republished in accordance with the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International Public License, and in accordance with our Terms of Use.
The views expressed in this article are those of the author alone and not the World Economic Forum.
Forum Stories newsletter
Bringing you weekly curated insights and analysis on the global issues that matter.
More on Economic GrowthSee all
Kateryna Karunska and Ian Shine
February 24, 2025
Aengus Collins
February 24, 2025
Mahmoud Jabari
February 24, 2025
Spencer Feingold
February 21, 2025
Marieke Blom
February 21, 2025