How big data can crack development problems
How do we contain fast-moving diseases like Ebola and H1N1, the so-called swine flu? How can we make sure refugees living in shelters after natural disasters are healthy and protected from abuse? Are there ways to better predict and prevent anything from unemployment to unrest to civil war?
Big data analytics for development has the power to solve these and other seemingly intractable problems. But for analysis to produce insight, we have to have data to feed the models. In other words, data is the liquid that fuels the algorithms.
Fortunately, the supply is there: on a daily basis we’re all generating enough data to help answer these questions and more. But there are justifiable concerns about how to protect privacy when using such data. For instance, if an NGO wants to look at the movements of populations to better understand how to prevent disease, how can it obtain geolocation data, and use it in a way that safeguards individual privacy?
The answer may lie in the portability of data models. The old way was to turn over the data to governments and NGOs for analysis, but large data sets are unwieldy, and the more you transport them, the greater the potential risk to privacy and the greater possibility for abuse. A more modern approach is to “flip the script” by taking the model to the data.
Once an analytical model is developed, it makes sense to export the model instead of importing the data. That way, the data can be analysed in the safe locations where it already resides – in essence, giving the data owner the keys to unlock the insights residing in the data. I call this a model lab. This model-driven approach to big data analytics has the potential to bring greater efficiency when attempting to solve known questions and try out hypotheses. Hub-based model labs are a complementary approach to the field-based data labs that help uncover the questions we didn’t know to ask.
Lessons from a super typhoon
The analytics industry is already making strides towards this goal. As I mentioned in my last blogpost, typhoon recovery efforts are a good example. When Super Typhoon Haiyan ripped through the Philippines in November 2013, thousands died and hundreds of thousands were left homeless.
To coordinate basic services, the International Organization for Migration (IOM) needed data on the movements of citizens. By using analytics, the IOM was able to visualize the data and arrive at questions not previously asked within hours of bringing the data and technology together. The answers weren’t always intuitive. For instance, IOM learned that one of the worst-hit cities didn’t need food or water so much as it needed diesel, to help run hospital generators and to assist people in evacuating the island.
The IOM case shows how analytics can greatly reduce time to insight, but more can be done. Rather than bringing the technology to the data, especially when dealing with big or sensitive data that’s not easy to move, let’s simply give the models to our partners to avoid delays in getting answers. This would also broaden the engagement of non-traditional partners, like telco providers, who have the ability to run models on their data even as it remains in place and anonymous. When organizations like IOM have predictive models, they can more quickly optimize evacuation routes or take steps to safeguard women’s security in shelters, and they can do it before a problem arises. As this year’s Typhoon Hagupit proves, there will be no end to the need.
Of course, exporting models is useful for more than just disaster relief. Whether it’s fighting Ebola or helping refugees displaced by war, any time the data is sensitive, it makes sense to move the model. As data scientists are learning, questions can be far more portable than the data needed to provide answers. My company is working on just those types of arrangements, and I call on my colleagues throughout the industry to do the same. Whereas analytics used to be exclusively data-driven, increasingly it is model-driven: the question leads the way to the relevant data. And that data often resides outside conventional sources and inside proprietary, privacy-protected walls. That’s why it makes sense to focus on the model. Model-driven analytics enables us to ask and crowdsource the answers to the more complex multi-science questions that will elevate the human condition.
Author: Mikael Hagstrom, Executive Vice-President, SAS Institute; Vice-Chair, Global Agenda Council on Data-Driven Development. To learn more about IOM’s use of analytics in disaster recovery, visit this website or download a PDF.
Image: A robotic tape library used for mass storage of digital data is pictured at the Konrad-Zuse Centre for applied mathematics and computer science (ZIB), in Berlin August 13, 2013. REUTERS/Thomas Peter
Don't miss any update on this topic
Create a free account and access your personalized content collection with our latest publications and analyses.
License and Republishing
World Economic Forum articles may be republished in accordance with the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International Public License, and in accordance with our Terms of Use.
The views expressed in this article are those of the author alone and not the World Economic Forum.
Stay up to date:
Data Science
The Agenda Weekly
A weekly update of the most important issues driving the global agenda
You can unsubscribe at any time using the link in our emails. For more details, review our privacy policy.
More on Emerging TechnologiesSee all
Filipe Beato and Jamie Saunders
November 21, 2024