Loading presentation...

Present Remotely

Send the link below via email or IM

Copy

Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.

DeleteCancel

Make your likes visible on Facebook?

Connect your Facebook account to Prezi and let your likes appear on your timeline.
You can change this under Settings & Account at any time.

No, thanks

UN SUMMER ACADEMY 2013

Presentation at the UN Summer Academy, "Big Data for Development" clinic, June 15th 2013, 9.00 am - 3.00 pm.
by

Emmanuel Letouzé

on 18 June 2013

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of UN SUMMER ACADEMY 2013

UN Summer Academy 2013
5 Reflections on
Big Data for Development

2. Big Data for Development is not about the data
1. Automating Research Changes the Definition of Knowledge.
2. Claims to Objectivity and Accuracy are Misleading (Apophenia: seeing meaningful patterns where none exists)
3. Bigger Data are Not Always Better Data
4. Not All Data Are Equivalent
5. Just Because it is Accessible Doesn’t Make it Ethical
6. Limited Access to Big Data Creates New Digital Divides
Articles on Big Data are themselves a source of big data
Big Data is already impacting, to various degrees, business, journalism, official statistics, counter-terrorism, social science, tomorrow policy-making?
But many questions remain (more to come)
Question is: what is the novelty?
Source: Boyd, Danah and Crawford, Kate, Six Provocations for Big Data (September 21, 2011). http://dx.doi.org/10.2139/ssrn.1926431
6 provocations for Big Data
Key question:
How do you model human behavior?
To be able to see the details of variations in the market and the beginnings of political revolutions, to predict them, and even control them, is definitely a case of Promethean fire. Big Data can be used for good or bad, but either way it brings us to interesting times. We're going to reinvent what it means to have a human society.Alex (Sandy) Pentland
"To be able to see the details of variations in the market and the beginnings of political revolutions, to predict them, and even control them, is definitely a case of Promethean fire. Big Data can be used for good or bad, but either way it brings us to interesting times. We're going to reinvent what it means to have a human society."
Alex (Sandy) Pentland, MIT, "Reinventing Society in the Wake of Big Data".
"Current attempts to build computer-based baseline of 'normal' human behaviour on the basis of Big Data may turn out to be similar to what happened with standardization of physical goods in the Industrial Revolution."
Patrick Wolfe, UCL
Pb: we are very good at predicting 10 of the next 3 revolutions
5. Big Data may revolutionize Development
40 mns
3 sections:
1. Context & Promise
2. Challenges & Risks
3. Principles & Policies
==> we will need and find a better name
"Digital breadcrumbs" (Pentland)
"Digital behavioral data" (many, too vague)
"Ambient data" (many, too vague)
"Particle data"?
Emmanuel
Letouzé

New York,
June 15th , '13

3. Big Data for Development
is no modern panacea

4. Big Data for Development is
in its infancy

What is truly revolutionary here? Volume? Velocity? Variety? Other 'Vs'? NO. Not really.
"the dominant (..) model seems bogged down in a simplistic (...) framework, with the unecessary and confusing "quality of children" notion clouding everyone's thinking. If it is to provide valid insights into human behaviour and guidance to policy, the economic theory of fertility must be substantially reformulated".
Warren C. Robinson on Gary Becker, 1997
==>Replace "quality of children" and "economic theory of fertility" by "Big Data"
This is the revolution: individual level behavioral data we never had before.
Movement of an individual in Rwanda over 4 years.
Source: Inferring Patterns of Internal Migration from Mobile Phone Call Records: Evidence from Rwanda J. Blumenstock, 2011
MAIN POINT IS:
FORGET ABOUT SIZE. STOP COUNTING. Size, velocity etc, clouds everybody's thinking.
It is a qualitative change--these little individual behavioral and networked data resulting IN a quantitative change--NOT from.
It's about the nature of the data. Never in history have we had these data. Understand what this means.

==> NEED TO SPECIFY WHICH FUNCTION IS SOUGHT--very different set ups and requirements
==> RELEVANCE AND ACCEPTABILITY of each (esp. predictive) is highly political
==> GREATER FOCUS needed on #3. Causality matters too.
Point is: LOTS happening and needed. Long way to go. Work in progress.
Additional thoughts on field:
It's relatively easy to predict what you know happened (cf Bin Laden in the news)
It's harder to avoid false positives (we will typically predict 10 out the next 3 revolutions): we over-predict, it's math. (anomaly detection)
Democratic principles will hopefully get in the way of certain decisions being made on the basis of predictions.
This does not mean that predictive function is irrelevant--it is highly relevant.
There is just too much focus on it at the expense of the historical opportunity to improve our understanding of human ecosystems.
Methodological work is the next frontier
Sample bias? validity? proxies? need to develop tools and methods.
Bridging across communities--partnerships!
Medium term: "agile development"-feedback loops, more responsive policymaking
Think of other innovative applications and implications:
test economic theories with new data
change measurement of welfare--free data consumption
empowering communities--quantified self movement.
1. Big Data for Development is not about size
Thank you
The [Big] Data Revolution
"90% of the data ever produced has been produced in the past X months". Fine. But what does this mean/imply?
BUT:
Big Data is not about the big data (cf Gary King)
Big Data is about the nature of the big data--these "traces of human actions picked up by digital devices" AND the research ecosystem that makes sense of them
"Big Data is not about the data" -- it's about the analytics (Gary King, 2013)
Big Data vs. big data. Field vs. data.
We should add: "it's not JUST about the analytics understood as advanced statistical methods, data mining and machine learning; it's about contextualization, human insights"
Big Data is deeply ethnographic: example of CDR and mobility analysis: what you follow are not people but SIM-cards that can be shared!
1. Automating Research Changes the Definition of Knowledge.
2. Claims to Objectivity and Accuracy are Misleading (Apophenia: seeing meaningful patterns where none exists)
3. Bigger Data are Not Always Better Data
4. Not All Data Are Equivalent
5. Just Because it is Accessible Doesn’t Make it Ethical
6. Limited Access to Big Data Creates New Digital Divides
Source: Boyd, Danah and Crawford, Kate, Six Provocations for Big Data (September 21, 2011). http://dx.doi.org/10.2139/ssrn.1926431
6 provocations for Big Data
Key question:
How do you model human behavior?
"Current attempts to build computer-based baseline of 'normal' human behaviour on the basis of Big Data may turn out to be similar to what happened with standardization of physical goods in the Industrial Revolution."
Patrick Wolfe, UCL
Pb: we are very good at predicting 10 of the next 3 revolutions
Call Detail Records (CDRs) used to study:
Impact of human mobility on malaria transmission
Slum dynamics
Internal migration patterns
Track movement to predict cholera outbreak
Yahoo! data used to study international migration
Google data used to detect flu and dengue outbreaks
Early thinking & applications to analysis of poverty, mobility, public health, ...more every month
Facebook to study teenage drinking
NEEDED and ongoing: formalization of the field
Conceptual:
Big Data vs. big data
Qualitative nature of the data
Analytical--which one is seeked?
1. Descriptive-Awareness
2. Predictive-Warning
3. Diagnostics-feedback
Institutional: Data philanthropy/partnership/LABS
'No Panacea'? Sounds like rehearsed developmental speech. BUT:
1. Critics claim early adopters believe it is..yet no one serious does!
2. No one serious does because there are real, serious, risks, challenges, obstacles, in the way.
==> We need to be aware of these and mitigate them.

Methodological: sample bias, internal vs external validities...
Full transcript