Loading presentation...

Present Remotely

Send the link below via email or IM

Copy

Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.

DeleteCancel

Make your likes visible on Facebook?

Connect your Facebook account to Prezi and let your likes appear on your timeline.
You can change this under Settings & Account at any time.

No, thanks

Statistics Sweden keynote November 6, 2015

No description
by

CATHERINE ONEIL

on 6 November 2015

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of Statistics Sweden keynote November 6, 2015

Cathy O'Neil
mathbabe.org

What is a
big data skeptic?
Motivating Q's
What are some risks?
How bad could things get?
How can we address them?
What promise?

What problems?
How bad could it get?
What can we do?
Stats mistakes in
the big data world
Dirty or incomplete data
Survivor bias
Selection bias
Bad proxies
Bad or misleading evaluation metrics
In the academic world
- Carmen Reinhart and Kenneth Rogoff - John Ioannidis
- David Madigan
- Victoria Stodden
- OSTP
Models often fail
on purpose
- Credit rating agencies
- LIBOR
- Pension models
- VAM
- VaR
Death Spiral
What do big data models do?
I'll make you click
I'll make you buy
I'll decide if you're at risk
I'll decide if you're smart
I'll decide whether to hire you
I'll optimize you for ROI
Political modeling
Set standards
for models

Set standards
for modeling
Data Privacy
Examples
After all, black boxes are nice, and they make us feel smart, and give us employment.
What are the systemic risks and who's keeping track?
Is this an intractible inevitable problem of modern life?
Parting thoughts
We could do a lot of good if we represented the public somehow
Models aren't going away
Regulations alone won't solve this problem
Need to educate people about the risks
Models are also cool
Let's focus on making models work for people
instead of on them.
Let's be realistic.
Thank you!
Incomplete - revision history
Survivorship - financial data
Selection - rec'n engines
Proxies - "interest"
Eval metric - profit
The issue of time
Are models racist?
- Google search
- Job hiring
- Peer-to-peer lending
- "Digital redlining"
- Invisible failures
feedback loop
long term systemic changes
winner as witness
beyond the filter bubble
Is Segmentation "Good"?

- insurance
- screening at the airport
- generalized surveillance
- #BlackLivesMatter
Quantify this?
- Modeling the model
- Individual impact
- Monte Carlo for cumulative effect?
What are the long-term effects?
Rich get richer
Poor get poorer
less mobility
increased inequality
Go back to "time"
- Short term gains from private co's
- Long term negative effects
- Completely unregulated
- European laws are stricter
First it was Obama
Next it's everyone
Personal messaging
Personal offers
Is this democratic?
Related but different:
Poll models
- Feedback loop here too
- Might cause weird voting behavior
- But doesn't directly pervert issues
How it worked
- Individual appeals
- Facebook graph etc.
- Linking databases
- Money, then votes
- Targeted phone calls, emails
- Ads and Reddit
- Now used by Caesar's
Tons of data out there
Acxiom
European privacy laws
Start with kids?
Scrutiny for:
high impact,
high stakes, and
widespread models
What would that look like?
Transparency
Access
Audits
Hippocratic Oath of modeling
Data skepticism
Data standards
Story telling
Reproducibility and beyond
Storytelling
Modeling is hard
Need better tools
Wakari, ipython notebook
Reproducibility
and beyond
Public access
Robustness tests
Open Models
The Promise and The Risks of Big Data
November 6, 2015
Stockholm, Sweden

Doesn't overestimate data
Twitter/ obesity
Doesn't underestimate
Predator/ prey
Develops data standards
Imagine if...
we used health scores to help people stay well
we used risk scores to help people find jobs or mental health services
we used personality tests to help people find a job that is suited to them
we used on-the-job surveillance to help people enjoy work more
we measured ability of jails and prisons to lead to better lives
Goldman Sachs and the social impact bonds
US News & World Report college rankings
VaR
Examples of gaming
Full transcript