Send the link below via email or IMCopy
Present to your audienceStart remote presentation
- Invited audience members will follow you as you navigate and present
- People invited to a presentation do not need a Prezi account
- This link expires 10 minutes after you close the presentation
- A maximum of 30 users can follow your presentation
- Learn more about this feature in our knowledge base article
How does knowledge emerge from data?
Transcript of How does knowledge emerge from data?
HOW DOES KNOWLEDGE EMERGE FROM DATA?
1. Openness as Infrastructure
(John Wilbanks, 2011)
2. As Data Overflows Online,
Researchers Grapple with
(Vindu Goel, 2014)
Openness as infrastructure, 2011
Journal of Cheminformatics
"where we build tools and policies that help networks of people who have their health data share it with networks of people who like to analyze health data."
Special thanks to Camila Jenkin
Ted Talk: Let’s pool our medical data
Tulane University, Bachelor of Philosophy
The Sorbonne, Modern Letters
Ewing Marion Kauffman Foundation
co-founded Incellico, which is now part of Selventa.
Earthquake in Japan
Coreflood botnet taken down
The wedding of Prince William
Osama bin Laden killed
Microsoft buys Skype
Final film of Harry Potter
Openness as infrastructure
Where does the trust come from?
"the brand of the journal, built over years through the recruitment of trusted scientists to serve as
" (Wilbanks, 2011)
advertisement of research
understanding of the existing paradigm
describe method, results, implications
Science = wiki ?
"every topic in science is open for back and forth and new discoveries spark rounds of editing and re-editing, and the print equivalent of flame wars in biting letters to the editor." (Wilbanks, 2011)
era of increasingly computerized science
Science is drowning
What do we need?
Full Scale Revolution
put literature online
free of charge
free of copyright
provide credit to the author
(the Budapest Open Access Initiative)
separate the subjective judgement of impact from a more objective judgment of scientific validity in the peer review process
new system into the existing data infrastructure
but...in the data world
no accepted standard language
No good way to structure data
(given certain inputs, what our decision matrix looks like)
3 essential elements missing
1. Scientific Collaboration
3. Data Openness
infrastructure to distribute collaboration
ex. categories, links & tags
need of right search string
need of formal classification imposed
Open data license
Legal user interfaces
Tech implementations of licenses
create low-cost marketplace of ideas
address classification problem
Example 1 of open data
Example 2 of open data
longstanding tradition of sharing open data
evolved, open source infrastructure for virtual collaboration
data becomes larger
discoveries become more complex
only tractable method: Open data
ex. pharmaceutical industry's investment of data
ex. Sage Bionetworks
making reproducible claims under similar circumstances
Open data will win out
"return scientific data to its most natural state, one that is a pure public good, that gains more value as more people possess it" (Wilbanks, 2011)
As data overflows online, researchers Grapple with Ethics, 2014
The New York Times
University of Michigan, Knight-Wallace
Technology reporter at The Times
San Jose Mercury News
Contra Costa Times
The Plain Dealer
The Wall Street Journal
Ebola Virus Outbreak
Super Bowl XLVII Champion: Seahawks
Malaysia Airlines plane crashes
End to NSA's bulk data collection
2014 FIFA World Cup
Twitter sues the US government
As Data Overflows Online, Researchers Grapple With Ethics
(Facebook & Twitter)
Social Science Research
without people knowing
not knowing they are subjects
no explicitly consent
Co-author of Facebook study
(700,000 people's news feeds)
published in June, 2014
(Academics, corporate researchers & government agencies)
MIT & Stanford university
The Federal Trade Commision
panels & conferences
offering software tool
privacy and fair treatment of internet users
"Consumers should be in their driver's seat when it comes to their data." (Edith Ramirez)
Apologized but declined further comments
(What does people prefer to see?)
Facebook Emotion Experiment
Facebook Voting Experiment
Facebook data scientist Adam Kramer, Professor Hancock & academic researcher Jamie Guilory
How emotions spread through large population?
"deliberately changed the number of positive and negative posts in the subjects' news feeds"
"how the changes affected the emotional tone of the users' subsequent Facebook posts"
sent voting reminders to 61 million American users on Election day in 2010
seeing more friends' posts of voting
prompting more people to vote
Goel, V. (2014, August 12). As Data Overflows Online, Researchers Grapple With Ethics. The New York Times. Retrieved from http://www.nytimes.com/2014/08/13/technology/the-boon-of-online-data-puts-social-science-in-a-quandary.html
Wilbanks, J. (2011). Openness as infrastructure. Journal of Cheminformatics, 3(1), 1–5.
What are the research for?
2 Facebook Experiments
Existing federal rules require consent from those studied unless the potential for harm is minimal.
-->inadequate guidance for large scale research
Make the rules without preventing the development of research (Sinan Aral, MIT)
Researchers conduct research with little outside guidance. (Mary Gray, IU)
researchers didn’t realize that manipulating would make some people feel violated.(Hancock)
1.John Wilibanks wonders if:
"The desire to protect our
privacy is slowing down research?"
How do you think privacy concerns affect growth of knowledge?
4. What do you think about companies like Facebook who conduct experiments like the one we've discussed?
one side of the class discuss the pros of this experiment
other side discuss the cons of the experiment
2. What types of experiments are so intrusive that they need prior consent or prompt disclosure after the fact?
3. How do we decide what data we can freely assemble without restriction to create new knowledge? and what data can we not?
WHO even decides where that line is?