Send the link below via email or IMCopy
Present to your audienceStart remote presentation
- Invited audience members will follow you as you navigate and present
- People invited to a presentation do not need a Prezi account
- This link expires 10 minutes after you close the presentation
- A maximum of 30 users can follow your presentation
- Learn more about this feature in the manual
Do you really want to delete this prezi?
Neither you, nor the coeditors you shared it with will be able to recover it again.
Make your likes visible on Facebook?
Connect your Facebook account to Prezi and let your likes appear on your timeline.
You can change this under Settings & Account at any time.
DataOne Face2Face Pres
Nic Weberon 20 October 2010
Transcript of DataOne Face2Face Pres
What are the data sharing policies effecting these domains in terms of journals, repositories and funding sources?
How do these policies differ by discipline,
publisher, country, funding body, data type or
How much do these policies effect the practice
of reusing or citing data?
How have these policies changed over time,
and what affect have these changes had? Evaluate established policies for sharing and citing data in ecology, evolutionary biology and the broader environmental sciences.
Gather policies from publishing outlets, funding sources and domain repositories.
The impetus for this evaluation is to determine the effectivness of policies on the data sharing habits of domain practitioners. Data Gathered Summer 2010
Internship Repositories 26 Total Evaluations
Funded by Governments, Institutions or Non-Profit
Representing National Agencies, Universities and Research Centers
24 Open Access, 2 Mixed Sub/OA 53 Total Evaluations
17 Different Nations (Euro. NoAm and Asia)
Representing Governments, Non-profit and Corporate Charities Funding Sources Journals 11.5% require associated publication, as well as direct journal affiliations
31% have direction for citing content in their repository
37% provide accession numbers for deposited content Findings Findings 44 % require data shared in some way ( 7.5% for specific length)
24.5% gave directions on type of place to deposit, only 2 explicit where
7.5% made exta $ avialable for depositing data
Only 1 gave direction on how to cite data Evolutionary Biology 38 Journals In ISI Category
5 OA, 15 Mix, 18 Sub.
21% Required DataDep for Peer Review
16%Request, 8% Require Authors to Share Data
21% give instructions how to share data
24% (9 overall) where to share ( 7 Rep, 2 IR)
Only 1 Journal Recommends how to Cite Data Ecology 107 Journals in ISI Category
7 OA, 23 Mix, 77 Sub
9% Request, 6 Require data to be shared by author
7% give instructions how to share
12% (13 overall) where to share (8 rep., 4 registry, 1 journal)
6% require accesion numbers from deposit
3% give instructions how to cite data in journal Environmental
Sciences 162 Journals in ISI Category
10 OA, 22 Mix, 130 Sub
9% request, 1% require data to be shared by author
Only 4 give instructions how to share
6% (10 overall) where to share (7 repository, 3 Journal)
7% give some instruction on how to cite data, mostly w/r/t section Unpublished Data The Outliers Journals often give direction for how to cite unpublished data, but generally treat describe this as "communications or correspondence" and not the underlying quantitative or qualitative data we associate with scientific production Supplementary Data Most journals, especially those with a generic set of publishing guidelines include some mention of supplementary or a/v material that may be included with a submission. This statement is often accompanied by format (generally .pdf or .xls) and size ( 8- 12 MB) restrictions and no mention of preservation for the material. What I learned from Gathering
The Data OutPuts Audience Policy makers
Data Curators Why Share or facilitate the re-use of data?
maximised investment in data collection
broader access where costs would be prohibitive for individual researchers/institutions
potential for new discoveries from existing data, especially where data are aggregated and integrated
reduced duplication of data collection costs and increased transparency of the scientific record
increased research impact and reduced time-lag in realising those impacts;
new collaborations and new knowledge-based industries. How does citing data effect re-use?
Makes underlying data identifiable and hence accessible.
Provides method of attribution currently unrepresented in scientific reward structure
Potential for Scientists to create new metrics of their research impact
Creates new collaboration possibilities Or... to paraphrase Neil Beagrie, data is the ladder
we use to stand on the Newtonian 'Shoulders of Giants' Fry, J and Lockyer, S and Oppenheim, C and Houghton, J and Rasmussen, B (2009) Identifying benefits arising from the curation and open sharing of research data produced by UK Higher Education and research institutes. Project Report. UNSPECIFIED. (Unpublished) OR Policies Depending on your roll in a research ecosystem are either For sharing and citing data. Motivation What Failed? subatomic particle Methods With three separate source types for
policy I was forced to take three
different, yet related strategies for data gathering. For all three sources I gathered both metadata about the entity and a
quantified evaluation of it's
sharing / citation policy -Sherpa Juliet as baseline
-Queried ISI Web of Sci. for funding agencies (top 10-25)
for journals and subject categories
-Eliminated those without formal application policies Policy Elements Abstracted
Data Required for peer review
Underlying data requested/ required
Instructions how / where to share data
Repositories named for data
Accession number required
Instructions how to cite data -Sought Comprehensive, Well Defined Grouping
-JCR / ISI Categories for Disciplines: Ecology, Evolutionary Biology and Environmental Sciences Metadata Elements Collected
Impact factor (and numerous other Bibliometrics)
All ISI Categories
Peer Review Journals Policy Elements Abstracted: Data Management Plan Requirement Data Sharing RequirementLength of time before data must be shared
Length of time data must be retained (if not deposited) Where should data be deposited
Possiblity of extra funding for sharing data
Instructions for citing data Metadata Elements
(Gov, Cultural agency
Date of Est.
Date of most recent guidelines Funding Sources Policy Elements Abstracted
Requirements for Deposit
URI / DOI
Citation Statements Repositories -Least formal method for collecting resources
-Two general categories: very new or very well established
(little middle ground)
-Many have related databases / repositories Metadata Elements Collected Affiliation
Type of Data Expected Associated Journals So how do we encourage citations (or acknowledged re-use), and data sharing? Immediate Future Potential Future Work Complete Statistical Analysis
Compose poster presenting overallproject effort
Write paper outlining the project's goals and suggest potential for future work Collaborate with Sarah and Valerie for larger publication
Build on Qualitative methods of DataAccess project
Collaborate with Data Conservancy
Inform early Dissertation work Very limited success in finding explicit policies
Hard to determine good measurement of journal's importance(Impact Factor not best metric)
Hard to grasp itterations of a policy... especially in Funding.
Limited statistical knowledge (Correlations less apparent as I was working) Limited Stat Knowledge.
Limited Success, hard to make case for importance
Were author guidelines and repositories best approach? Form of my deliverables Worries 1. Bottom Up: Open Science / Open Data movements, Panton Principles etc.
2. Top Down: Funding Mandates and Data Sharing and Re-use Policy How? What are the shoulder for this project to stand on?
McCain (1995) Mandate: Sharing Policies in Life Sciences
DataAccess Project: 2001 - 2004 UCSD researching access to publicly funding data, OECD working group looking into many issues with data sharing and consequently data attribution / citation
Networked Research and Digital Information (Nerdi): 2003 publication Promise and Practice in DataSharing, early look at "Big Science" data policy
Research Information Network: 2009 study 'Patterns of Information Use and Exchange' - 3 Policy Specific Influences Good, But Vague:
"BBSRC recognises that different fields of study will require different approaches. What is sensible in one scientific or technological area may not work in others; therefore the policy aims to achieve the sharing of data in an appropriate manner and not to be overly prescriptive. Researchers are required to adhere to any relevant regulatory requirements, including those relating to the ethical use of data."
Data Sharing Policy Section 1 par 2. 'Polar Research' Sharing Policy
We recommend that data for which public repositories are widely used, and are accessible to all, should be deposited in such a repository prior to publication. The appropriate linking details and identifier(s) should then be included in the publication and, where possible, in the repository, to facilitate linking between the journal article and the data. If such a repository does not exist, data should be included as supporting information to the published paper or authors should agree to make their data available upon reasonable request. Thanks to: NSF DataONE Summer Intenrship
GSLIS CIRSS All photos and images CC attribution