http://svpow.wordpress.com/2009/04/23/mydd/
Published Data
Original data format
(Authors)
(Publishers)
My Frustrations
http://en.wikipedia.org/wiki/Nexus_file
My Frustrations
A bit about myself...
Open Palaeontology
the
Data Obfuscation
metadata
putting data on the web for all to see and use
{
When data IS 'shared' in a publication
it's often significantly obfuscated
Some scientists won't share data
data wrapper
publishing
Ross Mounce
Birds
evolved from
Dinosaurs
Table 1: The Data Matrix
Aus 00100 10101 01010 101
Bus 01010 10101 0101A 013
Cus 03254 43504 58423 232
... etc
#NEXUS
Begin data;
Dimensions ntax=7 nchar=99
Format datatype=standard missing=? gap=-;
Matrix
Aus 001001010101010101
Bus 01010101010101{01}013
Cus 032544350458423232
...
end;
the
...even if you explain why
...even post-publication
...even if you ask them face-to-face
...even if they're publicly funded
publishing
process
Non-standard formatting
(requires reformatting)
+
in a .pdf or .html
(which programs can't use)
Standardly-formatted data
(information)
+
in a usable digital file
(digital container)
NEXUS -formated information
(phylogenetic info)
+
in a plain-text .nex file
(digital container)
Tabulated information
(phylogenetic info)
+
in an unusable .pdf file
(digital container)
process
slides here: http://bit.ly/openpal
process
2nd year PhD student | Research: Fossils & Phylogeny
their data stays private (potentially forever!)
#OpenScience #OpenData
@rmounce
Ubuntu user, occasional FOSS zealot, Internet addict...
the actual data
charge: $800 per research article
The Open Paleontology Journal
(Bentham Science Publishers)
ISSN: 1874-4257
3 of which are self-citations
6 articles -> 4 citations
~34th out of 42
The GenBank model
Morpho-Databases
already exist
Paper-based
thinking in 2011
Jo Wolfe (Yale), Graeme Lloyd (NHM)
Katie Davis (NHM), Rachel Warnock (Bristol)
an image of a glass of water
But hardly anyone uses them: it's not mandatory
Homo AGTCGGTC
Pan AGTGGATC
Molecular sequence data is treated appropriately
Why not morphological data too? Similar
Why?
Maxwell, E. E. Generic reassignment of an ichthyosaur from the queen elizabeth islands, northwest territories, canada. Journal of Vertebrate Paleontology 30, 403-415 (2010). URL
http://dx.doi.org/10.1080/02724631003617944
e.g.
or
We can see it but we can't drink (use) it!
Homo 01021011121
Pan 01010100010
"Erm... let's do something about this!"
from: O’Meara, Brian, Whitacre, Jamie, Mounce, Ross, Rosauer, Dan, Vos, Rutger, and Stoltzfus, Arlin.
Publishing re-usable phylogenetic trees, in theory and practice. Presented at iEvoBio, June, 2011
Why not have a rich-content digital version, and a separate paper-friendly version?
in contrast GenBank is mandatory for publishing molecular sequence analyses
Michael Pittman (UCL), Aodhán Butler (Uppsala), Alex Dunhill (Bristol), Russell Garwood (Imperial), James Lamsdell (Kansas), David Legg (Imperial)
http://precedings.nature.com/documents/6048/version/1
also from
a paper
published
in 2010
But the reaction wasn't 100% positive
Getting the message out
Other criticisms (too long to quote)
Over 120 signatures in the 1st week
A website + petition form
A draft Open Letter using:
(criticism is however, good and welcome)
(aka 'leveraging the social networks')
http://supportpalaeodataarchiving.co.uk/
(Fear of people
stealing fossils)
Shouldn't laws
prevent this?
"Releasing data always jeopardises future research plans as others now have access to the data"
"...lowering citations for everyone
making it harder for us all to compete for funding"
"This could actually slow the rate of publication"
"I will not endorse anything that implies that we need even less public money or that we are wasting public money"
If they're 'undescribed' why mention them? If they're important, describe them to the level needed for your purposes.
Locality data in field-based studies.
Undescribed specimens
Long-term projects
(I can't know everything before I publish)
"Complete openness of data facilitates the advancement of knowledge & reduces information loss."
"As a postgraduate trying to establish myself, greater access to data would be a boon"
'Data accessibility and transparency are absolutely necessary to mend the damage done to science by "climategate." '
"I've been trying to get the Paleo Society to sign on with Dryad, but it's been like slamming my head on jello...."
(inspired by)
(Jeopardising)
Mailing-Lists
If you publish 'interim' results don't expect people NOT to use that data
kudos to Jon Antcliffe (Bristol) for saying
what others were thinking, but didn't express
special thx to Jon Hill & Katie Davis
Pete Wagner (Smithsonian)
Lucy Muir & Joe Botting (Nanjing)
original source: http://mailman.nhm.ac.uk/pipermail/paleonet/2011-March/001933.html
original source: http://mailman.nhm.ac.uk/pipermail/paleonet/2011-March/001934.html
International support from Palaeontologists
Lessons Learned
What next...?
Other Barriers in Palaeontology
#OpenData heroes
in Palaeontology
Encourage/promote/support existing databases
Feel free to:
share
re-use/remix
re-distribute
my research is publicly funded hence I strive to make my works publicly available
Access to physical fossils is a minefield
(politically, legally, socially...)
http://figshare.com/
http://www.morphobank.org/
- All publicity is good publicity
especially criticism and debate
- Act NOW! Change is sloooow
Commercial rights over fossils vary
hugely between countries and museums.
http://www.treebase.org/
http://paleodb.org/
http://www.zoobank.org/
Graeme T. Lloyd
Mike P. Taylor
Certain institutions very overprotective
and assertive of image 'rights'
c.f. "The best time to plant a tree is 20 years ago
The 2nd best time is NOW"
Matt Wedel
Andy A. Farke
awesome
http://svpow.wordpress.com/
slides here: http://bit.ly/openpal
http://maps.google.co.uk/maps/ms?msid=218353561524244017339.0004a697ea032baff7843&msa=0
Callaway, E. Fossil data enter the web period : Nature news.
e.g. Mike Taylor on the NHM http://dml.cmnh.org/2011Jun/msg00010.html
http://www.graemetlloyd.com/
http://www.slideshare.net/wilbanks/data-sharing-as-a-means-to-a-revolution-8421942
http://datadryad.org/
http://paleo.esrf.fr/
http://openpaleo.blogspot.com/
http://opendino.wordpress.com/
http://www.nature.com/news/2011/110411/full/472150a.html
@rmounce
or ross.mounce@gmail.com
for comments on slides: Jenny Greenwood, Alex Dunhill, Anne O'Connor, Martin Hughes, David M. Williams
for being an awesome supervisor: Matthew Wills
Team OKFN for inviting me, and organising a great conf
and to everyone who supported the Open Letter...
Palaeontologists: please share MORE content online;
raw data, high-res images, green OA papers, presentations...