Introducing 

Prezi AI.

Your new presentation assistant.

Refine, enhance, and tailor your content, source relevant images, and edit visuals quicker than ever before.

Loading…
Transcript

Plus data from 67 other

Botanical / Mycological

publications (>250 datasets)

e.g. Canadian J Botany (>26)

Int J Plant Sci (>21)

Phytopathology (>17)

Lichenologist (>11)

Sydowia

Bryologist

Mycotaxon

Am J Potato Res

...

Where are the animals?

Burying data in pdf's is very naive

Acknowledgements

What can WE do?

The (current) publishing process

The (better?) publishing process

Bringing Systematics into the Digital Age

Some journals require

mandatory data archiving

Other (bad) examples...

Funding:

Helpful discussants:

Can you 'look' at a dataset and see that it's correct?

2.) Polymorphic/uncertain states are formatted in non-standard ways

(ubiquitous!)

Centralised datasets are easy to find: they're in the same place. (TreeBase)

e.g. Try finding all published arachnid datasets. It requires expert knowledge.

Look at the success of GenBank for raw sequence data

Look at the rise of Digital Taxonomy

Look at the future advances offered by Ontologies

Maureen O'Leary (MorphoBank)

William Piel (TreeBase)

Rutger Vos (TreeBase)

Dennis Stevenson (Ed. Cladistics)

Recommended Reading:

Directly lobby your societies

and/or journal editors

Set a good example by publicly archiving your published data, even if your journal doesn't require it

Spread the word! Data is important!!!

1.) Author runs data file, gets results

2.) Author exports unusable reformatted data,

often spread over many pages into a .doc or .pdf file

It hampers:

  • Discoverability
  • Transparency
  • Repeatability
  • Future re-usage
  • Education & Outreach

1.) Author runs data file, gets results

2.) Author submits exact same data file, with results in a useable format to a data repo or publisher

Smith, V., 2009. Data publication: towards a database of everything.

BMC Research Notes

1.) Not including all the data, referring reader to another paper

(very common in palaeontology) :(

e.g. Norell et al 2008. A New Platynotan Lizard (Diapsida: Squamata) from the Late Cretaceous Gobi Desert (Ömnögov), Mongolia. American Museum Novitates

http://dx.doi.org/10.1186/1756-0500-2-113

Why!!!

Thanks to:

NEXUS format: {01} or (01)

Hennig (xread) format: [01]

published format: A or $ or z

My supervisor Matthew A. Wills for supporting this talk and all other members of the

Fossils, Phylogeny and Macroevolution Research Group at Bath.

+ all authors who already archive their data in appropriate centralized databases

Phylogenetic data is extremely valuable

and should be treated as such, with proper

centralised online publicly-accessible databasing

(or any random single character, it's never consistent between papers)

http://bit.ly/f6Dxfo

Want to see this again? :

http://bit.ly/phylodata

3.) Unusable data is typeset (not without error)

and published

Why miss-out on

extra citations?

Making your data

available increases

your chances of citation

through re-use of your data

Surely, this is a better model,

with less room for error?

http://onlinelibrary.wiley.com/doi/10.1111/j.1558-5646.2010.01182.x/abstract

Wouldn't it be great

if undergrads could re-analyse the latest data,

to learn cladistics?

3.) Typesetting errors: author submits correct data, publisher mangles it

(clickable link)

Support / Ideas / Feedback: ross.mounce@gmail.com

Why not ALL journals?

(less frequent but occurs many times every year)

The (Continued) Growth of Phylogenetic Information

Cladistics (Mirande, 2009)

In 1993, Sanderson et al (Syst. Biol.) tried to get a handle on

just how much new phylogenetic information was

being published each year (for the period 1989-1991).

They found 882 studies whilst acknowledging that they

didn't sample every journal, noticeably lacking palaeontological ones.

The rate of growth of information publishing was increasing rapidly

~50 more studies each and every year.

In 2010 the situation is far worse. There are easily thousands of novel

studies published each and every year of which only a tiny fraction

are archived in a centralized online electronic database.

I argue here, this needs to change

Q: How many Cladistic studies are there?

(novel, explicit, phylogenetic systematic analyses)

~1966 - 2010

Systematics requires an accumulated wealth of knowledge

The (continued) growth of

phylogenetic information

Where is the depot of phylogenetic information?

a critical discourse on data publishing and (lack of) archiving

MorphoBank

Treebase (II)

38 studies

~2500 studies

Have YOU ever tried to extract phylogenetic data from a paper?

Regrettably tiny. It's not their fault.

Authors just don't submit their work.

:(

by Ross Mounce

Is this everything?

NO!

Best viewed in fullscreen

Fossils, Phylogeny and Macroevolution Research Group

40,000 ?

Data particularly lacking:

Invertebrates (morphological)

Vertebrates (morphological)

Palaeontological

A: Who knows?

40k is my estimate based upon simple extrapolation from Sanderson et al (1993) with adjustment for post-1993 literature growth

Try this one:

Zhu, M., Gai, Z.-K., 2006. Phylogenetic relationships of Galeaspids (Agnatha). Vertebrata PalAsiatica 44 (1), 1-27.

The top 15 journals

contributing data (1635) to TreeBase

03. Mol Phy Evo (228)

04. Syst Biol (156)

06. Molecular Ecology (61)

01. Syst Botany (443)

02. Mycologia (305)

05. Am J Botany (95)

07. Mycoscience (53)

08. Stud in Mycology (46)

09. Taxon (40)

10. Persoonia (39)

11. Plant Syst Evo (38)

12. J Phycology (35)

13. Fungal Diversity (34)

14. Ann Miss Bot Gar (31)

15. Mycological Progress (31)

General Journals

(within which may be botanical or mycological)

http://www.ivpp.cas.cn/cbw/gjzdwxb/xbwzxz/200810/t20081023_2385439.html

(clickable link)

Botany / Mycology

journals

(neo)

Vertebrates

Palaeontological

Invertebrates

It's an Open Access journal

it should be easy, right?

BUT

Q: What phylogeny programs read

.pdf .doc or .xls files?

Wrong! It's virtually impossible without re-typing every single cell

TREEBASE is full of trees!

A: NONE! Manual re-formatting required...

and it's NEVER just a matter of copy n paste

Severely under-represented data

What's the problem?

Isn't the data matrix and cladogram published with most papers?

Animalia*

Morphological

&

Paleo-morphological

...in a way, yes.

Usually buried in appendices and supplementary materials

* excluding mammals which are reasonably well covered (pers. comm. Bill Piel)

The absurdity of data publishing:

1.) not providing all the data

e.g Knoll 2010 Geological Magazine (sauropods)

Ignavusaurus phylogenetic analysis

provides coding data for Ignavusaurus (only)

see Smith & Pol 2007 for the rest of the matrix

find Smith & Pol 2007 -> matrix not printed in full

see Yates, 2007b (Special Papers in Palaeontology)

rather difficult to get hold of (no official electronic copies, paper only!)

Journals in which paleo-phylogenetic data has been published

(excluding the mainstream, and mainstream systematics journals)

None of which archive their data properly

Acta Paleontologica Polonica

Ameghiana

American Museum Novitates

Annals of Carnegie Museum

Antarctic Science

Bulletin of the American Museum of Natural History

Bulletin of the Peabody Museum of Natural History

Canadian Journal of Earth Sciences

Comptes Rendus Palevol

Contributions from the Museum of Paleontology, University of Michigan

Contributions in Science (NHM, LA)

Cretaceous Research

Fieldiana: Geology

Fossil Record

Geobios

Geodiversitas

Geological Magazine

Historical Biology

Journal of Human Evolution

Journal of Mammalian Evolution

Journal of Paleontology

Journal of Systematic Palaeontology

Journal of Vertebrate Paleontology

Naturwissenschaften

Neues Jahrbuch für Geologie und Paläontologie

Occasional Papers of the Natural History Museum University of Kansas

Palaeontology

Paläontologische Zeitschrift

Paleontological Research

Paleontology Electronica

Records of the Australian Museum

Revista Brasiliera de Paleontology

Revista del Museo Argentino de Ciencias Naturales

Senckenbergiana Lethaea

Vertebrata PalAsiatica

Learn more about creating dynamic, engaging presentations with Prezi