Loading presentation...

Present Remotely

Send the link below via email or IM


Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.


ILD review

my Hennig XXX (Sao Jose do Rio Preto, Brazil) talk, given at 9am, Monday August 1st 2011

Ross Mounce

on 28 September 2011

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of ILD review

A review of the ILD randomisation test:
uses and abuses by Ross Mounce
and Matthew Wills @rmounce My Frustrations Some scientists won't share data ...even if you explain why
...even post-publication
...even if you ask them face-to-face
...even if they're publicly funded their data stays private (potentially forever!) My Frustrations When data IS 'shared' in a publication
it's often significantly obfuscated (rage!) Standardly-formatted data
in a usable digital file
(digital container) (Authors) (Publishers) Non-standard formatting
(requires reformatting)
in a .pdf or .html
(which programs can't use) publishing process the Data Obfuscation an example publishing process the NEXUS -formated information
(phylogenetic info)
in a plain-text .nex file
(digital container) Tabulated information
(phylogenetic info)
in an unusable .pdf file
(digital container) Original data format #NEXUS
Begin data;
Dimensions ntax=7 nchar=99
Format datatype=standard missing=? gap=-;
Aus 001001010101010101
Bus 01010101010101{01}013
Cus 032544350458423232
end; metadata the actual data { data wrapper publishing process the Published Data Aus 00100 10101 01010 101
Bus 01010 10101 0101A 013
Cus 03254 43504 58423 232
... etc NO metadata reformatted codings
it was formerly {01} printed sideways
(harder to re-extract) Table 1: The Data Matrix in a .pdf http://en.wikipedia.org/wiki/Nexus_file http://svpow.wordpress.com/2009/04/23/mydd/ The Past The Present The Future...? 1. 2. 3. the Incongruence Length Difference value http://dx.doi.org/10.2307/2413255 Mickevich & Farris 1981 (proposed here, but not named) 0000
0111 00000
01111 { combined { { separate partitions tax1
tax4 where L is length of the shortest MPT(s) but what value of D is significantly different?
(in the statistical sense) 'D' relative to other ways of
partitioning the data matrix Citations (according to ISI WoK) n = 2379 as of July 2011 incongruent not incongruent Some comments
and criticisms... When Does the Incongruence Length Difference Test Fail?
(Darlu & Lecointre, 2002) Uninformative Characters and Apparent Conflict
(Lee, 2001) The "Hypercongruence" effect and multiple comparisons
(Ramirez, 2006) Not a flaw; just a refinement Cunningham, 1997 - The ILD is the best of 3 Localized incongruence Not a flaw; outside scope Dolphin et al., 2000 noise can cause incongruence
- we should test significance against 'shuffled' matrices violates safe taxonomic reduction
uninformative taxa / characters testing MUST be pairwise pairwise partition testing ONLY exclude uninformative characters and taxa treat / code partitions the same e.g. gaps How are other people using the ILD test? It should not be used to assess 'combinability' of data:
Combinability was never an issue -> Total Evidence

Nor is it a measure of 'phylogenetic accuracy' Sampling Method all published in 2009 or 2010, 278 papers citing the original Farris et al, 1994/5a paper some understandable confusion
over the exact year sample bias #1: only articles I could access sample bias #2: Mol. Phy. Evo. dominated across 85 different journals MPE articles (101) Number of articles
sampled by Journal out of a
possible 454 BMC Evo. Biol. (10) PLoS ONE (9) } Am. J. Bot. | J. Biogeography
Plant Syst. and Evolution (8) Is this a fair sample? Yes! MPE is a high-volume
phylogenetics journal between 2009 - 2010 it
published 764 articles c.f. Cladistics 169 articles *according to ISI WoK database (All papers manually assessed) An ILD test was performed in

The ILD test was only discussed in

Insufficient evidence for n=276 92% 6% An Open Literature Review c.f. Rindal & Brower (2011) http://dx.doi.org/10.1111/j.1096-0031.2010.00342.x 2% (254) (5) (17) How many papers reported p-values? Reported exact p-value(s)

Reported p-values < or > X

Did not report p-value(s) for n=259 1.5% (4) 18.5% (48) ...quality not quantity Between papers there is
little consistency in practise Just what is a significant ILD test p-value? http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0005862 Ngamskulrungroj et al 2009 critical value used: 0.0001 concluded -> 0.002 'congruent' D D D D D obs 1 2 3 ... D obs D obs http://dx.doi.org/10.1111/j.1096-0031.1994.tb00181.x http://mbe.oxfordjournals.org/content/14/7/733.abstract http://mbe.oxfordjournals.org/cgi/content/abstract/19/4/432 http://mbe.oxfordjournals.org/cgi/content/abstract/18/4/676 http://dx.doi.org/10.1111/j.1096-0031.2006.00106.x Great idea - not easy to implement, yet http://dx.doi.org/10.1006/mpev.2000.0845 (...and these are just some of the published comments on the ILD test) explicitly stated critical values inc.
0.05, 0.01, 0.005, 0.001 and 0.0001 An Open Question: Can we perform a prospective power analysis (sensu Cohen) to determine what critical level one should use? (Justification) Evidence, Explicitness, and Verifiable Science "using 100 replicates... the combinability of the ITS and cpDNA sequences was confirmed with the partition homogeneity test and ILD test" Surveswaran et al, 2009 http://www.springerlink.com/content/x03004572m382267/ 1000 reps abs. minimum (Allard et al 1999)
The ILD test is NOT a test of 'combinability' it is for data exploration (sensu Grant & Kluge, 2003)
Where is the statistical evidence? No p-value and no critical level reported (this is quite common) Feel free to look at the sample
and my interpretations here: http://bit.ly/ILDdata c.f. Rindal & Brower's (2011) analysis of MP, ML & BI analyses Conclusions
and reiterated 'suggestions' The ILD test is a useful and informative method
for comparative 'data exploration' some suggestions... absolute mininum 1000 replications examine the tree-length distribution for extra potential information slides here: http://bit.ly/ILDreview D obs D obs p-value = 0.001 p-value = 0.001 for 10k reps Acknowledgements @rmounce slides here: http://bit.ly/ILDreview Marie Stopes Student Travel Award Funding my supervisor: Matthew Wills for discussion: Steve Farris ...and thank you for watching! (207) 80%
Full transcript