1 Citation 300 Views 15 Downloads
The possibility that two data sets may have different underlying
phylogenetic histories (such as gene trees that deviate from species
trees) has become an important argument against combining data in
phylogenetic analysis. However, two data sets sampled for a large number
of taxa may differ in only part of their histories. This is a realistic
scenario and one in which the relative advantages of combined, separate,
and consensus analysis become much less clear. I suggest a simple
methodology for dealing with this situation that involves (1) partitioning
the available data to maximize detection of different histories, (2)
performing separate analyses of the data sets, and (3) combining the data
but considering questionable or unresolved those parts of the combined
tree that are strongly contested in the separate analyses (and which
therefore may have different histories), until a majority of unlinked data
sets supports one resolution over another. In support of this methodology,
computer simulations suggest that (1) the accuracy of combined analysis at
recovering the true species phylogeny may exceed that of either of two
separately analyzed data sets under some conditions, particularly when the
mismatch between phylogenetic histories is small and the estimates of the
underlying histories are imperfect (few characters and/or high homoplasy),
and (2) combined analysis provides a poor estimate of the species tree in
areas of the phylogenies with different histories but an improved estimate
in regions that share the same history. Thus, when there is a localized
mismatch between the histories of two data sets, separate, consensus, and
combined analysis may all give unsatisfactory results in certain parts of
the phylogeny. Similarly, approaches that allow data combination only
after a global test of heterogeneity will suffer from the potential
failings of either separate or combined analysis, depending on the outcome
of the test. Excision of conflicting taxa is also problematic in that it
may obfuscate the position of conflicting taxa within a larger tree, even
when their placement is congruent between data sets. Application of the
proposed methodology to molecular and morphological data sets for
Sceloporus lizards is discussed.
300 views reported since publication in 2008.