1 Citation 235 Views 63 Downloads
The presence of rogue taxa (rogues) in a set of trees can frequently have
a negative impact on the results of a bootstrap analysis (e.g., the
overall support in consensus trees). We introduce an efficient graph-based
algorithm for rogue taxon identification as well as an interactive
web-service implementing this algorithm. Compared to our previous method,
the new algorithm is up to four orders of magnitude faster, while
returning qualitatively identical results. Because of this significant
improvement in scalability, the new algorithm can now identify
substantially more complex and compute-intensive rogue taxon
constellations. On a large and diverse collection of real-world datasets,
we show that, our method yields better supported reduced/pruned consensus
trees than any competing rogue taxon identification method. Using the
parallel version of our open-source code, we successfully identified rogue
taxa in a set of 100 trees with 116,334 taxa each. Using simulated
datasets we show that, when removing/pruning rogue taxa with our method
from a tree set, we consistently obtain bootstrap consensus trees as well
as maximum likelihood trees that are topologically closer to the
respective true trees.
235 views reported since publication in 2012.