An Improved Bipartition Cover Bound for the Multispecies Coalescent Model
TLDR
This paper presents improved topology-free upper bounds on the number of loci needed for bipartition cover in the multispecies coalescent model.
Key contributions
- Derives improved topology-free upper bounds for bipartition cover under the multispecies coalescent.
- Extends practical applicability of bounds to a broader range of biologically realistic parameter settings.
- Develops new asymptotics for these bounds and absorption times under Kingman's coalescent.
Why it matters
This paper significantly improves existing bounds for bipartition cover under the multispecies coalescent model. This expands the practical utility of summary methods like ASTRAL for empirical datasets and enhances our theoretical understanding of coalescence.
Original Abstract
Bipartition cover probabilities quantify whether a collection of gene trees contains every bipartition of the underlying species tree, a condition that underlies finite-sample guarantees for summary methods such as ASTRAL. We study this problem under the multispecies coalescent (MSC) model and derive topology-free upper bounds on the number of loci required to obtain a bipartition cover with prescribed confidence, improving upon the existing bounds of Uricchio et al. (2016). Practically, our bounds remain below biologically realistic numbers of loci across a substantially broader range of parameter settings, expanding their usefulness for empirical datasets. Theoretically, our analysis sharpens our understanding of coalescence under the MSC model and develops new asymptotics for these bounds and absorption times under Kingman's coalescent in the natural short branch regime. We further compare our new bounds with existing work using simulations under a variety of different species-tree topologies.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.