Tree-Conditioned Edit Flows for Ancestral Sequence Reconstruction
Emil Sharafutdinov, Ingemar André
TLDR
A new tree-conditioned edit-flow model reconstructs ancestral protein sequences, effectively handling insertions and deletions, unlike classical ASR methods.
Key contributions
- Introduces a novel tree-conditioned edit-flow model for ancestral sequence reconstruction (ASR).
- Effectively handles variable-length sequences, including insertions and deletions (indels).
- Reconstructs ancestors using paired bidirectional edit trajectories.
- Accurately localizes evolutionary changes in natural sequences with abundant indels.
Why it matters
Classical ASR methods often fail to account for insertions and deletions, limiting their accuracy. This paper introduces a robust model that effectively handles these complexities, providing a more accurate tool for inferring protein evolution. This advances our ability to understand ancestral protein sequences.
Original Abstract
Ancestral sequence reconstruction (ASR) aims to infer extinct protein sequences at internal nodes of a phylogenetic tree. Classical ASR methods are typically based on continuous-time Markov substitution models, but they treat sites largely independently and handle insertions and deletions only weakly or not at all. We introduce a tree-conditioned edit-flow model for variable-length ASR. Given two descendant sequences and their branch distances to a shared ancestor, the model reconstructs the ancestor through paired bidirectional edit trajectories constrained to agree on a common ancestral state. On a benchmark of experimentally evolved sequences with only context-independent substitutions, the model does not match the accuracy of the best classical method, yet still achieves reasonable performance despite being trained on natural sequences that include insertions, deletions, and substitutions. On a benchmark of natural homologous sequences with abundant insertions and deletions, the model most accurately localizes inferred evolutionary change.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.