ArXiv TLDR

The fitness landscape of overlapping genes

🐦 Tweet
2604.00602

Orson Kirsch, Nicole Wood, Steven A Redford, Kabir Husain

q-bio.BMphysics.bio-phq-bio.PE

TLDR

This paper explores the fitness landscape of overlapping genes, revealing compatibility, trade-offs, and the natural genetic code's unique suitability.

Key contributions

  • Developed a computational method to design de novo sequences for two overlapping proteins.
  • Identified a fundamental trade-off and a simple criterion for overlapping gene feasibility.
  • Showed widespread compatibility between protein families, with specific reading frames being more permissive.
  • Demonstrated the natural genetic code's unique suitability for supporting overlapping genes.

Why it matters

This paper clarifies the evolutionary mechanisms and design principles of overlapping genes. It reveals protein fitness landscape flexibility and the natural genetic code's unique suitability, advancing our understanding of genome evolution and protein design.

Original Abstract

Natural genomes sometimes encode two different proteins in staggered reading frames of the same DNA sequence. Despite the prevalence of these 'overlapping genes' across the tree of life, it remains unknown whether arbitrary protein pairs can overlap, to what extent such overlaps are feasible, or what design principles govern them. Here, we study compatibility, frustration, and connectivity in the fitness landscape of overlapping genes. We computationally design sequences de novo that satisfy the dual functional constraints of two distinct protein families. The joint fitness landscape, inferred via Potts models from multiple sequence alignments, reveals a fundamental trade-off between the two proteins and provides a simple criterion for when overlap is feasible. We find widespread compatibility between protein families, with one class of reading frames markedly more permissible than others. By exploring alternative genetic codes, we find that the natural genetic code is uniquely well-suited to support overlapping genes. Constructing mutational paths between sequences, we find that sequence-diverse overlapped genes can be connected via a network of near-neutral mutations. Overall, our results suggest that protein fitness landscapes are sufficiently flexible so as to accommodate the stringent, orthogonal requirements of overlapping genes.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.