ArXiv TLDR

A Combinatorial Optimisation Approach to Multi-factorial Gap-filling in Genome-scale Metabolic Models (GEMs)

🐦 Tweet
2604.25233

Philip Kilby, Sevvandi Kandanaarachchi, Matthew J. Morgan, Amy M. Paten, Mariana Velasque + 2 more

math.OCq-bio.GN

TLDR

This paper presents a metaheuristic combinatorial optimization method for multi-factorial gap-filling in GEMs, significantly improving accuracy over prior approaches.

Key contributions

  • Introduces a novel combinatorial optimization method for multi-factorial gap-filling in Genome-scale Metabolic Models (GEMs).
  • Employs metaheuristic approaches, relying solely on continuous Linear Programming, avoiding slow Integer Linear Programming.
  • Simultaneously gap-fills GEMs across 9-28 media conditions, improving overall predictive accuracy.
  • Outperforms conventional methods by 7.3% (Kendal Tau) and 13.3% (RMS Error) in empirical tests.

Why it matters

Traditional GEM gap-filling is slow and often introduces errors when applied across multiple conditions. This paper offers a faster, more accurate multi-factorial approach using combinatorial optimization, significantly improving the reliability and efficiency of GEM construction for cellular simulations.

Original Abstract

Genome-Scale Metabolic Models (GEMs) describe the interactions between genes, proteins, and the biochemical reactions that underpin an organism's metabolism aiming to computationally simulate functions at the cellular level. While many metabolic reactions can be inferred from genome analysis, constructing GEMs often involves incorporating reactions unsupported by genomic data to improve prediction accuracy. This is known as gap-filling, a process that can be performed manually (a time-consuming task) or computationally. Traditional computational gap-filling approaches aim to correct GEM predictions for a single environmental condition (medium) by solving a large Integer Linear Programming problem. Sequential application across multiple media can produce a more robust model, but often introduces unrealistic predictions in other media. They are also slow to run. In this paper, we study multi-factorial gap filling, which aims to gap-fill GEMs across typically 10 or more input media simultaneously, while improving their overall predictive accuracy and minimising unrealistic behaviour. We view the selection of the set of reactions as a combinatorial optimisation problem, and describe a method based on classic metaheuristic approaches which requires the solution of continuous Linear Programming problems only. This paper provides an introduction of this problem to an audience whose speciality lies outside biology, and suggests a practical first-cut solution method. We demonstrate the method gap-filling GEMs for three bacteria strains, selecting 3000 to 4000 reactions from a database of more than 11000 reactions, while attempting to match the empirically measured performance on 9 to 28 separate media conditions. We show that our method outperforms conventional approaches on multiple metrics, including Kendal Tau and RMS Error by an average of 7.3% and 13.3%, respectively.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.