ArXiv TLDR

Commit-Aware Learning-Based Test Case Prioritization for Continuous Integration

🐦 Tweet
2604.25363

Lorenzo Abbondante, Gerardo Canfora

cs.SE

TLDR

This paper introduces a commit-aware, learning-based test case prioritization method for CI that leverages structural code changes to detect faults earlier.

Key contributions

  • Develops a commit-aware, learning-based test case prioritization (TCP) method for CI.
  • Integrates structural properties of code diffs, test coverage, and historical data into a unified model.
  • Predicts test failure probability for new commits to optimize test execution order.
  • Demonstrates significant improvement over non-commit-aware baselines in fault detection.

Why it matters

Regression testing in CI is expensive. This paper offers a novel TCP approach that uses commit structural information, significantly improving early fault detection. It provides a more robust and generalizable solution for CI environments.

Original Abstract

Regression testing in Continuous Integration (CI) pipelines is increasingly costly due to the growing size and execution frequency of test suites. Test Case Prioritization (TCP) mitigates this problem by reordering tests to expose faults earlier. However, most existing techniques rely primarily on historical execution data and coverage metrics, neglecting the rich structural information contained in code changes. This paper proposes a commit-aware, learning-based TCP method that combines structural properties of version-control diffs, test coverage relations, and historical execution behavior into a unified predictive model. Given a new commit, the method estimates the probability that each test suite will reveal at least one failure and prioritizes test execution accordingly. We evaluate our method on five Defects4J projects using a leave-one-project-out cross-project validation setting. Results show that the commit-aware TCP significantly outperform non-commit-aware-baselines in both classification and prioritization effectiveness. Our findings show that including commit structural semantics substantially enhances regression fault detection and enables robust, generalizable learning-based TCP in CI environments.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.