ArXiv TLDR

Enhancing Program Repair with Specification Guidance and Intermediate Behavioral Signals

🐦 Tweet
2604.11770

Minh Le-Anh, Cuong Chi Le, Tien N. Nguyen

cs.SE

TLDR

SpecTune improves automated program repair by using specification-guided intermediate behavioral signals, enabling precise fault localization and targeted patches.

Key contributions

  • Incorporates intermediate behavioral reasoning into Automated Program Repair (APR) using localized postconditions.
  • Decomposes repair tasks into suspicious regions, generating micro-level debugging signals from execution checkpoints.
  • Introduces `alpha` and `beta` signals to validate and detect violations of LLM-generated postconditions.
  • Achieves more precise fault localization and improved APR effectiveness compared to baseline methods.

Why it matters

LLM-based program repair struggles with coarse signals. SpecTune introduces specification-guided intermediate behavioral reasoning, mimicking human debugging. This leads to more precise fault localization and effective patch generation, advancing automated software maintenance.

Original Abstract

Automated Program Repair (APR) has recently benefited from large language models (LLMs). However, most LLM-based APR approaches still rely primarily on coarse end-to-end signals from test-suite outcomes to guide repair, providing limited insight into where a program's internal logic deviates from its intended behavior. In contrast, human debugging often relies on intermediate reasoning about program states through localized correctness conditions or assertions. Inspired by this observation, we propose SpecTune, a specification-guided debugging framework that incorporates intermediate behavioral reasoning into APR. SpecTune decomposes the repair task into suspicious regions connected by execution checkpoints and derives localized postconditions representing expected program behaviors at those points. By executing the buggy program and evaluating these postconditions, SpecTune produces micro-level debugging signals that indicate mismatches between observed and intended behaviors, enabling more precise fault localization and targeted patch generation. To address the potential unreliability of LLM-generated postconditions, we introduce two complementary signals: a specification validation signal alpha, which estimates the consistency of generated postconditions using partially passing test cases, and a discriminative signal beta, which detects violations of validated postconditions during execution. With these signals, SpecTune safely leverages automatically generated specifications for APR. Experimental results show that SpecTune improves fault localization and APR effectiveness than the baselines.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.