TDD Governance for Multi-Agent Code Generation via Prompt Engineering
Tarlan Hasanli, Shahbaz Siddeeq, Bishwash Khanal, Pyry Kotilainen, Tommi Mikkonen + 1 more
TLDR
An AI-native TDD framework uses prompt engineering to enforce TDD principles, improving stability and reproducibility in LLM-assisted software development.
Key contributions
- Presents an AI-native TDD framework operationalizing TDD principles via prompt and workflow governance.
- Formalizes TDD principles into a machine-readable manifesto for structured enforcement across development stages.
- Introduces a layered architecture separating LLM proposals from deterministic engine authority.
- Enforces phase ordering, bounded repair loops, validation gates, and atomic mutation control for stability.
Why it matters
LLMs struggle with discipline and stability in software development. This paper addresses this by integrating classical TDD principles directly into LLM workflows. By enforcing structured processes, it promises more reliable and reproducible code generation, crucial for production-ready AI-assisted development.
Original Abstract
Large language models (LLMs) accelerate software development but often exhibit instability, non-determinism, and weak adherence to development discipline in unconstrained workflows. While test-driven development (TDD) provides a structured Red-Green-Refactor process, existing LLM-based approaches typically use tests as auxiliary inputs rather than enforceable process constraints. We present an AI-native TDD framework that operationalizes classical TDD principles as structured prompt-level and workflow-level governance mechanisms. Extracted principles are formalized in a machine-readable manifesto and distributed across planning, generation, repair, and validation stages within a layered architecture that separates model proposal from deterministic engine authority. The system enforces phase ordering, bounded repair loops, validation gates, and atomic mutation control to improve stability and reproducibility. We describe architecture and discuss encoding software engineering discipline directly into prompt orchestration, which we think offers a promising direction for reliable LLM-assisted development.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.