APITestGenie: Generating Web API Tests from Requirements and API Specifications with LLMs

April 2, 20262604.02039

André Pereira, Bruno Lima, João Pascoal Faria

cs.SE

TLDR

APITestGenie uses LLMs and RAG to automate Web API test generation from requirements, achieving 89% success and finding defects.

Key contributions

Introduces APITestGenie, an LLM-powered tool for generating API integration tests from requirements.
Achieved 89% success in generating valid test scripts for 10 real-world APIs, including industrial ones.
Identified previously unknown API defects, including integration issues, demonstrating practical value.
Found API complexity and requirement detail are key factors influencing test generation success.

Why it matters

Creating API tests is a major bottleneck. This paper introduces APITestGenie, which significantly automates this process using LLMs, reducing manual effort and improving test-requirement alignment. Its ability to find real defects and strong industry interest highlight its practical impact on software quality.

Original Abstract

Modern software systems rely heavily on Web APIs, yet creating meaningful and executable test scripts remains a largely manual, time-consuming, and error-prone task. In this paper, we present APITestGenie, a novel tool that leverages Large Language Models (LLMs), Retrieval-Augmented Generation (RAG), and prompt engineering to automatically generate API integration tests directly from business requirements and OpenAPI specifications. We evaluated APITestGenie on 10 real-world APIs, including 8 APIs comprising circa 1,000 live endpoints from an industrial partner in the automotive domain. The tool was able to generate syntactically and semantically valid test scripts for 89\% of the business requirements under test after at most three attempts. Notably, some generated tests revealed previously unknown defects in the APIs, including integration issues between endpoints. Statistical analysis identified API complexity and level of detail in business requirements as primary factors influencing success rates, with the level of detail in API documentation also affecting outcomes. Feedback from industry practitioners confirmed strong interest in adoption, substantially reducing the manual effort in writing acceptance tests, and improving the alignment between tests and business requirements.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers