Improving LLM-Driven Test Generation by Learning from Mocking Information
Jamie Lee, Flynn Teh, Hengcheng Zhu, Mengzhen Li, Mattia Fazzini + 1 more
TLDR
MOCKMILL improves LLM-driven unit test generation by learning from developer-defined mocking information in existing test suites.
Key contributions
- Proposes MOCKMILL, an LLM-based tool that generates unit tests using extracted mocking information.
- Guides test generation by exploiting stubbings and interaction expectations from developer-written tests.
- Employs an iterative generation-and-repair process to ensure the executability of generated tests.
- Achieves higher code coverage and mutant killing compared to existing tests and baseline LLM approaches.
Why it matters
LLMs are promising for test generation, but their effectiveness can be limited. This paper introduces a novel approach to significantly enhance LLM-driven test generation by leveraging existing developer-defined mocking information. This method improves test quality, leading to better code coverage and fault detection.
Original Abstract
Large Language Models (LLMs) have recently shown strong potential for automated unit test generation. This has motivated us to investigate whether developer-defined test doubles (commonly referred to as mocks) available in existing test suites can be leveraged to improve LLM-driven test generation. To this end, we propose MOCKMILL, an LLM-based technique and tool that generates test cases by exploiting mocking information automatically extracted from developer-written tests. MOCKMILL targets components that are replaced by test doubles in existing tests and uses the encoded stubbings and interaction expectations to guide test generation, combined with an iterative generation-and-repair process to ensure executable tests. We evaluated MOCKMILL on 10 open-source classes from six Java projects using four LLMs, and compared the generated tests with existing project tests and tests produced by baseline approaches. The results show that MOCKMILL's tests cover lines of code and kill mutants that existing tests and baseline-generated tests miss. Overall, our findings provide preliminary evidence that leveraging mocking information is a complementary and effective way to enhance LLM-based test generation.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.