FunFuzz: An LLM-Powered Evolutionary Fuzzing Framework
Mario Rodríguez Béjar, B. Romera-Paredes, Jose L. Hernández-Ramos
TLDR
FunFuzz is an LLM-powered evolutionary fuzzer that uses a multi-island approach to enhance exploration, achieving superior compiler coverage and unique failures.
Key contributions
- Introduces FunFuzz, a multi-island evolutionary fuzzing framework powered by LLMs.
- Addresses LLM-driven fuzzing sensitivity by running parallel, isolated searches with candidate migration.
- Dynamically adapts LLM prompts using documentation-derived initial prompts and feedback-guided selection.
- Achieves higher compiler coverage and discovers more unique failures on GCC and Clang than baselines.
Why it matters
LLM-driven fuzzing often struggles with prompt sensitivity and redundant inputs. FunFuzz offers a novel multi-island evolutionary approach to overcome these limitations, significantly improving exploration efficiency and bug discovery in compilers. This advancement makes LLM-powered fuzzing more robust and effective for complex software systems.
Original Abstract
Modern fuzzers increasingly use Large Language Models (LLMs) to generate structured inputs, but LLM-driven fuzzing is sensitive to prompt initialization and sampling variance, which can reduce exploration efficiency and lead to redundant inputs. We present FunFuzz, a multi-island evolutionary fuzzing framework that runs several isolated searches in parallel and periodically migrates high-value candidates to maintain diversity. FunFuzz derives initial generation prompts from documentation and initializes islands with topic-specific instructions, then continuously adapts prompts using feedback-guided selection. During fuzzing, candidates are prioritized by incremental compiler coverage, while compiler-internal failure signals are used to identify crash-inducing inputs. We evaluate FunFuzz on compiler fuzzing, where inputs are source programs and success is measured by compiler coverage and unique compiler-internal failures. Across repeated 24-hour campaigns on GCC and Clang, FunFuzz achieves higher compiler coverage than previous LLM-driven baselines and discovers more unique failure-triggering inputs.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.