LLM-Guided Prompt Evolution for Password Guessing

April 14, 20262604.12601

Vladimir A. Mazin, Mikhail A. Zorin, Dmitrii S. Korzh, Elvir Z. Karimov, Dmitrii A. Bolokhov + 1 more

cs.CRcs.AI

TLDR

This paper uses LLM-driven evolutionary computation to optimize prompts for password guessing, significantly improving cracking rates.

Key contributions

Introduces LLM-driven evolutionary computation for optimizing password guessing prompts.
Utilizes OpenEvolve, combining MAP-Elites and an island population model for prompt evolution.
Achieves a significant increase in password cracking rates from 2.02% to 8.48%.
Demonstrates evolved prompts generate statistically more realistic passwords.

Why it matters

This research offers a novel, low-barrier method to enhance LLM-based password auditing by automating prompt optimization. It highlights the potential for evolutionary computation to strengthen security testing and model attacker behavior. This approach can lead to more robust password policies.

Original Abstract

Passwords still remain a dominant authentication method, yet their security is routinely subverted by predictable user choices and large-scale credential leaks. Automated password guessing is a key tool for stress-testing password policies and modeling attacker behavior. This paper applies LLM-driven evolutionary computation to automatically optimize prompts for the LLM password guessing framework. Using OpenEvolve, an open-source system combining MAP-Elites quality-diversity search with an island population model we evolve prompts that maximize cracking rate on a RockYou-derived test set. We evaluate three configurations: a local setup with Qwen3 8B, a single compact cloud model Gemini-2.5 Flash, and a two-model ensemble of frontier LLMs. The approach raises the cracking rates from 2.02\% to 8.48\%. Character distribution analysis further confirms how evolved prompts produce statistically more realistic passwords. Automated prompt evolution is a low-barrier yet effective way to strengthen LLM-based password auditing and underlining how attack pipelines show tendency via automated improvements.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers