AutoSOUP: Safety-Oriented Unit Proof Generation for Component-level Memory-Safety Verification
Paschal C. Amusuo, Ricardo Calvo, Dharun Anandayuvaraj, Taylor Le Lievre, Kevin Kolyakov + 3 more
TLDR
AutoSOUP automates component-level memory-safety verification using Safety-Oriented Unit Proofs and a hybrid LLM-as-function-call architecture.
Key contributions
- Introduces AutoSOUP, a system for automated component-level memory-safety verification.
- Formalizes Safety-Oriented Unit Proofs encoding verification choices like scope and loop bounds.
- Presents three techniques for automatically deriving these unit proofs.
- Uses LLM-As-Function-Call, a hybrid LLM/synthesis architecture, for proof generation.
Why it matters
Memory-safety errors are a major source of vulnerabilities, especially in embedded systems where current verification is largely manual. AutoSOUP automates this complex process, making robust memory-safety verification more accessible and practical.
Original Abstract
Memory-safety errors remain a persistent source of zero-day vulnerabilities in low-level software. The problem is especially acute in embedded systems, where hardware protections are often limited and dynamic analysis is difficult to apply effectively. Memory-safety verification can provide stronger assurance by proving the absence of such errors or exposing violations when they exist. However, current verification workflows remain largely manual and require substantial specialized expertise, limiting their adoption in practice. We present AutoSOUP, a system for automating component-level memory-safety verification through Safety-Oriented Unit Proofs. We formalize these unit proofs as artifacts that encode verification choices (scope, loop bounds, and environment models) for verifying safety properties, and introduce three techniques for deriving them automatically. To overcome the limitations of existing automation approaches, we further introduce LLM-As-Function-Call, a hybrid architecture that combines deterministic program synthesis with LLMs to automate these techniques and produce justifiable unit proofs. We evaluate AutoSOUP by assessing its ability to automate memory-safety verification and expose vulnerabilities in verified components, and we characterize the assumptions and guarantees of the resulting proofs.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.