Breaking the Dependency Chaos: A Constraint-Driven Python Dependency Resolution Strategy with Selective LLM Imputation
Kowshik Chowdhury, Dipayan Banik, Shazibul Islam Shamim
TLDR
SMT-LLM resolves Python dependency conflicts by combining formal constraint solving with selective LLM imputation, significantly outperforming prior LLM-only approaches.
Key contributions
- SMT-LLM combines formal constraint solving with selective LLM imputation for Python dependency resolution.
- Employs deterministic AST analysis, vermin, and a five-tier PyPI-first resolver to build a constraint graph.
- Utilizes a Z3 SMT solver to find consistent version assignments, even with missing metadata via LLMs.
- Achieves 83.6% resolution (vs 54.8% for PLLM), 6.3x faster, and 11x fewer LLM calls.
Why it matters
Python dependency resolution is a major source of build failures. SMT-LLM drastically improves resolution rates and speed by integrating formal constraint solving with LLMs, significantly reducing costly LLM usage. This makes software development more reliable and efficient.
Original Abstract
Dependency resolution is the task of selecting package versions that can be installed together without conflicts. It accounts for a significant share of build failures in modern software projects. In the Python ecosystem, this task is especially challenging due to Python 2/3 incompatibilities, deprecated packages, and widespread missing metadata. Recent work, such as PLLM, tackles this problem by using large language models (LLMs) to infer Python and package versions from code and iteratively repairing them based on build errors. We present SMT-LLM, a hybrid system that replaces LLM-only version guessing with formal constraint solving. SMT-LLM uses deterministic import extraction and Python version detection via abstract syntax tree (AST) analysis, the vermin tool to infer minimum Python versions, and a five-tier import-to-package resolver that queries PyPI before any LLM call. We construct a constraint graph from PyPI metadata and LLM-imputed dependencies for packages with missing metadata, then solve for consistent version assignments using a Z3 satisfiability modulo theories (SMT) solver. On the HG2.9K benchmark using Gemma2:9B (10 GB VRAM), SMT-LLM resolves 83.6% of snippets compared to PLLM's 54.8%, while reducing median resolution time from 151.5 s to 23.9 s (6.3x faster) and average LLM calls from ~24.9 to 2.26 per snippet (11x reduction).
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.