Early-Stage Product Line Validation Using LLMs: A Study on Semi-Formal Blueprint Analysis
Viet-Man Le, Thi Ngoc Trang Tran, Sebastian Lubos, Alexander Felfernig, Damian Garber
TLDR
This paper explores using LLMs for early-stage product line validation by analyzing semi-formal blueprints, achieving high accuracy.
Key contributions
- Investigates LLMs for feature model analysis on semi-formal textual blueprints in Software Product Lines.
- Finds reasoning-optimized LLMs (e.g., Grok 4, Gemini 2.5 Pro) achieve 88-89% accuracy, nearing solver correctness.
- Identifies systematic errors in LLM structural parsing and constraint reasoning.
- Provides insights into accuracy-cost trade-offs for selecting LLMs in validation tasks.
Why it matters
This research demonstrates LLMs' potential as lightweight assistants for early variability validation in software product lines. Achieving high accuracy on semi-formal blueprints, LLMs can streamline validation, reducing development costs and time. It also provides crucial insights for practical model selection.
Original Abstract
We study whether Large Language Models (LLMs) can perform feature model analysis operations (AOs) directly on semi-formal textual blueprints, i.e., concise constrained-language descriptions of feature hierarchies and constraints, enabling early validation in Software Product Line scoping. Using 12 state-of-the-art LLMs and 16 standard AOs, we compare their outputs against the solver-based oracle FLAMA. Results show that reasoning-optimized models (e.g., Grok 4 Fast Reasoning, Gemini 2.5 Pro) achieve 88-89% average accuracy across all evaluated blueprints and operations, approaching solver correctness. We identify systematic errors in structural parsing and constraint reasoning, and highlight accuracy-cost trade-offs that inform model selection. These findings position LLMs as lightweight assistants for early variability validation.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.