Asking What Matters: Reward-Driven Clarification for Software Engineering Tasks
Sanidhya Vijayvargiya, Vijay Viswanathan, Graham Neubig
TLDR
CLARITI, an 8B module, uses reward-driven clarification for software engineering tasks, matching GPT-5's resolution with 41% fewer questions.
Key contributions
- Quantified information impact on task success and user answerability in software engineering tasks.
- Identified task relevance and user answerability as key properties for effective clarification.
- Operationalized these properties as multi-stage reinforcement learning rewards for training.
- Developed CLARITI (8B-param) matching GPT-5's resolution rate while asking 41% fewer questions.
Why it matters
This paper significantly improves AI assistants' efficiency in software engineering by teaching them to ask more relevant and answerable clarifying questions. It addresses the common challenge of underspecified tasks, showing that empirically-grounded reward design leads to more effective and less verbose clarification.
Original Abstract
Humans often specify tasks incompletely, so assistants must know when and how to ask clarifying questions. However, effective clarification remains challenging in software engineering tasks as not all missing information is equally valuable, and questions must target information users can realistically provide. We study clarification in real software engineering tasks by quantifying which types of information most affect task success and which questions elicit useful responses from simulated users. Using Shapley attribution and distributional comparisons, we identify two key properties of effective clarification: task relevance (which information predicts success) and user answerability (what users can realistically provide). We operationalize these properties as multi-stage reinforcement learning rewards to train CLARITI, an 8B-parameter clarification module, that matches GPT-5's resolution rate on underspecified issues while generating 41% fewer questions. Our results suggest that grounding reward design in empirical analysis of information impact and user answerability improves clarification efficiency.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.