Asking What Matters: Reward-Driven Clarification for Software Engineering Tasks

April 16, 20262604.14624

Sanidhya Vijayvargiya, Vijay Viswanathan, Graham Neubig

cs.SEcs.AI

TLDR

CLARITI, an 8B module, uses reward-driven clarification for software engineering tasks, matching GPT-5's resolution with 41% fewer questions.

Key contributions

Quantified information impact on task success and user answerability in software engineering tasks.
Identified task relevance and user answerability as key properties for effective clarification.
Operationalized these properties as multi-stage reinforcement learning rewards for training.
Developed CLARITI (8B-param) matching GPT-5's resolution rate while asking 41% fewer questions.

Why it matters

This paper significantly improves AI assistants' efficiency in software engineering by teaching them to ask more relevant and answerable clarifying questions. It addresses the common challenge of underspecified tasks, showing that empirically-grounded reward design leads to more effective and less verbose clarification.

Original Abstract

Humans often specify tasks incompletely, so assistants must know when and how to ask clarifying questions. However, effective clarification remains challenging in software engineering tasks as not all missing information is equally valuable, and questions must target information users can realistically provide. We study clarification in real software engineering tasks by quantifying which types of information most affect task success and which questions elicit useful responses from simulated users. Using Shapley attribution and distributional comparisons, we identify two key properties of effective clarification: task relevance (which information predicts success) and user answerability (what users can realistically provide). We operationalize these properties as multi-stage reinforcement learning rewards to train CLARITI, an 8B-parameter clarification module, that matches GPT-5's resolution rate on underspecified issues while generating 41% fewer questions. Our results suggest that grounding reward design in empirical analysis of information impact and user answerability improves clarification efficiency.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers