Self-Consistency Improves Chain of Thought Reasoning in Language Models

March 21, 20222203.11171

Xuezhi Wang, Jason Wei, Dale Schuurmans, Quoc Le, Ed Chi + 3 more

cs.CLcs.AI

TLDR

Self-consistency is a new decoding strategy that improves chain-of-thought reasoning in language models by sampling diverse reasoning paths and selecting the most consistent answer.

Key contributions

Introduces self-consistency decoding to replace greedy decoding in chain-of-thought prompting.
Samples multiple reasoning paths and marginalizes over them to find the most consistent answer.
Demonstrates significant performance improvements across various arithmetic and commonsense reasoning benchmarks.

Why it matters

This paper matters because it addresses a key limitation in current chain-of-thought reasoning approaches by leveraging multiple reasoning trajectories rather than relying on a single greedy path. This leads to more robust and accurate reasoning in large language models, substantially advancing their ability to solve complex tasks that require multi-step logical inference.

Original Abstract

Chain-of-thought prompting combined with pre-trained large language models has achieved encouraging results on complex reasoning tasks. In this paper, we propose a new decoding strategy, self-consistency, to replace the naive greedy decoding used in chain-of-thought prompting. It first samples a diverse set of reasoning paths instead of only taking the greedy one, and then selects the most consistent answer by marginalizing out the sampled reasoning paths. Self-consistency leverages the intuition that a complex reasoning problem typically admits multiple different ways of thinking leading to its unique correct answer. Our extensive empirical evaluation shows that self-consistency boosts the performance of chain-of-thought prompting with a striking margin on a range of popular arithmetic and commonsense reasoning benchmarks, including GSM8K (+17.9%), SVAMP (+11.0%), AQuA (+12.2%), StrategyQA (+6.4%) and ARC-challenge (+3.9%).

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers