Think Before you Write: QA-Guided Reasoning for Character Descriptions in Books
Argyrios Papoudakis, Mirella Lapata, Frank Keller
TLDR
This paper introduces a QA-guided reasoning framework that improves character description generation from long narratives by decoupling reasoning from generation.
Key contributions
- LLMs struggle with character description generation from long narratives, performing better with disabled reasoning.
- Proposes a novel training framework that decouples reasoning (via QA trace) from the final generation process.
- A reasoning model produces a structured QA trace, which then conditions a separate generation model.
- Achieves improved faithfulness, informativeness, and grounding over strong long-context baselines.
Why it matters
This paper addresses a significant challenge in narrative AI: generating accurate character descriptions from long texts. By decoupling reasoning from generation, it offers a novel way to improve LLM performance in complex tasks where direct reasoning falls short. This method has broad implications for story analysis, summarization, and character-driven simulations.
Original Abstract
Character description generation is an important capability for narrative-focused applications such as summarization, story analysis, and character-driven simulations. However, generating accurate character descriptions from long-form narratives (e.g., novels) is challenging: models must track evolving attributes (e.g., relationships and events), integrate evidence scattered across the text, and infer implicit details. Despite the success of reasoning-enabled LLMs on many benchmarks, we find that for character description generation their performance improves when built-in reasoning is disabled (i.e., an empty reasoning trace). Motivated by this, we propose a training framework that decouples reasoning from generation. Our approach, which can be applied on top of long-context LLMs or chunk-based methods, consists of a reasoning model that produces a structured QA reasoning trace and a generation model that conditions on this trace to produce the final character description. Experiments on two datasets (BookWorm and CroSS) show that QA-guided reasoning improves faithfulness, informativeness, and grounding over strong long-context baselines.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.