Learning Evidence Highlighting for Frozen LLMs

April 24, 20262604.22565

Shaoang Li, Yanhang Shi, Yufei Li, Mingfu Liang, Xiaohan Wei + 8 more

cs.CLcs.AI

TLDR

HiLight is a framework that trains a lightweight actor to highlight key evidence in long contexts for frozen LLMs, improving reasoning without modifying the LLM.

Key contributions

Introduces HiLight, an Evidence Emphasis framework for frozen LLMs.
Trains a lightweight Actor to highlight pivotal evidence spans in unaltered context.
Optimized via RL using only task reward, requiring no evidence labels or Solver access.
Achieves performance gains and zero-shot transferability across LLM families.

Why it matters

LLMs often miss key evidence in long, noisy contexts. HiLight solves this by allowing frozen LLMs to focus on relevant information without costly fine-tuning or input alteration. Its zero-shot transferability demonstrates a robust, reusable approach to improving LLM reasoning.

Original Abstract

Large Language Models (LLMs) can reason well, yet often miss decisive evidence when it is buried in long, noisy contexts. We introduce HiLight, an Evidence Emphasis framework that decouples evidence selection from reasoning for frozen LLM solvers. HiLight avoids compressing or rewriting the input, which can discard or distort evidence, by training a lightweight Emphasis Actor to insert minimal highlight tags around pivotal spans in the unaltered context. A frozen Solver then performs downstream reasoning on the emphasized input. We cast highlighting as a weakly supervised decision-making problem and optimize the Actor with reinforcement learning using only the Solver's task reward, requiring no evidence labels and no access to or modification of the Solver. Across sequential recommendation and long-context question answering, HiLight consistently improves performance over strong prompt-based and automated prompt-optimization baselines. The learned emphasis policy transfers zero-shot to both smaller and larger unseen Solver families, including an API-based Solver, suggesting that the Actor captures genuine, reusable evidence structure rather than overfitting to a single backbone.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers