Neuro-RIT: Neuron-Guided Instruction Tuning for Robust Retrieval-Augmented Language Model

April 2, 20262604.02194

Jaemin Kim, Jae O Lee, Sumyeong Ahn, Seo Yeon Park

cs.CLcs.AI

TLDR

Neuro-RIT enhances Retrieval-Augmented Language Models' robustness to noisy contexts by precisely aligning and deactivating specific neurons.

Key contributions

Introduces Neuro-RIT, a neuron-guided instruction tuning framework for robust RALMs.
Disentangles neurons for relevant vs. irrelevant contexts using attribution-based mining.
Employs a two-stage tuning strategy for noise suppression and evidence distillation.

Why it matters

RALMs struggle with noisy data, limiting their real-world applicability. This paper introduces a fine-grained, neuron-level approach to enhance robustness, moving beyond coarse-grained methods. By precisely targeting and deactivating noise-processing neurons, Neuro-RIT significantly improves RALM performance in knowledge-intensive tasks.

Original Abstract

Retrieval-Augmented Language Models (RALMs) have demonstrated significant potential in knowledge-intensive tasks; however, they remain vulnerable to performance degradation when presented with irrelevant or noisy retrieved contexts. Existing approaches to enhance robustness typically operate via coarse-grained parameter updates at the layer or module level, often overlooking the inherent neuron-level sparsity of Large Language Models (LLMs). To address this limitation, we propose Neuro-RIT (Neuron-guided Robust Instruction Tuning), a novel framework that shifts the paradigm from dense adaptation to precision-driven neuron alignment. Our method explicitly disentangles neurons that are responsible for processing relevant versus irrelevant contexts using attribution-based neuron mining. Subsequently, we introduce a two-stage instruction tuning strategy that enforces a dual capability for noise robustness: achieving direct noise suppression by functionally deactivating neurons exclusive to irrelevant contexts, while simultaneously optimizing targeted layers for evidence distillation. Extensive experiments across diverse QA benchmarks demonstrate that Neuro-RIT consistently outperforms strong baselines and robustness-enhancing methods.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers