ReRec: Reasoning-Augmented LLM-based Recommendation Assistant via Reinforcement Fine-tuning

April 9, 20262604.07851

Jiani Huang, Shijie Wang, Liangbo Ning, Wenqi Fan, Qing Li

cs.IRcs.AI

TLDR

ReRec is a reinforcement fine-tuning framework that enhances LLM-based recommendation assistants with improved multi-step reasoning for complex queries.

Key contributions

Dual-Graph Enhanced Reward Shaping integrates rec metrics with query and preference alignment scores.
Reasoning-aware Advantage Estimation penalizes incorrect reasoning steps in LLM outputs for better reasoning.
Online Curriculum Scheduler dynamically adjusts training based on query difficulty for stable learning.

Why it matters

This paper introduces ReRec, a novel framework that significantly improves LLM-based recommendation systems' ability to handle complex, reasoning-driven queries. By addressing multi-step reasoning challenges, ReRec paves the way for more intelligent and personalized recommendation assistants.

Original Abstract

With the rise of LLMs, there is an increasing need for intelligent recommendation assistants that can handle complex queries and provide personalized, reasoning-driven recommendations. LLM-based recommenders show potential but face challenges in multi-step reasoning, underscoring the need for reasoning-augmented systems. To address this gap, we propose ReRec, a novel reinforcement fine-tuning (RFT) framework designed to improve LLM reasoning in complex recommendation tasks. Our framework introduces three key components: (1) Dual-Graph Enhanced Reward Shaping, integrating recommendation metrics like NDCG@K with Query Alignment and Preference Alignment Scores to provide fine-grained reward signals for LLM optimization; (2) Reasoning-aware Advantage Estimation, which decomposes LLM outputs into reasoning segments and penalizes incorrect steps to enhance reasoning of recommendation; and (3) Online Curriculum Scheduler, dynamically assess query difficulty and organize training curriculum to ensure stable learning during RFT. Experiments demonstrate that ReRec outperforms state-of-the-art baselines and preserves core abilities like instruction-following and general knowledge. Our codes are available at https://github.com/jiani-huang/ReRec.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers