GraphRAG-IRL: Personalized Recommendation with Graph-Grounded Inverse Reinforcement Learning and LLM Re-ranking
Siqi Liang, Xiawei Wang, Yudi Zhang, Jiaying Zhou
TLDR
GraphRAG-IRL is a hybrid recommender using knowledge graphs, inverse reinforcement learning, and LLM re-ranking to improve personalization and accuracy.
Key contributions
- Proposes GraphRAG-IRL, a hybrid framework for robust personalized recommendation.
- Constructs a heterogeneous knowledge graph for rich feature extraction and context retrieval.
- Leverages Maximum Entropy Inverse Reinforcement Learning for calibrated pre-ranking.
- Applies persona-guided LLM re-ranking on short candidate lists to fuse semantic judgments.
Why it matters
This paper addresses LLM limitations in recommendation by integrating them into a robust hybrid system. It significantly improves accuracy and handles sparse feedback, offering a practical path for LLM-enhanced recommenders.
Original Abstract
Personalized recommendation requires models that capture sequential user preferences while remaining robust to sparse feedback and semantic ambiguity. Recent work has explored large language models (LLMs) as recommenders and re-rankers, but pure prompt-based ranking often suffers from poor calibration, sensitivity to candidate ordering, and popularity bias. These limitations make LLMs useful semantic reasoners, but unreliable as standalone ranking engines. We present \textbf{GraphRAG-IRL}, a hybrid recommendation framework that combines graph-grounded feature construction, inverse reinforcement learning (IRL), and persona-guided LLM re-ranking. Our method constructs a heterogeneous knowledge graph over items, categories, and concepts, retrieves both individual and community preference context, and uses these signals to train a Maximum Entropy IRL model for calibrated pre-ranking. An LLM is then applied only to a short candidate list, where persona-guided prompts provide complementary semantic judgments that are fused with IRL rankings. Experiments show that GraphRAG-IRL is a strong standalone recommender: IRL-MLP with GraphRAG improves NDCG@10 by 15.7\% on MovieLens and 16.6\% on KuaiRand over supervised baselines. The results also show that IRL and GraphRAG are superadditive, with the combined gain exceeding the sum of their individual improvements. Persona-guided LLM fusion further improves ranking quality, yielding up to 16.8\% NDCG@10 improvement over the IRL-only baseline on MovieLens ml-1m, while score fusion on KuaiRand provides consistent gains of 4--6\% across LLM providers.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.