ArXiv TLDR

Bridging Textual Profiles and Latent User Embeddings for Personalization

🐦 Tweet
2605.06981

Zhaoxuan Tan, Xiang Zhai, Yan Zhu, Meng Jiang, Mohamed Hammad

cs.IRcs.CL

TLDR

BLUE unifies interpretable textual user profiles with discriminative latent embeddings using reinforcement learning for personalized recommendations.

Key contributions

  • Introduces BLUE, an RL framework unifying textual user profiles and latent embeddings.
  • Uses an LLM and embedding model rewards to align textual profiles with recommendation objectives.
  • Achieves state-of-the-art results in zero-shot sequential and cross-domain recommendation.
  • Generated profiles offer superior personalized context for question answering.

Why it matters

This paper matters by solving the trade-off between interpretable textual profiles and effective latent embeddings in personalization. BLUE's unification approach offers a powerful new method. Its strong performance in zero-shot and cross-domain settings, plus improved context for QA, highlights its practical utility for transparent and effective personalization.

Original Abstract

Personalized systems rely on user representations to connect behavioral history with downstream recommendation applications. Existing methods typically employ either supervised latent user embeddings, which are effective for retrieval but difficult to interpret, or textual user profiles, which are interpretable but challenging to optimize for downstream utility due to lack of direct supervision. To bridge this gap, we present BLUE, a reinforcement learning framework that unifies these two forms of user representation by aligning language-based user profiles with embedding-based recommendation objectives. Given a user interaction history, BLUE leverages a profiler Large Language Model (LLM) to generate textual profiles, while an embedding model provides reward signals. This encourages the resulting textual representations to move closer to positive items and farther from negative ones in the embedding space. We further introduce a text-space supervision signal based on next-item prediction, ensuring the learned profiles remain both semantically meaningful and highly effective for downstream retrieval. Experiments on Amazon Reviews 2023 and Google Local Reviews in zero-shot sequential recommendation settings demonstrate that BLUE consistently outperforms strong baselines under both frozen and trainable embedding conditions. Notably, BLUE achieves clear gains in cross-domain transfer, highlighting the strong generalization ability of the learned user profiles. Furthermore, these generated profiles provide superior personalized context for question answering compared to raw user histories or alternative profile optimization methods. Overall, these results show that BLUE provides an effective way to unify interpretable textual profiling with discriminative latent embeddings for personalization.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.