RecGPT-Mobile: On-Device Large Language Models for User Intent Understanding in Taobao Feed Recommendation

May 6, 20262605.04726

Bin Zhang, Weipeng Huang, Dimin Wang, Jialin Zhu, Yuning Jiang + 7 more

cs.IR

TLDR

RecGPT-Mobile deploys lightweight LLMs directly on mobile devices to understand user intent in real-time, improving e-commerce recommendations.

Key contributions

Deploys lightweight LLMs directly on mobile devices for real-time user intent understanding.
Addresses high inference costs and latency of cloud-based LLMs in mobile e-commerce.
Captures evolving user interests faster, enabling real-time recommendation adjustments.
Achieves significant improvements in recommendation accuracy through extensive experiments.

Why it matters

This paper enables on-device LLMs for real-time user intent understanding in mobile e-commerce. It significantly improves recommendation accuracy and user experience, offering a practical, scalable solution for LLM integration into production mobile systems.

Original Abstract

Predicting a user's next search query from recent interaction behaviors is a critical problem in modern e-commerce systems, particularly in scenarios where user intent evolves rapidly. Large Language Models (LLMs) offer strong semantic reasoning capabilities and have recently been adopted to enhance training data construction for next-query prediction. However, due to resource constraints on mobile devices, existing applications are deployed on cloud servers, resulting in high inference costs. In this paper, we propose RecGPT-Mobile, a framework that designs a lightweight LLM-based intent understanding agent to improve recommendation quality in mobile e-commerce scenarios. By deploying LLMs directly on mobile devices, our approach can capture evolving interests of users more quickly and adjust the recommendation results in real time. Extensive offline analyses and online experiments demonstrate that our method significantly improves the accuracy of recommendation results, laying a practical path for LLM deployment in production-scale recommendation systems on mobile devices, as well as a scalable solution for integrating LLMs into real-world next-query prediction systems.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers