Unified Value Alignment for Generative Recommendation in Industrial Advertising
Xinxun Zhang, Yuling Xiong, Jiale Zhou, Zhengkai Guo, Zhennan Pang + 11 more
TLDR
UniVA enhances generative recommendation for advertising by aligning commercial value signals across tokenization, decoding, and online serving.
Key contributions
- Introduces a Commercial SID tokenizer to embed value-related attributes into item representations.
- Develops a Generation-as-Ranking SID Decoder, optimized with eCPM-aware RL, to fuse value into item generation.
- Designs a value-guided personalized beam search for online serving, using generation logits for value guidance.
Why it matters
This paper addresses a key challenge in industrial advertising by integrating commercial value directly into generative recommendation. It offers a practical framework, UniVA, that significantly improves both offline metrics and online revenue. This makes GR more effective for real-world advertising platforms.
Original Abstract
Generative Recommendation (GR) reformulates recommendation as a next-token generation problem and has shown promise in industrial applications. However, extending GR to industrial advertising is non-trivial because the system must optimize not only user interest but also commercial value. Existing GR pipelines remain largely semantics-centric, making it difficult to align value signals across tokenization, decoding, and online serving. To address this issue, we propose UniVA, a Unified Value Alignment framework for advertising recommendation. We first introduce a Commercial SID tokenizer that injects value-related attributes into SID construction, yielding value-discriminative item representations. We then develop a Generation-as-Ranking SID Decoder jointly optimized by supervised learning and eCPM-aware reinforcement learning, which fuses value scores into next-item SID generation to perform generation and ranking in one decoding process. Finally, we design a value-guided personalized beam search that reuses generation-as-ranking logits as online value guidance and applies a personalized trie tree to constrain decoding to request-valid SID paths. Experiments on the Tencent WeChat Channels advertising platform show that UniVA achieves a 37.04\% improvement in offline Hit Rate@100 over the baseline and a 1.5\% GMV lift in online A/B tests.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.