TokenFormer: Unify the Multi-Field and Sequential Recommendation Worlds
Yifeng Zhou, Yuehong Hu, Zhixiang Feng, Junwei Pan, Kaihui Wu + 7 more
TLDR
TokenFormer unifies multi-field and sequential recommendation, overcoming Sequential Collapse Propagation with novel attention and interaction representations.
Key contributions
- Unifies multi-field and sequential recommendation, addressing Sequential Collapse Propagation (SCP).
- Introduces TokenFormer, a novel architecture for robust unified recommendation modeling.
- Proposes Bottom-Full-Top-Sliding (BFTS) attention for efficient and effective sequence processing.
- Utilizes Non-Linear Interaction Representation (NLIR) for enhanced feature discriminability.
Why it matters
This paper tackles Sequential Collapse Propagation, a key challenge in unifying multi-field and sequential recommendation. TokenFormer's novel attention and representation schemes achieve state-of-the-art performance, improving dimensional robustness. This is crucial for building more effective and versatile recommender systems.
Original Abstract
Recommender systems have historically developed along two largely independent paradigms: feature interaction models for modeling correlations among multi-field categorical features, and sequential models for capturing user behavior dynamics from historical interaction sequences. Although recent trends attempt to bridge these paradigms within shared backbones, we empirically reveal that naive unifying these two branches may lead to a failure mode of Sequential Collapse Propagation (SCP). That is, the interaction with those dimensionally ill non-sequence fields leads to the dimensional collapse of the sequence features. To overcome this challenge, we propose TokenFormer, a unified recommendation architecture with the following innovations. First, we introduce a Bottom-Full-Top-Sliding (BFTS) attention scheme, which applies full self-attention in the lower layers and shrinking-window sliding attention in the upper layers. Second, we introduce a Non-Linear Interaction Representation (NLIR) that applies one-sided non-linear multiplicative transformations to the hidden states. Extensive experiments on public benchmarks and Tencent's advertising platform demonstrate state-of-the-art performance, while detailed analysis confirm that TokenFormer significantly improves dimensional robustness and representation discriminability under unified modeling.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.