RAG-Enhanced Large Language Models for Dynamic Content Expiration Prediction in Web Search
Tingyu Chen, Wenkai Zhang, Li Gao, Lixin Su, Ge Chen + 2 more
TLDR
This paper introduces an LLM-based framework for dynamic content expiration prediction in web search, improving freshness and user experience.
Key contributions
- Introduces a novel LLM-based framework for query-aware dynamic content expiration prediction in web search.
- Extracts fine-grained temporal contexts and deduces query-specific "validity horizons" for content obsolescence.
- Deployed in Baidu search, integrating robust hallucination mitigation strategies for reliability.
- Demonstrated significant improvements in search freshness and user experience via A/B testing on live traffic.
Why it matters
This paper tackles the critical problem of content expiration in web search, moving beyond static filtering to dynamic, query-aware timeliness. By leveraging LLMs at an industrial scale, it significantly enhances search freshness and user experience, demonstrating a practical application of advanced AI.
Original Abstract
In commercial web search, aligning content freshness with user intent remains challenging due to the highly varied lifespans of information. Traditional industrial approaches rely on static time-window filtering, resulting in "one-size-fits-all" rankings where content may be chronologically recent but semantically expired. To address the limitation, we present a novel Large Language Models (LLMs)-based Query-Aware Dynamic Content Expiration Prediction Framework deployed in Baidu search, reformulating timeliness as a dynamic validity inference task. Our framework extracts fine-grained temporal contexts from documents and leverages LLMs to deduce a query-specific "validity horizon"-a semantic boundary defining when information becomes obsolete based on user intent. Integrated with robust hallucination mitigation strategies to ensure reliability, our approach has been evaluated through offline and online A/B testing on live production traffic. Results demonstrate significant improvements in search freshness and user experience metrics, validating the effectiveness of LLM-driven reasoning for solving semantic expiration at an industrial scale.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.