ArXiv TLDR

FAVOR: Efficient Filter-Agnostic Vector ANNS Based on Selectivity-Aware Exclusion Distances

🐦 Tweet
2605.07770

Junjie Song, Yu Liu, Guoyu Hu, Zhongle Xie, Ming Yang + 2 more

cs.IR

TLDR

FAVOR is a new ANNS method that efficiently integrates complex attribute filtering, achieving stable high throughput across varying selectivity levels.

Key contributions

  • Integrated architecture unifies selectivity estimation and filtered ANNS execution for hybrid queries.
  • HNSW-based inline-filtering uses exclusion distances to dynamically reshape vector distributions for efficiency.
  • Selectivity-driven search selector dynamically routes queries to optimize performance across all scenarios.
  • Achieves 1.3-5x higher QPS than state-of-the-art for arbitrary filtering conditions.

Why it matters

Modern retrieval systems need efficient hybrid queries combining vector search and attribute filtering. FAVOR addresses this by providing a robust solution that maintains high throughput and stable performance across diverse filtering conditions. This significantly advances ANNS for applications like RAG and recommendation systems.

Original Abstract

Modern retrieval systems increasingly require integrating approximate nearest neighbor search (ANNS) with complex attribute filtering to handle hybrid queries in applications such as recommendation systems and retrieval-augmented generation (RAG). While HNSW-based inline-filtering methods show promise, existing approaches struggle to deliver high throughput under low-selectivity scenarios while balancing search efficiency, filtering generality, and index connectivity. To address these challenges, we propose FAVOR, an efficient filter-agnostic vector ANNS that supports arbitrary filtering conditions while maintaining stable performance across varying selectivity levels. FAVOR introduces three novel features: (1) an integrated architecture that unifies selectivity estimation and filtered ANNS execution, providing a cohesive solution for hybrid vector-attribute queries; (2) a HNSW-based inline-filtering algorithm that introduces an exclusion distance mechanism to dynamically reshape the vector distance distribution, pushing non-target vectors away from the query while promoting valid candidates toward the query, thus improving search efficiency without compromising generality or graph connectivity; and (3) a selectivity-driven search selector that estimates query selectivity and dynamically routes queries between a pre-filtering brute-force algorithm for low-selectivity cases and an optimized HNSW-based search algorithm for other scenarios, ensuring consistent performance. Extensive experiments on real-world datasets demonstrate that FAVOR achieves a 1.3-5$\times$ higher QPS at $Recall@10 = 95\%$ compared to state-of-the-art methods for arbitrary filtering conditions, while maintaining competitive performance even against tailored solutions in some filtering conditions.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.