Efficient Logic Gate Networks for Video Copy Detection

April 23, 20262604.21694

cs.CVcs.AIcs.IR

TLDR

This paper introduces Logic Gate Networks (LGNs) for highly efficient and scalable video copy detection, outperforming prior methods.

Key contributions

Introduces a video copy detection framework based on differentiable Logic Gate Networks (LGNs).
Combines frame miniaturization, binary preprocessing, and a trainable LGN embedding model.
LGNs discretize into Boolean circuits for extremely fast and memory-efficient inference.
Achieves competitive accuracy with orders of magnitude smaller descriptors and 11k+ samples/sec inference.

Why it matters

Deep neural networks for video copy detection are computationally expensive. This paper offers a scalable, resource-efficient alternative using logic-based models. It significantly reduces descriptor size and boosts inference speed, making large-scale deployment feasible.

Original Abstract

Video copy detection requires robust similarity estimation under diverse visual distortions while operating at very large scale. Although deep neural networks achieve strong performance, their computational cost and descriptor size limit practical deployment in high-throughput systems. In this work, we propose a video copy detection framework based on differentiable Logic Gate Networks (LGNs), which replace conventional floating-point feature extractors with compact, logic-based representations. Our approach combines aggressive frame miniaturization, binary preprocessing, and a trainable LGN embedding model that learns both logical operations and interconnections. After training, the model can be discretized into a purely Boolean circuit, enabling extremely fast and memory-efficient inference. We systematically evaluate different similarity strategies, binarization schemes, and LGN architectures across multiple dataset folds and difficulty levels. Experimental results demonstrate that LGN-based models achieve competitive or superior accuracy and ranking performance compared to prior models, while producing descriptors several orders of magnitude smaller and delivering inference speeds exceeding 11k samples per second. These findings indicate that logic-based models offer a promising alternative for scalable and resource-efficient video copy detection.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers