ArXiv TLDR

Efficient Logic Gate Networks for Video Copy Detection

🐦 Tweet
2604.21694

Katarzyna Fojcik

cs.CVcs.AIcs.IR

TLDR

This paper introduces Logic Gate Networks (LGNs) for highly efficient and scalable video copy detection, outperforming prior methods.

Key contributions

  • Introduces a video copy detection framework based on differentiable Logic Gate Networks (LGNs).
  • Combines frame miniaturization, binary preprocessing, and a trainable LGN embedding model.
  • LGNs discretize into Boolean circuits for extremely fast and memory-efficient inference.
  • Achieves competitive accuracy with orders of magnitude smaller descriptors and 11k+ samples/sec inference.

Why it matters

Deep neural networks for video copy detection are computationally expensive. This paper offers a scalable, resource-efficient alternative using logic-based models. It significantly reduces descriptor size and boosts inference speed, making large-scale deployment feasible.

Original Abstract

Video copy detection requires robust similarity estimation under diverse visual distortions while operating at very large scale. Although deep neural networks achieve strong performance, their computational cost and descriptor size limit practical deployment in high-throughput systems. In this work, we propose a video copy detection framework based on differentiable Logic Gate Networks (LGNs), which replace conventional floating-point feature extractors with compact, logic-based representations. Our approach combines aggressive frame miniaturization, binary preprocessing, and a trainable LGN embedding model that learns both logical operations and interconnections. After training, the model can be discretized into a purely Boolean circuit, enabling extremely fast and memory-efficient inference. We systematically evaluate different similarity strategies, binarization schemes, and LGN architectures across multiple dataset folds and difficulty levels. Experimental results demonstrate that LGN-based models achieve competitive or superior accuracy and ranking performance compared to prior models, while producing descriptors several orders of magnitude smaller and delivering inference speeds exceeding 11k samples per second. These findings indicate that logic-based models offer a promising alternative for scalable and resource-efficient video copy detection.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.