Neural Network Pruning via QUBO Optimization

April 7, 20262604.05856

Osama Orabi, Artur Zagitov, Hadi Salloum, Viktor A. Lobachev, Kasymkhan Khubiev + 1 more

cs.CVcs.AIcs.LGcs.NE

TLDR

This paper introduces a Hybrid QUBO framework for neural network pruning, combining heuristic importance with global combinatorial optimization for better results.

Key contributions

Proposes a Hybrid QUBO framework for neural network pruning, merging heuristic importance with global optimization.
Integrates gradient-aware sensitivity (Taylor/Fisher) and data-driven activation similarity into the QUBO objective.
Introduces a dynamic capacity-driven search to strictly enforce target sparsity without distortion.
Employs a two-stage pipeline with Tensor-Train (TT) Refinement for fine-tuning the QUBO solution.

Why it matters

This work addresses limitations of current pruning methods by offering a principled, hybrid optimization approach. It significantly outperforms greedy and traditional L1-based QUBO, demonstrating robust and scalable neural network compression.

Original Abstract

Neural network pruning can be formulated as a combinatorial optimization problem, yet most existing approaches rely on greedy heuristics that ignore complex interactions between filters. Formal optimization methods such as Quadratic Unconstrained Binary Optimization (QUBO) provide a principled alternative but have so far underperformed due to oversimplified objective formulations based on metrics like the L1-norm. In this work, we propose a unified Hybrid QUBO framework that bridges heuristic importance estimation with global combinatorial optimization. Our formulation integrates gradient-aware sensitivity metrics - specifically first-order Taylor and second-order Fisher information - into the linear term, while utilizing data-driven activation similarity in the quadratic term. This allows the QUBO objective to jointly capture individual filter relevance and inter-filter functional redundancy. We further introduce a dynamic capacity-driven search to strictly enforce target sparsity without distorting the optimization landscape. Finally, we employ a two-stage pipeline featuring a Tensor-Train (TT) Refinement stage - a gradient-free optimizer that fine-tunes the QUBO-derived solution directly against the true evaluation metric. Experiments on the SIDD image denoising dataset demonstrate that the proposed Hybrid QUBO significantly outperforms both greedy Taylor pruning and traditional L1-based QUBO, with TT Refinement providing further consistent gains at appropriate combinatorial scales. This highlights the potential of hybrid combinatorial formulations for robust, scalable, and interpretable neural network compression.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers