Neural Network Pruning via QUBO Optimization
Osama Orabi, Artur Zagitov, Hadi Salloum, Viktor A. Lobachev, Kasymkhan Khubiev + 1 more
TLDR
This paper introduces a Hybrid QUBO framework for neural network pruning, combining heuristic importance with global combinatorial optimization for better results.
Key contributions
- Proposes a Hybrid QUBO framework for neural network pruning, merging heuristic importance with global optimization.
- Integrates gradient-aware sensitivity (Taylor/Fisher) and data-driven activation similarity into the QUBO objective.
- Introduces a dynamic capacity-driven search to strictly enforce target sparsity without distortion.
- Employs a two-stage pipeline with Tensor-Train (TT) Refinement for fine-tuning the QUBO solution.
Why it matters
This work addresses limitations of current pruning methods by offering a principled, hybrid optimization approach. It significantly outperforms greedy and traditional L1-based QUBO, demonstrating robust and scalable neural network compression.
Original Abstract
Neural network pruning can be formulated as a combinatorial optimization problem, yet most existing approaches rely on greedy heuristics that ignore complex interactions between filters. Formal optimization methods such as Quadratic Unconstrained Binary Optimization (QUBO) provide a principled alternative but have so far underperformed due to oversimplified objective formulations based on metrics like the L1-norm. In this work, we propose a unified Hybrid QUBO framework that bridges heuristic importance estimation with global combinatorial optimization. Our formulation integrates gradient-aware sensitivity metrics - specifically first-order Taylor and second-order Fisher information - into the linear term, while utilizing data-driven activation similarity in the quadratic term. This allows the QUBO objective to jointly capture individual filter relevance and inter-filter functional redundancy. We further introduce a dynamic capacity-driven search to strictly enforce target sparsity without distorting the optimization landscape. Finally, we employ a two-stage pipeline featuring a Tensor-Train (TT) Refinement stage - a gradient-free optimizer that fine-tunes the QUBO-derived solution directly against the true evaluation metric. Experiments on the SIDD image denoising dataset demonstrate that the proposed Hybrid QUBO significantly outperforms both greedy Taylor pruning and traditional L1-based QUBO, with TT Refinement providing further consistent gains at appropriate combinatorial scales. This highlights the potential of hybrid combinatorial formulations for robust, scalable, and interpretable neural network compression.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.