ArXiv TLDR

Learn&Drop: Fast Learning of CNNs based on Layer Dropping

🐦 Tweet
2604.23403

Giorgio Cruciata, Luca Cruciata, Liliana Lo Presti, Jan Van Gemert, Marco La Cascia

cs.CVcs.AIcs.NE

TLDR

Learn&Drop dynamically drops CNN layers during training to halve training time and reduce FLOPs without losing accuracy.

Key contributions

  • Dynamically drops CNN layers during training based on parameter change scores.
  • Reduces forward propagation operations and trainable parameters for faster learning.
  • Halves training time and reduces FLOPs by up to 83% with minimal accuracy loss.
  • Especially beneficial for fine-tuning or online training of convolutional models.

Why it matters

This paper introduces a novel training strategy that significantly accelerates CNN learning by dynamically dropping layers during forward propagation. This differs from network compression, focusing on training efficiency. It's crucial for efficient fine-tuning or online training, making deep learning faster and more accessible.

Original Abstract

This paper proposes a new method to improve the training efficiency of deep convolutional neural networks. During training, the method evaluates scores to measure how much each layer's parameters change and whether the layer will continue learning or not. Based on these scores, the network is scaled down such that the number of parameters to be learned is reduced, yielding a speed up in training. Unlike state-of-the-art methods that try to compress the network to be used in the inference phase or to limit the number of operations performed in the backpropagation phase, the proposed method is novel in that it focuses on reducing the number of operations performed by the network in the forward propagation during training. The proposed training strategy has been validated on two widely used architecture families: VGG and ResNet. Experiments on MNIST, CIFAR-10 and Imagenette show that, with the proposed method, the training time of the models is more than halved without significantly impacting accuracy. The FLOPs reduction in the forward propagation during training ranges from 17.83\% for VGG-11 to 83.74\% for ResNet-152. These results demonstrate the effectiveness of the proposed technique in speeding up learning of CNNs. The technique will be especially useful in applications where fine-tuning or online training of convolutional models is required, for instance because data arrive sequentially.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.