Dynamic Model Pruning with Feedback

Tao Lin; Sebastian U. Stich; Luis Barba; Daniil Dmitriev; Martin Jaggi

Dynamic Model Pruning with Feedback

Tao Lin, Sebastian U. Stich, Luis Barba, Daniil Dmitriev, Martin Jaggi

Keywords: compression, imagenet, memory, model compression, pruning

Abstract Paper Reviews Chat

Wed Session 2 (08:00-10:00 GMT) [Live QA] [Cal]

Wed Session 3 (12:00-14:00 GMT) [Live QA] [Cal]

Abstract: Deep neural networks often have millions of parameters. This can hinder their deployment to low-end devices, not only due to high memory requirements but also because of increased latency at inference. We propose a novel model compression method that generates a sparse trained model without additional overhead: by allowing (i) dynamic allocation of the sparsity pattern and (ii) incorporating feedback signal to reactivate prematurely pruned weights we obtain a performant sparse model in one single training pass (retraining is not needed, but can further improve the performance). We evaluate the method on CIFAR-10 and ImageNet, and show that the obtained sparse models can reach the state-of-the-art performance of dense models and further that their performance surpasses all previously proposed pruning schemes (that come without feedback mechanisms).

Dynamic Model Pruning with Feedback

Tao Lin, Sebastian U. Stich, Luis Barba, Daniil Dmitriev, Martin Jaggi

Similar Papers

Picking Winning Tickets Before Training by Preserving Gradient Flow

Chaoqi Wang, Guodong Zhang, Roger Grosse,

Dynamic Sparse Training: Find Efficient Sparse Network From Scratch With Trainable Masked Layers

Junjie LIU, Zhe XU, Runbin SHI, Ray C. C. Cheung, Hayden K.H. So,

Scalable Model Compression by Entropy Penalized Reparameterization

Deniz Oktay, Johannes Ballé, Saurabh Singh, Abhinav Shrivastava,

Shifted and Squeezed 8-bit Floating Point format for Low-Precision Training of Deep Neural Networks

Leopold Cambier, Anahita Bhiwandiwalla, Ting Gong, Oguz H. Elibol, Mehran Nekuii, Hanlin Tang,