EfficientDNNs
A collection of recent methods on DNN compression and acceleration. There are mainly 5 kinds of methods for efficient DNNs:
- neural architecture re-design or search (NAS)
- maintain accuracy, less cost (e.g., #Params, #FLOPs, etc.): MobileNet, ShuffleNet etc.
- maintain cost, more accuracy: Inception, ResNeXt, Xception etc.
- pruning (including structured and unstructured)
- quantization
- matrix/low-rank decomposition
- knowledge distillation (KD)
Note, this repo is more about pruning (with lottery ticket hypothesis or LTH as a sub-topic), KD, and quantization. For other topics like NAS, see more comprehensive collections (## Related Repos and Websites) at the end of this file. Welcome to send a pull request if you'd like to add any pertinent papers.
Other repos:
- LTH (lottery ticket hypothesis) and its broader version, pruning at initialization (PaI), now is at the frontier of network pruning. We single out the PaI papers to this repo. Welcome to check it out!
- Awesome-Efficient-ViT for a curated list of efficient vision transformers.
About abbreviation: In the list below,
o
for oral,s
for spotlight,b
for best paper,w
for workshop.
Surveys
- 1993-TNN-Pruning Algorithms -- A survey
- 2017-Proceedings of the IEEE-Efficient Processing of Deep Neural Networks: A Tutorial and Survey [2020 Book: Efficient Processing of Deep Neural Networks]
- 2017.12-A survey of FPGA-based neural network accelerator
- 2018-FITEE-Recent Advances in Efficient Computation of Deep Convolutional Neural Networks
- 2018-IEEE Signal Processing Magazine-Model compression and acceleration for deep neural networks: The principles, progress, and challenges. Arxiv extension
- 2018.8-A Survey on Methods and Theories of Quantized Neural Networks
- 2019-JMLR-Neural Architecture Search: A Survey
- 2020-MLSys-What is the state of neural network pruning
- 2019.02-The State of Sparsity in Deep Neural Networks
- 2021-TPAMI-Knowledge Distillation and Student-Teacher Learning for Visual Intelligence: A Review and New Outlooks
- 2021-IJCV-Knowledge Distillation: A Survey
- 2020-Proceedings of the IEEE-Model Compression and Hardware Acceleration for Neural Networks: A Comprehensive Survey
- 2020-Pattern Recognition-Binary neural networks: A survey
- 2021-TPDS-The Deep Learning Compiler: A Comprehensive Survey
- 2021-JMLR-Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks
- 2022-IJCAI-Recent Advances on Neural Network Pruning at Initialization
- 2021.6-Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and Better
Papers [Pruning and Quantization]
1980s,1990s
- 1988-NIPS-A back-propagation algorithm with optimal use of hidden units
- 1988-NIPS-Skeletonization: A Technique for Trimming the Fat from a Network via Relevance Assessment
- 1988-NIPS-What Size Net Gives Valid Generalization?
- 1989-NIPS-Dynamic Behavior of Constained Back-Propagation Networks
- 1988-NIPS-Comparing Biases for Minimal Network Construction with Back-Propagation
- 1989-NIPS-Optimal Brain Damage
- 1990-NN-A simple procedure for pruning back-propagation trained neural networks
- 1992-NIPS-Second order derivatives for network pruning: Optimal Brain Surgeon
- 1993-ICNN-Optimal Brain Surgeon and general network pruning
2000s
- 2001-JMLR-Sparse Bayesian learning and the relevance vector machine
- 2007-Book-The minimum description length principle
2011
- 2011-JMLR-Learning with Structured Sparsity
- 2011-NIPSw-Improving the speed of neural networks on CPUs
2013
- 2013-NIPS-Predicting Parameters in Deep Learning
- 2013.08-Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation
2014
- 2014-BMVC-Speeding up convolutional neural networks with low rank expansions
- 2014-INTERSPEECH-1-Bit Stochastic Gradient Descent and its Application to Data-Parallel Distributed Training of Speech DNNs
- 2014-NIPS-Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation
- 2014-NIPS-Do deep neural nets really need to be deep
- 2014.12-Memory bounded deep convolutional networks
2015
- 2015-ICLR-Speeding-up convolutional neural networks using fine-tuned cp-decomposition
- 2015-ICML-Compressing neural networks with the hashing trick
- 2015-INTERSPEECH-A Diversity-Penalizing Ensemble Training Method for Deep Learning
- 2015-BMVC-Data-free parameter pruning for deep neural networks
- 2015-BMVC-Learning the structure of deep architectures using l1 regularization
- 2015-NIPS-Learning both Weights and Connections for Efficient Neural Network
- 2015-NIPS-Binaryconnect: Training deep neural networks with binary weights during propagations
- 2015-NIPS-Structured Transforms for Small-Footprint Deep Learning
- 2015-NIPS-Tensorizing Neural Networks
- 2015-NIPSw-Distilling Intractable Generative Models
- 2015-NIPSw-Federated Optimization:Distributed Optimization Beyond the Datacenter
- 2015-CVPR-Efficient and Accurate Approximations of Nonlinear Convolutional Networks [2016 TPAMI version: Accelerating Very Deep Convolutional Networks for Classification and Detection]
- 2015-CVPR-Sparse Convolutional Neural Networks
- 2015-ICCV-An Exploration of Parameter Redundancy in Deep Networks with Circulant Projections
- 2015.12-Exploiting Local Structures with the Kronecker Layer in Convolutional Networks
2016
- 2016-ICLR-Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding [Best paper!]
- 2016-ICLR-All you need is a good init [Code]
- 2016-ICLR-Data-dependent Initializations of Convolutional Neural Networks [Code]
- 2016-ICLR-Convolutional neural networks with low-rank regularization [Code]
- 2016-ICLR-Diversity networks
- 2016-ICLR-Neural networks with few multiplications
- 2016-ICLR-Compression of deep convolutional neural networks for fast and low power mobile applications
- 2016-ICLRw-Randomout: Using a convolutional gradient norm to win the filter lottery
- 2016-CVPR-Fast algorithms for convolutional neural networks
- 2016-CVPR-Fast ConvNets Using Group-wise Brain Damage
- 2016-BMVC-Learning neural network architectures using backpropagation
- 2016-ECCV-Less is more: Towards compact cnns
- 2016-EMNLP-Sequence-Level Knowledge Distillation
- 2016-NIPS-Learning Structured Sparsity in Deep Neural Networks [Caffe Code]
- 2016-NIPS-Dynamic Network Surgery for Efficient DNNs [Caffe Code]
- 2016-NIPS-Learning the Number of Neurons in Deep Neural Networks
- 2016-NIPS-Memory-Efficient Backpropagation Through Time
- 2016-NIPS-PerforatedCNNs: Acceleration through Elimination of Redundant Convolutions
- 2016-NIPS-LightRNN: Memory and Computation-Efficient Recurrent Neural Networks
- 2016-NIPS-CNNpack: packing convolutional neural networks in the frequency domain
- 2016-ISCA-Eyeriss: A Spatial Architecture for Energy-Efficient Dataflow for Convolutional Neural Networks
- 2016-ICASSP-Learning compact recurrent neural networks
- 2016-CoNLL-Compression of Neural Machine Translation Models via Pruning
- 2016.03-Adaptive Computation Time for Recurrent Neural Networks
- 2016.06-[Structured Convolution