- See Also
-
Links
- “Pruning Compact ConvNets for Efficient Inference”, Et Al 2023
- “Lottery Tickets on a Data Diet: Finding Initializations With Sparse Trainable Networks”, Et Al 2022
- “PPCD-GAN: Progressive Pruning and Class-Aware Distillation for Large-Scale Conditional GANs Compression”, Et Al 2022
- “Data-Efficient Structured Pruning via Submodular Optimization”, Et Al 2022
- “The Combinatorial Brain Surgeon: Pruning Weights That Cancel One Another in Neural Networks”, Et Al 2022
- “Sparsity Winning Twice: Better Robust Generalization from More Efficient Training”, Et Al 2022
- “Fortuitous Forgetting in Connectionist Networks”, Et Al 2022
- “How Many Degrees of Freedom Do We Need to Train Deep Networks: a Loss Landscape Perspective”, Et Al 2021
- “Prune Once for All: Sparse Pre-Trained Language Models”, Et Al 2021
- “DSEE: Dually Sparsity-embedded Efficient Tuning of Pre-trained Language Models”, Et Al 2021
- “HALP: Hardware-Aware Latency Pruning”, Et Al 2021
- “On the Interplay Between Sparsity, Naturalness, Intelligibility, and Prosody in Speech Synthesis”, Et Al 2021
- “Block Pruning For Faster Transformers”, Et Al 2021
- “Scaling Laws for Deep Learning”, 2021
- “A Winning Hand: Compressing Deep Networks Can Improve Out-Of-Distribution Robustness”, Et Al 2021
- “Chasing Sparsity in Vision Transformers: An End-to-End Exploration”, Et Al 2021
- “On Lottery Tickets and Minimal Task Representations in Deep Reinforcement Learning”, Et Al 2021
- “Sifting out the Features by Pruning: Are Convolutional Networks the Winning Lottery Ticket of Fully Connected Ones?”, 2021
- “Learning N:M Fine-grained Structured Sparse Neural Networks From Scratch”, Et Al 2021
- “Postnatal Connectomic Development of Inhibition in Mouse Barrel Cortex”, Et Al 2021
- “ES-ENAS: Blackbox Optimization over Hybrid Spaces via Combinatorial and Continuous Evolution”, Et Al 2021
- “A Primer in BERTology: What We Know about How BERT Works”, Et Al 2020
- “Optimal Subarchitecture Extraction For BERT”, 2020
- “Pruning Neural Networks at Initialization: Why Are We Missing the Mark?”, Et Al 2020
- “Logarithmic Pruning Is All You Need”, Et Al 2020
- “On the Predictability of Pruning Across Scales”, Et Al 2020
- “Pruning Neural Networks without Any Data by Iteratively Conserving Synaptic Flow”, Et Al 2020
- “Movement Pruning: Adaptive Sparsity by Fine-Tuning”, Et Al 2020
- “Bayesian Bits: Unifying Quantization and Pruning”, Et Al 2020
- “Lite Transformer With Long-Short Range Attention”, Et Al 2020
- “On the Effect of Dropping Layers of Pre-trained Transformer Models”, Et Al 2020
- “Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers”, Et Al 2020
- “Sparse Networks from Scratch: Faster Training without Losing Performance”, 2019
- “Playing the Lottery With Rewards and Multiple Languages: Lottery Tickets in RL and NLP”, Et Al 2019
- “SpArSe: Sparse Architecture Search for CNNs on Resource-Constrained Microcontrollers”, Et Al 2019
- “Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned”, Et Al 2019
- “Stabilizing the Lottery Ticket Hypothesis”, Et Al 2019
- “The State of Sparsity in Deep Neural Networks”, Et Al 2019
- “Differential Contribution of Cortical Thickness, Surface Area, and Gyrification to Fluid and Crystallized Intelligence”, Et Al 2019
- “A Closer Look at Structured Pruning for Neural Network Compression”, Et Al 2018
- “The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks”, 2018
- “Efficient Neural Audio Synthesis”, Et Al 2018
- “Recovering from Random Pruning: On the Plasticity of Deep Convolutional Neural Networks”, Et Al 2018
- “Learning to Prune Filters in Convolutional Neural Networks”, Et Al 2018
- “Faster Gaze Prediction With Dense Networks and Fisher Pruning”, Et Al 2018
- “Automated Pruning for Deep Neural Network Compression”, Et Al 2017
- “Training Simplification and Model Simplification for Deep Learning: A Minimal Effort Back Propagation Method”, Et Al 2017
- “NeST: A Neural Network Synthesis Tool Based on a Grow-and-Prune Paradigm”, Et Al 2017
- “To Prune, or Not to Prune: Exploring the Efficacy of Pruning for Model Compression”, 2017
- “Bayesian Sparsification of Recurrent Neural Networks”, Et Al 2017
- “Structured Bayesian Pruning via Log-Normal Multiplicative Noise”, Et Al 2017
- “Exploring Sparsity in Recurrent Neural Networks”, Et Al 2017
- “Variational Dropout Sparsifies Deep Neural Networks”, Et Al 2017
- “Iterative Magnitude Pruning: Learning Both Weights and Connections for Efficient Neural Networks”, Et Al 2015
- “Flat Minima”, 1997
- “Optimal Brain Surgeon and General Network Pruning”, Et Al 1993
- “Optimal Brain Damage”, Et Al 1989
- Miscellaneous
- Link Bibliography
See Also
Links
“Pruning Compact ConvNets for Efficient Inference”, Et Al 2023
“Pruning Compact ConvNets for Efficient Inference”, 2023-01-11 ( ; similar)
“Lottery Tickets on a Data Diet: Finding Initializations With Sparse Trainable Networks”, Et Al 2022
“Lottery Tickets on a Data Diet: Finding Initializations with Sparse Trainable Networks”, 2022-06-02 (similar)
“PPCD-GAN: Progressive Pruning and Class-Aware Distillation for Large-Scale Conditional GANs Compression”, Et Al 2022
“PPCD-GAN: Progressive Pruning and Class-Aware Distillation for Large-Scale Conditional GANs Compression”, 2022-03-16 ( ; similar)
“Data-Efficient Structured Pruning via Submodular Optimization”, Et Al 2022
“Data-Efficient Structured Pruning via Submodular Optimization”, 2022-03-09 (similar)
“The Combinatorial Brain Surgeon: Pruning Weights That Cancel One Another in Neural Networks”, Et Al 2022
“The Combinatorial Brain Surgeon: Pruning Weights That Cancel One Another in Neural Networks”, 2022-03-09 (similar)
“Sparsity Winning Twice: Better Robust Generalization from More Efficient Training”, Et Al 2022
“Sparsity Winning Twice: Better Robust Generalization from More Efficient Training”, 2022-02-20 (similar; bibliography)
“Fortuitous Forgetting in Connectionist Networks”, Et Al 2022
“Fortuitous Forgetting in Connectionist Networks”, 2022-02-01 (similar)
“How Many Degrees of Freedom Do We Need to Train Deep Networks: a Loss Landscape Perspective”, Et Al 2021
“How many degrees of freedom do we need to train deep networks: a loss landscape perspective”, 2021-11-20 (similar)
“Prune Once for All: Sparse Pre-Trained Language Models”, Et Al 2021
“Prune Once for All: Sparse Pre-Trained Language Models”, 2021-11-10 ( ; similar; bibliography)
“DSEE: Dually Sparsity-embedded Efficient Tuning of Pre-trained Language Models”, Et Al 2021
“DSEE: Dually Sparsity-embedded Efficient Tuning of Pre-trained Language Models”, 2021-10-30 (similar; bibliography)
“HALP: Hardware-Aware Latency Pruning”, Et Al 2021
“HALP: Hardware-Aware Latency Pruning”, 2021-10-20 (similar)
“On the Interplay Between Sparsity, Naturalness, Intelligibility, and Prosody in Speech Synthesis”, Et Al 2021
“On the Interplay Between Sparsity, Naturalness, Intelligibility, and Prosody in Speech Synthesis”, 2021-10-04 ( ; similar)
“Block Pruning For Faster Transformers”, Et Al 2021
“Block Pruning For Faster Transformers”, 2021-09-10 ( ; similar)
“Scaling Laws for Deep Learning”, 2021
“Scaling Laws for Deep Learning”, 2021-08-17 ( ; backlinks; similar; bibliography)
“A Winning Hand: Compressing Deep Networks Can Improve Out-Of-Distribution Robustness”, Et Al 2021
“A Winning Hand: Compressing Deep Networks Can Improve Out-Of-Distribution Robustness”, 2021-06-16 ( ; similar)
“Chasing Sparsity in Vision Transformers: An End-to-End Exploration”, Et Al 2021
“Chasing Sparsity in Vision Transformers: An End-to-End Exploration”, 2021-06-08 ( ; similar; bibliography)
“On Lottery Tickets and Minimal Task Representations in Deep Reinforcement Learning”, Et Al 2021
“On Lottery Tickets and Minimal Task Representations in Deep Reinforcement Learning”, 2021-05-04 ( ; similar)
“Sifting out the Features by Pruning: Are Convolutional Networks the Winning Lottery Ticket of Fully Connected Ones?”, 2021
“Sifting out the features by pruning: Are convolutional networks the winning lottery ticket of fully connected ones?”, 2021-04-27 ( ; backlinks; similar)
“Learning N:M Fine-grained Structured Sparse Neural Networks From Scratch”, Et Al 2021
“Learning N:M Fine-grained Structured Sparse Neural Networks From Scratch”, 2021-02-08 (similar)
“Postnatal Connectomic Development of Inhibition in Mouse Barrel Cortex”, Et Al 2021
“Postnatal connectomic development of inhibition in mouse barrel cortex”, 2021-01-29 ( ; backlinks; similar)
“ES-ENAS: Blackbox Optimization over Hybrid Spaces via Combinatorial and Continuous Evolution”, Et Al 2021
“ES-ENAS: Blackbox Optimization over Hybrid Spaces via Combinatorial and Continuous Evolution”, 2021-01-19 ( ; similar)
“A Primer in BERTology: What We Know about How BERT Works”, Et Al 2020
“A Primer in BERTology: What we know about how BERT works”, 2020-11-09 ( ; similar)
“Optimal Subarchitecture Extraction For BERT”, 2020
“Optimal Subarchitecture Extraction For BERT”, 2020-10-20 (similar)
“Pruning Neural Networks at Initialization: Why Are We Missing the Mark?”, Et Al 2020
“Pruning Neural Networks at Initialization: Why are We Missing the Mark?”, 2020-09-18 (backlinks; similar)
“Logarithmic Pruning Is All You Need”, Et Al 2020
“Logarithmic Pruning is All You Need”, 2020-06-22 ( ; similar)
“On the Predictability of Pruning Across Scales”, Et Al 2020
“On the Predictability of Pruning Across Scales”, 2020-06-18 ( ; backlinks; similar; bibliography)
“Pruning Neural Networks without Any Data by Iteratively Conserving Synaptic Flow”, Et Al 2020
“Pruning neural networks without any data by iteratively conserving synaptic flow”, 2020-06-09 (similar)
“Movement Pruning: Adaptive Sparsity by Fine-Tuning”, Et Al 2020
“Movement Pruning: Adaptive Sparsity by Fine-Tuning”, 2020-05-15 ( ; similar)
“Bayesian Bits: Unifying Quantization and Pruning”, Et Al 2020
“Bayesian Bits: Unifying Quantization and Pruning”, 2020-05-14 ( ; similar)
“Lite Transformer With Long-Short Range Attention”, Et Al 2020
“Lite Transformer with Long-Short Range Attention”, 2020-04-24 ( ; backlinks; similar)
“On the Effect of Dropping Layers of Pre-trained Transformer Models”, Et Al 2020
“On the Effect of Dropping Layers of Pre-trained Transformer Models”, 2020-04-08 ( ; similar; bibliography)
“Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers”, Et Al 2020
“Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers”, 2020-02-26 ( ; backlinks; similar)
“Sparse Networks from Scratch: Faster Training without Losing Performance”, 2019
“Sparse Networks from Scratch: Faster Training without Losing Performance”, 2019-07-10 (similar)
“Playing the Lottery With Rewards and Multiple Languages: Lottery Tickets in RL and NLP”, Et Al 2019
“Playing the lottery with rewards and multiple languages: lottery tickets in RL and NLP”, 2019-06-06 ( ; similar)
“SpArSe: Sparse Architecture Search for CNNs on Resource-Constrained Microcontrollers”, Et Al 2019
“SpArSe: Sparse Architecture Search for CNNs on Resource-Constrained Microcontrollers”, 2019-05-28 ( ; backlinks; similar)
“Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned”, Et Al 2019
“Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned”, 2019-05-23 ( ; similar)
“Stabilizing the Lottery Ticket Hypothesis”, Et Al 2019
“Stabilizing the Lottery Ticket Hypothesis”, 2019-03-05 (similar; bibliography)
“The State of Sparsity in Deep Neural Networks”, Et Al 2019
“The State of Sparsity in Deep Neural Networks”, 2019-02-25 (similar; bibliography)
“Differential Contribution of Cortical Thickness, Surface Area, and Gyrification to Fluid and Crystallized Intelligence”, Et Al 2019
“Differential Contribution of Cortical Thickness, Surface Area, and Gyrification to Fluid and Crystallized Intelligence”, 2019 ( ; similar)
“A Closer Look at Structured Pruning for Neural Network Compression”, Et Al 2018
“A Closer Look at Structured Pruning for Neural Network Compression”, 2018-10-10 (similar; bibliography)
“The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks”, 2018
“The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks”, 2018-03-09 (similar)
“Efficient Neural Audio Synthesis”, Et Al 2018
“Efficient Neural Audio Synthesis”, 2018-02-23 ( ; similar)
“Recovering from Random Pruning: On the Plasticity of Deep Convolutional Neural Networks”, Et Al 2018
“Recovering from Random Pruning: On the Plasticity of Deep Convolutional Neural Networks”, 2018-01-31 (similar; bibliography)
“Learning to Prune Filters in Convolutional Neural Networks”, Et Al 2018
“Learning to Prune Filters in Convolutional Neural Networks”, 2018-01-23 (backlinks; similar)
“Faster Gaze Prediction With Dense Networks and Fisher Pruning”, Et Al 2018
“Faster gaze prediction with dense networks and Fisher pruning”, 2018-01-17 ( ; similar)
“Automated Pruning for Deep Neural Network Compression”, Et Al 2017
“Automated Pruning for Deep Neural Network Compression”, 2017-12-05 (similar)
“Training Simplification and Model Simplification for Deep Learning: A Minimal Effort Back Propagation Method”, Et Al 2017
“Training Simplification and Model Simplification for Deep Learning: A Minimal Effort Back Propagation Method”, 2017-11-17 ( ; similar)
“NeST: A Neural Network Synthesis Tool Based on a Grow-and-Prune Paradigm”, Et Al 2017
“NeST: A Neural Network Synthesis Tool Based on a Grow-and-Prune Paradigm”, 2017-11-06 (similar)
“To Prune, or Not to Prune: Exploring the Efficacy of Pruning for Model Compression”, 2017
“To prune, or not to prune: exploring the efficacy of pruning for model compression”, 2017-10-05 ( ; similar)
“Bayesian Sparsification of Recurrent Neural Networks”, Et Al 2017
“Bayesian Sparsification of Recurrent Neural Networks”, 2017-07-31 ( ; similar)
“Structured Bayesian Pruning via Log-Normal Multiplicative Noise”, Et Al 2017
“Structured Bayesian Pruning via Log-Normal Multiplicative Noise”, 2017-05-20 ( ; similar)
“Exploring Sparsity in Recurrent Neural Networks”, Et Al 2017
“Exploring Sparsity in Recurrent Neural Networks”, 2017-04-17 ( ; similar)
“Variational Dropout Sparsifies Deep Neural Networks”, Et Al 2017
“Variational Dropout Sparsifies Deep Neural Networks”, 2017-01-19 (similar)
“Iterative Magnitude Pruning: Learning Both Weights and Connections for Efficient Neural Networks”, Et Al 2015
“Iterative Magnitude Pruning: Learning both Weights and Connections for Efficient Neural Networks”, 2015-06-08 (similar)
“Flat Minima”, 1997
“Flat Minima”, 1997 ( ; similar)
“Optimal Brain Surgeon and General Network Pruning”, Et Al 1993
“Optimal Brain Surgeon and general network pruning”, 1993-03-28 (backlinks; similar; bibliography)
“Optimal Brain Damage”, Et Al 1989
“Optimal Brain Damage”, 1989 (backlinks; similar)
Miscellaneous
Link Bibliography
-
https://arxiv.org/abs/2202.09844
: “Sparsity Winning Twice: Better Robust Generalization from More Efficient Training”, Tianlong Chen, Zhenyu Zhang, Pengjun Wang, Santosh Balachandra, Haoyu Ma, Zehao Wang, Zhangyang Wang: -
https://arxiv.org/abs/2111.05754
: “Prune Once for All: Sparse Pre-Trained Language Models”, Ofir Zafrir, Ariel Larey, Guy Boudoukh, Haihao Shen, Moshe Wasserblat: -
https://arxiv.org/abs/2111.00160
: “DSEE: Dually Sparsity-embedded Efficient Tuning of Pre-trained Language Models”, Xuxi Chen, Tianlong Chen, Yu Cheng, Weizhu Chen, Zhangyang Wang, Ahmed Hassan Awadallah: -
https://arxiv.org/abs/2108.07686
: “Scaling Laws for Deep Learning”, Jonathan S. Rosenfeld: -
https://arxiv.org/abs/2106.04533
: “Chasing Sparsity in Vision Transformers: An End-to-End Exploration”, Tianlong Chen, Yu Cheng, Zhe Gan, Lu Yuan, Lei Zhang, Zhangyang Wang: -
https://arxiv.org/abs/2006.10621
: “On the Predictability of Pruning Across Scales”, Jonathan S. Rosenfeld, Jonathan Frankle, Michael Carbin, Nir Shavit: -
https://arxiv.org/abs/2004.03844
: “On the Effect of Dropping Layers of Pre-trained Transformer Models”, Hassan Sajjad, Fahim Dalvi, Nadir Durrani, Preslav Nakov: -
https://arxiv.org/abs/1903.01611
: “Stabilizing the Lottery Ticket Hypothesis”, Jonathan Frankle, Gintare Karolina Dziugaite, Daniel M. Roy, Michael Carbin: -
https://arxiv.org/abs/1902.09574
: “The State of Sparsity in Deep Neural Networks”, Trevor Gale, Erich Elsen, Sara Hooker: -
https://arxiv.org/abs/1810.04622
: “A Closer Look at Structured Pruning for Neural Network Compression”, Elliot J. Crowley, Jack Turner, Amos Storkey, Michael O’Boyle: -
https://arxiv.org/abs/1801.10447
: “Recovering from Random Pruning: On the Plasticity of Deep Convolutional Neural Networks”, Deepak Mittal, Shweta Bhardwaj, Mitesh M. Khapra, Balaraman Ravindran: -
1993-hassibi.pdf
: “Optimal Brain Surgeon and General Network Pruning”, Babak Hassibi, David G. Stork, Gregory J. Wolff: