“‘NN Sparsity’ Tag”,2019-12-07 (; backlinks):
![]()
Bibliography for tag
ai/nn/sparsity, most recent first: 7 related tags, 88 annotations, & 22 links (parent).
- See Also
- Links
- “Convolutional Differentiable Logic Gate Networks”, et al 2024
- “LoRA vs Full Fine-Tuning: An Illusion of Equivalence”, et al 2024
- “On the Complexity of Neural Computation in Superposition”, 2024
- “GSoC 2024: Differentiable Logic for Interactive Systems and Generative Music”
- “CATS: Contextually-Aware Thresholding for Sparsity in Large Language Models”, et al 2024
- “Exploring Concept Depth: How Large Language Models Acquire Knowledge at Different Layers?”, et al 2024
- “ReFT: Representation Finetuning for Language Models”, et al 2024
- “Mechanistic Design and Scaling of Hybrid Architectures”, et al 2024
- “LTE: Training Neural Networks from Scratch With Parallel Low-Rank Adapters”, et al 2024
- “Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet”
- “Exponentially Faster Language Modeling”, 2023
- “DiLoCo: Distributed Low-Communication Training of Language Models”, et al 2023
- “Language Models Are Super Mario (DARE): Absorbing Abilities from Homologous Models As a Free Lunch”, et al 2023
- “ProSG: Using Prompt Synthetic Gradients to Alleviate Prompt Forgetting of RNN-Like Language Models”, et al 2023
- “The Impact of Depth and Width on Transformer Language Model Generalization”, et al 2023
- “Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time”, et al 2023
- “Fast Feedforward Networks”, 2023
- “Any Deep ReLU Network Is Shallow”, 2023
- “JaxPruner: A Concise Library for Sparsity Research”, et al 2023
- “Reusing Deep Neural Network Models through Model Re-Engineering”, et al 2023
- “MUX-PLMs: Pre-Training Language Models With Data Multiplexing”, et al 2023
- “DataMUX: Data Multiplexing for Neural Networks”, et al 2023
- “Deep Differentiable Logic Gate Networks”, et al 2022
- “The Lazy Neuron Phenomenon: On Emergence of Activation Sparsity in Transformers”, et al 2022
- “Noise Transforms Feed-Forward Networks into Sparse Coding Networks”, 2022
- “Exploring Low Rank Training of Deep Neural Networks”, et al 2022
- “Monolith: Real Time Recommendation System With Collisionless Embedding Table”, et al 2022
- “More ConvNets in the 2020s: Scaling up Kernels Beyond 51×51 Using Sparsity (SLaK)”, et al 2022
- “Building Machine Translation Systems for the Next Thousand Languages”, et al 2022
- “Monarch: Expressive Structured Matrices for Efficient and Accurate Training”, et al 2022
- “Efficient Language Modeling With Sparse All-MLP”, et al 2022
- “NeuPL: Neural Population Learning”, et al 2022
- “Datamodels: Predicting Predictions from Training Data”, et al 2022
- “Spiking Neural Networks and Their Applications: A Review”, et al 2022
- “Persia: An Open, Hybrid System Scaling Deep Learning-Based Recommenders up to 100 Trillion Parameters”, et al 2021
- “EvilModel: Hiding Malware Inside of Neural Network Models”, et al 2021
- “LoRA: Low-Rank Adaptation of Large Language Models”, et al 2021
- “On the Distribution, Sparsity, and Inference-Time Quantization of Attention Values in Transformers”, et al 2021
- “The Neural Basis of Intelligence in Fine-Grained Cortical Topographies”, et al 2021
- “Clusterability in Neural Networks”, et al 2021
- “Sparsity in Deep Learning: Pruning and Growth for Efficient Inference and Training in Neural Networks”, et al 2021
- “Scaling down Deep Learning”, 2020
- “Extreme Model Compression for On-Device Natural Language Understanding”, et al 2020
- “Training Independent Subnetworks for Robust Prediction”, et al 2020
- “EventProp: Event-Based Backpropagation Can Compute Exact Gradients for Spiking Neural Networks”, 2020
- “On Linear Identifiability of Learned Representations”, et al 2020
- “Rethinking Parameter Counting in Deep Models: Effective Dimensionality Revisited”, et al 2020
- “Bayesian Deep Learning and a Probabilistic Perspective of Generalization”, 2020
- “Neural Arithmetic Units”, 2020
- “Linear Mode Connectivity and the Lottery Ticket Hypothesis”, et al 2019
- “Learning to Seek: Autonomous Source Seeking With Deep Reinforcement Learning Onboard a Nano Drone Microcontroller”, et al 2019
- “Does Learning Require Memorization? A Short Tale about a Long Tail”, 2019
- “Weight Agnostic Neural Networks”, 2019
- “StyleNAS: An Empirical Study of Neural Architecture Search to Uncover Surprisingly Fast End-To-End Universal Style Transfer Networks”, An et al 2019
- “EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks”, 2019
- “Superposition of Many Models into One”, et al 2019
- “Playing Atari With Six Neurons”, et al 2018
- “Measuring the Intrinsic Dimension of Objective Landscapes”, et al 2018
- “SqueezeNext: Hardware-Aware Neural Network Design”, et al 2018
- “Wide Compression: Tensor Ring Nets”, et al 2018
- “Intriguing Properties of Randomly Weighted Networks: Generalizing While Learning Next to Nothing”, 2018
- “Fix Your Classifier: the Marginal Value of Training the Last Weight Layer”, et al 2018
- “Learning Compact Recurrent Neural Networks With Block-Term Tensor Decomposition”, et al 2017
- “3D Semantic Segmentation With Submanifold Sparse Convolutional Networks”, et al 2017
- “XUnit: Learning a Spatial Activation Function for Efficient Image Restoration”, et al 2017
- “Natural Language Processing With Small Feed-Forward Networks”, et al 2017
- “ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices”, et al 2017
- “Submanifold Sparse Convolutional Networks”, 2017
- “Shake-Shake Regularization of 3-Branch Residual Networks”, 2017
- “Using the Output Embedding to Improve Language Models”, 2016
- “Deep Residual Learning for Image Recognition”, et al 2015
- “Tensorizing Neural Networks”, et al 2015
- “Eight Pairs of Descending Visual Neurons in the Dragonfly Give Wing Motor Centers Accurate Population Vector of Prey Direction”, Gonzalez- et al 2013
- “The Cat Is out of the Bag: Cortical Simulations With 109 Neurons, 1013 Synapses”, et al 2009
- “On the Computational Power of Threshold Circuits With Sparse Activity”, et al 2006
- “Networks of Spiking Neurons: The Third Generation of Neural Network Models”, 1997
- “Characteristics of Sparsely Encoded Associative Memory”, 1989
- “[2110.08152] Kronecker Decomposition for GPT Compression”
- “Higher Accuracy on Vision Models With EfficientNet-Lite”
- “Something Weird Is Happening With LLMs and Chess”, 2024
- “Delivering Real-Time AI in the Palm of Your Hand”
- “Sparsity-Aware Deep Learning Inference Runtime for CPUs”
- “Neuralmagic/sparseml: Libraries for Applying Sparsification Recipes to Neural Networks With a Few Lines of Code, Enabling Faster and Smaller Models”
- “An Estimation of the Absolute Number of Axons Indicates That Human Cortical Areas Are Sparsely Connected”
- “Creating a 17 KB Style Transfer Model With Layer Pruning and Quantization”, 2024
- “BERT-Large: Prune Once for DistilBERT Inference Performance”
- “Circuits in Superposition: Compressing Many Small Neural Networks into One”
- “Measuring the Intrinsic Dimension of Objective Landscapes [Video]”
- Sort By Magic
- Wikipedia
- Miscellaneous
- Bibliography