GSoC 2024: Differentiable Logic for Interactive Systems and Generative Music
CATS: Contextually-Aware Thresholding for Sparsity in Large Language Models
Exploring Concept Depth: How Large Language Models Acquire Knowledge at Different Layers?
LTE: Training Neural Networks from Scratch with Parallel Low-Rank Adapters
Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet
DiLoCo: Distributed Low-Communication Training of Language Models
Language Models are Super Mario (DARE): Absorbing Abilities from Homologous Models as a Free Lunch
ProSG: Using Prompt Synthetic Gradients to Alleviate Prompt Forgetting of RNN-like Language Models
The Impact of Depth and Width on Transformer Language Model Generalization
Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time
Reusing Deep Neural Network Models through Model Re-engineering
MUX-PLMs: Pre-training Language Models with Data Multiplexing
The Lazy Neuron Phenomenon: On Emergence of Activation Sparsity in Transformers
Noise Transforms Feed-Forward Networks into Sparse Coding Networks
Monolith: Real Time Recommendation System With Collisionless Embedding Table
More ConvNets in the 2020s: Scaling up Kernels Beyond 51×51 using Sparsity (SLaK)
Building Machine Translation Systems for the Next Thousand Languages
Monarch: Expressive Structured Matrices for Efficient and Accurate Training
Persia: An Open, Hybrid System Scaling Deep Learning-based Recommenders up to 100 Trillion Parameters
On the Distribution, Sparsity, and Inference-time Quantization of Attention Values in Transformers
The neural basis of intelligence in fine-grained cortical topographies
Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks
Extreme Model Compression for On-device Natural Language Understanding
EventProp: Event-Based Backpropagation can compute Exact Gradients for Spiking Neural Networks
Rethinking Parameter Counting in Deep Models: Effective Dimensionality Revisited
Bayesian Deep Learning and a Probabilistic Perspective of Generalization
Linear Mode Connectivity and the Lottery Ticket Hypothesis
Learning to Seek: Autonomous Source Seeking with Deep Reinforcement Learning Onboard a Nano Drone Microcontroller
Does Learning Require Memorization? A Short Tale about a Long Tail
StyleNAS: An Empirical Study of Neural Architecture Search to Uncover Surprisingly Fast End-to-End Universal Style Transfer Networks
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
Intriguing Properties of Randomly Weighted Networks: Generalizing while Learning Next to Nothing
Fix your classifier: the marginal value of training the last weight layer
Learning Compact Recurrent Neural Networks with Block-Term Tensor Decomposition
3D Semantic Segmentation with Submanifold Sparse Convolutional Networks
xUnit: Learning a Spatial Activation Function for Efficient Image Restoration
Natural Language Processing with Small Feed-Forward Networks
ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices
Eight pairs of descending visual neurons in the dragonfly give wing motor centers accurate population vector of prey direction
The cat is out of the bag: cortical simulations with 109 neurons, 1013 synapses
On the Computational Power of Threshold Circuits with Sparse Activity
Networks of spiking neurons: The third generation of neural network models
Neuralmagic/sparseml: Libraries for Applying Sparsification Recipes to Neural Networks With a Few Lines of Code, Enabling Faster and Smaller Models
An Estimation of the Absolute Number of Axons Indicates That Human Cortical Areas Are Sparsely Connected
Creating a 17 KB Style Transfer Model With Layer Pruning and Quantization
BERT-Large: Prune Once for DistilBERT Inference Performance
Circuits in Superposition: Compressing Many Small Neural Networks into One
56cb7ccd134aaa922ba1f32126ca7c67fc25fb15.html#Read_in_interference
Measuring the Intrinsic Dimension of Objective Landscapes [Video]
2022-bapna-figure2-googletranslateneuralmachinetranslationscalingbylanguagecorpussize.jpg
https://ai.facebook.com/blog/a-highly-efficient-real-time-text-to-speech-system-deployed-on-cpus/
https://blog.roblox.com/2020/05/scaled-bert-serve-1-billion-daily-requests-cpus/
https://cprimozic.net/blog/growing-sparse-computational-graphs-with-rnns/
https://old.reddit.com/r/slatestarcodex/comments/1201v68/10word_quote_a_short_and_simple_failure_mode_of/jdjsx43/
https://research.google/blog/an-all-neural-on-device-speech-recognizer/
https://research.google/blog/auto-generated-summaries-in-google-docs/
https://research.google/blog/custom-on-device-ml-models-with-learn2compress/
https://research.google/blog/efficient-sequence-modeling-for-on-device-ml/
https://research.google/blog/grammar-correction-as-you-type-on-pixel-6/
https://research.google/blog/training-machine-learning-models-more-efficiently-with-dataset-distillation/
https://tech.pic-collage.com/distillation-of-clip-model-and-other-experiments-f8394b7321ce
https://www.lesswrong.com/posts/7fxusXdkMNmAhkAfc/finding-sparse-linear-connections-between-features-in-llms
https://www.lesswrong.com/posts/qmQFHCgCyEEjuy5a7/lora-fine-tuning-efficiently-undoes-safety-training-from
https://www.quantamagazine.org/sparse-neural-networks-point-physicists-to-useful-data-20230608/
https://www.reddit.com/r/LocalLLaMA/comments/18luk10/wait_llama_and_falcon_are_also_moe/
Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time
MUX-PLMs: Pre-training Language Models with Data Multiplexing
The Lazy Neuron Phenomenon: On Emergence of Activation Sparsity in Transformers
https%253A%252F%252Farxiv.org%252Fabs%252F2210.06313%2523google.html
More ConvNets in the 2020s: Scaling up Kernels Beyond 51×51 using Sparsity (SLaK)
Building Machine Translation Systems for the Next Thousand Languages
https%253A%252F%252Farxiv.org%252Fabs%252F2205.03983%2523google.html
Monarch: Expressive Structured Matrices for Efficient and Accurate Training
https%253A%252F%252Farxiv.org%252Fabs%252F2202.07415%2523deepmind.html
https%253A%252F%252Farxiv.org%252Fabs%252F2106.09685%2523microsoft.html
https%253A%252F%252Fgreydanus.github.io%252F2020%252F12%252F01%252Fscaling-down%252F.html
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
https%253A%252F%252Farxiv.org%252Fabs%252F1905.11946%2523google.html
Wikipedia Bibliography: