- See Also
-
Links
- “Organic Reaction Mechanism Classification Using Machine Learning”, 2023
- “A High-performance Speech Neuroprosthesis”, Et Al 2023
- “Melting Pot 2.0”, Et Al 2022
- “VeLO: Training Versatile Learned Optimizers by Scaling Up”, Et Al 2022
- “Legged Locomotion in Challenging Terrains Using Egocentric Vision”, Et Al 2022
- “Massively Multilingual ASR on 70 Languages: Tokenization, Architecture, and Generalization Capabilities”, Et Al 2022
- “Semantic Scene Descriptions As an Objective of Human Vision”, Et Al 2022
- “Benchmarking Compositionality With Formal Languages”, Et Al 2022
- “PI-ARS: Accelerating Evolution-Learned Visual-Locomotion With Predictive Information Representations”, Et Al 2022
- “Spatial Representation by Ramping Activity of Neurons in the Retrohippocampal Cortex”, Et Al 2022
- “Neural Networks and the Chomsky Hierarchy”, Et Al 2022
- “BYOL-Explore: Exploration by Bootstrapped Prediction”, Et Al 2022
- “AnimeSR: Learning Real-World Super-Resolution Models for Animation Videos”, Et Al 2022
- “Task-Agnostic Continual Reinforcement Learning: In Praise of a Simple Baseline (3RL)”, Et Al 2022
- “Simple Recurrence Improves Masked Language Models”, Et Al 2022
- “Sequencer: Deep LSTM for Image Classification”, 2022
- “Data Distributional Properties Drive Emergent Few-Shot Learning in Transformers”, Et Al 2022
- “Block-Recurrent Transformers”, Et Al 2022
- “Learning by Directional Gradient Descent”, Et Al 2022
- “Retrieval-Augmented Reinforcement Learning”, Et Al 2022
- “General-purpose, Long-context Autoregressive Modeling With Perceiver AR”, Et Al 2022
- “End-to-end Algorithm Synthesis With Recurrent Networks: Logical Extrapolation Without Overthinking”, Et Al 2022
- “Data Scaling Laws in NMT: The Effect of Noise and Architecture”, Et Al 2022
- “Active Predictive Coding Networks: A Neural Solution to the Problem of Learning Reference Frames and Part-Whole Hierarchies”, 2022
- “Learning Robust Perceptive Locomotion for Quadrupedal Robots in the Wild”, Et Al 2022
- “Inducing Causal Structure for Interpretable Neural Networks (IIT)”, Et Al 2021
- “Evaluating Distributional Distortion in Neural Language Modeling”, 2021
- “Gradients Are Not All You Need”, Et Al 2021
- “An Explanation of In-context Learning As Implicit Bayesian Inference”, Et Al 2021
- “Minimum Description Length Recurrent Neural Networks”, Et Al 2021
- “S4: Efficiently Modeling Long Sequences With Structured State Spaces”, Et Al 2021
- “A Connectome of The Drosophila Central Complex Reveals Network Motifs Suitable for Flexible Navigation and Context-dependent Action Selection”, Et Al 2021
- “LSSL: Combining Recurrent, Convolutional, and Continuous-time Models With Linear State-Space Layers”, Et Al 2021
- “Recurrent Model-Free RL Is a Strong Baseline for Many POMDPs”, Et Al 2021
- “Photos Are All You Need for Reciprocal Recommendation in Online Dating”, Neve & 2021
- “Perceiver IO: A General Architecture for Structured Inputs & Outputs”, Et Al 2021
- “Unbiased Gradient Estimation in Unrolled Computation Graphs With Persistent Evolution Strategies”, Et Al 2021
- “Shelley: A Crowd-sourced Collaborative Horror Writer”, Et Al 2021
- “Ten Lessons From Three Generations Shaped Google’s TPUv4i”, Et Al 2021
- “RASP: Thinking Like Transformers”, Et Al 2021
- “Scaling Laws for Acoustic Models”, 2021
- “Scaling End-to-End Models for Large-Scale Multilingual ASR”, Et Al 2021
- “Efficient Transformers in Reinforcement Learning Using Actor-Learner Distillation”, 2021
- “Finetuning Pretrained Transformers into RNNs”, Et Al 2021
- “Pretrained Transformers As Universal Computation Engines”, Et Al 2021
- “Perceiver: General Perception With Iterative Attention”, Et Al 2021
- “When Attention Meets Fast Recurrence: Training SRU++ Language Models With Reduced Compute”, 2021
- “Predictive Coding Is a Consequence of Energy Efficiency in Recurrent Neural Networks”, Et Al 2021
- “Distilling Large Language Models into Tiny and Effective Students Using PQRNN”, Et Al 2021
- “Meta Learning Backpropagation And Improving It”, 2020
- “On the Binding Problem in Artificial Neural Networks”, Et Al 2020
- “A Recurrent Vision-and-Language BERT for Navigation”, Et Al 2020
- “Towards Playing Full MOBA Games With Deep Reinforcement Learning”, Et Al 2020
- “Adversarial Vulnerabilities of Human Decision-making”, Et Al 2020
- “Learning to Summarize Long Texts With Memory Compression and Transfer”, Et Al 2020
- “Human-centric Dialog Training via Offline Reinforcement Learning”, Et Al 2020
- “AFT: An Attention Free Transformer”, 2020
- “Deep Reinforcement Learning for Closed-Loop Blood Glucose Control”, Et Al 2020
- “HiPPO: Recurrent Memory With Optimal Polynomial Projections”, Et Al 2020
- “Adding Recurrence to Pretrained Transformers for Improved Efficiency and Context Size”, Et Al 2020
- “Matt Botvinick on the Spontaneous Emergence of Learning Algorithms”, 2020
- “DeepSinger: Singing Voice Synthesis With Data Mined From the Web”, Et Al 2020
- “High-performance Brain-to-text Communication via Imagined Handwriting”, Et Al 2020
- “Transformers Are RNNs: Fast Autoregressive Transformers With Linear Attention”, Et Al 2020
- “The Recurrent Neural Tangent Kernel”, Et Al 2020
- “Untangling Tradeoffs between Recurrence and Self-attention in Neural Networks”, Et Al 2020
- “Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing”, Et Al 2020
- “Learning Music Helps You Read: Using Transfer to Study Linguistic Structure in Language Models”, 2020
- “Syntactic Structure from Deep Learning”, 2020
- “Agent57: Outperforming the Human Atari Benchmark”, Et Al 2020
- “Machine Translation of Cortical Activity to Text With an Encoder-decoder Framework”, Et Al 2020
- “Learning-based Memory Allocation for C++ Server Workloads”, Et Al 2020
- “Direct Fit to Nature: An Evolutionary Perspective on Biological and Artificial Neural Networks”, Et Al 2020
- “Scaling Laws for Neural Language Models”, Et Al 2020
- “Placing Language in an Integrated Understanding System: Next Steps toward Human-level Performance in Neural Language Models”, Et Al 2020
- “Estimating the Deep Replicability of Scientific Findings Using Human and Artificial Intelligence”, Et Al 2020
- “Single Headed Attention RNN: Stop Thinking With Your Head”, 2019
- “Excavate”, 2019
- “MuZero: Mastering Atari, Go, Chess and Shogi by Planning With a Learned Model”, Et Al 2019
- “Legendre Memory Units: Continuous-Time Representation in Recurrent Neural Networks”, Et Al 2019
- “High Fidelity Video Prediction With Large Stochastic Recurrent Neural Networks”, Et Al 2019
- “SEED RL: Scalable and Efficient Deep-RL With Accelerated Central Inference”, Et Al 2019
- “Mixed-Signal Neuromorphic Processors: Quo Vadis?”, Et Al 2019
- “R2D3: Making Efficient Use of Demonstrations to Solve Hard Exploration Problems”, Et Al 2019
- “Language Modelling State-of-the-art Leaderboards”, Paperswithcode.com 2019
- “Metalearned Neural Memory”, Et Al 2019
- “Generating Text With Recurrent Neural Networks”, Et Al 2019
- “XLNet: Generalized Autoregressive Pretraining for Language Understanding”, Et Al 2019
- “Playing the Lottery With Rewards and Multiple Languages: Lottery Tickets in RL and NLP”, Et Al 2019
- “Reinforcement Learning, Fast and Slow”, Et Al 2019
- “MoGlow: Probabilistic and Controllable Motion Synthesis Using Normalizing Flows”, Et Al 2019
- “Meta-learners’ Learning Dynamics Are unlike Learners’”, 2019
- “Speech Synthesis from Neural Decoding of Spoken Sentences”, Et Al 2019
- “Good News, Everyone! Context Driven Entity-aware Captioning for News Images”, Et Al 2019
- “Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context”, Et Al 2019
- “High Fidelity Video Prediction With Large Stochastic Recurrent Neural Networks: Videos”, Et Al 2019
- “Bayesian Layers: A Module for Neural Network Uncertainty”, Et Al 2018
- “Meta-Learning: Learning to Learn Fast”, 2018
- “Piano Genie”, Et Al 2018
- “Learning Recurrent Binary/Ternary Weights”, Et Al 2018
- “R2D2: Recurrent Experience Replay in Distributed Reinforcement Learning”, Et Al 2018
- “Adversarial Reprogramming of Text Classification Neural Networks”, Et Al 2018
- “This Time With Feeling: Learning Expressive Musical Performance”, Et Al 2018
- “Character-Level Language Modeling With Deeper Self-Attention”, Al-Et Al 2018
- “General Value Function Networks”, Et Al 2018
- “Universal Transformers”, Et Al 2018
- “Deep-speare: A Joint Neural Model of Poetic Language, Meter and Rhyme”, Et Al 2018
- “Accurate Uncertainties for Deep Learning Using Calibrated Regression”, Et Al 2018
- “Neural Ordinary Differential Equations”, Et Al 2018
- “Greedy Attack and Gumbel Attack: Generating Adversarial Examples for Discrete Data”, Et Al 2018
- “Hierarchical Neural Story Generation”, Et Al 2018
- “Sharp Nearby, Fuzzy Far Away: How Neural Language Models Use Context”, Et Al 2018
- “A Tree Search Algorithm for Sequence Labeling”, Et Al 2018
- “An Analysis of Neural Language Modeling at Multiple Scales”, Et Al 2018
- “Reviving and Improving Recurrent Back-Propagation”, Et Al 2018
- “Learning Memory Access Patterns”, Et Al 2018
- “Learning Longer-term Dependencies in RNNs With Auxiliary Losses”, Et Al 2018
- “One Big Net For Everything”, 2018
- “Efficient Neural Audio Synthesis”, Et Al 2018
- “Deep Contextualized Word Representations”, Et Al 2018
- “M-Walk: Learning to Walk over Graphs Using Monte Carlo Tree Search”, Et Al 2018
- “ULMFiT: Universal Language Model Fine-tuning for Text Classification”, 2018
- “Large-scale Comparison of Machine Learning Methods for Drug Target Prediction on ChEMBL”, Et Al 2018
- “A Flexible Approach to Automated RNN Architecture Generation”, Et Al 2017
- “Learning Compact Recurrent Neural Networks With Block-Term Tensor Decomposition”, Et Al 2017
- “Mastering the Dungeon: Grounded Language Learning by Mechanical Turker Descent”, Et Al 2017
- “Evaluating Prose Style Transfer With the Bible”, Et Al 2017
- “Breaking the Softmax Bottleneck: A High-Rank RNN Language Model”, Et Al 2017
- “Neural Speed Reading via Skim-RNN”, Et Al 2017
- “Generalization without Systematicity: On the Compositional Skills of Sequence-to-sequence Recurrent Networks”, 2017
- “Mixed Precision Training”, Et Al 2017
- “To Prune, or Not to Prune: Exploring the Efficacy of Pruning for Model Compression”, 2017
- “N2N Learning: Network to Network Compression via Policy Gradient Reinforcement Learning”, Et Al 2017
- “Why Pay More When You Can Pay Less: A Joint Learning Framework for Active Feature Acquisition and Classification”, Et Al 2017
- “SRU: Simple Recurrent Units for Highly Parallelizable Recurrence”, Et Al 2017
- “Learning to Look Around: Intelligently Exploring Unseen Environments for Unknown Tasks”, 2017
- “Skip RNN: Learning to Skip State Updates in Recurrent Neural Networks”, Et Al 2017
- “Twin Networks: Matching the Future for Sequence Generation”, Et Al 2017
- “Revisiting Activation Regularization for Language RNNs”, Et Al 2017
- “Bayesian Sparsification of Recurrent Neural Networks”, Et Al 2017
- “On the State of the Art of Evaluation in Neural Language Models”, Et Al 2017
- “Controlling Linguistic Style Aspects in Neural Language Generation”, 2017
- “Device Placement Optimization With Reinforcement Learning”, Et Al 2017
- “Language Generation With Recurrent Generative Adversarial Networks without Pre-training”, Et Al 2017
- “Biased Importance Sampling for Deep Neural Network Training”, 2017
- “Deriving Neural Architectures from Sequence and Graph Kernels”, Et Al 2017
- “A Deep Reinforced Model for Abstractive Summarization”, Et Al 2017
- “A Neural Network System for Transformation of Regional Cuisine Style”, Et Al 2017
- “Sharp Models on Dull Hardware: Fast and Accurate Neural Machine Translation Decoding on the CPU”, 2017
- “Adversarial Neural Machine Translation”, Et Al 2017
- “Learning to Reason: End-to-End Module Networks for Visual Question Answering”, Et Al 2017
- “Exploring Sparsity in Recurrent Neural Networks”, Et Al 2017
- “DeepAR: Probabilistic Forecasting With Autoregressive Recurrent Networks”, Et Al 2017
- “Recurrent Environment Simulators”, Et Al 2017
- “Learning to Generate Reviews and Discovering Sentiment”, Et Al 2017
- “I2T2I: Learning Text to Image Synthesis With Textual Data Augmentation”, Et Al 2017
- “Improving Neural Machine Translation With Conditional Sequence Generative Adversarial Nets”, Et Al 2017
- “Learned Optimizers That Scale and Generalize”, Et Al 2017
- “Parallel Multiscale Autoregressive Density Estimation”, Et Al 2017
- “Tracking the World State With Recurrent Entity Networks”, Et Al 2017
- “Optimization As a Model for Few-Shot Learning”, 2017
- “Neural Combinatorial Optimization With Reinforcement Learning”, Et Al 2017
- “Tuning Recurrent Neural Networks With Reinforcement Learning”, Et Al 2017
- “Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer”, Et Al 2017
- “Neural Data Filter for Bootstrapping Stochastic Gradient Descent”, Et Al 2017
- “SampleRNN: An Unconditional End-to-End Neural Audio Generation Model”, Et Al 2016
- “Improving Neural Language Models With a Continuous Cache”, Et Al 2016
- “Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation”, Et Al 2016
- “Learning to Learn without Gradient Descent by Gradient Descent”, Et Al 2016
- “RL2: Fast Reinforcement Learning via Slow Reinforcement Learning”, Et Al 2016
- “DeepCoder: Learning to Write Programs”, Et Al 2016
- “Bidirectional Attention Flow for Machine Comprehension”, Et Al 2016
- “Neural Architecture Search With Reinforcement Learning”, 2016
- “QRNNs: Quasi-Recurrent Neural Networks”, Et Al 2016
- “Hybrid Computing Using a Neural Network With Dynamic External Memory”, Et Al 2016
- “Using Fast Weights to Attend to the Recent Past”, Et Al 2016
- “Achieving Human Parity in Conversational Speech Recognition”, Et Al 2016
- “HyperNetworks”, Et Al 2016
- “Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation”, Et Al 2016
- “Pointer Sentinel Mixture Models”, Et Al 2016
- “Deep Learning Human Mind for Automated Visual Classification”, Et Al 2016
- “Decoupled Neural Interfaces Using Synthetic Gradients”, Et Al 2016
- “Full Resolution Image Compression With Recurrent Neural Networks”, Et Al 2016
- “LSTMVis: A Tool for Visual Analysis of Hidden State Dynamics in Recurrent Neural Networks”, Et Al 2016
- “Learning to Learn by Gradient Descent by Gradient Descent”, Et Al 2016
- “Https://medium.com/artists-and-machine-intelligence/adventures-in-narrated-reality-part-ii-dc585af054cb”
- “Iterative Alternating Neural Attention for Machine Reading”, Et Al 2016
- “Deep Reinforcement Learning for Dialogue Generation”, Et Al 2016
- “Programming With a Differentiable Forth Interpreter”, Et Al 2016
- “Training Deep Nets With Sublinear Memory Cost”, Et Al 2016
- “Bridging the Gaps Between Residual Learning, Recurrent Neural Networks and Visual Cortex”, 2016
- “Improving Sentence Compression by Learning to Predict Gaze”, Et Al 2016
- “Adaptive Computation Time for Recurrent Neural Networks”, 2016
- “Https://medium.com/artists-and-machine-intelligence/adventures-in-narrated-reality-6516ff395ba3”
- “Dynamic Memory Networks for Visual and Textual Question Answering”, Et Al 2016
- “PlaNet—Photo Geolocation With Convolutional Neural Networks”, Et Al 2016
- “Learning Distributed Representations of Sentences from Unlabeled Data”, Et Al 2016
- “Exploring the Limits of Language Modeling”, Et Al 2016
- “Pixel Recurrent Neural Networks”, Et Al 2016
- “Persistent RNNs: Stashing Recurrent Weights On-Chip”, Et Al 2016
- “Deep-Spying: Spying Using Smartwatch and Deep Learning”, 2015
- “On Learning to Think: Algorithmic Information Theory for Novel Combinations of Reinforcement Learning Controllers and Recurrent Neural World Models”, 2015
- “Neural GPUs Learn Algorithms”, 2015
- “Sequence Level Training With Recurrent Neural Networks”, Et Al 2015
- “Generating Sentences from a Continuous Space”, Et Al 2015
- “Generative Concatenative Nets Jointly Learn to Write and Classify Reviews”, Et Al 2015
- “Generating Images from Captions With Attention”, Et Al 2015
- “Semi-supervised Sequence Learning”, 2015
- “RNN Metadata for Mimicking Author Style”, 2015
- “Deep Recurrent Q-Learning for Partially Observable MDPs”, 2015
- “Scheduled Sampling for Sequence Prediction With Recurrent Neural Networks”, Et Al 2015
- “Visualizing and Understanding Recurrent Networks”, Et Al 2015
- “The Unreasonable Effectiveness of Recurrent Neural Networks”, 2015
- “Deep Neural Networks for Large Vocabulary Handwritten Text Recognition”, 2015
- “Reinforcement Learning Neural Turing Machines—Revised”, 2015
- “End-To-End Memory Networks”, Et Al 2015
- “Inferring Algorithmic Patterns With Stack-Augmented Recurrent Nets”, Et Al 2015
- “DRAW: A Recurrent Neural Network For Image Generation”, Et Al 2015
- “Neural Turing Machines”, Et Al 2014
- “Learning to Execute”, 2014
- “Neural Machine Translation by Jointly Learning to Align and Translate”, Et Al 2014
- “Distributed Representations of Sentences and Documents”, 2014
- “One Billion Word Benchmark for Measuring Progress in Statistical Language Modeling”, Et Al 2013
- “Generating Sequences With Recurrent Neural Networks”, 2013
- “Large Language Models in Machine Translation”, Et Al 2007
- “Learning to Learn Using Gradient Descent”, Et Al 2001
- “Long Short-Term Memory”, 1997
- “Flat Minima”, 1997
- “Gradient-Based Learning Algorithms for Recurrent Networks and Their Computational Complexity”, 1995
- “A Focused Backpropagation Algorithm for Temporal Pattern Recognition”, 1995
- “Learning Complex, Extended Sequences Using the Principle of History Compression”, 1992
- “Learning to Control Fast-Weight Memories: An Alternative to Dynamic Recurrent Networks”, 1992
- “Untersuchungen Zu Dynamischen Neuronalen Netzen [Studies of Dynamic Neural Networks]”, 1991
- “Finding Structure In Time”, 1990
- “Connectionist Music Composition Based on Melodic, Stylistic, and Psychophysical Constraints [Technical Report CU-CS–495–90]”, 1990
- “A Learning Algorithm for Continually Running Fully Recurrent Neural Networks”, 1989b
- “A Local Learning Algorithm for Dynamic Feedforward and Recurrent Networks”, 1989
- “Experimental Analysis of the Real-time Recurrent Learning Algorithm”, 1989
- “A Sticky-Bit Approach for Learning to Represent State”, 1988
- “The Utility Driven Dynamic Error Propagation Network (RTRL)”, 1987
- “Serial Order: A Parallel Distributed Processing Approach”, 1986
- “Attention and Augmented Recurrent Neural Networks”
- “Deep Learning for Assisting the Process of Music Composition (part 3)”
- Wikipedia
- Miscellaneous
- Link Bibliography
See Also
Links
“Organic Reaction Mechanism Classification Using Machine Learning”, 2023
“Organic reaction mechanism classification using machine learning”, 2023-01-25 ( ; similar; bibliography)
“A High-performance Speech Neuroprosthesis”, Et Al 2023
“A high-performance speech neuroprosthesis”, 2023-01-21 ( ; similar)
“Melting Pot 2.0”, Et Al 2022
“Melting Pot 2.0”, 2022-11-24 ( ; similar)
“VeLO: Training Versatile Learned Optimizers by Scaling Up”, Et Al 2022
“VeLO: Training Versatile Learned Optimizers by Scaling Up”, 2022-11-17 ( ; similar)
“Legged Locomotion in Challenging Terrains Using Egocentric Vision”, Et Al 2022
“Legged Locomotion in Challenging Terrains using Egocentric Vision”, 2022-11-14 ( ; similar; bibliography)
“Massively Multilingual ASR on 70 Languages: Tokenization, Architecture, and Generalization Capabilities”, Et Al 2022
“Massively Multilingual ASR on 70 Languages: Tokenization, Architecture, and Generalization Capabilities”, 2022-11-10 ( ; similar)
“Semantic Scene Descriptions As an Objective of Human Vision”, Et Al 2022
“Semantic scene descriptions as an objective of human vision”, 2022-09-23 ( ; similar; bibliography)
“Benchmarking Compositionality With Formal Languages”, Et Al 2022
“Benchmarking Compositionality with Formal Languages”, 2022-08-17 ( ; similar)
“PI-ARS: Accelerating Evolution-Learned Visual-Locomotion With Predictive Information Representations”, Et Al 2022
“PI-ARS: Accelerating Evolution-Learned Visual-Locomotion with Predictive Information Representations”, 2022-07-27 ( ; similar)
“Spatial Representation by Ramping Activity of Neurons in the Retrohippocampal Cortex”, Et Al 2022
“Spatial representation by ramping activity of neurons in the retrohippocampal cortex”, 2022-07-26 ( ; similar)
“Neural Networks and the Chomsky Hierarchy”, Et Al 2022
“Neural Networks and the Chomsky Hierarchy”, 2022-07-05 ( ; similar)
“BYOL-Explore: Exploration by Bootstrapped Prediction”, Et Al 2022
“BYOL-Explore: Exploration by Bootstrapped Prediction”, 2022-06-16 ( ; similar)
“AnimeSR: Learning Real-World Super-Resolution Models for Animation Videos”, Et Al 2022
“AnimeSR: Learning Real-World Super-Resolution Models for Animation Videos”, 2022-06-14 ( ; similar)
“Task-Agnostic Continual Reinforcement Learning: In Praise of a Simple Baseline (3RL)”, Et Al 2022
“Task-Agnostic Continual Reinforcement Learning: In Praise of a Simple Baseline (3RL)”, 2022-05-28 ( ; similar)
“Simple Recurrence Improves Masked Language Models”, Et Al 2022
“Simple Recurrence Improves Masked Language Models”, 2022-05-23 ( ; similar)
“Sequencer: Deep LSTM for Image Classification”, 2022
“Sequencer: Deep LSTM for Image Classification”, 2022-05-04 (similar; bibliography)
“Data Distributional Properties Drive Emergent Few-Shot Learning in Transformers”, Et Al 2022
“Data Distributional Properties Drive Emergent Few-Shot Learning in Transformers”, 2022-04-22 ( ; similar)
“Block-Recurrent Transformers”, Et Al 2022
“Block-Recurrent Transformers”, 2022-03-11 ( ; backlinks; similar; bibliography)
“Learning by Directional Gradient Descent”, Et Al 2022
“Learning by Directional Gradient Descent”, 2022-02-17 (similar)
“Retrieval-Augmented Reinforcement Learning”, Et Al 2022
“Retrieval-Augmented Reinforcement Learning”, 2022-02-17 ( ; similar)
“General-purpose, Long-context Autoregressive Modeling With Perceiver AR”, Et Al 2022
“General-purpose, long-context autoregressive modeling with Perceiver AR”, 2022-02-15 ( ; similar; bibliography)
“End-to-end Algorithm Synthesis With Recurrent Networks: Logical Extrapolation Without Overthinking”, Et Al 2022
“End-to-end Algorithm Synthesis with Recurrent Networks: Logical Extrapolation Without Overthinking”, 2022-02-11 (similar)
“Data Scaling Laws in NMT: The Effect of Noise and Architecture”, Et Al 2022
“Data Scaling Laws in NMT: The Effect of Noise and Architecture”, 2022-02-04 ( ; similar)
“Active Predictive Coding Networks: A Neural Solution to the Problem of Learning Reference Frames and Part-Whole Hierarchies”, 2022
“Active Predictive Coding Networks: A Neural Solution to the Problem of Learning Reference Frames and Part-Whole Hierarchies”, 2022-01-21 ( ; similar)
“Learning Robust Perceptive Locomotion for Quadrupedal Robots in the Wild”, Et Al 2022
“Learning robust perceptive locomotion for quadrupedal robots in the wild”, 2022-01-19 ( ; backlinks; similar; bibliography)
“Inducing Causal Structure for Interpretable Neural Networks (IIT)”, Et Al 2021
“Inducing Causal Structure for Interpretable Neural Networks (IIT)”, 2021-12-01 ( ; similar)
“Evaluating Distributional Distortion in Neural Language Modeling”, 2021
“Evaluating Distributional Distortion in Neural Language Modeling”, 2021-11-16 ( ; similar)
“Gradients Are Not All You Need”, Et Al 2021
“Gradients are Not All You Need”, 2021-11-10 ( ; similar)
“An Explanation of In-context Learning As Implicit Bayesian Inference”, Et Al 2021
“An Explanation of In-context Learning as Implicit Bayesian Inference”, 2021-11-03 ( ; backlinks; similar)
“Minimum Description Length Recurrent Neural Networks”, Et Al 2021
“Minimum Description Length Recurrent Neural Networks”, 2021-10-31 ( ; similar)
“S4: Efficiently Modeling Long Sequences With Structured State Spaces”, Et Al 2021
“S4: Efficiently Modeling Long Sequences with Structured State Spaces”, 2021-10-31 ( ; backlinks; similar; bibliography)
“A Connectome of The Drosophila Central Complex Reveals Network Motifs Suitable for Flexible Navigation and Context-dependent Action Selection”, Et Al 2021
“A connectome of the Drosophila central complex reveals network motifs suitable for flexible navigation and context-dependent action selection”, 2021-10-26 ( ; backlinks; similar; bibliography)
“LSSL: Combining Recurrent, Convolutional, and Continuous-time Models With Linear State-Space Layers”, Et Al 2021
“LSSL: Combining Recurrent, Convolutional, and Continuous-time Models with Linear State-Space Layers”, 2021-10-26 ( ; backlinks; similar)
“Recurrent Model-Free RL Is a Strong Baseline for Many POMDPs”, Et Al 2021
“Recurrent Model-Free RL is a Strong Baseline for Many POMDPs”, 2021-10-11 ( ; backlinks; similar)
“Photos Are All You Need for Reciprocal Recommendation in Online Dating”, Neve & 2021
“Photos Are All You Need for Reciprocal Recommendation in Online Dating”, 2021-08-26 ( ; similar)
“Perceiver IO: A General Architecture for Structured Inputs & Outputs”, Et Al 2021
“Perceiver IO: A General Architecture for Structured Inputs & Outputs”, 2021-07-30 ( ; similar; bibliography)
“Unbiased Gradient Estimation in Unrolled Computation Graphs With Persistent Evolution Strategies”, Et Al 2021
“Unbiased Gradient Estimation in Unrolled Computation Graphs with Persistent Evolution Strategies”, 2021-07-01 ( ; similar; bibliography)
“Shelley: A Crowd-sourced Collaborative Horror Writer”, Et Al 2021
“Shelley: A Crowd-sourced Collaborative Horror Writer”, 2021-06-15 ( ; similar; bibliography)
“Ten Lessons From Three Generations Shaped Google’s TPUv4i”, Et Al 2021
“Ten Lessons From Three Generations Shaped Google’s TPUv4i”, 2021-06-14 ( ; similar; bibliography)
“RASP: Thinking Like Transformers”, Et Al 2021
“RASP: Thinking Like Transformers”, 2021-06-13 ( ; backlinks; similar; bibliography)
“Scaling Laws for Acoustic Models”, 2021
“Scaling Laws for Acoustic Models”, 2021-06-11 ( ; similar; bibliography)
“Scaling End-to-End Models for Large-Scale Multilingual ASR”, Et Al 2021
“Scaling End-to-End Models for Large-Scale Multilingual ASR”, 2021-04-30 ( ; similar)
“Efficient Transformers in Reinforcement Learning Using Actor-Learner Distillation”, 2021
“Efficient Transformers in Reinforcement Learning using Actor-Learner Distillation”, 2021-04-04 ( ; backlinks; similar)
“Finetuning Pretrained Transformers into RNNs”, Et Al 2021
“Finetuning Pretrained Transformers into RNNs”, 2021-03-24 ( ; backlinks; similar)
“Pretrained Transformers As Universal Computation Engines”, Et Al 2021
“Pretrained Transformers as Universal Computation Engines”, 2021-03-09 ( ; backlinks; similar)
“Perceiver: General Perception With Iterative Attention”, Et Al 2021
“Perceiver: General Perception with Iterative Attention”, 2021-03-04 ( ; similar; bibliography)
“When Attention Meets Fast Recurrence: Training SRU++ Language Models With Reduced Compute”, 2021
“When Attention Meets Fast Recurrence: Training SRU++ Language Models with Reduced Compute”, 2021-02-24 ( ; backlinks; similar)
“Predictive Coding Is a Consequence of Energy Efficiency in Recurrent Neural Networks”, Et Al 2021
“Predictive coding is a consequence of energy efficiency in recurrent neural networks”, 2021-02-16 ( ; similar)
“Distilling Large Language Models into Tiny and Effective Students Using PQRNN”, Et Al 2021
“Distilling Large Language Models into Tiny and Effective Students using pQRNN”, 2021-01-21 ( ; similar)
“Meta Learning Backpropagation And Improving It”, 2020
“Meta Learning Backpropagation And Improving It”, 2020-12-29 ( ; similar)
“On the Binding Problem in Artificial Neural Networks”, Et Al 2020
“On the Binding Problem in Artificial Neural Networks”, 2020-12-09 ( ; similar)
“A Recurrent Vision-and-Language BERT for Navigation”, Et Al 2020
“A Recurrent Vision-and-Language BERT for Navigation”, 2020-11-26 ( ; similar)
“Towards Playing Full MOBA Games With Deep Reinforcement Learning”, Et Al 2020
“Towards Playing Full MOBA Games with Deep Reinforcement Learning”, 2020-11-25 ( ; similar; bibliography)
“Adversarial Vulnerabilities of Human Decision-making”, Et Al 2020
“Adversarial vulnerabilities of human decision-making”, 2020-11-04 ( ; similar)
“Learning to Summarize Long Texts With Memory Compression and Transfer”, Et Al 2020
“Learning to Summarize Long Texts with Memory Compression and Transfer”, 2020-10-21 ( ; similar)
“Human-centric Dialog Training via Offline Reinforcement Learning”, Et Al 2020
“Human-centric Dialog Training via Offline Reinforcement Learning”, 2020-10-12 ( ; similar)
“AFT: An Attention Free Transformer”, 2020
“AFT: An Attention Free Transformer”, 2020-09-28 ( ; similar)
“Deep Reinforcement Learning for Closed-Loop Blood Glucose Control”, Et Al 2020
“Deep Reinforcement Learning for Closed-Loop Blood Glucose Control”, 2020-09-18 ( ; backlinks; similar)
“HiPPO: Recurrent Memory With Optimal Polynomial Projections”, Et Al 2020
“HiPPO: Recurrent Memory with Optimal Polynomial Projections”, 2020-08-17 ( ; backlinks; similar; bibliography)
“Adding Recurrence to Pretrained Transformers for Improved Efficiency and Context Size”, Et Al 2020
“Adding Recurrence to Pretrained Transformers for Improved Efficiency and Context Size”, 2020-08-16 ( ; backlinks; similar)
“Matt Botvinick on the Spontaneous Emergence of Learning Algorithms”, 2020
“Matt Botvinick on the spontaneous emergence of learning algorithms”, 2020-08-12 ( ; backlinks; similar; bibliography)
“DeepSinger: Singing Voice Synthesis With Data Mined From the Web”, Et Al 2020
“DeepSinger: Singing Voice Synthesis with Data Mined From the Web”, 2020-07-09 ( ; similar)
“High-performance Brain-to-text Communication via Imagined Handwriting”, Et Al 2020
“High-performance brain-to-text communication via imagined handwriting”, 2020-07-02 ( ; similar)
“Transformers Are RNNs: Fast Autoregressive Transformers With Linear Attention”, Et Al 2020
“Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention”, 2020-06-29 ( ; backlinks; similar)
“The Recurrent Neural Tangent Kernel”, Et Al 2020
“The Recurrent Neural Tangent Kernel”, 2020-06-18 (similar)
“Untangling Tradeoffs between Recurrence and Self-attention in Neural Networks”, Et Al 2020
“Untangling tradeoffs between recurrence and self-attention in neural networks”, 2020-06-16 ( ; backlinks; similar)
“Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing”, Et Al 2020
“Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing”, 2020-06-05 ( ; backlinks; similar)
“Learning Music Helps You Read: Using Transfer to Study Linguistic Structure in Language Models”, 2020
“Learning Music Helps You Read: Using Transfer to Study Linguistic Structure in Language Models”, 2020-04-30 ( ; similar)
“Syntactic Structure from Deep Learning”, 2020
“Syntactic Structure from Deep Learning”, 2020-04-22 ( ; backlinks; similar)
“Agent57: Outperforming the Human Atari Benchmark”, Et Al 2020
“Agent57: Outperforming the human Atari benchmark”, 2020-03-31 ( ; backlinks; similar; bibliography)
“Machine Translation of Cortical Activity to Text With an Encoder-decoder Framework”, Et Al 2020
“Learning-based Memory Allocation for C++ Server Workloads”, Et Al 2020
“Learning-based Memory Allocation for C++ Server Workloads”, 2020-03-16 ( ; similar)
“Direct Fit to Nature: An Evolutionary Perspective on Biological and Artificial Neural Networks”, Et Al 2020
“Direct Fit to Nature: An Evolutionary Perspective on Biological and Artificial Neural Networks”, 2020-02-05 ( ; backlinks; similar)
“Scaling Laws for Neural Language Models”, Et Al 2020
“Scaling Laws for Neural Language Models”, 2020-01-23 ( ; similar; bibliography)
“Placing Language in an Integrated Understanding System: Next Steps toward Human-level Performance in Neural Language Models”, Et Al 2020
“Placing language in an integrated understanding system: Next steps toward human-level performance in neural language models”, 2020 ( ; backlinks; similar)
“Estimating the Deep Replicability of Scientific Findings Using Human and Artificial Intelligence”, Et Al 2020
“Estimating the deep replicability of scientific findings using human and artificial intelligence”, 2020 ( ; backlinks; similar)
“Single Headed Attention RNN: Stop Thinking With Your Head”, 2019
“Single Headed Attention RNN: Stop Thinking With Your Head”, 2019-11-26 ( ; similar)
“Excavate”, 2019
“Excavate”, 2019-11-22 ( ; backlinks; similar)
“MuZero: Mastering Atari, Go, Chess and Shogi by Planning With a Learned Model”, Et Al 2019
“MuZero: Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model”, 2019-11-19 ( ; similar)
“Legendre Memory Units: Continuous-Time Representation in Recurrent Neural Networks”, Et Al 2019
“Legendre Memory Units: Continuous-Time Representation in Recurrent Neural Networks”, 2019-11-05 (backlinks; similar; bibliography)
“High Fidelity Video Prediction With Large Stochastic Recurrent Neural Networks”, Et Al 2019
“High Fidelity Video Prediction with Large Stochastic Recurrent Neural Networks”, 2019-11-05 ( ; similar)
“SEED RL: Scalable and Efficient Deep-RL With Accelerated Central Inference”, Et Al 2019
“SEED RL: Scalable and Efficient Deep-RL with Accelerated Central Inference”, 2019-10-15 ( ; similar; bibliography)
“Mixed-Signal Neuromorphic Processors: Quo Vadis?”, Et Al 2019
“Mixed-Signal Neuromorphic Processors: Quo vadis?”, 2019-10-14 ( ; similar)
“R2D3: Making Efficient Use of Demonstrations to Solve Hard Exploration Problems”, Et Al 2019
“R2D3: Making Efficient Use of Demonstrations to Solve Hard Exploration Problems”, 2019-09-03 ( ; similar)
“Language Modelling State-of-the-art Leaderboards”, Paperswithcode.com 2019
“Language Modelling State-of-the-art leaderboards”, 2019-08-28 ( ; backlinks)
“Metalearned Neural Memory”, Et Al 2019
“Metalearned Neural Memory”, 2019-07-23 ( ; similar)
“Generating Text With Recurrent Neural Networks”, Et Al 2019
“Generating Text with Recurrent Neural Networks”, 2019-07-16 ( ; similar)
“XLNet: Generalized Autoregressive Pretraining for Language Understanding”, Et Al 2019
“XLNet: Generalized Autoregressive Pretraining for Language Understanding”, 2019-06-19 ( ; backlinks; similar)
“Playing the Lottery With Rewards and Multiple Languages: Lottery Tickets in RL and NLP”, Et Al 2019
“Playing the lottery with rewards and multiple languages: lottery tickets in RL and NLP”, 2019-06-06 ( ; similar)
“Reinforcement Learning, Fast and Slow”, Et Al 2019
“Reinforcement Learning, Fast and Slow”, 2019-05-16 ( ; similar)
“MoGlow: Probabilistic and Controllable Motion Synthesis Using Normalizing Flows”, Et Al 2019
“MoGlow: Probabilistic and controllable motion synthesis using normalizing flows”, 2019-05-16 ( ; backlinks; similar)
“Meta-learners’ Learning Dynamics Are unlike Learners’”, 2019
“Meta-learners’ learning dynamics are unlike learners’”, 2019-05-03 ( ; similar; bibliography)
“Speech Synthesis from Neural Decoding of Spoken Sentences”, Et Al 2019
“Speech synthesis from neural decoding of spoken sentences”, 2019-04-24 ( ; similar)
“Good News, Everyone! Context Driven Entity-aware Captioning for News Images”, Et Al 2019
“Good News, Everyone! Context driven entity-aware captioning for news images”, 2019-04-02 ( ; backlinks; similar)
“Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context”, Et Al 2019
“Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context”, 2019-01-09 ( ; backlinks; similar)
“High Fidelity Video Prediction With Large Stochastic Recurrent Neural Networks: Videos”, Et Al 2019
“High Fidelity Video Prediction with Large Stochastic Recurrent Neural Networks: Videos”, 2019 ( )
“Bayesian Layers: A Module for Neural Network Uncertainty”, Et Al 2018
“Bayesian Layers: A Module for Neural Network Uncertainty”, 2018-12-10 ( ; similar)
“Meta-Learning: Learning to Learn Fast”, 2018
“Meta-Learning: Learning to Learn Fast”, 2018-11-30 ( ; similar)
“Piano Genie”, Et Al 2018
“Piano Genie”, 2018-10-11 ( ; similar)
“Learning Recurrent Binary/Ternary Weights”, Et Al 2018
“Learning Recurrent Binary/Ternary Weights”, 2018-09-28 ( ; similar)
“R2D2: Recurrent Experience Replay in Distributed Reinforcement Learning”, Et Al 2018
“R2D2: Recurrent Experience Replay in Distributed Reinforcement Learning”, 2018-09-27 ( ; similar; bibliography)
“Adversarial Reprogramming of Text Classification Neural Networks”, Et Al 2018
“Adversarial Reprogramming of Text Classification Neural Networks”, 2018-09-06 ( ; backlinks; similar)
“This Time With Feeling: Learning Expressive Musical Performance”, Et Al 2018
“This Time with Feeling: Learning Expressive Musical Performance”, 2018-08-10 ( ; similar)
“Character-Level Language Modeling With Deeper Self-Attention”, Al-Et Al 2018
“Character-Level Language Modeling with Deeper Self-Attention”, 2018-08-09 ( ; backlinks; similar)
“General Value Function Networks”, Et Al 2018
“General Value Function Networks”, 2018-07-18 (similar)
“Universal Transformers”, Et Al 2018
“Universal Transformers”, 2018-07-10 ( ; similar)
“Deep-speare: A Joint Neural Model of Poetic Language, Meter and Rhyme”, Et Al 2018
“Deep-speare: A Joint Neural Model of Poetic Language, Meter and Rhyme”, 2018-07-10 ( ; backlinks; similar)
“Accurate Uncertainties for Deep Learning Using Calibrated Regression”, Et Al 2018
“Accurate Uncertainties for Deep Learning Using Calibrated Regression”, 2018-07-01 ( ; similar)
“Neural Ordinary Differential Equations”, Et Al 2018
“Neural Ordinary Differential Equations”, 2018-06-19 (backlinks; similar)
“Greedy Attack and Gumbel Attack: Generating Adversarial Examples for Discrete Data”, Et Al 2018
“Greedy Attack and Gumbel Attack: Generating Adversarial Examples for Discrete Data”, 2018-05-31 (backlinks; similar)
“Hierarchical Neural Story Generation”, Et Al 2018
“Hierarchical Neural Story Generation”, 2018-05-13 ( ; backlinks; similar)
“Sharp Nearby, Fuzzy Far Away: How Neural Language Models Use Context”, Et Al 2018
“Sharp Nearby, Fuzzy Far Away: How Neural Language Models Use Context”, 2018-05-12 (backlinks; similar)
“A Tree Search Algorithm for Sequence Labeling”, Et Al 2018
“A Tree Search Algorithm for Sequence Labeling”, 2018-04-29 ( ; similar)
“An Analysis of Neural Language Modeling at Multiple Scales”, Et Al 2018
“An Analysis of Neural Language Modeling at Multiple Scales”, 2018-03-22 (similar)
“Reviving and Improving Recurrent Back-Propagation”, Et Al 2018
“Reviving and Improving Recurrent Back-Propagation”, 2018-03-16 ( ; similar)
“Learning Memory Access Patterns”, Et Al 2018
“Learning Memory Access Patterns”, 2018-03-06 ( ; backlinks; similar)
“Learning Longer-term Dependencies in RNNs With Auxiliary Losses”, Et Al 2018
“Learning Longer-term Dependencies in RNNs with Auxiliary Losses”, 2018-03-01 ( ; similar)
“One Big Net For Everything”, 2018
“One Big Net For Everything”, 2018-02-24 ( ; similar)
“Efficient Neural Audio Synthesis”, Et Al 2018
“Efficient Neural Audio Synthesis”, 2018-02-23 ( ; similar)
“Deep Contextualized Word Representations”, Et Al 2018
“Deep contextualized word representations”, 2018-02-15 (similar)
“M-Walk: Learning to Walk over Graphs Using Monte Carlo Tree Search”, Et Al 2018
“M-Walk: Learning to Walk over Graphs using Monte Carlo Tree Search”, 2018-02-12 ( ; similar)
“ULMFiT: Universal Language Model Fine-tuning for Text Classification”, 2018
“ULMFiT: Universal Language Model Fine-tuning for Text Classification”, 2018-01-18 (similar)
“Large-scale Comparison of Machine Learning Methods for Drug Target Prediction on ChEMBL”, Et Al 2018
“Large-scale comparison of machine learning methods for drug target prediction on ChEMBL”, 2018 ( ; similar)
“A Flexible Approach to Automated RNN Architecture Generation”, Et Al 2017
“A Flexible Approach to Automated RNN Architecture Generation”, 2017-12-20 ( ; backlinks; similar)
“Learning Compact Recurrent Neural Networks With Block-Term Tensor Decomposition”, Et Al 2017
“Learning Compact Recurrent Neural Networks with Block-Term Tensor Decomposition”, 2017-12-14 ( ; similar)
“Mastering the Dungeon: Grounded Language Learning by Mechanical Turker Descent”, Et Al 2017
“Mastering the Dungeon: Grounded Language Learning by Mechanical Turker Descent”, 2017-11-21 ( ; similar)
“Evaluating Prose Style Transfer With the Bible”, Et Al 2017
“Evaluating prose style transfer with the Bible”, 2017-11-13 ( ; backlinks; similar)
“Breaking the Softmax Bottleneck: A High-Rank RNN Language Model”, Et Al 2017
“Breaking the Softmax Bottleneck: A High-Rank RNN Language Model”, 2017-11-10 (backlinks; similar)
“Neural Speed Reading via Skim-RNN”, Et Al 2017
“Neural Speed Reading via Skim-RNN”, 2017-11-06 (backlinks; similar)
“Generalization without Systematicity: On the Compositional Skills of Sequence-to-sequence Recurrent Networks”, 2017
“Generalization without systematicity: On the compositional skills of sequence-to-sequence recurrent networks”, 2017-10-31 (backlinks; similar)
“Mixed Precision Training”, Et Al 2017
“Mixed Precision Training”, 2017-10-10 ( ; similar)
“To Prune, or Not to Prune: Exploring the Efficacy of Pruning for Model Compression”, 2017
“To prune, or not to prune: exploring the efficacy of pruning for model compression”, 2017-10-05 ( ; similar)
“N2N Learning: Network to Network Compression via Policy Gradient Reinforcement Learning”, Et Al 2017
“N2N Learning: Network to Network Compression via Policy Gradient Reinforcement Learning”, 2017-09-18 ( ; backlinks; similar)
“Why Pay More When You Can Pay Less: A Joint Learning Framework for Active Feature Acquisition and Classification”, Et Al 2017
“Why Pay More When You Can Pay Less: A Joint Learning Framework for Active Feature Acquisition and Classification”, 2017-09-18 ( ; backlinks; similar)
“SRU: Simple Recurrent Units for Highly Parallelizable Recurrence”, Et Al 2017
“SRU: Simple Recurrent Units for Highly Parallelizable Recurrence”, 2017-09-08 (backlinks; similar; bibliography)
“Learning to Look Around: Intelligently Exploring Unseen Environments for Unknown Tasks”, 2017
“Learning to Look Around: Intelligently Exploring Unseen Environments for Unknown Tasks”, 2017-09-01 ( ; similar)
“Skip RNN: Learning to Skip State Updates in Recurrent Neural Networks”, Et Al 2017
“Skip RNN: Learning to Skip State Updates in Recurrent Neural Networks”, 2017-08-22 (backlinks; similar)
“Twin Networks: Matching the Future for Sequence Generation”, Et Al 2017
“Twin Networks: Matching the Future for Sequence Generation”, 2017-08-22 (similar)
“Revisiting Activation Regularization for Language RNNs”, Et Al 2017
“Revisiting Activation Regularization for Language RNNs”, 2017-08-03 (similar)
“Bayesian Sparsification of Recurrent Neural Networks”, Et Al 2017
“Bayesian Sparsification of Recurrent Neural Networks”, 2017-07-31 ( ; similar)
“On the State of the Art of Evaluation in Neural Language Models”, Et Al 2017
“On the State of the Art of Evaluation in Neural Language Models”, 2017-07-18 (similar)
“Controlling Linguistic Style Aspects in Neural Language Generation”, 2017
“Controlling Linguistic Style Aspects in Neural Language Generation”, 2017-07-09 ( ; backlinks; similar)
“Device Placement Optimization With Reinforcement Learning”, Et Al 2017
“Device Placement Optimization with Reinforcement Learning”, 2017-06-13 ( ; similar)
“Language Generation With Recurrent Generative Adversarial Networks without Pre-training”, Et Al 2017
“Language Generation with Recurrent Generative Adversarial Networks without Pre-training”, 2017-06-05 ( ; backlinks; similar)
“Biased Importance Sampling for Deep Neural Network Training”, 2017
“Biased Importance Sampling for Deep Neural Network Training”, 2017-05-31 ( ; backlinks; similar)
“Deriving Neural Architectures from Sequence and Graph Kernels”, Et Al 2017
“Deriving Neural Architectures from Sequence and Graph Kernels”, 2017-05-25 (backlinks; similar)
“A Deep Reinforced Model for Abstractive Summarization”, Et Al 2017
“A Deep Reinforced Model for Abstractive Summarization”, 2017-05-11 ( ; backlinks; similar)
“A Neural Network System for Transformation of Regional Cuisine Style”, Et Al 2017
“A neural network system for transformation of regional cuisine style”, 2017-05-06 ( ; similar)
“Sharp Models on Dull Hardware: Fast and Accurate Neural Machine Translation Decoding on the CPU”, 2017
“Sharp Models on Dull Hardware: Fast and Accurate Neural Machine Translation Decoding on the CPU”, 2017-05-04 ( ; backlinks; similar)
“Adversarial Neural Machine Translation”, Et Al 2017
“Adversarial Neural Machine Translation”, 2017-04-20 ( ; backlinks; similar)
“Learning to Reason: End-to-End Module Networks for Visual Question Answering”, Et Al 2017
“Learning to Reason: End-to-End Module Networks for Visual Question Answering”, 2017-04-18 (backlinks; similar; bibliography)
“Exploring Sparsity in Recurrent Neural Networks”, Et Al 2017
“Exploring Sparsity in Recurrent Neural Networks”, 2017-04-17 ( ; similar)
“DeepAR: Probabilistic Forecasting With Autoregressive Recurrent Networks”, Et Al 2017
“DeepAR: Probabilistic Forecasting with Autoregressive Recurrent Networks”, 2017-04-13 ( ; backlinks; similar)
“Recurrent Environment Simulators”, Et Al 2017
“Recurrent Environment Simulators”, 2017-04-07 ( ; similar)
“Learning to Generate Reviews and Discovering Sentiment”, Et Al 2017
“Learning to Generate Reviews and Discovering Sentiment”, 2017-04-05 ( ; similar)
“I2T2I: Learning Text to Image Synthesis With Textual Data Augmentation”, Et Al 2017
“I2T2I: Learning Text to Image Synthesis with Textual Data Augmentation”, 2017-03-20 ( ; backlinks; similar)
“Improving Neural Machine Translation With Conditional Sequence Generative Adversarial Nets”, Et Al 2017
“Improving Neural Machine Translation with Conditional Sequence Generative Adversarial Nets”, 2017-03-15 ( ; backlinks; similar)
“Learned Optimizers That Scale and Generalize”, Et Al 2017
“Learned Optimizers that Scale and Generalize”, 2017-03-14 ( ; backlinks; similar)
“Parallel Multiscale Autoregressive Density Estimation”, Et Al 2017
“Parallel Multiscale Autoregressive Density Estimation”, 2017-03-10 ( ; similar)
“Tracking the World State With Recurrent Entity Networks”, Et Al 2017
“Tracking the World State with Recurrent Entity Networks”, 2017-03-03 (similar)
“Optimization As a Model for Few-Shot Learning”, 2017
“Optimization as a Model for Few-Shot Learning”, 2017-03-01 ( ; similar)
“Neural Combinatorial Optimization With Reinforcement Learning”, Et Al 2017
“Neural Combinatorial Optimization with Reinforcement Learning”, 2017-02-17 ( ; backlinks; similar)
“Tuning Recurrent Neural Networks With Reinforcement Learning”, Et Al 2017
“Tuning Recurrent Neural Networks with Reinforcement Learning”, 2017-02-14 ( ; backlinks; similar)
“Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer”, Et Al 2017
“Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer”, 2017-01-23 ( ; similar)
“Neural Data Filter for Bootstrapping Stochastic Gradient Descent”, Et Al 2017
“Neural Data Filter for Bootstrapping Stochastic Gradient Descent”, 2017-01-20 ( ; backlinks; similar)
“SampleRNN: An Unconditional End-to-End Neural Audio Generation Model”, Et Al 2016
“SampleRNN: An Unconditional End-to-End Neural Audio Generation Model”, 2016-12-22 ( ; backlinks; similar)
“Improving Neural Language Models With a Continuous Cache”, Et Al 2016
“Improving Neural Language Models with a Continuous Cache”, 2016-12-13 ( ; similar)
“Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation”, Et Al 2016
“Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation”, 2016-11-14 ( ; backlinks; similar)
“Learning to Learn without Gradient Descent by Gradient Descent”, Et Al 2016
“Learning to Learn without Gradient Descent by Gradient Descent”, 2016-11-11 ( ; similar)
“RL2: Fast Reinforcement Learning via Slow Reinforcement Learning”, Et Al 2016
“RL2: Fast Reinforcement Learning via Slow Reinforcement Learning”, 2016-11-09 ( ; similar)
“DeepCoder: Learning to Write Programs”, Et Al 2016
“DeepCoder: Learning to Write Programs”, 2016-11-07 ( ; similar)
“Bidirectional Attention Flow for Machine Comprehension”, Et Al 2016
“Bidirectional Attention Flow for Machine Comprehension”, 2016-11-05 (backlinks; similar)
“Neural Architecture Search With Reinforcement Learning”, 2016
“Neural Architecture Search with Reinforcement Learning”, 2016-11-05 ( ; similar)
“QRNNs: Quasi-Recurrent Neural Networks”, Et Al 2016
“QRNNs: Quasi-Recurrent Neural Networks”, 2016-11-05 ( ; similar)
“Hybrid Computing Using a Neural Network With Dynamic External Memory”, Et Al 2016
“Hybrid computing using a neural network with dynamic external memory”, 2016-10-27 ( ; similar)
“Using Fast Weights to Attend to the Recent Past”, Et Al 2016
“Using Fast Weights to Attend to the Recent Past”, 2016-10-20 ( ; similar)
“Achieving Human Parity in Conversational Speech Recognition”, Et Al 2016
“Achieving Human Parity in Conversational Speech Recognition”, 2016-10-17 (similar)
“HyperNetworks”, Et Al 2016
“HyperNetworks”, 2016-09-27 ( ; similar)
“Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation”, Et Al 2016
“Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation”, 2016-09-26 ( ; similar)
“Pointer Sentinel Mixture Models”, Et Al 2016
“Pointer Sentinel Mixture Models”, 2016-09-26 ( ; backlinks; similar)
“Deep Learning Human Mind for Automated Visual Classification”, Et Al 2016
“Deep Learning Human Mind for Automated Visual Classification”, 2016-09-01 ( ; similar)
“Decoupled Neural Interfaces Using Synthetic Gradients”, Et Al 2016
“Decoupled Neural Interfaces using Synthetic Gradients”, 2016-08-18 ( ; similar)
“Full Resolution Image Compression With Recurrent Neural Networks”, Et Al 2016
“Full Resolution Image Compression with Recurrent Neural Networks”, 2016-08-18 ( ; similar)
“LSTMVis: A Tool for Visual Analysis of Hidden State Dynamics in Recurrent Neural Networks”, Et Al 2016
“LSTMVis: A Tool for Visual Analysis of Hidden State Dynamics in Recurrent Neural Networks”, 2016-06-23 ( ; similar)
“Learning to Learn by Gradient Descent by Gradient Descent”, Et Al 2016
“Learning to learn by gradient descent by gradient descent”, 2016-06-14 ( ; backlinks; similar)
“Https://medium.com/artists-and-machine-intelligence/adventures-in-narrated-reality-part-ii-dc585af054cb”
“Iterative Alternating Neural Attention for Machine Reading”, Et Al 2016
“Iterative Alternating Neural Attention for Machine Reading”, 2016-06-07 (backlinks; similar)
“Deep Reinforcement Learning for Dialogue Generation”, Et Al 2016
“Deep Reinforcement Learning for Dialogue Generation”, 2016-06-05 ( ; backlinks; similar)
“Programming With a Differentiable Forth Interpreter”, Et Al 2016
“Programming with a Differentiable Forth Interpreter”, 2016-05-21 ( ; similar)
“Training Deep Nets With Sublinear Memory Cost”, Et Al 2016
“Training Deep Nets with Sublinear Memory Cost”, 2016-04-21 ( ; backlinks; similar)
“Bridging the Gaps Between Residual Learning, Recurrent Neural Networks and Visual Cortex”, 2016
“Bridging the Gaps Between Residual Learning, Recurrent Neural Networks and Visual Cortex”, 2016-04-13 ( ; similar)
“Improving Sentence Compression by Learning to Predict Gaze”, Et Al 2016
“Improving sentence compression by learning to predict gaze”, 2016-04-12 ( )
“Adaptive Computation Time for Recurrent Neural Networks”, 2016
“Adaptive Computation Time for Recurrent Neural Networks”, 2016-03-29 ( ; backlinks; similar)
“Https://medium.com/artists-and-machine-intelligence/adventures-in-narrated-reality-6516ff395ba3”
“Dynamic Memory Networks for Visual and Textual Question Answering”, Et Al 2016
“Dynamic Memory Networks for Visual and Textual Question Answering”, 2016-03-04 (similar)
“PlaNet—Photo Geolocation With Convolutional Neural Networks”, Et Al 2016
“PlaNet—Photo Geolocation with Convolutional Neural Networks”, 2016-02-17 ( ; similar)
“Learning Distributed Representations of Sentences from Unlabeled Data”, Et Al 2016
“Learning Distributed Representations of Sentences from Unlabeled Data”, 2016-02-10 (similar)
“Exploring the Limits of Language Modeling”, Et Al 2016
“Exploring the Limits of Language Modeling”, 2016-02-07 ( ; similar)
“Pixel Recurrent Neural Networks”, Et Al 2016
“Pixel Recurrent Neural Networks”, 2016-01-25 ( ; similar)
“Persistent RNNs: Stashing Recurrent Weights On-Chip”, Et Al 2016
“Persistent RNNs: Stashing Recurrent Weights On-Chip”, 2016-01 ( ; similar)
“Deep-Spying: Spying Using Smartwatch and Deep Learning”, 2015
“Deep-Spying: Spying using Smartwatch and Deep Learning”, 2015-12-17 ( ; backlinks; similar)
“On Learning to Think: Algorithmic Information Theory for Novel Combinations of Reinforcement Learning Controllers and Recurrent Neural World Models”, 2015
“On Learning to Think: Algorithmic Information Theory for Novel Combinations of Reinforcement Learning Controllers and Recurrent Neural World Models”, 2015-11-30 ( ; similar)
“Neural GPUs Learn Algorithms”, 2015
“Neural GPUs Learn Algorithms”, 2015-11-25 (similar)
“Sequence Level Training With Recurrent Neural Networks”, Et Al 2015
“Sequence Level Training with Recurrent Neural Networks”, 2015-11-20 ( ; similar)
“Generating Sentences from a Continuous Space”, Et Al 2015
“Generating Sentences from a Continuous Space”, 2015-11-19 (similar)
“Generative Concatenative Nets Jointly Learn to Write and Classify Reviews”, Et Al 2015
“Generative Concatenative Nets Jointly Learn to Write and Classify Reviews”, 2015-11-11 ( ; backlinks; similar)
“Generating Images from Captions With Attention”, Et Al 2015
“Generating Images from Captions with Attention”, 2015-11-09 (backlinks; similar)
“Semi-supervised Sequence Learning”, 2015
“Semi-supervised Sequence Learning”, 2015-11-04 ( ; backlinks; similar)
“RNN Metadata for Mimicking Author Style”, 2015
“RNN Metadata for Mimicking Author Style”, 2015-09-12 ( ; backlinks; similar; bibliography)
“Deep Recurrent Q-Learning for Partially Observable MDPs”, 2015
“Deep Recurrent Q-Learning for Partially Observable MDPs”, 2015-07-23 ( ; similar)
“Scheduled Sampling for Sequence Prediction With Recurrent Neural Networks”, Et Al 2015
“Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks”, 2015-06-09 ( ; similar)
“Visualizing and Understanding Recurrent Networks”, Et Al 2015
“Visualizing and Understanding Recurrent Networks”, 2015-06-05 ( ; similar)
“The Unreasonable Effectiveness of Recurrent Neural Networks”, 2015
“The Unreasonable Effectiveness of Recurrent Neural Networks”, 2015-05-21 ( ; backlinks; similar)
“Deep Neural Networks for Large Vocabulary Handwritten Text Recognition”, 2015
“Deep Neural Networks for Large Vocabulary Handwritten Text Recognition”, 2015-05-13 ( ; backlinks; similar)
“Reinforcement Learning Neural Turing Machines—Revised”, 2015
“Reinforcement Learning Neural Turing Machines—Revised”, 2015-05-04 ( ; backlinks; similar)
“End-To-End Memory Networks”, Et Al 2015
“End-To-End Memory Networks”, 2015-03-31 (similar)
“Inferring Algorithmic Patterns With Stack-Augmented Recurrent Nets”, Et Al 2015
“Inferring Algorithmic Patterns with Stack-Augmented Recurrent Nets”, 2015-03-03 ( ; similar)
“DRAW: A Recurrent Neural Network For Image Generation”, Et Al 2015
“DRAW: A Recurrent Neural Network For Image Generation”, 2015-02-16 ( ; backlinks; similar)
“Neural Turing Machines”, Et Al 2014
“Neural Turing Machines”, 2014-10-20 ( )
“Learning to Execute”, 2014
“Learning to Execute”, 2014-10-17 ( ; similar)
“Neural Machine Translation by Jointly Learning to Align and Translate”, Et Al 2014
“Neural Machine Translation by Jointly Learning to Align and Translate”, 2014-09-01 ( ; backlinks; similar)
“Distributed Representations of Sentences and Documents”, 2014
“Distributed Representations of Sentences and Documents”, 2014-05-16 (similar)
“One Billion Word Benchmark for Measuring Progress in Statistical Language Modeling”, Et Al 2013
“One Billion Word Benchmark for Measuring Progress in Statistical Language Modeling”, 2013-12-11 ( ; backlinks; similar)
“Generating Sequences With Recurrent Neural Networks”, 2013
“Generating Sequences With Recurrent Neural Networks”, 2013-08-04 (backlinks)
“Large Language Models in Machine Translation”, Et Al 2007
“Large Language Models in Machine Translation”, 2007-06 ( ; similar)
“Learning to Learn Using Gradient Descent”, Et Al 2001
“Learning to Learn Using Gradient Descent”, 2001-08-17 ( ; backlinks; similar)
“Long Short-Term Memory”, 1997
“Long Short-Term Memory”, 1997-12-15 (similar)
“Flat Minima”, 1997
“Flat Minima”, 1997 ( ; similar)
“Gradient-Based Learning Algorithms for Recurrent Networks and Their Computational Complexity”, 1995
“Gradient-Based Learning Algorithms for Recurrent Networks and Their Computational Complexity”, 1995 (backlinks; similar)
“A Focused Backpropagation Algorithm for Temporal Pattern Recognition”, 1995
“A Focused Backpropagation Algorithm for Temporal Pattern Recognition”, 1995 (backlinks; similar)
“Learning Complex, Extended Sequences Using the Principle of History Compression”, 1992
“Learning Complex, Extended Sequences Using the Principle of History Compression”, 1992 ( ; similar)
“Learning to Control Fast-Weight Memories: An Alternative to Dynamic Recurrent Networks”, 1992
“Learning to Control Fast-Weight Memories: An Alternative to Dynamic Recurrent Networks”, 1992 ( ; backlinks; similar)
“Untersuchungen Zu Dynamischen Neuronalen Netzen [Studies of Dynamic Neural Networks]”, 1991
“Untersuchungen zu dynamischen neuronalen Netzen [Studies of dynamic neural networks]”, 1991-06-15 (backlinks; similar; bibliography)
“Finding Structure In Time”, 1990
“Finding Structure In Time”, 1990-04-01 (backlinks; similar)
“Connectionist Music Composition Based on Melodic, Stylistic, and Psychophysical Constraints [Technical Report CU-CS–495–90]”, 1990
“Connectionist Music Composition Based on Melodic, Stylistic, and Psychophysical Constraints [Technical report CU-CS–495–90]”, 1990 ( ; backlinks; similar)
“A Learning Algorithm for Continually Running Fully Recurrent Neural Networks”, 1989b
“A Learning Algorithm for Continually Running Fully Recurrent Neural Networks”, 1989-06-01 (backlinks; similar)
“A Local Learning Algorithm for Dynamic Feedforward and Recurrent Networks”, 1989
“A Local Learning Algorithm for Dynamic Feedforward and Recurrent Networks”, 1989 (backlinks; similar)
“Experimental Analysis of the Real-time Recurrent Learning Algorithm”, 1989
“Experimental Analysis of the Real-time Recurrent Learning Algorithm”, 1989 (similar; bibliography)
“A Sticky-Bit Approach for Learning to Represent State”, 1988
“A Sticky-Bit Approach for Learning to Represent State”, 1988-09-06 (backlinks)
“The Utility Driven Dynamic Error Propagation Network (RTRL)”, 1987
“The Utility Driven Dynamic Error Propagation Network (RTRL)”, 1987-11-04 (backlinks; similar)
“Serial Order: A Parallel Distributed Processing Approach”, 1986
“Serial Order: A Parallel Distributed Processing Approach”, 1986-05 (backlinks; similar)
“Attention and Augmented Recurrent Neural Networks”
“Deep Learning for Assisting the Process of Music Composition (part 3)”
Wikipedia
Miscellaneous
Link Bibliography
-
2023-bures.pdf
: “Organic Reaction Mechanism Classification Using Machine Learning”, Jordi Burés, Igor Larrosa: -
https://arxiv.org/abs/2211.07638
: “Legged Locomotion in Challenging Terrains Using Egocentric Vision”, Ananye Agarwal, Ashish Kumar, Jitendra Malik, Deepak Pathak: -
https://arxiv.org/abs/2209.11737
: “Semantic Scene Descriptions As an Objective of Human Vision”, Adrien Doerig, Tim C. Kietzmann, Emily Allen, Yihan Wu, Thomas Naselaris, Kendrick Kay, Ian Charest: -
https://arxiv.org/abs/2205.01972
: “Sequencer: Deep LSTM for Image Classification”, Yuki Tatsunami, Masato Taki: -
https://arxiv.org/abs/2203.07852
: “Block-Recurrent Transformers”, DeLesley Hutchins, Imanol Schlag, Yuhuai Wu, Ethan Dyer, Behnam Neyshabur: -
https://arxiv.org/abs/2202.07765#deepmind
: “General-purpose, Long-context Autoregressive Modeling With Perceiver AR”, : -
2022-miki.pdf
: “Learning Robust Perceptive Locomotion for Quadrupedal Robots in the Wild”, Takahiro Miki, Joonho Lee, Jemin Hwangbo, Lorenz Wellhausen, Vladlen Koltun, Marco Hutter: -
https://arxiv.org/abs/2111.00396
: “S4: Efficiently Modeling Long Sequences With Structured State Spaces”, Albert Gu, Karan Goel, Christopher Ré: -
https://elifesciences.org/articles/66039
: “A Connectome of the <em>Drosophila< / em> Central Complex Reveals Network Motifs Suitable for Flexible Navigation and Context-dependent Action Selection”, : -
https://arxiv.org/abs/2107.14795#deepmind
: “Perceiver IO: A General Architecture for Structured Inputs & Outputs”, : -
https://proceedings.mlr.press/v139/vicol21a.html
: “Unbiased Gradient Estimation in Unrolled Computation Graphs With Persistent Evolution Strategies”, Paul Vicol, Luke Metz, Jascha Sohl-Dickstein: -
2021-delul.pdf
: “Shelley: A Crowd-sourced Collaborative Horror Writer”, Pinar Yanardag Delul, Manuel Cebrian, Iyad Rahwan: -
2021-jouppi.pdf
: “Ten Lessons From Three Generations Shaped Google’s TPUv4i”, : -
https://arxiv.org/abs/2106.06981
: “RASP: Thinking Like Transformers”, Gail Weiss, Yoav Goldberg, Eran Yahav: -
https://arxiv.org/abs/2106.09488#amazon
: “Scaling Laws for Acoustic Models”, Jasha Droppo, Oguz Elibol: -
https://arxiv.org/abs/2103.03206#deepmind
: “Perceiver: General Perception With Iterative Attention”, Andrew Jaegle, Felix Gimeno, Andrew Brock, Andrew Zisserman, Oriol Vinyals, Joao Carreira: -
https://arxiv.org/abs/2011.12692#tencent
: “Towards Playing Full MOBA Games With Deep Reinforcement Learning”, : -
https://arxiv.org/abs/2008.07669
: “HiPPO: Recurrent Memory With Optimal Polynomial Projections”, Albert Gu, Tri Dao, Stefano Ermon, Atri Rudra, Christopher Re: -
https://www.lesswrong.com/posts/Wnqua6eQkewL3bqsF/matt-botvinick-on-the-spontaneous-emergence-of-learning
: “Matt Botvinick on the Spontaneous Emergence of Learning Algorithms”, Adam Scholl: -
https://www.deepmind.com/blog/agent57-outperforming-the-human-atari-benchmark
: “Agent57: Outperforming the Human Atari Benchmark”, Adrià Puigdomènech, Bilal Piot, Steven Kapturowski, Pablo Sprechmann, Alex Vitvitskyi, Daniel Guo, Charles Blundell: -
https://arxiv.org/abs/2001.08361#openai
: “Scaling Laws for Neural Language Models”, : -
https://openreview.net/forum?id=HyxlRHBlUB
: “Legendre Memory Units: Continuous-Time Representation in Recurrent Neural Networks”, Aaron R. Voelker, Ivana Kajić, Chris Eliasmith: -
https://arxiv.org/abs/1910.06591#deepmind
: “SEED RL: Scalable and Efficient Deep-RL With Accelerated Central Inference”, Lasse Espeholt, Raphaël Marinier, Piotr Stanczyk, Ke Wang, Marcin Michalski: -
https://arxiv.org/abs/1905.01320#deepmind
: “Meta-learners’ Learning Dynamics Are unlike Learners’”, Neil C. Rabinowitz: -
https://openreview.net/forum?id=r1lyTjAqYX#deepmind
: “R2D2: Recurrent Experience Replay in Distributed Reinforcement Learning”, Steven Kapturowski, Georg Ostrovski, John Quan, Remi Munos, Will Dabney: -
https://arxiv.org/abs/1709.02755
: “SRU: Simple Recurrent Units for Highly Parallelizable Recurrence”, Tao Lei, Yu Zhang, Sida I. Wang, Hui Dai, Yoav Artzi: -
https://arxiv.org/abs/1704.05526
: “Learning to Reason: End-to-End Module Networks for Visual Question Answering”, Ronghang Hu, Jacob Andreas, Marcus Rohrbach, Trevor Darrell, Kate Saenko: -
rnn-metadata
: “RNN Metadata for Mimicking Author Style”, Gwern Branwen: -
1991-hochreiter.pdf
: “<em>Untersuchungen Zu Dynamischen Neuronalen Netzen< / em> [Studies of Dynamic Neural Networks]”, Sepp Hochreiter: -
1989-williams.pdf
: “Experimental Analysis of the Real-time Recurrent Learning Algorithm”, Ronald J. Williams, David Zipser: