“‘RNN’ Tag”,2019-08-30 ():
![]()
Bibliography for tag
ai/nn/rnn, most recent first: 5 related tags, 368 annotations, & 40 links (parent).
- See Also
- Gwern
- Links
- “Hymba: A Hybrid-Head Architecture for Small Language Models”, et al 2024
- “State-Space Models Can Learn In-Context by Gradient Descent”, et al 2024
- “Were RNNs All We Needed?”, et al 2024
- “The Mamba in the Llama: Distilling and Accelerating Hybrid Models”, et al 2024
- “
handwriter.ttf: Handwriting Synthesis With Harfbuzz WASM”, 2024- “Learning to (Learn at Test Time): RNNs With Expressive Hidden States”, et al 2024
- “An Empirical Study of Mamba-Based Language Models”, et al 2024
- “State Soup: In-Context Skill Learning, Retrieval and Mixing”, et al 2024
- “Grokfast: Accelerated Grokking by Amplifying Slow Gradients”, et al 2024
- “Attention As an RNN”, et al 2024
- “XLSTM: Extended Long Short-Term Memory”, et al 2024
- “Megalodon: Efficient LLM Pretraining and Inference With Unlimited Context Length”, et al 2024
- “The Illusion of State in State-Space Models”, et al 2024
- “An Accurate and Rapidly Calibrating Speech Neuroprosthesis”, et al 2024
- “Does Transformer Interpretability Transfer to RNNs?”, et al 2024
- “Mechanistic Design and Scaling of Hybrid Architectures”, et al 2024
- “GLE: Backpropagation through Space, Time, and the Brain”, et al 2024
- “ZigMa: Zigzag Mamba Diffusion Model”, et al 2024
- “RNNs Are Not Transformers (Yet): The Key Bottleneck on In-Context Retrieval”, et al 2024
- “MambaByte: Token-Free Selective State Space Model”, et al 2024
- “MoE-Mamba: Efficient Selective State Space Models With Mixture of Experts”, et al 2024
- “Evolving Reservoirs for Meta Reinforcement Learning”, et al 2023
- “Zoology: Measuring and Improving Recall in Efficient Language Models”, et al 2023
- “Mamba: Linear-Time Sequence Modeling With Selective State Spaces”, 2023
- “Diffusion Models Without Attention”, et al 2023
- “Learning Few-Shot Imitation As Cultural Transmission”, et al 2023
- “Compositional Capabilities of Autoregressive Transformers: A Study on Synthetic, Interpretable Tasks”, et al 2023
- “HGRN: Hierarchically Gated Recurrent Neural Network for Sequence Modeling”, et al 2023
- “On Prefrontal Working Memory and Hippocampal Episodic Memory: Unifying Memories Stored in Weights and Activation Slots”, et al 2023
- “GateLoop: Fully Data-Controlled Linear Recurrence for Sequence Modeling”, 2023
- “ProSG: Using Prompt Synthetic Gradients to Alleviate Prompt Forgetting of RNN-Like Language Models”, et al 2023
- “Transformers Learn Higher-Order Optimization Methods for In-Context Learning: A Study With Linear Models”, et al 2023
- “Generalization in Sensorimotor Networks Configured With Natural Language Instructions”, 2023
- “Never Train from Scratch: Fair Comparison of Long-Sequence Models Requires Data-Driven Priors”, et al 2023
- “Parallelizing Non-Linear Sequential Models over the Sequence Length”, et al 2023
- “A High-Performance Neuroprosthesis for Speech Decoding and Avatar Control”, et al 2023
- “Learning to Model the World With Language”, et al 2023
- “Retentive Network: A Successor to Transformer for Large Language Models”, et al 2023
- “Using Sequences of Life-Events to Predict Human Lives”, et al 2023
- “Thought Cloning: Learning to Think While Acting by Imitating Human Thinking”, 2023
- “RWKV: Reinventing RNNs for the Transformer Era”, et al 2023
- “Emergence of Belief-Like Representations through Reinforcement Learning”, et al 2023
- “Model Scale versus Domain Knowledge in Statistical Forecasting of Chaotic Systems”, 2023
- “Resurrecting Recurrent Neural Networks for Long Sequences”, et al 2023
- “SpikeGPT: Generative Pre-Trained Language Model With Spiking Neural Networks”, et al 2023
- “Organic Reaction Mechanism Classification Using Machine Learning”, 2023
- “A High-Performance Speech Neuroprosthesis”, et al 2023
- “Hungry Hungry Hippos: Towards Language Modeling With State Space Models”, et al 2022
- “Pretraining Without Attention”, et al 2022
- “A 64-Core Mixed-Signal In-Memory Compute Chip Based on Phase-Change Memory for Deep Neural Network Inference”, et al 2022
- “Melting Pot 2.0”, et al 2022
- “VeLO: Training Versatile Learned Optimizers by Scaling Up”, et al 2022
- “Legged Locomotion in Challenging Terrains Using Egocentric Vision”, et al 2022
- “Massively Multilingual ASR on 70 Languages: Tokenization, Architecture, and Generalization Capabilities”, et al 2022
- “Perfectly Secure Steganography Using Minimum Entropy Coupling”, et al 2022
- “Transformers Learn Shortcuts to Automata”, et al 2022
- “Omnigrok: Grokking Beyond Algorithmic Data”, et al 2022
- “Semantic Scene Descriptions As an Objective of Human Vision”, et al 2022
- “Benchmarking Compositionality With Formal Languages”, et al 2022
- “Learning to Generalize With Object-Centric Agents in the Open World Survival Game Crafter”, et al 2022
- “PI-ARS: Accelerating Evolution-Learned Visual-Locomotion With Predictive Information Representations”, et al 2022
- “Spatial Representation by Ramping Activity of Neurons in the Retrohippocampal Cortex”, et al 2022
- “Neural Networks and the Chomsky Hierarchy”, et al 2022
- “BYOL-Explore: Exploration by Bootstrapped Prediction”, et al 2022
- “AnimeSR: Learning Real-World Super-Resolution Models for Animation Videos”, et al 2022
- “Task-Agnostic Continual Reinforcement Learning: In Praise of a Simple Baseline (3RL)”, et al 2022
- “Simple Recurrence Improves Masked Language Models”, et al 2022
- “Sequencer: Deep LSTM for Image Classification”, 2022
- “Data Distributional Properties Drive Emergent Few-Shot Learning in Transformers”, et al 2022
- “Semantic Projection Recovers Rich Human Knowledge of Multiple Object Features from Word Embeddings”, et al 2022
- “Block-Recurrent Transformers”, et al 2022
- “All You Need Is Supervised Learning: From Imitation Learning to Meta-RL With Upside Down RL”, et al 2022
- “Retrieval-Augmented Reinforcement Learning”, et al 2022
- “Learning by Directional Gradient Descent”, et al 2022
- “General-Purpose, Long-Context Autoregressive Modeling With Perceiver AR”, et al 2022
- “End-To-End Algorithm Synthesis With Recurrent Networks: Logical Extrapolation Without Overthinking”, et al 2022
- “Data Scaling Laws in NMT: The Effect of Noise and Architecture”, et al 2022
- “Active Predictive Coding Networks: A Neural Solution to the Problem of Learning Reference Frames and Part-Whole Hierarchies”, 2022
- “Learning Robust Perceptive Locomotion for Quadrupedal Robots in the Wild”, et al 2022
- “Inducing Causal Structure for Interpretable Neural Networks (IIT)”, et al 2021
- “Evaluating Distributional Distortion in Neural Language Modeling”, 2021
- “Gradients Are Not All You Need”, et al 2021
- “An Explanation of In-Context Learning As Implicit Bayesian Inference”, et al 2021
- “S4: Efficiently Modeling Long Sequences With Structured State Spaces”, et al 2021
- “Minimum Description Length Recurrent Neural Networks”, et al 2021
- “LSSL: Combining Recurrent, Convolutional, and Continuous-Time Models With Linear State-Space Layers”, et al 2021
- “A Connectome of the Drosophila Central Complex Reveals Network Motifs Suitable for Flexible Navigation and Context-Dependent Action Selection”, et al 2021
- “Recurrent Model-Free RL Is a Strong Baseline for Many POMDPs”, et al 2021
- “Photos Are All You Need for Reciprocal Recommendation in Online Dating”, Neve & 2021
- “Perceiver IO: A General Architecture for Structured Inputs & Outputs”, et al 2021
- “PES: Unbiased Gradient Estimation in Unrolled Computation Graphs With Persistent Evolution Strategies”, et al 2021
- “Shelley: A Crowd-Sourced Collaborative Horror Writer”, et al 2021
- “Ten Lessons From Three Generations Shaped Google’s TPUv4i”, et al 2021
- “RASP: Thinking Like Transformers”, et al 2021
- “Scaling Laws for Acoustic Models”, 2021
- “Scaling End-To-End Models for Large-Scale Multilingual ASR”, et al 2021
- “Sensitivity As a Complexity Measure for Sequence Classification Tasks”, et al 2021
- “ALD: Efficient Transformers in Reinforcement Learning Using Actor-Learner Distillation”, 2021
- “Finetuning Pretrained Transformers into RNNs”, et al 2021
- “Pretrained Transformers As Universal Computation Engines”, et al 2021
- “Perceiver: General Perception With Iterative Attention”, et al 2021
- “When Attention Meets Fast Recurrence: Training SRU++ Language Models With Reduced Compute”, 2021
- “Generative Speech Coding With Predictive Variance Regularization”, et al 2021
- “Predictive Coding Is a Consequence of Energy Efficiency in Recurrent Neural Networks”, et al 2021
- “Deep Residual Learning in Spiking Neural Networks”, et al 2021
- “Distilling Large Language Models into Tiny and Effective Students Using PQRNN”, et al 2021
- “Meta Learning Backpropagation And Improving It”, 2020
- “On the Binding Problem in Artificial Neural Networks”, et al 2020
- “A Recurrent Vision-And-Language BERT for Navigation”, et al 2020
- “Towards Playing Full MOBA Games With Deep Reinforcement Learning”, et al 2020
- “Multimodal Dynamics Modeling for Off-Road Autonomous Vehicles”, et al 2020
- “Adversarial Vulnerabilities of Human Decision-Making”, et al 2020
- “Learning to Summarize Long Texts With Memory Compression and Transfer”, et al 2020
- “Human-Centric Dialog Training via Offline Reinforcement Learning”, et al 2020
- “AFT: An Attention Free Transformer”, 2020
- “Deep Reinforcement Learning for Closed-Loop Blood Glucose Control”, et al 2020
- “HiPPO: Recurrent Memory With Optimal Polynomial Projections”, et al 2020
- “Adding Recurrence to Pretrained Transformers for Improved Efficiency and Context Size”, et al 2020
- “Matt Botvinick on the Spontaneous Emergence of Learning Algorithms”, 2020
- “Cultural Influences on Word Meanings Revealed through Large-Scale Semantic Alignment”, et al 2020
- “DeepSinger: Singing Voice Synthesis With Data Mined From the Web”, et al 2020
- “High-Performance Brain-To-Text Communication via Imagined Handwriting”, et al 2020
- “Transformers Are RNNs: Fast Autoregressive Transformers With Linear Attention”, et al 2020
- “The Recurrent Neural Tangent Kernel”, et al 2020
- “Untangling Tradeoffs between Recurrence and Self-Attention in Neural Networks”, et al 2020
- “Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing”, et al 2020
- “Learning Music Helps You Read: Using Transfer to Study Linguistic Structure in Language Models”, 2020
- “Syntactic Structure from Deep Learning”, 2020
- “Agent57: Outperforming the Human Atari Benchmark”, et al 2020
- “Machine Translation of Cortical Activity to Text With an Encoder-Decoder Framework”, et al 2020
- “Learning-Based Memory Allocation for C++ Server Workloads”, et al 2020
- “Accelerating Feedforward Computation via Parallel Nonlinear Equation Solving”, et al 2020
- “Direct Fit to Nature: An Evolutionary Perspective on Biological and Artificial Neural Networks”, et al 2020
- “Scaling Laws for Neural Language Models”, et al 2020
- “Estimating the Deep Replicability of Scientific Findings Using Human and Artificial Intelligence”, et al 2020
- “Placing Language in an Integrated Understanding System: Next Steps toward Human-Level Performance in Neural Language Models”, et al 2020
- “Measuring Compositional Generalization: A Comprehensive Method on Realistic Data”, et al 2019
- “SimpleBooks: Long-Term Dependency Book Dataset With Simplified English Vocabulary for Word-Level Language Modeling”, 2019
- “Single Headed Attention RNN: Stop Thinking With Your Head”, 2019
- “Excavate”, 2019
- “MuZero: Mastering Atari, Go, Chess and Shogi by Planning With a Learned Model”, et al 2019
- “CommonGen: A Constrained Text Generation Challenge for Generative Commonsense Reasoning”, et al 2019
- “High Fidelity Video Prediction With Large Stochastic Recurrent Neural Networks”, et al 2019
- “Legendre Memory Units: Continuous-Time Representation in Recurrent Neural Networks”, et al 2019
- “SEED RL: Scalable and Efficient Deep-RL With Accelerated Central Inference”, et al 2019
- “Mixed-Signal Neuromorphic Processors: Quo Vadis?”, et al 2019
- “Restoring Ancient Text Using Deep Learning (Pythia): a Case Study on Greek Epigraphy”, et al 2019
- “Mogrifier LSTM”, et al 2019
- “R2D3: Making Efficient Use of Demonstrations to Solve Hard Exploration Problems”, et al 2019
- “Language Modeling State-Of-The-Art Leaderboards”, paperswithcode.com 2019
- “Metalearned Neural Memory”, et al 2019
- “Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank”, et al 2019
- “Generating Text With Recurrent Neural Networks”, et al 2019
- “XLNet: Generalized Autoregressive Pretraining for Language Understanding”, et al 2019
- “Playing the Lottery With Rewards and Multiple Languages: Lottery Tickets in RL and NLP”, et al 2019
- “MoGlow: Probabilistic and Controllable Motion Synthesis Using Normalizing Flows”, et al 2019
- “Reinforcement Learning, Fast and Slow”, et al 2019
- “Meta-Learners’ Learning Dynamics Are unlike Learners’”, 2019
- “Speech Synthesis from Neural Decoding of Spoken Sentences”, et al 2019
- “Good News, Everyone! Context Driven Entity-Aware Captioning for News Images”, et al 2019
- “Surrogate Gradient Learning in Spiking Neural Networks”, et al 2019
- “On the Turing Completeness of Modern Neural Network Architectures”, et al 2019
- “Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context”, et al 2019
- “Natural Questions: A Benchmark for Question Answering Research”, et al 2019
- “High Fidelity Video Prediction With Large Stochastic Recurrent Neural Networks: Videos”, et al 2019
- “Bayesian Layers: A Module for Neural Network Uncertainty”, et al 2018
- “Meta-Learning: Learning to Learn Fast”, 2018
- “Piano Genie”, et al 2018
- “Learning Recurrent Binary/Ternary Weights”, et al 2018
- “R2D2: Recurrent Experience Replay in Distributed Reinforcement Learning”, et al 2018
- “HotpotQA: A Dataset for Diverse, Explainable Multi-Hop Question Answering”, et al 2018
- “Adversarial Reprogramming of Text Classification Neural Networks”, et al 2018
- “Object Hallucination in Image Captioning”, et al 2018
- “This Time With Feeling: Learning Expressive Musical Performance”, et al 2018
- “Character-Level Language Modeling With Deeper Self-Attention”, Al- et al 2018
- “General Value Function Networks”, et al 2018
- “Deep-Speare: A Joint Neural Model of Poetic Language, Meter and Rhyme”, et al 2018
- “Universal Transformers”, et al 2018
- “Accurate Uncertainties for Deep Learning Using Calibrated Regression”, et al 2018
- “The Natural Language Decathlon: Multitask Learning As Question Answering”, et al 2018
- “Neural Ordinary Differential Equations”, et al 2018
- “Know What You Don’t Know: Unanswerable Questions for SQuAD”, et al 2018
- “DVRL: Deep Variational Reinforcement Learning for POMDPs”, et al 2018
- “Greedy Attack and Gumbel Attack: Generating Adversarial Examples for Discrete Data”, et al 2018
- “Hierarchical Neural Story Generation”, et al 2018
- “Sharp Nearby, Fuzzy Far Away: How Neural Language Models Use Context”, et al 2018
- “Newsroom: A Dataset of 1.3 Million Summaries With Diverse Extractive Strategies”, et al 2018
- “A Tree Search Algorithm for Sequence Labeling”, et al 2018
- “An Analysis of Neural Language Modeling at Multiple Scales”, et al 2018
- “Reviving and Improving Recurrent Back-Propagation”, et al 2018
- “Learning Memory Access Patterns”, et al 2018
- “Learning Longer-Term Dependencies in RNNs With Auxiliary Losses”, et al 2018
- “One Big Net For Everything”, 2018
- “Efficient Neural Audio Synthesis”, et al 2018
- “Deep Contextualized Word Representations”, et al 2018
- “M-Walk: Learning to Walk over Graphs Using Monte Carlo Tree Search”, et al 2018
- “Overcoming the Vanishing Gradient Problem in Plain Recurrent Networks”, et al 2018
- “ULMFiT: Universal Language Model Fine-Tuning for Text Classification”, 2018
- “Large-Scale Comparison of Machine Learning Methods for Drug Target Prediction on ChEMBL”, et al 2018
- “A Flexible Approach to Automated RNN Architecture Generation”, et al 2017
- “The NarrativeQA Reading Comprehension Challenge”, et al 2017
- “Learning Compact Recurrent Neural Networks With Block-Term Tensor Decomposition”, et al 2017
- “Mastering the Dungeon: Grounded Language Learning by Mechanical Turker Descent”, et al 2017
- “Evaluating Prose Style Transfer With the Bible”, et al 2017
- “Breaking the Softmax Bottleneck: A High-Rank RNN Language Model”, et al 2017
- “Neural Speed Reading via Skim-RNN”, et al 2017
- “Unsupervised Machine Translation Using Monolingual Corpora Only”, et al 2017
- “Generalization without Systematicity: On the Compositional Skills of Sequence-To-Sequence Recurrent Networks”, 2017
- “Mixed Precision Training”, et al 2017
- “To Prune, or Not to Prune: Exploring the Efficacy of Pruning for Model Compression”, 2017
- “Dynamic Evaluation of Neural Sequence Models”, et al 2017
- “Online Learning of a Memory for Learning Rates”, et al 2017
- “Why Pay More When You Can Pay Less: A Joint Learning Framework for Active Feature Acquisition and Classification”, et al 2017
- “N2N Learning: Network to Network Compression via Policy Gradient Reinforcement Learning”, et al 2017
- “SRU: Simple Recurrent Units for Highly Parallelizable Recurrence”, et al 2017
- “Learning to Look Around: Intelligently Exploring Unseen Environments for Unknown Tasks”, 2017
- “Twin Networks: Matching the Future for Sequence Generation”, et al 2017
- “Skip RNN: Learning to Skip State Updates in Recurrent Neural Networks”, et al 2017
- “Revisiting Activation Regularization for Language RNNs”, et al 2017
- “Bayesian Sparsification of Recurrent Neural Networks”, et al 2017
- “On the State-Of-The-Art of Evaluation in Neural Language Models”, et al 2017
- “Controlling Linguistic Style Aspects in Neural Language Generation”, 2017
- “Device Placement Optimization With Reinforcement Learning”, et al 2017
- “Six Challenges for Neural Machine Translation”, 2017
- “Towards Synthesizing Complex Programs from Input-Output Examples”, et al 2017
- “Language Generation With Recurrent Generative Adversarial Networks without Pre-Training”, et al 2017
- “Biased Importance Sampling for Deep Neural Network Training”, 2017
- “Deriving Neural Architectures from Sequence and Graph Kernels”, et al 2017
- “A Deep Reinforced Model for Abstractive Summarization”, et al 2017
- “TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension”, et al 2017
- “DeepTingle”, et al 2017
- “A Neural Network System for Transformation of Regional Cuisine Style”, et al 2017
- “Sharp Models on Dull Hardware: Fast and Accurate Neural Machine Translation Decoding on the CPU”, 2017
- “Adversarial Neural Machine Translation”, et al 2017
- “SearchQA: A New Q&A Dataset Augmented With Context from a Search Engine”, et al 2017
- “Learning to Reason: End-To-End Module Networks for Visual Question Answering”, et al 2017
- “Exploring Sparsity in Recurrent Neural Networks”, et al 2017
- “Get To The Point: Summarization With Pointer-Generator Networks”, et al 2017
- “DeepAR: Probabilistic Forecasting With Autoregressive Recurrent Networks”, et al 2017
- “Bayesian Recurrent Neural Networks”, et al 2017
- “Recurrent Environment Simulators”, et al 2017
- “Learning to Generate Reviews and Discovering Sentiment”, et al 2017
- “Learning Simpler Language Models With the Differential State Framework”, II et al 2017
- “I2T2I: Learning Text to Image Synthesis With Textual Data Augmentation”, et al 2017
- “Improving Neural Machine Translation With Conditional Sequence Generative Adversarial Nets”, et al 2017
- “Learned Optimizers That Scale and Generalize”, et al 2017
- “Parallel Multiscale Autoregressive Density Estimation”, et al 2017
- “Tracking the World State With Recurrent Entity Networks”, et al 2017
- “Optimization As a Model for Few-Shot Learning”, 2017
- “Neural Combinatorial Optimization With Reinforcement Learning”, et al 2017
- “Frustratingly Short Attention Spans in Neural Language Modeling”, et al 2017
- “Tuning Recurrent Neural Networks With Reinforcement Learning”, et al 2017
- “Outrageously Large Neural Networks: The Sparsely-Gated Mixture-Of-Experts Layer”, et al 2017
- “Neural Data Filter for Bootstrapping Stochastic Gradient Descent”, et al 2017
- “Learning the Enigma With Recurrent Neural Networks”, 2017
- “Your TL;DR by an AI: A Deep Reinforced Model for Abstractive Summarization”, 2017
- “SampleRNN: An Unconditional End-To-End Neural Audio Generation Model”, et al 2016
- “Improving Neural Language Models With a Continuous Cache”, et al 2016
- “NewsQA: A Machine Comprehension Dataset”, et al 2016
- “Neural Combinatorial Optimization With Reinforcement Learning”, et al 2016
- “Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation”, et al 2016
- “Learning to Learn without Gradient Descent by Gradient Descent”, et al 2016
- “RL2: Fast Reinforcement Learning via Slow Reinforcement Learning”, et al 2016
- “DeepCoder: Learning to Write Programs”, et al 2016
- “QRNNs: Quasi-Recurrent Neural Networks”, et al 2016
- “Neural Architecture Search With Reinforcement Learning”, 2016
- “Bidirectional Attention Flow for Machine Comprehension”, et al 2016
- “Hybrid Computing Using a Neural Network With Dynamic External Memory”, et al 2016
- “Scaling Memory-Augmented Neural Networks With Sparse Reads and Writes”, et al 2016
- “Using Fast Weights to Attend to the Recent Past”, et al 2016
- “Achieving Human Parity in Conversational Speech Recognition”, et al 2016
- “VPN: Video Pixel Networks”, et al 2016
- “HyperNetworks”, et al 2016
- “Pointer Sentinel Mixture Models”, et al 2016
- “Multiplicative LSTM for Sequence Modeling”, et al 2016
- “Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation”, et al 2016
- “Image-To-Markup Generation With Coarse-To-Fine Attention”, et al 2016
- “Hierarchical Multiscale Recurrent Neural Networks”, et al 2016
- “Deep Learning Human Mind for Automated Visual Classification”, et al 2016
- “Using the Output Embedding to Improve Language Models”, 2016
- “Full Resolution Image Compression With Recurrent Neural Networks”, et al 2016
- “Decoupled Neural Interfaces Using Synthetic Gradients”, et al 2016
- “Clockwork Convnets for Video Semantic Segmentation”, et al 2016
- “Layer Normalization”, et al 2016
- “LSTMVis: A Tool for Visual Analysis of Hidden State Dynamics in Recurrent Neural Networks”, et al 2016
- “Learning to Learn by Gradient Descent by Gradient Descent”, et al 2016
- “Iterative Alternating Neural Attention for Machine Reading”, et al 2016
- “Deep Reinforcement Learning for Dialogue Generation”, et al 2016
- “Programming With a Differentiable Forth Interpreter”, et al 2016
- “Training Deep Nets With Sublinear Memory Cost”, et al 2016
- “Bridging the Gaps Between Residual Learning, Recurrent Neural Networks and Visual Cortex”, 2016
- “Improving Sentence Compression by Learning to Predict Gaze”, et al 2016
- “Adaptive Computation Time for Recurrent Neural Networks”, 2016
- “Dynamic Memory Networks for Visual and Textual Question Answering”, et al 2016
- “PlaNet—Photo Geolocation With Convolutional Neural Networks”, et al 2016
- “Learning Distributed Representations of Sentences from Unlabeled Data”, et al 2016
- “Exploring the Limits of Language Modeling”, et al 2016
- “PixelRNN: Pixel Recurrent Neural Networks”, et al 2016
- “Persistent RNNs: Stashing Recurrent Weights On-Chip”, et al 2016
- “Exploring the Limits of Language Modeling § 5.9: Samples from the Model”
- “Deep-Spying: Spying Using Smartwatch and Deep Learning”, 2015
- “On Learning to Think: Algorithmic Information Theory for Novel Combinations of Reinforcement Learning Controllers and Recurrent Neural World Models”, 2015
- “Neural GPUs Learn Algorithms”, 2015
- “Sequence Level Training With Recurrent Neural Networks”, et al 2015
- “Neural Programmer-Interpreters”, 2015
- “Generating Sentences from a Continuous Space”, et al 2015
- “Generative Concatenative Nets Jointly Learn to Write and Classify Reviews”, et al 2015
- “Generating Images from Captions With Attention”, et al 2015
- “Semi-Supervised Sequence Learning”, 2015
- “BPEs: Neural Machine Translation of Rare Words With Subword Units”, et al 2015
- “Training Recurrent Networks Online without Backtracking”, et al 2015
- “Deep Recurrent Q-Learning for Partially Observable MDPs”, 2015
- “Teaching Machines to Read and Comprehend”, et al 2015
- “Scheduled Sampling for Sequence Prediction With Recurrent Neural Networks”, et al 2015
- “Visualizing and Understanding Recurrent Networks”, et al 2015
- “The Unreasonable Effectiveness of Recurrent Neural Networks”, 2015
- “Deep Neural Networks for Large Vocabulary Handwritten Text Recognition”, 2015
- “Reinforcement Learning Neural Turing Machines—Revised”, 2015
- “End-To-End Memory Networks”, et al 2015
- “LSTM: A Search Space Odyssey”, et al 2015
- “Inferring Algorithmic Patterns With Stack-Augmented Recurrent Nets”, et al 2015
- “DRAW: A Recurrent Neural Network For Image Generation”, et al 2015
- “Ensemble of Generative and Discriminative Techniques for Sentiment Analysis of Movie Reviews”, et al 2014
- “Neural Turing Machines”, et al 2014
- “Learning to Execute”, 2014
- “Neural Machine Translation by Jointly Learning to Align and Translate”, et al 2014
- “Identifying and Attacking the Saddle Point Problem in High-Dimensional Non-Convex Optimization”, et al 2014
- “GRU: Learning Phrase Representations Using RNN Encoder-Decoder for Statistical Machine Translation”, et al 2014
- “
doc2vec: Distributed Representations of Sentences and Documents”, 2014- “A Clockwork RNN”, et al 2014
- “One Billion Word Benchmark for Measuring Progress in Statistical Language Modeling”, et al 2013
- “Generating Sequences With Recurrent Neural Networks”, 2013
- “On the Difficulty of Training Recurrent Neural Networks”, et al 2012
- “Recurrent Neural Network Based Language Model”, et al 2010
- “Large Language Models in Machine Translation”, et al 2007
- “Learning to Learn Using Gradient Descent”, et al 2001
- “Long Short-Term Memory”, 1997
- “Flat Minima”, 1997
- “Gradient-Based Learning Algorithms for Recurrent Networks and Their Computational Complexity”, 1995
- “A Focused Backpropagation Algorithm for Temporal Pattern Recognition”, 1995
- “Learning Complex, Extended Sequences Using the Principle of History Compression”, 1992
- “Learning to Control Fast-Weight Memories: An Alternative to Dynamic Recurrent Networks”, 1992
- “Untersuchungen Zu Dynamischen Neuronalen Netzen [Studies of Dynamic Neural Networks]”, 1991
- “Finding Structure In Time”, 1990
- “Connectionist Music Composition Based on Melodic, Stylistic, and Psychophysical Constraints [Technical Report CU-CS–495–90]”, 1990
- “A Learning Algorithm for Continually Running Fully Recurrent Neural Networks”, 1989b
- “Recurrent Backpropagation and Hopfield Networks”, 1989b
- “Backpropagation in Perceptrons With Feedback”, 1989
- “Experimental Analysis of the Real-Time Recurrent Learning Algorithm”, 1989
- “A Local Learning Algorithm for Dynamic Feedforward and Recurrent Networks”, 1989
- “A Sticky-Bit Approach for Learning to Represent State”, 1988
- “Generalization of Backpropagation With Application to a Recurrent Gas Market Model”, 1988
- “Generalization of Back-Propagation to Recurrent Neural Networks”, 1987
- “The Utility Driven Dynamic Error Propagation Network (RTRL)”, 1987
- “A Self-Optimizing, Non-Symmetrical Neural Net for Content Addressable Memory and Pattern Recognition”, 1986
- “Programming a Massively Parallel, Computation Universal System: Static Behavior”, 1986b
- “Serial Order: A Parallel Distributed Processing Approach”, 1986
- “Hypernetworks [Blog]”, 2024
- “Safety-First AI for Autonomous Data Center Cooling and Industrial Control”
- “Attention and Augmented Recurrent Neural Networks”
- “BlinkDL/RWKV-LM: RWKV Is an RNN With Transformer-Level LLM Performance. It Can Be Directly Trained like a GPT (parallelizable). So It’s Combining the Best of RNN and Transformer—Great Performance, Fast Inference, Saves VRAM, Fast Training, “Infinite” Ctx_len, and Free Sentence Embedding.”
- “Efficient, Reusable RNNs and LSTMs for Torch”
- “Updated Training?”
- “Minimaxir/textgenrnn: Easily Train Your Own Text-Generating Neural Network of Any Size and Complexity on Any Text Dataset With a Few Lines of Code.”
- “Deep Learning for Assisting the Process of Music Composition (part 3)”
- “Metalearning or Learning to Learn Since 1987”
- “Stream Seaandsailor”
- “Composing Music With Recurrent Neural Networks”
- Wikipedia
- Miscellaneous
- Bibliography