- See Also
-
Links
- “Diversifying AI: Towards Creative Chess With AlphaZero”, Zahavy et al 2023
- “CausalLM Is Not Optimal for In-context Learning”, Ding et al 2023
- “Self Expanding Neural Networks”, Mitchell et al 2023
- “One Step of Gradient Descent Is Provably the Optimal In-Context Learner With One Layer of Linear Self-Attention”, Mahankali et al 2023
- “Teaching Arithmetic to Small Transformers”, Lee et al 2023
- “Trainable Transformer in Transformer”, Panigrahi et al 2023
- “Pretraining Task Diversity and the Emergence of Non-Bayesian In-context Learning for Regression”, Raventós et al 2023
- “How Well Do Large Language Models Perform in Arithmetic Tasks?”, Yuan et al 2023
- “BiLD: Big Little Transformer Decoder”, Kim et al 2023
- “Hard Prompts Made Easy: Gradient-Based Discrete Optimization for Prompt Tuning and Discovery”, Wen et al 2023
- “Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent As Meta-Optimizers”, Dai et al 2022
- “Unnatural Instructions: Tuning Language Models With (Almost) No Human Labor”, Honovich et al 2022
- “FWL: Meta-Learning Fast Weight Language Models”, Clark et al 2022
- “What Learning Algorithm Is In-context Learning? Investigations With Linear Models”, Akyürek et al 2022
- “VeLO: Training Versatile Learned Optimizers by Scaling Up”, Metz et al 2022
- “BLOOMZ/mT0: Crosslingual Generalization through Multitask Finetuning”, Muennighoff et al 2022
- “ProMoT: Preserving In-Context Learning Ability in Large Language Model Fine-tuning”, Wang et al 2022
- “SAP: Bidirectional Language Models Are Also Few-shot Learners”, Patel et al 2022
- “AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model”, Soltan et al 2022
- “What Can Transformers Learn In-Context? A Case Study of Simple Function Classes”, Garg et al 2022
- “Few-shot Adaptation Works With UnpredicTable Data”, Chan et al 2022
- “Transformer Neural Processes: Uncertainty-Aware Meta Learning Via Sequence Modeling”, Nguyen & Grover 2022
- “Offline RL Policies Should Be Trained to Be Adaptive”, Ghosh et al 2022
- “TabPFN: Meta-Learning a Real-Time Tabular AutoML Method For Small Data”, Hollmann et al 2022
- “Goal-Conditioned Generators of Deep Policies”, Faccio et al 2022
- “Prompting Decision Transformer for Few-Shot Policy Generalization”, Xu et al 2022
- “RHO-LOSS: Prioritized Training on Points That Are Learnable, Worth Learning, and Not Yet Learnt”, Mindermann et al 2022
- “NOAH: Neural Prompt Search”, Zhang et al 2022
- “Task-Agnostic Continual Reinforcement Learning: In Praise of a Simple Baseline (3RL)”, Caccia et al 2022
- “Bongard-HOI: Benchmarking Few-Shot Visual Reasoning for Human-Object Interactions”, Jiang et al 2022
- “Towards Learning Universal Hyperparameter Optimizers With Transformers”, Chen et al 2022
- “CT0: Fine-tuned Language Models Are Continual Learners”, Scialom et al 2022
- “Instruction Induction: From Few Examples to Natural Language Task Descriptions”, Honovich et al 2022
- “Gato: A Generalist Agent”, Reed et al 2022
- “Unifying Language Learning Paradigms”, Tay et al 2022
- “Data Distributional Properties Drive Emergent Few-Shot Learning in Transformers”, Chan et al 2022
- “Tk-Instruct: Benchmarking Generalization via In-Context Instructions on 1,600+ Language Tasks”, Wang et al 2022
- “What Language Model Architecture and Pretraining Objective Work Best for Zero-Shot Generalization?”, Wang et al 2022
- “Effective Mutation Rate Adaptation through Group Elite Selection”, Kumar et al 2022
- “DualPrompt: Complementary Prompting for Rehearsal-free Continual Learning”, Wang et al 2022
- “Searching for Efficient Neural Architectures for On-Device ML on Edge TPUs”, Akin et al 2022
- “Can Language Models Learn from Explanations in Context?”, Lampinen et al 2022
- “Auto-Lambda: Disentangling Dynamic Task Relationships”, Liu et al 2022
- “In-context Learning and Induction Heads”, Olsson et al 2022
- “Evolving Curricula With Regret-Based Environment Design”, Parker-Holder et al 2022
- “HyperPrompt: Prompt-based Task-Conditioning of Transformers”, He et al 2022
- “Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?”, Min et al 2022
- “All You Need Is Supervised Learning: From Imitation Learning to Meta-RL With Upside Down RL”, Arulkumaran et al 2022
- “NeuPL: Neural Population Learning”, Liu et al 2022
- “The Dual Form of Neural Networks Revisited: Connecting Test Time Predictions to Training Patterns via Spotlights of Attention”, Irie et al 2022
- “Learning Synthetic Environments and Reward Networks for Reinforcement Learning”, Ferreira et al 2022
- “From Data to Functa: Your Data Point Is a Function and You Should Treat It like One”, Dupont et al 2022
- “Active Predictive Coding Networks: A Neural Solution to the Problem of Learning Reference Frames and Part-Whole Hierarchies”, Gklezakos & Rao 2022
- “Environment Generation for Zero-Shot Compositional Reinforcement Learning”, Gur et al 2022
- “Learning Robust Perceptive Locomotion for Quadrupedal Robots in the Wild”, Miki et al 2022
- “HyperTransformer: Model Generation for Supervised and Semi-Supervised Few-Shot Learning”, Zhmoginov et al 2022
- “In Defense of the Unitary Scalarization for Deep Multi-Task Learning”, Kurin et al 2022
- “Automated Reinforcement Learning (AutoRL): A Survey and Open Problems”, Parker-Holder et al 2022
- “Finding General Equilibria in Many-Agent Economic Simulations Using Deep Reinforcement Learning”, Curry et al 2022
- “The Curse of Zero Task Diversity: On the Failure of Transfer Learning to Outperform MAML and Their Empirical Equivalence”, Miranda et al 2021
- “A Mathematical Framework for Transformer Circuits”, Elhage et al 2021
- “PFNs: Transformers Can Do Bayesian Inference”, Müller et al 2021
- “Learning to Prompt for Continual Learning”, Wang et al 2021
- “How to Learn and Represent Abstractions: An Investigation Using Symbolic Alchemy”, AlKhamissi et al 2021
- “Noether Networks: Meta-Learning Useful Conserved Quantities”, Alet et al 2021
- “A General Language Assistant As a Laboratory for Alignment”, Askell et al 2021
- “A Rational Reinterpretation of Dual-process Theories”, Milli et al 2021
- “A Modern Self-Referential Weight Matrix That Learns to Modify Itself”, Irie et al 2021
- “A Survey of Generalisation in Deep Reinforcement Learning”, Kirk et al 2021
- “Gradients Are Not All You Need”, Metz et al 2021
- “An Explanation of In-context Learning As Implicit Bayesian Inference”, Xie et al 2021
- “Procedural Generalization by Planning With Self-Supervised World Models”, Anand et al 2021
- “MetaICL: Learning to Learn In Context”, Min et al 2021
- “Logical Activation Functions: Logit-space Equivalents of Probabilistic Boolean Operators”, Lowe et al 2021
- “Shaking the Foundations: Delusions in Sequence Models for Interaction and Control”, Ortega et al 2021
- “Meta-learning, Social Cognition and Consciousness in Brains and Machines”, Langdon et al 2021
- “T0: Multitask Prompted Training Enables Zero-Shot Task Generalization”, Sanh et al 2021
- “Embodied Intelligence via Learning and Evolution”, Gupta et al 2021
- “Replay-Guided Adversarial Environment Design”, Jiang et al 2021
- “Transformers Are Meta-Reinforcement Learners”, Anonymous 2021
- “Scalable Online Planning via Reinforcement Learning Fine-Tuning”, Fickinger et al 2021
- “Is Curiosity All You Need? On the Utility of Emergent Behaviors from Curious Exploration”, Groth et al 2021
- “Dropout’s Dream Land: Generalization from Learned Simulators to Reality”, Wellmer & Kwok 2021
- “Bootstrapped Meta-Learning”, Flennerhag et al 2021
- “The Sensory Neuron As a Transformer: Permutation-Invariant Neural Networks for Reinforcement Learning”, Tang & Ha 2021
- “FLAN: Finetuned Language Models Are Zero-Shot Learners”, Wei et al 2021
- “The AI Economist: Optimal Economic Policy Design via Two-level Deep Reinforcement Learning”, Zheng et al 2021
- “Dataset Distillation With Infinitely Wide Convolutional Networks”, Nguyen et al 2021
- “Open-Ended Learning Leads to Generally Capable Agents”, Team et al 2021
- “Why Generalization in RL Is Difficult: Epistemic POMDPs and Implicit Partial Observability”, Ghosh et al 2021
- “PonderNet: Learning to Ponder”, Banino et al 2021
- “Multimodal Few-Shot Learning With Frozen Language Models”, Tsimpoukelli et al 2021
- “LHOPT: A Generalizable Approach to Learning Optimizers”, Almeida et al 2021
- “Towards Mental Time Travel: a Hierarchical Memory for Reinforcement Learning Agents”, Lampinen et al 2021
- “A Full-stack Accelerator Search Technique for Vision Applications”, Zhang et al 2021
- “Reward Is Enough”, Silver et al 2021
- “Bayesian Optimization Is Superior to Random Search for Machine Learning Hyperparameter Tuning: Analysis of the Black-Box Optimization Challenge 2020”, Turner et al 2021
- “CrossFit: A Few-shot Learning Challenge for Cross-task Generalization in NLP”, Ye et al 2021
- “Podracer Architectures for Scalable Reinforcement Learning”, Hessel et al 2021
- “BLUR: Meta-Learning Bidirectional Update Rules”, Sandler et al 2021
- “Asymmetric Self-play for Automatic Goal Discovery in Robotic Manipulation”, OpenAI et al 2021
- “OmniNet: Omnidirectional Representations from Transformers”, Tay et al 2021
- “Linear Transformers Are Secretly Fast Weight Programmers”, Schlag et al 2021
- “Prompt Programming for Large Language Models: Beyond the Few-Shot Paradigm”, Reynolds & McDonell 2021
- “ES-ENAS: Blackbox Optimization over Hybrid Spaces via Combinatorial and Continuous Evolution”, Song et al 2021
- “Training Learned Optimizers With Randomly Initialized Learned Optimizers”, Metz et al 2021
- “Evolving Reinforcement Learning Algorithms”, Co-Reyes et al 2021
- “Meta Pseudo Labels”, Pham et al 2021
- “Meta Learning Backpropagation And Improving It”, Kirsch & Schmidhuber 2020
- “Emergent Complexity and Zero-shot Transfer via Unsupervised Environment Design”, Dennis et al 2020
- “Scaling down Deep Learning”, Greydanus 2020
- “Reverse Engineering Learned Optimizers Reveals Known and Novel Mechanisms”, Maheswaranathan et al 2020
- “Dataset Meta-Learning from Kernel Ridge-Regression”, Nguyen et al 2020
- “MELD: Meta-Reinforcement Learning from Images via Latent State Models”, Zhao et al 2020
- “Meta-trained Agents Implement Bayes-optimal Agents”, Mikulik et al 2020
- “Learning Not to Learn: Nature versus Nurture in Silico”, Lange & Sprekeler 2020
- “Prioritized Level Replay”, Jiang et al 2020
- “Learning from the Past: Meta-Continual Learning With Knowledge Embedding for Jointly Sketch, Cartoon, and Caricature Face Recognition”, Zheng et al 2020b
- “Tasks, Stability, Architecture, and Compute: Training More Effective Learned Optimizers, and Using Them to Train Themselves”, Metz et al 2020
- “Grounded Language Learning Fast and Slow”, Hill et al 2020
- “Matt Botvinick on the Spontaneous Emergence of Learning Algorithms”, Scholl 2020
- “Discovering Reinforcement Learning Algorithms”, Oh et al 2020
- “Deep Reinforcement Learning and Its Neuroscientific Implications”, Botvinick 2020
- “Meta-Learning through Hebbian Plasticity in Random Networks”, Najarro & Risi 2020
- “Decentralized Reinforcement Learning: Global Decision-Making via Local Economic Transactions”, Chang et al 2020
- “Learning to Learn With Feedback and Local Plasticity”, Lindsey & Litwin-Kumar 2020
- “Rapid Task-Solving in Novel Environments”, Ritter et al 2020
- “FBNetV3: Joint Architecture-Recipe Search Using Predictor Pretraining”, Dai et al 2020
- “GPT-3: Language Models Are Few-Shot Learners”, Brown et al 2020
- “Synthetic Petri Dish: A Novel Surrogate Model for Rapid Architecture Search”, Rawal et al 2020
- “Automatic Discovery of Interpretable Planning Strategies”, Skirzyński et al 2020
- “Meta-Reinforcement Learning for Robotic Industrial Insertion Tasks”, Schoettler et al 2020
- “A Comparison of Methods for Treatment Assignment With an Application to Playlist Generation”, Fernández-Loría et al 2020
- “Approximate Exploitability: Learning a Best Response in Large Games”, Timbers et al 2020
- “Meta-Learning in Neural Networks: A Survey”, Hospedales et al 2020
- “Designing Network Design Spaces”, Radosavovic et al 2020
- “Agent57: Outperforming the Atari Human Benchmark”, Badia et al 2020
- “Enhanced POET: Open-Ended Reinforcement Learning through Unbounded Invention of Learning Challenges and Their Solutions”, Wang et al 2020
- “Accelerating and Improving AlphaZero Using Population Based Training”, Wu et al 2020
- “Meta-learning Curiosity Algorithms”, Alet et al 2020
- “AutoML-Zero: Evolving Machine Learning Algorithms From Scratch”, Real et al 2020
- “AutoML-Zero: Open Source Code for the Paper: "AutoML-Zero: Evolving Machine Learning Algorithms From Scratch"”, Real et al 2020
- “Effective Diversity in Population Based Reinforcement Learning”, Parker-Holder et al 2020
- “AI Helps Warehouse Robots Pick Up New Tricks: Backed by Machine Learning Luminaries, Covariant.ai’s Bots Can Handle Jobs Previously Needing a Human Touch”, Knight 2020
- “Smooth Markets: A Basic Mechanism for Organizing Gradient-based Learners”, Balduzzi et al 2020
- “AutoML-Zero: Evolving Code That Learns”, Real & Liang 2020
- “Learning Neural Activations”, Minhas & Asif 2019
- “Meta-Learning without Memorization”, Yin et al 2019
- “MetaFun: Meta-Learning With Iterative Functional Updates”, Xu et al 2019
- “Procgen Benchmark: We’re Releasing Procgen Benchmark, 16 Simple-to-use Procedurally-generated Environments Which Provide a Direct Measure of How Quickly a Reinforcement Learning Agent Learns Generalizable Skills”, Cobbe et al 2019
- “Leveraging Procedural Generation to Benchmark Reinforcement Learning”, Cobbe et al 2019
- “Increasing Generality in Machine Learning through Procedural Content Generation”, Risi & Togelius 2019
- “Optimizing Millions of Hyperparameters by Implicit Differentiation”, Lorraine et al 2019
- “Learning to Predict Without Looking Ahead: World Models Without Forward Prediction [blog]”, Freeman et al 2019
- “Learning to Predict Without Looking Ahead: World Models Without Forward Prediction”, Freeman et al 2019
- “Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning”, Yu et al 2019
- “Solving Rubik’s Cube With a Robot Hand”, OpenAI et al 2019
- “Solving Rubik’s Cube With a Robot Hand [blog]”, OpenAI 2019
- “Gradient Descent: The Ultimate Optimizer”, Chandra et al 2019
- “Multiplicative Interactions and Where to Find Them”, Jayakumar et al 2019
- “Data Valuation Using Reinforcement Learning”, Yoon et al 2019
- “ANIL: Rapid Learning or Feature Reuse? Towards Understanding the Effectiveness of MAML”, Raghu et al 2019
- “Emergent Tool Use From Multi-Agent Autocurricula”, Baker et al 2019
- “Meta-Learning With Implicit Gradients”, Rajeswaran et al 2019
- “A Critique of Pure Learning and What Artificial Neural Networks Can Learn from Animal Brains”, Zador 2019
- “AutoML: A Survey of the State-of-the-Art”, He et al 2019
- “Metalearned Neural Memory”, Munkhdalai et al 2019
- “Algorithms for Hyper-Parameter Optimization”, Bergstra et al 2019
- “Evolving the Hearthstone Meta”, Silva et al 2019
- “Meta Reinforcement Learning”, Weng 2019
- “One Epoch Is All You Need”, Komatsuzaki 2019
- “Compositional Generalization through Meta Sequence-to-sequence Learning”, Lake 2019
- “Risks from Learned Optimization in Advanced Machine Learning Systems”, Hubinger et al 2019
- “ICML 2019 Notes”, Abel 2019
- “SpArSe: Sparse Architecture Search for CNNs on Resource-Constrained Microcontrollers”, Fedorov et al 2019
- “AI-GAs: AI-generating Algorithms, an Alternate Paradigm for Producing General Artificial Intelligence”, Clune 2019
- “Alpha MAML: Adaptive Model-Agnostic Meta-Learning”, Behl et al 2019
- “Reinforcement Learning, Fast and Slow”, Botvinick et al 2019
- “Meta Reinforcement Learning As Task Inference”, Humplik et al 2019
- “Learning Loss for Active Learning”, Yoo & Kweon 2019
- “Meta-learning of Sequential Strategies”, Ortega et al 2019
- “Searching for MobileNetV3”, Howard et al 2019
- “Meta-learners’ Learning Dynamics Are unlike Learners’”, Rabinowitz 2019
- “Ray Interference: a Source of Plateaus in Deep Reinforcement Learning”, Schaul et al 2019
- “AlphaX: EXploring Neural Architectures With Deep Neural Networks and Monte Carlo Tree Search”, Wang et al 2019
- “Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables”, Rakelly et al 2019
- “The Omniglot Challenge: a 3-year Progress Report”, Lake et al 2019
- “FIGR: Few-shot Image Generation With Reptile”, Clouâtre & Demers 2019
- “Paired Open-Ended Trailblazer (POET): Endlessly Generating Increasingly Complex and Diverse Learning Environments and Their Solutions”, Wang et al 2019
- “Meta-Learning Neural Bloom Filters”, Rae, 2019 {#rae,-2019-section .link-annotated-not}
- “Malthusian Reinforcement Learning”, Leibo et al 2018
- “Quantifying Generalization in Reinforcement Learning”, Cobbe et al 2018
- “Meta-Learning: Learning to Learn Fast”, Weng 2018
- “An Introduction to Deep Reinforcement Learning”, Francois-Lavet et al 2018
- “Evolving Space-Time Neural Architectures for Videos”, Piergiovanni et al 2018
- “Understanding and Correcting Pathologies in the Training of Learned Optimizers”, Metz et al 2018
- “WBE and DRL: a Middle Way of Imitation Learning from the Human Brain”, Branwen 2018
- “BabyAI: A Platform to Study the Sample Efficiency of Grounded Language Learning”, Chevalier-Boisvert et al 2018
- “Deep Reinforcement Learning”, Li 2018
- “Searching for Efficient Multi-Scale Architectures for Dense Image Prediction”, Chen et al 2018
- “Backprop Evolution”, Alber et al 2018
- “Learning Dexterous In-Hand Manipulation”, OpenAI et al 2018
- “LEO: Meta-Learning With Latent Embedding Optimization”, Rusu et al 2018
- “Automatically Composing Representation Transformations As a Means for Generalization”, Chang et al 2018
- “Human-level Performance in First-person Multiplayer Games With Population-based Deep Reinforcement Learning”, Jaderberg et al 2018
- “Guided Evolutionary Strategies: Augmenting Random Search With Surrogate Gradients”, Maheswaranathan et al 2018
- “RUDDER: Return Decomposition for Delayed Rewards”, Arjona-Medina et al 2018
- “Meta-Learning Transferable Active Learning Policies by Deep Reinforcement Learning”, Pang et al 2018
- “Fingerprint Policy Optimisation for Robust Reinforcement Learning”, Paul et al 2018
- “Meta-Gradient Reinforcement Learning”, Xu et al 2018
- “AutoAugment: Learning Augmentation Policies from Data”, Cubuk et al 2018
- “Prefrontal Cortex As a Meta-reinforcement Learning System”, Wang et al 2018
- “Meta-Learning Update Rules for Unsupervised Representation Learning”, Metz et al 2018
- “Reviving and Improving Recurrent Back-Propagation”, Liao et al 2018
- “Kickstarting Deep Reinforcement Learning”, Schmitt et al 2018
- “Reptile: On First-Order Meta-Learning Algorithms”, Nichol et al 2018
- “Some Considerations on Learning to Explore via Meta-Reinforcement Learning”, Stadie et al 2018
- “One Big Net For Everything”, Schmidhuber 2018
- “Machine Theory of Mind”, Rabinowitz et al 2018
- “Evolved Policy Gradients”, Houthooft et al 2018
- “One-Shot Imitation from Observing Humans via Domain-Adaptive Meta-Learning”, Yu et al 2018
- “Rover Descent: Learning to Optimize by Learning to Navigate on Prototypical Loss Surfaces”, Faury & Vasile 2018
- “ScreenerNet: Learning Self-Paced Curriculum for Deep Neural Networks”, Kim & Choi 2018
- “Population Based Training of Neural Networks”, Jaderberg et al 2017
- “BlockDrop: Dynamic Inference Paths in Residual Networks”, Wu et al 2017
- “Learning to Select Computations”, Callaway et al 2017
- “Efficient K-shot Learning With Regularized Deep Networks”, Yoo et al 2017
- “Online Learning of a Memory for Learning Rates”, Meier et al 2017
- “Supervising Unsupervised Learning”, Garg & Kalai 2017
- “One-Shot Visual Imitation Learning via Meta-Learning”, Finn et al 2017
- “Learning With Opponent-Learning Awareness”, Foerster et al 2017
- “SMASH: One-Shot Model Architecture Search through HyperNetworks”, Brock et al 2017
- “Stochastic Optimization With Bandit Sampling”, Salehi et al 2017
- “A Simple Neural Attentive Meta-Learner”, Mishra et al 2017
- “Reinforcement Learning for Learning Rate Control”, Xu et al 2017
- “Metacontrol for Adaptive Imagination-Based Optimization”, Hamrick et al 2017
- “Research Ideas”, Gwern 2017
- “Deciding How to Decide: Dynamic Routing in Artificial Neural Networks”, McGill & Perona 2017
- “Prototypical Networks for Few-shot Learning”, Snell et al 2017
- “Learned Optimizers That Scale and Generalize”, Wichrowska et al 2017
- “MAML: Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks”, Finn et al 2017
- “Meta Networks”, Munkhdalai & Yu 2017
- “Optimization As a Model for Few-Shot Learning”, Ravi & Larochelle 2017
- “Understanding Synthetic Gradients and Decoupled Neural Interfaces”, Czarnecki et al 2017
- “Learning to Optimize Neural Nets”, Li & Malik 2017
- “Learning to Superoptimize Programs”, Bunel et al 2017
- “Discovering Objects and Their Relations from Entangled Scene Representations”, Raposo et al 2017
- “An Actor-critic Algorithm for Learning Rate Learning”, Xu et al 2016
- “Learning to Reinforcement Learn”, Wang et al 2016
- “Learning to Learn without Gradient Descent by Gradient Descent”, Chen et al 2016
- “RL2: Fast Reinforcement Learning via Slow Reinforcement Learning”, Duan et al 2016
- “Designing Neural Network Architectures Using Reinforcement Learning”, Baker et al 2016
- “Using Fast Weights to Attend to the Recent Past”, Ba et al 2016
- “HyperNetworks”, Ha et al 2016
- “Decoupled Neural Interfaces Using Synthetic Gradients”, Jaderberg et al 2016
- “Learning to Learn by Gradient Descent by Gradient Descent”, Andrychowicz et al 2016
- “Matching Networks for One Shot Learning”, Vinyals et al 2016
- “Learning to Optimize”, Li & Malik 2016
- “One-shot Learning With Memory-Augmented Neural Networks”, Santoro et al 2016
- “Adaptive Computation Time for Recurrent Neural Networks”, Graves 2016
- “On Learning to Think: Algorithmic Information Theory for Novel Combinations of Reinforcement Learning Controllers and Recurrent Neural World Models”, Schmidhuber 2015
- “Gradient-based Hyperparameter Optimization through Reversible Learning”, Maclaurin et al 2015
- “Machine Teaching: an Inverse Problem to Machine Learning and an Approach Toward Optimal Education”, Zhu 2015b
- “Human-level Concept Learning through Probabilistic Program Induction”, Lake, 2015 {#lake,-2015-section .link-annotated-not}
- “Deep Learning in Neural Networks: An Overview”, Schmidhuber 2014
- “Practical Bayesian Optimization of Machine Learning Algorithms”, Snoek et al 2012
- “Optimal Ordered Problem Solver (OOPS)”, Schmidhuber 2002
- “Learning to Learn Using Gradient Descent”, Hochreiter et al 2001
- “On the Optimization of a Synaptic Learning Rule”, Bengio et al 1997
- “Learning to Control Fast-Weight Memories: An Alternative to Dynamic Recurrent Networks”, Schmidhuber 1992
- “Interactions between Learning and Evolution”, Ackley & Littman 1992
- “Learning a Synaptic Learning Rule”, Bengio et al 1991
- “Reinforcement Learning: An Introduction §Designing Reward Signals”, Sutton & Barto 2023 (page 491)
- “Universal Search § OOPS and Other Incremental Variations”
- “The Lie Comes First, the Worlds to Accommodate It”
- “AlphaStar: Mastering the Real-Time Strategy Game StarCraft II”
- “Prefrontal Cortex As a Meta-reinforcement Learning System [blog]”
- “Optimal Learning: Computational Procedures for Bayes-Adaptive Markov Decision Processes”
- Sort By Magic
- Wikipedia
- Miscellaneous
- Link Bibliography
See Also
Links
“Diversifying AI: Towards Creative Chess With AlphaZero”, Zahavy et al 2023
“CausalLM Is Not Optimal for In-context Learning”, Ding et al 2023
“Self Expanding Neural Networks”, Mitchell et al 2023
“One Step of Gradient Descent Is Provably the Optimal In-Context Learner With One Layer of Linear Self-Attention”, Mahankali et al 2023
“Teaching Arithmetic to Small Transformers”, Lee et al 2023
“Trainable Transformer in Transformer”, Panigrahi et al 2023
“Pretraining Task Diversity and the Emergence of Non-Bayesian In-context Learning for Regression”, Raventós et al 2023
“Pretraining task diversity and the emergence of non-Bayesian in-context learning for regression”
“How Well Do Large Language Models Perform in Arithmetic Tasks?”, Yuan et al 2023
“How well do Large Language Models perform in Arithmetic tasks?”
“BiLD: Big Little Transformer Decoder”, Kim et al 2023
“Hard Prompts Made Easy: Gradient-Based Discrete Optimization for Prompt Tuning and Discovery”, Wen et al 2023
“Hard Prompts Made Easy: Gradient-Based Discrete Optimization for Prompt Tuning and Discovery”
“Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent As Meta-Optimizers”, Dai et al 2022
“Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent as Meta-Optimizers”
“Unnatural Instructions: Tuning Language Models With (Almost) No Human Labor”, Honovich et al 2022
“Unnatural Instructions: Tuning Language Models with (Almost) No Human Labor”
“FWL: Meta-Learning Fast Weight Language Models”, Clark et al 2022
“What Learning Algorithm Is In-context Learning? Investigations With Linear Models”, Akyürek et al 2022
“What learning algorithm is in-context learning? Investigations with linear models”
“VeLO: Training Versatile Learned Optimizers by Scaling Up”, Metz et al 2022
“BLOOMZ/mT0: Crosslingual Generalization through Multitask Finetuning”, Muennighoff et al 2022
“BLOOMZ/mT0: Crosslingual Generalization through Multitask Finetuning”
“ProMoT: Preserving In-Context Learning Ability in Large Language Model Fine-tuning”, Wang et al 2022
“ProMoT: Preserving In-Context Learning ability in Large Language Model Fine-tuning”
“SAP: Bidirectional Language Models Are Also Few-shot Learners”, Patel et al 2022
“SAP: Bidirectional Language Models Are Also Few-shot Learners”
“AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model”, Soltan et al 2022
“AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model”
“What Can Transformers Learn In-Context? A Case Study of Simple Function Classes”, Garg et al 2022
“What Can Transformers Learn In-Context? A Case Study of Simple Function Classes”
“Few-shot Adaptation Works With UnpredicTable Data”, Chan et al 2022
“Transformer Neural Processes: Uncertainty-Aware Meta Learning Via Sequence Modeling”, Nguyen & Grover 2022
“Transformer Neural Processes: Uncertainty-Aware Meta Learning Via Sequence Modeling”
“Offline RL Policies Should Be Trained to Be Adaptive”, Ghosh et al 2022
“TabPFN: Meta-Learning a Real-Time Tabular AutoML Method For Small Data”, Hollmann et al 2022
“TabPFN: Meta-Learning a Real-Time Tabular AutoML Method For Small Data”
“Goal-Conditioned Generators of Deep Policies”, Faccio et al 2022
“Prompting Decision Transformer for Few-Shot Policy Generalization”, Xu et al 2022
“Prompting Decision Transformer for Few-Shot Policy Generalization”
“RHO-LOSS: Prioritized Training on Points That Are Learnable, Worth Learning, and Not Yet Learnt”, Mindermann et al 2022
“RHO-LOSS: Prioritized Training on Points that are Learnable, Worth Learning, and Not Yet Learnt”
“NOAH: Neural Prompt Search”, Zhang et al 2022
“Task-Agnostic Continual Reinforcement Learning: In Praise of a Simple Baseline (3RL)”, Caccia et al 2022
“Task-Agnostic Continual Reinforcement Learning: In Praise of a Simple Baseline (3RL)”
“Bongard-HOI: Benchmarking Few-Shot Visual Reasoning for Human-Object Interactions”, Jiang et al 2022
“Bongard-HOI: Benchmarking Few-Shot Visual Reasoning for Human-Object Interactions”
“Towards Learning Universal Hyperparameter Optimizers With Transformers”, Chen et al 2022
“Towards Learning Universal Hyperparameter Optimizers with Transformers”
“CT0: Fine-tuned Language Models Are Continual Learners”, Scialom et al 2022
“Instruction Induction: From Few Examples to Natural Language Task Descriptions”, Honovich et al 2022
“Instruction Induction: From Few Examples to Natural Language Task Descriptions”
“Gato: A Generalist Agent”, Reed et al 2022
“Unifying Language Learning Paradigms”, Tay et al 2022
“Data Distributional Properties Drive Emergent Few-Shot Learning in Transformers”, Chan et al 2022
“Data Distributional Properties Drive Emergent Few-Shot Learning in Transformers”
“Tk-Instruct: Benchmarking Generalization via In-Context Instructions on 1,600+ Language Tasks”, Wang et al 2022
“Tk-Instruct: Benchmarking Generalization via In-Context Instructions on 1,600+ Language Tasks”
“What Language Model Architecture and Pretraining Objective Work Best for Zero-Shot Generalization?”, Wang et al 2022
“What Language Model Architecture and Pretraining Objective Work Best for Zero-Shot Generalization?”
“Effective Mutation Rate Adaptation through Group Elite Selection”, Kumar et al 2022
“Effective Mutation Rate Adaptation through Group Elite Selection”
“DualPrompt: Complementary Prompting for Rehearsal-free Continual Learning”, Wang et al 2022
“DualPrompt: Complementary Prompting for Rehearsal-free Continual Learning”
“Searching for Efficient Neural Architectures for On-Device ML on Edge TPUs”, Akin et al 2022
“Searching for Efficient Neural Architectures for On-Device ML on Edge TPUs”
“Can Language Models Learn from Explanations in Context?”, Lampinen et al 2022
“Auto-Lambda: Disentangling Dynamic Task Relationships”, Liu et al 2022
“In-context Learning and Induction Heads”, Olsson et al 2022
“Evolving Curricula With Regret-Based Environment Design”, Parker-Holder et al 2022
“HyperPrompt: Prompt-based Task-Conditioning of Transformers”, He et al 2022
“HyperPrompt: Prompt-based Task-Conditioning of Transformers”
“Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?”, Min et al 2022
“Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?”
“All You Need Is Supervised Learning: From Imitation Learning to Meta-RL With Upside Down RL”, Arulkumaran et al 2022
“All You Need Is Supervised Learning: From Imitation Learning to Meta-RL With Upside Down RL”
“NeuPL: Neural Population Learning”, Liu et al 2022
“The Dual Form of Neural Networks Revisited: Connecting Test Time Predictions to Training Patterns via Spotlights of Attention”, Irie et al 2022
“Learning Synthetic Environments and Reward Networks for Reinforcement Learning”, Ferreira et al 2022
“Learning Synthetic Environments and Reward Networks for Reinforcement Learning”
“From Data to Functa: Your Data Point Is a Function and You Should Treat It like One”, Dupont et al 2022
“From data to functa: Your data point is a function and you should treat it like one”
“Active Predictive Coding Networks: A Neural Solution to the Problem of Learning Reference Frames and Part-Whole Hierarchies”, Gklezakos & Rao 2022
“Environment Generation for Zero-Shot Compositional Reinforcement Learning”, Gur et al 2022
“Environment Generation for Zero-Shot Compositional Reinforcement Learning”
“Learning Robust Perceptive Locomotion for Quadrupedal Robots in the Wild”, Miki et al 2022
“Learning robust perceptive locomotion for quadrupedal robots in the wild”
“HyperTransformer: Model Generation for Supervised and Semi-Supervised Few-Shot Learning”, Zhmoginov et al 2022
“HyperTransformer: Model Generation for Supervised and Semi-Supervised Few-Shot Learning”
“In Defense of the Unitary Scalarization for Deep Multi-Task Learning”, Kurin et al 2022
“In Defense of the Unitary Scalarization for Deep Multi-Task Learning”
“Automated Reinforcement Learning (AutoRL): A Survey and Open Problems”, Parker-Holder et al 2022
“Automated Reinforcement Learning (AutoRL): A Survey and Open Problems”
“Finding General Equilibria in Many-Agent Economic Simulations Using Deep Reinforcement Learning”, Curry et al 2022
“Finding General Equilibria in Many-Agent Economic Simulations Using Deep Reinforcement Learning”
“The Curse of Zero Task Diversity: On the Failure of Transfer Learning to Outperform MAML and Their Empirical Equivalence”, Miranda et al 2021
“A Mathematical Framework for Transformer Circuits”, Elhage et al 2021
“PFNs: Transformers Can Do Bayesian Inference”, Müller et al 2021
“Learning to Prompt for Continual Learning”, Wang et al 2021
“How to Learn and Represent Abstractions: An Investigation Using Symbolic Alchemy”, AlKhamissi et al 2021
“How to Learn and Represent Abstractions: An Investigation using Symbolic Alchemy”
“Noether Networks: Meta-Learning Useful Conserved Quantities”, Alet et al 2021
“Noether Networks: Meta-Learning Useful Conserved Quantities”
“A General Language Assistant As a Laboratory for Alignment”, Askell et al 2021
“A General Language Assistant as a Laboratory for Alignment”
“A Rational Reinterpretation of Dual-process Theories”, Milli et al 2021
“A Modern Self-Referential Weight Matrix That Learns to Modify Itself”, Irie et al 2021
“A Modern Self-Referential Weight Matrix That Learns to Modify Itself”
“A Survey of Generalisation in Deep Reinforcement Learning”, Kirk et al 2021
“Gradients Are Not All You Need”, Metz et al 2021
“An Explanation of In-context Learning As Implicit Bayesian Inference”, Xie et al 2021
“An Explanation of In-context Learning as Implicit Bayesian Inference”
“Procedural Generalization by Planning With Self-Supervised World Models”, Anand et al 2021
“Procedural Generalization by Planning with Self-Supervised World Models”
“MetaICL: Learning to Learn In Context”, Min et al 2021
“Logical Activation Functions: Logit-space Equivalents of Probabilistic Boolean Operators”, Lowe et al 2021
“Logical Activation Functions: Logit-space equivalents of Probabilistic Boolean Operators”
“Shaking the Foundations: Delusions in Sequence Models for Interaction and Control”, Ortega et al 2021
“Shaking the foundations: delusions in sequence models for interaction and control”
“Meta-learning, Social Cognition and Consciousness in Brains and Machines”, Langdon et al 2021
“Meta-learning, social cognition and consciousness in brains and machines”
“T0: Multitask Prompted Training Enables Zero-Shot Task Generalization”, Sanh et al 2021
“T0: Multitask Prompted Training Enables Zero-Shot Task Generalization”
“Embodied Intelligence via Learning and Evolution”, Gupta et al 2021
“Replay-Guided Adversarial Environment Design”, Jiang et al 2021
“Transformers Are Meta-Reinforcement Learners”, Anonymous 2021
“Scalable Online Planning via Reinforcement Learning Fine-Tuning”, Fickinger et al 2021
“Scalable Online Planning via Reinforcement Learning Fine-Tuning”
“Is Curiosity All You Need? On the Utility of Emergent Behaviors from Curious Exploration”, Groth et al 2021
“Is Curiosity All You Need? On the Utility of Emergent Behaviors from Curious Exploration”
“Dropout’s Dream Land: Generalization from Learned Simulators to Reality”, Wellmer & Kwok 2021
“Dropout’s Dream Land: Generalization from Learned Simulators to Reality”
“Bootstrapped Meta-Learning”, Flennerhag et al 2021
“The Sensory Neuron As a Transformer: Permutation-Invariant Neural Networks for Reinforcement Learning”, Tang & Ha 2021
“FLAN: Finetuned Language Models Are Zero-Shot Learners”, Wei et al 2021
“The AI Economist: Optimal Economic Policy Design via Two-level Deep Reinforcement Learning”, Zheng et al 2021
“The AI Economist: Optimal Economic Policy Design via Two-level Deep Reinforcement Learning”
“Dataset Distillation With Infinitely Wide Convolutional Networks”, Nguyen et al 2021
“Dataset Distillation with Infinitely Wide Convolutional Networks”
“Open-Ended Learning Leads to Generally Capable Agents”, Team et al 2021
“Why Generalization in RL Is Difficult: Epistemic POMDPs and Implicit Partial Observability”, Ghosh et al 2021
“Why Generalization in RL is Difficult: Epistemic POMDPs and Implicit Partial Observability”
“PonderNet: Learning to Ponder”, Banino et al 2021
“Multimodal Few-Shot Learning With Frozen Language Models”, Tsimpoukelli et al 2021
“LHOPT: A Generalizable Approach to Learning Optimizers”, Almeida et al 2021
“Towards Mental Time Travel: a Hierarchical Memory for Reinforcement Learning Agents”, Lampinen et al 2021
“Towards mental time travel: a hierarchical memory for reinforcement learning agents”
“A Full-stack Accelerator Search Technique for Vision Applications”, Zhang et al 2021
“A Full-stack Accelerator Search Technique for Vision Applications”
“Reward Is Enough”, Silver et al 2021
“Bayesian Optimization Is Superior to Random Search for Machine Learning Hyperparameter Tuning: Analysis of the Black-Box Optimization Challenge 2020”, Turner et al 2021
“CrossFit: A Few-shot Learning Challenge for Cross-task Generalization in NLP”, Ye et al 2021
“CrossFit: A Few-shot Learning Challenge for Cross-task Generalization in NLP”
“Podracer Architectures for Scalable Reinforcement Learning”, Hessel et al 2021
“Podracer architectures for scalable Reinforcement Learning”
“BLUR: Meta-Learning Bidirectional Update Rules”, Sandler et al 2021
“Asymmetric Self-play for Automatic Goal Discovery in Robotic Manipulation”, OpenAI et al 2021
“Asymmetric self-play for automatic goal discovery in robotic manipulation”
“OmniNet: Omnidirectional Representations from Transformers”, Tay et al 2021
“OmniNet: Omnidirectional Representations from Transformers”
“Linear Transformers Are Secretly Fast Weight Programmers”, Schlag et al 2021
“Prompt Programming for Large Language Models: Beyond the Few-Shot Paradigm”, Reynolds & McDonell 2021
“Prompt Programming for Large Language Models: Beyond the Few-Shot Paradigm”
“ES-ENAS: Blackbox Optimization over Hybrid Spaces via Combinatorial and Continuous Evolution”, Song et al 2021
“ES-ENAS: Blackbox Optimization over Hybrid Spaces via Combinatorial and Continuous Evolution”
“Training Learned Optimizers With Randomly Initialized Learned Optimizers”, Metz et al 2021
“Training Learned Optimizers with Randomly Initialized Learned Optimizers”
“Evolving Reinforcement Learning Algorithms”, Co-Reyes et al 2021
“Meta Pseudo Labels”, Pham et al 2021
“Meta Learning Backpropagation And Improving It”, Kirsch & Schmidhuber 2020
“Emergent Complexity and Zero-shot Transfer via Unsupervised Environment Design”, Dennis et al 2020
“Emergent Complexity and Zero-shot Transfer via Unsupervised Environment Design”
“Scaling down Deep Learning”, Greydanus 2020
“Reverse Engineering Learned Optimizers Reveals Known and Novel Mechanisms”, Maheswaranathan et al 2020
“Reverse engineering learned optimizers reveals known and novel mechanisms”
“Dataset Meta-Learning from Kernel Ridge-Regression”, Nguyen et al 2020
“MELD: Meta-Reinforcement Learning from Images via Latent State Models”, Zhao et al 2020
“MELD: Meta-Reinforcement Learning from Images via Latent State Models”
“Meta-trained Agents Implement Bayes-optimal Agents”, Mikulik et al 2020
“Learning Not to Learn: Nature versus Nurture in Silico”, Lange & Sprekeler 2020
“Prioritized Level Replay”, Jiang et al 2020
“Learning from the Past: Meta-Continual Learning With Knowledge Embedding for Jointly Sketch, Cartoon, and Caricature Face Recognition”, Zheng et al 2020b
“Tasks, Stability, Architecture, and Compute: Training More Effective Learned Optimizers, and Using Them to Train Themselves”, Metz et al 2020
“Grounded Language Learning Fast and Slow”, Hill et al 2020
“Matt Botvinick on the Spontaneous Emergence of Learning Algorithms”, Scholl 2020
“Matt Botvinick on the spontaneous emergence of learning algorithms”
“Discovering Reinforcement Learning Algorithms”, Oh et al 2020
“Deep Reinforcement Learning and Its Neuroscientific Implications”, Botvinick 2020
“Deep Reinforcement Learning and Its Neuroscientific Implications”
“Meta-Learning through Hebbian Plasticity in Random Networks”, Najarro & Risi 2020
“Meta-Learning through Hebbian Plasticity in Random Networks”
“Decentralized Reinforcement Learning: Global Decision-Making via Local Economic Transactions”, Chang et al 2020
“Decentralized Reinforcement Learning: Global Decision-Making via Local Economic Transactions”
“Learning to Learn With Feedback and Local Plasticity”, Lindsey & Litwin-Kumar 2020
“Rapid Task-Solving in Novel Environments”, Ritter et al 2020
“FBNetV3: Joint Architecture-Recipe Search Using Predictor Pretraining”, Dai et al 2020
“FBNetV3: Joint Architecture-Recipe Search using Predictor Pretraining”
“GPT-3: Language Models Are Few-Shot Learners”, Brown et al 2020
“Synthetic Petri Dish: A Novel Surrogate Model for Rapid Architecture Search”, Rawal et al 2020
“Synthetic Petri Dish: A Novel Surrogate Model for Rapid Architecture Search”
“Automatic Discovery of Interpretable Planning Strategies”, Skirzyński et al 2020
“Meta-Reinforcement Learning for Robotic Industrial Insertion Tasks”, Schoettler et al 2020
“Meta-Reinforcement Learning for Robotic Industrial Insertion Tasks”
“A Comparison of Methods for Treatment Assignment With an Application to Playlist Generation”, Fernández-Loría et al 2020
“A Comparison of Methods for Treatment Assignment with an Application to Playlist Generation”
“Approximate Exploitability: Learning a Best Response in Large Games”, Timbers et al 2020
“Approximate exploitability: Learning a best response in large games”
“Meta-Learning in Neural Networks: A Survey”, Hospedales et al 2020
“Designing Network Design Spaces”, Radosavovic et al 2020
“Agent57: Outperforming the Atari Human Benchmark”, Badia et al 2020
“Enhanced POET: Open-Ended Reinforcement Learning through Unbounded Invention of Learning Challenges and Their Solutions”, Wang et al 2020
“Accelerating and Improving AlphaZero Using Population Based Training”, Wu et al 2020
“Accelerating and Improving AlphaZero Using Population Based Training”
“Meta-learning Curiosity Algorithms”, Alet et al 2020
“AutoML-Zero: Evolving Machine Learning Algorithms From Scratch”, Real et al 2020
“AutoML-Zero: Evolving Machine Learning Algorithms From Scratch”
“AutoML-Zero: Open Source Code for the Paper: "AutoML-Zero: Evolving Machine Learning Algorithms From Scratch"”, Real et al 2020
“Effective Diversity in Population Based Reinforcement Learning”, Parker-Holder et al 2020
“Effective Diversity in Population Based Reinforcement Learning”
“AI Helps Warehouse Robots Pick Up New Tricks: Backed by Machine Learning Luminaries, Covariant.ai’s Bots Can Handle Jobs Previously Needing a Human Touch”, Knight 2020
“Smooth Markets: A Basic Mechanism for Organizing Gradient-based Learners”, Balduzzi et al 2020
“Smooth markets: A basic mechanism for organizing gradient-based learners”
“AutoML-Zero: Evolving Code That Learns”, Real & Liang 2020
“Learning Neural Activations”, Minhas & Asif 2019
“Meta-Learning without Memorization”, Yin et al 2019
“MetaFun: Meta-Learning With Iterative Functional Updates”, Xu et al 2019
“Procgen Benchmark: We’re Releasing Procgen Benchmark, 16 Simple-to-use Procedurally-generated Environments Which Provide a Direct Measure of How Quickly a Reinforcement Learning Agent Learns Generalizable Skills”, Cobbe et al 2019
“Leveraging Procedural Generation to Benchmark Reinforcement Learning”, Cobbe et al 2019
“Leveraging Procedural Generation to Benchmark Reinforcement Learning”
“Increasing Generality in Machine Learning through Procedural Content Generation”, Risi & Togelius 2019
“Increasing Generality in Machine Learning through Procedural Content Generation”
“Optimizing Millions of Hyperparameters by Implicit Differentiation”, Lorraine et al 2019
“Optimizing Millions of Hyperparameters by Implicit Differentiation”
“Learning to Predict Without Looking Ahead: World Models Without Forward Prediction [blog]”, Freeman et al 2019
“Learning to Predict Without Looking Ahead: World Models Without Forward Prediction [blog]”
“Learning to Predict Without Looking Ahead: World Models Without Forward Prediction”, Freeman et al 2019
“Learning to Predict Without Looking Ahead: World Models Without Forward Prediction”
“Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning”, Yu et al 2019
“Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning”
“Solving Rubik’s Cube With a Robot Hand”, OpenAI et al 2019
“Solving Rubik’s Cube With a Robot Hand [blog]”, OpenAI 2019
“Gradient Descent: The Ultimate Optimizer”, Chandra et al 2019
“Multiplicative Interactions and Where to Find Them”, Jayakumar et al 2019
“Data Valuation Using Reinforcement Learning”, Yoon et al 2019
“ANIL: Rapid Learning or Feature Reuse? Towards Understanding the Effectiveness of MAML”, Raghu et al 2019
“ANIL: Rapid Learning or Feature Reuse? Towards Understanding the Effectiveness of MAML”
“Emergent Tool Use From Multi-Agent Autocurricula”, Baker et al 2019
“Meta-Learning With Implicit Gradients”, Rajeswaran et al 2019
“A Critique of Pure Learning and What Artificial Neural Networks Can Learn from Animal Brains”, Zador 2019
“A critique of pure learning and what artificial neural networks can learn from animal brains”
“AutoML: A Survey of the State-of-the-Art”, He et al 2019
“Metalearned Neural Memory”, Munkhdalai et al 2019
“Algorithms for Hyper-Parameter Optimization”, Bergstra et al 2019
“Evolving the Hearthstone Meta”, Silva et al 2019
“Meta Reinforcement Learning”, Weng 2019
“One Epoch Is All You Need”, Komatsuzaki 2019
“Compositional Generalization through Meta Sequence-to-sequence Learning”, Lake 2019
“Compositional generalization through meta sequence-to-sequence learning”
“Risks from Learned Optimization in Advanced Machine Learning Systems”, Hubinger et al 2019
“Risks from Learned Optimization in Advanced Machine Learning Systems”
“ICML 2019 Notes”, Abel 2019
“SpArSe: Sparse Architecture Search for CNNs on Resource-Constrained Microcontrollers”, Fedorov et al 2019
“SpArSe: Sparse Architecture Search for CNNs on Resource-Constrained Microcontrollers”
“AI-GAs: AI-generating Algorithms, an Alternate Paradigm for Producing General Artificial Intelligence”, Clune 2019
“Alpha MAML: Adaptive Model-Agnostic Meta-Learning”, Behl et al 2019
“Reinforcement Learning, Fast and Slow”, Botvinick et al 2019
“Meta Reinforcement Learning As Task Inference”, Humplik et al 2019
“Learning Loss for Active Learning”, Yoo & Kweon 2019
“Meta-learning of Sequential Strategies”, Ortega et al 2019
“Searching for MobileNetV3”, Howard et al 2019
“Meta-learners’ Learning Dynamics Are unlike Learners’”, Rabinowitz 2019
“Ray Interference: a Source of Plateaus in Deep Reinforcement Learning”, Schaul et al 2019
“Ray Interference: a Source of Plateaus in Deep Reinforcement Learning”
“AlphaX: EXploring Neural Architectures With Deep Neural Networks and Monte Carlo Tree Search”, Wang et al 2019
“AlphaX: eXploring Neural Architectures with Deep Neural Networks and Monte Carlo Tree Search”
“Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables”, Rakelly et al 2019
“Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables”
“The Omniglot Challenge: a 3-year Progress Report”, Lake et al 2019
“FIGR: Few-shot Image Generation With Reptile”, Clouâtre & Demers 2019
“Paired Open-Ended Trailblazer (POET): Endlessly Generating Increasingly Complex and Diverse Learning Environments and Their Solutions”, Wang et al 2019
“Meta-Learning Neural Bloom Filters”, Rae, 2019 {#rae,-2019-section .link-annotated-not}
“Malthusian Reinforcement Learning”, Leibo et al 2018
“Quantifying Generalization in Reinforcement Learning”, Cobbe et al 2018
“Meta-Learning: Learning to Learn Fast”, Weng 2018
“An Introduction to Deep Reinforcement Learning”, Francois-Lavet et al 2018
“Evolving Space-Time Neural Architectures for Videos”, Piergiovanni et al 2018
“Understanding and Correcting Pathologies in the Training of Learned Optimizers”, Metz et al 2018
“Understanding and correcting pathologies in the training of learned optimizers”
“WBE and DRL: a Middle Way of Imitation Learning from the Human Brain”, Branwen 2018
“WBE and DRL: a Middle Way of imitation learning from the human brain”
“BabyAI: A Platform to Study the Sample Efficiency of Grounded Language Learning”, Chevalier-Boisvert et al 2018
“BabyAI: A Platform to Study the Sample Efficiency of Grounded Language Learning”
“Deep Reinforcement Learning”, Li 2018
“Searching for Efficient Multi-Scale Architectures for Dense Image Prediction”, Chen et al 2018
“Searching for Efficient Multi-Scale Architectures for Dense Image Prediction”
“Backprop Evolution”, Alber et al 2018
“Learning Dexterous In-Hand Manipulation”, OpenAI et al 2018
“LEO: Meta-Learning With Latent Embedding Optimization”, Rusu et al 2018
“Automatically Composing Representation Transformations As a Means for Generalization”, Chang et al 2018
“Automatically Composing Representation Transformations as a Means for Generalization”
“Human-level Performance in First-person Multiplayer Games With Population-based Deep Reinforcement Learning”, Jaderberg et al 2018
“Guided Evolutionary Strategies: Augmenting Random Search With Surrogate Gradients”, Maheswaranathan et al 2018
“Guided evolutionary strategies: Augmenting random search with surrogate gradients”
“RUDDER: Return Decomposition for Delayed Rewards”, Arjona-Medina et al 2018
“Meta-Learning Transferable Active Learning Policies by Deep Reinforcement Learning”, Pang et al 2018
“Meta-Learning Transferable Active Learning Policies by Deep Reinforcement Learning”
“Fingerprint Policy Optimisation for Robust Reinforcement Learning”, Paul et al 2018
“Fingerprint Policy Optimisation for Robust Reinforcement Learning”
“Meta-Gradient Reinforcement Learning”, Xu et al 2018
“AutoAugment: Learning Augmentation Policies from Data”, Cubuk et al 2018
“Prefrontal Cortex As a Meta-reinforcement Learning System”, Wang et al 2018
“Meta-Learning Update Rules for Unsupervised Representation Learning”, Metz et al 2018
“Meta-Learning Update Rules for Unsupervised Representation Learning”
“Reviving and Improving Recurrent Back-Propagation”, Liao et al 2018
“Kickstarting Deep Reinforcement Learning”, Schmitt et al 2018
“Reptile: On First-Order Meta-Learning Algorithms”, Nichol et al 2018
“Some Considerations on Learning to Explore via Meta-Reinforcement Learning”, Stadie et al 2018
“Some Considerations on Learning to Explore via Meta-Reinforcement Learning”
“One Big Net For Everything”, Schmidhuber 2018
“Machine Theory of Mind”, Rabinowitz et al 2018
“Evolved Policy Gradients”, Houthooft et al 2018
“One-Shot Imitation from Observing Humans via Domain-Adaptive Meta-Learning”, Yu et al 2018
“One-Shot Imitation from Observing Humans via Domain-Adaptive Meta-Learning”
“Rover Descent: Learning to Optimize by Learning to Navigate on Prototypical Loss Surfaces”, Faury & Vasile 2018
“Rover Descent: Learning to optimize by learning to navigate on prototypical loss surfaces”
“ScreenerNet: Learning Self-Paced Curriculum for Deep Neural Networks”, Kim & Choi 2018
“ScreenerNet: Learning Self-Paced Curriculum for Deep Neural Networks”
“Population Based Training of Neural Networks”, Jaderberg et al 2017
“BlockDrop: Dynamic Inference Paths in Residual Networks”, Wu et al 2017
“Learning to Select Computations”, Callaway et al 2017
“Efficient K-shot Learning With Regularized Deep Networks”, Yoo et al 2017
“Online Learning of a Memory for Learning Rates”, Meier et al 2017
“Supervising Unsupervised Learning”, Garg & Kalai 2017
“One-Shot Visual Imitation Learning via Meta-Learning”, Finn et al 2017
“Learning With Opponent-Learning Awareness”, Foerster et al 2017
“SMASH: One-Shot Model Architecture Search through HyperNetworks”, Brock et al 2017
“SMASH: One-Shot Model Architecture Search through HyperNetworks”
“Stochastic Optimization With Bandit Sampling”, Salehi et al 2017
“A Simple Neural Attentive Meta-Learner”, Mishra et al 2017
“Reinforcement Learning for Learning Rate Control”, Xu et al 2017
“Metacontrol for Adaptive Imagination-Based Optimization”, Hamrick et al 2017
“Research Ideas”, Gwern 2017
“Deciding How to Decide: Dynamic Routing in Artificial Neural Networks”, McGill & Perona 2017
“Deciding How to Decide: Dynamic Routing in Artificial Neural Networks”
“Prototypical Networks for Few-shot Learning”, Snell et al 2017
“Learned Optimizers That Scale and Generalize”, Wichrowska et al 2017
“MAML: Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks”, Finn et al 2017
“MAML: Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks”
“Meta Networks”, Munkhdalai & Yu 2017
“Optimization As a Model for Few-Shot Learning”, Ravi & Larochelle 2017
“Understanding Synthetic Gradients and Decoupled Neural Interfaces”, Czarnecki et al 2017
“Understanding Synthetic Gradients and Decoupled Neural Interfaces”
“Learning to Optimize Neural Nets”, Li & Malik 2017
“Learning to Superoptimize Programs”, Bunel et al 2017
“Discovering Objects and Their Relations from Entangled Scene Representations”, Raposo et al 2017
“Discovering objects and their relations from entangled scene representations”
“An Actor-critic Algorithm for Learning Rate Learning”, Xu et al 2016
“Learning to Reinforcement Learn”, Wang et al 2016
“Learning to Learn without Gradient Descent by Gradient Descent”, Chen et al 2016
“Learning to Learn without Gradient Descent by Gradient Descent”
“RL2: Fast Reinforcement Learning via Slow Reinforcement Learning”, Duan et al 2016
“RL2: Fast Reinforcement Learning via Slow Reinforcement Learning”
“Designing Neural Network Architectures Using Reinforcement Learning”, Baker et al 2016
“Designing Neural Network Architectures using Reinforcement Learning”
“Using Fast Weights to Attend to the Recent Past”, Ba et al 2016
“HyperNetworks”, Ha et al 2016
“Decoupled Neural Interfaces Using Synthetic Gradients”, Jaderberg et al 2016
“Learning to Learn by Gradient Descent by Gradient Descent”, Andrychowicz et al 2016
“Matching Networks for One Shot Learning”, Vinyals et al 2016
“Learning to Optimize”, Li & Malik 2016
“One-shot Learning With Memory-Augmented Neural Networks”, Santoro et al 2016
“Adaptive Computation Time for Recurrent Neural Networks”, Graves 2016
“On Learning to Think: Algorithmic Information Theory for Novel Combinations of Reinforcement Learning Controllers and Recurrent Neural World Models”, Schmidhuber 2015
“Gradient-based Hyperparameter Optimization through Reversible Learning”, Maclaurin et al 2015
“Gradient-based Hyperparameter Optimization through Reversible Learning”
“Machine Teaching: an Inverse Problem to Machine Learning and an Approach Toward Optimal Education”, Zhu 2015b
“Machine Teaching: an Inverse Problem to Machine Learning and an Approach Toward Optimal Education”
“Human-level Concept Learning through Probabilistic Program Induction”, Lake, 2015 {#lake,-2015-section .link-annotated-not}
“Human-level concept learning through probabilistic program induction”
“Deep Learning in Neural Networks: An Overview”, Schmidhuber 2014
“Practical Bayesian Optimization of Machine Learning Algorithms”, Snoek et al 2012
“Practical Bayesian Optimization of Machine Learning Algorithms”
“Optimal Ordered Problem Solver (OOPS)”, Schmidhuber 2002
“Learning to Learn Using Gradient Descent”, Hochreiter et al 2001
“On the Optimization of a Synaptic Learning Rule”, Bengio et al 1997
“Learning to Control Fast-Weight Memories: An Alternative to Dynamic Recurrent Networks”, Schmidhuber 1992
“Learning to Control Fast-Weight Memories: An Alternative to Dynamic Recurrent Networks”
“Interactions between Learning and Evolution”, Ackley & Littman 1992
“Learning a Synaptic Learning Rule”, Bengio et al 1991
“Reinforcement Learning: An Introduction §Designing Reward Signals”, Sutton & Barto 2023 (page 491)
“Reinforcement Learning: An Introduction §Designing Reward Signals”
“Universal Search § OOPS and Other Incremental Variations”
“The Lie Comes First, the Worlds to Accommodate It”
“AlphaStar: Mastering the Real-Time Strategy Game StarCraft II”
“AlphaStar: Mastering the Real-Time Strategy Game StarCraft II”
“Prefrontal Cortex As a Meta-reinforcement Learning System [blog]”
“Prefrontal cortex as a meta-reinforcement learning system [blog]”
“Optimal Learning: Computational Procedures for Bayes-Adaptive Markov Decision Processes”
“Optimal Learning: Computational Procedures for Bayes-Adaptive Markov Decision Processes”
Sort By Magic
Annotations sorted by machine learning into inferred 'tags'. This provides an alternative way to browse: instead of by date order, one can browse in topic order. The 'sorted' list has been automatically clustered into multiple sections & auto-labeled for easier browsing.
Beginning with the newest annotation, it uses the embedding of each annotation to attempt to create a list of nearest-neighbor annotations, creating a progression of topics. For more details, see the link.
Meta-RL
learning-to-predict
Meta-Continual-Learning
Wikipedia
Miscellaneous
-
http://lukemetz.com/exploring-hyperparameter-meta-loss-landscapes-with-jax/#google
-
https://blog.research.google/2021/10/introducing-flan-more-generalizable.html
-
https://blog.research.google/2021/11/permutation-invariant-neural-networks.html
-
https://blog.research.google/2021/12/training-machine-learning-models-more.html
-
https://blog.waymo.com/2020/04/using-automated-data-augmentation-to.html#google
-
https://lilianweng.github.io/lil-log/2020/01/29/curriculum-for-reinforcement-learning.html#openai
-
https://lilianweng.github.io/lil-log/2020/08/06/neural-architecture-search.html#openai
-
https://pages.ucsd.edu/~rbelew/courses/cogs184_w10/readings/HintonNowlan97.pdf
-
https://www.deepmind.com/publications/open-ended-learning-leads-to-generally-capable-agents
-
https://www.lesswrong.com/posts/FkgsxrGf3QxhfLWHG/risks-from-learned-optimization-introduction
-
https://www.lesswrong.com/posts/bC5xd7wQCnTDw7Kyx/getting-up-to-speed-on-the-speed-prior-in-2022
-
https://www.lesswrong.com/posts/t9svvNPNmFf5Qa3TA/mysteries-of-mode-collapse-due-to-rlhf
-
https://www.quantamagazine.org/researchers-build-ai-that-builds-ai-20220125/
-
https://www.youtube.com/watch?v=QyJGXc9WeNo&list=PLOXw6I10VTv9HODt7TFEL72K3Q6C4itG6&index=3
Link Bibliography
-
https://arxiv.org/abs/2307.03381
: “Teaching Arithmetic to Small Transformers”, Nayoung Lee, Kartik Sreenivasan, Jason D. Lee, Kangwook Lee, Dimitris Papailiopoulos -
https://arxiv.org/abs/2304.02015#alibaba
: “How Well Do Large Language Models Perform in Arithmetic Tasks?”, Zheng Yuan, Hongyi Yuan, Chuanqi Tan, Wei Wang, Songfang Huang -
https://arxiv.org/abs/2212.02475#google
: “FWL: Meta-Learning Fast Weight Language Models”, Kevin Clark, Kelvin Guu, Ming-Wei Chang, Panupong Pasupat, Geoffrey Hinton, Mohammad Norouzi -
https://arxiv.org/abs/2209.14500
: “SAP: Bidirectional Language Models Are Also Few-shot Learners”, Ajay Patel, Bryan Li, Mohammad Sadegh Rasooli, Noah Constant, Colin Raffel, Chris Callison-Burch -
https://arxiv.org/abs/2208.01448#amazon
: “AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model”, -
https://arxiv.org/abs/2208.01066
: “What Can Transformers Learn In-Context? A Case Study of Simple Function Classes”, Shivam Garg, Dimitris Tsipras, Percy Liang, Gregory Valiant -
https://arxiv.org/abs/2207.01848
: “TabPFN: Meta-Learning a Real-Time Tabular AutoML Method For Small Data”, Noah Hollmann, Samuel Müller, Katharina Eggensperger, Frank Hutter -
https://arxiv.org/abs/2206.13499
: “Prompting Decision Transformer for Few-Shot Policy Generalization”, Mengdi Xu, Yikang Shen, Shun Zhang, Yuchen Lu, Ding Zhao, Joshua B. Tenenbaum, Chuang Gan -
https://arxiv.org/abs/2206.07137
: “RHO-LOSS: Prioritized Training on Points That Are Learnable, Worth Learning, and Not Yet Learnt”, -
https://arxiv.org/abs/2205.13320#google
: “Towards Learning Universal Hyperparameter Optimizers With Transformers”, -
https://arxiv.org/abs/2205.12393
: “CT0: Fine-tuned Language Models Are Continual Learners”, Thomas Scialom, Tuhin Chakrabarty, Smaranda Muresan -
https://arxiv.org/abs/2205.06175#deepmind
: “Gato: A Generalist Agent”, -
https://arxiv.org/abs/2205.05131#google
: “Unifying Language Learning Paradigms”, -
https://arxiv.org/abs/2204.07705
: “T<em>k</em>-Instruct: Benchmarking Generalization via In-Context Instructions on 1,600+ Language Tasks”, -
https://arxiv.org/abs/2203.00759
: “HyperPrompt: Prompt-based Task-Conditioning of Transformers”, -
https://arxiv.org/abs/2202.12837#facebook
: “Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?”, Sewon Min, Xinxi Lyu, Ari Holtzman, Mikel Artetxe, Mike Lewis, Hannaneh Hajishirzi, Luke Zettlemoyer -
https://arxiv.org/abs/2202.07415#deepmind
: “NeuPL: Neural Population Learning”, Siqi Liu, Luke Marris, Daniel Hennes, Josh Merel, Nicolas Heess, Thore Graepel -
2022-miki.pdf
: “Learning Robust Perceptive Locomotion for Quadrupedal Robots in the Wild”, Takahiro Miki, Joonho Lee, Jemin Hwangbo, Lorenz Wellhausen, Vladlen Koltun, Marco Hutter -
https://arxiv.org/abs/2112.10510
: “PFNs: Transformers Can Do Bayesian Inference”, Samuel Müller, Noah Hollmann, Sebastian Pineda Arango, Josif Grabocka, Frank Hutter -
https://arxiv.org/abs/2112.00861#anthropic
: “A General Language Assistant As a Laboratory for Alignment”, -
https://arxiv.org/abs/2111.01587#deepmind
: “Procedural Generalization by Planning With Self-Supervised World Models”, -
https://arxiv.org/abs/2106.00958#openai
: “LHOPT: A Generalizable Approach to Learning Optimizers”, Diogo Almeida, Clemens Winter, Jie Tang, Wojciech Zaremba -
https://www.sciencedirect.com/science/article/pii/S0004370221000862#deepmind
: “Reward Is Enough”, David Silver, Satinder Singh, Doina Precup, Richard S. Sutton -
https://arxiv.org/abs/2104.06272#deepmind
: “Podracer Architectures for Scalable Reinforcement Learning”, Matteo Hessel, Manuel Kroiss, Aidan Clark, Iurii Kemaev, John Quan, Thomas Keck, Fabio Viola, Hado van Hasselt -
https://arxiv.org/abs/2103.01075#google
: “OmniNet: Omnidirectional Representations from Transformers”, -
https://arxiv.org/abs/2003.10580#google
: “Meta Pseudo Labels”, Hieu Pham, Zihang Dai, Qizhe Xie, Minh-Thang Luong, Quoc V. Le -
https://greydanus.github.io/2020/12/01/scaling-down/
: “Scaling down Deep Learning”, Sam Greydanus -
https://www.lesswrong.com/posts/Wnqua6eQkewL3bqsF/matt-botvinick-on-the-spontaneous-emergence-of-learning
: “Matt Botvinick on the Spontaneous Emergence of Learning Algorithms”, Adam Scholl -
https://openai.com/research/procgen-benchmark
: “Procgen Benchmark: We’re Releasing Procgen Benchmark, 16 Simple-to-use Procedurally-generated Environments Which Provide a Direct Measure of How Quickly a Reinforcement Learning Agent Learns Generalizable Skills”, Karl Cobbe, Christopher Hesse, Jacob Hilton, John Schulman -
https://arxiv.org/abs/1906.06669
: “One Epoch Is All You Need”, Aran Komatsuzaki -
https://david-abel.github.io/notes/icml_2019.pdf
: “ICML 2019 Notes”, David Abel -
https://arxiv.org/abs/1905.01320#deepmind
: “Meta-learners’ Learning Dynamics Are unlike Learners’”, Neil C. Rabinowitz -
https://arxiv.org/abs/1904.11455#deepmind
: “Ray Interference: a Source of Plateaus in Deep Reinforcement Learning”, Tom Schaul, Diana Borsa, Joseph Modayil, Razvan Pascanu -
https://arxiv.org/abs/1806.07857
: “RUDDER: Return Decomposition for Delayed Rewards”, -
https://arxiv.org/abs/1805.09501#google
: “AutoAugment: Learning Augmentation Policies from Data”, Ekin D. Cubuk, Barret Zoph, Dandelion Mane, Vijay Vasudevan, Quoc V. Le -
https://arxiv.org/abs/1804.00222#google
: “Meta-Learning Update Rules for Unsupervised Representation Learning”, Luke Metz, Niru Maheswaranathan, Brian Cheung, Jascha Sohl-Dickstein -
https://arxiv.org/abs/1803.02999#openai
: “Reptile: On First-Order Meta-Learning Algorithms”, Alex Nichol, Joshua Achiam, John Schulman -
https://arxiv.org/abs/1708.05344
: “SMASH: One-Shot Model Architecture Search through HyperNetworks”, Andrew Brock, Theodore Lim, J. M. Ritchie, Nick Weston -
idea
: “Research Ideas”, Gwern -
2015-zhu-2.pdf
: “Machine Teaching: an Inverse Problem to Machine Learning and an Approach Toward Optimal Education”, Xiaojin Zhu -
https://arxiv.org/abs/cs/0207097#schmidhuber
: “Optimal Ordered Problem Solver (OOPS)”, Juergen Schmidhuber -
1991-bengio.pdf
: “Learning a Synaptic Learning Rule”, Yoshua Bengio, Samy Bengio, Jocelyn Cloutier