“Learning to Learn Using Gradient Descent”, Sepp Hochreiter, A. Steven Younger, Peter R. Conwell2001-08-17 (, ; backlinks)⁠:

This paper introduces the application of gradient descent methods to meta-learning. The concept of “meta-learning”, i.e. of a system that improves or discovers a learning algorithm, has been of interest in machine learning for decades because of its appealing applications. Previous meta-learning approaches have been based on evolutionary methods and, therefore, have been restricted to small models with few free parameters.

We make meta-learning in large systems feasible by using recurrent neural networks with their attendant learning routines as meta-learning systems.

Our system derived complex well-performing learning algorithms from scratch. In this paper we also show that our approach performs non-stationary time series prediction.