“Matt Botvinick on the Spontaneous Emergence of Learning Algorithms”, 2020-08-12 (; backlinks; similar):
Matt Botvinick is Director of Neuroscience Research at DeepMind. In this interview, he discusses results from a 2018 paper which describe conditions under which reinforcement learning algorithms will spontaneously give rise to separate full-fledged reinforcement learning algorithms that differ from the original. Here are some notes I gathered from the interview and paper:
Initial Observation
At some point, a group of DeepMind researchers in Botvinick’s group noticed that when they trained a RNN using RL on a series of related tasks, the RNN itself instantiated a separate reinforcement learning algorithm. These researchers weren’t trying to design a meta-learning algorithm—apparently, to their surprise, this just spontaneously happened. As Botvinick describes it, they started “with just one learning algorithm, and then another learning algorithm kind of… emerges, out of, like out of thin air”:
“What happens… it seemed almost magical to us, when we first started realizing what was going on—the slow learning algorithm, which was just kind of adjusting the synaptic weights, those slow synaptic changes give rise to a network dynamics, and the dynamics themselves turn into a learning algorithm.”
Other versions of this basic architecture—eg. using slot-based memory instead of RNNs—seemed to produce the same basic phenomenon, which they termed “meta-RL.” So they concluded that all that’s needed for a system to give rise to meta-RL are three very general properties: the system must 1. have memory, 2. whose weights are trained by a RL algorithm, 3. on a sequence of similar input data.
From Botvinick’s description, it sounds to me like he thinks [learning algorithms that find/instantiate other learning algorithms] is a strong attractor in the space of possible learning algorithms:
“…it’s something that just happens. In a sense, you can’t avoid this happening. If you have a system that has memory, and the function of that memory is shaped by reinforcement learning, and this system is trained on a series of interrelated tasks, this is going to happen. You can’t stop it.”
…The account detailed by Botvinick and Wang et al strikes me as a relatively clear example of mesa-optimization, and I interpret it as tentative evidence that the attractor toward mesa-optimization is strong.
View External Link:
Matt Botvinick on the Spontaneous Emergence of Learning Algorithms