“Reward Is Enough”, David Silver, Satinder Singh, Doina Precup, Richard S. Sutton2021-05-24 (, , , ; similar)⁠:

In this article we hypothesize that intelligence, and its associated abilities, can be understood as subserving the maximization of reward.

Accordingly, reward is enough to drive behavior that exhibits abilities studied in natural and artificial intelligence, including knowledge, learning, perception, social intelligence, language, generalisation and imitation. This is in contrast to the view that specialised problem formulations are needed for each ability, based on other signals or objectives.

Furthermore, we suggest that agents that learn through trial & error experience to maximise reward could learn behavior that exhibits most if not all of these abilities, and therefore that powerful reinforcement learning agents could constitute a solution to artificial general intelligence.

[Keywords: artificial intelligence, artificial general intelligence, reinforcement learning, reward]

[cf.: convergent instrumental drives; AI-GAs; the Bitter Lesson; meta-reinforcement learning; the blessings of scale, the pretraining paradigm, and the scaling hypothesis.]