“Why Didn’t DeepMind Build GPT-3?”, Jonathan Godwin2023-02-27 (, , ; backlinks)⁠:

…As someone professionally interested in how you build extraordinary scientific teams, there are 3 things that strike me quite profoundly about GPT-3.

The first is that there is no real evaluation metric or target for GPT-3. Nothing was “solved” when GPT-3 was released, in the way that Go or protein folding was “solved”. Nobody knew in advance how long you’d have to train GPT-3 before it would start to count, and the eerie experience of interacting with GPT-3 is not in any way captured by question answering benchmarks. This lack of easily quantifiable measurement is striking in its departure from previous grand challenges in AI.

The second is that there are comparatively few people with traditional elite academic machine learning backgrounds in the GPT-3 author list (PhDs in machine learning, people with many thousands first or last author papers)—an organizational departure from the prevailing wisdom on how to build teams in AGI.

The third is the scale of organizational-level risk taking involved in building GPT-3. It seems obvious now, but it was in no way clear in 2019 that reducing the language modelling loss on the whole of the internet would lead to the amazing properties we see in large language models. There was substantial risk it wouldn’t work out, and the costs—opportunity and financial—to OpenAI would have been substantial.

These points are related. They stem from strong organizational, almost philosophical, differences. OpenAI is an exceptionally engineering focused research company, concerned first and foremost on how to build systems that appear to have intelligence when interacted with. This stands in stark contrast to most academic machine learning that is focused more on algorithmic understanding than system performance. Engineering focused papers often have a hard time getting into conferences, with reviewers saying “clear reject” because of “lack of novelty”—it’s “just engineering” after all.

Finally, in 2019 OpenAI had something to prove. They were commonly viewed as a company without clear focus. Now the shoe is on the other foot, DeepMind (and Google) have to respond.

…In attempting to answer this question I’m not primarily interested in whether DeepMind had the technical ability or resources to build and serve large language models—clearly they did and still do. My former DeepMind colleagues are extraordinarily talented and not to be underestimated. The race isn’t won yet.