“Google’s Newest AI Model Uses Nearly 5× More Text Data for Training Than Its Predecessor”, Jennifer Elias2023-05-17 (, )⁠:

Google’s new large language model, which the company announced last week, uses almost 5× as much training data as its predecessor from 2022, allowing its to perform more advanced coding, math and creative writing tasks, CNBC has learned.

PaLM 2, the company’s new general-use large language model (LLM) that was unveiled at Google I/O, is trained on 3.6 trillion tokens, according to internal documentation viewed by CNBC.

…Since unveiling PaLM 2, Google has said the new model is smaller than prior LLMs, which is important because it means the company’s technology is becoming more efficient while accomplishing more sophisticated tasks. PaLM 2, according to internal documents, is trained on 340 billion parameters, an indication of the complexity of the model. The initial PaLM was trained on 540 billion parameters.

Google said in a blog post about PaLM 2 that the model uses a “new technique” called “compute-optimal scaling.” [ie. Chinchilla?] That makes the LLM “more efficient with overall better performance, including faster inference, fewer parameters to serve, and a lower serving cost.”


CNBC leaks PaLM2-L training config, says it is: