“AI Search: The Bitter-Er Lesson”, 2024-06-10 ():
[cf 2020] …In 2019, a team of researchers built a cracked chess computer. She was called Leela Chess Zero—‘zero’ because she started knowing only the rules. She learned by playing against herself billions of times. She made moves that overturned centuries of human chess canon. She was inventive and made long-term sacrifices. Leela played with her food and exhibited weird human tendencies. She won the world championship. And then she was utterly destroyed by Stockfish.
I loved Leela. I had sunk years into knowing, benchmarking, and researching her. As a kid, I always wondered what it would be like to meet superintelligent aliens and have them tell us how they play chess. There was a moment, watching Leela play, when I realized I just found out.
Leela’s magic, of course, was in deep learning. By teaching herself, she gained deeper chess representations than humans could ever hard-code. Years later, I still think Leela is the best example of The Bitter Lesson. Leela put aside human arrogance and figured stuff out by herself.
Leela also proved scaling laws before they were cool. In 2018, I and others on the team noticed that larger networks consistently outperformed smaller ones, position-for-position. We even observed remarkable emergent properties—larger networks seemed to ‘look ahead’ several moves without explicit instruction or search.
So, in 2020, the Leela team raced to train larger networks. She sourced compute from corporate donors and friends’ GTX 1070s. We feverishly tracked self-play metrics like many track WandB loss curves today. Just before the world championship, Leela’s largest model came out of the oven. And then she brutally lost.
…Stockfish had better search…Leela shook chess up because she threw out human chess knowledge and learned for herself. At the time, Stockfish’s ability to grind out billions of positions didn’t matter because its understanding of each position was kneecapped by its human creators.
To fix this, the Stockfish team heisted Leela’s deep learning techniques and trained a model hundreds of times smaller than the top Leela model.
After they trained their tiny model, they threw it into their search pipeline, and Stockfish crushed Leela overnight.
Stockfish utterly rejected scaling laws. They went backward and made a smaller model. But, because their search algorithm was more efficient, took better advantage of hardware, and saw further, they won.
The Bitter-er Lesson is that, in a world of fancy deep learning, you shouldn’t discount the power of AI search.