“DeepMind’s Latest AI Breakthrough Is Its Most Important Yet: Google-Owned DeepMind’s Go-Playing Artificial Intelligence Can Now Learn without Human Help… or Data”, Matt Burgess2017-10-18 ()⁠:

DeepMind’s human-conquering AlphaGo AI just got even smarter. The firm’s latest Go-playing system not only defeated all previous versions of the software, it did it all by itself. “The most striking thing for me is we don’t need any human data anymore”, says Demis Hassabis, the CEO and co-founder of DeepMind. While the first version of AlphaGo needed to be trained on data from more than 100,000 human games, AlphaGo Zero can learn to play from a blank slate. Not only has DeepMind removed the need for the initial human data input, Zero is also able to learn faster than its predecessor.

David Silver, the main programmer on DeepMind’s Go project, says the original AlphaGo that defeated 18-time world champion Lee Sedol 4–1 required several months of training. “We reached a superior level of performance after training for just 72 hours with AlphaGo Zero”, he says. Only 4.9 million simulated games were needed to train Zero, compared to the original AlphaGo’s 30 million. After the 3 days of learning Zero was able to defeat the Lee Sedol-conquering version 100–0. After it had been playing the game for 40 days, Zero defeated DeepMind’s previous strongest version of AlphaGo, called Master, which defeated Chinese master Ke Jie in May.

…When Zero played a game against itself, it was given feedback from the system. A +1 is given if it wins and a −1 if it loses. After each game the neural network behind Zero automatically reconfigures to a new, theoretically better, version. On average the system took 0.4 seconds of thinking time before making a move.

“In the original version, we tried this a couple of years ago and it would collapse”, Hassabis says. He cites DeepMind’s “novel” reinforcement algorithms for Zero’s new ability to learn without prior knowledge. Additionally the new system only uses one neural network instead of 2 and 4 of Google’s AI processors compared to the 48 needed to beat Lee. During the development of Zero, Hassabis says the system was trained on hardware that cost the company as much as $42.62$352017 million. The hardware is also used for other DeepMind projects.