“U.S. vs. China Rivalry Boosts Tech—And Tensions: Militarized AI Threatens a New Arms Race”, 2021-12-28 (; similar):
…A year later, with much less fanfare, Tsinghua University’s Beijing Academy of Artificial Intelligence released an even larger model, Wu Dao 2.0, with 10× as many parameters—the neural network values that encode information. While GPT-3 boasts 175 billion parameters, Wu Dao 2.0’s creators claim it has a whopping 1.75 trillion. Moreover, the model is capable not only of generating text like GPT-3 does but also images from textual descriptions like OpenAI’s 12-billion parameter DALL·E 1 model, and has a similar scaling strategy to Google’s 1.6 trillion-parameter Switch Transformer model.
Tang Jie, the Tsinghua University professor leading the Wu Dao project, said in a recent interview that the group built an even bigger, 100 trillion-parameter model in June, though it has not trained it to “convergence”, the point at which the model stops improving. “We just wanted to prove that we have the ability to do that”, Tang said…Tang says his group is now working on video with the goal of generating realistic video from text descriptions. “Hopefully, we can make this model do something beyond the Turing test”, he says, referring to an assessment of whether a computer can generate text indistinguishable from that created by a human. “That’s our final goal.”
…Geoffrey Hinton instead helped to put deep learning on the map in 2012 with a now-famous neural net called AlexNet when he was at the University of Toronto. But Hinton was also in close contact with the Microsoft Research Lab in Redmond, Wash., before and after his group validated AlexNet, according to one of Hinton’s associates there, Li Deng, then principal researcher and manager and later chief scientist of AI at Microsoft.
In 2009 and 2010, Hinton and Deng worked together at Microsoft on speech recognition and Deng, then Editor-In-Chief of the IEEE Signal Processing Magazine, was invited in 2011 to lecture at several academic organizations in China where he said he shared the published success of deep learning in speech processing. Deng said he was in close contact with former Microsoft colleagues at Baidu, a Chinese search engine and AI giant, and a company called iFlyTek, a spin off from Deng’s undergraduate alma mater.
When Hinton achieved his breakthrough with backpropagation in neural networks in 2012, he sent an email to Deng in Washington, and Deng said he shared it with Microsoft executives, including Qi Lu who led the development of the company’s search engine, Bing. Deng said he also sent a note to his friends at iFlyTek, which quickly adopted the strategy and became an AI powerhouse—famously demonstrated in 2017 with a convincing video of then-president Donald Trump speaking Chinese.
Qi Lu went on to become COO of Baidu where Deng said another Microsoft alum, Kai Yu, who also knew Hinton well, had already seized on Hinton’s breakthrough. Literally within hours of Hinton’s results, according to Deng, researchers in China were working on repeating his success.
See Also:
“ChinAI #141: The PanGu Origin Story: Notes from an informative Zhihu Thread on PanGu”
“Turing-NLG: A 17-billion-parameter language model by Microsoft”
“CPM: A Large-scale Generative Chinese Pre-trained Language Model”
“ERNIE 3.0: Large-scale Knowledge Enhanced Pre-training for Language Understanding and Generation”
“Yuan 1.0: Large-Scale Pre-trained Language Model in Zero-Shot and Few-Shot Learning”
“Microsoft announces new supercomputer, lays out vision for future AI work”