“Microsoft Announces New Supercomputer, Lays out Vision for Future AI Work”, 2020-05-19 (; backlinks; similar):
Microsoft has built one of the top five publicly disclosed supercomputers in the world, making new infrastructure available in Azure to train extremely large artificial intelligence models, the company is announcing at its Build developers conference.
Built in collaboration with and exclusively for OpenAI, the supercomputer hosted in Azure was designed specifically to train that company’s AI models. It represents a key milestone in a partnership announced last year to jointly create new supercomputing technologies in Azure.
…It’s also a first step toward making the next generation of very large AI models and the infrastructure needed to train them available as a platform for other organizations and developers to build upon. “The exciting thing about these models is the breadth of things they’re going to enable”, said Microsoft Chief Technical Officer Kevin Scott, who said the potential benefits extend far beyond narrow advances in one type of AI model. “This is about being able to do a hundred exciting things in natural language processing at once and a hundred exciting things in computer vision, and when you start to see combinations of these perceptual domains, you’re going to have new applications that are hard to even imagine right now”, he said…Microsoft is also exploring large-scale AI models that can learn in a generalized way across text, images and video. That could help with automatic captioning of images for accessibility in Office, for instance, or improve the ways people search Bing by understanding what’s inside images and videos.
…The supercomputer developed for OpenAI is a single system with more than 285,000 CPU cores, 10,000 GPUs and 400 gigabits per second of network connectivity for each GPU server. Compared with other machines listed on the TOP500 supercomputers in the world, it ranks in the top five, Microsoft says. Hosted in Azure, the supercomputer also benefits from all the capabilities of a robust modern cloud infrastructure, including rapid deployment, sustainable datacenters and access to Azure services.
[As of 2023-01-08, OA still uses this cluster: “Yes, but evolved. This was used to train the original GPT-3. Lots of enhancements since.”]