Nvidia has announced a “multi-year collaboration” with Microsoft to build “one of the most powerful AI supercomputers in the world,” designed to handle the huge computing workloads needed to train and scale AI. The collaboration will see Nvidia utilizing Microsoft’s scalable virtual machine instances to accelerate advances in generative AI models like DALL-E.
Based on Microsoft’s Azure cloud infrastructure, the AI supercomputer will use tens of thousands of Nvidia’s powerful H100 and A100 data center GPUs and its Quantum-2 InfiniBand networking platform. According to Nvidia, the combination of Microsoft’s Azure cloud platform and Nvidia’s GPUs, networking, and full AI suite will allow more enterprises to train, deploy, and scale AI — including large, state-of-the-art models. The two companies will also collaboratively develop DeepSpeed, Microsoft’s deep learning optimization software.
The explosive growth of AI has increased demand for supercomputers capable of scaling with it
In a statement, Nvidia said the supercomputer could be used to “research and further accelerate advances in generative AI,” a relatively new class of large language models like DALL-E and Stable Diffusion that use self-learning algorithms to create a diverse range of content, such as text, code, digital images, video, and audio. These AI models have seen rapid growth in recent years which has significantly raised the demand for powerful computing infrastructure capable of scaling alongside their development.
“AI technology advances as well as industry adoption are accelerating. The breakthrough of foundation models has triggered a tidal wave of research, fostered new startups and enabled new enterprise applications,” said Nvidia vice president of enterprise computing Manuvir Das. “Our collaboration with Microsoft will provide researchers and companies with state-of-the-art AI infrastructure and software to capitalize on the transformative power of AI.”