NVIDIA’s 2024 GTC event, taking place through March 21, saw the usual plethora of announcements one would expect from a major tech conference. One stood out, from founder and CEO Jensen Huang’s keynote: the next-generation Blackwell GPU architecture, enabling organisations to build and run real-time generative AI on trillion-parameter large language models.
“The future is generative… which is why this is a brand new industry,” Huang told attendees. “The way we compute is fundamentally different. We created a processor for the generative AI era.”
Yet this was not the only ‘next-gen’ announcement to come out of the San Jose gathering.
NVIDIA unveiled a blueprint to construct the next generation of data centres, promising ‘highly efficient AI infrastructure’ with the support of partners ranging from Schneider Electric, to data centre infrastructure firm Vertiv, to simulation software provider Ansys.
The data centre, billed as fully operational, was demoed on the GTC show floor as a digital twin in NVIDIA Omniverse, a platform for building 3D work, from tools, to applications, and services. Another announcement was the introduction of cloud APIs, to help developers easily integrate core Omniverse technologies directly into existing design and automation software applications for digital twins.
The latest NVIDIA AI supercomputer is based on the NVIDIA GB200 NVL72 liquid-cooled system. It has two racks, both containing 18 NVIDIA Grace CPUs and 36 NVIDIA Blackwell GPUs, connected by fourth-generation NVIDIA NVLink switches.
Cadence, another partner cited in the announcement, plays a particular role thanks to its Cadence Reality digital twin platform, which was also announced yesterday as the ‘industry’s first comprehensive AI-driven digital twin solution to facilitate sustainable data centre design and modernisation.’ The upshot is a claim of up to 30% improvement in data centre energy efficiency.
The platform was used in this demonstration for multiple purposes. Engineers unified and visualised multiple CAD (computer-aided design) datasets with ‘enhanced precision and realism’, as well as use Cadence’s Reality Digital Twin solvers to simulate airflows alongside the performance of the new liquid-cooling systems. Ansys’ software helped bring simulation data into the digital twin.
“The demo showed how digital twins can allow users to fully test, optimise, and validate data centre designs before ever producing a physical system,” NVIDIA noted. “By visualising the performance of the data centre in the digital twin, teams can better optimise their designs and plan for what-if scenarios.”
For all the promise of the Blackwell GPU platform, it needs somewhere to run – and the biggest cloud providers are very much involved in offering the NVIDIA Grace Blackwell. “The whole industry is gearing up for Blackwell,” as Huang put it.
NVIDIA Blackwell on AWS will ‘help customers across every industry unlock new generative artificial intelligence capabilities at an even faster pace’, a statement from the two companies noted. As far back as re:Invent 2010, AWS has had NVIDIA GPU instances. Huang appeared alongside AWS CEO Adam Selipsky in a noteworthy cameo of last year’s re:Invent.
The stack includes AWS’ Elastic Fabric Adapter Networking, Amazon EC2 UltraClusters, as well as virtualization infrastructure AWS Nitro. Exclusive to AWS is Project Ceiba, an AI supercomputer collaboration which will also use the Blackwell platform, which will be for the use of NVIDIA’s internal R&D team.
Microsoft and NVIDIA, expanding their longstanding collaboration, are also bringing the GB200 Grace Blackwell processor to Azure. The Redmond firm claims a first for Azure in integrating with Omniverse Cloud APIs. A demonstration at GTC showed how, using an interactive 3D viewer in Power BI, factory operators can see real-time factory data, overlaid on a 3D digital twin of their facility.
Healthcare and life sciences are being touted as key industries for both AWS and Microsoft. The former is teaming up with NVIDIA to ‘expand computer-aided drug discovery with new AI models’, while the latter is promising that myriad healthcare stakeholders ‘will soon be able to innovate rapidly across clinical research and care delivery with improved efficiency.’
Google Cloud, meanwhile, has Google Kubernetes Engine (GKE) to its advantage. The company is integrating NVIDIA NIM microservices into GKE to help speed up generative AI deployment in enterprises, as well as making it easier to deploy the NVIDIA NeMo framework across its platform via GKE and Google Cloud HPC Toolkit.
Yet, fitting into the ‘next-gen’ theme, it is not the case that only hyperscalers need apply. NexGen Cloud is a cloud provider based on sustainable infrastructure as a service, with Hyperstack, powered by 100% renewable energy, offered as a self-service, on-demand GPU as a service platform. The NVIDIA H100 GPU is the flagship offering, with the company making headlines in September by touting a $1 billion European AI supercloud promising more than 20,000 H100 Tensor Core GPUs at completion.
NexGen Cloud announced that NVIDIA Blackwell platform-powered compute services will be part of the AI supercloud. “Through Blackwell-powered solutions, we will be able to equip customers with the most powerful GPU offerings on the market, empowering them to drive innovation, whilst achieving unprecedented efficiencies,” said Chris Starkey, CEO of NexGen Cloud.