Home Blogs Data Center Explorer Google launches A3 supercomputer VMs

Google launches A3 supercomputer VMs

News Analysis

May 15, 20232 mins

Data CenterServers

The A3 supercomputer's scale can provide up to 26 exaFlops of AI performance, Google says.

Google Cloud announced a new supercomputer virtual-machine series aimed at rapidly training large AI models.

Unveiled at the Google I/O conference, the new A3 supercomputer VMs are purpose-built to handle the considerable resource demands of a large language model (LLM).

“A3 GPU VMs were purpose-built to deliver the highest-performance training for today’s ML workloads, complete with modern CPU, improved host memory, next-generation Nvidia GPUs and major network upgrades,” the company said in a statement.

The instances are powered by eight Nvidia H100 GPUs, Nvidia’s newest GPU that just begin shipping earlier this month, as well as Intel’s 4th Generation Xeon Scalable processors, 2TB of host memory and 3.6 TBs bisectional bandwidth between the eight GPUs via Nvidia’s NVSwitch and NVLink 4.0 interconnects.

All together, Google is claiming these machines can provide up to 26 exaFlops of power. That’s the cumulative performance of the entire supercomputer, not each individual instance. Still, it blows away the old record for the fastest supercomputer, Frontier, which was just a little over one exaFlop.

According to Google, A3 is the first production-level deployment of its GPU-to-GPU data interface, which Google calls the infrastructure processing unit (IPU). It allows for sharing data at 200 Gbps directly between GPUs without having to go through the CPU. This result is a ten-fold increase in available network bandwidth for A3 virtual machines compared to prior-generation A2 VMs.

A3 workloads will be run on Google’s specialized Jupiter data center networking fabric, which the company says “scales to tens of thousands of highly interconnected GPUs and allows for full-bandwidth reconfigurable optical links that can adjust the topology on demand.”

Google will be offering the A3 in two ways: customers can run it themselves or as a managed service where Google handles most of the work. If you opt to do it yourself, the A3 VMs run on Google Kubernetes Engine (GKE) and Google Compute Engine (GCE). If you go with a managed service, the VMs run on Vertex, the company’s managed machine learning platform.

The A3 virtual machines are available for preview, which requires filling out an application to join the Early Access Program. Google makes no promises you will get a spot in the program.

by Andy Patrizio

Andy Patrizio is a freelance journalist based in southern California who has covered the computer industry for 20 years and has built every x86 PC he’s ever owned, laptops not included.

The opinions expressed in this blog are those of the author and do not necessarily represent those of ITworld, Network World, its parent, subsidiary or affiliated companies.

Americas

Topics

About

Policies

Our Network

More

Google launches A3 supercomputer VMs

The A3 supercomputer's scale can provide up to 26 exaFlops of AI performance, Google says.

Most popular authors

Show me more

Elon Musk’s xAI to build supercomputer to power next-gen Grok

Regulators sound out users on cloud services competition concerns

Backgrounding and foregrounding processes in the Linux terminal

Has the hype around ‘Internet of Things’ paid off? | Ep. 145

Episode 1: Understanding Cisco’s Converged SDN Transport

Episode 2: Pluggable Optics and the Internet for the Future

Has the hype around ‘Internet of Things’ paid off?

Are unused IPv4 addresses a secret gold mine?

Preparing for a 6G wireless world: Exciting changes coming to the wireless industry

Google launches A3 supercomputer VMs

The A3 supercomputer's scale can provide up to 26 exaFlops of AI performance, Google says.

Related content

AMD holds steady against Intel in Q1

Broadcom launches 400G Ethernet adapters

HPE updates block storage services

ZutaCore launches liquid cooling for advanced Nvidia chips

Newsletter Promo Module Test

Most popular authors

Show me more

Elon Musk’s xAI to build supercomputer to power next-gen Grok

Regulators sound out users on cloud services competition concerns

Backgrounding and foregrounding processes in the Linux terminal

Has the hype around ‘Internet of Things’ paid off? | Ep. 145

Episode 1: Understanding Cisco’s Converged SDN Transport

Episode 2: Pluggable Optics and the Internet for the Future

Has the hype around ‘Internet of Things’ paid off?

Are unused IPv4 addresses a secret gold mine?

Preparing for a 6G wireless world: Exciting changes coming to the wireless industry