Google says its TPU v4 supercomputer is more powerful and efficient than ever, thanks to optical circuit switching technology and architecture, and challenges Nvidia. A new white paper from Google details the company’s use of optical circuit switches in its machine learning training supercomputer, saying that the TPU v4 model with those switches in place offers improved performance and more energy efficiency than general-use processors. Google’s Tensor Processing Units — the basic building blocks of the company’s AI supercomputing systems — are essentially ASICs, meaning that their functionality is built in at the hardware level, as opposed to the general use CPUs and GPUs used in many AI training systems. The white paper details how, by interconnecting more than 4,000 TPUs through optical circuit switching, Google has been able to achieve speeds 10 times faster than previous models while consuming less than half as much energy. Aiming for AI performance, price breakthroughs The key, according to the white paper, is in the way optical circuit switching (performed here by switches of Google’s own design) enables dynamic changes to interconnect topology of the system. Compared to a system like Infiniband, which is commonly used in other HPC areas, Google says that its system is cheaper, faster and considerably more energy efficient. “Two major architectural features of TPU v4 have small cost but outsized advantages,” the paper said. “The SparseCore [data flow processors] accelerates embeddings of [deep learning] models by 5x-7x by providing a dataflow sea-of-cores architecture that allows embeddings to be placed anywhere in the 128 TiB physical memory of the TPU v4 supercomputer.” According to Peter Rutten, research vice president at IDC, the efficiencies described in Google’s paper are in large part due to the inherent characteristics of the hardware being used — well-designed ASICs are almost by definition better suited to their specific task than general use processors trying to do the same thing. “ASICs are very performant and energy efficient,” he said. “If you hook them up to optical circuit switches where you can dynamically configure the network topology, you have a very fast system.” While the system described in the white paper is only for Google’s internal use at this point, Rutten noted that the lessons of the technology involved could have broad applicability for machine learning training. “I would say it has implications in the sense that it offers them a sort of best practices scenario,” he said. “It’s an alternative to GPUs, so in that sense it’s definitely an interesting piece of work.” Google-Nvidia comparison is unclear While Google also compared TPU v4’s performance to systems using Nvidia’s A100 GPUs, which are common HPC components, Rutten noted that Nvidia has since released much faster H100 processors, which may shrink any performance difference between the systems. “They’re comparing it to an older-gen GPU,” he said. “But in the end it doesn’t really matter, because it’s Google’s internal process for developing AI models, and it works for them.” Related content news Elon Musk’s xAI to build supercomputer to power next-gen Grok The reported supercomputer project coincides with xAI’s recent announcement of a $6 billion series B funding round. By Gyana Swain May 27, 2024 3 mins Supercomputers GPUs news Regulators sound out users on cloud services competition concerns Cloud customers are more concerned with technical barriers than egress fees in contemplating cloud platform switches, it seems. By John Leyden May 24, 2024 4 mins Cloud Management Multi Cloud how-to Backgrounding and foregrounding processes in the Linux terminal Running processes in the background can be convenient when you want to use your terminal window for something else while you wait for the first task to complete. By Sandra Henry-Stocker May 24, 2024 5 mins Linux news FCC proposes $6M fine for AI-generated robocall spoofing Biden’s voice The incident reignites concerns over the potential misuse of deepfakes, a technology that can create realistic and often undetectable audio and video forgeries. By Gyana Swain May 24, 2024 3 mins Artificial Intelligence PODCASTS VIDEOS RESOURCES EVENTS NEWSLETTERS Newsletter Promo Module Test Description for newsletter promo module. Please enter a valid email address Subscribe