Nvidia announces server “superchips”, with and without GPUs


At its GPU Technology Conference (GTC) last year, Nvidia announced that it would be releasing its own server chip called Grace based on the Arm Neoverse v9 server architecture. At the time, details were scarce, but this week Nvidia revealed the details, and they are remarkable.

With Grace, customers have two options, both called superchips by Nvidia. The first is the Grace Hopper Superchip which was officially introduced last year, but only widely described. It consists of a 72-core CPU and Hopper H100 GPU tightly connected by Nvidia’s new high-speed NVLink-C2C chip-to-chip interconnect, which has a transfer speed of 900 GB/s.

The second, announced this week, is the Grace CPU Superchip, which has no GPU. Instead, it has two 72-core CPU chips linked together via NVLink. Even without the H100 GPU, the Grace CPU Superchip has very good benchmarks. Nvidia claims SPECrate2017_int_base performance more than 1.5 times better than the high-end dual AMD Epyc “Rome” processors already shipped with Nvidia’s DGX A100 server.

The two superchips will serve two different markets, according to Paresh Kharya, senior director of product management and marketing at Nvidia. The Grace Hopper Superchip is intended to address the giant scale of AI and HPC, with a focus on the CPU system memory bottleneck, he said.

“Bandwidth is limited, and when you connect the CPU and GPU in a traditional server, the flow of data from system memory to the GPU is impeded by the PCIe slot,” he said. “So, by putting the two chips together and interconnecting them with our NVLink interconnect, we can unlock this memory.

The Grace CPU Superchip and Grace Hopper Superchip eschew standard DRAM memory sticks in favor of a new memory technology that Nvidia calls LPDDR5X. The memory is on the chip die and physically right next to the chips themselves, rather than on memory strips in DIMM slots. This direct connection offers up to 1 TB/s of bandwidth while supporting in-memory error correction. Kharya said the memory performance is up to 30 times faster than Nvidia’s current Ampere technology, which uses traditional DIMM memory.

With the Grace CPU Superchip, Nvidia has a different focus. First, he put both the processors as well as the LPDDR5X memory in a single package with a power draw of 500 watts, which he says is twice as energy efficient as mainstream processors. It may be more than that. A dual-socket x86 server will easily exceed 500 watts and have nowhere near as many. And this does not take into account the memory power consumption.

The memory bandwidth of the Grace CPU Superchip will benefit a range of applications not yet accelerated for GPUs.

Another potential market for the Grace CPU Superchip is AI inference. Some inference tasks require a lot of pre- and post-processing that must happen on the CPU and some other parts of the application are processed on the GPU. He also cited data analytics as a big potential market since then.

“There is a long queue of applications that have not yet been accelerated on GPUs. They would benefit immediately. They will really appreciate the high memory bandwidth to process faster as well as the speed of the CPU cores,” Kharya said.

Nvidia said Grace CPU Superchip and Grace Hopper Superchip are expected to ship by the end of this year or early next year.

Join the Network World communities on Facebook and LinkedIn to comment on the topics that matter to you.

Copyright © 2022 IDG Communications, Inc.


About Author

Comments are closed.