Nvidia GTC: New 400 Gbps InfiniBand networking platform - Embedded.com

Nvidia GTC: New 400 Gbps InfiniBand networking platform

New at Nvidia GTC is a 400Gbps InfiniBand switch and networking platform which features secure, cloud-native, multi-tenant, bare-metal performance for AI, data analytics, and HPC applications


Among its several announcements at Nvidia GTC this week was the new Nvidia Quantum-2, which the company said is the next generation of its InfiniBand networking platform for delivering the performance, accessibility and strong security needed by cloud computing providers and supercomputing centers.

The Quantum-2 is an advanced 400Gbps InfiniBand end-to-end networking platform that consists of the Quantum-2 switch, the ConnectX-7 network adapter, the BlueField-3 data processing unit (DPU) and all the software that supports the new architecture.

NVIDIA Quantum-2 - Platform NVIDIA Quantum-2 Switch ConnectX-7 Adapters LinkX Cables

The introduction of Quantum-2 comes as supercomputing centers are increasingly opening to multitudes of users, many from outside their organizations. At the same time, the world’s cloud service providers are beginning to offer more supercomputing services to their millions of customers. Hence Quantum-2 includes key features required for demanding workloads running in either arena. Supercharged by cloud-native technologies, it provides high performance with 400 gigabits per second of throughput and advanced multi-tenancy to accommodate many users.

“The requirements of today’s supercomputing centers and public clouds are converging,” said Gilad Shainer, senior vice president of networking at Nvidia. “They must provide the greatest performance possible for next-generation HPC, AI and data analytics challenges, while also securely isolating workloads and responding to varying demands of user traffic. This vision of the modern data center is now real with Nvidia Quantum-2 InfiniBand.”

Cloud-native capabilities
With 400Gbps, Nvidia said the Quantum-2 InfiniBand platform doubles the network speed and triples the number of network ports. It accelerates performance by 3x and reduces the need for data center fabric switches by 6x, while cutting data center power consumption and reducing data center space by 7 percent each. The multi-tenant performance isolation keeps the activity of one tenant from disturbing others, utilizing an advanced telemetry-based congestion control system with cloud-native capabilities that ensure reliable throughput, regardless of spikes in users or workload demands.

The Quantum-2 SHARPv3 in-network computing technology provides 32x more acceleration engines for AI applications compared with the previous generation. Advanced InfiniBand fabric management for data centers, including predictive maintenance, is enabled with the Nvidia UFM cyber-AI platform.

A nanosecond-precision timing system integrated into Quantum-2 can synchronize distributed applications, like database processing, helping to reduce the overhead of wait and idle times. This new capability allows cloud data centers to become part of the telecommunications network and host software-defined 5G radio services.

InfiniBand switch
At the heart of the new platform is the new Quantum-2 InfiniBand switch. With 57 billion transistors on 7-nanometer silicon, it is slightly bigger than the Nvidia A100 GPU with 54 billion transistors. It features 64 ports at 400Gbps or 128 ports at 200Gbps and will be offered in a variety of switch systems up to 2,048 ports at 400Gbps or 4,096 ports at 200Gbps — more than 5x the switching capability over the previous generation, Quantum-1.

The combined networking speed, switching capability and scalability is ideal for building the next generation of giant HPC systems.

The Quantum-2 switch is now available from a wide range of leading infrastructure and system vendors around the world, including Atos, DataDirect Networks (DDN), Dell Technologies, Excelero, GIGABYTE, HPE, IBM, Inspur, Lenovo, NEC, Penguin Computing, QCT, Supermicro, VAST Data and WekaIO.

Quantum-2, ConnectX-7 and BlueField-3
The Quantum-2 platform provides two networking end-point options, the ConnectX-7 NIC and BlueField-3 DPU InfiniBand.

ConnectX-7, with 8 billion transistors in a 7-nanometer design, doubles the data rate of the world’s current leading HPC networking chip, the ConnectX-6. It also doubles the performance of RDMA, GPUDirect Storage, GPUDirect RDMA and in-networking computing. The ConnectX-7 will sample in January.

BlueField-3 InfiniBand, with 22 billion transistors in a 7-nanometer design, offers sixteen 64-bit Arm CPUs to offload and isolate the data center infrastructure stack. BlueField-3 samples in May.

Related Contents:

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.