Xilinx has upped its game in addressing performance bottlenecks in networking and data centers, with a new series in its Versal adaptive compute acceleration platform (ACAP) portfolio that integrates high bandwidth memory (HBM) to enable fast compute acceleration for massive, connected data sets with fewer and lower cost servers.
Its new Versal HBM series integrates advanced HBM2e DRAM, providing 820GB/s of throughput and 32GB of capacity for 8X more memory bandwidth and 63% lower power than DDR5 implementations (a comparison based on a typical system implementation of four DDR5-6400 components). Xilinx said the Versal HBM series is architected to keep up with the higher memory needs of the most compute intensive, memory bound applications for data center, wired networking, test and measurement, and aerospace and defense.
In a briefing with embedded.com, Mike Thompson, the senior product line manager for Xilinx Versal FPGAs, said, “There are three major trends at the moment: exponential growth of network traffic and data to be processed; DDR bandwidth availability which leads to performance bottlenecks; and the third is data security. Versal HBM increases the capacity of each of these three layers, particularly since bandwidth and security requirements are outpacing current processing and memory technologies.”
The Versal HBM series utilizes high-bandwidth memory integrated using stacked silicon interconnect (SSI) based on TSMC’s CoWoS (chip on wafer on substrate) 3D stacking technology. Thompson said this heterogenous integration is a key part of addressing the so-called end of Moore’s Law. He said traditional architectures are bottlenecked on memory and network access for real-time applications.
The Versal HBM series uses the foundation provided by Xilinx Versal Premium, but swaps out one super logic region (SLR) in the device to swap in the HBM2e stack, and another SLR to add an integrated HBM controller. This enables an architecture for fast data movement and adaptive processing through the integration of a networked intellectual property (IP) and memory subsystem. Thompson indicated that the Versal HBM integrates 14 equivalent FPGAs (compared with Xilinx Virtex EltraScale+), and replaces 32 DDR5 chips with integrated HBM.
The new HBM platform incorporates power-optimized networking cores for high bandwidth, secure connectivity. The Versal HBM series offers 5.6Tb/s of serial bandwidth with 112Gb/s PAM4 transceivers, 2.4Tb/s of scalable Ethernet bandwidth, 1.2Tb/s of line rate encryption throughput, 600Gb/s of Interlaken connectivity, and 1.5Tb/s of PCIe Gen5 bandwidth with built-in DMA, supporting both CCIX and CXL. This broad set of hardened IP provides off-the-shelf, multi-terabit networked connectivity for a breadth of protocols, data rates, and optical standards, enabling optimal power and performance and fast time to market.
As an adaptive, heterogeneous compute platform, the Versal HBM series is engineered to accelerate a wide range of workloads with large data sets, integrating adaptable engines for low-latency hardware parallelism, DSP engines for AI inference and signal processing, and scalar engines for embedded compute, platform management, and secure boot and configuration. Unlike fixed function accelerators, the Versal HBM series can dynamically reconfigure hardware in milliseconds to adapt with evolving algorithms and emerging protocols, eliminating the need for hardware redesign and re-deployment. Thompson told us “This with adaptable compute is important for agile design.”
This convergence of adaptable compute with high bandwidth memory and multi-terabit connectivity enables next-generation cloud acceleration and secure networking. Versal HBM ACAPs deliver good performance and power efficiency for big data workloads including fraud detection, recommendation engines, database acceleration, data analytics, financial modeling, and deep learning inference for natural language processing (NLP). By improving runtimes by orders of magnitude over modern server-class CPUs, while supporting 4X larger data sets, users can deploy applications with massive, connected data sets with far fewer and lower cost servers.
Similarly, Versal HBM ACAPs deliver network scalability and performance for 800G routers, switches, and security appliances. A traditional network processing unit (NPU) implementation of an 800G next-generation firewall would require multiple NPU devices and DDR modules, whereas a single Versal HBM ACAP eliminates external memories and performs packet processing, security processing, and adaptable AI-infused anomaly detection at dramatically lower power and at a fraction of the form factor. The series delivers major CapEx and OpEx savings for cloud and network providers by enabling customers to use fewer devices and systems to implement their applications.
Accessible to both hardware and software developers, Versal HBM ACAPs provide a design-entry point for any developer, including Vivado Design Suite for hardware developers, the Vitis unified software platform for software developers, and Vitis AI for data scientists with domain-specific frameworks and acceleration libraries.
The Versal HBM series is built on the foundation of production-proven 7nm Versal devices. Developers can start prototyping on Versal Premium series devices and evaluation boards and readily migrate to the Versal HBM series. The Versal HBM series will begin sampling in the first half of 2022. Documentation is available now and tools will be available in the second half of 2021 via an early access program.
- Samsung I-Cube4 puts 4 HBMs & logic die on paper-thin silicon interposer
- Synopsys tackles hyper-convergent ICs with unified circuit simulation flow
- Xilinx Launches Biggest ACAP Yet
- Xilinx targets data center offload with ‘composable’ hardware
- Edge AI challenges memory technology
- Inference chip performance builds on optimized memory subsystem design
- Embedded design with FPGAs: Implementation
- New Lattice FPGAs enable real-time hardware Root-of-Trust