New Synopsys neural processor IP delivers 3,500 TOPs for AI SoCs - Embedded.com

New Synopsys neural processor IP delivers 3,500 TOPs for AI SoCs

The NPX6 NPU IP includes hardware and software support for multi-NPU clusters of up to eight NPUs to achieve 3500 TOPS with sparsity.

Synopsys has launched a new neural processing unit (NPU) intellectual property (IP) core and toolchain that delivers 3,500 TOPS to support the performance requirements of increasingly complex neural network models in artificial intelligence (AI) systems on chip (SoCs).

Its new DesignWare ARC NPX6 and NPX6FS NPU IP address the demands of real-time compute with ultra-low power consumption for AI applications. And for application software development on the latest NPU, the company’s new MetaWare MX development toolkit provides a comprehensive compilation environment with automatic neural network algorithm partitioning to maximize resource utilization.

John Koeter, the senior VP for marketing and strategy in the Synopsys solutions group, commented, “Higher resolution images, more cameras in systems, and more complex algorithms are driving AI processing requirements to high TOPS performance. With the new DesignWare ARC NPX6 and NPX6FS NPU IP, as well as MetaWare MX development toolkits, designers can take advantage of the latest neural network models, meet growing performance demands and accelerate time-to-market for their next intelligent SoCs.”

With multiple products in the family, the ARC NPX6 NPU IP addresses deep learning algorithm coverage including computer vision tasks such as object detection, image quality improvement, and scene segmentation, and for broader AI applications such as audio and natural language processing. The architecture is based on individual cores that can scale from 4K MACs to 96K MACs for a single AI engine performance of over 250 TOPS and over 440 TOPS with sparsity.

The NPX6 NPU IP includes hardware and software support for multi-NPU clusters of up to eight NPUs to achieve 3500 TOPS with sparsity. Advanced bandwidth features in hardware and software, and a memory hierarchy (including L1 memory in each core and a high-performance, low-latency interconnect to access a shared L2 memory) make scaling to a high MAC count possible. An optional tensor floating point unit is available for applications benefiting from BF16 or FP16 inside the neural network.

Synopsys DesignWare-ARC-NPX6-NPU-Processor-Block-Diagram
Synopsys said its new DesignWare ARC NPX6 and NPX6FS NPU IP together with MetaWare MX development toolkit enables designers to take advantage of the latest neural network models, (Image: Synopsys)

For application software development, the MetaWare MX development toolkit provides a software programming environment that includes a neural network software development kit (NN SDK) and support for virtual models. The NN SDK automatically converts neural networks trained using popular frameworks, like Pytorch, Tensorflow, or ONNX into optimized executable code for the NPX hardware.

The idea is that the NPX6 NPU processor IP can then be used to create a range of products – from a few TOPS to 1000s of TOPS, programmed with a single toolchain.

As one of the users of Synopsys’ existing ARC EV processors, the CTO of Inuitive, Dor Zepeniuk, said “Based on our seamless experience integrating the Synopsys DesignWare ARC EV processor IP into our successful NU4000 multi-core SoC, we have selected the new ARC NPX6 NPU IP to further strengthen the AI processing capabilities and efficiency of our products when executing the latest neural network models. The company is a designer of powerful 3D and vision processors for advanced robotics, drones, augmented reality/virtual reality (AR/VR) devices and other edge AI and embedded vision applications. He said, “In addition, the easy-to-use ARC MetaWare tools help us take maximum advantage of the processor hardware resources, ultimately helping us to meet our performance and time-to-market targets.”

Scalable processor with toolchain

The ARC NPX6 NPU IP, addresses a range of application requirements as complex neural network models put greater demands on compute and memory resources, often for safety-critical functions. These include advanced driver assistance systems (ADAS), surveillance, digital TVs and cameras and other emerging AI applications. Key features include that it:

  • Scales from 4K to 96K MACs.
  • Delivers, in a single instance, up to 250 tera operations per second (TOPS) at 1.3 GHz on 5nm processes in worst-case conditions, or up to 440 TOPS by using new sparsity features, which can increase the performance and decrease energy demands of executing a neural network.
  • Integrates hardware and software connectivity features that enable implementation of multiple NPU instances to achieve up to 3,500 TOPS of performance on a single SoC.
  • Provides more than 50x the performance of the maximum configuration of the ARC EV7x processor IP.
  • Offers optional 16-bit floating point support inside the neural processing hardware, maximizing layer performance and simplifying the transition from GPUs used for AI prototyping to high-volume power- and area-optimized SoCs

In addition, the ARC NPX6FS NPU IP meets stringent random hardware fault detection and systematic functional safety development flow requirements to achieve up to ISO 26262 ASIL D compliance. The processors, with comprehensive safety documentation included, feature dedicated safety mechanisms for ISO 26262 compliance and address the mixed-criticality and virtualization requirements of next-generation zonal architectures.

The ARC MetaWare MX development toolkit includes compilers and debugger, neural network software development kit (SDK), virtual platforms SDK, runtimes and libraries, and advanced simulation models. It provides a single toolchain environment to accelerate application development and automatically partitions algorithms across the MAC resources for efficient processing. For safety-critical automotive applications, the MetaWare MX development toolkit for safety includes a safety manual and a safety guide to help developers meet the ISO 26262 requirements and prepare for ISO 26262 compliance testing.

The DesignWare ARC NPX6 NPU IP, NPX6FS NPU IP and MetaWare MX Development toolkit are available to lead customers today.

The company will be providing more details about the new NPU IP at the Linley Fall Processor Conference this week, and also in a deep dive session at the Embedded Vision Summit on May 19 where the company will talk about how to optimize AI performance and power for advanced neural network applications.


Related Content:

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.