Imagination launches new multi-core low-bandwidth NNA for ADAS - Embedded.com

Imagination launches new multi-core low-bandwidth NNA for ADAS

Imagination Technologies has launched the IMG Series4, a family of neural network accelerators (NNA) featuring a new multi-core architecture and up to 90% improvement in data processing bandwidth efficiency. The company said its features enable up to 600 tera operations per second (TOPS) performance and low latency for large neural network workloads in advanced driver-assistance systems (ADAS).

Commenting on the new architecture, Andrew Grant, senior director, artificial intelligence, Imagination Technologies, said, “In terms of hardware architectural features, we have created pre-built clusters with on-chip memory (OCM or SRAM), shared OCM, high bandwidth interconnect and a new flexible pipeline which is more efficient.” He continued, “The new IMG Series4 is a multi-core NNA with low-energy, low-bandwidth requirements and ultra-low latency. It has advanced features like Imagination Tensor Tiling (ITT) to reduce bandwidth by up to 90% and it’s designed for automotive safety.”

Its multi-core capability allows for flexible allocation and synchronization of workloads across the cores. Imagination’s software, which provides fine-grained control and increases flexibility through batching, splitting and scheduling of multiple workloads, can now be exploited across any number of cores, available in configurations of 2, 4, 6, or 8 cores per cluster.

Imagination Series 4 architecture
In this diagram 8 cores are shown with high bandwidth interconnects connecting them and shared on-chip memory (OCM) that can be used by all the cores. In addition, each of the cores has its own on-chip memory or SRAM for it to use so weights and intermediate data are kept close to the individual cores. Each 8-core cluster delivers over 100 TOPS in 5 nm technology and with over 30 TOPS per watt and over 12 TOPS per millimetre squared. (Image: Imagination Technologies)

The ITT function is a patent-pending technology that addresses bandwidth efficiency by splitting input data tensors into multiple tiles for efficient data processing. ITT exploits local data dependencies to keep intermediate data in on-chip memory. This minimizes data transfers to external memory, reducing bandwidth by up to 90%. ITT is a scalable algorithm with major benefits on networks with large input data sizes.

Imagination said that with ITT, which is patent pending, this brings in reduced bandwidth requirements, providing the ability to split the networks and run them on the selected cores and enabling really fine-grained control over the workload.

The outcome of these architectural features is ultra-high performance and ultra-low latency, according to Imagination. The low-power NNA architecture is designed to run full network inferencing while meeting functional safety requirements. It executes multiple operations in a single pass to maximize performance per watt and deliver industry-leading energy efficiency.

The Series4 offers 12.5 TOPS per core at less than one watt. For example, an 8-cluster core can deliver 100 TOPS: thus, a 6×100 solution offers 600 TOPS. A Series4 NNA achieves performance that is over 20x faster than an embedded GPU and 1000x faster than an embedded CPU for AI inference.

On the low latency front, by combining all the cores into a 2, 4, 6 or 8-core cluster, all the cores can be dedicated to executing a single task, reducing latency, and therefore response time, by a corresponding factor. For example, for an 8-core cluster by a factor of eight.

Series4 has already been licensed and will be available on the market in December 2020.

Addressing AI performance for automotive
The automotive industry is on the cusp of a revolution, with new use cases such as self-driving cars and robotaxis demanding new levels of artificial intelligence (AI) performance. To that end, Imagination is already working with leading players and innovators in automotive, and other industries where functional safety is valued.

James Hodgson, principle analyst, smart mobility and automotive, ABI Research, said, “While we expect the demand for ADAS to triple by around 2027 the automotive industry is already looking beyond this to full self-driving cars and robotaxis. Wider adoption of neural networks will be an essential factor in the evolution from Level 2 and 3 ADAS, to full self-driving at Level 4 and Level 5. These systems will have to cope with hundreds of complex scenarios, absorbing data from numerous sensors, such as multiple cameras and LiDAR, for solutions such as automated valet parking, and intersection management and safely navigating complex urban environments. A combination of high performance, low latency and energy efficiency will be key to scaling highly automated driving.”

Andrew Grant added, “Innovators are already tackling the task of creating the silicon that will support the next generation of ADAS features and autonomous vehicles. Any company or R&D team looking to be a serious player in automotive needs to be integrating this technology into their platforms now.”

Grant also said Series4 enables the safe inference of neural networks without performance degradation. “It incorporates hardware safety mechanisms and is designed following ISO 26262 processes. We have hardware features such as CRCs, watchdogs and performance monitors, and selective parity in logic and RAMs so that we can meet ASIL decomposition requirements. These hardware safety mechanisms protect the compiled network, the execution of the network and the data processing pipeline. The IP safety package also includes documentation.”


Related Contents:

For more Embedded, subscribe to Embedded’s weekly email newsletter.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.