Arm adds CPU, GPU and ISP for autonomous and vision safety -

Arm adds CPU, GPU and ISP for autonomous and vision safety


Arm has introduced a new suite of intellectual property (IP) comprising new CPU, GPU and ISP (image signal processor) to enable scalable, power efficient compute capability for safe, autonomous decision-making across automotive and industrial applications.

The new suite of IP includes the Arm Cortex-A78AE CPU, Arm Mali-G78AE GPU, and Arm Mali-C71AE ISP, all of which are designed to enable silicon providers and OEMs to design for autonomous workloads. These products will be deployed in a range of applications, from enabling more intelligence and configurability in smart manufacturing to enhancing ADAS and digital cockpit applications in automotive.

“Autonomy has the potential to improve every aspect of our lives, but only if built on a safe and secure computing foundation,” said Chet Babla, vice president, of automotive and IoT business line at Arm. “As autonomous decision-making becomes more pervasive, Arm has designed a unique suite of technology that prioritizes safety while delivering highly scalable, power efficient compute to enable autonomous decision-making across new automotive and industrial opportunities.”

He added, “Integrating these three processing technologies into a system on chip (SoC) will provide the power-efficient and safety-enabled processing needed to unlock the decision-making potential of autonomous systems.”

Cortex-A78AE: 30% performance improvement in safety critical applications 
The new Arm Cortex-A78AE CPU is Arm’s latest, highest performance safety capable CPU, offering the ability to run different, complex workloads for autonomous applications such as mobile robotics and driverless transportation.

The micro-architecture is revamped on a number of fronts: additional fetch bandwidth, improved branch prediction, lower mis-predict penalty, wider integer issue and memory subsystem with 50% higher bandwidth than the previous generation. Of particular significance is the introduction of the macro-operation cache, a structure designed to hold decoded instructions that decouple the fetch engines from the execution thereby enabling dynamic code sequence optimization.

Together, these innovations result in over 30% performance improvement on the Spec2006 synthetic benchmark suite – across both integer and floating-point routines. The Cortex-A78AE manages to achieve the Cortex-A76AE’s targeted performance at 60% lower power in up to a 7nm implementation. At the same power envelope, the Cortex-A78AE offers a 25% performance boost, trading the power for performance.

Arm said heterogeneity and the right-sized compute is important as no one micro-architecture satisfies the power efficiency and compute needs of automotive and industrial applications. As an example, an autonomous drive platform needs to sense data, perceive obstacles and decide on the right path vector before engaging the vehicular controls. The middle two tasks require an enormous variety of algorithmic execution. To this end, the CPU supports the ability to be configured in a variety of cache sizes – across L1, L2, and L3 – besides memory interfaces and types.

The Cortex-A78AE can be paired in heterogenous compute clusters alongside the Cortex-A65AE and can be coupled with accelerators over the accelerator coherence port. A low latency peripheral port is of use for dedicated system interface controllers, while the CMN-600AE and MMU-600AE IPs support CHI-protocol-based NPUs and general-purpose GPU blocks within the coherence domain of the CPU cluster. These products provide the system designer with the ability to right-size the platform to the task at hand.

Arm Scalable-Cluster-Configuration-A78AE
The Cortex-A78AE can be paired in heterogenous compute clusters alongside the Cortex-A65AE (Image: Arm)

The new CPU supports features to achieve relevant automotive and industrial functional safety standards, ISO 26262 and IEC 61508 for applications up to ASIL D / SIL 3. A new enhanced split lock functionality (hybrid mode) is designed to specifically enable applications that target lower levels of ASIL requirements without compromising performance and allow the deployment of the same SoC compute architecture into different domain controllers.

Mali-G78AE: Arm’s first safety GPU, with flexible partitioning

The new Mali-G78AE is Arm’s first GPU to be designed for safety, enabling rich user experiences and heterogenous compute for safety-critical autonomous as well as industrial applications. Designed to address complex requirements for human machine interfaces and the heterogenous compute needed in autonomous systems, it brings the performance required to deliver mobile class capabilities whilst supporting automotive and industrial safety standards, helping to meet ASIL B / SIL 2 requirements.

Flexible partitioning is a new feature in Arm’s GPU that enables resources to be dedicated to different GPU workloads at runtime or boot time and shown to be completely separate from each other – instead of the GPU being a single large resource shared by a range of applications. Scheduling can be handled in hardware, or software, and jobs are deployed onto the GPU to ensure high utilization of GPU cores and to maximize efficiency – either with workloads potentially being mixed together or, in the case of traditional virtualization, resources working through one workload at a time effectively time slicing the GPU.

Arm Flexible Partitioning
Flexible partitioning is a new feature in Arm’s GPU that enables resources to be dedicated to different GPU workloads. (Image: Arm)

Mali-C71AE: safe ISP supporting four real-time cameras

Autonomous workloads need to be aware of their surroundings, often through cameras operating in a wide range of lighting conditions. Supporting both human and machine vision applications such as production line monitoring and ADAS camera systems, Arm has introduced the Mali-C71AE, developed for emerging smart automotive systems and industrial markets. Delivering key visual information to both computer vision (CV) systems and human display for clear and convenient viewing, Mali-C71AE is the first product in the Mali camera series of ISPs with built-in features for functional safety applications.

Mali-C71AE supports up to four real-time camera inputs or 16 camera streams from memory. Supporting simultaneous inputs from a mix of sensor and CFA (color filter array) types operating in a variety of modes allows the system developer to select from a wide range of sensors from various suppliers. Support for multiple camera inputs managed through the hardware means that a single instance of Mali-C71AE removes the need for an ISP per sensor. Instead the single ISP can be used to manage different sensors and frame rates for different scenarios in the vehicle or robot usage cycle.

Arm Mali ISP
Features of the Mali-C71AE include support for  up to four real-time camera inputs. (Arm)

The key benefit of this capability is camera inputs can be processed in a range of ways: in as-received order, in a programmed order, or in various other software-defined patterns – in order to prioritize based on system needs, such as batch-processing four cameras for surround view, or in an ADAS application making sure the long-range forward-facing camera input is processed when received, to minimize latency.

The senior vice president for autonomous driving in the Volkswagen Group and CEO of Artemis, Alexander Hitzinger, said, “The requirements for higher levels of driver automation, electrification and immersive in-vehicle experiences are continually growing, and scalable, heterogenous, safe compute is critical in order to meet the requirements of future vehicle electronics systems. Innovation such as Arm’s new technologies and the extensive ecosystem that supports it will help to accelerate the deployment of next-generation vehicles.”

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.