Sensing a market opportunity for ARM CPUs in extreme performance applications in radar systems, backbone network/communications and compute intensive data centers, Altera at ARM Techcon took the wraps off its Cortex-A53 based Stratix 10 FPGA family.
With this move, the FPGA power house transforms the ARM architecture from an outsider looking in at the very high performance market dominated up to now by Intel’s x86 and IBM’s Power Architecture into one in which it must be considered as a extreme performance player of equal capabilities.
Altera has done this by using Intel’s own 14 nanometer TriGate process based foundry service that uses to build all of its high density FPGAs and applied it to a core-based design – the Statix 10 – that combines a 1.5 GHz quadcore 64-bit Cortex-A53 architecture with a 1 GHz programmable fabric.
According to Danny Biran, senior vice president, corporate strategy and marketing at Altera, the Cortex-A53 is already one of the power efficient and compute-capable of ARM’s application-class processors. “But when delivered on the 14 nm Tri-Gate process it will achieve more than six times more data throughput compared to today’s highest performing SoC FPGAs, “he said.
The Cortex-A53 also delivers important features to the extreme performance segment of the market, including virtualization support, 256TB memory reach and error correction code (ECC) on L1 and L2 caches.
“Furthermore, the Cortex-A53 core can run in 32-bit mode, which will run Cortex-A9 operating systems and code unmodified,,” he said, ”allowing a smooth upgrade path from Altera’s 28 nm and 20 nm SoC FPGAs.”
The other ingredient in the secret sauce that Altera brings to the competition in this previously Intel/IBM dominated market is a cleverly structured FPGA fabric that allows developers to implement clean designs in separate functional layers as performance and capability requires.
A logic layer is implemented in 1 GHz programmable fabric that can be used to implement custom functions such as hardware accelerators that gives designers access to the equivalent of four million 4-input lookup tables (LUTs) that use six inputs. It would be used for functions such as deep packet inspection, hardware acceleration, and special cryptographic engines.
Another layer is optimized for DSP and contains hardened floating-point DSP blocks that are designed in excess of 10 teraflops of computational performance in the highest end devices. Here, said xxx, the designer can implement DSP-based operations necessary for floating-point computations, matrix manipulations, and waveform processing.
To aid developers in building applications, Biran said the company has combined its its own SoC Embedded Design Suite (EDS) and an ARM Development Studio 5 (DS-5) kit optimized for Altera’s FPGA designs with the OpenCL programming tools for creating in a high level design language the kind of software support that such extreme performance heterogeneous implementations designs need.
“With this combination of building blocks Altera Stratix 10 SoCs will have a programmable-logic performance level of more than 1GHz, “ he said, “twice the core performance of current high-end 28 nm FPGAs.