SANTA CLARA, Calif. — Startup Wave Computing added IP for deep learning to its expanding business model of chips, systems, and services. Its TritonAI 64 packages existing MIPS and dataflow blocks with a new tensor core unit, initially targeting inference jobs at the edge.
Observers expressed surprise that one of the first startups to design accelerators for deep learning would enter a market already well-served by established IP players. Wave has yet to reveal specs, performance, and availability of its new products, leaving analysts unable to make meaningful comparisons to existing blocks from Cadence, Ceva, Nvidia, Synopsys, and others.
“It’s a busy sector, but they have MIPS now, so they have expertise in licensing,” said Linley Gwennap of The Linley Group, referring to Wave’s acquisition in June of the processor IP vendor.
The WaveTensor block, the new element in Triton, is a matrix-multiply unit, a standard feature of most deep-learning accelerators since Google revealed its TPU in 2016. It is made up of multiple 4 × 4 and 8 × 8 kernels gathered into an array capable of up to 8 TOPS/W and more than 10 TOPS/mm 2 in a 7-nm process.
One rival called the new unit a step backward into a more conventional AI architecture focused on convolutional neural networks (CNNs). In its first disclosures, Wave described a dataflow processor flexible across a wide range of neural-net jobs.
The dataflow unit still exists in TritonAI next to the tensor core and up to six 64-bit MIPS cores to run Google’s TensorFlow framework. In practical applications, the tensor units will handle “80% to 90% of the computations” needed for CNNs, said Chris Nicol, chief technologist of Wave in a talk presenting TritonAI at the Linley Spring Processor Conference here.
The MIPS, tensor, and dataflow blocks are synthesizable cores configurable for different array sizes and caches. Users will need performance data to make informed configuration choices, something that Nicol said the startup’s benchmark team is developing.