Tensilica extends BaseBand Engine family with DSP IP core for LTE Advanced - Embedded.com

Tensilica extends BaseBand Engine family with DSP IP core for LTE Advanced


The ConnX BBE64-128 from Tensilica is a digital signal processor intellectual property cores for system-on-chip design. It provides over 100 GigaMACs performance in 28nm high-performance process technology.

The core was designed to meet the performance requirements for Long-Term Evolution (LTE) Advanced, which required at least five times more processing power than LTE.  

Tensilica has also introduced the ConnX BBE64-UE, which is specifically optimized for the low power and small area requirements of LTE Advanced handsets.

These two products are based on the new ConnX BBE64 architecture, which Tensilica’s customers can use to optimize a DSP core for their particular requirements.

The company's product line also includes DSPs for LTE, including the popular ConnX BBE16 LTE DSP and the ConnXSSP16, ConnX BSP3, and ConnX Turbo16, also introduced today.

The ConnX BBE64-128 DSP can perform at 128 GigaMACs per cycle for maximum throughput and minimum energy for most common multiple in, multiple out (MIMO) and channel estimation functions, used extensively in LTE Advanced software.  It is based on a multislot very long instruction word (VLIW) architecture that provides high sustained performance across many applications with dense code and power efficiency.

For non-vector algorithms, high code density can be achieved with modeless switching to Tensilica’s smaller standard 16- and 24-bit instructions. Almost any operation can be performed from any slot in the VLIW format for improved sustained performance, lower energy and denser code.

The BBE64-128 can run 128 multiply accumulates (MACs), which is particularly helpful for finite impulse response (FIR) filters and matrix operations that dominate LTE Advanced channel estimation and MIMO processing.

To improve performance the core has 'soft bit' vector data types and operations including arbitrary field insertion and extraction for complex transmit operations, resulting in over 250 general 10-bit operations per cycle. Parallel register files for 10/20-bit and 40-bit data types for easier compilation and higher performance at lower power.

Large register files improve performance on complex code, reduce memory bandwidth requirements, reduce power and ease compilation. The core has single-cycle 16-way complex radix-4 and radix-8 fast Fourier transform and discrete Fourier transform for efficiency on arbitrary size transformations common to orthogonal frequency-division multiplexing (OFDM) algorithms.

It has accelerated interleaving for all bit, byte, half-word and word vector types for flexibility and efficiency in HARQ (hybrid automatic repeat request), forward error correction and convolutional coding. Cellular modem acceleration has an optimized capability for max-index search, demap, despread, vector divide, vector recip and square root

Multiple parallel execution units of each type provide greater instruction scheduling flexibility and higher performance on code that uses one execution type heavily.

Expanded vector memory operations ease automatic compilation of complex C code at maximum performance on any data size and placement while a high-performance AXI interface for eases shared memory connection to memory and other cores.

The core provides the ability to optimize design for specific needs by adding custom instructions in minutes with Tensilica’s automated tools and provides flexibility for adding special memory interfaces, special per-SIMD (single instruction, multiple data) lane lookups or other required functions.

ConnX BBE64-UE is based on a minimum feature set for minimum energy and latency. It is optimized for interface with low-power specialized engines (programmable or hard wired). While excluding such features as the option to run 128 MACs/cycle, this high-efficiency processor can reach approximately 300,000 GMAC/second/Watt in 28nm low-leakage process technology.

The compiler is automatically generated to match the exact configuration options chosen during the design process and features full native DSP data-type support (integer/fractional, real/complex). It automatically infers complex instructions, accelerates and vectorizes legacy code from ConnX BBE16, accelerates legacy code written with industry-standard intrinsic functions, vectorizes loops with complex conditional operations, and performs ANSI C operators on vector datatypes.

Tensilica has also made available for customer evaluation the optimized programmable dataplane processing units of its Atlas Reference Architecture. Click here for more information.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.