Using FPGAs to improve your wireless subsystem's performanceYou can realize significant improvements in the performance of signal processing functions in wireless systems. How? By taking advantage of the flexibility of FPGA fabric and the embedded DSP blocks in current FPGA architectures for operations that can benefit from parallelism.
By offloading operations that require high-speed parallel processing onto the FPGA and leaving operations that require high-speed serial processing on the processor, overall system performance and cost can be optimized while lowering system requirements.
The FPGA can be used with a digital signal processor (DSP), serving either as an independent pre-processor (or sometimes post-processor) device, or as a co-processor. In a preprocessing architecture, the FPGA sits directly in the data path and is responsible for processing the signals to a point when they can be efficiently and cost-effectively handed off to a DSP processor for further lower-rate processing.
|Figure 1: In co-processing architectures, the FPGA sits alongside the DSP, which offloads specific algorithmic functions to the FPGA to be processed at significantly higher speeds than what is possible in a DSP processor alone.|
In co-processing architectures, the FPGA sits alongside the DSP, which offloads specific algorithmic functions to the FPGA to be processed at significantly higher speeds than what is possible in a DSP processor alone. The results are passed back to the DSP or sent to other devices for further processing, transmission or storage (Figure 1 above).
The choice of pre-processing, post-processing or co-processing is often governed by the timing margins needed to move data between the processor and FPGA and how that impinges on the overall latency.
Although a co-processing solution is the topology most often considered by designers--primarily because the DSP is in more direct control of the data hand-off process-- this may not always be the best overall strategy.
|Figure 2: Shown is an LTE example of co-processing data-transfer latency issues.|
Consider, for example, the latest specifications for 3GPP Long Term Evolution, in which the transmission time interval has been reduced to 1ms, down from 2ms for HSDPA and 10ms for W-CDMA. This essentially requires that data be processed from the receiver and through to the output of the media access control (MAC) layer in less than 1,000 microseconds.
Figure 2 above shows that using a serial RapidIO port on the DSP running at 3.125Gbit/s, with 8bit/10bit encoding and a 200bit overhead for the Turbo decode function, results in a DSP-to-FPGA transfer delay of 230µs. Taking into account other expected delays, the Turbo codec performance required to meet these system timings is a very demanding 75.8Mbit/s for 50 users.
Using an FPGA to process the Turbo codecs as a largely independent post-processor not only removes DSP latency but saves time because there's no need to transfer the data at a high bandwidth between the DSP and FPGA.
This reduces the throughput rate of the Turbo decoder down to 47Mbit/s, a decrease that allows more cost-effective devices, and has reduced system power dissipation.
Another consideration is whether to use soft- or hard embedded processor intellectual property (IP) on the FPGA to offload some of the system processing tasks, which in turn offers the possibility of additional cost, power and footprint reduction benefits.