Altera aids floating-point DSP implementation in FPGAs -

Altera aids floating-point DSP implementation in FPGAs

Altera is quietly rolling out the ability for DSP designers to move directly from Simulink models to floating-point data paths implemented in Altera FPGAs. The capability, fully described by Altera senior technical marketing manager Michael Parker in a paper at DesignCon 2011, has been integrated into the company’s DSP Builder Simulink-to-VHDL flow.

Parker listed two primary reasons for adding floating-point capability to DSP Builder. The first is the difficulty of moving from algorithm development in floating-point environments to implementation in fixed-point hardware. The translation requires constant diligence to avoid loss of precision inside computations, which in turn requires an intimacy with numerical methods and the algorithms in question that few designers posses. The second reason is the fact that many applications—RADAR signal processing, matrix inversion in MIMO receivers, or control of dynamic predistortion in RF power amplifiers, for examples—require such enormous dynamic range that only block- or true-floating-point implementation is feasible.

If the case for implementing hardware floating-point data paths is strong, the options for doing so have been unattractive. Parker’s criteria only fit a small minority of FPGA users, so at even 28nm the overhead to put hard floating-point blocks in the chips still doesn’t make business sense. The major vendors have provided library soft blocks for floating-point operations, but these are necessarily limited in flexibility. So designers who need a better power-performance point than they can get from floating-point DSP chips had to face the daunting task of describing a floating-point data path in synthesizable VHDL or Verilog, inferring instances of the FPGA’s DSP hard blocks, using the block RAMs as best they could, and somehow verifying the result.

This process is greatly complicated by the IEEE 754 floating-point format itself, Parker observed. The format was designed for convenience in computers, and it employs implicit offsets in some fields that wreak havoc on data-path design.Given this scenario, DSP Builder looked like an ideal solution. The tool, which has been around for some time now, takes in a Simulink model and spits out a black-box of synthesizable VHDL, ready to send into Altera’s Quartus II tool chain. Previously, DSP Builder had been limited to fixed-point applications. But in light of the problems with floating-point, Altera has decided to extend the capability to floating-point as well.

On the surface, the tool might look like a fairly straightforward source-code converter, moving from one high-level representation of a data path to another, slightly more hardware-oriented one. But several optimizations raise DSP Builder far above the level of a simple translator.

There is the issue of that IEEE 754 format. DSP Builder uses an internal twos-complement format more friendly to FPGA implementation, with a 754 wrapper that provides format translation at the input and output of the data path. Also, algorithms added to the tool infer Altera’s fixed-point hard multiplier blocks: in obvious ways, such as ganging to handle the mantissas in multiplications, and in less obvious ways, such as replacing FPGA-eating barrel-shifters with multipliers in normalize operations. DSP Builder minimizes use of the latter with an internally-developed optimization they call the Fused Data Path. This algorithm organizes operations and extends data widths to eliminate up to three-quarters of the normalize-denormalize stages that would normally surround the major function blocks in the data path, with obvious impact on latency, area, and power.

On the positive side, these optimizations product excellent results, Parker claimed. “We have data showing our sum-of-squares error compared to double-precision is four to five times lower than what you get performing the same operations in single-precision IEEE 754 format,” he said.

On the less positive side, the optimizations are so aggressive that you must treat the VHDL as a black box. “You can’t really edit the code without breaking it,” Parker warned. Instead, DSP Builder offers switches to control degree of optimization effort, use of soft logic-cell RAM vs. block RAM, and use of the FPGA’s hard multiplier blocks.

Estimation and debug of the black box can also be issues. DSP Builder gives you estimates of resource consumption and maximum frequency along with your VHDL. But you won’t really know where you stand on power, cell use, or timing until you’ve run the design through Quartus II. So tuning a design for minimum energy could involve iterations through a one- or two-day loop.

The first level of verification, of course, is in Simulink—getting the algorithm right. In principle, if it works in 754 format in Simulink, it should work better in the FPGA. If you are the untrusting type, DSP Builder can invoke ModelSim and initialize it to drive RTL test vectors into the VHDL model. This may be of limited usefulness if you don’t understand the structure of the VHDL model, but at least it is another data point in the verification process.

Recognizing that, even with floating-point DSP Builder, implementing a real application is still a significant undertaking, Altera is building both a math library—the objective is most or all of math.h—and a set of verified reference designs for hard but common problems. The package should be a significant help for designers who, for whatever reason, need both the performance of FPGAs and the dynamic range of floating-point.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.