Richard Lyons takes on high speed vector magnitude approximation in another in an on-going series of DSP design hints and tricks
By Richard G. Lyons
The quadrature processing techniques employed in spectrum analysis,
computer graphics, and digital
communications routinely require high
speed determination of the magnitude of a complex number (vector V)
given its real and imaginary parts, i.e., the in-phase part I and the
quadrature-phase part Q. This magnitude calculation requires a square
root operation because the magnitude of V is
Assuming that the sum I2+ Q2 is available, the
problem is to efficiently perform the square root operation.
There are several ways to obtain square roots. The optimum technique
depends on the capabilities of the available hardware and software. For
example, when performing a square root using a high-level software
language, we employ whatever software square root function is
available. Although accurate, software routines can be very slow.
By contrast, if a system must accomplish a square root operation in
50 nanoseconds, high-speed magnitude approximations are required. Let's
look at a neat magnitude approximation scheme that's particularly
hardware efficient. There is a technique called the "alpha max plus
beta min" algorithm for calculating the magnitude of a complex vector.
It's a linear approximation to the vector magnitude problem that
requires the determination of which orthogonal vector, I or Q, has the
greater absolute value. If the maximum absolute value of I or Q is
designated by Max, and the minimum absolute value of either I or Q is
Min, an approximation of |V| using the "alpha max plus beta min"
algorithm is expressed as
There are several pairs for the Alpha and Beta constants that
provide varying degrees of vector magnitude approximation accuracy to
within 0.1 dB. The "alpha max plus beta min" algorithms determine a
vector magnitude at whatever speed it takes a system to perform a
magnitude comparison, two multiplications, and one addition.
But those algorithms require, as a minimum, a 16-bit multiplier to
achieve reasonably accurate results. If, however, hardware multipliers
are not available, all is not lost. By restricting the Alpha and Beta
constants to reciprocals of integer powers of two, Equation (13"6) above lends
itself well to implementation in binary integer arithmetic.
A prevailing application of the "alpha max plus beta min" algorithm
uses Alpha = 1.0 and Beta = 0.5. The 0.5 multiplication operation is
performed by shifting the minimum quadrature vector magnitude, Min, to
the right by one bit.
We can gauge the accuracy of any vector magnitude estimation
algorithm by plotting its error as a function of vector phase angle.
Let's do that. The "alpha max plus beta min" estimate for a complex
vector of unity magnitude, using
over the vector angular range of 0 to 90 degrees, is shown as the
solid curve in Figure 13"7 below.
(The curves in the figure repeat
every 90 degrees.)
An ideal estimation curve for a unity magnitude vector would have a
value of one. We'll use this ideal estimation curve as a yardstick to
measure the merit of various "alpha max plus beta min" algorithms.
Let's make sure we know what the solid curve in Figure 13"7 is telling us. It
indicates that a unity magnitude vector oriented at an angle of
approximately 26 degrees will be estimated by Equation (13"7) to have a magnitude
of 1.118 instead of the correct magnitude of one.
 |
| Figure
13-7. Normalized "alpha max plus beta min"estimates for Alpha = 1; and
Beta = 1/2, 1/4, and 3/8. |
The error then, at 26 degree, is 11.8%, or 0.97 dB. Analyzing the
entire solid curve in Figure 13"7 results in an average error, over the
0 to 90 degree range, of 8.6% (0.71 dB). Figure 13"7 also presents the error
curves for the Alpha = 1 and Beta = 1/4, and the Alpha = 1 and Beta =
3/8 values.
Although the values for Alpha and Beta in Figure 13"7 yield rather accurate
vector magnitude estimates, there are other values for Alpha and Beta
that deserve our attention because they result in smaller error
standard deviations.
Consider the Alpha = 7/8 and Beta = 7/16 pair where its error curve
is the solid curve in Figure 13"8 below. A further improvement can be
obtained using Alpha = 15/16 and Beta = 15/32 having an error shown by
the dashed curve in Figure 13"8 below.
The Alpha = 15/16 and Beta = 15/32 pair give rather good results
when compared to the optimum floating point values of Alpha =
0.96043387 and Beta = 0.397824735, whose error is the dotted curve in Figure 13"8.
 |
| Figure
13-8. Estimates for Alpha = 7/8, Beta = 7/16; Alpha = 15/16, Beta =
15/32; and Alpha = 0.96043387, Beta = 0.397824735. |
Although using Alpha = 15/16 and Beta = 15/32 appears to require two
multiplications and one addition, its digital hardware implementation
can be straightforward, as shown in
Figure
13"9 below. The diagonal lines, \1 for example, denote a
hardwired shift of one
bit to the right to implement a divide-by-two
operation by truncation.
Likewise, the \4 symbol indicates a right shift by four bits to
realize a divide-by-16 operation. The |I|>|Q| control line is TRUE
when the magnitude of I is greater than the magnitude of Q, so that Max
= |I| and Min = |Q|.
This condition enables the registers to apply the values |I| and
|Q|/2 to the adder. When |I|>|Q| is FALSE, the registers apply the
values |Q| and |I|/2 to the adder. Notice that the output of the adder,
Max + Min/2, is the result of Eq. (13"26). Equation (13"29) is
implemented via the subtraction of (Max + Min/2)/16 from Max + Min/2.
In Figure 13"9 below, all
implied multiplications are performed by hardwired bit shifting and the
total execution time is limited only by the delay times associated with
the hardware components.
One thing to keep in mind. Because the various |V| estimations can
exceed the correct normalized vector magnitude value, i.e., some
magnitude estimates are greater than one.
This means that although the correct magnitude value may be within
the system's full-scale word width, an algorithm result may exceed the
word width of the system and cause overflow errors. With "alpha max
plus beta min" algorithms the user must be certain that no true vector
magnitude exceeds the value that will produce an estimated magnitude
greater than the maximum allowable word width.
 |
| Figure
13"9 Hardware implementation using Alpha = 15/16 and Beta = 15/32. |
The penalty we pay for the convenience of having Alpha and Beta as
powers of two is the error induced by the division-by-truncation
process, and we haven't taken that error into account thus far.
The error curves in Figure 13"7 and
Figure 13"8 were obtained using
a software simulation with its floating point accuracy, and are useful
in evaluating different Alpha and Beta values. However, the true error
introduced by the "alpha max plus beta min" algorithm will be somewhat
different, due to division errors when truncation is used with finite
word widths.
For "alpha max plus beta min" schemes, the truncation errors are a
function of the data's word width, the algorithm used, the values of
both |I| and |Q|, and the vector's phase angle. (These errors due to truncation compound
the errors already inherent in our "alpha max plus beta min" algorithms.)
However, modeling has shown that for an 8-bit system (maximum vector magnitude = 255) the
truncation error is less than 1%. As the system word width increases
the truncation errors approach 0%, this means that truncation errors
add very little to the inherent "alpha max plus beta min" algorithm
errors. The relative performance of the various algorithms is
summarized in Table 13"2 below.
 |
| Table
13"2. "Alpha max plus beta min" Algorithm Comparisons |
The last column in Table 13"2
illustrates the maximum allowable true vector magnitude as a function
of the system's full-scale (F.S.) word width to avoid overflow errors.
So, the "Alpha max plus beta min" algorithm enables high speed
vector magnitude computation without the need for math coprocessor or
hardware multiplier chips. Of course, with the recent availability of
high-speed floating point multiplier integrated circuits - with their
ability to multiply in one or two clock cycles - Alpha and Beta may not
always need to be restricted to reciprocals of integer powers of two.
It's also worth mentioning that this algorithm can be nicely
implemented in a single hardware integrated circuit (for example, a field programmable gate array)
affording high speed operation.
Used
with the permission of the publisher, Prentice Hall, this on-going
series of articles is based on copyrighted material from "Understanding
Digital Signal Processing, Second Edition" by Richard G. Lyons. The
book can be purchased on line.
Richard Lyons is a consulting
systems engineer and lecturer with Besser Associates. As a
lecturer with Besser and an instructor for the University of California
Santa Cruz Extension, Lyons has delivered digitasl signal processing
seminars and training course at technical conferences as well at
companies such as Motorola, Freescale, Lockheed Martin, Texas
Instruments, Conexant, Northrop Grumman, Lucent, Nokia, Qualcomm,
Honeywell, National Semiconductor, General Dynamics and Infinion.