Digital Signal Processing Tricks - High-speed vector magnitude approximationThe quadrature processing techniques employed in spectrum analysis, computer graphics, and digital communications routinely require high speed determination of the magnitude of a complex number (vector V) given its real and imaginary parts, i.e., the in-phase part I and the quadrature-phase part Q. This magnitude calculation requires a square root operation because the magnitude of V is
Assuming that the sum I2+ Q2 is available, the problem is to efficiently perform the square root operation.
There are several ways to obtain square roots. The optimum technique depends on the capabilities of the available hardware and software. For example, when performing a square root using a high-level software language, we employ whatever software square root function is available. Although accurate, software routines can be very slow.
By contrast, if a system must accomplish a square root operation in 50 nanoseconds, high-speed magnitude approximations are required. Let's look at a neat magnitude approximation scheme that's particularly hardware efficient. There is a technique called the "alpha max plus beta min" algorithm for calculating the magnitude of a complex vector.
It's a linear approximation to the vector magnitude problem that requires the determination of which orthogonal vector, I or Q, has the greater absolute value. If the maximum absolute value of I or Q is designated by Max, and the minimum absolute value of either I or Q is Min, an approximation of |V| using the "alpha max plus beta min" algorithm is expressed as
There are several pairs for the Alpha and Beta constants that provide varying degrees of vector magnitude approximation accuracy to within 0.1 dB. The "alpha max plus beta min" algorithms determine a vector magnitude at whatever speed it takes a system to perform a magnitude comparison, two multiplications, and one addition.
But those algorithms require, as a minimum, a 16-bit multiplier to achieve reasonably accurate results. If, however, hardware multipliers are not available, all is not lost. By restricting the Alpha and Beta constants to reciprocals of integer powers of two, Equation (13"6) above lends itself well to implementation in binary integer arithmetic.
A prevailing application of the "alpha max plus beta min" algorithm uses Alpha = 1.0 and Beta = 0.5. The 0.5 multiplication operation is performed by shifting the minimum quadrature vector magnitude, Min, to the right by one bit.
We can gauge the accuracy of any vector magnitude estimation algorithm by plotting its error as a function of vector phase angle. Let's do that. The "alpha max plus beta min" estimate for a complex vector of unity magnitude, using
over the vector angular range of 0 to 90 degrees, is shown as the solid curve in Figure 13"7 below. (The curves in the figure repeat every 90 degrees.)
An ideal estimation curve for a unity magnitude vector would have a value of one. We'll use this ideal estimation curve as a yardstick to measure the merit of various "alpha max plus beta min" algorithms. Let's make sure we know what the solid curve in Figure 13"7 is telling us. It indicates that a unity magnitude vector oriented at an angle of approximately 26 degrees will be estimated by Equation (13"7) to have a magnitude of 1.118 instead of the correct magnitude of one.
|Figure 13-7. Normalized "alpha max plus beta min"estimates for Alpha = 1; and Beta = 1/2, 1/4, and 3/8.|
The error then, at 26 degree, is 11.8%, or 0.97 dB. Analyzing the entire solid curve in Figure 13"7 results in an average error, over the 0 to 90 degree range, of 8.6% (0.71 dB). Figure 13"7 also presents the error curves for the Alpha = 1 and Beta = 1/4, and the Alpha = 1 and Beta = 3/8 values.
Although the values for Alpha and Beta in Figure 13"7 yield rather accurate vector magnitude estimates, there are other values for Alpha and Beta that deserve our attention because they result in smaller error standard deviations.
Consider the Alpha = 7/8 and Beta = 7/16 pair where its error curve is the solid curve in Figure 13"8 below. A further improvement can be obtained using Alpha = 15/16 and Beta = 15/32 having an error shown by the dashed curve in Figure 13"8 below.
The Alpha = 15/16 and Beta = 15/32 pair give rather good results when compared to the optimum floating point values of Alpha = 0.96043387 and Beta = 0.397824735, whose error is the dotted curve in Figure 13"8.
|Figure 13-8. Estimates for Alpha = 7/8, Beta = 7/16; Alpha = 15/16, Beta = 15/32; and Alpha = 0.96043387, Beta = 0.397824735.|
Although using Alpha = 15/16 and Beta = 15/32 appears to require two multiplications and one addition, its digital hardware implementation can be straightforward, as shown in Figure 13"9 below. The diagonal lines, \1 for example, denote a hardwired shift of one bit to the right to implement a divide-by-two operation by truncation.
Likewise, the \4 symbol indicates a right shift by four bits to realize a divide-by-16 operation. The |I|>|Q| control line is TRUE when the magnitude of I is greater than the magnitude of Q, so that Max = |I| and Min = |Q|.
This condition enables the registers to apply the values |I| and |Q|/2 to the adder. When |I|>|Q| is FALSE, the registers apply the values |Q| and |I|/2 to the adder. Notice that the output of the adder, Max + Min/2, is the result of Eq. (13"26). Equation (13"29) is implemented via the subtraction of (Max + Min/2)/16 from Max + Min/2.
In Figure 13"9 below, all implied multiplications are performed by hardwired bit shifting and the total execution time is limited only by the delay times associated with the hardware components.
One thing to keep in mind. Because the various |V| estimations can exceed the correct normalized vector magnitude value, i.e., some magnitude estimates are greater than one.
This means that although the correct magnitude value may be within the system's full-scale word width, an algorithm result may exceed the word width of the system and cause overflow errors. With "alpha max plus beta min" algorithms the user must be certain that no true vector magnitude exceeds the value that will produce an estimated magnitude greater than the maximum allowable word width.
|Figure 13"9 Hardware implementation using Alpha = 15/16 and Beta = 15/32.|
The penalty we pay for the convenience of having Alpha and Beta as powers of two is the error induced by the division-by-truncation process, and we haven't taken that error into account thus far.
The error curves in Figure 13"7 and Figure 13"8 were obtained using a software simulation with its floating point accuracy, and are useful in evaluating different Alpha and Beta values. However, the true error introduced by the "alpha max plus beta min" algorithm will be somewhat different, due to division errors when truncation is used with finite word widths.
For "alpha max plus beta min" schemes, the truncation errors are a function of the data's word width, the algorithm used, the values of both |I| and |Q|, and the vector's phase angle. (These errors due to truncation compound the errors already inherent in our "alpha max plus beta min" algorithms.)
However, modeling has shown that for an 8-bit system (maximum vector magnitude = 255) the truncation error is less than 1%. As the system word width increases the truncation errors approach 0%, this means that truncation errors add very little to the inherent "alpha max plus beta min" algorithm errors. The relative performance of the various algorithms is summarized in Table 13"2 below.
|Table 13"2. "Alpha max plus beta min" Algorithm Comparisons|
The last column in Table 13"2 illustrates the maximum allowable true vector magnitude as a function of the system's full-scale (F.S.) word width to avoid overflow errors.
So, the "Alpha max plus beta min" algorithm enables high speed vector magnitude computation without the need for math coprocessor or hardware multiplier chips. Of course, with the recent availability of high-speed floating point multiplier integrated circuits - with their ability to multiply in one or two clock cycles - Alpha and Beta may not always need to be restricted to reciprocals of integer powers of two.
It's also worth mentioning that this algorithm can be nicely
implemented in a single hardware integrated circuit (for example, a field programmable gate array)
affording high speed operation.
Used with the permission of the publisher, Prentice Hall, this on-going series of articles is based on copyrighted material from "Understanding Digital Signal Processing, Second Edition" by Richard G. Lyons. The book can be purchased on line.
Richard Lyons is a consulting
systems engineer and lecturer with Besser Associates. As a
lecturer with Besser and an instructor for the University of California
Santa Cruz Extension, Lyons has delivered digitasl signal processing
seminars and training course at technical conferences as well at
companies such as Motorola, Freescale, Lockheed Martin, Texas
Instruments, Conexant, Northrop Grumman, Lucent, Nokia, Qualcomm,
Honeywell, National Semiconductor, General Dynamics and Infinion.