Last month I heaped praise on the book
Computer Approximations (John Hart et al, second edition, 1978, Malabar, FL: Robert Krieger Publishing Company). This mind-numbingly dull tome is
the Bible for anyone creating an approximation to pretty much any well-known function, from Bessel functions to trig. It's a collection of thousands of polynomials, each of which computes one function to a specific precision over some limited range. Hart and his coauthors also describe the intricacies behind the polynomials in a way only a math teacher would enjoy. The proofs, alas, are left to the student.
If you missed last month's article, here are the key points:
• Compiler vendors provide built-in functions to compute the cosine, square root, and many other functions. When there's no hardware floating-point support, these code chunks always use an approximating polynomial or ratio of polynomials.
• These vendor-supplied approximations may not be optimum for speed-sensitive real-time systems.
• There's an annoying tradeoff between precision, speed, and range. Like the triad "schedule, features, and quality" you can pick only two. Together, the three parameters fight like a trio of American Idol judges.
• Since each approximation works over a limited range, the code must reduce the input argument to something within those bounds. Range reduction is by far the trickiest part of creating an approximation.
I gave code and coefficients for the square and cube roots. In this column, we'll look at exponentiation and logarithms.
Power to the exponents!
The three most common exponentiations we use are powers of two, e, and 10. In a binary computer it's trivial to raise an integer to a power of two, but floating point is much harder. Approximations to the rescue!
Hart's power-of-two exponentials expect an argument in the range of 0.0 to 0.5. That's not terribly useful for most real applications so we must use a bit of code to reduce the input x to that range.
Consider the equation 2x=2aX2(x–a). Well, duh. That appears both contrived and trite. Yet it's the basis for our range-reduction code. The trick is to select an integer a so that x is between 0 and 1. An integer, because it's both trivial and fast to raise 2 to any int. Simply set a to the integer part of x and feed (x-int(x)) into the approximating polynomial. Multiply the result by the easily computed 2a to correct for the range reduction.
But I mentioned that Hart wants inputs in the range of 0.0 to 0.5, not 0.0 to 1.0. If (x-int(a)) is greater than 0.5 use the relation:
(1)
instead of the one given earlier. The code is simpler than the equation. Subtract 0.5 from x and then compute the polynomial for (x-int(x)) as usual. Correct the answer by:
Exponents can be negative as well as positive. Remember that:
(2)
Solve for the absolute value of x and then take the reciprocal.
Though I mentioned it's easy to figure 2a, that glib statement glosses over a bit of complexity. The code in Listing 1 uses the worst possible strategy: it laboriously does floating-point multiplies. That's a general solution devoid of confusing chicanery: although it's clear, it's slow.
Listing 1: Code to reduce the range of an exponential
/***************************************************
*
* reduce_expb - The expb routines require an input argument in
* the range[0.0, 0.5].This routine reduces the argument to that range.
*
* Return values:
* - "arg", which is the input argument reduced to that range
* - "two_int_a", which is 2**(int(arg))
* - "adjustment", which is an integer flag set to zero if the fractional
* part of arg is <=0.5; set to one otherwise
*
* How this range reduction code works:
* (1) 2**x = 2**a * 2**(x - a)
* and,
* (2) 2**x = 2**(1/2) * 2**a * 2**(x - a - 1/2)
*
* Now, this all looks pretty contrived. But the trick is to pick an integer "a"
* such that the term (x - a) is in the range [0.0, 1.0]. If the result is in
* the range we use equation 1. If it winds up in [0.5, 1.0], use equation (2) which
* will get it down to our desired [0.0, 0.5] range.
*
* The return value "adjustment" tells the calling function if we are using the
* first or the second equation.
*
***************************************************/
void reduce_expb(double *arg, double *two_int_a, int *adjustment)
{
int int_arg; // integer part of the input argument
*adjustment = 0; // Assume we're using equation (2)
int_arg = (int) *arg;
if((*arg - int_arg) > 0.5) // if frac(arg) is in [0.5, 1.0]...
{
*adjustment = 1;
*arg = *arg - 0.5; // ... then change it to [0.0, 0.5]
}
*arg = *arg - (double) int_arg; // arg is now just the fractional part
// Now compute 2** (int) arg.
*two_int_a = 1.0;
for(; int_arg!=0; int_arg--)*two_int_a = *two_int_a * 2.0;
};
Other options exist. If the input x never gets very big, use a look-up table of precomputed values. Even a few tens of entries will let the resulting 2
x assume huge values.
Another approach simply adds
a to the number's mantissa. Floating-point numbers are represented in memory using two parts, one fractional (the characteristic), plus an implied 2 raised to the other part, the mantissa. Apply
a to the mantissa to create 2
a. That's highly nonportable but very fast.
Or, you could shift a 1 left through an int or long
a times and then convert that to a float or double.
Computer Approximations lists polynomials for 46 variants of 2
x, with precisions ranging from four to 25 decimal digits. Room and sanity prohibit listing them all, but here are two of the most useful examples. Both assume the input argument is between 0 and 0.5, and both use the following ratio of two polynomials:

(3)
The following coefficients yield a precision of 6.36 decimal digits. Note that
P(
x) is simply an offset, and Q01 is 1, making this a very fast and reasonably accurate approximation:
P00 (– 1) +0.86778 38827 9
Q00 (– 2) +0.25039 10665 03
Q01 (+ 1) +0.1
For 9.85 digits of precision use the somewhat slower (also see Listing 2):
P00 (+ 1) +0.72151 89152 1493
P01 (– 1) +0.57690 07237 31
Q00 (+ 2) +0.20818 92379 30062
Q01 (+ 1) +0.1
Listing 2: 2x to 9.85 digits
/**************************************************
*
* expb1063 - compute 2**x to 9.85 digits accuracy
*
***************************************************/
double expb1063(double arg){
const double P00 = + 7.2152891521493;
const double P01 = + 0.0576900723731;
const double Q00 = +20.8189237930062;
const double Q01 = + 1.0;
const double sqrt2= 1.4142135623730950488; // sqrt(2) for scaling
double two_int_a; // 2**(int(a)); used to scale result
int adjustment; // set to 1 by reduce_expb if must...
// ...adjust the answer by sqrt(2)
double answer; // The result
double Q; // Q(x**2)
double x_P; // x*P(x**2)
int negative=0; // 0 if arg is +; 1 if negative
// Return an error if the input is too large. "Too large" is entirely
// a function of the range of your floating point library and expected inputs
// used in your application.
if(abs(arg>100.0))
{
printf("\nYikes! %d is a big number, fella. Aborting.", arg);
return 0;
}
// If the input is negative, invert it. At the end we'll take
// the reciprocal, since n**(-1) = 1/(n**x).
if(arg<0.0)
{
arg = -arg;
negative = 1;
}
reduce_expb(&arg, &two_int_a, &adjustment); // reduce input to [0.0, 0.5]
// The format of the polynomial is:
// answer=(Q(x**2) + x*P(x**2))/(Q(x**2) - x*P(x**2))
//
// The following computes the polynomial in several steps:
Q = Q00 + Q01 * (arg * arg);
x_P = arg * (P00 + P01 * (arg * arg));
answer= (Q + x_P)/(Q - x_P);
// Now correct for the scaling factor of 2**(int(a))
answer= answer * two_int_a;
// If the result had a fractional part > 0.5, correct for that
if(adjustment == 1)answer=answer * sqrt2;
// Correct for a negative input
if(negative == 1) answer = 1.0/answer;
return(answer);
};
If you're feeling a need for serious accuracy try the following coefficients, which are good to an astonishing 24.78 digits. You'll need a more complex range-reduction algorithm to keep the input between [0, 1/256]. Try the coefficients in this table:
P00 (+ 1) +0.72134 75314 61762 84602 46233 635
P01 (- 1) +0.57762 26063 55921 17671 75
Q00 (+ 2) +0.20813 69012 79476 15341 50743 885
Q01 (+ 1) +0.1
Note that the polynomials, both of degree 1 only, quickly plop out an answer.
Other exponentials
What about the more common natural and decimal exponentiations?
Computer Approximations lists many that are accurate over a variety of ranges. The most useful seem to be those that work between 0.0 to 0.5 and 0.0 to 1.0. Use a range-reduction approach much like I just described, substituting the appropriate base: 10
x=10
aX10
(x–a) or
ex=
eaX
e(x–a).
Using the same ratio of polynomials in Equation 3 the coefficients in this next table give 10
x to 12.33 digits.
P00 (+ 2) +0.41437 43559 42044 8307
P01 (+ 1) +0.60946 20870 43507 08
P02 (- 1) +0.76330 97638 32166
Q00 (+ 2) +0.35992 09924 57256 1042
Q01 (+ 2) +0.21195 92399 59794 679
Q02 (+ 1) +0.1
The downside of using this or similar polynomials is you've got to solve for that pesky 10
a. Though a look-up table is one fast possibility if the input stays in a reasonable range, there aren't many other attractive options. If you're working with an antique IBM 1602 BCD machine perhaps it's possible to left shift by factors of 10 in a way analogous to that I've outlined for exponents of two. In today's binary world there are no left-shift decimal instructions so we must multiply by 10. Instead, combine one of the following relations:

(4)
or:

(5)
with the power-of-two approximations as shown in Listing 3.
Listing 3: Computing 10x using a 2x approximation
/**************************************************
*
* expd_based_on_expb1063 - compute 10**x to 9.85 digits
accuracy
*
* This is one approach to computing 10**x. Note that:
* 10**x = 2** (x / log_base_10(2)), so
* 10**x = expb(x/log_base_10(2))
*
***************************************************/
double expd_based_on_expb1063(double arg)
{
const double log10_2 = 0.30102999566398119521;
return(expb1063(arg/log10_2));
}
Logarithms
There's little magic to logs, other than a different range-reduction algorithm. Again, it's easiest to develop code that works with logarithms of base 2. Note that:

(6)
We pick an
n that keeps
f between 0.5 to 1.0, since the
Computer Approximations most useful approximations require that range. The reduction algorithm, shown in Listing 4, is similar to the one for square roots that I described in the last month's column. Listing 5 gives code to compute log base 2 to 4.14 digits of precision.
Listing 4: Log range-reduction code
/***************************************************
*
* reduce_log2 - The log2 routines require an input argument in
* the range[0.5, 1.0].This routine reduces the argument to that range.
*
* Return values:
* - "arg", which is the input argument reduced to that range
* - "n", which is n from the discussion below
* - "adjustment", set to 1 if the input was < 0.5
*
* How this range reduction code works:
* If we pick an integer n such that x=f x 2**n, and 0.5<= f < 1, and
* assume all "log" functions in this discussion means log(base 2), then:
* log(x) = log(f x 2**n)
* = log(f) + log(2**n)
* = log(f) + n
*
* The ugly "for" loop shifts a one through two_to_n while shifting a long
* version of the input argument right, repeating till the long
* version is all zeroes. Thus, two_to_n and the input argument have this relationship:
*
* two_to_n input argument n
* 1 0.5 to 1 0
* 2 1 to 2 1
* 4 2 to 4 2
* 8 4 to 8 3
* etc.
*
* There's a special case for the argument being less than 0.5.
* In this case we use the relation:
* log(1/x) =log(1) -log(x)
* = -log(x)
* That is, we take the reciprocal of x (which will be bigger than 0.5) and solve
* for the above. To tell the caller to do this, set "adjustment" = 1.
*
***************************************************/
void reduce_log2(double *arg, double *n, int *adjustment)
{
long two_to_n; // divisor we're looking for: 2^(2k)
long l_arg; // long (32 bit) version of input
*adjustment=0;
if(*arg<0.5)
{ // if small arg use the reciprocal
*arg=1.0/(*arg);
*adjustment=1;
}
// shift arg to zero while computing two_to_n as described above
l_arg=(long) *arg; // l_arg is long version of input arg
for(two_to_n=1, *n=0.0; l_arg!=0; l_arg>>=1, two_to_n<<=1, *n+=1.0);
*arg=*arg / (double) two_to_n; // normalize input argument to [0.5, 1]
};
Listing 5: Log base 2 code good to 4.14 digits
/**************************************************
*
* LOG2_2521 - compute log(base 2)(x) to 4.14 digits
accuracy
*
***************************************************/
double log2_2521(double arg)
{
const double P00 = -1.45326486;
const double P01 = +0.951366714;
const double P02 = +0.501994886;
const double Q00 = +0.352143751;
const double Q01 = +1.0;
double n; // used to scale result
double poly; // result from polynomial
double answer; // The result
int adjustment; // 1 if argument was < 0.5
// Return an error if the input is <=0 since log is not real at and below 0.
if(arg<=0.0)
{
printf("\nHoly smokes! %d is too durn small. Aborting.", arg);
return 0;
}
reduce_log2(&arg, &n,
&adjustment); // reduce input to [0.5, 1.0]
// The format of the polynomial is P(x)/Q(x)
poly= (P00 + arg * (P01 + arg * P02)) / (Q00 + Q01 * arg);
// Now correct for the scaling factors
if(adjustment)answer = -n - poly;
else answer=poly + n;
return(answer);
};
Need better results? The following coefficients are good to 8.32 digits over the same range:
P00 (+1) -.20546 66719 51
P01 (+1) -.88626 59939 1
P02 (+1) +.61058 51990 15
P03 (+1) +.48114 74609 89
Q00 (+0) +.35355 34252 77
Q01 (+1) +.45451 70876 29
Q02 (+1) +.64278 42090 29
Q03 (+1) +.1
Though Computer Approximations includes plenty of approximations for common and natural logs, it's probably more efficient to compute log2 and use the change of base formulae:
(7)
and:
(8)
So far I've talked about the approximations' precision imprecisely. What exactly does precision mean? Computer Approximations lists relative precision for roots and exponentials:

(9)
Logarithms are a different game altogether; the book uses absolute precision, defined by:
(10)
To figure decimal digits of precision use:
(11)
The good book
Computer Approximations includes polynomials for all of the standard and inverse trig functions and much more. See www.embedded.com/code or www.ganssle.com/app.htm for implementations of trig functions.
Converting the terse range reduction explanations to useful algorithms does take a bit of work, but the book is an invaluable treasure for people faced with creating approximations. The samples I've shown just touch a fraction of a percent of the approximations contained in the book.
It's pretty cool to crank parameters into these simple equations that use such bizarre-looking coefficients and to see astonishingly correct answers pop out every time.
Jack G. Ganssle is a lecturer and consultant on embedded development issues. He conducts seminars on embedded systems and helps companies with their embedded challenges. Contact him at jack@ganssle.com.
Have you seen the book "An Atlas of Functions" by Spanier and Oldham? They also offer approximations, but fewer than "Computer Approximations". It's a fun book, as they go.
- Fred Williams
Hart is a dangerous book. In my work with a math Ph.D'ed colleague on high speed algorithms for Map Projection, we started comapring Hart's algorithms to processor architecture features. Many of the algorithms in Hart like those given above are rational algorithms and as such have a floating point divide in them.
On most desktop and workstation processors do double precision floating point multiplies in one clock cycle while single or double precision floating point divides and interger divides normally take 32 cycles. Processor pipelines also often stall on the divide operation. Our research showed that in our application domain that one was usually better off using a long polynomial approximation rather than a rational approximation. This may not hold for some of the embedded processors with no FPU's but if you have FPU support, beware that some optimization based on rational functions may be pessimization.
A perusal of the cycle counts for your architecture's floating point operations may be enlightening.
- Cameron Kellough