# A root-finding algorithm

**Crenshaw offers up “The New and Improved World's Best Root Finder” and he's got the code to prove it.**

Three college students are asleep, in three different rooms. Before turning in, each student has a smoke and tosses the cigarette butt into the wastebasket next to his or her bed. Each awakens to find a fire blazing in the basket. Each has a pitcher of water on the bedside table.

The engineering student tosses the contents of the pitcher into the basket and douses the fire. Satisfied that the fire is out, he rolls over and goes back to sleep.

The physics student takes a look at the basket, assesses the size of the fire, and quickly estimates how much water will be needed to douse it. She pours exactly that amount from the pitcher. Satisfied that the fire is indeed out, she rolls over and goes back to sleep.

Likewise, the math major wakes up, takes in the fire, and recognizes that there's enough water in the pitcher to put it out. Satisfied that a solution exists, he rolls over and goes back to sleep.

The story is a little parable about mathematicians' obsession with existence theorems. Sometimes mathematicians find it more interesting to prove or disprove that a problem has a solution than to actually solve the problem.

Despite my reputation as *Embedded Systems Programming* 's math guru, I've never thought of myself as much of a mathematician. If I am one, I'm more of an applied one than a theoretical one. I've always been, I thought, about solving real problems in the real world.

But twice now, in recent months, I fear I've behaved much like that math major with the fire in his wastebasket. Once I satisfied myself that a solution was at hand, I sort of lost interest in producing it.

I'm talking about two pieces of code that I promised to upload to Embedded.com. The first was for the function minimizer I called “minimizer lite,” the end product of months of study on methods for finding the minimum of a function in general, and Brent's method in particular. The second was the code for my new and improved version of The World's Best Root Finder (TWBRF). I never got around to uploading either function. I guess it was enough for me (if not for you) that a solution existed.

I suppose it will come as no shock to learn that my column covers things I'm doing in my day job. The topics I choose tend to be something I'm thinking deeply about at the time. Though I've always been interested in ways to minimize functionsI've been looking for a good algorithm since 1970 or somy interest was heightened by a seeming flurry of problems that needed optimal solutions. Only one of many cases was a gas analyzer that required a real-time estimate of the concentrations of several gases.

More recently, I had occasion to trot out and dust off TWBRF to find an iterative solution to an orbit-mechanics problem. In the process, I had reason to worry that I'd somehow broken the algorithm, so I started digging into it and, in the process, found areas for improvement.

**TWBRF redux**

My first step in that direction has been to post the long-awaited new version of TWBRF, which should now rightly be called The New, Improved, and Even Better Root Finder (TNIEBRF). The changes are both subtle enough and important enough that I need to explain them. So I'm digressing once more, but I promise to get back on topic soon.

TWBRF has had a remarkable history. Developed in the '60s by some unknown genius, it was distributed by IBM as part of its Scientific Subroutine Package. It's intended to locate a root of a nonlinear equation, as in:

Equation 1

*ƒ* (*x* )=0

where *ƒ* (*x* ) can be any function of *x* at all, no matter how horrible. The only restrictions are that *ƒ* (*x* ) is continuous, and *ƒ* (*x* ) crosses zero at least once, somewhere in the specified interval.

These restrictions being satisfied, there has to be at least one value of x such that Equation 1 is satisfied, and TWBRF is guaranteed to find it, within some limit of accuracy. Guaranteed. You don't hear *that* very often in the software world.

TWBRF works by alternating steps of bisection and inverse parabolic interpolation. The bisection method is slow but sure. Other methods, such as Brent's method (not to be confused with the minimizer of the same name), skip the bisections and use only successive iterations. That makes them faster, but I feel the bisection adds stability that helps tremendously in tricky situations. I've never found a better root-finder than TWBRF, and it's recently come to my attention that some highly respected vendors of math software use it, also.

My changes to TWBRF operate in two main areas. First, I've changed the structure to better accommodate multiple tests qualifying the parabolic interpolation. Second, I've changed convergence criterion rather radically. Let's discuss the second one first.

As with most iterative methods, when you use TWBRF you must give it an epsilon (**ε** ) convergence criterion. The intent is that the iteration will stop when the estimated error is within **ε** of zero. In the original IBM algorithm, as with my earlier translations, that test is an absolute one, testing the value of the range, *x _{2} -x_{1} * . Over years of use, I've noticed that the test seems to be pessimistic. If, say, I give the algorithm an

**ε**of 10

^{-6}, I actually get a result that's good to about 10

^{-8}or better.

That never really bothered me much. After all, who's going to complain if the result is better than expected? But the flip side of the coin is, the algorithm has apparently taken more steps of iteration than it needed to.

If you look at the geometry for a test case, it's easy to see why the test is pessimistic. The algorithm maintains two points that straddle the root (meaning that the signs of *ƒ* (*x* ) are different). They'll be called *x _{1} * and

*x*during an iteration, although

_{2}*x*gets redefined as

_{2}*x*for the next step. Typically, the root tends to lie very close to one of the points, say

_{3}*x*.

_{1}When I was working on my minimizer, I tried to keep the solution midway between reference points. That tends to avoid numerical error. But in the root finder, having this imbalance is a Good Thing. If we maintain one of the points very close to the root, we have a better chance of nailing it on the next iteration. This is precisely why the algorithm tends to dive for the root so aggressively once it gets close to it.

Unfortunately, the algorithm is deciding when to stop based on the full range, *x _{2} * –

*x*. It's unaware that one of the points is actually much closer to the root.

_{1}You can define convergence in several other ways. One typical way is to compare the estimated root with the value from the previous iteration. You stop the process when the estimate stops moving. However, it's been my experience that, for some pathological cases, you can get false convergence where you're still far from the root, but just happen to get two nearly equal estimates in a row. Testing the range in *x* may be pessimistic, but it's certainly not going to be fooled by false indications.

To see an even better solution, though, let's look at what the algorithm is supposed to be doing. It's looking for the case where:

Equation 2

*y* =*ƒ* (*x* )=0

If we're trying to decide when to stop, shouldn't we be looking at values of *y* , not *x* ?

Actually, the original IBM algorithm does, in fact, test *y* . But it's a half-hearted and ineffective test. What they did was to multiply the **ε** for *x* by 100 and use that for *y* . Of course, without knowing the nature of *ƒ* (*x* ), that test does little for us. If the slope of *ƒ* (*x* ), in the vicinity of the root, is around 100, it may be okay. But the slope may just as well be 1/100, or 10,000, in which case the test makes no sense. We simply don't know.

Anyhow, a relative test is always better, in my opinion, than an absolute one. It's easy to define a relative test in *x* , because we know the original span of *x* , which is the distance between the input values, *x _{1} * and

*x*. So I can define:

_{3}Equation 3

**ε** _{x} =**ε** (*x _{3} * –

*x*)

_{1}It certainly seems reasonable that we stop when we've narrowed the root down to some small percentage of the original range. One advantage of this approach is that I don't have to change **ε** when I change the problem. Since it's a relative error, we're likely to want the same kind of percentage improvement for the next problem.

It's more difficult to define a relative error in *y* , since we don't know, in advance, the range of *y* . We could always evaluate *ƒ* (*x* ) at *x _{1} * and

*x*, but that could lead to trouble. For all we know,

_{3}*ƒ*(

*x*) goes through massive swings elsewhere in the range, so our test could be impossible to satisfy.

I said that it's more difficult to use values of *y* , rather than *x* , in the convergence criterion. But it's not impossible. The trick is to make the range of *y* dynamic. In my new, improved version of TWBRF, I measure *y* = *ƒ* (*x* ) at every evaluation. I use that to maintain a dynamic range, based on the largest and smallest values of *y* encountered so far. The convergence criterion then compares *y* to:

Equation 4

**ε** * _{y} * =

**ε**(

*y*)

_{max}-y_{min}When I originally developed this approach, I struggled to find some way to combine tests in both *x* and *y* , using Equations 3 and 4. However, I could think of no good way to get around the pessimistic test. Fortunately, the test in *x* doesn't seem to be needed. Equation 4, applied alone, works nicely.

Some months back, reader John Lull sent me a pathological case for which TWBRF failed to converge. It failed because the function was a cubic with two roots close to the origin, but a third way-the-heck-out-there at *x* = -10^{14} . That value led to an even worse value of *y* , at -10^{47} . The new, improved TWBRF now handles this case, though perhaps not in the way John might have preferred.

You see, it's not so much that a root finder must solve every case; one can always invent pathological cases that drive it nuts. But it's very important that the algorithm keep its promise and work as advertised. The old version of TWBRF didn't do that for John's case. The new one does.

Suppose we give it an **ε** of 10^{-10} . Because the function has a y-range of 10^{47} , any value of *x* that yields |*y* |<10^{37} is acceptable. As I said, it may not be the solution John wanted to find, but the function will have kept to its contract with the user.

**A new function**

TWBRF calls a function, *ƒ* (*x* ), several places in its code. Because I'm now maintaining dynamic values for the maximum and minimum values of *ƒ* (*x* ), I found it convenient to write a separate function that maintains these values. Listing 1 shows the function.

**Listing 1: Evaluating ƒ (x )**

// This function evaluates a new point, sets the y range,

// and tests for convergence

double get_y(double x, double (*f)(double), double eps, double &ymax, double &ymin, bool &converged){

double y = f(x);

ymax = max(ymax, y);

ymin = min(ymin, y);

converged = (abs(y) < eps*(ymax-ymin));

return y;

}

** **

As you can see, the function has to maintain several different parameters. I've put them in the calling list as reference variables, rather than using global variables. This makes the root finder reentrant, capable of having multiple instances running around. The function not only evaluates *ƒ* (*x* ) and set the maximum and minimum values, it also tests the convergence criterion and sets the Boolean flag, **converged** , when the criterion is satisfied.

**A new structure**

Remember that TWBRF works by alternating steps of bisection and inverse parabolic interpolation. Architecturally, the flow of control might look like this:

**while(not converged){ bisect(); interpolate();}**

We don't always interpolate, however. There's a criterion used, which is a mathematical way of requiring that the estimated root be between *x _{1} * and

*x*.

_{2}So the architecture really looks like:

**while(not converged){ bisect(); if(test_ok) interpolate(); else dont();}**

This is basically the architecture used in all previous versions of TWBRF, including IBM's original version (though the architecture was well hidden under layers of three-way, arithmetic-if branches).

My problem was, I wanted to have more than one test. For example, I found a caseunhandled in all previous versionsin which one of the parameters could be zero, leading to a divide-by-zero exception. A very serious flaw.

Fortunately, a simple change to the architecture supports multiple testsit's shown in Listing 2.

**Listing 2: The improved architecture**

while(not converged){

bisect();

if(test_1){

if(test_2){

if(test_3){

…

interpolate();

}

}

}

}

** **

Of course, in practice, the function doesn't look as clean as implied by Listing 2, because typically intermediate computations are to be done between tests, and also alternate action if the test fails. Nevertheless, I think you'll find that the new architecture conforms to that of Listing 2.

**A test that failed the test**

In my many, many hand checksmore than I care to rememberof TWBRF, I sometimes came across steps where, although the criterion for parabolic interpolation was satisfied, the estimated root didn't seem to be very good at all. I found that this sometimes happened when the vertex of the parabola got in the way. Conceptually the algorithm tries to fit a parabola opening horizontally to three points. Occasionally I ran into a situation where the vertex of the parabola lay between *y _{1} * and

*y*. The test implemented in all versions of TWBRF doesn't address the location of the vertex at allit only looks at

_{2}*x*-values, not

*y*.

I thought it would surely be better if we rejected the interpolation if the vertex was in the way, and I developed the math for a test on that condition. However, when I included that test in the code, I found that its major effect was to increase the number of iterations. Apparently, a parabolic interpolation, even a bad one, is still better than a simple bisection. So I wrote this idea off as not one of my better ones and ripped the test back out again.

We are now donemaybe foreverwith root finders. You'll find TWBRF at ftp://ftp.embedded.com/pub/2004/05crenshaw. Use it in good health; at this point I don't think it can be improved upon.

**Don Lancaster**

Next month, we'll get back to discussions of logic chips, Karnaugh maps, and other good stuff. Meantime I'll be working to post the remaining code, which is minimizer lite.

I don't have enough room for logic this month, but I would like to get in a word about the fellow to whom I owe a great debt.

Some of you old readers of *Popular Electronics* and *Radio-Electronics* magazines will remember Don Lancaster. Some of you may recognize him from *Computer Shopper* for which he still writes. In the mid- to late sixties, Don kept us all up to date by writing prolifically about electronic gizmos of all sorts. As I understand it, Don would get announcements of all the new devices electronics companies were coming out with. He'd get a sample, take it down to his basement, and put together some circuits. Nothing fancyperhaps a code practice oscillator, an RF oscillator, a modulator or demodulatorthat sort of thing. But each month, he'd have a new article. One month, it might be a new IGFET transistor. The next month, a tunnel diode. Whatever, Don would have practical circuits put together and running, almost before the device was available to the rest of us.

Rumor has it that Don had a unique way of assuring a new article every month. For a few weeks, he'd tinker in his lab, trying things out, and taking notes of what he did. When the article deadline was drawing near, he'd backpack his portable typewriter up into an abandoned fire lookout tower and start writing. He resolved not to come down until the article was done. That would certainly give one an incentive to get on with it.

Whatever his technique, it sure worked. Some months would find articles on different gadgets in two or even three magazines.

One such article was a seminal one on the Fairchild RTL gates. That's how I learned they were affordable to a grad student like me, and I was on the phone ordering parts the very next day.

While I was tinkering around trying to count to four, Don sprang his next breakthrough: a series of decade counters, complete with displays. Don had the same problem I had with seven-segment displays; he solved the problem in a different way, with a vertical array of 10 lights, one for each digit. I used some of these counters in a timer I built for my son's midget race car club. They worked for decades without a glitch. The least reliable part in the whole setup was the light bulbs, but I found that once they got over infant mortality, they were dead reliable. It was those same light bulbs that I sought to use in my abortive ice tray displays.

In those days, magazines such as *Popular Electronics* had arrangements with companies like Micro Instrumentation and Telemetry Systems (MITS) and Southwest Technical Products (SWTP). Working with those authors who cared to participate, these companies would collect up all the parts needed for a given project, and ship them to you as kits, with a copy of the magazine article. Packaging, power supply, and mechanical arrangement were left strictly up to the builder.

SWTP supplied kits for many of Don Lancaster's projects. That's where I got the decade counters.

Some of you might remember MITS. In December 1974, they changed the world forever with a little computer called the Altair 8800. But that's another story.

Not surprisingly, Don Lancaster was thinking about output displays for calculators and computers, same as me. While I was struggling with cursive seven-segment displays la Hewlett-Packard, Don had a better idea. Instead of trying to generate *x* –*y* waveforms, Don decided to sweep the scope raster fashion, and use a character generator to generate dots. Even his method of driving the dots was innovative. Where I was working to drive the z-axis (intensity) of the scope, Don simply drove the waveform off the screen. Clever. You can't see the dot if it's not there.

Many of you will remember his next breakthrough, which made my Teflon ice trays, light bulbs, and scope waveform generators seem like a kid's scooter compared to a Yamaha R-1. Don called it the TV Typewriter, and it was a genuine CRT display, built around a surplus TV monitor and about 100 TTL chips. SWTP sold it, and I bought one. Like most of SWTP's stuff, it worked the first time out. I never did get around to interfacing it with a computer; the Radio Shack TRS-80 made that unnecessary. Even so, Lancaster's TV Typewriter was a breakthrough product and got us all primed for the PC revolution that was roaring down the freeway.

All you young whippersnappers, if you came along too late for all the fun, do a Google search on TV Typewriter, Don Lancaster, SWTP, MITS, and Altair 8800, and find out what things were like when real men soldered chips. We owe guys like Don Lancaster a huge debt of gratitude. Some days, while I'm waiting for my 1.7GHz whizbang to boot, I ponder how things might have turned out if Don had designed a computer. If he had, I'll bet SWTP would have sold it, and it would have worked. Without crashing, ever.

Thanks a lot, Don.

**Jack Crenshaw** is a senior software engineer at Spectrum-Astro and the author of *Math Toolkit for Real-Time Programming* , from CMP Books. He holds a PhD in physics from Auburn University. E-mail him at .