Black Tuesday

This month, Jack discusses the tragedy of September 11, a storm in Florida, and-we kid you not-the final function minimizer.

This is not going to be your typical Crenshaw column. As I type this, it is Tuesday, September 18, 2001, exactly one week, to the hour, after 9/11, a day that will live in infamy.

I'm old enough to have lived through World War II. I was a kid then, but still old enough to be aware of what was going on. Like everyone else who lived through those years, I remember Pearl Harbor, and the horrors that followed, as though they were yesterday. I remember the concern and the sacrifices of the families who sent young men and women off to war; the heartbreak when someone lost a son, brother, or father in the fighting; the hardships and voluntary sacrifices of those of us who were left behind; the excitement of D-Day, and the jubilation of V-E and V-J days. I suspect that people who lived through these times will forever be somewhat different than the Americans who followed later. It was a defining time in the history of this country, and it changed us forever.

World War II ended many debates. We Americans love to argue with each other. It's our freedom that allows us to. Before the war, as last month, there were many opinions on politics, religion, world affairs, foreign policy, racism, states' rights, and more. You name it, we argued about it. Many argued we should stay out of the war, already two years old. Others said our policy towards the Japanese was at fault. Yet more thought maybe Hitler had a good idea.

Hours after Pearl Harbor, debate ended. There was only grim-faced, steel-jawed determination. The only words that mattered were “Remember Pearl Harbor.”

The years 1941 to 1945 were hard. Many thousands of young men died; friends, neighbors, sons, and daughters. Many sacrifices were made. Food and fuel were rationed. Tires were almost unobtainable. New car production stopped. Prices were controlled. Speeds were limited to 35mph, to save tire wear. And we didn't need police to enforce the limit, either. It was the patriotic thing to do.

World War II was arguably our last just war. It was a battle of Good vs. Evil. If we had any doubts as to the righteousness of our cause, we had only to review the images of Chinese peasants, kneeling with heads bowed as they were beheaded by Japanese officers. Or Jews in Europe wearing Star-of-David armbands, and being loaded onto cattle cars for transport to “concentration camps.” We were to learn later it was to gas chambers and ovens. It was a just war. We had to win. It was either that or perish. And to the survivors of the Holocaust, we vowed, “Never again.”

Over 2,400 people-mostly American servicemen-died at Pearl Harbor. More than double that died in New York City, and they weren't servicemen. They were innocent civilians; men, women, and children, whose only crime was to go to work that day, or board an airplane.

How can one explain such madness? Whose fault was this? Did the FBI fail in its mission? The CIA? Was our relationship with Israel to blame?

It doesn't matter anymore. The deed is done, the die is cast, the action has been taken. Make no mistake: we are in another war, like it or not. It will not be an easy one. Sacrifices will be made. Sacrifices have already been made. But it's a just war. If ever you have doubts, review again the images of Afghan women, kneeling with heads bowed as their brains are blown out by Taliban “clerics.” Or picture again, in your mind's eye, the horrifying images of the World Trade Center towers falling. Picture the humanity falling with them, crushed into oblivion, or being carried into them on wings of madness.

The time for debate is over. Put on your face of grim, steel-jawed determination. Adopt the slogan, “Remember 9/11.” And make a vow to the victims of that day: “Never again.”

Moving day

Things are popping here at Crenshaw farms. In all the chaos of Black Tuesday, most of you probably never even heard that tropical storm Gabrielle was attacking Florida. Tropical storm. That's meteorologese for “I'll huff, and I'll puff, and I'll drown your chickens, blow all your trees down, and flood your roads.” As near as I've been able to tell, Gabrielle's eye passed right smack over the bedroom of my house. Some 100,000 homes were without power, some of us for two and a half days. In the midst of all this, I'm in the process of moving, lock, livestock, and feed barrels, to a new home near Phoenix, AZ. I'll be working with Spectrum Astro. See www.specastro.com for the company's particulars. These guys are doing everything I ever wanted to do in life, professional life, anyway, and I'm incredibly excited to be a part of it.

In such turbulent days, the development of a function minimizer seems less important than before. But the show must go on, and so must this column.

I've been working like a beaver, as time permitted, running Mathcad simulations of algorithms and trying different approaches. I've also been in contact with a neat bunch of people, who have expressed interest in the minimizer and offered many useful suggestions. I'll get to them later.

Minimizer lite

First, however, I want to discuss a version of the minimizer that I promised to upload to ESP' s web site, and didn't get the time to. Perhaps it's just as well, because I want to discuss with you some of the decisions and changes to the structure that went into this version.

To refresh your memory, we're seeking a function that will find the minimum of a function f(x), where x and f(x) are both real scalars. The kickoff point of our search began with Brent's method, which uses a combination of parabolic interpolation and golden ratio bisection. Key to the method is the concept of maintaining a “triplet” of points, with the condition:


(1)

Parabolic interpolation involves fitting a parabola through the three points defined by x0 , x1 , and x2 , and using the quadratic equation of the parabola to calculate a predicted minimum. Except for nagging special cases (such as points with equal y-values, or solutions that repeat a previous point), a little housekeeping allows us to maintain Condition 1, and iterate to find the minimum.

My main beef with Brent's method is its tendency to end up with ill-conditioned triplets, such as the one shown in Figure 1. As we approach the solution, we are pushing the limits of machine precision anyway. Anything that causes a loss of significant bits, due to nearly equal values, can wreak havoc with the accuracy of the solution.

After much tinkering around trying to save the algorithm, I decided that bisection was the problem. I developed a new technique based on doubling or reversing the previous step, a technique reminiscent of the Nelder-Mead “Simplex” method for problems of higher dimensions. Since then, things have gone swimmingly.

Early on, I became convinced that three points are not enough to guarantee robust performance for all kinds of functions. That's why I've been carrying along five, and I continue to do so. I've been using three algorithms for estimating the minimum: two based on parabolic interpolation and one based on the intersection of two straight lines. Last month, I gave you an algorithm that used all three estimators, albeit not very wisely. Since then, I've been developing rules for using them more wisely, and have some outstanding news, which I will reveal later. I'm convinced that the use of multiple estimators adds a lot of robustness to the algorithm.

Nevertheless, for many functions, particularly those with smooth minima looking roughly like parabolas, the three-point interpolation works fine, when combined with my backstep/

overstep approach. I'm offering you this “minimizer lite” as a workhorse version of the larger minimizer.

You might recall my quotation, last month, from race car great, Harry Miller: “Simplify, and add lightness.” I acknowledged that I wasn't really adding lightness, but building something more like an American LaFrance fire truck than a racing car. Continuing that analogy, you can think of my offering this month as a '32 Ford three-window coupe with the engine souped up, and the fenders and running boards stripped off. I still haven't managed to “add lightness.” I'm definitely stripping down instead. But make no mistake, like the hot rod, this little sucker goes like stink!

Stripping down for action

Let me take a few moments to describe the features of my Minimizer Lite. The most profound change is, as you might expect, the reduction to a single estimation method. Since parabolic interpolation needs only the three data points of a triplet, we can cut back the dimensions of the data set from five to three. This reduction continues through the entire structure of the minimizer, and simplifies it tremendously. Making the change from a larger, more complex program to the lite version is a bit like brain surgery, and took me awhile to accomplish, but the algorithm is straightforward enough. Because I've made detailed changes from month to month, I'll repeat the estimation algorithm here to make sure we're on the same page. Given three sets of data points Pi = {xi , yi }, let:

(2)

and:

(3)

The estimated minimum is then given:

by:


(4)

In the American LaFrance version of the minimizer, the value of ymin is used in a fairly complicated convergence criterion, designed to be absolutely certain we haven't converged on a false minimum. The stripped-down version uses only x-values, so ymin is not needed.

The overstep/backstep algorithm goes like this: record the value of the step just taken. To illustrate:


(5)

This step may or may not give you a new x1 . If the new value of:


(6)

is an improvement over the previous minimum, then xmin is your new x1 . Now take another step in the same direction. If it's not an improvement, insert the point wherever it goes, and step back from the previous x1 .

Last month, I pointed out that some cases can lead to an estimated minimum that's right on top of a previous point, usually P1 . We can't allow this, and Brent invoked a bisection step to change the geometry. I introduced what I think is a much better approach, which is simply to force the next x to be a certain minimum distance from the points of the triplet. To accomplish that, I introduced a function wiggle().

This month, I've altered the function wiggle(), but only slightly. Previously, the function used as its minimum distance a specified percentage of either the left-hand or right-hand segments of the triplet. The new version computes a single distance, a percentage of the span x2 — x0 . The code is shown in Listing 1.

Listing 1: Function to limit xmin value

double wiggle(double X)
{
#define R 0.01
// Minimum distance from points
double del = R * (x[2] – x[0]);

// Are we in left or right segment?
if (X <=>
X = max(x[0] + del, min(X, x[1] – del));
else
X = max(x[1] + del, min(X, x[2] – del));
return X;
}

The other functions of the minimizer are pretty much the same as before. However, since we now have only three points instead of five, they simplify down quite a bit. The function step(), which finds the initial triplet, hands it off to the rest of the minimizer, and we simply keep updating the triplet each time a new probe of the function is performed.

The proof of the pudding is in the eating. Minimizer lite converges, on our usual test case:


(7)

using 17 function evaluations. That's starting with a very coarse step process using only two steps, which means that we effectively bisect the original interval, 0..1. This is an important issue, because this particular configuration, for this particular test function, yields an initial estimate that's identical to x1 . Without function wiggle(), we'd be in trouble from the get-go. With it, the algorithm remains robust.

To satisfy yourself that my overstep/backstep process works, you need look no further than Figure 2. This is the end configuration of the triplet, at the instant of convergence. It doesn't get any better than this!

Acknowledgements

Several folks have contacted me by e-mail to either comment on the progress of my efforts, offer advice, or suggest improvements. I'd be remiss if I didn't acknowledge some of them here. The first name is more of a correction: Jim Moe of Sohnen-Moe Associates. Somehow, I managed to garble Jim's name in a previous column. Sorry, Jim.

Jim has been most diligent in dissecting Brent's method, and figuring out what it does. He's also been very helpful in defining convergence criteria.

Other helpful and interested folks include Hugo Pfoertner, Geoff Evans, Katherine Haynes, and Keith Briggs. We've formed a loose confederation, and try to share ideas. Geoff has a great idea for estimation algorithms which I'll discuss later.

Minimizer heavy

I haven't written any more code for the full-blown minimizer. Instead, I've been concentrating on figuring out how best to use the multiple estimators that five data points allows. Here's the issue in a nutshell: given three (or N) different estimates of where the minimum is, how does one combine the estimates to greatest effect?

One ground-rule to remember: we always assume that the most costly part of the computation is the part that computes f(x). Even if the function itself is simple, we still have the overhead of a function call. Therefore, we tend to use the number of function evaluations as a measure of the performance of the algorithm. Within reasonable limits, we don't care how much computation is involved in forming the estimates themselves, as long as they don't require additional function evaluations.

Two months ago, I gave you a rudimentary version of the minimizer, which merely cycled between three estimators. Though the minimizer worked reasonably well, this was plainly not an optimal approach. My analyses, using Mathcad, have been aimed at finding one that works better. I think I have.

The result may surprise you. Given three estimates of the minimum, the one to choose is the one that requires the greatest movement from where you were before. In a sense, we're trying to avoid the true minimum up until the very end.

This approach may seem counter-intuitive, but trust me, it makes sense. Remember the fundamental problem with minimizers. The function that we're minimizing looks more and more like a straight line as we approach the minimum. We want a triplet that contains the minimum, for sure. And we'd like to have the points of the triplet close together. But not too close, or else we lose all numerical precision, and the minimum along with it.

So rather than diving directly for the minimum, we tend to tiptoe up to it more gingerly, trying to hold a certain spread around the minimum until we're sure we have it cornered. Using the most distant estimate allows us to do that.

I've simulated this approach in Mathcad. Because this algorithm doesn't dive straight for the minimum as aggressively as Minimizer Lite, we can expect it to take more function evaluations to converge, and indeed it does, but not many. This one takes 20 evaluations, only three more than before. But take a look at Figure 3, which shows the “points spread” at the instant of convergence. You can't ask for a collection of data points better distributed than this.

Some folks think I've spent an inordinate amount of time tinkering with this problem. If any doubts remain in that the time was well spent, a look at Figures 2 and 3 should dispel them. This minimizer combines the techniques:

  • Multiple estimators
  • Overstep/backstep logic
  • “Wiggle” logic
  • Logic to handle equal y-values
  • Delayed approach to the minimum

These approaches yield an algorithm that's extremely robust, and should be able to handle any function you care to throw at it. What's left are only detail improvements, things like trying other estimation methods, or dropping estimators that don't seem to be helping.

Still to come

I'll mention two possible improvements here. One of the estimators I'm currently using fits two straight lines to the points P0 to P1 and P3 to P4 . The point where they cross is the estimated minimum. This estimator was thrown in to handle cases with “pointy” minima with discontinuous derivatives, like abs(x). It nails that function in one iteration. But most functions don't look like abs(x) near the minimum, and I am still not sure how well this estimator works for other cases.

The decision to use this estimator is important, because in my tests, it usually ends up being the one that estimates the farthest from the actual minimum. Since my delayed approach uses the estimate that's farther out, this is the one usually used. That result delays the iteration process, and explains the higher function evaluation count.

Each estimator gives not only an estimate of xmin , but ymin as well. I can easily see that we might want to “grade” the estimators based on how well they did on the previous iteration. If the estimated and actual values of x are far apart, we have an indication that the function is vastly different from what the estimator thinks it is. In this case, we might choose to shut down the most poorly performing estimator. I believe this would definitely speed convergence.

Goeff Evans has been looking at other estimators, and he suggests a method that shows promise in his tests. He uses four points instead of five. The points would consist of the two bracketing points, plus two inner points: the last estimate of the minimum and the one from the previous step. Goeff also has found, as I did, that a delayed approach to the minimum is best.

In the long run, it's going to be awhile-perhaps years- before the best algorithm is found. I'm using seven test functions in my work. Jim Moe is using 25. However, we won't accept the algorithm as production quality until it's been tested with thousands of functions. It should be immediately obvious that neither Jim nor I, nor any other small group of people, can exercise the function in such exhaustive tests.

So I'm moving the algorithm out of this column, and into the real world. You now have Minimizer Lite, which should serve for most well-behaved functions. You will soon have a serviceable version of the full-blown, multi-estimator version. Although you can expect comments, status reports, and new ideas related to the algorithm for some time to come, this will be the last of my columns devoted exclusively to the scalar function minimizer. Next month, we'll move on into other areas. I'm going to be talking about spline interpolation. See you then.

Jack W. Crenshaw is a senior software engineer at Spectrum-Astro in Gilbert, AZ, and a specialist in the application of advanced methods to the solution of real-world, embedded-systems problems. He is the author of Math Toolkit for Real-Time Programming, from CMP Books. He holds a Ph.D. in physics from Auburn University. Jack enjoys contact and can be reached via e-mail at .

Return to November 2001 Table of Contents

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.