Educated Guessing Games
Engineers can't know everything, but with the approximation techniques described here, we can fill in the blanks.
Engineering is often a guessing game. Radar, high quality audio, precision motion, robotics-wherever we depend on processing to guide or solve problems, we rely on our best guess. And we sometimes have to do a lot of work to guess effectively. Interpolation, prediction, extrapolation, and approximation are just a few of the names we give to the process of producing guesses accurate enough to use in real-world applications.
Every branch of engineering and science has tools for the purpose of guessing. They often go by different names, even though they employ the same technologies. Knowing how other branches of engineering accomplish their guessing-and with what degree of accuracy-is useful whatever your own field. In this and the next few columns, I am going to show you how people in different areas of engineering develop, create, and approximate new data based on available information. You may find a solution here to a problem you are confronted with.
Digital signal processing (DSP) makes frequent use of approximation. In motion control, the DSP might compute a cubic spline to guide a cutting laser over a piece of metal. By computing many intermediate positions-as well as the necessary accelerations and velocities to attain each position and pass through the necessary end points-we can make sure that the motion of the laser is smooth, its use of energy is economical, and the mechanical wear and tear is minimized.
In audio, we often use filters to predict what a data sequence would be like if it only had a certain bandwidth. We also use approximations to create sound, such as the sounds of instruments with their many details. To do this, we use high granularity tables, or polynomials, or iterative algorithms, to predict the next data point with an accuracy that we must know and control to avoid producing unwanted noise.
Statistical processing bases its probability predictions on generalizations of observed data. These generalizations are made using different forms of least squares approximation or even wavelet-based extrapolators. Though these concepts might seem foreign, they have a source in math and can be used in almost any discipline, regardless of where they originated. In fact, Fourier approximations are actually ideal least squares approximations, although you might not think that to look at them. Wavelet theory became prevalent in earthquake analysis, then made its way to audio, where, in many instances, it replaced the discrete cosine transform.
Along with examining some familiar techniques, I hope to show you some ideas you may not have thought of, as well as new ways to look at some traditional methods. As always, though, I'll start with the basics.
Interpolation
Interpolation is usually defined as an effort to estimate an intermediate value-say of a function or series-based on two known values. Linear interpolation is the most common form.
Linear interpolation amounts, in essence, to drawing a straight line between two known data points. The new data point is somewhere along that line based on how close it is in value to either of the initial data points. In the following formula:
The values f(a) and f(b) are the known data points at x = a and x = b .
Let's say that in a hypothetical application we need to solve for various values of the exponential function exp(x) . As is often the case with embedded systems, we'll assume we don't have much memory or processor speed with which to work. Table 1 shows the known values that cover our interval of interpolation, which is rather small.
Table 1: Values of ex | |
x | ex |
0 | 1 |
1 | 2.718 |
2 | 7.389 |
3 | 20.086 |
4 | 54.598 |
5 | 148.413 |
6 | 403.429 |
7 | 1.097×103 |
8 | 2.98×103 |
9 | 8.103×103 |
If we have to find the value for exp(3.8) , linear interpolation works like this:
As you can see in Figure 1, the function ƒ(x) is not a straight line whereas the interpolator is. So, except for the end points, any value we pick will not precisely represent the function. Depending on the desired accuracy, the number of available data points, and their spacing, this may not be a problem; but if the function wiggles between the data points and our desired accuracy requires it, we may need a larger table. Frequency synthesis, for example, usually uses very large wave tables to get alias-free sound, especially when using linear interpolation. The higher the resolution of table, the more accurate your interpolator, assuming that your data is not band limited. But, because of their memory limitations, not all applications can afford a table with so many entries. So how do you increase accuracy while keeping the table small? We have a number of options. One of them is the Lagrange interpolator.
Lagrange
The Lagrange interpolator fits a polynomial to data. It's based on Weierstrass' theorem, which says that if x_{0} , x_{1} , …, x_{n} are distinct, then for any y_{0} , y_{1} , …, y_{n} there exists a unique polynomial of degree n or less, such that:
We use it to find a polynomial of degree n that passes through the n +1 data points. From that viewpoint, we can say that the polynomial is really:
with each data point as a term. And it becomes supremely easy to code because that all resolves to:
(1)
The core of this computation:
(2)
deserves some explanation. Suppose we define a polynomial or function ƒ(xi) = 0 for all sample points except x_{i} = x_{j} and write it:
(3)
We then write the equation for the only point at which it is non-zero:
(4)
Then we use Equation 4 to normalize Equation 3 to give us Equation 1.
To demonstrate how this works, let's take the same exponential function and data points (from Table 1) that we used in the example on linear interpolation.
The results are different. The Lagrange interpolator produces 44.677 compared with the linear interpolator's 47.6956. And these results are both based on the same information. More calculation is required with the Lagrange interpolator, but we'll get to that later. First, let's look at the implementation.
To illustrate the simplicity of the implementation here's a C function:
s = 0;for (i= 0; i <= m;="" i++)="" {="" z="1.0;" for="" (j="0;"><=n; j++="" {="" if="" (i!="j)z=z*(T-x[j])/(x[i]-x[j]);" }="" s="s" +="" z="" *="" f[i];="">
To improve speed, we can reduce the number of computations required by the interpolator by pre-computing the intermediate products in Equation 2. Since all of these values are known ahead of time, they can be stored in a table replacing the table of y_{i} 's. In this case, we just multiply each value by the appropriate g_{i} (x) . This process runs on an 8051 with little trouble.
With a table of 10 entries, this algorithm boasts the accuracies of linear interpolation with a 1,000-entry table. With the known terms pre-computed, it's also fast. For an embedded system whose output must fit a predetermined set of data points, this can ease computation tremendously.
Chebyshev
Many interpolators, Lagrange included, work well in mid-range but begin to diverge as they reach the end points. By re-sampling the indices at intervals controlled by the zeros of an appropriate order Chebyshev polynomial, we can do a great deal to equalize the error across the interpolation interval. This sounds complicated, but with one simple change, you can have an algorithm that is just as fast and efficient as a Lagrange algorithm, but even more consistent and reliable.
A Chebyshev function oscillates around zero on an interval bounded by –1 and 1. Interestingly, this type of function's frequency increases at the end points of the interval. Sampling using the zeros from the Chebyshev polynomial fitted to your interval of interpolation forces more samples at the ends of the interval than in the middle. This can help make up for some of the inaccuracies at the end points in equally spaced methods.
This does not necessarily mean that your results will be more accurate across the interval of interpolation, but it does spread any error over the entire interval. In other words, you can expect a result obtained at the end points of the interval to be as reliable as one in the center.
So how do you do it? First, you determine your interval of interpolation. In the example we have been using, the interval is 0 through 9. We then fit a Chebyshev polynomial to it; if it is a Lagrange we are fitting, the order is one greater than the Lagrange polynomial. In our example, the Lagrange polynomial is of order 9. Therefore, the Chebyshev polynomial will be order 10. Select the sample points with this formula:
In this formula, a is the start of the interval, b is the end of the interval, and c is the order of the required polynomial. The new Chebyshev sample points are shown in Table 2.
Table 2: Chebyshev sample points | ||
x | ex | Chebyshev index point |
0 | 1.057 | 0.055 |
1 | 1.633 | 0.49 |
2 | 30736 | 1.318 |
3 | 11.67 | 2.457 |
4 | 44.525 | 3.796 |
5 | 181.991 | 5.204 |
6 | 694.337 | 6.543 |
7 | 2.169*103 | 7.682 |
8 | 4.692*103 | 8.51 |
9 | 7.666*103 | 8.945 |
Now you have an algorithm that can produce useful results anywhere in the interval of interpolation. It's also fast and does not require a great deal of memory.
So what's the catch?
Lagrange does have some drawbacks. The order of the polynomial depends on the length of the table, which can mean-even with pre-computed elements-a lengthy computation when the number of accurate digits you need is large. Each new value must be computed from scratch, as no part of the previous evaluations can be used. Error is also difficult to calculate in a Lagrange polynomial, and even with Chebyshev it is not a trivial task. But these are not the only interpolation techniques out there.
In my next column, we'll look at other ways to increase the speed and accuracy of interpolation using relatively small tables.
Don Morgan is a senior engineer at Ultra Stereo Labs and a consultant with 25 years of experience in signal processing, embedded systems, hardware, and software. His e-mail address is .
Return to the March 2002 Table of Contents