From theory to practice in one easy lesson. Jack connects the digital and analog worlds with one simple equation.
Just when you think you're doing something right, someone or something comes along to remind you that you could have done it better. I used to think I was good at explaining things to people, and I've been told that I have a talent for it. Recently, I've learned that there's always room for improvement. That was the thrust of my last column. I determined that I was going to explain something in such a way that “not one reader would be left behind.” I picked out a topic-the “Rosetta Stone” equation-and tried to make it as transparent as I could. I wanted everyone to “get it.”
But not everyone did. Among many e-mails I received praising the column, there were two from readers who reminded me that, despite my best intentions, I could have done it better. Both points were well taken, so I'll digress a bit to address them.
It's about motivation
Looking back, I realize that I didn't spell out the motivation for my derivation of the Rosetta Stone equation:
Oh, I could give you the usual argument that this equation is the key relationship connecting the real world of continuous time to the digital world, where time progresses with the ticks of a system clock. But I've already done that. What you need is a concrete example or two.
Say you're controlling some kind of dynamic system-an industrial robot, perhaps. Because it lives in the real world, that system obeys Newton's laws of motion, which can be described using differential equations. A typical, if simplified, one might be:
This equation happens to describe a damped, second-order system under the influence of a forcing function.
Last month, I showed you the D operator, which represents a derivative with respect to time. In that notation, I could have written Equation 2 in the form:
If you're going to be working with embedded systems that interact with-that is, observe and/or control-real-world devices, you need to deal with the describing equations. Further, if you're going to be controlling the devices, you'll want to apply a little dynamics of your own. Perhaps, for example, the real system is underdamped. In the vernacular of Equations 2 and 3, its damping factor, ξ, is too low. You may want to artificially increase it by adding a little extra force to the system.
A popular controller for such systems is the proportional integral derivative (PID) controller. If you deal with computer-controlled systems, you've probably heard of it. The 'D' in the acronym is precisely the same D we've been talking about; that is, the derivative with respect to time. The 'I' is roughly equivalent to D-1 . Bottom line? If you want to write a PID controller, you need to deal with D, and therefore with Equation 1.
Can you write a PID controller without addressing Equation 1? Sure. People do it every day-badly. For example, we can replace a derivative by a divided difference:
where h is the (hopefully constant) time between measurements. This approximation does work-sort of-but it introduces an unwanted and unnecessary lag into the system.
Sometimes, this lag doesn't cause new problems. We can help make sure that it doesn't by making h small. That is, crank up the processing frequency.
Here's a fact of life for your future reference: usually, a digital algorithm-even one that's wrong-can be made to seem right by cranking up the frequency. I can't tell you how many times I've come across programs implemented with wrong algorithms that still seem to work because the high frequency hid the errors.
But wouldn't you rather get the algorithm right in the first place? By doing so, you can often reduce the frequency of the operations, thereby getting the same results with a smaller, cheaper, and less power-consuming processor.
The approximation of Equation 4 can get you in trouble. A long time ago, I was developing a new navigation system based on existing software. Part of the software we inherited was a low-pass filter called the alpha-beta filter (not to be confused with a tracking filter of the same name). It had been supplied to the company by a previous customer, who insisted that it be implemented as written by their systems engineers, with no changes.
In tests, the alpha-beta filter seemed flaky. But several good analysts had looked at the filter, analyzed it based on divided differences, and concluded that it was a valid, if strangely implemented, version of the damped second-order filter of Equation 2. Not only was it valid, but the frequency and damping factor were just what they should have been.
When I got the filter, I repeated the same analysis, with the same result. Then, instead of analyzing it in continous-time space, based on the divided-difference approximation, I analyzed it in z -space. I was quickly able to show that the frequency response was exactly zero at zero frequency (DC). The low-pass filter was, in fact, a noise amplifier, with a gain greater than one for high frequencies, and not passing lows at all!
The second letter gave me yet another reminder: some of you are used to the s-notation of Laplace transforms. For those folks, the D -notation seems foreign. We don't have enough room to cover Laplace transforms, but, fortunately, you don't need to know Laplace transforms to deal with s.
A hugely useful class of differential equations, which includes Equation 2 if the forcing function is set to zero, are called linear, homogenous equations. Even if the forcing function is non-zero, the methods of solution still apply. All such equations respond to an assumed solution of the form:
The nice part about using an exponential is that its derivatives are simple:
Substituting back for x(t): and, using our operator notation:
In short, the s of Laplace transforms is, for all practical purposes, exactly equal to our derivative operator, D (I hedge the statement a bit because Laplace defined it quite differently).
About the same time as I was developing that nav system, I was also dealing with digital filters from other systems engineers. Those with EE backgrounds used to write a filter in the form of a transfer function:
Where P and Q are polynomials in s. The transfer function for a first-order, lowpass filter (a lag filter) is the simple one:
A more effective one might be the third-order filter:
(Here I've normalized the coefficients to unit frequency.) Such a filter can be enormously effective if one gets the coefficients right. The values above work nicely. It can be disastrous if one gets them wrong. What's more, if you're designing this filter by trying different coefficients in your code, you won't live long enough to get them right by trial and error.
Those systems engineers used to think that once they'd given us programmers the transfer functions, their job was over. It was up to us to turn them into digital code. Most of my colleagues did it by using divided differences. I did, too. In the process, we got the coefficients slightly wrong.
As long as the sampling frequency was high enough, no one noticed. But again, it would've been better to do it right. By doing so, we could have cut the sampling frequency and still gotten good performance.
Back to work
I hope by now you're sufficiently motivated. Now let's reduce Equation 1 to something useful. The key concept of operator notation is that we should be able to deal with the operators as though they were algebraic parameters. Therefore you won't blanch when I revert Equation 1 to read:
There is no useful power series expansion for ln z that expands directly in powers of z. Fortunately, however, there's a very nice one that reads:
The good news is this series includes only odd terms, so it should converge fairly quickly. The bad news is the coefficients aren't decreasing much, so the series should converge slowly. The good news is if our sampling rate is reasonably high, we can expect zx to be not much different from x itself. That is to say, adjacent values in a table of values of x should be nearly equal, which means:
In short, even though the coefficients aren't small enough to make the series converge quickly, the value of each term is. Bottom line: A very useful approximation to Equation 12 is gotten from the first term. Substituting this into Equation 11, and replacing D by s, gives the justly popular bilinear transform:
Many of you will recognize this transform immediately. Now you also know where it comes from.
Let's put the linear transform to work for a very specific and practical application-that of implementing the low-pass filter of Equation 9. In this equation, the constant τ is the time constant, which relates to the cutoff frequency by the relationship:
If τ, for example, is 1ms, the cutoff frequency fc will be 159 Hz.
Let's assume that we are going to measure the input values from the A/D converter at some frequency f (not the same as fc -it must be quite a bit higher). Then the step size, h, is 1/f. We'll call the input of the filter y, and the output x. The definition of a transfer function is:
where X(s) and Y(s) are the Laplace transforms of x and y, respectively.
Here, I have to beg you to allow me to be more than a little sloppy in the derivation because, in practice, we don't really take transforms of x and y: We apply the transfer function directly to the data. The reason we can get away with this is not obvious and requires a lot of explanation, but the short explanation is that we assume steady-state conditions, so that initial values of x and y don't affect the result. Applying the transfer function in this way gives us the equation:
For those of you who aren't familiar with the notation, the little flyspeck over that x is a physicist's way of saying s or D. It's yet another way of saying, “derivative with respect to time.” We physicists are a lazy lot, and are always seeking minimalist notation. Sorry about the many different kinds of notation, but if you haven't encountered the dot notation before, stay tuned. I use it all the time. Remember, we're talking about traveling between worlds here.
If we were working in the analog world, and if we had a circuit that could take a derivative (a tall order; they're notoriously unstable), Equation 18 would give us a perfect implementation of the transfer function of Equation 9. But we aren't, and we don't, so it doesn't. We must do the solution in z-space.
In z -space, measurements are not continuous, but discrete. We talk about x and y as though they were sequences of measurements. When we measure y, we're building a sequence of values:
Here's a very important point: You'll note that there is no value in the sequence for yn+1 . That's because its measurement lies in the future. The time to measure yn+1 has not yet occurred. Lacking prescience, we can't know what that value is.
Let's begin with the transfer function of Equation 9 and apply the bilinear transform of Equation 14. We get:
To simplify this equation, let:
Since h must be much smaller than t, we can be assured that r is less than one. Equation 20 now becomes:
Applying this transfer function to x and y gives:
Remember that z is a shift operator; it advances time (and therefore n ) to the next tabular point. But in this case, we have no next point. The last measurement we took was for yn , and we have not yet output xn , much less anything fresher. We can't expect our hardware to be prescient. For this reason, z is not a very useful operator.
Its inverse, 1/z or z-1 , on the other hand, is eminently useful, because it involves moving backwards through the tables. This we know how to do. We can either save the past values in an array, or we can spring for some analog delay hardware. Either way, unlike their future values, past values of x and y are extremely easy to come by.
We can transform Equation 23 to work with z-1 through the simple expedient of dividing both sides of the equation by z. We get:
Applying the operators gives us a perfectly practical algorithm:
This new equation involves only one past value (the last one) for x, and the last two measurements for y. If the coefficients seem strange to you, try this-let:
(The reason for the 2 will become apparent). Equation 21 now gives:
Note that α is smaller than one, and only slightly different from ρ. A little math also shows that:
Equation 25 now takes on a rather familiar form:
This equation says that, to get the new value of x, reduce the old value by the factor 1 — α, and add a times the average of the last two values of y.
Now you know that you didn't have to. The constant is given by Equation 27, and so has a very specific value, determined by the lag time constant, τ, and the sample time, h.
Let's recap. We began with Equation 1, which seems elegant, but “of no practical value.” We used it to derive the bilinear transform of Equation 14, and used that to go from a continuous-time, Laplace-oriented transform, Equation 9, to a discrete-time, z- oriented transform, Equation 22. Then we wrote that in a form, Equation 29, that's trivial to turn into source code.
That's well and good for a simple lag filter, but what about a more complex filter like Equation 10? Well, it's not a lot harder than Equation 29. The computation of the coefficients is horrible, but that part is easy enough with a tool like Mathcad. Once you have the coefficients, the rest is straightforward, differing only from Equation 29 in that you'll need three past values of x and four past values of y. Harder in practice, but not much.
Jack Crenshaw is a senior software engineer at Spectrum-Astro and the author of Math Toolkit for Real-Time Programming, from CMP Books. He holds a PhD in physics from Auburn University. E-mail him at .