Two Different Worlds -

Two Different Worlds

We and computers live, as the old song says, in two different worlds.

In the real, physical world of nature, time flows continuously, like a river, from instant to instant. Clocks—devised by man—may go tick-tock, but we have no problem imaging a time of 3:14:15 o'clock, or even 3:14:15.92653589793 o'clock. Give me two times, and I'll give you a new one in between them. That's the very definition of the word “continuous.” In the 17th century, Newton learned that the behaviors of things in nature could be expressed in terms of ordinary differential equations (ODEs). He wrote out those equations and changed the world forever. His equations made it possible to understand nature to an astonishing new depth; to explain the motion of the Earth, Moon, and planets, and ultimately to visit to them.

But when the clocks in digital computers go tick-tock, they're not kidding. Computer states change only on the ticks of their electronic clocks. Fast though they may be, with intervals measured in nanoseconds, their behavior is still expressed only in terms of discrete events.

A digital computer can interact with a real, physical system primarily by measuring voltages using analog-to-digital converters (ADCs) and exerting control using digital-to-analog converters (DACs). It can only sample these inputs and emit those outputs, at specific, discrete times determined by the ticks of its clock. If the sampling rate is, say, 100Hz, the computer might understand times like 3:14:15.01 o'clock and 3:14:15.02 o'clock. But it can never know, through direct measurement, what happened in between. From our perspective, the computer's knowledge of the physical world is incomplete, consisting only of a table of values that represent samples of the systems state. But from the perspective of the computer, the table is all there is. For each measurement, there's a corresponding time. To the computer, times in between simply don't exist.

To those of us who work with real-time, embedded systems, this is the challenge. We must design systems that sample parameters in a dynamic system, infer what the system is really doing, and emit voltages to control it. In short, we build digital control systems.

The connection between the two worlds is the equation I call the Rosetta Stone:


The name is apt because it is this equation that serves as the basis for translation between the worlds of continuous time and discrete time. In Equation 1, the symbol D represents a time derivative of some parameter in a real, physical system; in other words a system whose behavior is described by ODEs like Newton's. The z , on the other hand, represents changes in that parameter as measured at discrete time intervals. Used wisely, Equation 1 allows us to build a digital control system that interacts with the physical system in a useful way.

Equation 1 certainly seems simple enough, but then, so does Einstein's equation:


It may look both simple and profound, but unless you know what it means—unless you understand the physical meaning for each parameter and how to apply the equation to real-world problems—it's useless.

In recent months I've tried to explain the real meaning of Equation 1, at the deepest and most fundamental levels. At a level where you can not only read and believe the equation, but where you understand its origin and, most of all, can apply it to control new systems and solve new problems. As I've derived Equation 1, I've tried to not only give you its derivation, but also the meaning of each symbol and each term, and why each step in the derivation not only makes sense, but seems almost obvious and intuitive.

I began with the Taylor series, which I'll reproduce here for the sake of completeness in a slightly different and more general form than I've presented before. Since staring at this equation tends to make people go all wobbly, I suggest you just glance at it briefly and move on.


The Taylor series gives us a way of predicting values of a function ƒ (x ) based upon the value of the function and all its derivatives, at some point x = x0 . The funny vertical lines in Equation 3 are there to emphasize the order of operations: we must first take the derivatives of ƒ (x ), then evaluate them at x = x0 . The series is an infinite series, meaning that its terms go on forever. In reality, of course, we can't evaluate derivatives to an infinite order, but we can still use the equation in a few important cases.

  • If ƒ (x ) is a polynomial, one of its derivatives, and all higher ones, are zero, so the series is truncated to a polynomial
  • If higher derivatives of ƒ (x ) can be expressed in terms of lower ones, the terms can be collected to make the expression finite
  • If we are doing analytical work, we can use the infinite series directly; though engineers feel bound by finite derivatives of functions, mathematicians do not
  • If h is small, the series often converges after a few terms (note the factorial in the denominators), so we can approximate the series by truncating it to a polynomial, in some vicinity of x0

Next, I talked about power series in general, and why they're so useful for situations that can't be handled with simple algebra. I introduced the function ex , where e is the Euler constant:


I showed you the power series for ex , and how it can be derived from its defining differential equation. That series is:


(As a matter of interest, you might try setting x = 1 in Equation 4 and evaluating it term by term. You should get the value of e1 = e .)

By far the most significant and profound concept surrounding the Rosetta Stone is the notion of operators. The D operator, for example, represents a derivative, and higher powers of D represent higher-order derivatives:


Clearly, D is an operator that makes sense only in a world where functions are continuous, and therefore the notion of a derivative makes sense. On the other hand, z is an operator that makes sense in a world of discrete measurements. If we have a table of numbers {y0 , y1 , y2 ,…, yn }, the z operator represents the process of advancing forward one step in the table:


Not surprisingly, the inverse operator represents a step backwards:


This operator is particularly useful in digital control systems, where the set of measurements, { yn }, represents a sequence of measurements taken at discrete times. In such a case, z-1 is equivalent to a delay by one time tick; easy to implement in either hardware or software.

Armed with these operators, we can begin to make sense of the Rosetta Stone equation. Take the specific case where x represents time, ƒ (x ) is some function of time, and { yn } represents measurements of ƒ (x ), sampled with constant time intervals h . In this case, Equation 1 says that we can predict the next entry in the table:


I should emphasize again that Equation 8 is identical—identical , I say—to that horrible statement of the Taylor series in Equation 3. We haven't done anything different; we've merely used operators to reduce the notation down to something manageable. In one sense, we don't need the shorthand notation; if we were careful enough and diligent enough, there's no operation on Equation 1 that we can't do using the full-up Taylor series. On the other hand, when the difference of complexity reaches a certain point, it becomes not a difference in degree, but a difference in kind.

The next step is hugely profound. It's a step inspired by Oliver Heaviside, in his concept known as operational calculus . Heaviside noted that Equation 8 works for any function ƒ (x ), given some obvious conditions such as the existence of its derivatives. If we can agree that the sequence {yn } represents measurements of the function ƒ (x ), then we don't need to write down the arguments of each side. We can “divide out” the function ƒ , to get the raw relationship between operators—in other words, the Rosetta Stone:


Operational calculus takes us one step beyond Taylor's series. It allows us to operate on operators, manipulating them as though they were ordinary algebraic functions. In this way, we can derive relationships like Equation 1, that hold for all well-behaved functions ƒ (x ).

It should not surprise you that D and z are not the only operators one can define. I've already defined z-1 , which moves us one step backwards in a table. Likewise, if D represents a derivative of a function, then D-1 is its integral. If your background is in control systems, you will recognize D as a euphemism for s . This operator is defined from an entirely different concept, the Laplace transform. However, in casual usage, s represents the derivative of some function, and 1/s its integral.

A couple of other operators are useful. The first is the forward difference operator, Δ. This operator, working in the discrete domain, is the difference between two successive table entries.


Similarly, the backward difference operator (del) is:


Note that the only difference between the two operators is that one looks forward in the table, one backward. In other words:


And so


In a similar fashion, we can derive quite a number of relationships between the discrete operators, all examples of operational calculus in action. A few are:


The last line of Equation 14 must surely be one of the few cases where a difference of two things is also equal to their product.

At last we get to the part where we can use the Rosetta Stone, Equation 1, to manipulate functions and measurements of them to attain useful goals. In the process, we get to apply operational calculus to its most profound effect. Remember, in operational calculus we lean on Heaviside's observation that even though the operators must operate on something, we can manipulate them in equations like Equation 1 just as though they were algebraic variables. In 1885 Heaviside observed that if he did such manipulations naively, as though the operators were variables, he always got the right answer. He suggested that this would always be true, and he was right. As the simplest example, consider an operation we often want to do: estimate the derivative of a function by processing its sampled values. Reverting Equation 1 without regard for the nature of the function ƒ (x ) gives, simply:


What does this mean? To use the form of Equation 1, we had to understand that we must replace the function ex by its power series form. Then ehD became a power series in derivatives, which is another way of saying, Taylor series. In a similar way, we must expand the logarithm function into its equivalent power series.

There are two ways we can do this. My table of integrals doesn't give a power series for ln(x ), per se, but it does give a series for ln(1+x ):


But from Equation 14:


Making the substitution, we get:


And therefore our estimate of the derivative is:


I should mention that the power series for ln(x ) is a notoriously slowly converging series. Unlike the series for sin(x ), cos(x ), or ex , there are no factorials in the denominators to speed convergence. Fortunately, this isn't really a problem in our case, because we must assume that the sampling rate is high enough so that the entries don't change much between intervals. For that reason, the differences are small, and higher-order differences, smaller yet. In practical cases, you'll find that you'll get a good estimate with only a few orders of Δ.

It may not seem so to you yet, but Equation 19 is a highly practical way of estimating a derivative. Here's how.

In the “good old days” before computers, astronomers used difference tables to compute the orbits of planets and comets. They would create a table with a list of measurements taken at successive time intervals. That's { yn }. Next, they'd take the difference between each entry and the one following it. That column is {Δ yn }. Not surprisingly, the next column is the second difference, {Δ2 yn }. Once they had enough columns, they'd apply Equation 19 to get the result.

You can build your own tables (can you say, “spreadsheet”?). In Table 1, I've generated the values for a function defined in my book (Math Tookit for Real-Time Programming , CMP Books, 2000):


For the record, I got the table wrong in the book, tabulating the derivative rather than the function. It's correct in Table 1. The result is exact to nine digits.

There's only one problem with our result: It's rather useless.

Remember, Δ is the forward difference operator, which means that to compute the difference for x = 2, we need the value one step further along. To get higher differences, we need to look even further ahead. This may not be a problem if we're not working in real time. But if we are, then looking ahead means looking into the future, which we can't do.

Fortunately, there's an easy fix. Instead of using the forward difference Δ, use the backward difference, . From Equation 14, we see that:


The power series gives:


we can use the same power series as before, substituting – for Δ. The series becomes:


The new table looks like Table 2.

The operations we need now involve looking only backwards in the table (that's why is called the backward difference operator!). This, we can easily do by retaining a few past values of y .

Forward or backward differences might have worked fine for the old astronomers, but in the modern world, we usually find it better to work directly with z . As you can see from Table 2, using the differences requires us to save not only past values of y but past values of the differences as well. We'd prefer to save only past values of y .

Can we do this? Absolutely. Remember from Equation 14:


We could have written the estimate of the derivative directly in terms of z , or rather z-1 . The equation becomes:

(25) Expanding the series to fourth order, we get:


Using this formula, we can clearly write the derivative in terms of only present and past values of yn . So what's the problem? To see it, try expanding Equation 25 to one less power of z-1 . Write:


And expand to get:


Our formula is completely different!

This is the essence of the problem with using powers of z . It's true enough that to get a practical algorithm for estimating a derivative, we're going to need to truncate the series to some order. But the pattern of the terms in Equation 23 is easy to see, so we can either truncate the series or extend it to higher order, with little effort. On the other hand, we can't just use Equation 25 as written because we have no idea how to evaluate the powers of 1-z-1 . To get the result in a useful form, we have no choice but to expand the polynomials and collect terms. As soon as we do so, we get a set of coefficients that are different than the ones for higher or lower orders.

When I have to implement the derivative estimate in a digital control system, do I use or z-1 ? Actually, I use z-1 . In the control system, I want to do the minimum amount of work and save the minimum number of terms. I get that by using a form like Equation 28. But for analysis, I'm better off with Equation 23. So while I'm doing my design and analysis, I keep the series in powers of . That way, I can keep my options open until I know how many orders I'm going to need. That number will depend on the sample rate and other factors, so determining the order is an integral part of the design.

But once I've completed the design and fixed both the sample rate and order of approximation, the power series of Equation 23 becomes a truncated power series, like Equation 25 or 27. I'm now safe to expand the polynomials in z-1 and collect terms. That gives me equations like Equations 26 or 28. It's not only feasible, but desirable, to use only functions of z-1 in practice. But as you've seen, the values of the coefficients change if I change the order of terms in , so expanding the polynomials is always my last step.

Jack Crenshaw is a senior software engineer at General Dynamics and the author of Math Toolkit for Real-Time Programming , from CMP Books. He holds a PhD in physics from Auburn University. E-mail him at .

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.