More on the Rosetta Stone -

More on the Rosetta Stone


Over the last few months, I've been trying to explain, in depth, the origin and usefulness of the equation I like to call the Rosetta Stone:


This equation connects the continuous-time, real world we live in to the discrete-time world of computers. If you've been following the series, you know that, so far, we've covered the Taylor series, the exponential function, and synthetic division–all skills we'll need in what follows. In case you missed the earlier installments, here's a mini-review.

We started with the Taylor series, which I'm going to write a little differently than before.


The reason for the change is an ambiguity in the meaning of the term:

which is the form I used before. Taken literally, this term seems to imply a sequence:

• First evaluate f (x ) at x = 0

• Then take its derivative

That's incorrect. Since f (0) has a constant value, taking its derivative makes no sense. The real sequence is the opposite one:

• First take the derivative of f (x )

• Then evaluate the derivative at x = 0

In our discussions of the Taylor series, I hesitated to throw more notational changes at you than I had to, so I stuck with the more familiar notation. I hope and think I made the meaning clear. But there's no doubt that the notation of Equation 2 is hard to misinterpret. So we'll use that notation from now on.

The Taylor series, you'll recall, allows us to predict future values of a function–all future values–based on its value and all of its derivatives at some point, x = 0. Of course, if we need to use a value other than x = 0, we can do so with a simple change of variables.

I didn't derive the series using rigorous mathematics, but I did give some explanation, and enough examples to convince you, I hope, that the equation is both reasonable and true.

After completing the study of the Taylor series, we looked at the exponential function, ex , whose fundamental property is the fact that it's its own derivative:


Based on this definition alone, we managed to use the Taylor series to derive the power series for the exponential:


If you're thinking ahead, you can probably see where I'm going with this, which is that the forms of the series in Equations 2 and 4 are the same.

Lastly, I showed you the technique of synthetic division, which allows us to derive relations like:


You won't need this skill to derive the Rosetta stone equation, but you're going to need it later, when we start applying it. By the way, I should have mentioned last time that the series only converges when |x | < 1.="" there's="" a="" moral="" in="" here="" somewhere,="" which="" is="" that="" elegant="" math="" doesn't="" protect="" us="" from="" stupidity.="" if="">x | >1, it's clear that every term in the series is larger than the previous one, so there's no way the series will converge.

Smooth operators
We're almost there; I need only to tell you about one more trick of mathematics, which is the concept of operators.

If you've ever done any kind of programming, I don't have to tell you what an operator is. “+” is an operator. So are “-“, “*”, and “/”. In words of few syllables, an operator takes one or more arguments and does something to them. The symbol tells us what is to be done to the argument(s). The symbol “+” says to add them. In C++ parlance, the operator may be overloaded. Adding two integers is much like adding two floats or two complex numbers. Adding two vectors or matrices is very much different, in detail, but the principal is exactly the same.

As fundamental a concept as this is, it's an extremely powerful one. The central idea is that we don't care what the value of the operand is–or even its type. Adding 1+1 is no different, in principle, than adding 1.23456 and 6.54321.

Now, this part is key: associated with the operator is a set of rules for what to do with the operand(s). Once you've got the rules down, it doesn't really matter what symbol you use. I could just as easily use:

x banana y


instead of x + y , as long as we've agreed on how the operator “banana” is to be interpreted.

For this study, I'm particularly interested in the symbol for the derivative of a function. As you know, the operation:

implies doing something to f (x )–in this case, measuring its slope for all, or at least a range of, values of x . By now it should be obvious that it really doesn't matter that the symbol for differentiation is:

Here are some synonyms:


(That last one is a special case, used by us physicists when the independent variable is time.) I'll mention in passing that the Taylor series is much more compact if we use one of the other notations. Derivatives of any order can be defined in terms of derivatives of lower order:


By now, I hope we can agree that the symbol doesn't matter; it's only a placekeeper to explain an operation. Now I'm going to introduce yet another synonym, D , which we can agree is defined by:


As before, as long as we can agree that D means “differentiate with respect to the independent variable,” the operation is thoroughly clear. As in Equation 8, we must also agree that:


Armed with this definition, I can now work on Equation 2. It becomes:


Now comes the really neat step. Watch carefully, and notice there's nothing up my sleeve. Imagine that I can “factor out” the function itself:


Whoa! What happened there? Can this possibly be right? Can we factor out the operand of an operator? Of course we can. It's no different than writing:


We've already agreed that an operator can operate on any operand, so the meaning of Equation 12 is precisely identical to that of Equation 11: take multiple derivatives of f (x ), and evaluate them at x = 0. We've only shortened the symbolism a bit.

Now look at the form of the series in brackets, in Equation 12. It's identical to the form of the exponential function. Hence we can write:


If factoring the operand out of the Taylor series seemed weird to you, taking the exponential of an operator must really blow your mind. Remember, though, that the symbols used don't matter. The expressions of Equations 11 and 14 are identical, because we've already established what the terms mean. Equation 14 doesn't relieve us of any effort. It only allows us to express the chore in a more compact way.

The importance of being discrete
So much for the continuous world where f (x ) resides. Now it's time to look at the case where things aren't continuous. In the past, I've referred to the Rosetta Stone equation as connecting the worlds of continuous time and discrete time, which it does. It should be clear, though, that the independent variable needn't be time. In the equations so far, I've deliberately used the more general variable, x . We use time only because that's the variable that's most useful inside embedded computers.

Now suppose that we've got an embedded computer hooked up to some real, physical system–the cruise control on a car, perhaps. We can arrange for the speedometer to generate a voltage proportional to the speed, and measure that voltage using an analog-to-digital converter (A/D). The voltage, like the speed, is a continuous function of time, v (t ), as shown in Figure 1.

Although the voltage is continuous, that's not the view our computer gets of it. It can only read the A/D at specific times, controlled by our software. Those measured values are represented by the orange dots in Figure 1. I'll call them y 0 ,y 1 , . . . yn . For the record, the set isn't infinitely long. While it may go back in time for a long time, it can't go forward beyond the current time, tn .

In general, it's not enough to know what the measured voltages are at each point; we must also know what time the measurements were taken. In the parlance of mathematics, each of the yellow dots in Figure 1 represent an ordered pair , {ti ,yi }. To fully represent the measured data, we'd have to store the time values along with the measured voltage. Data of this kind is called time-tagged data .

However, in Figure 1, I've shown the measurements being taken using uniform time intervals, Δt . We almost always do this, if possible, because the math gets very much harder if we allow the data to be taken with unequal time intervals. For all the control systems you're likely to see, the measurement intervals will be as uniform as we can possibly make them. In such cases, we don't need to store the time tags, because we can compute them from:


When dealing with the Rosetta Stone operators, you can safely assume that the time intervals will always be uniform.

Now, here's the important part: Despite the fact that the array yn represents measurements of the voltage v (t ), they are very, very different animals. The voltage is a continuously varying function of time, represented by the smooth curve. The measurements yn are a discrete set, corresponding only to specific times. To the computer, it makes no sense to talk about intermediate values like y 4.5 . To the computer, those intermediate values don't exist . The yn are all there is.

On the other hand, we're trying to control a device that's continuously varying. We want to control v (t ), not yn . This, then, is the crux of the relationship between the worlds of continuous time and discrete time. In the computer, we're measuring the voltage only at discrete points, but we must use those measurements to infer something about the continuous function. That's where the Rosetta Stone comes in.

Back to the Taylor Series
Now let's return to the Taylor series, which we've managed to simplify down to the compact formula of Equation 14.

For simplicity, I've consistently assumed that the expansion is about the point x = 0. It should be clear, however, that the formula still works for any starting value. We can make the switch with a simple change of variables. Define:





Substituting these into Equation 14 gives:


We need only three more steps. The first one is trivial. First, we'll replace Δt by the measurement step size, h . We're not doing anything different here, just changing a symbol. Other than the fact that typing one character is easier than typing two, I can't explain why we use h instead of Δt . The reasons are lost in the mists of time. But this is the symbol used in most numerical calculus texts (including mine), so I won't break with tradition here.

The second step is both subtle and profound. Remember that we said that the measured values, {y 0 , y 1 , …, yn }, are the values of v (t ) at times {t 0 , t 1 , …, tn }. That is:


It would certainly seem reasonable, then, to write Equation 18 as:


What does it mean?
Never forget how the expression is to be interpreted. Again, we're using notation that is more compact but must be interpreted to mean exactly the same thing as the original Taylor series. Specifically, terms like Dyn are to be interpreted as:

• First take the derivative of v (t )

• Then evaluate the derivative at t = tn

At this point, you should be asking, “But how do I get the derivatives? I'm only measuring voltage.” Aye, there's the rub. In the real world, we don't really know the derivatives of v (t ). Because we can't know those derivatives, Equation 20 might seem rather useless. But it's not, as you'll see in a moment.

Until now, we haven't invented anything new. Equation 18 says exactly the same thing as Equation 2; it just says it in a much more compact way. We haven't gotten out of any work (yet). We can think of Equation 18 simply as shorthand that reminds us of the steps we must take.

Between Equations 18 and 20, however, lies a great gulf; a leap of faith. It's the same leap of faith that Oliver Heaviside took when he invented the method called operational calculus circa 1885. When he first introduced it, operational calculus caused a firestorm of controversy in the halls of mathematics. Pure mathematicians screamed that it was not rigorous, that there was no mathematical proof of its validity. For many decades, scientists were split on the issue. The pure mathematicians continued to argue that it was mathematically unsound, while physicists and applied mathematicians happily used it to get useful results. In the end, someone came up with a rigorous mathematical proof of correctness, which finally stifled the debate.

While working with equations like Equations 18 and 20, Heaviside noticed that, though symbols like D clearly represent operators, he could perform algebra on them just as though they were ordinary algebraic variables. In doing so, he always got the right result. This is the central concept of operational calculus.

Since Heaviside couldn't prove that his manipulations were mathematically correct, you can see why the mathematicians were nervous. But Heaviside didn't seem to mind very much. He was getting results.

Introducing z
I'll get back to the usefulness of Equation 20 in a moment. For now, let's complete the transformation. The z operator is one that operates on tables of discrete data, rather than continuous functions. Its purpose is to advance the “pointer” one step forward in the table. Thus, by definition:


If z advances us one step farther along in our data set, you should not be surprised to learn that its inverse, z -1 , takes us one step backwards. This inverse turns out to be much more useful than the z operator itself. Since neither we nor our embedded computer is prescient, we can't advance further in the data stream than the last measured point. If yn represents the last measurement, then yn +1 hasn't happened yet. On the other hand, we can easily get yn -1 , yn -2 , and so on. All we have to do is look at older data in our data stream. Controls system types call z -1 the unit delay, since it takes us one position backwards in the data array.

We're now in position to take the final step in the transformation from Taylor series to Rosetta Stone. Substitute the left side of Equation 21 into Equation 20, to get:


Now we have an interesting situation. Both z and ehD are operators, operating on the tabular values yn . But the operand is the same on both sides of the equation. So, following in Heaviside's footsteps, we can “factor out” the yn , to get the Rosetta Stone equation in its purest form:


Manipulating operators
At this point, an important threshold has been crossed. Equation 22 shows two different sets of operations on the argument, yn . But Equation 23 has no argument at all; it contains naked operators without operands. This is the essence of Heaviside's contribution. He suggested that we can manipulate the operators algebraically, without regard for the thing they're operating on. This isn't really a radical concept. After all, we think of operators like “+” in a similar way. But, as it turns out, the suggestion that we can tranform the operators to different forms, while retaining the validity of the relation when we use it to operate on something, is both hugely profound and hugely useful.

We'll be talking about some of the uses in future issues. For now, I want to show you one transformation, which is the reason why we aren't worried about the fact that we don't know the derivatives of v (t ).

If Heaviside is right, and we can manipulate the operators, we should be able to solve Equation 22 for D , to get:



Now we're getting somewhere! To see where, let these operators operate on yn as before:


Look at what comes out the left-hand side: it's those elusive derivatives of yn (or, more precisely, of v (t ) evaluated at tn ). What we have is a formula for numerical differentiation. Given the measurements yn of v (t , we can figure out what the derivatives are by operation on the tabular data.

Is that cool, or what?

Of course, you're surely wondering how the heck we compute the operator ln (z ), operating on yn ? Answer: in the same way as we did for Equation 22: expand the function as a power series–this time in z –and use it to operate on yn .

Where have we been?
Let's review what we've done so far. I began with the Taylor series and gave you enough examples to, I hope, convince you that it works. Next, I manipulated the form into the compact form of Equation 22 by noting the similarity between the terms in Taylor's series and the power series for ex . Next, inspired by Heaviside's operational calculus, I gave you the “naked operator” form, which is our Rosetta Stone. Finally, I noted that once we get this form, we can manipulate the operators in the same way we manipulate algebraic variables, to get other relationships.

I'll be showing you a number of examples in the coming months. I'll show you the details of applying Equation 25 to perform numerical differentiation. I'll also show you formulas for numerical integration and interpolation, all derived directly from Equation 24. See you then.

Jack Crenshaw is a senior software engineer at General Dynamics and the author of Math Toolkit for Real-Time Programming, from CMP Books. He holds a PhD in physics from Auburn University. E-mail him at .

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.