# Rosetta Stone explained

As promised, this month we're going to talk about my favorite relationship, which is the operator relationship:

(1) |

This innocent-looking equation pretty much captures everything we need to know about the transition from the universe of continuous time in which we live (where the operator *D * makes sense), to the world of digital computers, digital signal processing, and digital control systems, where *z * lives. Though its development is purely theoretical, Equation 1 has immediate and profound usefulness in all those areas, plus a lot more. Because it connects the analog and digital worlds, I call it the Rosetta Stone.

I've derived the expression a couple of times before in this column, but I always manage to leave some folks behind. Even those who get it, I fear, don't really see how to put the relationship to work to solve their problems. So it remains an elegant, but essentially useless, curiosity—a mantelpiece gimcrack.

I intend to fix that situation, starting now. I'll explain both the derivation and the use of the equation so thoroughly that everyone reading this will get it. To help you understand this concept, I'll start from a point further back where we're all comfortable and work forward in small steps. Then I'll go further forward describing more applications to help you understand how to put the thing to work.

I can't do this in one column, so bear with me; we're embarking on yet another multi-issue study. Grab a bite, drink a Jolt cola, and let's get started.

**The Taylor series**

The jumping-off point for the Rosetta Stone derivation is the Taylor series, which looks like this:

(2) |

What's that you say? You're already lost? Well, duh! What a surprise. Last time, I used this equation as a great example of the kind of thing that makes people's eyes glaze over and cause their brain to shift into neutral. What was I thinking in starting there?

First, we have to step back far enough so that, when I show you Equation 2 again, you'll not only understand it, but accept it as absolute and obvious fact. To do that, let's go back and look at what functions are and how they work.

**Functions**

Remember Moore's Law? It says that computer capabilities double every 18 months. When graphed, as in Figure 1, the law shows that after 10 years, computer capability should have gone up by something over 100. In 20 years, it should be up by a factor of 10,000 or so, and that prediction is pretty darned close.

**Figure 1: Moore's Law**

The graph in Figure 1 is an example of a *functional relationship* . It says that the computing capacity depends upon the time. Give me a time, and I'll tell you how much computing capacity has increased. It's conventional to emphasize that things like capacity are functions of other things, like time, this way:

(3) |

The purpose of the parentheses is to tell you what *C * is a function of.

Because we physicists and mathematicians are a lazy lot, we tend to use single-letter, or few-letter, identifiers when we can. Bad practice for programming, but good practice in algebra. A shorter version of Equation 3 is:

(4) |

which we would read, in English, “*C * equals *ƒ * of *t * .” Even more concisely, we may simply write:

(“*C * of *t * “) (5)

Either form makes it crystal clear that *C * is *dependent* on *t * . Not surprisingly we call *t * the *independent variable* , and *C * the *dependent variable* . When we plot the relationship as a graph, we always plot the independent variable on the x- (horizontal) axis, as in Figure 1, and the dependent variable on the y- (vertical) axis. The reason is purely historical; it's tradition.

Not just any old curve on a graph qualifies to be called a function. The curve of Figure 1 is special in a number of ways. First and foremost, the independent variable—*t * in this case—can take on any value, at least within some specified range. There can be no gaps in the curve because of some restrictions on *t * . In short, the independent variable is *continuous* .

Later on, when we get to digital systems, we'll deal with cases where this assumption isn't true. In fact, the whole purpose of Equation 1 is to let us move between the two universes, those of continuous time and discrete time. But right now we're still in the domain of continuous variables. In computer terms, think floating point rather than integers.

Next, a function must be *single-valued* , which means that, given a value of the independent variable, there can be only one value of the dependent variable. That's certainly the case in Figure 1. Given a value of *t * , only one value of *C * fits the functional relationship.

Finally, the function itself must be *continuous* . We can give a very precise mathematical definition of a continuous function, but intuition works well here, also. In layman's terms, we should be able to draw the graph of the relationship without lifting our pen.

**Figure 2: Piecewise-continuous functions**

The mathematically inclined among you will see right away that what I just said isn't rigorously true. Plenty of useful functions aren't continous. Figure 2 shows some examples. In this figure, the curves are well-defined, nicely continuous, functions, *except* at certain discrete values of *x * —the yellow dots in the figure. Usually the number of such points is relatively small over some specified range. Except at these special points, *x * can take on any value, and there is only one, well-defined value of *y * . We say that such curves are *piecewise-continous* , and such curves represent acceptable functions.

**Forecasting the future**

With the benefit of the definitions concerning functions, we're now armed to attack the central problem of this analysis, which is predicting future values of a function. When doing such things, I like to start with the simplest case I can think of (the “Hello, world” of math). Instead of me just giving you the function, let's make a game of it. I'm going to give you a few values for *y * (*x * ). Your challenge is to tell me what the next value will be, with the help of Table 1.

**Table 1: Guess my value**

x |
y(x) |

0 | 1.23 |

1 | 1.23 |

2 | 1.23 |

3 | 1.23 |

4 | 1.23 |

5 | 1.23 |

Not sure? Maybe the graph in Figure 3 will help.

**Figure 3: Hello, constant**

So what's the predicted value at *x * = 6? Clearly, the most likely value is (duh!) 1.23.

As silly as this example might seem, there's an important lesson to be learned. In the absence of any other information, the best guess value for the next point is the value the function had last. Simple. Let's put this concept into a mathematical form. Write:

(6) |

Since no one's given us a limit on the range of *x * , we might as well assume it can take on any value. In this case, it doesn't matter anyway; Equation 6 is a statement that *y * doesn't really depend on *x * at all.

OK, what's the next step? How about a straight line? Table 2 and Figure 4 break it down for you.

**Table 2: A straight line**

x |
y(x ) |

0 | 1.23 |

1 | 2.69 |

2 | 4.15 |

3 | 5.61 |

4 | 7.07 |

5 | 8.53 |

**Figure 4: Hello, line**

From your college algebra, you might remember the concept of a slope. We can find the slope of a straight line by picking two points—it doesn't matter which two, since the line is straight—and plugging the values into the formula:

(slope) (7)

For no good reason, I'll use:

*x * _{1} = 0 *y * _{1} = 1.23

*x * _{2} = 5 *y * _{2} = 8.53

Equation 7 gives:

(8)

The slope of the curve is 1.46, which means that every time *x * advances by one unit, *y * will advance by 1.46 units. From this, we can guess that when *x * is 6, *y * will have advanced to:

(9)

More importantly, we can guess what *y * will be for *any* new value of *x * . One form of the equation is based on the *y * –*intercept* , which is the value of *y * when *x * is zero. The equation is:

(10) |

Plug in *x * = 6 to get:

(11) |

**Hello, curve**

I suppose you can guess what's coming next. I'll step up to the next level, which is a quadratic. No, you won't have to solve the quadratic equation. That's required if we're going the other way, where the curve has two y-values for every *x * . In this case, we're going the “easy way.” Table 3 and Figure 5 tell the story.

**Table 3: A quadratic**

x |
y(x) |

0 | 1.23 |

1 | -1.36 |

2 | -2.71 |

3 | -2.82 |

4 | -1.69 |

5 | 0.68 |

**Figure 5: Hello, curve**

As we get into higher and higher orders for the function curve, we're going to find that the math gets messy in a hurry. The rules for fitting a quadratic equation are not outrageously hard, but not trivial, either. Since we've done it before in these pages, I won't go into the details. Instead, I'll just outline the general approach.

One aspect of the solution, we could probably have predicted. For the case where the function was constant, we needed only one value: the value of *y * at any value of *x * . For the case of a straight line, we needed two points, denoted by {*x * _{1} , *y * _{1} } and {*x * _{2} , *y * _{2} }. It probably won't surprise you to learn that we're going to need three points to fit the quadratic. As before, any three will do, as long as their x-values are distinct.

We begin with a general form of the quadratic equation:

(12) |

Evaluate this function at three points to get:

(13) |

These equations represent three equations in three unknowns: the coefficients *A * , *B * , and *C * . After a little algebra, we get the algorithm:

(14) |

(15) |

(16) |

And, finally:

(17) |

Let's try this solution for our example function. We can choose any three points in Table 3, though it's better if they're far apart. I'll arbitrarily choose:

x |
y |

0 | 1.23 |

2 | -2.71 |

4 | -1.69 |

Substituting these values into Equation 14 gives:

(18) |

So:

(19) |

Now that we have *C * , we can solve for *B * . Equation 16 gives:

(20) |

Finally, from Equation 17, we get:

(21) |

This one is easy, because I chose *x * _{1} to be zero. The formula reduces to:

(22) |

This point deserves emphasis. When *x * = 0, the function is always equal to its y-intercept. Just as with the lower-order cases, this value has special importance. It's important enough to emphasize it by changing our general equation to:

(23) |

We should write the equation this way, even if we don't use zero as one of our x-values. However, if we have to evaluate the function there anyway, we might as well insist on using this value for one of the *x * 's. If the range of the function doesn't include *x * = 0, we can always make sure it does so with a change of variables.

Now we have all three coefficients, and can write down the fitted function:

(24) |

The next entry in Table 3 should be:

(25) |

From Figure 5, this looks about right.

Before leaving the subject of fitted polynomials, I want to show you a much more elegant way of finding the coefficients. A word of warning, though: it does involve matrix algebra. If you don't feel comfortable with that, you can skip to the next section.

Considering Equation 13 one last time, let us write them in the matrix-vector form:

(26) |

Inverting the matrix gives:

(27) |

Inverting the matrix by hand is not fun, but a math tool like Mathcad (see sidebar) will do it in an eyeblink. If you prefer, Mathcad will even invert the matrix symbolically, so you can write down the solutions for *A * , *B * , and *C * for the general case. Extension of the method to higher orders is obvious.

**A slippery slope**

Let's review where we've been so far. For the three orders of polynomials, our fitting equations turn out to be:

(0th order) (28) | |

(1st order) (29) | |

(2nd order) (30) |

Now we can begin to see a deeper structure behind the solution. All of the equations have the y-intercept, *y * (0), as the constant term. In the case of the constant function, it's the only term, since the function value doesn't change with *x * . In the first-order fit, the next coefficient is the slope, *m * , and it also doesn't change with *x * . For the quadratic, the last coefficient is *C * . As it turns out, this one represents the curvature of the curve, and it doesn't depend on *x * , either. Try the fit for any three points in Table 3, and you'll see that you always get the same value for *C * . Only the middle coefficient, *B * , changes with the choice of *x * 's. If we were to write down the solutions for higher orders, we'd find that this rule continues: the first and last coefficients don't depend on the *x * 's, but the others do.

Let's look a bit harder at that elusive value of *B * . Using nothing more than dimensional analysis, we can see that it always has the character of a slope. That is, regardless of what physical parameters *x * and *y * represent, the units of *B * in Equation 21 must be the same as those of *m * in the linear case, or else the units of *y * won't come out right. The units of *m * are:

The units of *B * must be the same, since Equation 16 contains *m * _{1} . And, indeed, the units of the next term agree, as they must:

But what, exactly, is this slope called *B * ? Clearly, the slope of the curve changes with *x * ; that's why it's called a curve. But of all the slopes in Figure 3, which one is *B * ?

To answer that question, we're going to have to turn to calculus. Don't panic, however. We don't need *much* calculus; just the rule for differentiating a power of *x * :

(31) |

The derivatives of the first few orders are:

(32) |

Let's agree that the slope of the function, at any *x * , is *m * (*x * ), which is now not a constant. Using these rules of differentiation, the derivative of the quadratic function is:

(33) |

or

(34)

What is this slope when *x * = 0? Clearly, it's:

(35) |

Now we see what the value of *B * represents: **It's the slope when x = 0** . A slope-intercept, if you will. Now let's go back to Equation 30, which we can write:

(36) |

Looking at this equation, it's tempting to assume that *C * must represent the “y-intercept” of curvature. That is, it's the second derivative. But that would be wrong, as we can see by differentiating Equation 34 one more time:

(37) |

So *C * isn't exactly the second derivative; it's only proportional to it. Therefore we must write, for the quadratic:

(38) |

To extend the formula to higher orders, note that each time we differentiate higher orders, we get some coefficients out front. For example:

(39)

Notice anything familiar about that value, 24? Yep, it's the factorial, 4!. And from the steps of the derivations, it's easy to see how this product gets there. Each time we differentiate, we get a coefficient that's one less than the previous one.

Once you see that point, it's not hard to believe that the *n * th derivative of the function relates to its “y-intercept” by a ratio that's equal to a factorial. This inspires us to guess, for the next order,

(40) |

Yes, I know, I know. Making a