Using vectors and matrices simplifies math and code tremendously and reduces the chances you'll make programming errors.
This month, we're starting out on a new adventure into the world of vector and matrix math. I'll be writing a lot more C++ software than I've done in a long time.
First, however, there are a couple of loose ends I'd like to tie up regarding the Rosetta Stone equation. Each of them requires some explanation, so bear with me as I wrap them up.
A reader gently chided me for not explaining the bilinear transform. He was absolutely right; this is a transform you need to know about. Here's why.
As I've been emphasizing for months, a digital computer does things in discrete steps. We tell it what to do in the language of discrete time, which for us means the delay operator, z -1 . However, many controls engineers still think and do their design work in the domain of continuous time. The connection, of course, is the Rosetta Stone equation:
Where D is the derivative operator:
Controls engineers don't use D , though. They speak in terms of the Laplace operator s . The mathematical basis behind Laplace transforms is highly theoretical and too complicated for us to discuss here. I covered the theory in depth in my 1994 articles.1, 2
It turns out, however, that we don't need all that theory. In practice, the operators s and D are indistinguishable. So we can simply change the notation and move on.
Why do controls engineers like to use s ? It's because the Laplace transform gives them access to a huge body of knowledge and techniques such as Nyquist diagrams, Bode plots, root locus plots, the Routh criterion, and more. Such tools help immensely in designing a control system or signal filter. Filters are used to modify signals in some desired way. Think “rumble filter” or “treble control” on your stereo, and you won't be far wrong.
Using s , engineers can tailor a filter so it modifies a signal in pretty much any way they like. A filter is neatly described by its transfer function , like the low-pass filter:
Some dark day, you may be given a transfer function based on s . Your challenge is to turn it into an equivalent, discrete-time form using z -1 . That's where the bilinear transform comes in. The transform, in turn, comes straight from the Rosetta Stone equation. Reverting Equation 1 (and substituting s for D ) gives:
Remember the rule, which came from the Taylor series: we can't use functions like ex or ln x directly; we don't know how to take the logarithm of an operator. Instead, we must consider such functions to be simply shorthand for expressing a power series. To proceed, we need a power series for the logarithm.
You'll find that series in any good engineering handbook:
Chop the series off after the first term, and you get the bilinear transform:
In past work with the Rosetta Stone, we strove for more accuracy by using higher-order approximations to the power series. It may seem that use of only the first-order term is a very crude approximation, and in a sense that's true. What saves us, however, is the effect of sampling rate. As the rate goes up, the value of h goes down, but so does the term in z . That's because, as you sample more often, the signal changes less between samples. In other words, samples xn and x n -1 are more nearly equal. For this reason, the approximation implicit in the bilinear transform gets better and better as we increase the sampling rate.
In a nutshell, if you're given an s -domain transfer function like Equation 3, simply substitute for s using the bilinear transform, and simplify the equation. It helps tremendously to have an automated math aid like Mathcad.
Because the result depends on the sampling rate, I can't show you a universal form for the new transfer function. However, for h = 1, it's:
Applying this function to a digital signal gives something like:
Which leads to:
Remember that each power of z -1 is a delay. Given this, we can solve for the output value of y , which is its un-delayed term:
Here's a form you haven't seen before: a relationship involving delayed values of the output as well as the input signals. It should be clear, though, that it's no biggy to generate such delayed values. We need only build one more delay line, and a short one at that. From Equation 10, you should be able to implement the filter without breaking a sweat.
The PID controller
The other item of old business has to do with the PID controller. This popular control law for physical systems seems a natural for implementation using the methods of the Rosetta Stone, and several of you have asked me to show you how to build one. I'm going to do just that, but mostly by explaining why you don't need my help.
Any control system works through feedback. You look at some state variable, compare it with its desired (commanded) value, and calculate an error signal. Then command a restoring action proportional to that error.
Where xc is the commanded value, and k is a gain to be determined (call your friendly controls engineer for the value; this is where all those slick techniques get used).
Equation 11 describes a proportional (P) controller. It works for systems that are described by first-order equations of motion or for dynamical systems that have internal damping (think: the float valve in your toilet). It won't work for undamped systems, which include most physical systems. It will keep the error bounded, alright, but it will not settle down to e = 0; it will oscillate.
To prevent this oscillation, we must add damping of our own. Now, damping is a force that gets greater as the speed goes up. To add the damping, we feed back a second force proportional to the rate of change of the error signal.
There's our old friend, the derivative operator, D . If you can measure the derivative directly–by measuring a shaft speed, for example–do so. Otherwise, you'll have to compute De by numerical differentiation. You now have more than enough tools to do this.
The PD controller of Equation 12 works pretty well and probably represents 90% of all control systems on the planet. But it does have one flaw. If there is some external disturbing force Fx , we need an equal and opposite restoring force to balance it. In the steady state (se = 0), we can only get this by allowing a constant offset error e :
We can make e small by making k 1 large, but we can only go so far with that before stability becomes a problem. Even with a large k 1 , the system never corrects to the exact value we commanded. You've surely experienced this effect yourself, when your car's cruise control encounters a hill.
What we really need is a feedback signal that will adjust the correction slightly if the error persists. This means accumulating–integrating –the error. That's the I in PID. In the parlance of the Rosetta Stone, let:
The presence of the I term eliminates the constant offset by adding more and more restoring force as the error persists. Eventually, it will drive the error to zero.
OK, so now I'm ready to show you a fourth- or fifth-order application of the Rosetta Stone solution, right? Wrong, because it's not necessary . Simply put, a little feedback hides a lot of ills. Think of it this way. Imagine yourself driving on the Interstate. A crosswind comes up, pushing your car to the right. You steer left. End of story. You don't need a perfect estimate of the error and its behavior over time. You're watching the road, continually, we hope, and you can adjust your steering input as needed to keep the error under control.
It's the same with a PID controller. Unless things are happening really, really fast, you don't need highly accurate measures of the I and D terms. First order will do.
Here's a digital PID controller that will almost always do the job:
Vector me home
Enough about Rosettas; let's get on with the new topic.
I've been doing math with vectors and matrices for all of my career, which lately is a really long time. The very first bit of code I ever wrote in a higher-order language was a vector add routine, in Fortran.
Most of the work I do entails solving physical problems that involve motion in two or three dimensions, not just one. For every x , there's also a y and z . Such problems respond extremely well to vector solutions, and that's true in both the math and the software realms. Using vectors and matrices not only simplifies the math tremendously; it simplifies the code. What's more, it greatly reduces the chance of programming errors. If I have, in my back pocket, a canned function to add vectors, I can use it like a black box without concern that it's sometimes going to fail. If my code gives the wrong value for the sum, the error could be in many places, but not in that add function.
How much can vector/matrix notation simplify things? See for yourself. You could write the Equation 16.
Or you could write:
What's my vector?
In college, I was told that a vector is a quantity that has both magnitude and direction. The classic example is the wind. For some physical properties, like temperature and pressure, one number–a scalar–is enough. But a meteorologist can't just say, “wind is 15 miles per hour.” That's not enough information. Instead, he might say something like “15 miles per hour, out of the northwest.”
In a two-dimensional world, like a map or the surface of the earth, it takes two data items to define a vector. The nature of those two data items can vary. For the wind example, a magnitude and heading are just fine. For other vectors, other parameters are better.
Imagine you're visiting a big city, whose streets are neatly laid out in square blocks. You ask someone how to get from Point A to Point B.
Your guide says, “Go five blocks on a heading of 53 degrees, 7 minutes, and 48.36 seconds.” He wouldn't be wrong; he's answered your question correctly. It just isn't much help, unless you're a crow. You're much better off if he says, “Go east four blocks and north three.” The notion of measurements on a square grid, as on a map, is called Cartesian coordinates, after the 17th century French mathematician and philosopher Rene Descartes. For the record, I don't think much of Descartes. He was wrong about more things than he was right. His gem of philosophy, Cogito, ergo sum (I think, therefore I am) sounds cool, but it doesn't solve many problems. He further opined that animals are mere automatons, unable to think, feel emotions, or even feel pain. His disciples used to demonstrate their conviction to this notion by nailing dogs to a wall and skinning them alive. Nice guys. Descartes' ideas set back the science of animal behavior a good 200 years.
But when he voiced the notion of orthogonal (right-angled) coordinates on a square grid, he gave us a genuine, gold-plated winner. Most problems in physics or math can be greatly simplified by working with Cartesian vectors instead of scalars. What's more, the math of such vectors is a natural for computers.
Before we go further, I need to explain some fundamental things.
If we want to depict a vector graphically, we draw it as an arrow, as in Figure 1. The direction is important. Going from A to B is different than going from B to A. The arrowhead captures the direction.
If we want to depict a vector mathematically, we need a way to write down both (or all three) of its components. To do this, it's conventional to use a vertical array, like this:
Note that I've given the vector a name that's a single, lower-case, bold-faced character. This, too, is conventional. Uppercase letters are reserved for matrices.
We can divide the trip into separate segments, as our guide suggested. That's the path of the black arrows in Figure 1, and we can write the segments as:
Even crazier, we could follow the path of the green arrows, and write:
Looking at the figure, it's easy to convince yourself that “all roads lead to B.” That is, there are many routes–some as circuitous as you like–to get where you want to go. We only require that all the x-components add up to four, and the y-components, to three.
This strongly suggests that we should define vector addition as a matter of adding all the x-components, and all the y-components, separately. If we do that, we can write:
Now you are starting to get just the tiniest hint of the value of vector notation. Writing one expression describes multiple operations.
Note that there's no law, other than the need for sidewalks, that says that every motion must be either horizontal or vertical (east-west or north-south in the figure). We can just as well describe the motion as the sum of angled vectors, as shown in the red path of Figure 2.Here, let:
Of course, with the red path you are back to the crow problem. But maybe you're Superman. Or, more likely, you've broken down p and q into their own, separate NSEW motions.
A two-dimensional vector works fine for navigating on a map or in Kansas. But in our example, you may need to climb to a certain floor at your destination. For that, you might want to stretch the vector to include a third, z-component. Most vectors in physics are three-dimensional, for the simple reason that we live in a three-dimensional universe. In that case it will take three data items, not two, to define the vector. Hence the Star Trek lingo, “100 parsecs, heading 37 mark 4.” But as in the two-dimensional example, a vector expressed in Cartesian coordinates is usually the better choice.
By now you must surely be convinced that adding vectors means adding the components, element-by-element. It's time to write some code.
This is a nostalgia trip for me. I'm showing the very first code I ever wrote in Listing 1, simply for its historical value. Hope I can still remember it. Listing 2 shows the very same function, converted into C++.
Wow, three whole lines (not counting the curly brace). This is really complex stuff. But does it work? See for yourself. Listing 3 gives a test driver. Your output vector should be:
Some closing remarks
Needless to say, to make effective use of vector math, we need more operations than addition. But not as many as you might imagine. Subtraction involves a clear and trivial modification. There are two multiplication operators for vectors, no division operator, and a handful of miscellaneous operations, so the package might be smaller than you think.
Now, about style. You'll notice that I use a three-argument scheme, the third one being the destination array. Alternatively, one could use a two-address scheme and put the result back in one of the input arrays, say a[ ] . That would give us the equivalent of a += operator. The problem is that often, we want to preserve the value of a[ ] for use elsewhere. If I have only two arguments, I must copy one of them before overwriting it. On the other hand, if I use the three-address scheme, there's nothing keeping me from calling the function with the same array for a[ ] and c[ ] . So I can get either “+” or “+=” with no cost except the overhead of that third parameter.
It probably has not escaped your notice that the vector addition operator cries out for a C++ implementation of operator overloading. If I want to use vadd( ) to add multiple vectors, as in Equation 22, I have do it one pair at a time, like so:
vadd(a, b, a);
vadd(a, c, a);
vadd(a, d, a);
vadd(a, e, a);
For most of my career, I've longed for a language that would let me write an equation like Equation 22 just as it's written in math. With its support of operator overloading, C++ will let me do just that. But it does so at a fairly expensive cost in efficiency. To get the result it must create and delete temporary objects. In doing so, it can thrash the memory manager.
It seems I can have efficiency or simplicity but not both. Over the years I've cogitated on this issue and tried to decide just what is the optimal mix. I finally realized that I don't need to choose. I can provide super-efficient but crude forms like vadd( ) and the elegant but inefficient forms based on operator overloading. Let the user (usually me) decide which form is more appropriate for the problem.
Finally, I want to say something about modularity. In Equations 16 and 17, I showed you how much more concise the vector-matrix notation can be. Yet I regularly see folks writing software in the style of Equation 16. If you ask them why, they're likely to respond that it's a matter of efficiency. In a real-time system, they'll argue, we can't afford the overhead of function calls.
This, in highly technical terms, is pure bull. It was bull even in the days of the old IBM System 360, when a function call required 180s. It must surely be bull when the overhead is currently measured in nanoseconds–soon to be picoseconds. Unlike the old mainframes, modern microprocessors have a built-in call/return mechanism that's as simple as push an address, then jump. How long can that take?
My position is this: if you're so strapped for clock cycles that you can't afford a function call, you've got problems that go way beyond code efficiency.
What's more, inline code implementing Equation 16 may not even be more efficient than function calls. Certainly it will take more bytes. Further, a naive programmer could write the code so that it requires multiple evaluations of the same trig functions. If he does that, his entire argument gets negated. A highly optimizing compiler won't save him, either, because it won't optimize through function calls.
By now, I hope you can see how the new thread is going to go. In future months, I'll be mixing a heady brew of math, code, and philosophy. In the process, I'll be showing you my own library, which I now feel good about; it's as complete and flexible as I need. I think you're going to enjoy the ride.
Jack Crenshaw is a senior software engineer at General Dynamics and the author of Math Toolkit for Real-Time Programming, from CMP Books. He holds a PhD in physics from Auburn University. E-mail him at .
2. “The Alphabet from S to Z, Part II” Embedded Systems Programming , Vol. 7, #9, September 1994, pp. 64-88. (Not available online.)Back