Making the tough coding decisions - Embedded.com

Making the tough coding decisions

Making the tough coding decisions

When making a library of C++ classes for vector and matrix math, style still puts the art in programming. Jack gives his implementation du jour.

Lately I've been showing you how to create a set of serviceable C++ functions suitable as the basis for a full-up library of C++ classes for vector and matrix arithmetic. I started with some very simple vector functions written in the classical C (some might argue Fortran) style.

One thing has become extremely apparent: a lot of people have a lot of opinions about how to implement the ideas. I can't recall many times when I've received such a flurry of e-mails, most of them telling me that I'm doing it all wrong. All of the opinions carry weight with me, but a few came from people with huge name recognition–people whose opinions I value highly.

Why so many disparate opinions? Partly it's because personal preferences carry weight. Which brand of truck will you choose for your next big, white pickup? What's that you say? You want a black Audi sports sedan?

Sigh.

I've been thinking long and hard about the disparate points of view, and here's my conclusion: opinions vary because the decisions aren't clear cut. As small and simple as the vector functions are, there are many possible ways to do them and many pros and cons to each. What's more, the pros and cons tend to balance each other, leaving no clear winner. Which is perhaps why, after more than four decades of doing this stuff, I still tend to change my mind on alternate Thursdays and full moons. I've mentioned that I've probably written these functions hundreds of times. Why so many? Because each time I write them, I do it a little bit differently.

Speaking of which, it seems that in my last column, I wrote them one time too many. I got them wrong.

In writing and publishing these columns, I and the editorial staff try hard to get them right. Because I use a lot of math equations in this column, it's particularly susceptible to typos. Lots of people look the column over very hard before it gets to you.

In the case of last month's code, however, I'm afraid I got a little too complacent. After all (I reasoned), I've written these functions hundreds of times. They're not exactly the source code to Microsoft Vista; each function has only one to three executable lines of code. How hard can it be? What can go wrong?

Apparently, a lot. Because I didn't exercise the usual due diligence, many errors crept into the code; almost more errors than there were lines of code. Sorry about that. Believe me, I've learned my lesson. The code I'll show you this month is going to be thoroughly tested, as it usually is.

Weighty subject
Back to style decisions. The style one uses to write code depends a lot on the weight one assigns to the different “-ilities.” In the past I've given you my list of attributes that I value. They are, pretty much in order:

  • Correctness (always #1)
  • Ease of use
  • Readability
  • Maintainability
  • Efficiency

You'll note that I almost always place efficiency way down on the list. Yes, it's important, but I try to never obfuscate or complicate the code just to gain efficiency. I've also mentioned, though, that in the case of vector and matrix math, efficiency rises higher on my radar screen than usual. That's because these are low-level routines, which tend to get used way down inside nested loops. Perhaps it's this vague thought that tends to make the choices so difficult, at least for me.

Most regular readers know that it's never my purpose to write code for you, tip the top of your head back, and pour it in. My goal is to show you the idea, give you the pros and cons, and encourage you to make your own decisions and create your own implementations. In this case, as in many others, the best I can do is to show you my implementation (my implementation of the week, that is), and explain the decisions I've made. Your mileage may vary (YMMV).

Let me give you an example. Last time, I showed you the vector add routine, which now looks like this:

// Add two vectors (c = a + b)// Vector c can coincide with a or bvoid vAdd(const double a[ ], const double b[ ], double c[ ], int n){   for(int i=0; i   

(Note the change in spelling.) As I explained, the purpose of the fourth parameter, n , is to allow the vectors to have different lengths. But in most of my scientific work, vectors represent real, physical quantities like position and velocity, so they tend to have length three (or sometimes even two). If every vector in my software has length three, it gets to be a bother always writing:

	vAdd(a, b, c, 3);   

Aside from the bother of the extra parameter, every literal parameter in a calling list is an opportunity to get it wrong. Perhaps, from force of habit dating from First Grade, I write:

	vAdd(a, b, c, d);   

I could spend a lot of time scratching my head over that one. For that reason, I also showed you special functions for the case n = 3.

// Add two 3-vectors (c = a + b)// Vector c can coincide with a or bvoid vAdd(const double a[ ], const double b[ ], double c[ ]){   for(int i=0; i<3; ++i)="" c[i]="a[i]" +="" b[i];="">   

Aside from ease of use, this function has one big advantage: it can be optimized out of existence altogether. You see, any decent compiler will optimize out loops with small, fixed loop counts. In this case, it will replace the loop with:

c[0] = a[0] + b[0];c[1] = a[1] + b[1];c[2] = a[2] + b[2];   

Having done that, it will also recognize that the indices in the three lines are now all constants, so it will optimize these out, as well. Finally, with the keyword inline, you have a much better chance of persuading the compiler to inline the code, sparing you the overhead of the function call altogether (I haven't had much luck getting Microsoft Visual C++ to inline my code using the GUI-based IDE, but I also haven't tried very hard. As usual, YMMV).All of this seems to make a compelling case for providing special functions for n = 3. On the other hand, if I relegate efficiency to its usual place at the end of my list, I can get the same ease of use, plus a file of half the size, simply by using a default parameter, as in:

void vAdd(const double a[ ], const double b[ ], double c[ ], int n = 3)   

With this approach, the compiler will generate the same code for any value of n , but I only have to specify n if it has a value other than three.I'm leaning in this direction this month, fully recognizing that all the optimization advantages I just cited will not be used. Also I'm fully aware that at least one of the gurus who contacted me thinks of this approach as a kludge ranking right up there with goto's. The solution based on this approach does seem to be best for the medium we're using here, where short files are preferred over long ones. I just need you to be fully aware of the tradeoffs, and you certainly won't hurt my feelings if you prefer to stick with the special-case functions like vadd3( ) .

Finishing the file
Enough philosophy; let's get on with wrapping up this set of vector operations. Incredible as it may seem, we're almost done. The operators + , , and the two multiplication operators dot and cross , pretty much exhaust the list of mathematical things one can do using two vectors. There are, however, a few things you can do with one vector. Also some minor little items like input and output. We'll discuss these next.One of the simplest of the missing operations lets us scale a vector, multiplying it by a scalar.

// Multiply a vector by a scalarvoid vScale(double a[ ], double s, int n = 3){   for(int i=0; i   

The cases where s = 0 and s = “1 occur often enough to justify separate functions for them:

// Clear a vectorvoid vClear(double a[ ], double s, int n = 3){   for(int i=0; i   

We will probably need to be able to copy a vector.

// Copy a vector (b = a)void vCopy(const double a[], double b[], int n = 3){   for(int i=0; i   

One last arithmetic operator remains. This is the norm, or absolute value, of a vector. Mathematically, it's defined for a 3-vector by:


(1)Extension to higher dimensions is obvious. If you look at the definition of the dot product, you'll see that the term under the radical is the dot product of a with itself. This is another one of those functions that we can write one of several ways, depending on our choice of tradeoffs. If you want a simple and straightforward implementation (as I usually do), use the dot product function:

// Return the norm of a vectordouble vNorm(const double a[ ], int n = 3){   return sqrt(Dot(a, a, n));}   

Note carefully that if we nest the functions this way, we mustn't forget to pass the size parameter through. Leaving off the n in that last line will give us the 3-vector version of Dot , regardless of the size of a . If this bothers you, breathe easy. What goes on inside vNorm or other nested functions shouldn't be a concern for you.If you draw a vector in three (or more) dimensions, its norm is the same as the scalar length, as measured along the vector itself. It's tempting to name the function vLength( ) , but that usage might cause confusion with the number of elements in the array. In the past, I've also used vAbs( ) or even vAbsoluteValue( ) . But vNorm( ) works for me.I should note that, like other vector operators, the norm tends to get used a lot. Because of this, you may decide that calling another function from inside vNorm( ) is a bit much. If that's the case, it's easy enough to implement the dot product inside.

// A more efficient version of vNorm// Return the norm of a vectordouble vNorm(const double a[ ], int n = 3){	double sum = 0;	for(int i=0; i   

When doing vector math, we often find the need to convert a vector to a unit vector. You unitize a vector by dividing each element by the norm of the vector. Mathematically:


(2)Whenever the topic of unitizing a vector comes up, the obvious question arises: what happens if the vector has a norm of zero? Dividing by zero is not usually such a good idea. I've seen many implementations of vNorm( ) , with a lot of different approaches for handling the zero case. Some implementations ignore the issue completely, which means that they crash if given a zero vector. Others bail if the norm is zero, leaving the vector unchanged. Still others zero out the whole vector, operating upon the dubious theory that:

(3)Both of these latter two solutions violate the “contract” between the function and the user. The function promises to return a vector of unit length but doesn't keep its promise for this special case. The problem is not that we can't deliver a unit vector; it's that we don't know which direction to aim it. For non-zero norms, the direction is well defined: the unitized vector should be parallel with the original vector. But if the input vector has zero length, in which direction does it point?After over 40 years of dinking around with this issue, I finally came up with a simple solution–very obvious once you see it. Since the direction of the input vector is undefined, the direction of the unit vector doesn't matter. One direction is as good as another. So, in this special case, I simply return a vector with a 1 in the first element. Here's the code:

// Convert a vector to a unit vector// (a = a/|a|)// If a = 0, the vector returned will have // a one in the first element.void vUnitize(double a[ ], int n = 3){   double magnitude = vNorm(a, n);   if(magnitude != 0)      vScale(a, 1/magnitude, n);   else   {      vClear(a, n);      a[0] = 1;   }}   

Wrapping up
We're almost done with the file for vector operations in the Fortran/C tradition. I only have a few more functions to add. These include input/output routines and handy functions for comparison tests such as equality, inequality, and less than.

To save space in this column, I've only shown you code for those functions we've discussed. You can find the full listing on our web site, www.embedded.com/code.

Jack Crenshaw is a senior software engineer at General Dynamics and the author of Math Toolkit for Real-Time Programming. He holds a PhD in physics from Auburn University. E-mail him at . For more information about Jack

Reader Response


C/C++ Programmers like zero offset vectors [0..n], but if you do significant linear algebra you will find [1..n] is more convenient and consistent with convention. Check out Numerical Recipies in C (www.nr.com) for a nice discussion of this issue. Also, I find giving the programmer the choice between functional and destructive versions of vector and matrix functions is a good idea.

-Andy Skypeck
Software Engineer
Boston Scientific
Andover, MN


Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.