Reusing code and inverting matrices - Embedded.com

Reusing code and inverting matrices

This month, as promised, I'm continuing to flesh out the C++ matrix class. But first, I want to talk about the value of reusable code and how it's supported by the Microsoft compiler called Visual C++ Express. Regular readers may remember how highly I praised this compiler, because of its ease of use and all its neat features.

Since then, I've had a few problems. This is by no means to suggest that I'm reverting back to my former, anti-Microsoft stance. Although I've been doing C++ for a long time, I'm a total newbie to this compiler and perfectly willing to accept that my problems are results of pilot error. However, having loaded up with books on the order of Microsoft Visual C++ for Dummies and having contacted folks in the developers' community and Internet forums, without much success, I'm still having the problems. In the future, I hope to establish a dialog with folks at Microsoft who can set me straight. In the meantime, I'd like to share my experiences with you. See what you think.

One of the big selling points of object-oriented design and development has always been that it encourages software reuse. If we design our C++ classes carefully, we should be able to generate software packages that can be used again and again, in new projects, without modifications . I stress that last phrase, because you're not really reusing software if you have to modify it for the next application. My incentive to develop classes for vector and matrix operations comes directly from this notion of reuse.

I can't resist reporting that I had my own version of software reuse, going on in the 1960s and 1970s. The “libraries” for my software were two sections of my desk drawer; one containing the source code (on IBM punched cards) for reusable functions, the other, object decks. Reusing those functions was as easy as lifting the object decks out of the drawer. The incentive to not recode was strong.

I should also note in passing that none of the software methodologies, from top-down design to the latest UML-based methods, are very effective in defining candidates for reuse. The reason is simple: you don't get those candidates from any single project-development effort. What you get from such an effort is the specifications for software needed for that particular project. As a ridiculous example, you might find several places in a given project that a vector addition is needed, but none with vector subtraction. A strict application of formal methods would never lead to a vector subtraction function.

By definition, reusable software is software that's used on multiple projects. You identify candidates for reuse by asking “will I ever need to use this function again?” In the end, the answer comes much more from the experience and foresight of the human designer than from any formal method.

So how does this discussion of software reuse relate to Visual C++ Express? When I was developing the software for the vector class, I created a project that included the files Vectors and SimpleVec (cpp and h) . These files grew and evolved as the project went forward. At first, all these files were in the main project directory, but eventually I created a library folder for them, which is intended to become the modernized version of my desk drawer.

When I started the development of the matrix class, I created a new project, which included the same files (you can see where this is going, right?). As this project evolved, I found that I needed to make changes to the vector-related files. That worked–no problem.

Then I went back to run the vector project again. I discovered, to my horror, that it was now broken. VC++ Express was telling me that the files were corrupted, so it wouldn't even open the project. If it had, perhaps I could have repaired the damage somehow, but I couldn't.

What's up with that? What good is software reuse if you can't actually reuse the software?

I can't tell you what happened–I need a Microsoft guru for that–but I do have a theory. When you're working with a given VC++ Express project, I gather that it retains info on the files in a given project, which it uses to rebuild it as needed. When I made changes to some file in the second project, the date and size info on that file didn't match what the first project had remembered. So VC++ concluded that the project was broken.

Many Windows applications–Visual SlickEdit and the Matlab editor, for example–easily detect files that have been changed and politely ask you if you want to revert to an older version. Apparently, VC++ Express does not.

I can't say that I've done an exhaustive study of this issue–yet–but I did talk to a few folks in the Microsoft developers' community, from both the online help forums and newsgroups. Without exception, the folks who responded all said that they'd never tried to use files in the way I had. Whenever they “reuse” files, they make new copies in the new project folder.

That path leads to disaster, because you end up with multiple copies of a given file–copies that can be subtly, or not so subtly, different. Sorry, folks, but that ain't reuse.

Of course, you and I both know what's needed here: a good version control system. I guess that, in the warm glow of my delight with VC++ Express, I had put too much faith in its ability to manage software that is shared between projects.

Expect to hear more about this issue. I hope to find some guru, in Microsoft or not, who can tell me how best to handle the problem. In the meantime, just know that it is–or seems to be–a problem.

Back to the Matrix class
Now let's get back to software. As promised, I'm adding a new file, called SimpleMat.cpp , to the project (the code will be available at http://www.eetimes.com/design/embedded/source-code/4200225/SimpleMat-cpp). As the name suggests, this file contains the same kind of low-level functions that are in SimpleVec.cpp . (http://www.eetimes.com/design/embedded/source-code/4200228/SimpleVec) In the last column (June 2008, p.14), I finessed the need for these functions by tricking the SimpleVec functions into doing the work. That trick still works, but since the functions involved are so tiny, we lose almost nothing by following the more straight forward and transparent approach. Furthermore, SimpleMat.cpp contains a few new, and key functions specific to matrix operations.

The first two functions in SimpleMat.cpp are the console I/O functions mGet and mPut . The mGet function is essentially identical to vGet –the only difference is the range of the for-loop. The mPut function is a little more complicated, so that it will output the matrix row by row. I note again that these functions are rather crude, doing the least amount of work that's practical. For specific applications, you might want to elaborate on them.

At first glance, these new routines seem to be identical in function to the higher-level stream I/O routines I showed last time. In fact, it's tempting to just call the low-level functions from the higher-level ones. But be careful: the functions in SimpleMat.cpp are strictly for console I/O. I use them mainly just for testing. The functions in Matrices.cpp can input/output to any compatible stream object.

The next few functions in SimpleMat are virtually identical to their counterparts in SimpleVec . Because they perform the same operation on every element of the matrix, only the range of the loop counter changes.

One exception relates to the vector function vSet . You might remember that this function sets every element of a vector to the same scalar value s . To be honest, even this vector function has limited value, except when s is 0 or 1. For a matrix, it has no value at all.

On the other hand, the notion of a scalar matrix is fundamental. A scalar matrix has the form:

(1)

In SimpleMat , I'm providing two such functions. Function mSet(s) builds the scalar matrix with s 's down the diagonal. Function mUnit performs the same function, with s equal to 1. Note that all scalar matrices are square, so only one size parameter is needed.

The functions mAdd and mSub are logical extensions of their vector counterparts. Not so with the next two functions, which have no counterparts in the vector functions.

We've talked about the matrix product before. Remember the “pour row into column” mnemonic that I gave last time? I won't bother to rehash the algorithm here. Function mMult takes two matrices as arguments and yields a matrix result. Note carefully that in this case, the output argument cannot be either of the input arguments. That output argument gets built one element at a time, and that process would corrupt the input matrices.

When you multiply matrices, the dimensions are critically important. If you think back to the “pour rows into columns” mnemonic, you'll see that the number of columns in the first argument matrix must match the number of rows in the second. If the first matrix is, say, m ×k , the second matrix must be k by something . For an easy mnemonic, write:

(2)

In a sense, the two middle dimensions “cancel” each other. In general, whenever we multiply matrices, we must take care that the dimensions match up. In a way, the function mMult “enforces” this rule, by simply assuming it. In the spirit of the Fortranish, low-level functions of SimpleVec and SimpleMat , no error checking is done. If you violate the rule, mMult will happily give you a product you didn't expect. It can also overwrite your code. Caveat emptor.

The next function, mInv , is even more interesting. The mathematical entity called vector has no division operator, and the matrix entity doesn't, either. But it has the next best thing: an inversion operator. Consider the nature of the inverse operation for scalars. By definition, x -1 has the property that:

(3)

For this scalar case, the inverse is clearly:

(4)

From this definition, we can write things like:

(5)

Like the scalar inverse, the inverse of a matrix has the property:

(6)

where I is the unit matrix of appropriate size.

That term “appropriate size” is important. Now, take another look at Equation 6. The unit matrix is, by definition, square. What's more, the operations XX -1 and X -1 X only make sense if the inner indices are equal. That is, the matrix X must also be square. Otherwise it cannot have an inverse.

The rule of equal dimensions also limits how we can use the inverse. While the inverse matrix must be square, the matrix it multiplies need not be. Unlike the case in Equation 3, we can only multiply the inverse in the way that makes sense, dimensionally. Consider the classic matrix form of the linear algebra problem:

(7)

We know that A is square (else the system has no solution). Let's say that it's n ×n . Then x and y must both be column vectors, dimensioned n ×1. In that case, the product yA -1 makes no sense at all–the dimensions don't line up. The solution only makes sense if we write:

(8)

Mathematically, matrices are not commutative. You can't just switch the order of multiplication, as you can with scalars.

But wait, there's more. Eagle eyes will have seen a gotcha in Equation 3. If the value of x is zero, its inverse doesn't exist–or, rather, is a mathematical infinity. The same is true of the matrix inverse, only more so. Surely the inverse of a matrix like:

(9)

cannot exist; there is no matrix we can multiply 0 by to get the unit matrix I . But for matrices, 0 is not by any means the only matrix whose inverse doesn't exist.

But if a normal-looking matrix can fail to have an inverse, how can we tell? What can be the rule by which we identify it? To see the answer, you need only think back to the kinds of linear algebra problems that led us to matrices in the first place. If I have some set of linear equations like:

(10)

I can only solve for x and y if the two equations are linearly independent . That is, they can't just be scaled versions of each other. When, in high school, we used to say “n equations in n unknowns,” what we really meant was “n independent equations . . . .”

So how can we be sure that the equations are really independent? The answer turns out to be: the determinant of the coefficient matrix is non-zero. A matrix whose determinant is zero is called singular , and a singular matrix has no inverse.

And how, you ask, can I calculate the determinant? Ah, that's the question, and it's one I won't answer directly, since the definition is horrible. The determinants of small matrices are, however, easy to write down:

(11)

Trust me, you do not want to see the formulas for higher-order determinants. But as you might surmise, there exist recursive algorithms to compute them from lower-order determinants. Writing the computer code is easier than writing the math equation.

The important thing about determinants and singular matrices is that, just as the CPU catches an attempt to divide by zero, we must catch an attempt to invert a singular matrix. In practice, this means we must calculate the determinant as we calculate the inverse. If it ever becomes zero, we must call a halt and report failure.

When a CPU detects an attempt to divide by zero, it issues a hardware exception. In a perfect world, we should be able to throw a software exception for the singular matrix, as well. However, the C++ exception mechanism is complicated and usually not allowed in embedded software. We also can't just send an error message to the console. In general, embedded systems don't have consoles. So instead, I'm returning a flag of sorts: the determinant value. If it's zero, the inversion failed.

mInv , the matrix inversion algorithm I'm providing in SimpleMat.cpp , is an oldie but goodie. I cribbed it from the (uncopyrighted) IBM Scientific Subroutine Package, ca. 1960. I've discussed this algorithm at length in earlier issues of Embedded Systems Design , so I won't go into it in detail–perhaps later. The algorithm uses Gauss-Jordan elimination with full pivoting. This makes it something of an oddity in the world of numerical analysis.

About pivoting: as the Gauss-Jordan process unfolds, the key elements turn out to be those down the diagonal of the matrix. We end up dividing by those elements, so needless to say, zeros on the diagonal are bad. In fact, near-zero values are also bad. To get the best numerical accuracy, we'd prefer to have large (positive or negative) elements there. The term “pivoting” refers to a process of seeking out the larger elements and moving them to the diagonal, by row, and perhaps column, transformations.

I have seen, in embedded systems, homebrewed matrix inversion software that did no pivoting at all. This is both a terrible idea and a perfect example of what can happen when embedded programmers go bad. It's perfectly possible for a matrix to have a zero on the diagonal but still be non-singular. By performing no pivoting, the programmers risk having an algorithm fail when it doesn't need to.

Most implementations of the Gauss-Jordan algorithm use partial pivoting: they seek out the largest element but only in a given column. Full pivoting looks in the remaining elements of both rows and columns, thereby assuring the largest possible element at each step. Full pivoting takes a little longer. When you're moving elements to the diagonal, you're shuffling the matrix, much like a Rubik's cube. As you do so, you must record each shuffle, so that at the end of the process, you can put it back into order. Because full pivoting uses both row and column transformations, the shuffle/ unshuffle process takes longer and uses a bit more memory.

The upside is, you can get larger elements down that critical diagonal, and therefore the highest accuracy possible. The IBM programmers thought this was worth the extra work, and I agree. Which is why I've been using the algorithm for lo these 40+ years.

I should mention that Gauss-Jordan is not the only way to invert a matrix. More modern methods involve factoring the matrix into two parts. The LU decomposition, for example, involves factoring the matrix into two triangular matrices. This process gives a little better performance if you're really seeking the matrix inverse, as you do to solve Equation 7 via Equation 8. However, if you have to solve many such linear equations, involving the same matrix A , LU decomposition is much faster.

Several other decomposition algorithms exist. Each has its advantages. However, if you want the matrix inverse explicitly, and your matrix is relatively small (say less than 20×20), you'll find mInv hard to beat.

Still to come
With the functions in SimpleMat.cpp , we have most of the matrix operations we may ever need. Not all, however. The world of matrix algebra is particularly rich in mathematical operations. One that comes to mind is matrix diagonalization, which I'll be adding later. There are other operations that we may never add. Many special classes of matrices exist, including diagonal, triangular, tridiagonal, and symmetric, to name a few. Each of these classes has its own set of specialized operations, and many numerical systems directed at matrix algebra include a host of such operations. For my matrix class, I aim to keep my focus on the general form of the matrix and let the operations on that class suffice.

In this issue, I've concentrated on the new file, SimpleMat.cpp . Next time, we'll get back to the C++ class called Matrix . I think you'll find that, armed with the functions in SimpleMat.cpp (available on www.embedded.com/code/2008code.htm), additions to that class will go smoothly. See you then.

Jack Crenshaw is a systems engineer and the author of Math Toolkit for Real-Time Programming. He holds a PhD in physics from Auburn University. E-mail him at jcrens@earthlink.net. For more information about Jack click here

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.