The matrix reprogrammed - Embedded.com

The matrix reprogrammed

Over the last few issues, we've been talking about the math entity called a matrix. I've given examples of how matrices are useful and how matrix algebra can simplify complicated problems. A messy problem in simultaneous linear equations becomes almost trivial when cast in terms of matrix algebra. And I've shown you that the rules of matrix algebra aren't rules made up by math professors to keep us awake nights, but rather are rules that come naturally from the nature of the problems and their solutions.

Before we can turn the rules of matrix algebra into C++ code, we need to discuss one more topic: how to resolve the mathematical concepts of matrices with the realities of C++ arrays. Before we go there, however, there's one matter of old business I'd like to add.

Pouring over data
When I gave you the rule concerning matrix multiplication, I made no pretense that it was easy or intuitively obvious. On the contrary, I told you that although it seemed crazy, it had to work that way in order to solve problems in linear algebra. I led you through an example of the “two-finger” approach, where you let one finger move along elements of a matrix row and another down a column in the vector it multiplies.

Reader Dean Carpenter wrote to tell me of a mnemonic device that seems to make the matrix-vector product a little more rational. He wrote:

I harken back to my days at school in Sydney, Australia and the calculus classes taught by Mr. Blazey. He had a very simple method:

Take each horizontal row in the matrix and pour it into the vertical vector. Do your multiplies, add them up. The row points to the location in the result vector.

The act of pouring is quite natural, and it's something that has stuck with me all these eons.

I agree wholeheartedly. In Figure 1, I've tried to depict the process. It helps if you think of the numbers in each row of the matrix as floating in some fluid. We can pour the fluid from any row into the vertical container, but the order of elements is preserved.

View the full-size image

For example, pour the first row into the container–we might even call it an accumulator–next to the column of the vector. Now the elements pair up, term by term. Multiply them and add the products. That single number is the value for the first element of the product vector.

Now repeat this process using the second row, to get the second product element, and so on. Very slick.

Give it a try; I think you'll agree that the mnemonic device helps a lot.

Storage issues
At this point, we have enough information concerning the rules of matrix/vector algebra, and its potential uses, to start implementing a proper C++ matrix class. What's left are issues concerning the way we map the mathematical entities called matrices onto the computer science entities called arrays. Sad to say, most of the high-order languages such as C, C++, and Pascal don't give us much help. Because of this, there are serious implementation decisions to be made.

When we were building the vector class, we had many decisions to make, but the way the vector was stored in memory wasn't one of them. Since a vector is essentially a contiguous array of real numbers, and the computer memory is a contiguous array of bytes, the mapping of the elements of a vector onto the C++ array and then to the physical memory is quite straightforward.

It's not as easy when dealing with matrices, because matrices are two-dimensional. Since no one is likely to build a computer with two-dimensional RAM, we must map the two dimensions of the matrix–and the corresponding array–onto the one-dimensional memory.

Let's begin at the beginning. Mathematically, a matrix is a two-dimensional array of real numbers. For example:

(1)

This matrix has two rows and three columns. As a shorthand, we say that A is a 2×3 matrix. By long-standing convention that far precedes the C language, or even computers, we always give the row number first, then the column.

Any element of the array is referenced using a two-index notation. Following the row-column convention, element a 13 refers to the element in row 1, column 3. Note that it's not the thirteenth element of A . If it makes you feel better, you can separate the indices by a comma, but usually there's no ambiguity, so we leave the comma out.

The assignment statement:

(2)

would give A the new value:

(3)

At first glance, it may seem that C++ supports this notation handily. As you know, C/C++ supports two-dimensional arrays–in fact, arrays with any number of dimensions. The declaration:

double A[2][3] ;

(A)

or even:

typedef double[2][3] Matrix ;
Matrix A;

(B)

also defines a 2×3 matrix. The assignment in Equation 2 would translate to:

A[0][2] = 4.56;

(C)

Yes, you read it right. Remember that C/C++ uses zero-based indexing. So, perversely and confusingly, the reference A[0][2] refers to a 13 .

Since a computer's memory is organized as a sequential array of bytes, it's clear that the compiler is going to have to map the two dimensions of the array onto the single dimension of memory storage. Unless you're into Sudoku puzzles or cryptography, there are only four possible ways to perform this mapping:

• Row-wise, with indices starting at 0.

• Row-wise, with indices starting at 1.

• Columnwise, with indices starting at 0.

• Columnwise, with indices starting at 1.

To our eternal shame, we have managed to use–and propagate–all four conventions. This embarrassment is made even more egregious by the fact that the very first high-order language–Fortran, which predates C by nearly two decades–got it right. For reasons known only to themselves, the designers of Pascal and C chose to change the convention. I'll leave it up to them to explain why.

Some folks like to think of matrices as arrays of vectors. Indeed, we've seen in earlier columns that it can be useful to think of matrices in this way. For our example matrix, we might write:

(4)

where v 1 , v 2 , and v 3 are the three column vectors.

Thinking about arrays in this way is certainly encouraged by the “two-bracket” style of the declaration in Code Example A. Writing the declaration as:

double (A[2])[3] ;

(D)

makes it clear that the array is a 3-dimensional array of 2-vectors. (Putting the left paren in front of the “d” would make it even more clear, but the compiler doesn't think much of this.)

Unfortunately, the reality of C/C++ storage order makes the picture hazy again. If the matrix A is truly an array of 2-vectors, I would certainly be justified in assuming that the first two elements of A were the two elements of v 1 . They aren't.

To support this image, we'd expect the data to be stored in the order:

(5)

or:

(6)

This ordering is called column-major, or columnwise, order. The row index, which is the first one, varies most rapidly. Fortran uses this convention, in keeping with its focus on scientific programming. Unfortunately, C, C++, and Pascal don't. They store things in row-major order, where the column index varies first. In C++, the elements of A are stored as:

(7)

This is row-major order–the column index varies most rapidly. Equation 7 may look neater, solely because of my choice of numeric values in the example. But now the elements of the three column vectors are scrambled together. In view of this, it's hard to maintain the notion that the C/C++ array called A is a vector of vectors as the declaration of Code Example D seems to imply.

Variable dimensions
In the end, the issues concerning storage order pale in comparison with the issue of dealing with matrices of arbitrary dimensions. Declarations such as the one in Code Example A might be useful, if we knew in advance that all matrices would have 2 rows and 3 columns. Of course, we don't know that, and any matrix class worthy of the name should support matrices of arbitrary dimensions. Our matrix class is going to need a robust and efficient way to declare matrices of any size.

We've already been down this road, in the development of the vector class. We ended up with a combination of dynamically allocated storage, together with a class variable to keep track of the current array size.

The issues of the matrix class are going to be similar but moreso, thanks to the need to map the two dimensions onto one.

As it turns out, the need for handling arrays with arbitrary dimensions renders the usual C++ mechanisms for handling arrays pretty much useless. To see why requires delving into the ancient history of high-order languages. Bear with me now while we return to those thrilling days of yesteryear.

I've mentioned before that my very first code in a higher-order language was a vector add routine written in Fortran. Functionally, it's the same as the function vAdd( ) in my file SimpleVector.cpp . See the listing on http://www.eetimes.com/design/embedded/source-code/4200228/SimpleVec. For this discussion, the syntax of the original Fortran version is important. I've shown a modernized version in Listing 1.

That function worked just fine for 3-vectors, which was what I needed at the time. But soon, I was asking how I could extend the function to support vectors of any length. The answer might surprise you. The routine is shown in Listing 2.

See the difference? Note that the vectors are now dimensioned for only size 1, even though the true size is arbitrary.

The secret lies in the fact that the routine doesn't have any local storage for the vectors. They're passed parameters, the equivalent of a C++ array or reference variable. Only the reference is passed; the arrays must be allocated somewhere else. The compiler didn't really need to know how large the arrays really were. It only needed to be told that they were arrays. The dimension statement did that. The compiler ignored the size parameter completely.

Following that epiphany, my next question was, “How do I do the same thing for matrices?” The answer: You don't. You have to tell the compiler that it's a vector, not a matrix. Which means that you have to write your own code to index into the array.

Suppose I have defined an M xN matrix–M rows, N columns–and suppose it's stored in column-major order, as Fortran does it. To the code that declares it, it's a matrix. But in computer memory, it's just a set of words in contiguous locations; in other words, a vector of dimension M•N . We can access the i ,j element of this array–i th row and j th column–by the single index:

(8)

That's how we did it. The calling program thinks tat it's passing a two-dimensional matrix–the data array and its dimensions. The subroutine sees only a linear array and uses Equation 8 to index into it.

The obvious problem with using constructs like dimension A(1) : it's a kludge. Although it worked just fine, we were obviously tricking the compiler into doing what we wanted. And for multidimensional arrays, we had to write code like Equation 8 to access them.

In the mid-'60s, the designers of Fortran compilers got the message: people needed the ability to write general-purpose subroutines that could operate on multidimensional arrays of arbitrary size. They extended the language with a formal mechanism to do just that. Listing 3 shows the structure of a typical routine.

Looking at Equation 8, it's clear that the subroutine has all the information to implement the indexing internally. In addition to i and j , it needs only the value of M , which is passed in. Works equally well. The variable-dimension construct is legal only if the parameters used in the dimension statement are in the parameter list.

As a matter of interest, Equation 8 doesn't use the value of N at all, so the statement:

dimension a(m,1), b(m,1), c(m,1)   

works just as well. We can leave n out of the calling list, too. But now we're back to kludge, so the structure of Listing 3 is much to be preferred.

What, no conformal arrays?
At this point, you're probably wondering why I'm going on so, about features of Fortran. Here's why:

C++ doesn't have them!

In modern computer science parlance, arrays with variable dimensions are called conformal arrays . Neither Pascal, C, nor C++ support them.

I won't mince words here: I consider this fact to be a national–no, international –disgrace. We've already seen that the information is there, at least if one imposes the perfectly reasonable restriction that the dimensions must be provided in the calling list. As we've seen, Fortran has been supporting this feature for 50 years. Wouldn't you think that all the best brains in computer science would have been able to figure it out, also?

Why don't they? I can't speak for them, but I suspect it has to do with robustness. With artifices like dimension A(1) , you're leaving the issue of index range violations entirely in the hands of the programmer. If he chooses to declare an array of dimension 5, and then try to access element 42, Fortran will cheerfully try to do it. I'm guessing that someone decided that if we couldn't have total protection against range violations, we shouldn't support the feature at all. But even this argument falls flat. If we pass both the array A , and its dimensions m and n , the compiler has all the information it needs to do strict range checking. Anyhow, are you going to try to tell me that:

*p++ = 12; (E)

is any safer?

For whatever the reason, we can't pass matrix sizes in C++, which is precisely why so many scientific programmers continue to use Fortran. In C++, we're reduced to falling back on the dimension A(1) artifice and writing our own version of Equation 8. Using row-major storage and 0-based indexing, the equivalent code is:

size_t index(const size_t i, const size_t j,    const size_t cols){  return i*cols + j;}   

And we'll hide this little snippet in the matrix class where it doesn't frighten the horses.

I must admit that when I designed the Matrix class, I toyed with the idea of throwing C convention out the window and sticking to the Fortranish, column-major convention. Fortunately, I took a cold shower instead, which brought me back to my senses.

Now we're ready to define the structure of the matrix class. It borrows very heavily on the vector class, up to and including the use of class variables to hold the dimension information. The class declaration is shown in Listing 4. For this issue of Programmer's Toolbox, I'm limiting the member functions to a few housekeeping functions. The file on http://www.eetimes.com/design/embedded/source-code/4200223/MatTestA contains the full code, including the test driver.

Since we must provide our own indexing into the two-dimensional arrays, we neither need, want, nor can use the C++ indexing style:

A[i][j]   

What I'd like to do, instead, is to overload the [ ] operator, as I did in the vector class. I'd like to be able to write A[j,k] to refer to the element in row j +1, column k +1. I'd like to do it, but I can't. The C++ compiler keeps insisting, perversely enough, that operator [ ] should only have one argument. Silly compiler.

Of course, I can still define a function that gives me a reference to the desired element; I just can't use an overloaded [ ] operator for it. Instead, I've chosen to define the function Cell(i, j) . Calling A.Cell(0,0) returns a reference to element (1,1) of A .

On the other hand, the compiler has no problem with my overloading of operator ( ) . So if I want to use the Fortranish, 1-based indexing, I have no problem writing A(i, j) .

As with the vector class, anyone who is offended by this unconventional overloading of operator ( ) is cheerfully invited to delete the function. I'll be doing my best to avoid its use elsewhere in the code, except in the test driver.

Next time, I'll continue to flesh out the matrix class. See you then.

Jack Crenshaw is a systems engineer and the author of Math Toolkit for Real-Time Programming. He holds a PhD in physics from Auburn University. E-mail him at jcrens@earthlink.net. For more information about Jack click here

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.