20th Anniversary: Forth data structures - Embedded.com

20th Anniversary: Forth data structures

The following article first appeared in the September 1989 issue of Embedded Systems Programming magazine.

Whenever I hear some lout claim that Forth code is unreadable, I'm reminded of the tourist who, on returning from Paris, marveled at the French: “Their children are so smart! Why, even the youngest already speak French!” To the Forth programmer, mathematical expressions using reverse Polish notation are no odder than reading right to left is to an Arab; stack-based parameter passing is no stranger than case declension is to a speaker of Czech or Polish.

Having said all that, I must confess that Forth code isn't always as readable as it could be. Often what makes Forth seem incomprehensible is not the language itself but a programmer's poorly factored code or poorly conceptualized data structures. The latter is especially true of Forth programs for control applications. It doesn't have to be this way, of course, and to prove it I'll be presenting several examples of data-structure conceptualization.

All the examples discussed here follow the Forth 83-Standard, with particular reference to the “vanilla” Laxen and Perry F83 public-domain Forth system unless otherwise noted. I'll presume the reader has a working knowledge of Forth and knows about the CONSTANT and VARIABLE words as well as the CREATE …DOES> construct. As we'll be using examples from control applications, an occasional PC@ or PC! is thrown in to fetch and store bytes in the Intel 808x I/O space.

Real-world nightmares
A few years back I did some consulting for an automotive equipment engineer who had left the shelter of the corporate world for the perils of private entrepreneurship. He had built a marvelously inexpensive and useful piece of automotive repair equipment, then mastered the trigonometric calculations necessary to achieve the desired results. After teaching himself assembly language and beginning to write the program, he switched to Forth to increase his personal productivity. He finally decided to hire a full-time Forth consultant to complete his masterpiece, which was “beginning to be difficult.” When we began factoring the code, I saw that the program had indeed become difficult; it started out with 10 screens of variable declarations!

2 CONSTANT CELL  16-bit processor: CELLS ( n1 --- n2) CELL * ;   

and a word that can create other words:

: ARRAY ( #cell-entries ---)  CREATE CELLS ALLOT  DOES> ( index --- address)  SWAP CELLS + ;   

We then created the array and its access methods:

4 CONSTANT #FORCES0 CONSTANT LEFT1 CONSTANT RIGHT2 CONSTANT UP3 CONSTANT DOWN#FORCES ARRAY FORCE   

Note that we used symbolic constants in place of numeric literals, even at the level of the number of elements we're going to declare in our array. This practice ensures that future modifications to the program entail simply changing a few constant declarations rather than hunting through the code for a specific occurrence of a number.

We could then address the forces in easy-to-read syntax–LEFT FORCE @ or DOWN FORCE ! –to make the code simpler and its intention clearer. After sampling these forces at various points in a cycle, we had to decide whether to declare multiple arrays or factor the data with a two-dimensional array. As Listing 1 shows, adding a word that creates two-dimensional arrays isn't very difficult.

View the full-size image

With the addition of these arrays, we can get and store data using such phrases as START DOWN FORCE @ and MIDDLE RIGHT FORCE ! . Note that declaring arrays in this fashion takes one-based arguments, but the indices to the arrays are zero-based.

(For connoisseurs of exotica or for those concerned with massive interrelations of data samples, Listing 2 contains one of my favorite words, DIMARRAY . This compact CREATE…DOES> word is used to declare arrays of any number of dimensions with any number of elements in each dimension. For dealing with such multidimensional beasts, an optional run-time index limit checker is provided in the word DIMLIMS .)

View the full image

Trimming the overhead
Astute Forth programmers will realize that the indexing process just described consumes run-time cycles. In cases where the desired indices are known at compile time, this runtime overhead may be abolished without seriously impacting the readability of the code. If at some point in the code we'll store the top stack item to the END UP FORCE , we write the phrase:

... [ END UP FORCE ] LITERAL ! ...   

inside a colon definition to cause the array-address calculation to take place at compile time and be stored in the routine as a numeric literal.

Handling I/O data
I/O handlers can benefit from the use of symbolic constants as much as arrays. The statement 13 ECCO PC! , for example, is hardly as readable as CNT-INIT COUNTER-0 PC! . Symbolic constants are often only the beginning, however. Depending on hardware constraints, your input and output routines may require special data structures.

When mapping an eight-bit port to several bit entities, you may wish to create routines that manipulate given bits without disturbing their neighbors. Many modern microprocessors have memory-mapped I/O along with bit-set and bit-clear operations in their instruction sets. Forth systems frequently have bit-set and bit-clear operations that, on such microprocessors, will be code words taking advantage of the bit instructions.

There is a catch, however, when you read the fine print: those instructions tend to be readwrite- modify cycles and therefore unacceptable for use with peripherals for which a read cycle would constitute a special event or a command (consider, for example, DUART chips such as Motorola's MC68681 or the Signetics 2681).

Let's say we have two ports, A and B, located at 0E000 and 0E001 in a memory-mapped I/O configuration. Each has a zigger on bit D7 and a zagger on bit D6 . Listing 3 shows the resulting code. With the addition of the SET-STATE word, we can turn on the port A zigger without disturbing the adjacent zagger. Interpretively you'd use a phrase like ACTIVE ZIGGER PORT-A SET-STATE , which in a colon definition might look like this:

: CYCLE-ZAGGER-B ( ---)  BLETCH-TEST  IF ACTIVE ELSE INACTIVE THEN  ZAGGER PORT-B SET-STATE ;   

View the full-size image

Of course, if BLETCH-TEST were a polite word that returned a pure, true flag (FFFF ), it would be easy to AND the result directly with the bit values and omit the IF… THEN clause from that definition.

Handling tables
The final common data structure we'll look at is the lookup table. Lookup tables are handy for avoiding time-consuming run-time calculations, such as trigonometric functions, or in cases where the data is provided in table form from a component manufacturer, such as an ohms/temperature table documenting a thermistor. Lookup tables may be used singly, if the interval between indices to lookup values is regular for one side of the equation, or in pairs if no convenient integer relationship can be demonstrated among the indexing values. Creating a table is relatively easy, requiring only a name, CREATE , and a list of data. Accessing the data elegantly is almost as easy.

Listing 4 is an example of a single lookup table. In this particular implementation, we left the original reading on the stack and placed on top of it the index of the first lookup that surpassed the reading. Those two items (in conjunction with future readings) could be used to interpolate intermediate readings. For example, if the data consisted of thermistor readings, the indices represented degrees centigrade from some basepoint, and we didn't need interpolation, our conversion would be simple:

25 CONSTANT BASEPOINT: TEMP ( reading -- temp)  LOOKUP BASEPOINT + NIP ;   

View the full-size image

For some users–lab technicians, for example–interactive tests would be as easy as using phrases like CORE PROBE TEMP .

Bottom-line analysis
The moral of these examples is that Forth supports the creation of entirely user-specified data structures with minimal coding effort. A little time spent factoring your data and code words can work wonders in improving the readability and efficiency of your programs. It can also make your development appreciably faster. And best of all, it often translates into happier customers.

When this was written, Jack Woehr was a programmer at Vesta Technology Inc. in Wheat Ridge, Colo. He was also the International Chapter Coordinator for the Forth Interest Group and a GEnie Forth RoundTable sysop.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.