Picking the right code preprocessor for your embedded application - Embedded.com

Picking the right code preprocessor for your embedded application

Important in almost all embedded applications-  but absolutelycrucialfor code written for SoC designs and for programs which must bedelivered over slow networks –  is the ability to produce codethat isconcise, small and fast, and with minimal start-up time.

This requires that conceptually constant data be optimized and thatany internal preprocessing (pre-computing, setup) of them be moved fromruntime to compile time. Unfortunately, these requirements makesoftware maintenance more involved and more susceptible to humanerrors.

To optimize and “constant-ize” data and to automate the maintenance,an external tool is required, as programming languages are notexpressive enough. A good preprocessor utility may be the tool ofchoice, and provide additional perks.

Example: displaying status messageson the LCD
As a very simple yet realistic illustration, consider a task ofdisplaying pre-defined up-to-twenty-character “status” messages on theLCD according to some status bit array.

A message is displayed for, say, 5 seconds, provided that thecorrespondingbit is set, and then is replaced by the next (modulo the number ofstatus bits) message with a status bit set.

A “naïve” implementation say, in C, would probably define aconst array of pointers to const strings. The ordinal number of thestatus bit would also be the index into the array of pointers, so thecorresponding message string can be accessed. There are two problemswith this idea.

The first is memory consumption. Assuming four-byte pointers, we'vegot five bytes of overhead per string (a pointer and the terminatingnull character). For a twenty-byte payload, it's 25%.

If most messages are shorter than 20 characters, it's even more. Forexample, if the average length of the messages is 10 characters, wehave 50% overhead. That's not counting any data alignment overhead.Dummy pointers corresponding to undefined status bits have not beencounted either.

The second problem is maintainability. If, for some reason, a bitthat corresponds to some status had its number changed from 15 to 4,the array of pointers has to be modified accordingly. If a previouslyundefined bit gets defined, or a previously defined bit is no longer,then again, the array of pointers needs to be updated.

Data optimization and growingmaintainability problems
To address the first problem, we can choose a different data structure.

Let's have a (large) string comprising all message stringsconcatenated together (and even without the terminating null).

Let's further have an array of indices into this large string suchthat the nth element of the array is the index to the beginning of thenth message string. The last (extra) index in the array is the lengthof our large string. The length of the message n to display is thedifference between indices n+1 and n: no information is lost.

Typically, two bytes would be enough to hold an index. Since thelarge string has no terminating nulls, the overhead of this datastructure is 10% (down from 25%) or, with average counting, 20% (downfrom 50%), plus a fixed two-byte expense on the last index.

Great. However, on the maintainability front things just got muchworse: Maintaining the array of indices is very error-prone. Even if itwas not, it would still be yet another thing to maintain, thank youvery much.

Preprocessor to the rescue
A zero-maintenance solution to both the original and the newly createdmaintainability problem is to define a status bit number and thecorresponding text message in a single statement, like so:

BeginStatus
DefineStatus(3, “my fault”)
DefineStatus(8, “mea culpa”)………………………
EndStatus

We want these statements to execute at compile time and produce thesource code with the bit array definition and the constant datastructure we invented previously.

To achieve this goal, we are willing to do an extra work (once!) anddescribe, to some conversion tool, how to execute those statements andproduce the C source snippets that we want. Guess what? We are talkingabout some preprocessor and about writing macros for it.

To reiterate: we naturally identified a need in a preprocessor inour effort to reduce (to zero if possible) error-prone maintenancework, especially in cases of optimized data structures.

A programming language may already have a built-in preprocessor ofits own, as is the case with C and C++. If such a preprocessor existsand is expressive enough for the tasks, that's wonderful. Otherwise,we've got to use an external preprocessor.

Some tasks for the preprocessor todo
Here are some of the tasks where a good preprocessor can be of greathelp:

Tabulatedfunctions. A hard-to-compute function can be tabulated forfaster performance. Tabulating at compile time removes the tablegenerating code from the final build. Additionally, the resulting tableresides in ROM, which saves precious RAM and, in some applications, theneed to test its integrity.

Preprocesseddata. More generally, any data set may call for a processingalgorithm that requires one-time preprocessing of the data set. Some ofthe examples include lookup tables, perfect hashes, dictionary trees ofall sorts etc.

When the data set is constant for the project, so is its associatedpreprocessed (derived) data. In this case, the derived data can bepre-computed at compile time. As with tabulated functions, thechallenge is to find a tool capable of sufficiently complexcompile-time processing.

Loop unrolling. A decision to unroll a time-critical loop should not be left to thecompiler's heuristics: they have no knowledge of time criticality inyour application. Unrolling a loop manually eliminates a runtimevariable ” the loop counter ” but creates a maintainability challenge(and an implied constant parameter, the number of repetitions of theloop body).

Projectconfiguration management. In a context of a project family, agood architecture for software project configuration management isproject-independent code processing project definition data, the latterbeing of course constant for a given project. The project-dependentdata have to be shared across disparate languages (e.g., to a C sourceand to the linker command file)

Dedicated code generators vs.preprocessors
An extreme case of a preprocessor is a dedicated tool working for aspecific data set. For instance, the macros in our example of statusmessages can take the form

3 “my fault” 8 “mea culpa” ………….

All the smarts of converting this to the C source we want are in thetool itself; the data definition has no trace of what needs to be donewith it.

This approach is (or may be) better than none at all but is bestavoided if a suitable preprocessor is available. The first reason forthat is that a dedicated code generating tool (whether written in C++or Perl or anything) requires maintenance of its own, or else the datadesign becomes unjustifiably rigid.

Secondly, there can be (and probably is, right in your project) morethan one data definition of this kind, which is to produce an entirelydifferent output, according to an entirely different data design. Itwould therefore require an entirely different code generator; this isvery difficult to justify unless all data designs are extremely stable.

Thirdly, it is highly desirable that our macros can be plugged in anotherwise normal source file. This has to do with aesthetics not tounderestimate: the source code sprinkled with preprocessor statementsstill preserves the look and feel of the target programming language.

Even more importantly, it has to do with visibility (and linkage) ofthe generated output. Writing a code generator supporting this featureis no small feat.

Of course, a solution to all these problems is to split the codegenerator into two pieces: a conceptually simple yet flexible commonlanguage to describe how we process our definitions, and a common toolthat recognizes and processes these description statements in a perhapsotherwise normal source file.

This (of course) means a normal preprocessor.

What to look for in a preprocessor
When choosing a preprocessor, you may want to consider the followingcriteria:

1) The language style of the preprocessor .If preprocessor statements are planted into the source file, do theyreally, really stand out (like C/C++ preprocessor and unlike m4)? Canreusable constructs be wrapped in macros and tucked away in an includefile, so that they can be invoked on as-needed basis? Can the samepreprocessor language be used for different target programming (anddescription) languages?

2) Error handling.If the preprocessing results in an error, is there a guarantee that thegenerated source will not compile? Is there a place where all errorsare conveniently collected, even if multiple files are generated?

3) Flexibility andexpressive power of the language. Does the preprocessor languagemeet your realistic needs? For instance, how easy is it or is itpossible to tabulate a trig function? How easy is it or is it possibleto create a lookup table automatically?

A basic criterion, is it possible to arrange a re-scan of (acompile-time loop over) a segment of the source code? Another basiccriterion, does the language provide sufficient arithmeticcapabilities?

4) Integration into thedevelopment environment. How easy is it to include thepreprocessor in your Integrated Development Environment, provided itsupports inclusion of third-party tools? Can the preprocessor searchspecified include directories? Can the preprocessor output include filedependencies for make-driven build process?

5) Ability to output multiple files. In our example with status messages, the status bit array and themessage table may have (depending on coding policy) to go to differentfiles. Data sharing across different target languages simply requiresto output several files. Can the preprocessor do it?

Conclusion
Maintaining and managing optimized code across a family of projectsrequires serious attention to the data structures that are constantwithin a given project build. It is advantageous to use a preprocessorto pre-compute any derived data and to share data among differentlanguages.

Ark Khasin, PhD, is withMacroExpressions which specializes in development of original softwareengineering tools including Unimal, an advanced preprocessor and related services.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.