Program proactively. Write code that's inherently great before some tool reformats it.
English is a complex and wonderfully expressive language, full of oddball peculiarities, homonyms, and perverse punctuations. Anyone who reads Usenet, e-mail, or even (alas) billboards, sees how few Americans know the difference between the words “their, there, and “they're.” “Than” and “then” get confused almost as often as “affect” and “effect.” And is it “its” or is it “it's”?
Lynne Truss humorously pointed out that a Panda “eats shoots & leaves.” Add a comma, the tiniest of all typographical constructs, after “eats” and the phrase now expresses an entirely different idea.
The word “display,” so often used in programs as part of a function name, can be the noun indicating the display hardware or the verb “to show.” Which is it? Better think it through when incorporating it as part of a function or variable name.
Do you lie down or lay down? What does “lay” mean, anyway? Just the verb form has 52 different definitions in The American Heritage Dictionary of the English Language . As a noun it can be a ballad; as an adjective it denotes a secular person.
Then there's “lie” which means reclining, being mendacious, the hiding place of an animal, and more. Sometimes the context makes the meaning clear, which creates this epistemological puzzle that words themselves carry only a fraction of one's intent. Since a sentence may have many words whose meaning recursively depends on context it's a wonder we manage to communicate at all.
C, too, is a wonderfully expressive language, one whose grammar lets us construct torturous, though perfectly-legal constructions. ****************i=0; is semantically and linguistically correct, yet solving the Poincaré conjecture might be a shade easier than following that chain of indirection. Just a few deft lines of nested ifs can yield a veritable explosion of possible permutations that the compiler accepts with the wisdom of Buddha. But humans and lesser gods cannot decipher all of the inscrutable possibilities.
The International Obfuscated C Code Contest (www.ioccc.org) runs a competition to, uh, “honor” incomprehensible yet working C programs. One winner in 2004 supplied the code in Listing 1, which (somehow) graphs polynomials.
Any computer language that lets us construct such convoluted and unmaintainable bits of agony must be eliminated or tamed. C isn't going away, so it's up to us to use it in a grown-up fashion that eases debug, test, and maintenance.
Post-processing tools like Lint help find bugs but don't ensure we're building clean code that people can understand. Complexity analyzers do flag modules that look difficult, but, again, only work on code we've already written. I ran the program in Listing 1 through a complexity analyzer and the tool sputtered out some warnings and then crashed.
Program proactively. Write code that's inherently good even before it's compiled or processed by some tool. That's the point of a software standard; it defines what sort of operations and constructs are allowed before you press the editor's “save” button. The C language is an entire universe of possibilities whose vastness the standard narrows to a safer, more comprehensible subset.
Enter MISRA, the Motor Industry Software Reliability Association (www.misra.org.uk). This consortium of automotive and related companies was formed to find better ways to build firmware for cars. Why? Because the car companies are terrified of software. Though it adds tremendous value to their products, the cost of defects is staggeringly high. One little bug can initiate a crippling recall. My wonderful hybrid Prius, whose average 52MPG results from smart firmware, was recalled last year due to a software problem. (Yes, the 52 is real, averaged over 37,000 miles, though unlike most of us, my wife and I don't drive as if all the demons of hell are in hot pursuit.)
MISRA members range from AB Automotive to Visteon and includes Ford, Lotus, and others. Among other activities, MISRA produced a set of rules for programming in C. The standard is gaining traction in the automotive industry and others. It's available as a PDF from www.misra-c2.com for £10, or as a hardcopy in that annoying and difficult-for-Americans-to-file A4 format from the same site for £25.
For those who have the 1998 version, things have changed. As of 2004, it was substantially updated and improved.
MISRA-C (2004) has 121 mandatory and 20 advisory rules. I guarantee you won't agree with all of them but most are pretty reasonable and worth following. All derive from the following five principles:
- C is incompletely specified. How does process(j++, j); behave? And exactly what is the size of an int? How astounding that such a basic component of any program is undefined!
- Developers make mistakes, and the language does little to point out many of the even obvious screwups. It's so easy to mix up “= ” and “== .”
- Compilers have bugs, or purposely deviate from the ANSI standard. Most 8051 compilers, for instance, have run-time packages that take and return single precision results for trig functions instead of the prescribed doubles.
- C offers little intrinsic support for detecting run-time errors.
Programmers don't always have a deep knowledge of the language and so make incorrect assumptions.
The MISRA-C standard doesn't address stylistic issues, like indentations and brace placement. Only the bravest dare propose that his or her brace placement rules were intelligently designed. As the saying goes, put two programmers in a room and expect three very strong opinions.
Some of the rules are just common sense. For instance:
- Rule 1.2: No reliance shall be placed on undefined or unspecified behavior.
- Rule 9.1: All automatic variables shall have been assigned a value before being used.
- Rule 14.1: There shall be no unreachable code.
Other guidelines aren't particularly controversial. For instance:
- Rule 6.3: typedefs that indicate size and signedness should be used in place of the basic types.
Int , long , float , double , their names are legion, the meanings vague. It's much better to use int16_t , for instance, instead of int; and int32_t instead of, well, int or long . Oddly, this rule is advisory only. In my opinion it's an absolute requirement.
- Rule 16.10: If a function returns error information, then that error information shall be tested.
Duh, right? Yet how often do we see an “if ” testing malloc() 's return value?
- Rule: 7.1: Octal constants (other than zero) and octal escape sequences shall not be used.
You may have problems with some of the rules, though. Consider:
- Rule 2.2: Source code shall only use /* … */ style comments.
The rationale is that MISRA-C is based on C90, which doesn't permit the nice double slash comments we've swiped from C++.
- Rule 16.2: Functions shall not call themselves, either directly or indirectly.
Recursion makes my head hurt. It sure is useful for some limited applications, though. The standard does provide a mechanism for breaking rules in exceptional circumstances.
Some will start plenty of debate:
- Rule 4.2: Trigraphs shall not be used.
Though I'm totally with the MISRA folks on this one, more than a few folks won't give up their trigraphs till pried from their cold, dead hands.
- Rule 5.2: Identifiers in an inner scope shall not use the same name as an identifier in an outer scope, and therefore hide that identifier.
Another reasonable rule that far too much code violates. The last thing we need are English-language-like homonyms obfuscating variable and function names.
- Rule 14.9: This rule is badly worded. So instead here's the original form of the rule from the 1998 version of the standard: The statements forming the body of an if , else if , else , while , do … while , or for statement shall always be enclosed in braces.
The idea, which makes sense to me, is that we often make changes and additions to the statements following these constructs. If we always build a block structure such changes are easier and less likely to create problems.Then there are the great ideas that might be impossible to adopt:
- Rule 3.6: All libraries used in production code shall be written to comply with the provisions of this document, and shall have been subject to appropriate validation.
Although a nice idea, most of the popular compilers for embedded applications don't claim to offer run-time libraries that are MISRA-C compliant. What about protocol stacks and other third-party tools?
- Rule 20.4: Dynamic heap memory allocation shall not be used.
This rule also specifically outlaws the use of library functions that dynamically allocate memory. Admittedly, malloc() and its buddies can wreak havoc on deterministic systems that never get cleaned up via a reboot. But it's a whole lot of work to recode libraries that employ these common functions if you even know how the library routines work. What fun we might have stripping all of the malloc() s and free() s from embedded Linux! That would be a jobs-creation program for programmers. Let's see if we can get it stashed in amongst the other 15,000 items of pork Congress awards to their supporters each year.
MISRC-C makes sense
The 119-page MISRC-C document explains the rationale behind each of the rules. Even if you disagree with some, you'll find the arguments are neither capricious nor poorly thought out. It's worth reading just to make you think about the way you use C.
Yes, 141 commandments sounds like a lot to remember, particularly in today's climate where so many seem unable to manage the 10 that Moses brought down. A great deal, though, are so commonsensical that they're already part of your daily practice. With some thought and practice it becomes second nature to comply with the entire batch.
Some tools automatically check source files for MISRA-C compliance. Examples include Programming Research's QA-C (www.programmingresearch.com), Gimpel's Lint (www.gimpel.com), and Parasoft's C++Test (www.parasoft.com).
The MISRA team put years of work and big bucks into exploring and codifying these rules. It's just one part of the armor the auto industry wears to protect itself from company-busting bugs.
For a lousy 10 quid all of us can get all the benefits with none of the work. Talk about a no-brainer!
Jack Ganssle () is a lecturer and consultant specializing in embedded systems' development issues. For more information about Jack .
I though splint had at least partial MISRA-C checking. Do you know if that is correct?
– Richard Jennings
Jack replies Richard, MISRA is not mentioned in the splint manual (http://www.splint.org/downloads/manual.pdf).
I bought the MISRA-C and the “Software Readiness for Production (SRfP)” in SEP 2006. A Real Good read!!
I found the MISRA-C mentioned in a “United States Tax Payer funded” document called “JSF Air Vehicle – C++ Coding Standards (Revision C)”
The “JSF Air Vehicle – C++ Coding Standards” document contains a lot of the ideas “put forth” in the MISRA-C document, so if one want to see what is involved without spending the #10 Sterling, download it first.
– Mark Overholser
A nice artile. Making code MISRA compatible is a need of the time then just another standard.
– Abhijeet Yadav