MISRA C: Write safer, clearer C code - Embedded.com

MISRA C: Write safer, clearer C code

Embedded developers often bemoan the fact that no programming language is ideal for their particular needs. In a way, this situation is unsurprising, because, although a great many developers are working on embedded applications, they are still only quite a small subset of the world’s programming community. Nevertheless, some languages have been developed with embedded in mind. Notable examples are PL/M, Forth and Ada, all of which have been widely used, but never universally accepted. Other languages, like Rust, are gaining support, but are not yet mainstream. The compromise, which has been adopted almost universally, is C. How can that compromise be made to work most effectively?

The C language is compact, expressive and powerful. It provides a programmer with the means to write efficient, readable and maintainable code. All of these features account for its popularity. Unfortunately, the language also enables the unwary developer to write dangerous, insecure code that can cause serious problems at all stages of a development project and into deployment. For applications where safety and/or security are a major priority, these shortcomings of the language are a major concern.

It was against this background that, in the late 1990s, the Motor Industry Software Reliability Association (MISRA) introduced a set of guidelines for the use of C in vehicle systems, which became known as MISRA C. Since then, the guidelines have been steadily refined, with updates being published from time to time. A similar approach to the use of C++ has also been established. Although the guidelines were originally aimed at developers of software for use in cars, it was quickly realized that they are equally applicable to many other application areas where safety is critical, and the standard is now widely adopted in many industries.

Although MISRA C is not a style guide – indeed many users apply a style guide as well as the standard – numerous rules also promote the writing of clear, readable maintainable code. This is very beneficial, as code that is straightforward to understand is much less likely to harbor subtle bugs or undefined behavior.

Full details of MISRA C are obtainable from https://misra.org.uk and there are many tools available that support the approach.

I will just give a flavor of the guidelines here. My references are from MISRA C:2012 third edition, first revision. MISRA C is under constant review, with incremental changes addressing clarity and accuracy of the guidelines and support for newer versions of the C language standard. Although details change, the overall philosophy and approach do not.

Rule 13.2 – The value of an expression and its persistent side effects shall be the same under all permitted evaluation orders

The C language standard provides a very wide latitude to compliers with respect to evaluation order in expressions. Any code that is sensitive to evaluation order is, thus, compiler dependent and compiler-dependent code should always be considered unsafe.

For example, the use of the increment and decrement operators may be troublesome:

val = n++ + arr[n];

Which element of arr is accessed? Did the programmer expect the value of n used to index the array to be that before the increment or after? Although it might look as if the increment is performed before the array index, that assumes left-right expression evaluation, which is not a valid assumption. So, the code is not clear and should be re-written thus:

val = n + arr[n+1];


val = n++;
val += arr[n];

or even

val = n;
val += arr[n];

Which of these option you choose depends on personal style. They all perform the same operation, and, in fact, an optimizing compiler would most likely generate exactly the same code.

A similar problem may occur with multiple function calls used within an expression. A function call might have a side-effect that impacts another. For example:

val = fun1() + fun2();

In this case, if either function can affect the result from the other, the code is ambiguous. To write safe code, any possible ambiguity must be removed:

val = fun1();
val += fun2();

It is now clear that fun1() is executed first.

Rule 17.2 – Functions shall not call themselves, either directly or indirectly

From time to time, an elegant way to express an algorithm is through the use of recursion. However, unless the recursion is very tightly controlled, there is a danger of stack overflow, which can, in turn, result in very hard to locate bugs. In safety critical code, recursion should be avoided.

Rule 19.2 – The union keyword should not be used

Although C is a typed language, typing is not very strictly enforced, and developers may be tempted to override typing to “simplify” their code. Adhering to the constraints of data types is essential to create safe code, as any attempts to get around data types can produce undefined results. The union keyword can be used for a number of purposes, which generally result in unclear code, but can also be a means to circumvent typing.

One example would be using a union to “take apart” an unsigned integer, thus:

union e
   unsigned int ui;
   unsigned char a[4];

In this case, each byte of ui can be accessed as an element of a. However, we cannot be sure whether a[0] is the least of most significant byte, as this is an implementation issue. (Essentially associated with the endianity of the processor.) The alternative might be to use shifting and masking, thus:

unsigned char getbyte(unsigned int input, unsigned int index)
  input >>= (index * 8);
  return input & 0xff;

It may be argued that these rules (and most, if not all, of MISRA C) are just common sense and any good programmer would take such an approach. This may be true, but a set of clear guidelines leave less to chance.

Related Contents:

For more Embedded, subscribe to Embedded’s weekly email newsletter.

3 thoughts on “MISRA C: Write safer, clearer C code

  1. MISRA C is definitely a useful standard to know about, so thank you for this post. One might argue that these are just common sense for any good programmer, but I’d agree there’s more to it than that.

    First, when the set of rules is codified, it’s possible for a compiler (or a code-validation system) to check them exhaustively, and catch any programming errors that cause non-compliance. It’s also a way to communicate: If I say, “This software library is MISRA C compliant,” you know exactly what I mean. However, if I say, “This software library is written to be safe in embedded systems,” you have no idea whether what I mean by that has anything to do with what your embedded system needs.

    Second, saying these are “common sense” for “any good programmer” really undersells how much many of these rules are domain-specific to embedded programming, or require a strong understanding of details of the C language.

    Looking at your examples, Rule 13.2 tends to apply when subtleties of the C language come into play. A good programmer of any sort would probably tell you that relying on compiler behavior like that is a bad idea, but being able to recognize all the code that does it is a lot harder — which is why a good compiler should have a warning that will tell you.

    Rule 17.2 is, as you note, particular to safety-critical code. On the web-server backend code that I’ve recently been working with, there’s plenty of stack space (and, with guard pages in x86 virtual memory, stack exhaustion is fairly easy to catch), and recursive code is often important to making parts of the codebase comprehensible. The issues I’ve seen are mostly when a code bug causes infinite recursion, or when someone has written a pathological test that intentionally takes up almost all the stack space and then something changes to take just a tiny bit more. On the rare cases when a server backend has a stack overflow, the guard page means that program will simply crash, and then the rest of the system is designed to handle crashes without issue.

    Rule 19.2 is also somewhat particular to code that may run on hardware with multiple endiannesses and integer representations. Outlawing “union” entirely is also a fairly large hammer to solve the problem — after all, if you use “union” pedantically according to the C++ standard, you can’t run into any data-representation problems, because you’re only allowed to read from the same union member that you last wrote. (I’m not familiar with whether C makes the same restriction or not.) In some environments, using “union” or its equivalent safely is worthwhile for the memory savings it can provide.

    I point this out because that’s another reason why having these rules codified is a good idea. A good programmer with a different background won’t necessarily have the “embedded-programming sense” to recognize all of these things as potential problems, regardless of how much programming skill and common sense they have!

    (I also point this out because, although MISRA C is an excellent thing for the domains where it’s appropriate, there are lots of domains where it isn’t. Many of them are now showing up in “embedded” systems, too. So it’s important to be thoughtful about which coding standards to apply, rather than treating any coding standard as one-size-fits-all.)

    Log in to Reply
    1. First off, the like the topic of this article. Simple because it raises the idea of knowing the rules that encompass MISRA standard before blindly pledging allegiance to its use. Nothing like setting your static analysis software to “full MISRA” and then seeing a sea of errors and warnings fill the screen.

      One of these rules is 19.2. On that, I agree on with Brooks. I’ve had to use union() in embedded code quite often for converting from float or double to individual bytes that are transferred across a communication channel (rx/tx serial, I2C, SPI, etc. etc..). The Union method is clear and concise. Because this is embedded code and I have limited code and RAM space, I don’t want to gum it up with yet another 2-line function just to byte shift (extra push onto stack). Yes, this forces me to know what type of “endian-ness” I am dealing with and the number of bytes needed. Yes, I have to know what I am doing and assign the correct number of bytes to match the character array with the number of bytes in the float or double … but, hey, I would have to do that with the loop in the function also.

      Log in to Reply

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.