The best coding standards eliminate bugs - Embedded.com

The best coding standards eliminate bugs

The topic of coding standards is an emotive one among softwaredevelopers, whose divergent opinions raise questions that range from”Why do we need such restrictions?” to “How could we possibly operatewithout them?”

Software engineering has always wrestled with standards, and thedevelopment of the C and C++ languages brought the issue into evensharper focus. These flexible and powerful languages are now deeplyrooted in industrial and embedded environments. In the past decade,developers have accepted the need to control and restrict theselanguages for industrial, commercial, or other safety-consciouspurposes.

Many of the early attempts to define coding standards focused onstyle rather than safety and reliability. However, recent collectiveefforts such as the Motor Industry Software Reliability Association(MISRA) C and C++ guidelines target bug detection, avoidance, andprevention.

The primary intent behind these modern coding standards is toprevent software misbehavior. Software languages generally containfeatures that are rich beyond the needs of most software practitioners.Developers are not expected to be experts in the full-language featureset, and coding rules help protect them from language danger or misuse.

Undefined or bug?
Language dangers cover a broad range of issues involving languagespecification and misuse. All languages have undefined outcomes fromunplanned usage.

Unfortunately, despite its flexibility and suitability to embeddedapplications, the C language contains many of these issues. C++ hasinherited most of these issues, thanks to its full compatibility withC.

Well-recognized instances of undefined behavior includedereferencing a null pointer, dividing by zero in an expression, and afunction returning a handle to nonstatic local data. In C++, thisparticular danger extends to returns of function parameters:

class A {…}

const A& Bad(const A& a)
{     return a;     // returns ref toalias:
    // undefined if itexceeds
    // aliased objectlifetime
}

It is not uncommon for a C/C++ coding standard to include a blanketrule referencing adherence to the language standard and avoidance ofany undefined behavior. More specific guidelines, such as testing apointer's non-null property before each attempted dereference orprogrammatically ensuring that a divisor cannot be zero, provide morefocused approaches.

A special subclass of language definition involves situations wherecompiler vendors choose between several behaviors. Does a left-shiftprotect the leftmost sign bit? How will a larger integer value berepresented when cast to an 8-bit character type? Are plain char andbit field types signed or unsigned?

While these behaviors can be established from compiler documentationor configuration settings, a safe practice coding rule would prescribethat such outcomes be documented and restricted to only those that aredeterministic. The richest seam of buggy code is thus within thedeveloper's control. This includes a wide variety of coding errorsoften brought about by suspect coding assumptions or lack of foresight.

Datainitialization. Data initialization is a sensible and safepractice that is especially important if developers don't take fulladvantage of C++'s member initialization semantics:

class A
{public:
A (); // 'm_i' not init'ed
int getI() const
    { return m_i; }
private:
int m_i;
};

int j = A().getI(); // 'm_i' and subsequently j
// have indeterminate value

To avoid these initialization issues, the coding rule states thatconstructors shall initialize (either through initial value orconstructor call) each base class and all nonstatic data members.

Name hiding. Reusing a variable name in a different scope is a particularlydifficult bug. The identifier at the innermost scope hides any matchingname in an outer scope, whether intended or not. A coding rule statingthat “an identifier in an inner scope shall not hide an outer scopeidentifier” guards against this. Consider this for-loop example:

void foo(void)
{
    int i = 15;
    int MyArray[10];

for (int i=0; i<10; ++i) {     MyArray[i] = 0; }; // whatever intended ..   MyArray[i-1] = 1;
} // ..out-of-bounds results

C++ encourages developers to declare the control variable (i) in theloop statement, thus restricting its scope to that block. This examplecould be legacy C or an early version of C++.

Booleanexpressions. With no Boolean type in the most popular version ofC (ISO 1990), developers have to work with quasi-Boolean concepts. Theresulting lack of type safety can lead to some subtle and perniciousbugs:

x = ((a > b) & (c > d));
/* logical rather than            */
/* bitwise AND intended?     */

y = ((a + b) || (c – d));
/* odd: logical OR of two     */
/* arithmetic expressions      */

These can be neatly avoided with a coding rule to “prohibit themixing of arithmetic and logical (effectively Boolean) expressions.”

Assign inconditional. An assignment in a conditional expression, whilelegal, can expose a typing error or a more complex logic issue:

// assign or test equality?
if (y=x) {…}

// conditional side-effect
if ((a == b) || (c = d)) {…}

All such behaviors can be elegantly avoided through a coding rule”prohibiting assignment operators in effectively Boolean expressions.”

Typeconversions. The type system in C has great flexibility inhandling conversions. While this enables powerful data manipulation inexpressions, it often betrays poor understanding of the underlyingcompiler actions and occasionally reveals difficult, value-sensitivebugs:

// unsigned 16 and 32 bits
uint16_t u16a = 40000;
uint16_t u16b = 30000;
uint32_t u32a;

// result: 70000 or 4464?
u32a = u16a + u16b;

C's balancing and promotion rules might result in either of thesevalues, depending on how these two types are defined. An integer sizeof 16 bits will likely cause an erroneous result even if the resultsize is set to a 32-bit type.

Switchstatements . Conversion issues can lurk in places developersdon't expect to be a problem. For example, switch statements are anelegant control-flow mechanism. However, they are not without danger.The switch and case expressions must be of the same type and have thesame sign; otherwise, developers might suffer unwanted implicitconversions:

unsigned char c;

switch ( c ) {     case -1:     … /* unreachable*/
    case 256:     … /* unreachable*/

A general coding guideline that “there shall be no unreachable code”provides a degree of protection. A rule guarding against implicitconversions is a more targeted means of avoiding this issue.

Casting awayconst. According to ISO C, a pointer can only be assigned toanother pointer if “both operands are pointers to qualified orunqualified versions of compatible types, and the type pointed to bythe left has all of the qualifiers of the type pointed to by theright.”

In the example, the pointer assignment fails for this constraintreason. With an appropriate cast, such an assignment will succeed,although it will be highly dangerous:

// pointer to int:     int *pi;
// pointer to const int:     const int *pci;     …
// constraint error     pi = pci;
// dangerous but permitted     pi = (int *)pci;

Suitable protection is encapsulated in the rule expressing that “nocast shall be allowed that removes const or volatile qualification fromthe type addressed by a pointer.”

Reliable code protection
As can be seen from these code examples, the C and C++ languagesbenefit from defense against misuse of their features. These languages'well-publicized and documented undefined behaviors, including nullpointer dereference, divide by zero, and array bounds exception, arecentral to language protection.

However, many other types of vulnerabilities require deepunderstanding of language syntax and semantics. Examining source codefor a wide range of code defects and implementing coding bestpractices, preferably through an automated tool method, are equallyimportant to achieve a high-quality and robust code base.

Fergus Bolger is CTO at PRQA, based inHersham, Surrey, the United Kingdom. With nearly 30 years of experiencein the hardware and software computing industry, Fergus has filledmanagement and engineering roles in development and advanced testing atPRQA and Amdahl Corporation. He has extensive experience withmainframe, client-server, UNIX, and Windows platforms with specialinterest in software process, automated tooling, and maintainablesoftware systems. Fergus earned his Master's in Engineering fromUniversity College Dublin.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.