Advertisement

Using coding standards to improve software quality and security

David and Mike Kleidermacher, Green Hills Software

July 29, 2013

David and Mike Kleidermacher, Green Hills SoftwareJuly 29, 2013

4 Rule 8.11: The static storage class specified shall be used in definitions and declarations of objects and functions that have internal linkage. Two programmers may use variables of the same name for independent purposes in independent modules within the same program.

One module’s modification of the variable will corrupt the other module’s instance and vice versa. Furthermore, global variables may be more visible to attackers (if, for example, the global symbol table for the program is available), opening up opportunities to alter important data with malware. MISRA rule 8.11 is designed to prevent this by enforcing the generally good policy of limiting the scope of declarations to the minimum required.

While MISRA rules 8.9 and 8.11 will prevent many forms of incompatible definition and use errors, they will not prevent all such occurrences. Another example of improper symbolic resolution relates to the unintended use of exported library definitions. Libraries are often used to collect code modules that provide a related set of functions.

In fact, the use of libraries to collect reusable software across projects is worthy of mention in a coding standard. For example, most operating systems come with a C library, for example, libc.so, that provides support for the C runtime, including string manipulation, memory management, and console input/output functions.

A complex software project is likely to include a variety of project-specific libraries. These libraries export functions that can be called by application code. A reliability problem arises due to the fact that library developers and application developers may not accurately predict or define a priori the library’s exported interfaces.

The library may define globally visible functions intended for use only by other modules within the library. Yet once these functions are added to the global namespace at link time, the linker may resolve references made by applications that were not intended to match the definitions in the library.

For example, let’s consider an application that makes use of a print function. The application developer envisions the use of a printing library provided by the printer management team. However, the font management team created a library, also used by the application developer, that provides a set of font manipulation functions. The font management team defines a print function intended for use by other modules within the font management library.

However, if there does not exist a facility for limiting the name space of libraries(the use of such a facility, if available, should be covered by the coding standard),the font library’s print function maybe inadvertently used by the linker to resolve print references made by the application developer, causing the system to fail.

Therefore, this problem may need to be solved by something other than the compiler’s front end. One method is to use a toolchain utility program that hides library definitions so that they are used by the linker when resolving intra-library references but ignored when resolving extralibrary references.

The Windows platform employs user-defined library export files to accomplish this separation. When creating Windows DLLs, developers specify which functions are exported. Functions not included in the export file will not be used to resolve application references. Some high-level languages, such as C++ and Ada, do a better job of automatically enforcing type consistency and name spacing than other languages such as C. Language choice may well make certain coding standard rules trivial to enforce.

5 Rule 16.2: Functions shall not call themselves, either directly or indirectly. While directly recursive functions are easy to detect, and almost always a bad idea in resource-constrained or safety-critical embedded systems due to the risk of stack overflow, indirect recursion can be far more difficult to detect.

Sophisticated applications with complex call graphs and calls through function pointers may contain unnoticed indirect recursion. This is yet another case in which an inter-module analyzer, such as the linker/loader, is required to detect cycles in a program’s call graph. Handling all cases of indirect function calls, such as dynamically assigned function pointers, tables of function pointers, and C++ virtual functions, can be extremely difficult for an automated tool due to the ambiguity of potential functions that may be referenced by these pointers.

A developer should try out simple test cases with a MISRA checker to see what kinds of limitations it has. If a tool vendor is unable to improve or customize the tool to meet specific needs, the developer should consider other tool choices or adopt stricter coding standard rules for limiting the use of problematic forms of indirect function calls.

MISRA for C ++ was released in 2008 and, as one would expect, includes significant overlap with MISRA C. However, the MISRA C++ standard includes 228 rules, approximately 50% more than the MISRA C standard. The additional ground covers rules related to virtual functions, exception handling, namespaces ,reference parameters, access to encapsulated class data, and other facets specific to the C++ language.

6 Rule 9-3-2: Member functions shall not return non-const handles to class data. A simple example of a non-compliant class is as follows:

#include <stdint.h>
class temperature
{
  public:
    int32_t &gettemp(void) { return the_temp; }
  private:
    int32_t the_temp;
}
int main(void)
{
  temperature t;
  int32_t &temp_ref = t.gettemp();
  temp_ref = 10;
  return 0;
}

One of the major design goals of the C++ language is to promote clean and maintainable interfaces by encouraging the use of information hiding and data encapsulation. A C++ class is usually formed with a combination of internal (private) data and class member functions. The functions provide a documented interface for class clients, enabling class implementers to modify internal data structures and member implementations without affecting client portability.

The preceding class member function gettemp returns the address of an internal data structure. The direct access of this internal data by the client violates object-oriented principles of C++. An obvious improvement (and MISRA-compliant) implementation of the preceding sample class would be as follows:

#include <stdint.h>
class temperature
{
  public:
    int32_t gettemp(void) { return the_temp; }
    void settemp(int32_t t) { the_temp = t; }
  private:
    int32_t the_temp;
}
int main(void)
{
  temperature t;
  t.settemp(10);
  return 0;
}

If the temperature class owner decides that only eight bits of data are required to store valid temperatures, then she can modify the internal class without affecting the class clients:

#include <stdint.h>
class temperature
{
  public:
    int32_t gettemp(void) { return the_temp; }
    void settemp(int32_t t) { the_temp = t; }
  private:
    int8_t the_temp;
}


The non-compliant implementation requires modification to the client-side code due to the size change.

Embedded C++ and secure code
A number of advanced features in C++, such as multiple inheritance, can result in programming that is error prone, difficult to understand and maintain, and unpredictable or inefficient in footprint and execution speed. Because of these drawbacks, a consortium of semiconductor and developments tools vendors created a C++ subset specification called Embedded C++ that has been in widespread use for more than a decade.

The goal of Embedded C++ is to provide embedded systems developers who come from a C language background with a programming language upgrade that brings the major objectoriented benefits of C++ without some of its risky baggage. To that end, Embedded C++ removes the following features of C++:

  • Multiple inheritance
  • Virtual base classes
  • New-style casts
  • Mutable specifiers
  • Namespaces
  • Runtime type identification (RTTI)
  • Exceptions
  • Templates

One example of the rationale for Embedded C++ is the difficulty in determining the execution time and footprint of C++ exception handling. When an exception occurs, the compiler generated exception-handling code invokes a destructor on all automatic objects constructed since the applicable try block was executed.

The number and execution time of this destructor chain may be extremely difficult to estimate in sophisticated applications. Furthermore, the compiler generates exception-handling code to unwind the call stack linking the handler to its original tryblock. The additional foot print may be significant and difficult to predict. Because the standard C++ runtime is compiled to support exception handling, this feature adds code bloat to programs that do not even make use of the try and catch exception-handling mechanisms.

For this reason, purpose-built runtime libraries supporting the reduced language subset typically accompany an Embedded C++ tool chain. Footprint concerns also led C++ templates to be left out of the Embedded C++ standard; in some cases, the compiler may instantiate a large number of functions from a template, leading to unexpected code bloat.

Of course, some of these removed features can be extremely useful. Careful use of templates can avoid unnecessary code bloat while proving simpler, more maintainable source code interfaces. For this reason, many compilers provide variants of Embedded C++ that enable a development organization to add back features that may be acceptable for security-critical development, especially if those features are used sensibly (such as enforcing some or all of the rules of MISRA C++).

For example, Green Hills Software’s C++ compiler provides options for allowing the use of templates, exceptions, and other individual features with the Embedded C++ dialect (along with enabling MISRA checking).

Conclusion: Dealing with code complexity
Much has been published regarding the benefits of reducing complexity at the function level. Breaking up a software module into smaller functions makes each function easier to understand, maintain, and test.

One can think of this as meta-partitioning: applying the software componentization paradigm at a lower, programmatic, level. A complexity-limitation coding rule is easily enforced at compile time by calculating a complexity metric and generating a compile-time error when the complexity metric is exceeded.

Once again, since the compiler is already traversing the code tree, it does not require significant additional build time to apply a simple complexity computation, such as the popular McCabe metric (http://en.wikipedia.org/wiki/McCabe_Metric). Because the compiler generates an actual error pointing out the offending function, the developer is unable to accidentally create code that violates the rule.

Adopting a coding standard rule that allows a McCabe complexity value of 200 is useless; most legacy code base will be compliant despite having spaghetti-like code that is hard to understand, test, and maintain.

The selection of a specific maximum complexity value is open to debate. If an existing code base is well modularized, a value may be selected that allows most of the properly partitioned code to compile; future code will be held to the same stringent standard.

When the complexity metric is applied to a large code base that has previously not been subjected to such an analysis, it is likely that a small number of large functions will fail the complexity test. Management then needs to weigh the risk of changing the code at all.

Modifying a piece of code that, while complex, is well exercised (proven in use) and serves a critical function may reduce reliability by increasing the probability of introducing a flaw. The complexity enforcement tool should provide a capability to allow exceptions to the complexity enforcement rule for specific functions that meet this profile.

Exceptions, of course, should always be approved by management and documented as such. The coding standard should not allow exceptions for code that is developed subsequent to the adoption of the coding rule. These types of coding standard policies conform to their spirit while maximizing efficiency, enabling them to be employed effectively in legacy projects.

David Kleidermacher, Chief Technology Officer of Green Hills Software, joined the company in
1991 and is responsible for technology strategy, platform planning, and solutions design. He is an authority in systems software and security, including secure operating systems, virtualization technology, and the application of high robustness security engineering principles to solve computing infrastructure problems. Mr. Kleidermacher earned his bachelor
of science in computer science from Cornell University.


This article is excerpted from Embedded Systems Security by David and Mike Kleidermacher, used with permission from Newnes, a division of Elsevier. Copyright 2012. All rights reserved.

< Previous
Page 2 of 2
Next >

Loading comments...