Beyond Functional Firmware - Embedded.com

Beyond Functional Firmware

Because even bad code can work, you've got to evaluate your firmware vigorously. Jack lists the best criteria and how to meet them.

It's possible to write ugly, obfuscated, horribly convoluted, and undocumented code that still manages to work. God knows there's an awful lot of it out there today, controlling everything from disposable consumer products to mission-critical avionics. One reader who works for an antilock brake company tells me their code is “absolutely hideous.” (I think my next car will be a 1971 VW Beetle.)

It's equally possible and quite a bit easier to write finely crafted functions that are intrinsically maintainable and obviously, transparently correct. Here are some guidelines.

Minimize functionalityAre you an electrical engineer? My informal surveys suggest that around 60% to 70% of all firmware folks have electrical engineering degrees. That background serves us well in understanding both the physics of our applications and the intricacies of the hardware.

Yet most EE curriculums ignore software engineering. Sure, the professors teach you to program, and they expect each student to be proficient at building code. But they provide an education devoid of the critical tenets of software engineering necessary to build reliable systems. The skills needed to create a working 500-line program do not scale to one of 100,000 lines.

Probably the best known yet least used rule of software design is to keep functions short. In my firmware lectures I ask how many attendees enforce a function size limitation. It's rare to see even a single raised hand. Yet we know that good code is virtually impossible with long routines.

If you write a function that exceeds 50 lines—one page—it's too long. Fact is, you probably can't remember a string of more than eight or 10 numeric digits for more than a minute or so. How can you expect to comprehend the thousands of ASCII characters that make up a long function? Trying to follow program flow that extends across page boundaries is tough to impossible; we have to flip pages back and forth to understand what a gaggle of nested loops is doing.

A function should do just one thing. Convoluted code that struggles to manage many disparate activities is too complex to be reliable or maintainable. I see far too many functions that take 50 arguments that select a dozen interacting modes. Few of these work reliably.

Express independent ideas independently, each in its own crystal clear function. One rule of thumb is that if you have trouble determining a meaningful name for a function, it's doing too many different things.

EncapsulateAdvocates of object-oriented programming chant the mantra of “encapsulation, inheritance, and polymorphism.” While OOP isn't appropriate as a whole for all applications, encapsulation is.

Encapsulation means binding the data and the code that operates on that data into a single homogeneous entity. It means no other bit of code can directly access that data.

If your application is so ROM-limited that every byte counts, encapsulation may be impossible. But recognize that such applications are inherently expensive to build. Obviously, in a few extremely cost-sensitive applications—an electronic greeting card, for instance—it's critically important to minimize memory needs. But whenever you get into a byte-limited situation, development costs are likely to skyrocket.

Encapsulation is obviously possible in C++ and in Java. It's equally available to C and assembly programmers. Define each global variable within the function or module that uses them and ensure the scope prevents access by other routines.

But encapsulation is more than just data hiding. A properly encapsulated object has high cohesion. It accomplishes its mission without engaging in any unrelated activities. It's exception-safe and thread-safe. The object or function is a completely functional black box requiring little or no external support.

A properly encapsulated serial handler might require an interrupt service routine to transfer received characters into a circular buffer, a get_data() routine that extracts data from the data structure, and an is_data_available() function that tests for received characters. It also handles buffer overrun, serial dropout, parity errors, and all other possible error conditions. It's reentrant too.

A corollary of embracing encapsulation is to delete dependencies. High cohesion must be accompanied by low coupling—little dependence on other activities, in other words. We've all read code where some seemingly simple action is intertwined with a dozen other modules. The simplest design change requires chasing variables and functionality throughout thousands of lines of code, a task sure to drive maintainers nuts.

Remove redundancies
Get rid of redundant code. Researchers at Stanford studied 1.6 million lines of Linux and found that redundancies, even when harmless, correlate highly with bugs. (See www.stanford.edu/~engler/p401-xie.pdf.)

They defined redundancies as code snippets that have no effect, such as assigning a variable to itself, initializing or setting a variable and then never using that value, dead code, or complex conditionals where a subexpression will never be evaluated since its logic is already part of a prior subexpression. They were clever enough to eliminate special cases like setting a memory mapped I/O port, since this sort of operation looks redundant but isn't.

Even harmless redundancies that don't create bugs are problems, since these functions are 50% more likely to contain hard errors than functions that do not have redundant code. Redundancy suggests the developers were confused and likely to make other mistakes nearby.

Watch out for block-copied code. I am a great believer in reuse and encourage the exploitation of previously-tested chunks of source, but all too often developers copy code without sufficiently studying the implications. Are you really sure all of the variables are initialized as expected, even when this chunk is in a different part of the program? Might a subtle assumption about a mutex create a deadlock or race condition?

We copy code to save development time, but there is a cost. Study that code even more carefully than the new stuff you're writing from scratch. And when lint or the compiler warns about unused variables, take heed.[1] It may be a signal that more significant errors are lurking.

Reduce real-time code
Real-time code is error-prone, expensive to write, and even more costly to debug. If it's at all possible, move the time-critical sections to a separate task or section of the program. When time issues infiltrate the entire program, every bit of it becomes hard to debug.

Today we're building bigger and more complex systems than ever with debugging tools whose capabilities are less than those from a decade ago. Before processor speeds zoomed to near infinity and the proliferation of surface-mount packages eliminated the ability of coffee drinkers to probe ICs, the debugger of choice was an in-circuit emulator. These included real-time trace circuits, event timers, and even performance analyzers. Today we're saddled with BDM or JTAG debuggers.[2] Though nice for working on procedural problems, they offer essentially no resources for dealing with problems in the time domain.

Remember also these rules of thumb for scheduling. A system loaded to 90% doubles development time over one at 70% or less; at 95%, the schedule triples. Real-time development projects are expensive, highly loaded ones even more so.

Flow with grace
Flow, don't jump. Avoid continue , goto , break , and early return . These are all useful constructs, but generally reduce a function's clarity. Overused, they are the basic fabric of spaghetti code.

Refactor relentlessly
Extreme Programming and other agile methods emphasize the importance of refactoring (also known as rewriting crummy code). This is not really a new concept. Capers Jones, Barry Boehm, and others have shown that badly-written modules are much more expensive to beat into submission and maintain than ones with a beautiful structure.[3] ,[4]

Refactoring zealots demand we rewrite any code that can be improved. That's going too far. Our job is to create a viable product in a profitable way; perfection can never be a goal that overrides all other considerations. Yet some functions are so awful they must be rewritten.

If you're afraid to edit a function, if it breaks every time you modify a comment, then it needs to be refactored. Your finely honed sense as a professional developer that, well, we just better leave this particular chunk of code intact because no one dares mess with it, is a signal that it's time to drop everything else and rewrite the code to make it understandable and maintainable.

The second law of thermodynamics tells us that any closed system will head to more disorder; its entropy will increase. A program obeys this depressing truth. Successive maintenance cycles always increase the software's fragility, making each additional change that much more difficult. As Ron Jeffries pointed out, maintenance without refactoring increases the code's entropy by adding a “mess” factor (m) to each release. The cost to produce each release looks something like: (1+m)(1+m)(1+m). . ., or (1+m)n, where n is the number of releases.[5] Maintenance costs grow exponentially as we grapple with more and more hacks and sloppy shortcuts. This explains that bit of programmer wisdom that infuriates management: “the program is too much of a mess to maintain.”

Refactoring incurs its own cost, r. But it eliminates the mess factor, so releases cost 1+r+r+r . . ., which is linear.

Luke Hohmann advocates “post release entropy reduction.”[6] He recognizes that we too often make some quick hacks to get the product out the door. These entail a maintenance cost, so it's critical we pay off the technical debt incurred from abusing the software. Maintenance is more than cramming in new features; it's reducing accrued entropy.

Refactor to sharpen unclear logic. If the code is a convoluted mess or even somewhat unclear, rewrite it to better demonstrate its meaning. Eliminate deeply nested loops or conditionals. No one is smart enough to understand all permutations of deeply nested IFs. Clarity leads to accuracy.

Employ standards and inspections
Write code according to your company's firmware standard. (This is not art class.) Use formal code inspections to enforce the standard and find bugs. Test only after conducting an inspection.

Inspections find bugs some 20 times more cheaply than debugging. They'll capture entire classes of problems you'll never pick up with conventional testing. Most studies show that traditional debugging checks only about half the code! Without inspections, you're most likely shipping a bug-ridden product.

It's interesting that the DO-178B standards for building safety-critical software rely heavily on the use of tools to ensure every line of code gets executed. These code coverage tools are a wonder, but no substitute for inspections.

I've written much about both of these issues, so rather than repeat myself, I'll refer you to the references at the end of this column.[7] ,[8] ,[9] Coding standards and inspections are deeply intertwined; neither will succeed without the other. And without standards and inspections, it's impossible to build great firmware.

Some advice is bad
When I was just starting my career, an older fellow told me what he called The Fundamental Rule of Engineering: if the damn thing works at all, leave it alone. It's an appealing concept, one I've used too many times over the years. The rule seems to work with hardware design, but it's a disaster when applied to firmware.

I believe that part of the software crisis stems from a belief that “pretty much works” is a measure of project success. Professionals, though, understand that developmental requirements are just as important as functional and operational requirements.

Make it work, make it beautiful, and make it maintainable.

Jack G. Ganssle is a lecturer and consultant on embedded development issues. He conducts seminars on embedded systems and helps companies with their embedded challenges. Contact him at .

References
1. Jones, Nigel. “
Lint,” Embedded Systems Programming, May 2002, p. 55.
Back
2. Berger, Arnold and Michael Barr. “On-Chip Debug,” Embedded Systems Programming, March 2003, p. 47.
Back
3. Boehm, Barry. Software Engineering Economics. Englewood Cliffs, NJ: Prentice-Hall, 1982.
Back
4. Jones, Capers. Applied Software Measurement: Assuring Productivity and Quality. New York: McGraw-Hill, 1991.
Back
5. Jeffries, Ron. Extreme Programming Installed, Boston: Addison-Wesley, 2000.
Back
6. Private communication.
Back
7. Ganssle, Jack. “Better, Faster Code,” Embedded Systems Programming, August 1998, p. 117.
Back
8. Ganssle, Jack. “A Firmware Development Standard,” Embedded Systems Programming, March 1998, p. 129.
Back
9. Ganssle, Jack. “Firmware Development Standard, Part 2,” Embedded Systems Programming, April 1998, p. 97.
Back

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.