How embedded projects run into trouble: Jack’s Top Ten – Number Four
In The Sign of Four we learn that Sherlock Holmes has a cocaine addiction. I have seen number 4 on my list of the top ten ways projects get into trouble so many times I’m tempted to self-medicate as well.
4 - Writing optimistic code
A picture is worth a thousand words. I was buying parts for my sailboat and this popped up:
$84 trillion for some paint and filters? Didn’t sound right to me.
This is an example of optimistic programming. Assuming everything will go well, that there’s no chance for an error.
When I learned to program in 1969 we were told to check our goesintas and goesoutas. This was in the era of expensive computer time, limited memory, and carefully-rationed CPU cycles. Yet, we wrote our FORTAN code with a plethora of tests for conditions that were impossible to occur; impossible, because in our teenaged hubris we knew we were such brilliant developers that our code was blessed by the angels.
Alas, the built-in tests showed those were indeed fallen angels.
Today the resources in a $0.50 MCU dwarf those of the $10m Univac of yore. OK – we don’t have a bank of spinning tape drives, but 10 ns cycle times are astonishing compared to a 60s-era mainframe’s 750 ns. We can afford to write code that pessimistically checks for impossible conditions.
When it’s impossible for some condition to occur, it probably will.
For instance, a switch statement without a default case is an example of overly-optimistic programming. Nothing can possibly go wrong – can it? With few exceptions all switch statements need a default case, which, if taken, either logs debugging data or handles the error.
Exception handlers are exceptionally hard to get right. In fact, the reference (Simple Testing) at the end of this article shows that a large percentage of problems are in error handlers. It makes sense to carefully inspect these and include rock-solid code to handle the exception.
Then there’s the assert macro. Use it. A lot. Seed asserts everywhere. Research (see resources) has shown that code with lots of asserts is less buggy and delivered faster than that without. Asserts check for things that might go wrong. Done right, the assertion will fire close to the source of the bug, making it easy to figure out the problem
Admittedly, assert does stupid things for an embedded system, but it’s trivial to rewrite it to be more sensible.
Languages like Eiffel and Ada have super-asserts under the rubric of “Design by Contract” (DbC). In DbC it’s assumed that there is a contractually-mediated relationship between called and caller. Language elements are provided to enforce those relationships and take appropriate action when one is violated.
If there’s one thing I wish we could add to C, it’s DbC. It appears that C++20 will get them.
Unfortunately, C is pretty much devoid of runtime error checking. Add some. Lots.
Another example: I never see malloc()’s return value checked. If malloc() fails, it will let you know. Yet most of us practice a needless bravado, just mallocing that bad boy. Now, we really don’t know how to deal with a malloc() failure, but at the least you can add debugging bread crumbs. Sans those, these sorts of problems are almost impossible to find. Or, if you’re really a pessimist, add the mem utility (see resources). It’s a few hundred lines of code that will capture about a dozen bugs common when using dynamic memory allocation.
Using arrays or pointers? Check indexes and validity. Way back in the 1950s even the earliest compiled languages checked every array reference to see if it was out of bounds. 60 years later, too often we don’t, and our favorite language provides no intrinsic mechanism to do so. (But there’s work being done to add pointer checks to C – see https://www.microsoft.com/en-us/research/uploads/prod/2018/09/checkedc-secdev2018-preprint.pdf).
I call seeding the code with these sorts of constructs proactive debugging. You know there will be bugs – there always are. Smart developers start debugging as they write the code, folding in snippets that will automatically find problems.
The best engineers are utter pessimists. They always do worst-case design. Analyze the circuit assuming that 1K +/- 5% resistor is at each end of the tolerance band. And figure the code will fail in mysterious and bizarre ways.
Investigating the Use of Analysis Contracts to Support Fault Isolation in Object Oriented Code, L. Briand, Labiche, Y., Sun, H.
Building bug-free O-O software: An introduction to Design by Contract, (archive.eiffel.com/doc/manuals/technology/contract)
Simple Testing Can Prevent Most Critical Failures: An Analysis of Production Failures in Distributed Data-Intensive Systems, Ding Yuan, Yu Luo, Xin Zhuang, Guilherme Renna Rodrigues, Xu Zhao, Yongle Zhang, Pranay U. Jain, and Michael Stumm
Mem utility: http://www8.cs.umu.se/~isak/snippets/