In an
article
a couple of weeks ago I stressed the importance of aiming for zero
defects. A couple of private emails from readers took me to task,
arguing that perfection just isn’t attainable.
They’re right.
And they’re wrong.
Of course it’s impossible to make anything complex perfect.
Software will always be inherently problematic. A product comprising a
million lines of code is built from something like 20 million
keystrokes. Get just one wrong, for an error rate of one part in 0.5 *
10**7, and the system is defective in some measure. Unfortunately,
infallibility is not part of human nature.
But today’s state of the industry is unacceptable. Consider these
events:
• 2005: Toyota recalls 75,000 Prius hybrids due to a software defect
(mine included).
• 2004: Pontiac recalls the Grand Prix since the software didn’t
understand leap years. 2004 was a leap year.
• 2003: A BMW trapped a Thai politician when the computer crashed.
The door locks, windows, A/C and more were inoperable. Responders
smashed the windshield to get him out.
• 2002: BMW recalls the 745i since the fuel pump would shut off if
the tank was less than 1/3 full.
• 2001: 52,000 Jeeps recalled due to a software error that can shut
down the instrument cluster.
That’s just a handful of recent recalls, only in one industry, due
to buggy code.
There there’s this; I tried to buy some boat parts from Defender.com
and the total came to:
$84 trillion dollars, more than the world’s GDP. Happily, though,
they didn’t charge for shipping or tax.
It is possible to greatly reduce software defects. We might
chuckle about some of the recalls experienced by the auto industry, but
they are working hard to improve the quality of their code. For
instance, they started the Motor Industry Software Reliability
Association (http://www.misra.org.uk/) to improve the reliability of C
and C++ firmware. I admire their efforts to act decisively. With
recalls costing millions of dollars they’ve wisely rejected the notion
that we can hack our way to success.
We EEs started the embedded world some 30 years ago. Most of us had
no software engineering knowledge at all, but learned assembly language
and cranked out code. A lot of it was awful, but it was possible to
beat the small programs of the 70s into submission using heroics.
That’s not likely with today’s huge apps. In my travels around the
embedded landscape I’m seeing more organizations starting to refine
their software practices; to employ more disciplined approaches in an
effort to tame schedules and reign in bugs. Things are getting better.
But we’ve a long way to go.
The reality is that it’s astonishing how well software and computers
do work. They control practically every aspect of our lives, most of
the time with little fuss. Surely, though, our bosses will demand we
reduce or eliminate the recalls, upgrades, and patches that are so
common now.
Jack G. Ganssle is a lecturer and consultant on embedded
development issues. He conducts seminars on embedded systems and helps
companies with their embedded challenges. Contact him at jack@ganssle.com. His website is www.ganssle.com.
The problem of software unreliability never seems to go away. Normally, software systems become less reliable as they grow in complexity. Even if all "accidental" defects could be corrected by our development tools, there isn't much they can do about the design errors. This type of error is a consequence of what Frederick Brooks calls the "essential complexity of software".
But all is not lost. There is a way to construct software such that the number of design errors decrease with complexity. This approach uses a signal-based, synchronous software model. In this model, logical contradictions are automatically nipped in the bud. By increasing complexity while retaining a fixed functionality, the software developer effectively increases the chances of finding all the logical contradictions in the design. A silver bullet? You bet.
- Louis Savain
"BMW recalls the 745i since the fuel pump would shut off if the tank was less than 1/3 full."
Did nobody notice this in testing? Before the car's release? Before the recall?
- Don Kelley
It's not just a software problem - it is as much a systems problem.
My friends battery in his BMW 740i went dead. He was totally locked out since the locks are entirely electric with no mechanical overide. The dealership had no alternative except to hire a locksmith to force the internal door mechanism. Someone should get the Homer Simpson Golden Donut award for this idiotic oversight.
- Peter House
If cars have those kinds of problems, I think I'll stick to
riding my motorcycle.
However with ABS, fuel Management systems, and yes, even SRS
(Airbags!) creeping in, I'm afraid that's not going to be a
safe respite either (!)
I do give credit for the Auto Industry wethering past 12/31/99,
after which time civilization as we know it, was supposed
to end.
I think it would be interesting for you to do an article on software
defects that actually resulted in favorable (but of
course unintended) outcomes. I'm expecting a short list of
course, but it would be interesting.
- Tiger Joe Sallmen
Small is better! The Kiss principle could be defined as keep
it simple and smaller. Even if you have a big system, break
it down into smaller parts. And why on earth do cars have electric
doorlocks with no easy method to apply external backup
power in the event the on board battery fails. Many bugs are
not software errors per se but system design errors. BMW
engineers successfully protected their in tank fuel cooled fuel
pump from self destruction in the event of inadequate fuel
to cool the pump, but why did a systems engineer use a fuel
cooled pump to begin with? I think the idea of a fuel cooled
fuel pump is just plain dumb to begin with.
- Don McCallum
Reminds me of my most recent encounter with a design oops.
I was piloting a friend's boat before dawn for a salmon fishing
contest. The onboard GPS uses soft keys for control of
its LCD screen brightness and contrast. When we shoved off,
I set the brightness low so I could see past the control center.
As daylight approached I went to bring the brightness
back up and "fat-fingered" the softkeys. The result was the
brightness turned down to zero (black display) and no way
to see the soft key controls in order to bring the brightness
back up. It seems like a terrible oversight to allow the
brightness to be able to go to zero when the user relies on
seeing the display in order to control the brightness. Anyway,
I used a pocket flashlight to light up the display enough
from "my side" to see the soft key options and bring back
the brightness enough to see the controls for adjusting the
brightness.
I wonder how many of those GPS units gets returned for repairs
just because of that problem. Owner - I punched a few of
the wrong buttons and now the screen is blank. Vendor - We'll
send it back to the factory and see what they say. Factory
- Adjusted display brightness. Please see attached invoice.
- Greg Feneis
I would love to produce more reliable, well written software, however this is just not possible in some companies as they say:
A) we're spending lots of money on employing expensive engineers, now you want expensive tools aswell?!
B)it's only a small change, don't worry about testing.. we need it done yesterday
C) we haven't got time to re-write legacy code or implement RTOS and coding rules.. we need you to work on this new feature instead.
- Malcolm Humphrey