Software engineering metrics we need
Engineering is about numbers; firmware people need to collect metrics.
In a recent article ("Start collecting metrics now") I stressed the importance of collecting metrics to understand and improve the software engineering process. It's astonishing how few teams do any measurements, which means few have any idea if they are improving, or getting worse. Or if their efforts are world class, or constitute professional malpractice.
Two of the easiest and most important metrics are defect potential and defect removal efficiency. Capers Jones, one of the more prolific, and certainly one of the most important, researchers in software engineering pioneered these measurements.
Defect potential is the total number of bugs found during development (tracked after the compiler gives a clean compile; ignore the syntax errors it finds) and for the first 90 days after shipping. Every bug reported, every mistake corrected in the code, counts. Sum this even for those that Joe fixes while he is buried in the IDE doing unit tests. No names need be tracked; this is all about the code, not the personalities.
Defect removal efficiency is simply the percentage of those removed prior to shipping. One would hope for 100% but few achieve that.
These two metrics are then used to do root cause analysis: Why did a bug get shipped? What process can we change so it doesn't happen again? How can we tune the bug filters to be more effective?
Doing this well typically leads to a 10x reduction in shipped bugs over time. Here's some data from a client I worked with:
Click on image to enlarge.
Over the course of seven quarters, they reduced the number of shipped bugs by better than an order of magnitude by doing this root cause analysis.
What are common defect potentials? They are all over the map. Malpractice is when we ship 50 bugs/1,000 lines of code (KLOC). 1/KLOC is routinely achieved by disciplined teams, and 0.1/KLOC by world-class outfits.
According to data Capers Jones shared with me, software in general has a defect removal efficiency of 87%. Firmware scores a hugely better 94%. We embedded people do an amazingly good job. But given that defect injection rates run 5 to 10%, at a million LOC 94% means we're shipping with over 3,000 bugs.
What are your numbers? Do you track this, or anything?
Jack G. Ganssle is a lecturer and consultant on embedded development
issues. He conducts seminars on embedded systems and helps companies
with their embedded challenges, and works as an expert witness on
embedded issues. Contact him at firstname.lastname@example.org. His website is