James Grenning (www.renaissancesoftware.net), whose book Test Driven Development in C will be out in the fall, graciously agreed to be interviewed about TDD (test driven development). The first part of our talk ran last month at www.embedded.com/224200702, where you can also see reader comments.
Jack: How do you know if your testing is adequate? TDD people-heck, practically everyone in this industry-don't seem to use MC/DC, npath, or cyclomatic complexity to prove they have run at least the minimum number of tests required to ensure the system has been adequately verified.
Founder, Renaissance Software
James: You are right; TDD practitioners do not generally measure these things. There is nothing said in TDD about these metrics. It certainly does not prohibit them. You know, we have not really defined TDD yet, so here goes. This is the TDD micro cycle:
• Write a small test for code behavior that does not exist
• Watch the test fail, maybe not even compile
• Write the code to make the test pass
• Refactor any messes made in the process of getting the code to pass
• Continue until you run out of test cases
Maybe you can see that TDD would do very well with these metrics. Coverage will be very high, measured by line or path coverage.
One reason these metrics are not the focus is that there are some problems with them. It is possible to get a lot of code coverage and not know if your code operates properly. Imagine a test case that executes fully some hunk of code but never checks the direct or indirect outputs of the highly covered code. Sure it was all executed, but did it behave correctly? The metrics won't tell you.
Even though code coverage is not the goal of TDD it can be complementary. New code developed with TDD should have very high code coverage, along with meaningful checks that confirm the code is behaving correctly. Some practitioners do a periodic review of code coverage, looking for code that slipped through the TDD process. I've found this to be useful, especially when a team is learning TDD.
There has been some research on TDD's impact on cyclomatic complexity. TDD's emphasis on testability, modularity, and readability leads to shorter functions. Generally, code produced with TDD shows reduced cyclomatic complexity. If you Google for “TDD cyclomatic complexity,” you can find articles supporting this conclusion.
Jack: Who tests the tests?
James: In part, the production code tests the test code. Bob Martin wrote a blog a few years ago describing how TDD is like double entry accounting. Every entry is a debit and a credit. Accounts have to end up balanced or something is wrong. If there is a test failure, it could be due to a mistake in the test or the production code. Copy and paste of test cases is the biggest source of wrong test cases that I have seen. But it's not a big deal because the feedback is just seconds after the mistake, making it easy to find.
Also the second step in the TDD micro cycle helps get a test case right in the first place. In that step, we watch the new test case fail prior to implementing the new behavior. Only after seeing that the test case can detect the wrong result, do we make the code behave as specified by the test case. So, at first a wrong implementation tests the test case. After that, the production code tests the test case.
Another safeguard is to have others look at the tests. That could be through pair programming or test reviews. Actually, on some teams we've decided that doing test reviews is more important than reviewing production code. The tests are a great place to review interface and behavior, two critical aspects of design.
Jack: As has been observed, all testing can do is prove the presence of bugs, not the absence. A lot of smart people believe we must think in terms of quality gates: multiple independent activities that each filter defects. So that includes requirements analysis, design reviews, inspections, tests, and even formal verification. Is this orthogonal to TDD approaches, and how do TDD practitioners use various quality gates?
James: TDD does not try to prove the presence of bugs; it is a defect prevention technique (www.renaissancesoftware.net/blog/archives/16). People make mistakes regularly during development, but in the TDD micro cycle, the mistakes are immediately brought to the developer's attention. The mistake is not around long enough to ever make it into a bug-tracking system.
I think TDD is only part of the answer. Reviews, inspections, and pair programming are orthogonal and complementary to TDD.
There is another form of TDD, a more requirements-centric activity called Acceptance Test Driven Development (ATDD). In ATDD, the customer representative defines tests that describe the features of the system. Each iteration, the team works to complete specific stories defined by the customer. A story is like a use case, or a specific usage scenario. The acceptance tests describe the definition of done for the story. These acceptance tests are also automated. If the new and all prior tests pass, the story is done. That is an important a quality gate. Don't get me wrong, I am a proponent of reviews, but I think that TDD is superior to inspections at preventing defects.
I did a case study on the Zune bug that illustrates my point. This bug caused the 30G Zune model to freeze on New Year's Eve 2008. My informal research on the bug (www.renaissancesoftware.net/blog/archives/38) showed that most online code pundits who inspected the faulty function did not correctly identify the whole problem. I was in the group that got it almost right; a.k.a. wrong. Then I wrote a test. The test cannot be fooled as easy a human. So, I think we need both, inspections and tests.
Jack: Some systems are complex or control processes that respond slowly. What happens when it takes hours to run the tests?
James: For TDD to be a productive way to work, the micro cycle has to be very short in duration. This pretty much rules out going to the target during the micro cycle, and also that unit test execution must also be kept short.
To avoid the target bottleneck, I recommend that TDD practitioners first run their unit tests in their development system. If you are practicing the SOLID design principles it is natural to manage the dependencies on the hardware and operating system.
If there is a lengthy control process being test driven, we need to take control of the clock. If we are managing dependencies, this is not hard. A time-driven event eventually resolves to a function call. The test fixture can call the event processing code as well as some operating system, or interrupt-based event handler. If your code needs to ask some time service what the current millisecond is, we can intercept those calls and mimic any time-based scenario we like without any of the real delays slowing the test execution time.
With that said about unit tests, you might have the same issue when it comes to a more thorough integration, or system test. If you have automated some of these tests, and you rely on using the real clock, tests could take a long time to run. But that may not be a problem, because the cadence of acceptance and systems tests does not need to be as fast as unit tests. We'd like to run these longer tests automatically as part of a continuous integration system.
Jack: Let's move on to my business concerns. Through incremental delivery, TDD promises to produce a product that closely aligns with the customer's needs. That is, at each small release the customer can verify that he's happy with the feature, and presumably can ask for a change if he's not. “Customer” might refer to an end-user, your boss, the sales department, or any other stakeholder. If there's no barrier to changes, how does one manage or even estimate the cost of a project?
James: This is more of an Agile requirements management issue than TDD, but that's OK. Let me start by saying that it is a misconception that there is no barrier to requirements changes, and feature creep. For successful outcome, requirements have to be carefully managed.
In Agile projects there is usually a single person that is responsible for driving the development to a successful delivery. Some refer to this as the customer or the product owner (PO). The product owner might be from marketing, product management, or systems engineering. She usually heads a team of skilled people who know the product domain, the market, the technology, and testing. She is responsible for making trade-offs. Team members advise her, of course.
To manage development, we create and maintain something called the product backlog. The backlog is the list of all the features (we can think of) that should go into the product. There is a strong preference to make the work visible to the PO, over work that only engineers understand. It is mostly feature oriented, not engineering-task oriented, focusing on value delivery. We prevent surprises by taking three month engineering deliverables and splitting them into a series of demonstratable bits of work that our customer cares about.
The product owner's team can add things to the backlog, but in the end, the authority of what goes into a specific iteration is the PO's responsibility. For highly technical stories, a hardware engineer might play the role of the customer. For manufacturability stories, built in test for example, a manufacturing engineer or QA person might play the role of the customer. You can see there may be many “customers,” but the final call on what is worked on at what time is up to the product owner.
You also ask about estimating time and cost. There is no silver bullet here, but there is a realistic process Agile teams use. When an initial backlog is created, all the backlog items or stories are written on note cards and spread out on a table. (A story is not a specification, but rather a name of a feature or part of a feature.) Engineers get together and do an estimation session. Each story is given a relative difficulty on a linear scale. The easiest stories are given the value of one story point. All stories labeled with a one are of about the same difficulty. A story with a value of two is about twice as difficult to implement than a one. A five is about five times as difficult. I am sure you get the idea.
Once all the stories have a relative estimate, we attempt to calibrate the plan, by choosing the first few iterations and adding up their story points. We're estimating the team's velocity in story points per iteration. The initial estimate for the project would be the total of all story points divided by the estimated velocity. This will probably tell us that there is no way to make the delivery date. But it's just an estimate, next we'll measure.
As we complete an iteration, we calculate the actual velocity simply by adding the point values of the completed stories. The measured velocity provides feedback that is used to calibrate the plan. We get early warning of schedule problems, rather than 11th-hour surprises. If the projected date is too late for the business needs, managers can use the data to manage the project. The PO can carefully choose stories to do and not do to maximize delivered value. The business could looks at adding people before it is too late, or change the date.
Jack: Engineering is not a stand-alone activity. While we are designing a product, the marketing people make advertising commitments, tech writers create the user's manual, trade shows are arranged, accounting makes income and expense projections, and a whole host of other activities must come together for the product's launch. TDD says the boss must accept the fact that there's no real schedule, or at least it's unclear which features will be done at any particular time. How do you get bosses to buy into such vague outcomes?
James: Jack, there goes that misconception again on “no real schedule.” There is a schedule, probably a more rigorous and fact-based schedule that most developers are used to working with. The Agile approach can be used to manage to a specific date, or to specific feature content.
TDD is just part of the picture. The team activities should encompass cross-functional needs. While the product is evolving, the team's progress is an open book. The user documentation, marketing materials, etc., can and should be kept up to date. I don't try to get bosses to buy into vague outcomes. I get bosses that are not satisfied with vaguely “working harder/smarter next time.” I get bosses interested that want predictability and visibility into the work. I get bosses that want to see early and steady progress through the development cycle, ones that are not so interested in doing more of the same thing and expecting different results.
Jack: Now for a hardball question: Is it spelled agile or Agile?
James: Saving the toughest for last, setting me up. Someone with greater command of the language better take that one. Like any label, agile is aging and getting diluted. My real interest, and I think yours too, is advancing how we develop embedded software and meet business needs. To me many the ideas in Agile Development can really help teams. But its important to consider it a start, not the destination.
Jack, thanks again for the chat. It's always good talking to you.
Jack: Thanks, James, for your insightful answers. I hope the readers will respond with their thoughts and experiences using TDD in their workplace.
Jack Ganssle () is a lecturer and consultant specializing in embedded systems' development issues. For more information about Jack .