Verifying embedded software functionality: fault localization, metrics and directed testing
To illustrate our comparison of differences, consider the example in Figure 5.12. Recall that diff (π π’) = 33, 712,, and diff (ππ') = 76,714, as illustrated in the last two columns of Figure 5.12. Comparing 33,,714, with 76,714,, we see that 76, 714 < 33, 714,, because statement instance 76, occurs after statement instance 33, in execution run π”. Summary of Trace Comparison MethodsIn summary, which trace comparison metric is chosen and how the traces are compared is a matter of choice. However, based on the metric, we can choose the successful run from a pool of successful runs (say, the test suite of a program). In particular, suppose we have an execution trace αf that is failed, meaning it shows an unexpected behavior.We want to compare it with another execution trace αs such that: 1 - α s,, does not show any unexpected behavior, that is, the program outputs in run α s,, are as per the developer’s expectations, and 2 - α s, is the closest to α f in terms of the comparison metric being used. Thus, based on the trace comparison metric being used, we choose the successful run against which we compare a given failed execution run, and report the difference between the two runs as a bug report. Directed Testing Methods
So far, we have studied different debugging methods which either (a) analyze the dependencies in a failed trace or (b) compare a failed trace with a “chosen” successful trace. These methods can be combined with testing techniques in a postmortem fashion. In other words, given a program P, we generate a test suite (set of test inputs) for P, and the traces for the test inputs in the test suite are subjected to analysis. On the other hand, one could envision testing methods that are more directed to exposing errors. Conventional software testing methods are often driven with the sole goal of coverage. What do we mean by coverage in the context of test generation? Let us take the statement coverage criteria. We say that a set of test inputs S achieves statement coverage if each statement in the program appears in the trace for at least one test input in S. Similarly, one can define other coverage criteria such as branch edge coverage. Standard test coverage criteria such as statement coverage provide very weak guarantees about the software’s reliability. Statement coverage merely says that each statement in the program is executed for some test input in the test suite. However, this does not mean that executing the tests in the test suite will expose the bug in a buggy statement. If a statement is buggy, its execution does not guarantee the manifestation of the bug in the statement. Of course, if a statement in the program is buggy and it is executed for some input, there is more chance of the bug in the statement being manifested in the form of an unexpected output. Ideally, what we would like to do via systematic program testing is to expose the different paths in a program. However, enumerating all paths in a program and finding inputs that exercise these paths is not easy. The number of program paths is exponential on the number of branch instances, and the number of branch instances (individual executions of a branch statement) itself can be very large. Exhaustive testing by trying out all inputs is simply not an option with real-life embedded software, because there are too many inputs to test. Often, there are even infinitely many inputs—consider an image compression program; we cannot test it with all possible input images.


Loading comments... Write a comment