# Verifying embedded software functionality: Combining formal verification with testing

** Editor’s Note: ** In the final part in a four part series Abhik Roychoudhury, author of

**explains the usefulness of formal verification techniques to traditional software testing techniques.**

__Embedded Systems and software validation,__As noted in __Part 1__ , __Part 2,__ and __Part 3__ in this series, dynamic or trace-based checking methods are very useful for testing-oriented debugging. In other words, the software validation flow revolves around program testing—we test a program against selected test cases, and for the failed test cases (the ones for which the program output does not match the programmer’s “expectations”), we analyze the traces for these test cases using dynamic checking methods.

However, program testing, by its very nature, is nonexhaustive. It is not feasible to test a program against all possible inputs. As a result, for safety-critical software it is crucial to employ checking methods that go beyond testing-oriented debugging. .

Currently, many functionalities in our daily lives are software controlled—functionalities that earlier used to be controlled by electrical/mechanical devices. Two specific application domains where software is increasingly being used to control critical functionalities are automotive and avionics. .

Naturally, for such software it is critical to give certain guarantees about the software. Unfortunately, testing cannot deliver any guarantees about software behavior. Testing can only be used to show the presence of bugs; it cannot be used to guarantee the absence of bugs. .

For safety-critical software, it is necessary to formally check properties of the software and provide guarantees about its behavior. However, this is a tall order. How do we specify the guarantees that we need? .

And, how do we formally check programs for such guarantees? Our first thought is to employ automated checking methods such as model checking, which checks a temporal logic property (a property constraining the order in which events can occur) against a finite-state transition system. .

Model checking is not directly applicable for software verification, because any program in a modern programming language (such as C or Java) contains variables from infinite domains (integers, floating-point numbers).

The presence of a single integer variable in the program makes it infinite-state, because the variable may take up infinitely many values. The solution to the problem is to come up with abstractions of the variable values. .

That is, instead of maintaining the exact values of the program variables, we maintain only a finite abstraction of the variable values.We first give an example of the abstract representation of variable values, and then explain how the abstraction is achieved.

Consider the program fragment in **Figure 5.14 below** . This program contains only one integer variable x. Suppose we want to prove that x == 0 at the end of the program.One possibility is to exactly maintain the control flow and data variable values, and check that x is indeed zero at the end of the program. The problem with this approach is that x can, in general, acquire infinitely many values—in other words, the amount of memory required to store the value of x is unbounded. Indeed, in any implementation of integer data types, a finite number of bits (say 32 or 64 bits) will be allocated to represent an integer. .

.

**Figure 5-14. Program fragment for showing data abstraction.**

In our reasoning, we do not want to be restricted by the exact implementation of integer data types, nor do we need to be. Instead, we can abstract the value of the program variables by maintaining the true/false values of certain propositions about the program variables.

Simply put, we do not maintain the exact values of the program variables; instead we only maintain the true/false answers to certain questions on the program variable values. Thus, in the preceding example, instead of maintaining the exact value of x,we might only maintain whether x == 0.

This may be denoted by a boolean variable [x == 0], which is true if x == 0 and false otherwise.With such an abstraction, building a finite-state transition system is easy.

It is a graph whose nodes are the (abstracted) program states, and each edge denotes a transition between two states via the execution of a program statement. The finite-state transition system for the following program will be as shown in **Figure 5.15 below. **

**Figure 5.15. A program fragment and its abstracted transition system.**

In the finite-state transition system shown in Figure 5.15, each program state (i.e., each node in the graph) corresponds to:

**1 – ** The value of the program counter (i.e., where the control flow is), and**2 –** The value of the boolean variable [x == 0].

In other words, the control flow in the program is represented exactly—we do not employ any abstraction there. On the other hand, while representing the value of the data variables, we do not maintain the exact values—we only maintain the true/false answers to certain questions on the program variables.

In this case, the only program variable is x, whose exact value is not maintained—instead we only maintain whether x is equal to 0. In other words, we abstract a program P with another program P’ such that

**1 –** The control flow in P and P’ are identical, and**2 – ** The only data type in program P’ is boolean, that is, all variables in program P‘ are boolean variables.

The boolean variables in program P’ are, of course, related to the data variables in program P. Indeed, each boolean variable in P’ denotes a true/false question about the variable values in program P. Given a set of true/false questions about the variables in P, deriving program P’ from program P is automatic.

Program P’ really represents a finite-state transition system that can be subjected to state space exploration via model checking. The states of the transition system correspond to

**(a) ** the value of the program counter, and **(b) ** the value of the boolean variables.

Assuming that the number of possible program counter values (roughly corresponding to the number of lines in the program) is finite, and the number of boolean variables is finite, the number of states in the transition system is indeed finite.

**Figure 5.16 below** shows the overall role a method such as software model checking can play in software validation. Here we assume that only the software is available for validation—we do not have a finite-state machine–like model of the software available for model checking.

**Clickon image to enlarge.**

**Figure 5.16. Validation flow with software model checking.**

Naturally, this means we need to synthesize a finite state transition system out of the software itself. This finite-state transition system is used for model checking.

Given a program P, we abstract it to another program P’ where all data variables in P’ have the type boolean. We call P’, and it implicitly represents a finite-state transition system whose states can be explored via model checking.

The temporal logic properties verified through model checking may be provided by the programmer (where they typically denote desirableproperties of the software—typically invariant properties), or they may be found by “specification mining” from the test cases.

Specification mining involves finding a software specification (such as invariant properties) by running the software for different test cases and documenting the observed behavior. The Daikon tool, in particular, is useful for mining invariant properties from observed behavior of a program on test cases.

**Combining Formal Verification with Testing**

Testing and formal verification are two complementary techniques for validating functionality. Testing exposes few program behaviors, but any wrong behavior thereby exposed is truly a wrong behavior—there are no false alarms reported by testing.

On the other hand, formal verification exposes all program behaviors, albeit at an abstract level. However, because of the abstractions involved, “bugs” reported by formal verification can be false alarms. In other words, testing and formal verification have complementary strengths and weaknesses.

Testing methods work at the concrete level: They execute the program with concrete inputs. Naturally, they cannot cover all program behaviors—exhaustively covering all concrete inputs of any real-life program is an impossibility!

Directed testing methods (discussed earlier in this series) alleviate this concern somewhat—they are trying to cover all program paths, rather than all concrete inputs. Thus, one may say that directed testing methods partition the input space of a program, such that two inputs i, i’ are in the same partition if and only if inputs i, i’ exercise the same program path.

Given such a partitioning of the input space, directed testing methods are trying to expose one concrete input per partition.

However, the number of program paths in any real-life program is also huge—even for a program without loops, b branches can lead to anywhere between b=1 to 2^{b} paths depending on the control flow (see **Figure 5.21 below** ).

Clearly, if we have b branches containing 2^{b} acyclic paths within a loop with L iterations (where L is large), we will have (2^{b} )^{L} paths—a huge number! Hence, covering all program paths simply via testing is also not feasible for muchreal-life software.

**Figure 5.21. An acyclic program with 3 branches having (a) 23 program paths and (b) 3_1 program paths.**

Formal verification methods, on the other hand, abstract the behaviors of a program. One example of such an abstraction is predicate abstraction where we abstract the memory store of a program while leaving the control flow unabstracted.

Formal verification methods achieve full coverage of program behaviors in the abstract state space. However, because of the information (about program variable values) lost in the abstraction, the errors reported by covering the abstract state space might not correspond to actual errors in a concrete program.We saw an example in earlier in this series, reproduced here for convenience:

**1 x = 0; 2 x = x + 1; 3 x = x – 1; 4 if (x == 1) {5 … /* error */} **

Here, by abstracting the program only w.r.t. the predicate [x == 0] and model checking the abstracted program, we are unable to infer that the error location is unreachable. Thus, we report a false bug, a path to the error location as follows:(1,[x == 0]), 2,[x==0]), (3, ![x==0]), (4,![x==0]), (5, …)

This is what we mean by a false positive or a false alarm: a reported bug that does not correspond to any program error. Such a false alarm appears because of the information loss in abstraction and will show up during model checking of the abstract state space.

Because formal verification (such as model checking) and testing methods have complementary strengths and weaknesses, there have been attempts to combine them and reap the benefits of both. One possibility that has been studied is to use model checking for systematic generation of tests.

Yet another possibility is to use tests (and the observations resulting from running these tests) for learning about the program model, when the program is a black box. Usually these methods can be thought of as model checking helping testing, or vice versa.

In the recent past, software validation methods where testing and model checking simultaneously help each other have also been studied.To discuss the inner workings of such methods, we modify the example program in the preceding and introduce a program loop that does not modify variable x:

**1 x = 0; 2 x = x + 1; 3 x = x – 1; 4 for (i=0; i<100; i++){5 … /* x is not modified in the loop */6 }7 if (x == 0) {8 … /* error */ }9 … /* end of program */**

Suppose we want to check whether the error location is reachable. How do we start? We can perform predicate abstraction of the program and perform model checking of the abstracted program.

The property we want to verify states that the error location is never reached—specified as

**G(pc≠8) **

in linear-time temporal logic (LTL). The foregoing property states that the program counter ( pc) never goes to line 8 (the error location).A violation of the property will be a counterexample trace that reaches line 8.

To perform predicate abstraction, we need a set of predicates w.r.t. which we abstract. What set of predicates do we choose? As discussed earlier, we can simply start with the null set of predicates.

This the coarsest possible predicate abstraction, which treats each path in a program’s control-flow graph as feasible (i.e., the abstraction assumes that for each path _ in the control flow graph of a program, there is some program input exercising the path). Model checking finds the shortest counterexample trace:

**1,2,3,4,7,8 **

Because we did predicate abstraction w.r.t. a null set of predicates, we are not keeping track of the program variable values at all in the abstracted finite-state transition system. We only keep track of the program-counter value (also called the control location). We can now analyze this counterexample trace by collecting its path constraint, This will show that the foregoing is a spurious counterexample, because there is no concrete execution

**1,2,3,4,7,8 **

the loop needs to be iterated 100 times!!

According to the abstraction refinement methodology, we need to abstract our program w.r.t. a new predicate and run model checking again. The trouble with this approach is that whatever predicates we abstract our program with, model checking will successively uncover longer and longer paths to line 8 (all of which are spurious, that is, do not correspond to any concrete program execution). These paths correspond to the number of times we iterate through the loop.

**1, 2, 3, 4, 7, 81, 2, 3, 4, 5, 6, 4, 7, 81, 2, 3, 4, 5, 6, 4, 5, 6, 4, 7, 8 **

This is clearly inefficient. In our program, we will then need 100 refinement steps to show that the error location is indeed not reachable. If we had a loop iterating 1 million times, we would have required 1 million refinement steps!

Fortunately, testing methods can be combined with model checking to solve our problem. Here, we first abstract the program w.r.t. the null set of predicates and run model checking. This produces (as before), the counterexample trace:

**1,2,3,4,7,8 **

We can now use directed testing to attempt a concrete test trace that goes through this path (or as long a prefix of the path as possible). We find a concrete trace that visits lines 1, 2, 3, but not line 8. The trace is

**1,2,3,(4,5,6) ^{100} ,4,7,8 **

Note that this is a concrete path which can be exercised by executing the program. Thus, our concrete test trace straightaway shows that the error location is reachable, because it visits the error location (line 8). So, we completely avoided the 100 refinement steps necessary to show this error via pure abstraction refinement-oriented verification.

In summary, testing methods try to find witnesses of errors, whereas formal verification methods prove the absence of errors. Both have their unique plus and minus points. By judiciously combining them, we can develop powerful software validation methods that rely on testing to show presence of bugs, and on verification to show absence of them.

To read **Part 1,** go to “Why it’s necessary”.

To read **Part 2,** go to “The power of dynamic slicing.”

To read **Part 3,** go to “Fault localization, metrics, and directed testing.”

**Used with permission from Morgan Kaufmann, a division of Elsevier.Copyright 2009, from “Embedded Systems and Software Validation,” by Abhik Roychoudbury. For more information about this title and other similar books, visit www.elsevierdirect.com **

**Abhik Roychoudhury** is associate professor at the National Univdrsity of Singapore. He received his Ph.D. in computer science from the State University of New York at Stony Brook. His research interests are system modeling and validation with a specific focus on embedded software and systems.