Verifying embedded software functionality: The power of dynamic slicing - Embedded.com

Verifying embedded software functionality: The power of dynamic slicing

Editor’s Note: In this second in a four part series, Abhik Roychoudhury, author of Embedded Systems and software validation, details the ways in which dynamic slicing can be used for assessing software functionality. In this Part 2: The power of dynamic slicing.

Dynamic slicing is a generic method for program debugging and comprehension. The method takes in the following: (a) a program P, (b) an input I of the program P, and (c) a description of the error observed during the execution of P for input I.

The output of the method is a fragment of P that is likely to be responsible for the observed error. Thus, the output of the method is essentially a “bug report” of sorts, an explanation of the cause of the observed error.

To discuss the method, we first need to understand how the observed error is described. Usually, the observed error is presented as a pair (l, v), where l is a line in the program P being debugged and v is a program variable.

This usually means that the programmer is unhappy about the value of v observed in line l of a given program for a given input. That is, when a given program P is executed with a given input I, the user is unhappy about the value of v observed in line l of the program and would like to seek an explanation for it.

The explanation will be provided by dynamic slicing, which acts as a debugging method. One issue needs to clarified in this regard. A line l may be executed several times during the execution of program P with input I. Therefore, when the programmer seeks an explanation of the value of v in line l, he or she could be interested in the value of v in the last execution of l, or the value of v in some specific execution of l, or the value of v in all executions of l.

Note that whichever executions of l we are interested in, it makes little difference to the dynamic slicing method itself; it only makes a difference to the initialization of the dynamic slicing method.

Having explained the notion of observed error, we need to discuss the notion of dynamic slice and what it stands for. The dynamic slice stands for a detailed explanation of the observed error. In other words, the dynamic slice can be thought of as a “bug report,” and the task of dynamic slicing as “debugging”—finding the source of a bug that is observed elsewhere.

How is the bug report constructed? Let us say we are performing dynamic slicing of a program P with input I, where the slicing criterion is the value of variable v in the last occurrence of line 1. Slicing can now proceed in one of two ways:

1 – Via a backwards traversal of the execution trace of program P for input I; the traversal starts from the last occurrence of line l in the execution trace.

2 – Via a forward execution of the program P for input I. The first possibility involves storing the execution trace in some form and hence leads to space overheads. However, the computation in this case is goal-directed (i.e., redundancy free), because we compute the slice of only the slicing criterion we are interested in—the value of variable v in the last occurrence of l.

In the second option, we do not encounter any space overheads owing to storing of the program execution trace. We do, however, have to compute and store many slices during the execution. For each executed statement, we compute and store the slices of the variables used in the statement; these slices are used for computing slices of the subsequent variable usages.

Thus, even though the program execution trace does not need to be stored, there is a time and space overhead in computing many dynamic slices (many of which are unrelated to the actual slicing criterion the programmer is interested in). In other words, the dynamic slice computation is not goal-directed in this situation.

Whether a dynamic slice is computed by forward or backward traversal of program execution, it is computing chains of data and control dependencies. To explain these concepts, let us first consider a simple program fragment, written in C-style (Figure 5.4 below). The program constructs the sum of all even numbers from 1 to N and the product of all odd numbers from 1 to N, where N is a given integer.

We will use this example program to define the necessary concepts—(static and dynamic) control dependencies, (static and dynamic) data dependencies, dynamic dependence graphs, and dynamic program slices.

Figure 5-4. An example program to explain the concepts behind dynamic slicing.

Consider an execution of the program for N = 3. The execution trace is as follows:

(1, 2, 3, 4,            // initialization
5, 6, 8, 9,             // first iteration, i = 1
5, 6, 7, 9,             // second iteration, i = 2
5,10)                    // i = 3, end of execution

Suppose the programmer wants an explanation of the value of sum printed in line 10; thus the slicing criterion is (10,sum). We seek an automated method that can find the fragment of the program which influences the value of sum at line 10; this fragment will be treated as the explanation of the value of sum in line 10.

In constructing the explanation, we try to answer the following questions:

Dynamic data dependence : Which variable assignment was propagated to the value of sum printed in line 10?

Dynamic control dependence: What is the nearest conditional branch statement that enables line 10 to be executed, in the execution trace under consideration? These questions can be answered by a backwards traversal of the execution trace starting from line 10, the slicing criterion. In particular, the value of sum in line 10 contains the value set in the last execution of line 7. As far as dynamic control dependencies go, we observe that the execution of line 10 is unconditional, it does not depend on any conditional branch statement evaluating in a certain direction.

In case the reader is reasoning that line 10 got executed only because the while-loop (line 5) terminated and hence line 5 and line 10 are control dependent, here is the way to think about this matter. Any program execution from line 5 to the end of the program will pass through line 10. Hence line 5 does not enable line 10 to be executed.

The dynamic control and data dependencies in an execution trace are typically summarized in a dynamic dependency graph. Because one statement may be executed several time in an execution trace, we distinguish between the different occurrences of the same statement—each occurrence is a separate statement instance.

The nodes of a dynamic dependency graph are the statement instances, and the edges are dynamic dependencies (both data and control dependencies). Part of the dynamic dependency graph for the example program of Figure 5.4 with input N = 3 is given in Figure 5.5 below ; dashed edges denote control dependencies, and solid edges denote data dependencies.We show only the part of the dynamic dependency graph that is relevant to our slicing criterion—the value of sum in line 10 of the program.

Each node in the dynamic dependency graph in Figure 5.5 is of the form ij denoting the jth occurrence of line i in the execution trace of the program in Figure 5.4 with input N = 3. Let us now explain the dependencies shown in Figure 5.5.

Statement instance 101 is dynamically data dependent on 71 , because the definition of sum in 71 is used in 101 . Also, statement instance 71 is dynamically control dependent on statement instance 62 ; 62 is the nearest enclosing branch statement instance s.t. the evaluation of the corresponding branch, which allows 71 to be executed. Figure 5.5 shows only a fragment of the dynamic dependency graph—the fragment that is reachable from our slicing criterion, the value of sum in line 10.

Figure 5-5. Portion of dynamic dependency graph that is relevant to the slicing criterion—the variable sum in line 10 of Figure 5.4. We consider an input N = 3.

The slice consists of the following statement instances from which statement instances 101,71,62,52,91,31,51,21,11 that is, instances of the following statements: {1, 2, 3, 5, 6, 7, 9,10} Lines 4 and 8, which manipulate the variable prod, are not in the slice.Detailing the methodology
We now formally describe the dynamic slicing method for software debugging. Traditionally, dynamic slicing is performed w.r.t. a slicing criterion (H, l, v), where H represents an execution trace of the program being debugged, l represents a control location in the program, and v is a program variable.

A dynamic slice contains all statement instances (or statements) that have affected the value of variable v referenced at l in the trace H. A dynamic slicing algorithm can proceed by forward or backward exploration of an execution trace.

Here we summarize a backwards slicing algorithm that is goal-directed (w.r.t. the slicing criterion), but requires efficient storage/traversal of the trace. During the trace traversal that starts from the statement in the slicing criterion, a dynamic slicing algorithm maintains the following quantities: (a) the dynamic slice φ (b) a set of variables ζ whose dynamic data dependencies need to be explained, and (c) a set of statement instances γ whose dynamic control dependencies need to be explained. Initially, we set the following:φ =γ= last instance of location l in trace H, and (b) ζ {v}.

For each statement instance stmt encountered during the backward traversal, the algorithm performs the following two checks. The algorithm terminates when we reach the beginning of the trace.

Check dynamic data dependencies. Let vstmt def be the variable defined by stmt. If vstmt def ε, it means that we have found the definition of vstmt which the slicing algorithm was looking for. So, vstmt def is removed from ζ , and variables used by stmt are inserted into. In addition, stmt is inserted into ζ and γ .

Check dynamic control dependencies. If any statement instance in γ is dynamically control dependent on stmt, all statement instances which are dynamically control dependent on stmt are removed from γ . Variables used by stmt are inserted into ζ, and stmt is inserted into φ and γ.When the dynamic slicing algorithm terminates, the resultant dynamic slice, (i.e., the setφ) is reported back to the programmer for inspection.

Figure 5.6 below describes how slicing can be made to fit in with software testing and debugging. Usually a program is tested against inputs from a test suite. If the program outputs are as “expected,” the tests are said to pass.

Clickon image to enlarge.

Figure 5-6. Software testing and debugging with slicing as the debugging method

For a failed test (i.e., where the output is “unexpected”), the programmer needs to find the cause of unexpected program behavior. This brings us to debugging, and dynamic slicing is one debugging method. The slicing criterion comes from the failed test case itself; the slicing criterion is (I, l, v), where I is the input for the failed test case, l is the line number where the “unexpected” output is observed, and v is the output variable whose observed value is “unexpected.”

For dynamic slicing, the program is run against the same input (the one leading to a failed test case) in an instrumented fashion— that is, (part of ) the execution trace is collected. The execution trace is analyzed via dynamic slicing as mentioned in the preceding. The constructed dynamic slice acts as the bug report.

The programmer can use it for program comprehension and debugging, thereby locating the source of error. Needless to say, only the computation of the slice is automatic. Comprehension and debugging of programs using the slice is a fully manual activity.

Time and Space Complexity
Note that dynamic slicing is an algorithmic framework, and it can be adapted for different programming languages. The complexity of dynamic slicing algorithm for modern programming languages such as Java is as follows:

1 – Worst-case space complexity is linear in the length of the execution trace, and
2 – Worst-case time complexity is quadratic in the length of the execution trace.

The quadratic time complexity is owing to the dependence computation, which involves checking pairs of operations in the trace.We note that state-of-the-art slicing tools (such as the JSlice tool for Java) employ online compression of the execution trace—where the execution trace is compacted as it is collected, achieving compaction ratios (memory taken up by original trace versus memory taken up by compact trace) of 10 to 1000.

Dealing with Large Slices
The reader may be concerned that the dynamic slice of real-life programs may be too large for human comprehension. Here we would like to point out that dynamic slicing is a core method of program understanding, debugging, and validation. There exist very many different improvements to the dynamic slicing method to reduce the slice size, such as not computing the full slice (e.g., certain programmers may inspect only the chains of data dependencies).

A most recently proposed method called hierarchical dynamic slicing addresses this problem as follows. It builds a dynamic slicing method where the human programmer is gradually exposed to a slice in a hierarchical fashion, rather than having to inspect a very large slice after it is computed. The key idea is simple—we systematically interleave the slice computation and comprehension steps.

Conventional works on slicing have concentrated only on the computation of the slice, comprehension of the slice being left as a postmortem activity. In hierarchical dynamic slicing, the two activities are integrated in a synergistic fashion as follows:

1 – Computation of the slice is guided (to a limited extent) by the human programmer so that very few control/data dependencies in a large slice need to be explored and inspected.

2 – The programmer’s comprehension of the slice is greatly enhanced by the nature of our slice computation, which proceeds hierarchically. Thus, for programs with long dependence chains, this allows the programmer to gradually zoom in to selected dynamic dependencies.

To understand the potential benefits one can gain from the method, let us examine the factors that make the comprehension of dynamic slices difficult, which include:

1 – Many programs have long dependence chains spanning across loops and function boundaries. These dependence chains are captured in the slice. However, the slice being a (flat) set of statements, much of the program structure (loops/functions) is lost. This makes the slice hard to comprehend.

2 – Programs often also have a great deal of inherent parallelism. So, a slice may capture many different dependence chains.

Let us now discuss how hierarchical computation/exploration of slices can help programmers to comprehend large slices containing these two features: (a) long dependence chains, and (b) many different dependence chains. Figure 5.7a shows an example program with a long dependence chain. Consider an execution trace of the program …3, 4, 5, 6 – where lines 3,4,5,6 of Figure 5.7a below are executed. Slicing this execution trace w.r.t. the criterion (line6, y) (i.e., the value of y at line 6) yields a slice that contains lines 3, 4, 5,6 as well as lines inside the body of the functions f 1, f 2, f 3.

Figure 5.7. (a) A program with a long dynamic dependence chain. (b) The corresponding phases. Dashed arrows represent dynamic dependencies that a programmer needs to follow for debugging.

In other words, because the slice is a (flat) set of statements, the program structure is lost in the slice. This structure is explicitly manifested in Figure 5.7b, where we show the dependence chain in a hierarchical fashion. In other words, the dependencies inside the functions f 1, f 2, f 3 are not shown. Here, a hierarchical exploration of the dependence chains will clearly be less burdensome to the programmer.

Thus, in Figure 5.7b, by inspecting the dependencies hierarchically, the programmer may find it necessary to inspect the dependencies inside a specific function (say, f 2). As a result, we can avoid inspecting the dependence chain(s) inside the other functions (in this case f 1, f 3). Now, let us consider programs with many different dependence chains.

Figure 5.8a below shows a schematic program with several dependence chains, and hence substantial inherent parallelism. If the slicing criterion involves the value of y in line 6, we need to consider the dependencies between y and x3 and those between y and x2, as well as y and x1. These three dependencies are shown via broken arrows in Figure 5.8b. Again, with the programmer’s intervention, we can rule out some of these dependencies for exploration and inspection.

Figure 5-8. (a) A program with inherent parallelism (several dynamic dependence chains). (b) The corresponding phases. Dashed arrows represent dynamic dependencies that a programmer needs to follow for debugging.

In summary, the hierarchical dynamic slicing method works as follows. Given an execution trace (corresponding to a program input) containing an observable behavior that is deemed an “error” by the programmer, we divide the trace into phases. This division is typically done along loop/procedure/loop-iteration boundaries so that each phase corresponds to a logical unit of program behavior.

Only the interphase data and control dependencies are presented to the programmer; the intraphase dependencies are completely suppressed. The programmer then identifies a likely suspicious phase, which is then subjected to further investigation in a similar manner (dividing the phase into subphases, computing dependencies across these subphases, and so on). This process continues until the error is identified.

Of course, an underlying assumption here is that the programmer will be able to identify the erroneous statement once it is pointed out to him or her. One may comment that such a hierarchical exploration of dynamic dependencies involves programmer’s intervention, whereas conventional dynamic slicing is fully automatic.

Here we should note that the process of error detection by using/exploring a dynamic slice involves a huge manual effort; the manual effort in exploring/ comprehending the slice simply happens after the computation of the slice. However, in the hierarchical method, we are interleaving the computation and comprehension of dynamic dependencies.

As in dynamic slicing, the computation of the dynamic dependencies is automatic in our method; only the comprehension involves the programmer. Moreover, the programmer is exposed to the complex chain(s) of program dependencies gradually, rather than all at once, thereby allowing better program comprehension.

To read Part 1 , go to Why it’s necessary.
Next in Part 3 : Fault localization

Abhik Roychoudhury is associate professor at the National Univdrsity of Singapore. He received his Ph.D. in computer science from the State University of New York at Stony Brook. His research interests are system modeling and validation with a specific focus on embedded software and systems.

Used with permission from Morgan Kaufmann, a division of Elsevier.Copyright 2009, from “Embedded Systems and Software Validation,” by Abhik Roychoudbury. For more information about this title and other similar books, visit www.elsevierdirect.com

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.