How to use new unit testing tools & techniques to improve software quality

Unit Test has been around almost as long as software development itself. It just makes sense to take each application building block, build it in isolation, and execute it with test data to make sure that it does just what it should do without any confusing input from the remainder of the application.

In the past, the sting came from not being able to simply lift a software unit from its development environment, compile and run it ” let alone supply it with test data.

For that to happen, you need a harness program acting as a holding mechanism that calls the unit, details any included files, “stubs” written to handle any procedure calls by the unit, and offers any initialization sequences which prepare data structures for the unit under test to act upon.

Not only was creating that process laborious, but it took a lot of skill. More often than not, the harness program required at least as much testing as the unit under test.

Perhaps more importantly, a fundamental requirement of software testing is to provide an objective, independent view of the software. The very intimate code knowledge required to manually construct a harness compromised the independence of the test process, undermining the legitimacy of the exercise.

The legacy from high integrity systems
In developing applications for the medical, railway, aerospace and defence industries, unit test is a mandatory part of a software development cycle – a necessary evil.

For these high integrity systems, unit test is compulsory, and the only question is how it might be completed in the most efficient manner possible. It is therefore no coincidence that many of the companies developing tools to provide such efficiency have grown from this niche market.

In non-safety-critical environments, perceived wisdom is that unit testing is a nice idea in principle, but commercially unjustifiable. A significant factor in that stance is the natural optimism which abounds at the beginning of any project. At that stage, why would anyone spend money on careful unit testing? There are great engineers in the team, the design is solid, sound management is in place. What could possibly go wrong?

However, things can, and do go wrong, and while unit test can't guarantee success, it certainly helps minimise failure. So, if we look at the tools designed and proven to provide quick and easy unit tests in high integrity systems, it makes sense that the same unit tests would provide a solid solution for those working on commercially developed software as well.

When is unit test justifiable?
Unit testing cannot always be justified. And, sometimes it remains possible to perform unit test from first principles, without the aid of any test tool at all.

There are pragmatic judgements to be made.

Sometimes that judgment is easy. If the software fails, what are the implications? Will anyone be killed, as might be the case in aircraft flight control? Will the commercial implications be disproportionately high, as exemplified by a continuous plastics production plant? Or are the costs of recall extremely high, perhaps in a car's engine controller? In these cases, extensive unit testing is essential and any tools that aid in that purpose make sense.

On the other hand, if software is developed purely for internal use or is perhaps a prototype, then the overhead in unit testing all but the most vital of procedures would be prohibitive.

As you might expect, there is a grey area. Suppose the application software controls a mechanical measuring machine where the quantity of the devices sold is low and the area served is localized. The question becomes: Would the occasional failure be more acceptable than the overhead of unit test?

In these circumstances, it's useful to prioritize the parts of the software which are either critical or complex. If a software error leads to a strangely coloured display or a need for an occasional reboot, it may be inconvenient but not in itself justification for unit test. On the other hand, the unit test of code which generates reports showing whether machined components are within tolerance may be vital.

When are unit test tools justifiable?
Again, it comes down to cost. The later a defect is found in the product development, the more costly it is to fix (Figure 1 below ) — a concept first established in 1975 with the publication of Brooks' “Mythical Man Month” and proven many times since through various studies.

Figure 1: The later a defect is identified, the higher the cost of rectifying it.

The automation of any process changes the dynamic of commercial justification. This is especially true of test tools since they make earlier unit test much more feasible. Consequently, modern unit test almost implies the use of such a tool unless only a handful of procedures are involved.

The primary function of such unit test tools is to automatically generate the harness code which provides the main and associated calling functions or procedures (generically “procedures”). These facilitate compilation and allow unit testing to take place.

The tools not only provide the harness itself, but also statically analyze the source code to provide the details of each input and output parameter or global variable in any easily understood form. Where unit testing is performed on an isolated snippet of code, stubbing of called procedures can be an important aspect of unit testing. This can also be automated to further enhance the efficiency of the approach.

This automation makes the assignment of values to the procedure under test a simple process, and one which demands little knowledge of the code on the part of the test tool operator.

This creates that necessary unit test objectivity because it divorces the test process from that of code development where circumstances require it, and from a pragmatic perspective substantially lowers the level of skill required to develop unit tests.

It is this ease of use which means that unit test can now be considered viable for development since each procedure can be tested at the time of writing. . When these early unit tests identify weak code, it can be corrected whilst the original intent remains very fresh in the mind of the developer.

Beyond unit test
For some, the terms “unit test” and “module test” are synonymous. For others, the term “unit” implies the testing of a single procedure, whereas “module” suggests a collection of related procedures, perhaps designed to perform some particular purpose within the application.

Using the latter definitions, manually developed module tests are likely to be easier to construct than unit tests, especially if the module represents a functional aspect of the application itself. In this case, most of the calls to procedures are related and the code accesses related data structures which makes the preparation of the harness code more straightforward.

Test tools render the distinction between unit and module tests redundant. It is perfectly possible to test a single procedure in isolation and equally possible to use the exact same processes to test multiple procedures, a file or multiple files of procedures, a class (where appropriate), or a functional subset of an entire system. As a result, the distinction between unit and module test is one which has become increasingly irrelevant to the extent that the term “unit test” has come to include both concepts.

Such flexibility facilitates progressive integration testing. Procedures are first unit tested and then collated as part of the subsystems, which in turn are brought together to perform system tests.

It also provides options when a pragmatic approach is required for less critical applications. A single set of test cases can exercise a specified procedure, all procedures called as a result of exercising the single procedure as illustrated in Figure 2 below , or anything in between.

The use of test cases which prove the functionality of the whole call chain are easily constructed. Again, it is easy to “mix and match” the processes depending on the criticality of the code under review.

Figure 2: A single test case (inset) can exercise some or all of the call chain associated with it. In this example, “AdjustLighting”, note the red colouring highlights exercised code.

( To view an expanded view , click here )

This all embracing unit test approach can be extended to multithreaded applications. In a single-threaded application, the execution path is well-defined and sequential, such that no part of the code may be executed concurrently with any other part.

In applications with multiple threads, there may be two or more paths executed concurrently, with interaction between the threads a commonplace feature of the system. Unit test in this environment can ensure that particular procedures behave in an appropriate manner both internally and in terms of their interaction with other threads.

Sometimes, testing a procedure in isolation is impractical. For instance, if a particular procedure relies on the existence of some ordered data before it can perform its task, then similar data must be in place for any unit test of that procedure to be meaningful.

Just as unit test tools can encompass many different procedures as part of a single test, they can also use a sequence of tests with each one having an effect on the environment for those executed subsequently.

For example, unit testing a procedure which accesses a data structure may be achieved by first implementing a test case to call an initializations procedure within the application, and then a second test case to exercise the procedure of interest.

Unit test does not imply testing in only the development environment. Integration between test tools and development environments means that unit testing of software can take place seamlessly using the compiler and target hardware.

Retaining the functionality through regression test
Whilst unit testing at the time of development is a sound principle to follow, all too often ongoing development compromises the functionality of software which is considered complete. Such problems are particularly prevalent when adding functionality to code originally written with no knowledge of later enhancements.

Regression testing is what's needed here. By using a test case file to store a sequence of tests created for the original SOUP software, it is possible to recall and reapply it to the revised code to prove that none of the original functionality has been compromised.

Once configured, this regression testing can be initiated as a background task and run perhaps every evening. Reports can highlight any changes to the output generated by earlier test runs. In this way, any code modifications leading to unintentional changes in application behaviour can be identified and rectified immediately.

Modern unit test tools come equipped with user friendly, point-and-click graphical user interfaces, which are easy and intuitive to use. However, if faced with thousands of test cases, a GUI interface is not always the most efficient way to handle the development of test cases.

In recognition of this, test tools are designed to allow these test case files to be directly developed from applications such as Microsoft Excel. As before, the “regression test” mechanism can then be used to run the test cases held in these files.

Test-driven development
In addition to using unit test tools to prove developed code, they can also be used to develop test cases for code still in conception phase – an approach known as test-driven development (TDD).

As portrayed in Figure 3 below , TDD is a software development technique that uses short development iterations based on pre-written unit test cases that define desired improvements or new functions. Each iteration produces code necessary to pass that iteration's tests. The programmer or team refactors the code to accommodate changes.

Figure 3: Unit test tools lend themselves admirably to test-driven development by providing a mechanism to write test cases before any source code is available.

Unit test and System Test in tandem
Traditionally, many applications have been tested by functional means only. The source code is written in accordance with the specification, and then tested to see if it all works. The problem with this approach is that no matter how carefully the test data is chosen, the percentage of code actually exercised can be very limited.

That issue is compounded by the fact that the procedures tested in this way are only likely to handle data within the range of the current application and test environment. If anything changes a little – perhaps in the way the application is used, or perhaps as a result of slight modifications to the code – and the application could be running to an entirely untested execution in the field.

Of course, if all parts of the system are unit tested and collated on a piecemeal basis through integration testing, then this will not happen. But what if timescales and resources do not permit such an exercise?

Unit test tools often provide the facility to instrument code. This instrumented code is equipped to “track” execution paths, providing evidence of the parts of the application which have been exercised during execution. Such an approach provides the information to produce data such as that depicted in figure 2.

Code coverage is an important part of the testing process in that it shows the percentage of the code that has been exercised and proven during test. Proof that all code has been exercised correctly need not be based on unit tests alone. To that end, some unit tests can be used in combination with system test to provide a required level of execution coverage for a system as a whole.

This means that the system testing of an application can be complemented by unit tests to exercise code which would not normally be exercised in the running of the application. Examples include defensive code (e.g., to prevent crashes due to inadvertent division by zero), exception handlers and interrupt handlers.

Automatically generating test cases
Generally, the output data generated through unit tests is an important end in itself, but this is not necessarily always the case. There may be occasions when the fact that the unit tests have successfully completed is more important than the test data itself. This happens when source code is to be tested for robustness.

To provide for such eventualities, it is possible to use test tools to automatically generate test data as well as the test cases. High levels of code execution coverage can be achieved by this means alone, and the resultant test cases can be complemented by means of manually generated test cases in the usual way.

An interesting application for this technology involves legacy code. Such code is often a valuable asset, proven in the field over many years but likely to have been developed on an experimental, ad hoc basis by a series of expert “gurus” ” expert at getting things done and in the application itself, but not necessarily at complying with modern development practices.

Frequently this “software of unknown pedigree” (SOUP) is required to form the basis of new developments which are obliged to meet modern standards either due to client demands or because of a policy of continuous improvement within the developer organization. This situation may be further exacerbated by the fact that coding standards themselves are the subject of ongoing evolution, as the advent of MISRA C:2004 clearly demonstrates.

If there is a need to redevelop code to meet such standards, then this is a need to not only identify the aspects of the code which do not meet them, but also to ensure that in doing so the functionality of the software is not altered in unintended ways. The existing code may well be the soundest or only documentation available and so a means needs to be provided to ensure that it is dealt with as such.

Automatically generated test cases can be used to address just such an eventuality. By generating test cases using the legacy code and applying them to the rewritten version, it can be proven that the only changes in functionality are those deemed desirable at the outset.

In conclusion
The Apollo missions may have seemed irrelevant at the time, and yet hundreds of everyday products were developed or modified using aerospace research – from baby formula to swimsuits.

Formula One racing is considered a rich man's playground, and yet British soldiers benefiting from the protective qualities of the light, yet strong materials first developed for racing cars. Hospital patients and premature babies may stand a better chance of survival than they would have done a few years ago, thanks to the transfer of F1 know-how to the medical world.

Likewise, unit test has long been perceived to be a worthy ideal – an exercise for those few involved with the development of high-integrity applications with budgets to match. But the advent of unit test tools means that the latest unit test tools provide slick, efficient mechanisms that optimize the development process for all.

The availability of such tools has made this technology and unit testing itself an attractive proposition for applications where sound, reliable code is a commercial requirement, rather than a life-and-death imperative.

Unit test tools have long provided commercial benefit for the team developing the highest integrity applications. Now these tools can also streamline the efforts of their peers working in less critical environments ” even those charged with the ongoing development of undocumented legacy code.

Mark Pitchford has over 25 years' experience in software development for engineering applications. He has worked on many significant industrial and commercial projects in development and management, both in the UK and internationally including extended periods in Canada and Australia. For the past 5 years he has specialised in software test, and works throughout Europe and beyond as a Field Applications Engineer with LDRA Ltd.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.