GUI testing: exposing visual bugs
My approach to testing a graphical user interface (GUI) has always been to find the most appropriate access point to manually inject test cases. This article will discuss the challenges of trying to make GUI tests repeatable, and we'll look at a homegrown framework that allows test input to be managed.
To test a GUI, we need a framework that gives us the ability to inject test cases easily and to observe the output. Ideally, in a test system, the output can be stored, and then a subsequent test run can be compared with a previous run to provide regression testing. By regression testing, we mean that we have proven that the new version under test has not broken anything that used to work for the previous version.
If we consider a non-GUI issue like message processing, you might introduce a test set of messages and then the application's message store, or inbox, will contain those messages. If the inbox from one run is compared with the next, there are two possible outcomes. The inboxes may be identical--this means that the new functionality has not broken previous features. In some cases, you might have expected changes to result from new functionality in the system under test--in other words, the output should have changed but didn't, telling us that the new features are not working.
The second possibility is that the outputs are different. If the inbox messages can be expressed in text format, some text difference tool can show those inconsistencies and the engineer can examine them to decide if the changes match the new functionality. For example if an urgency field has been added to the messages, a message identifier might change from:
Message index:17, message title: test_185_title
to: Message index:17, message title: test_185_title, urgency: normal.
If the only changes in the output are consistent with this, we have successfully added the urgency feature, and we have not broken anything else.
It would be great to apply the same principle to graphics, but the challenge is that the appearance of the display can not usually be expressed in a humanly readable text format. This means that it is tricky to examine differences. The output of a GUI test is the appearance of the screen, which is a large number of pixels, and each pixel has a color value. While a human being can view the screen and establish if the text is readable and the layout conforms to requirements, the job of checking if each pixel has the correct numerical value (or color) is a nontrivial task.