Bug prediction is an important challenge for software engineering research. It consist in looking for possible early indicators of the presence of bugs in a software. However, despite the relevance of the issue, most experiments designed to evaluate bug prediction only investigate whether there is a linear relation between the predictor and the presence of bugs. However, it is well known that standard regression models can not filter out spurious relations.
Therefore, in this paper we describe an experiment to discover more robust evidences towards causality between software metrics (as predictors) and the occurrence of bugs. For this purpose, we have relied on Granger Causality Test to evaluate whether past changes in a given time series are useful to forecast changes in another series.
As its name suggests, Granger Test is a better indication of causality between two variables. We present and discuss the results of experiments on four real world systems evaluated over a time frame of almost four years. Particularly, we have been able to discover in the history of metrics the causes – in the terms of the Granger Test – for 64% to 93% of the defects reported for the systems considered in our experiment.
We have also been able to identify for each defective class the particular metrics that have Granger-caused the reported defects. Finally, as described in other studies, we could not identify a “holy grail” for bug prediction, i.e. a small set of metrics that are universally responsible for most of the defects, despite the considered systems. Instead, we have found that the metrics Granger-causing bugs can vary significantly from system to system and also among the different types of bugs of a particular system.
To read this external content in full, download the paper from the authors’ public archives. http://hal.archives-ouvertes.fr/docs/00/66/81/51/PDF/csmr2012.pdf