Measuring Changes in Software with CLOC
Source Code Evolution
The evolution of software from version to version is a mixture of several different kinds of changes. Figure 1 above shows how the LOC and files of a software project evolve.
There are detailed changes that absolute static measurements of the two versions cannot account for. Specifically, the maintenance and development of the software project can result in some of the original files being removed and other files continuing with either changed lines or with lines being removed.
The SLOC method is not accurate enough to measure software evolution because the subsequent version is not just the result of adding new LOC to the original version.
A software project could have a great deal of refactoring between versions that does not significantly increase the overall SLOC of the project, but still represents a significant amount of effort and evolution.
Even small modifications in existing code can represent large amounts of effort, because it involves understanding the software to be refactored, and then additional testing [3 pg. 2]. Therefore, it is necessary to take these additional details of software evolution into account.
Changing Lines of Code Measure (CLOC)
The CLOC method eliminates these discrepancies and properly measures the intricate changes involved. The CLOC method counts the number of LOC that have been added, changed, or remain unchanged.
These values are then combined to express the change in software as a rate of growth. The results can also be expressed in terms of the decay of the original code, which can be useful as a measure of how much original intellectual property ("IP") still exists from that original code.
Measurements. The CLOC method relies upon the CodeDiff and FileCount tools in Software Analysis & Forensic Engineering Corporation's (S.A.F.E.) toolset, CodeSuite, combined with a specially developed CLOC spreadsheet.
FileCount is a function that simply counts files, lines of code, and number of bytes in a directory tree. CLOC requires that FileCount is first used to count the number of program specific files and the number of non-blank lines in the software project's directory tree. CodeDiff is a function that exhaustively compares lines of code in one set of source code files to that in another set of source code files.
CLOC requires that CodeDiff is used to compare same-name files from the original version to subsequent versions of the software project. Typically movements of source code between files represents work being performed.
Similarly a file name change represents work being performed, because file names are not generally changed from version to version unless there is a significant change to the functionality of the file.
The results of the CodeDiff analysis are then exported into a CodeSuite distribution report that contains the statistical information about changes in the files and LOC.
The CodeDiff statistics are then combined with the FileCount numbers in the CLOC spreadsheet to generate the rate of software growth. This is demonstrated below in an analysis of the search engine Mozilla Firefox. The CLOC spreadsheet for the Firefox analysis is shown below in Table 1 below.
Click on image to enlarge.
The data elements shown in bold are generated from formulas in the spreadsheet that use the CodeDiff and FileCount data as input, whereas the other numbers are generated automatically by FileCount and CodeDiff.