
Estimating Program Complexity
by Orv Balcom
Estimating projects is not all crystal ball gazing. Here's a methodology that incorporates function points to help you deliver your next project on time and within budget.
The ability to
accurately predict program complexity early in the software development cycle is one of the necessary skills of the software engineer. We all know the effects of underestimating the software development task: missed markets, poor employee morale, and bad performance reviews, to name a few. On the other hand, overestimating the programming tasks can often get a project canceled, or if it's contract programming, lose you the job. Jack Ganssle discussed this problem at length in his column "Lies, Damn Lies,
and Schedules," in last December's issue of this magazine.
1
This article provides a method to improve the inexact science of estimating programming effort for embedded systems. It will take the concept of "function points" as an indication of program complexity from the data processing side of programming and suggest how to apply these concepts to embedded programs. These suggestions will be based on my own experience as an embedded system programmer, mainly in assembly language. We'll
then analyze four examples of this approach and draw conclusions.
For this or any estimating approach to work, a definitive functional requirement specification is needed. As Ganssle states, "without a clear project specification, any schedule estimate is nothing more than a stab in the dark" (p. 114). This approach also requires that a definitive software requirements specification is developed prior to the estimation process.
History
Traditionally, program complexity has
been specified by lines of code (LOCs), or in the case of programs written in assembler, by bytes of code. Obviously with LOCs, lines of new code should be specified, because libraries and debugged modules from previous projects need not be included. But do comment or blank lines count? Is Visual BASIC more efficient than QuickBASIC because Visual BASIC forces you to write one long single line where QuickBASIC allows lines to be extended with an underline? Even counting bytes in assembler isn't that reliable
because large tables of ASCII data often make for the simplest display message generation but use lots of ROM.
The concept of estimating the functionality of the program requirements to determine the complexity of the resultant program has been around for 20 years. Much has been written about its application in the data processing field, but little or nothing about its applicability to embedded systems. With some modification to the data processing approach, it can be a valuable tool.
How it Came About
As a consultant, I recently accepted a project which was to be the addition of some new features to a hand-held vacuum meter. The existing code was in MC68HC11 assembler; after reviewing that code, I decided to convert the whole program to use software common to my other projects. At the time, I realized that my fixed-price bid was going to be too low, but I looked at this as an ongoing project with a new client. It wasn't until I finished the software requirements spec (SRS)
that I realized how low my bid was! If you'd like to know more about what I include in an SRS, see my article in the February 1997 issue of this magazine.
2
My daily work schedule is to get started by about 7 a.m. and work straight through until 3 p.m., then go to the gym for some weights or racquetball. With this new project, I found myself missing the gym a lot because I was falling farther and farther behind schedule. So I set up a habit of setting completion goals for each day. If I met
the goal, off to the gym; if not, work until it was met (or reassess the day's goal). To set my daily goal, I'd look at the SRS and think part A will take a couple of hours, as will part B; now part C may take six hours, and so on.
This procedure rang a bell: function points. I remembered reading in the trade journals about them as a way of estimating program complexity by the functions that must be completed. I was lucky a few months back when my neighbor, who had left the aerospace industry, had his
whole computer science library up for grabs at a yard sale. I bought 43 books for $43 and really filled in my reference library. While scanning the indexes, I found a reference to function points in
Software Engineering
by Roger Pressman.
3
I read the section and found it quite relevant, even though the author states, "It [function points] may not be relevant to control-oriented or embedded applications in the engineering products and systems domain." I was on to something, though.
Pressman referenced two articles by A. J. Albrecht of IBM concerning the use of function points for estimating program complexity.
4,5
The first was in an IBM proceedings from 1979, and short of having one of my old main-frame buddies try to get it for me, I was out of luck. The second article was an IEEE document from 1983. I joined the IEEE Computer Society in 1979, so I went off to the garage to my periodical library. (I'll admit I save all my magazines. Who knows when I might need to refer to
one?) I had the referenced transaction, and following is the gist of what Albrecht and Gaffney said.
In their world, it seems that the only things of import were inputs, outputs, inquiries, and master files. I then understood why Pressman said that these probably wouldn't fit embedded systems! Albrecht and Gaffney assigned function points as follows:
Four function points for each input
Five function point for each output
Four function points for each inquiry
10 function
points for each master file
They then analyzed 24 different custom applications: 18 written in COBOL, four in PL/1, and two in DMS. These applications were all written by the IBM DP Services organization. They accounted for about one million lines of code and half a million work-hours. Using the in/out/inquiry/ file values above, the number of function points per work-hour were calculated for each application. A "best" straight line for function points vs. work-hours was fitted to the data. It showed very
good correlation, with an "r" of 0.9350 (one is perfect, zero is no correlation). While this data shows that IBM could only produce a couple of lines of code an hour, they were very consistent.
The paper went on to discuss ways to "tweak" the function point assignment for complexity and modify the LOCs per function point for the language used. They conclude that function points appear to be superior to LOC count in estimating the work-hours in a project.
Function Points Meet
Embedded Systems
Obviously, the above method of function point calculations will work with few, if any, embedded systems. I decided to try the following approach. I attempted to look at each discernible task in the SRS and assign a function point "score" based on the complexity of the required functionality. This would essentially be an hour count for what I thought it would take to code and debug it, based on what had to be done. If reusable code was available, only the integration time was counted. I
assigned letters with the following weights:
A=1; B=2; C=4; D=8; E=16; F=32
In other words, if I thought it could be done in an hour (I could do eight of this type in a day), I would assign an A. If I could do two a day, I would assign a C. If it would take close to a week, I'd assign an F. Each letter's occurrence was counted and the total number multiplied by the assigned value. These values were then summed to get the function point total.
An Example
I refer to
my section 5.0 of an SRS as "Timing," which consists of the time-dependent code. This includes initialization and usually a task scheduler, both of which are often existing reusable code. Table 1 provides the sub-paragraph headings. Now, I will describe how I assigned the function-point values.
TABLE 1: Typical section 5.
|
Task
|
Function Point Value
|
|
5.3.1 Initialization
|
|
|
5.3.1.1 Memory and Register Relocation
|
|
|
5.3.1.2 Pre-scaler Initialization
|
A
|
|
5.3.1.3 System Configuration Initialization
|
|
|
5.3.1.4 Initialize Display
|
C
|
|
5.3.1.5 Self Test
|
|
|
5.3.1.5.1 Self Test ROM
|
|
|
5.3.1.5.2 Self Test Internal RAM
|
A
|
|
5.3.1.5.3 Self Test External RAM
|
|
|
5.3.1.5.4 Self Test Replicated RAM
|
|
|
5.3.1.5.5 Self Test Error Reporting
|
A
|
|
5.3.1.6 Initialize Peripherals
|
|
|
5.3.1.6.1 Initialize the Timer
|
|
|
5.3.1.6.2 Initialize the ADC
|
A
|
|
5.3.1.6.3 Initialize the Parallel Port A
|
|
|
5.3.1.6.4 Initialize the Serial Comm. Interface
|
|
|
5.3.1.6.5 Zero Variables
|
|
|
5.3.1.7 Interrupt System Initialization
|
A
|
|
5.3.1.8 Illegal Interrupt Processing
|
B
|
|
5.3.2 Real Time Interrupt Processing
|
|
|
5.3.2.1 Service the Compare Register
|
|
|
5.3.2.2 Clear the Interrupt Flag
|
A
|
|
5.3.2.3 Service the Interrupt Counters
|
|
|
5.3.2.4 Schedule Foreground Routines
|
|
|
Total Function Points = 6*A + B + C = 12
|
|
Paragraphs
5.3.1.1 through 5.3.1.3 were basically existing code and clearly defined in the SRS, so I assigned a function point value of A. Paragraph 5.3.1.4, "Initialize the Display," was assigned a value of C because it was a new display with a 4-bit interface and existing code had to be modified. Within "Self Test" (Paragraph 5.3.1.5), sub-paragraphs 5.3.1.5.1 through 5.3.1.5.4 were existing code that only required some address modification. They were given a value of A. Sub-paragraph 5.3.1.5.5 required some
modification of existing code, so it got an A by itself.
The requirements for all of Paragraph 5.3.1.6, "Initialize Peripherals," was clearly defined in the SRS and was existing code, so an A covered it. Paragraph 5.3.1.7 was also existing code and assigned an A. Paragraph 5.3.1.8, "Illegal Interrupt Processing," required some new code and a new display, so it got a B. All of Paragraph 5.3.2, the interrupt service routine task scheduler, was existing code and could be easily integrated, so it was given an
A.
The total for Section 5.0 was six A's, a B, and a C for 12 function points. This examplifies how reusable code can really save development effort. Sections 6, 7, and 8 of the SRS would be analyzed in a similar manner to determine the total function points for the project.
Function Points and Real Projects
You can argue that the above process is nothing new. Any good manager will go through the SRS, section by section, to estimate the manpower that's required to do the
coding. Even so, the reason to use the function point approach is to add formality to the process. As the approach is used on more projects, the ability to assign function point values to the functionality required by the SRS should become more consistent.
Albrecht and Gaffney achieved a good correlation between function points and work-hours. How good a correlation can one expect with my function point approach applied to embedded projects? Since I became a full-time consultant over 25 years ago,
I have kept track of my work-hours on time cards. This is mainly to aid in bidding fixed-price projects because it allows me to go back and see how much time was spent on a project. In his article, Ganssle is adamant about the necessity of record keeping, and I agree.
To verify my function point approach, I selected four projects I'd coded over the last 10 years that had the following in common: the SRS for the project was written and approved before the coding started. These were all completely new
projects. None of the projects required learning new tools and I did all of the programming.
The first project was a DOS-based PC data acquisition system for collecting and analyzing data in a turbocharger test cell. It's used for development testing by a turbocharger manufacturer. The program was coded in Microsoft QuickBASIC. The resultant programs consisted of 11 .EXE files which were linked using the CHAIN WITH COMMON feature of QuickBASIC. The SRS contained 126 pages. The project was completed
in 1988.
The second project was a data acquisition and control system for an array of Quartz Crystal Microbalances which were used for contamination measurement aboard a space craft. The CPU was a Z80 and the code was written in assembler with a macro preprocessor. The SRS was 62 pages long and the resultant code was 3,036 bytes. The project was completed in 1987.
The third endeavor was a portable smoke meter which is used to measure the opacity of diesel truck exhaust. The CPU was an
MC68HC11, programmed in assembler. The SRS had 130 pages and the program occupied 22,385 bytes. It was completed in 1994.
The last project was the previously mentioned vacuum meter. The CPU was again the MC68HC11, also programmed in assembler. This SRS was 142 pages long and the code took up 25,620 bytes. This project was completed in 1997.
I examined the SRSs and assigned function points using the method described above. Then I went to the time cards and got the hours and counted lines of code
in the listings. The results are shown in Table 2.
TABLE 2: Function points, hours, and lines of code.
|
Project
|
Function Points
|
Lines of Code
|
Work-Hours
|
Hour per FP
|
LOC per FP
|
|
Turbo Test Syst
|
380
|
11,124
|
302
|
0.79
|
29
|
|
Spacecraft QCMs
|
166
|
2,484
|
98
|
0.59
|
15
|
|
Smoke Meter
|
235
|
10,206
|
212
|
0.90
|
43
|
|
Vacuum Meter
|
598
|
12,312
|
343
|
0.57
|
21
|
|
I entered the
function point, LOC, and work-hour data into Microsoft Excel. The plots of work-hours vs. function points and lines of code vs. function points are shown in Figures 1 and 2. Using the LINEST function of Excel, I calculated the best least-squares-fit straight line for the data. The equations for the lines and the "r's," or coefficients of determination for each plot, are shown in the figures.
Figure 1
Figure 2
The function point
vs. work hours data had an "r" of 0.83. While not the 0.9350 value of the IBM data, it seemed quite good for only four samples, compared to the 24 samples in the IBM data. Table 3 shows how well the derived estimating equation would have predicted the work-hours from the function points. I can think of a lot of reasons why the estimates are off, such as having to deal with buggy early releases of QuickBASIC 4.0. But since the error in estimating is about a work-week or less, I intend to use this approach
for future projects.
TABLE 3: Error in work-hours from estimating equation.
|
Project
|
Function Points
|
Estimated Hours
|
Actual Hours
|
Error (hours)
|
Error as Percent
|
|
Turbo Test Syst
|
380
|
258
|
302
|
+44
|
+17%
|
|
Spacecraft QCMs
|
166
|
146
|
98
|
-48
|
-33%
|
|
Smoke Meter
|
235
|
182
|
212
|
+30
|
+16%
|
|
Vacuum Meter
|
598
|
371
|
343
|
-28
|
-8%
|
Estimating Equation: Work-hours = (0.52 * FP) + 60
|
A certain amount of subjectivity exists in assigning function-point values to tasks. I would hope that with practice, I could zero in on a consistent evaluation, as that is all that's necessary. The samples I did for this article were my first try at it. I'm sure that if I analyzed the same projects again, knowing the previous results, I could obtain more consistent results, but that isn't fair-the idea is to develop an approach, over time, that is consistent. Then the analysis of a new project
from the SRS, before any code is written, will predict the work-hours required.
What To Do
To predict the work-hours required to code a project using the method I've described, you should do the following:
1. Review previous projects to try to determine the function point score from the requirements
2. Determine the estimating equation from the function points and work-hours of previous projects
3. Write a complete SRS for each new project
4. Use the previous estimating equation to predict the work-hours based on the function point value based on new the SRS
5. Keep detailed time cards on the programming effort for the new project
6. Evaluate the actual hours vs. the predicted hours and use the results to tune your function-point assignment skills
7. Add the data from the new project to your database
Often an estimate of the programming effort is required before the SRS is even started. In this case,
I'd suggest marking up an existing SRS from another project to add or delete functional requirements to match the new project's requirements. Then go through the function-point assignment procedure on the marked-up spec. Remember that even though everyone tries it now and then, you can't accurately estimate programming effort until the project's functional requirements are firm.
Try this approach. Tune it. Keep records, and hopefully, major software cost overruns will be a thing of the past.
A Valuable Aid
From a limited sample, it appears that the function point approach can be a valuable aid in estimating program complexity. Prediction was consistent within about a work-week of effort for each project. While this may not seem too good, compare it to the 2:1 overruns that are common in this industry, including my under-bidding of the vacuum meter project.
The test cases were all coded by a single programmer with consistent tools. The correlation between function
points and work-hours may not hold with different programmers or inconsistent programming teams. Programming language doesn't seem to have a big effect, probably because the language effect is factored in when the function-point score is assigned to the SRS requirements. I will certainly use function points when estimating future projects.
Orv Balcom, an electronics industry veteran, can be reached at (310) 326-8482.
References
1. Ganssle, Jack, "Lies, Damn Lies, and
Schedules,"
Embedded Systems Programming
, December 1997, p. 113.
2. Balcom, Orv, "Requirement Specifications Abet Software Development,"
Embedded Systems Programming
, February 1997, p. 52.
3. Pressman, Roger.
Software Engineering, A Practitioner's Approach
. Second Edition, New York: McGraw Hill, 1987, p. 92.
4. Albrecht, A. J., "Measuring Application Development Productivity,"
Proceedings of the IBM Application Development Symposium
, Monterey, CA, October 1979, pp. 83-92.
5. Albrecht, A. J. and J. E. Gaffney, "Software Function, Source Lines of Code and Development Effort Prediction: A Software Science Validation,"
IEEE Transactions on Software Engineering
, November 1983, pp. 639-648.
Return to Embedded.com
Send comments to:
Webmaster
All material on this site Copyright © 2000
CMP Media Inc. All rights
reserved.
|