Measure twice, cut once - Embedded.com

Measure twice, cut once



To read original PDF of the print article, click here.

Measure Twice, Cut Once

John Fusco

Carefully named variables help reduce confusion. Follow this namingconvention to track units and avoid mismeasurement.

In the fall of 1999, NASA experienced the failure of its Mars Climate Orbiter. The failure was attributed to miscalculations in the thrust used to correct anomalies in the orbiter's trajectory on its way to Mars. These miscalculations, it was later discovered, were due to one subcontractor using English units instead of metric units. NASA specified the output of a piece of ground software to be in metric units of impulse (Newton-seconds); the software produced output in English units of impulse (pound-seconds).

This article discusses problems associated with the lack of a useful software abstraction for measurement units and presents some simple ideas for dealing with them.

Standard units
The Systéme International d'Unités (SI, for short) standardizes the units that are used in science and engineering. A feature of SI units is that you can solve all the fundamental equations in physics without any extra manipulation. For instance, the familiar equation F=ma can be solved by using the SI unit of mass (kilograms) and acceleration (meters per second per second). The resulting force is measured in Newtons (N). You could solve this equation using the English units of mass (slug) and distance (foot). But the result would not be the English unit of force (pound) unless you also multiplied it by a nonintuitive constant.

SI classifies units into two different types: base units and derived units. A base unit is a unit that cannot be decomposed into other units. Length is an example of a base unit, since it cannot be expressed in terms of other units, like time or mass. A derived unit can be expressed in terms of multiple base units and may have a proper name. The SI unit of force is a derived unit that has a proper name. The Newton is derived from the SI units of mass, distance, and time. This can be illustrated using Newton's second law:

Force = mass * acceleration

Alternately:

Newton = kilograms*meters/seconds/seconds
1 Newton = 1 kg-m/s2

In addition, SI provides a vocabulary for scaling the standard units by powers of ten. These familiar prefixes are shown in Table 1.

Table 1: SI standard prefixes
Scale factor Prefix Symbol
109 nano n
106 micro µ
103 milli m
102 centi c
101 deci d
10 deka (or deca) da
102 hecto h
103 kilo k
106 mega M
109 giga G

Why not use SI units for everything?
Mandating SI standard units in embedded software is not practical when the inputs and outputs are anything but standard. Such a requirement can directly impact the performance and cost of the system. For instance, a system that measures time in one-microsecond clock ticks would need a data type that can represent 1/1,000,000th of a second in order to keep the data in SI units. Since C does not provide a fixed-point type, the only other choices are to use a floating-point type or an integer in non-standard units.

Cost-sensitive embedded systems avoid floating-point computation whenever possible. Using a CPU without a floating-point coprocessor can cut costs. Floating-point computations on an integer CPU require the use of a floating-point emulation library to take the place of the coprocessor. The emulation library takes up precious memory, which can affect the cost of the system. Floating-point operations are emulated with function calls that may be routed via an exception handler. Operations that would normally compile to a single instruction for an integer are turned into function calls for emulated floating point. Even the fastest integer processor will suffer a significant performance penalty when it is forced to use emulated floating-point operations. When a floating-point coprocessor is available, the space required to store 4-byte floats might be too costly when 2-byte integers will suffice.

For most programmers, the obvious solution to this problem is to use a scaled integer, that is, an integer in non-standard units. For our one-microsecond clock tick, the simple choice is to represent time in units of microseconds. Strictly speaking, this is a standard unit because we are only scaling the time by a power of 10. Scaling SI units by powers of 10 is as simple as slapping on one of the prefixes shown in Table 1 in front of the name.

Unfortunately, scaling by a power of 10 is often no help in embedded software. For example, the output of an analog-to-digital converter (ADC) is typically a binary value-meaning the range of output values is a power of two. The input range, on the other hand, can be constrained in a completely arbitrary fashion. So, in general, the output of an ADC will be X/2y, where X is the input range and y is the number of bits in the output. Even if the actual input (x) is in standard units, your output will be some standard unit divided by a power of two. SI doesn't help much here, since there is no vocabulary for representing standard units scaled by powers of two. That's probably just as well, since it is not very often that your input is an integral multiple of a standard unit either.

Use and abuse of units in software
It's hard to adhere to any convention, let alone SI, in your software, when all your inputs are in non-standard units. Devices like timers, ADCs, and encoders spit out integers in whatever units a hardware designer can dream of. Sometimes these units make sense and sometimes they're just arbitrary, but mostly they're not standard. It's up to the embedded programmer to make some sense out of the chaos.

It would be easy just to blame hardware designers, but there's plenty of blame to go around. Programmers can also play fast and loose with the data and confuse anyone who tries to follow. Many a shortcut has been used to save CPU cycles that make the code read like a college textbook where “the proof is left as an exercise for the reader.” While shortcuts are often necessary in embedded software, it is not necessary to confuse everyone who looks at the code.

Some common shortcuts you may have run across include interchanging terms like “frequency” and “period,” or “time” and “distance” in your code. This may seem trivial as long as you are familiar with the design, but it can confuse a newcomer. Another example would be hard-coding scale factors in your code to convert from one system of units to another, with no mention as to what is going on.

Sometimes, we hide vital information in a design document that could have easily been included in the code in the form of a comment or an informative name. Programmers will make assumptions without always checking the documentation-that's human nature. Some assumptions may seem obvious to the programmer, but can still be wrong. It may have seemed obvious to the programmers writing the software for the Mars Climate Orbiter that thrust should be expressed in pounds. If they had checked the design document, they would have found otherwise.

The problem of units in software
The C and C++ languages do not give you many choices when it comes to representing scalar variables. You basically have two choices: integer or floating point. The language makes no assumptions about what the data represents, so you get no complaints from the compiler when you do things like:

  • Assign units of one type to a variable in different units; for example, “length = time”
  • Add or subtract variables that are not in the same units; for example, “length = length + time”
  • Multiply or divide variables in different units and store the result in a variable in the wrong units; for example, “frequency = time/cycles”

In a nutshell, what we need to define is a unique scalar type for each unit of measure, such as length or time, that is derived from one of the native scalar types (int, double, float). The goal is to define some simple rules and let the compiler enforce them. Some languages, like Ada, have this ability built-in. Once you define your own scalar type, an Ada compiler won't allow you to mix up the math unless you explicitly instruct it to do so. Unfortunately, this is not so trivial in C and C++.

The typedef keyword, which provides the simplest way to define your own scalar type in C and C++, creates nothing more than an alias. The compiler doesn't care if you mix up an expression using a “meter_type” and a “second_type,” as long as they are both scalars. In most cases you won't get a warning, much less an error. To make matters worse, program checkers like lint don't care about user-defined scalar types either! You may as well just use the preprocessor.

The next logical step is to define a class, but that opens up a new can of worms. In C++ a class cannot inherit from a scalar. For instance, the following C++ statement will not compile:

class METER : double  {};	    // Does not compile

If you want to create your own scalar class, you must settle for something like the following:

class METER { double value; };

This is unfortunate because now the compiler knows absolutely nothing about scalar operations for this class. You must write code for every scalar operation, including several you may not have thought of. For instance, in addition to writing operator methods for “+”, “-“, and so on, you must supply the code to perform all type conversions, which are necessary to evaluate a simple expression like the following:

velocity = (METERS_PER_SEC) meters   / (METERS_PER_SEC) seconds;  

You have to define a method for the METERS_PER_SEC typecast in both the METERS type and the SECONDS type. You also have to provide various constructors and assignment operators before this class will work like a plain scalar. A complete example would take several pages and can be found in the first reference at the end of this article. I'm sure you can find one in your favorite C++ textbook. You will also have to revise these classes regularly to provide access to new derived units. This is not a low maintenance solution.

All we wanted was for the compiler to enforce better type checking of scalars. But to do that we are forced to write lots of extra code that adds no value to the application. As if that weren't bad enough, using a scalar class such as this will produce inefficient code. It forces the compiler to insert a call to a constructor or method where it would normally insert a single machine instruction. Even with inline code and aggressive optimization, I doubt that this code could ever be as efficient as code written with plain scalars.

For me, the ultimate question regarding a solution like this is “does it add value to the code?” I have three simple criteria I use for deciding if a solution adds value:

  • Does it reduce the amount of code that must be written? Fewer lines of code means fewer opportunities for bugs
  • Is it low maintenance? Code that has to be constantly modified to accommodate every application does not help. The act of modifying the code introduces more opportunities for bugs
  • Is it easy to understand? If you can live with high maintenance code, it had better be easy to understand. The more difficult it is to understand the easier it is to make a mistake when “maintaining” it. A kludge solution that works well for only one application is not much of a solution

You will have to answer these questions for yourself. But let's consider our humble goal. I just wanted to create an abstract scalar type that would tell the compiler to prevent me from making stupid mistakes like “length = time” or “seconds = milliseconds.” I also wanted to be able to overload operators so that the answers would come out in the right units automatically. Any programmer who wanted to use this class would have to be familiar with the implementation in order to use different units. So much for low maintenance.So we have gone to extremes to do what should be a very simple thing, and we still don't have a good solution. This is probably why, when it comes to using the right units, most of us rely on code review and testing to ensure that we get it right.

Getting it right
If you are going to rely on humans to do the job of checking units, you might as well make it easy on them. There are lots of ways to screw things up, so the best you can do is to try to make that less likely. The first line of defense in this area is the names you choose for your variables.

No modern programming languages I know of enforce any rules about how you name your data, or allow you to enforce your own. Many coding conventions out there say how you should name your variables, but they are usually focused on aesthetics. It seems we are more concerned about which letter should be capitalized, or where to put the underscore than we are about whether the name is inconsistent or misleading. Often we don't realize how ambiguous some terms are until we have to revisit the code much later. What seems clear today may be incomprehensible six months from now.

Consider a term like “rate” used to describe a change in some value over time. You might name a variable “transfer_rate” or “flow_rate,” but you are missing a key piece of information-the units. To measure a data transfer rate, you could be talking about bits per second, baud, bytes per second, megabytes per second, and so on. Typically you might put a comment in the declaration to indicate the units, but if that declaration is in another file, chances are good that no one will look for it.

Why not use the name to indicate the convention in use? For example, we could use the name “transfer_rate_bits_per_sec.” This tells us exactly how it was calculated and how to use it. What if you are using system clock ticks as your unit of time measurement? It would be a waste of CPU cycles to scale your clock ticks into seconds. In this case, you could skip the scaling and use a name like “transfer_rate_bytes_per_tick,” making it clear that you are not using seconds for measurement. This is not part of any standard convention, but it communicates an important piece of information that the reader needs to know. Another programmer should at least get a clue from the name that he needs to know what a clock tick is before he makes any assumptions about this value.

Putting units in your variable names may seem clunky at first, but it adds value to your code. For one thing, it gives more meaning to a variable than a generic term like “rate” or “length.” If the name consists of real-world measurements, the reader can infer information about the variable just by knowing about the application. For instance, if a variable contains the speed of a car in miles per hour, you have a fairly good idea what the range of values is for that variable. You also know that an expression like “speed_mph = distance_meters/delta_time_seconds” is wrong. You can see that just by looking at the names. No need to sift through header files or design documents, it's right there for everyone to see.

How Many Ways Can You Tell Time?

The ANSI C clock() function is one of many time related functions in the ANSI standard that leaves much to be desired in terms of a name. The time() function is just as vague, but this at least has the advantage of a long legacy that you can at least assume that programmers will be familiar with it.

The clock() function is supposed to tell you how much CPU time your program has consumed, but there are a few gotchas. For one thing the name tells us nothing about the units in the return value, and the return type “clock_t” is equally uninformative. I have seen more than one manual that told me to divide the return by CLK_TCK to get the time value in seconds, but this is wrong. Again, we have a macro with a bad name. The only thing CLK_TCK has in common with clock() is that they use the same units (whatever those are). It seems that the unit of clock_t and the definition of CLK_TCK are implementation defined. Unfortunately some early implementations of ANSI C left out a critical macro-the one which tells you the units, which is more clearly named CLOCKS_PER_SECOND.

So what is CLK_TCK then? CLK_TCK is supposed to tell you how precise the timer used by clock() is. That is, if CLOCKS_PER_SECOND tells you that the units of clock() are in microseconds, then CLK_TCK tells you how many microseconds there are between each clock tick. Obvious isn't it? This gets even more confusing when your implementation incorrectly defines CLK_TCK and CLOCKS_PER_SECOND as the same value. This is true of the Cygnus distribution of GCC for Windows (also known as Cygwin B20). Faulty software that uses CLK_TCK instead of CLOCKS_PER_SECOND will work until a compiler change makes these macros take on different values. Fortunately, hardly anyone uses this function anyway. I can't imagine why not.POSIX improved things with the clock_gettime() function. This function takes as its parameters a clock ID and a “tv” structure containing two fields-tv_sec and tv_nsec. Notice how the structure element names tell you that the time is in seconds + nanoseconds. The clock ID allows for implementation-defined clocks, but every implementation is required to provide a CLOCK_REALTIME clock ID. So this function can accommodate high precision clocks and is a bit more informative than the simpler ANSI functions.

What I like about this approach is that it is simple to apply and follow. You do not need strict rules to apply this, just common sense. You don't need a dictionary to follow the code as long as you stick to familiar abbreviations. A code review can help alert you to ambiguous abbreviations or unfamiliar terms.

Another benefit of this approach is that incomplete compliance is better than none at all. Even with tens of thousands of lines of legacy code, you can still improve the code by applying this to just the code you add or modify. Unlike some naming conventions that don't work unless they are strictly followed throughout the code, putting units in variable names can be done anywhere at any time.

Naming guidelines
Following a convention like SI goes a long way toward making your code easy to follow, but if following a particular convention starts to impact your performance or development cost, it's time to ditch the convention. That's where the names are most important. You should be free to use whatever units are necessary to get the job done, but use the names to inform the reader.

The goal is to make the code as easy to follow as possible, so that code reviews will catch as many errors as possible. Stick to familiar terms and abbreviations in your names using SI as an example (see Table 1). Keep in mind that the SI abbreviations are case sensitive. For instance, “M” is the abbreviation for “Mega,” but “m” is the abbreviation for “meter” and “milli.” When in doubt, spell it out. This is especially true if you are going to use some of the less common abbreviations. For example:

  • 1/1000th of a volt = “mv” or “mV” are okay, “mVolts” or “milliVolts” are even better
  • 1/100th of a volt = “mvx10,” “mVx10,” “mVoltsx10,” or “milliVoltsx10” are okay but “centiVolts” is would be unusual and “cV” would be cryptic
  • 1/10th of a volt = “mVoltsx100” or “milliVoltsx100” but “deciVolts” would be unusual and “dV” would be cryptic

These names start to get ugly when you use derived units like “velocity_meters_per_second,” but at least they're not ambiguous. Sometimes you should consider inventing your own units to avoid confusion. For instance, maybe you want to use a pixel as a unit of distance. In that case, don't call it a millimeter, call it a pixel. You can then define a well-named constant like PIXELS_TO_MILLIMETERS that will convert your pixels into appropriate units.

Another example of when you might want to invent your own units is when you need to manipulate raw data from an ADC. Chances are that the raw data is in arbitrary units, so no SI units apply unless you scale the data first. In this case just call it something like “adcCounts.” Any programmer or reviewer should understand that these are raw counts that need to be scaled before they can be used.Of course, assigning units to every scalar does not make sense. To paraphrase Freud, sometimes an int is just an int. Examples of this would include loop counters and bit flags. Use good judgement and listen to code review feedback. In the end you should have code that is easier to understand and fewer bugs.

John Fusco received his BSEE from Polytechnic University in Brooklyn NY. He has been designing and programming embedded systems for over 10 years. Currently he is with GE Medical Systems where he designs image reconstruction software for CT scanners. You can e-mail him at .

References
1. Eckel, Bruce. Thinking in C++, Vol. 1 (2nd Edition). Englewood Cliffs, NJ: Prentice-Hall, 2000.

2. For more information about standard units, consult the National Institute of Standards and Technology: physics.nist.gov/cuu/Units/index.html. Special Publication 330 discusses SI units in detail.

3. For more information about the details behind the loss of the Mars Climate Orbiter, see the official report available from JPL: ftp://ftp.hq.nasa.gov/pub/pao/reports/1999/MCO_report.pdf.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.