Automatic Units Tracking - Embedded.com

# Automatic Units Tracking

The manipulation of measurements in software has been a repeated source of problems for the software community. But this doesn't have to continue. Here are some C++ tips and tricks for tracking, converting, and manipulating data with units.

For a long time, people had to do arithmetic completely by hand. This was time consuming and frequently led to mistakes. Then folks developed calculators and life got a little easier. People who deal with measurements struggle with a similar problem: unit conversions. We use hundreds of units to represent physical quantities such as length, time, and mass; converting between them is time consuming and prone to errors. Fortunately, scientific calculators are available that can do some of this for us. Models such as Hewlett Packard's HP48s can convert between similar unit types (feetmeters or secondshours) and back again. Such devices also check that units match before performing addition (or subtraction) or conversion. With the help of an HP48s, one never has to worry about checking if unit conversions are correct.

Why then, does the software community continue to struggle with this problem? Developers use integer or floating-point numbers and rely on rigorous documentation, variable-labeling schemes, or coding standards to ensure that everything is correct. (See John Fusco's “Measure Twice, Cut Once,” October 2000, p. 36.) Here's a typical code snippet:

double height_in_feet = 6.0;
double clock_time = 10.0; // minutes

Still, when it comes time to convert from one unit type to another, programmers have to do it by hand. We must specify conversion constants and make sure that they are correct, which is both time consuming and prone to errors.

The most likely reason software developers struggle is that units are harder to use than numerical data types. This article demonstrates how to use C++ to create a Units data type that makes unit conversions simple. The proposed solution will also be efficient, error resistant, and (best of all) easy to read.

Terminology

Before creating our Units data type, we should discuss some terminology.

A scalar is any unlabeled number (5, 2.16, -5/9, π, sqrt(-1), 3.1313…, and so on). When students answer a word problem with only a scalar, they usually get points marked off.

A unit is the label attached to a scalar. This is what the students forgot to include in their answers to word problems.

Units may be grouped into categories . The meter and the foot belong to the “length” category; seconds, minutes, and hours belong to the “time” category.

A measurement is a quantity that can be represented using a scalar and a unit in combination. A measurement is not tied to any particular standard unit. The distance between two points is the same whether you represent it in inches, miles, kilometers, or light years. If either the scalar or the unit is missing, it is not a measurement.

For example, imagine that you live in Chicago:
measurement = scalar * units
Distance to San Francisco(XSF ) = 2,129 * miles
Distance to Miami(XM ) = 2,213 * kilometers

Computers are great at representing scalars but they are not so good at representing measurements. The computer does not need to represent miles or kilometers because they are standards. It simply says “miles” and we know what it means. However, the distance to San Francisco is not commonly known. To represent it, we must compare XSF to a standard unit (such as a mile), and find that it is 2,129 times larger. When we compare XM to kilometers, we find it is 2,213 times larger. By rearranging the prior equation we have:

measurement / units = scalar

or

XSF / miles = 2,129

In other words, the distance to San Francisco is 2,129 times larger than the standard unit of a mile. Although these scalars are good for displaying, they are not very good for comparing measurements. If we compared the above scalars we might erroneously conclude that Chicago is closer to San Francisco than it is to Miami. Therefore, we must always use measurements to do our comparisons and calculations, never scalars.

We can now begin constructing a Units data type. A good Units type should have several features:

• Unit conversions should be performed automatically.
• The software should check that the units are correct.
• The data type should be easy to use correctly and impossible to misuse.
• Data manipulations should be done efficiently.

Automatic conversions

Two steps are involved in automating unit conversions. The first is to tie the unit conversion factor directly to the unit label. For now, we will do this using a simple typedef . The second is to use measurements instead of scalars whenever possible. By using measurements, we must always attach a unit label and, thus, the conversion factor.

Listing 1: A first stab at automatic units management

// define a category
typedef double Length;

// define units of length
const Length Meters = 1.0;
const Length Feet = 0.3048 * Meters;
const Length Miles = 5280.0 * Feet;
const Length Kilometers = 1000.0 * Meters;

// store a measurement
Length DistToSF = 2129.0 * Miles;
// Length DistToSF = 3426.0 * Kilometers; // same results

// Display Length any way we choose
cout << "Meters: " << (DistToSF / Meters) << endl;
cout << "Feet: " << (DistToSF / Feet) << endl;
cout << "Km: " << (DistToSF / Kilometers) << endl;

output:
Meters: 3.42629e+06
Feet: 1.12411e+07
Km: 3426.29

Listing 1 shows how we might define a length category and various units of length, then use these to produce automatic conversions. This code snippet contains several items of interest. First, it doesn't matter how the measurement is entered: using miles, kilometers, or feet. When we want to display the value, we compare the measurement to our chosen units and are left with a friendly scalar. Second, the code documents itself. No units comments are required. When you read the code, is there any doubt how far San Francisco is from Chicago? Is there any question how feet relate to miles? Finally, the constants were defined in terms of meters (Meters = 1.0 ) but could have been defined in terms of any positive number. We could have replaced the meter definition line with an arbitrary number such as:

const Length Meters = 123.456;

without affecting the rest of the program.

It is hard to tell from the example above, but measurements obey the laws of physics. Because they are “pure,” we do not need to convert them before using them. In the example in Listing 2, SoftballSpeed and CarSpeed may be compared directly even though they were initialized using different units.

Listing 2: An example application

Typedef double Time;    // a new category
typedef double Velocity;     // another category

const Time Seconds = 1.0;
const Time Hours = 3600.0 * Seconds;

Velocity CarSpeed = 55.0 * Miles / Hours;
Velocity SoftballSpeed = 40.0 * Feet / Seconds;

if (SoftballSpeed > CarSpeed)
cout << "Softball is faster." << endl;
else
cout << "Softball is not faster." << endl;

This simple system works well as long as developers remember to always multiply and divide by the conversion factor stored in the appropriate unit constant. However, if we replace our typedef with a C++ class, we can make it work perfectly every time.

Unit checking

To ensure that units are correct we must track the category (length, velocity, force, and so on) to which a unit or measurement belongs. The scalars that we have used thus far do not know what categories are. Because our units are really just doubles , it is possible to erroneously add data from different unit categories without detecting the error. For example:

// categories are doubles
Length x = 5.0 * Feet;
Time t = 3.4 * Seconds;
// conceptually wrong,
// but no error is flagged
Velocity v = x + t;

To detect this bug automatically, our scheme must understand the relationship between base categories and derived categories. Base categories are the simplest type of unit category and they cannot be broken down. They include but are not limited to mass, length , and time . The process of combining one or more base categories through multiplication and division results in derived categories. These can be completely broken down into their base categories. Table 1 shows several derived categories and their corresponding base category components. The exponent associated with each base category is the important part.

 Table 1: Derived categories Derived categories Base categories Area Mass0 Length2 Time0 Volume Mass0 Length3 Time0 Velocity Mass0 Length1 Time-1 Force Mass1 Length1 Time-2 Energy Mass1 Length2 Time-2 Power Mass1 Length2 Time-3 Pressure Mass1 Length-1 Time-2

Judging from the table, it seems we can uniquely identify any category by simply identifying the exponents of its base categories. By combining the base category exponents with the scalar conversion constant, we can detect unit errors automatically.

Listing 3: Definition of the Units class

class Units {
public:
Units();
Units( int MassExp, int LengthExp, int TimeExp );
Units( const Units& );
const Units& operator=( const Units& );
// other operators here
private:
int m_MassExponent;
int m_LengthExponent;
int m_TimeExponent;
double m_Value;
};

Units Units::operator*( const Units& u ) const
{
return Units( m_Value * u.m_Value,
m_MassExp + u.m_MassExp,
m_LengthExp + u.m_LengthExp,
m_TimeExp + u.m_TimeExp );
}

bool Units::operator<( const Units& u ) const {    assert( m_MassExp     == u.m_MassExp );
assert( m_LengthExp     == u.m_LengthExp );
assert( m_TimeExp     == u.TimeExp );

return m_Value < u.m_Value;}

typedef Units Length;
typedef Units Mass;
typedef Units Time;
typedef Units Pressure;

Listing 3 shows the outlines of our Units class. Note that the exponents associated with each base category are stored within the object, as data members.

In addition to overloading the assignment operator, we must also overload several operators that are exponent savvy. New multiplication and division operators combine categories to form derived categories. The multiplication operator adds exponents and the division operator subtracts them. The addition, subtraction, and various comparison operators check that categories match by asserting exponent equality.

Easy to use, impossible to misuse

A scalar typecast, shown in Listing 4 is the final and most important operator. It lets data of the Units type interact with other C++ functions and data types. By ensuring that all exponents are zero before returning the scalar, the typecast forces the programmer to differentiate a scalar from a measurement.

Listing 4: Typecast operator

Units::operator double() const
{
assert( m_MassExponent == 0 );
assert( m_LengthExponent == 0 );
assert( m_TimeExponent == 0 );

return double m_Value;}

This function prevents us from accessing any scalar that still has units associated with it. Only when we divide a measurement by a unit from the same category will the exponents go to zero and the function return the scalar. If the exponents don't go to zero, we can be sure there is a bug in the code.

Listing 5 shows an example of such a bug. The bug is that we forgot to wrap Feet/Seconds in parentheses. Fortunately, the time exponents do not cancel and we get an assertion error at runtime.

Listing 5: An example program with a bug

// exponents for speed are ( mass=0, length=1, time=-1 )
Velocity CarSpeed = 55.0 * Feet / Seconds;

cout << "Car speed = " << endl;

cout << (double)( CarSpeed / (Miles/Hours) ); // ok
cout << " mph" << endl;

// ERROR – next line fails because of wrong units
cout << (double)( CarSpeed / Feet / Seconds );
cout << " ft/s" << endl;

Since measurements cannot be used by standard C++ functions, they must pass through the typecast gateway in order to become scalar. If an error exists, this is where we will catch it.

Now some of you may be asking yourself, “What prevents me from accidentally adding meters to centimeters?” The answer is “absolutely nothing.” In fact, if it is convenient I hope you do. Both are lengths and adding them together is allowed since the Units label already provides the proper scaling. For example, if we wished to add 300 meters to 75 centimeters, the line might look something like:

Length len = 300.0*Meters +
75.0*Centimeters;

Assuming the base unit for length is meters, the compiler interprets the line as:

Length len = 300.0*1.0 + 75.0*0.01;

or:

Length len = 300.75;
// base units: meters

When we wish to display len in terms of kilometers, it needs to be divided by Kilometers :

cout << (double) (len/Kilometers)
<< " km" << endl;

Again, replacing the Units with their corresponding scalar leaves:

cout << (double) (len/1000.0) <<
” km” << endl;

which correctly prints 0.30075 km.

Now that all of our pieces are in place, it is time for a complete example. Listing 6 calculates the torque generated by a certain hydraulic motor based on the pressure at its intake. First, notice how MotorTorque() performs its calculations using no conversion constants. It doesn't know what unit the calling routine used for a particular variable and it doesn't care. MotorTorque() is as easy to read and validate as the physics equation from which it was derived.

Listing 6: A complete example

Torque
MotorTorque( Pressure Intake, Pressure Charge, Volume Displacement,
double gearRatio, double Eff )
{
return (Intake – Charge) * Displacement * GearRatio * Eff;
}

void main()
{
// Pump characteristics
const Pressure Charge = 280.0*PSI;
const double Eff = 0.92;
const double GearRatio = 3.5;
const Volume Displacement = 57.0 * Centimeters3 / Revolutions;

// tell user which units they need to enter
cout << "Enter pump intake pressure in PSI: ";

// convert floating point # to Units
Pressure Intake;
double temp;
cin >> temp;
Intake = temp * PSI;

Torque output = MotorTorque( Intake, Charge, Displacement, GearRatio, Eff );

cout << "Output torque = ";
cout << (double)(output/(Newtons*Meters));
cout << " Newton Meters" << endl;
}

Secondly, since doubles aren't as safe as Units variables, their scope should be limited as much as possible. This keeps the label (such as PSI or NewtonMeters ) as close to the doubles source (or destination) as possible. The intake pressure remains a double for one line before becoming Units; the output does not become a double until it is used.

Efficiency

The Units class is pretty nifty, but it still has some problems. It requires extra space to tote around the exponents, and simple arithmetic operators have become expensive function calls. By itself, it doesn't make a good candidate for use in systems with limited resources. Therefore, when we finish debugging, we might want to flip a #define switch to revert the Units data type to a typedefed scalar. See the contents of units.h in Listing 7.

Listing 7: Partial listing of units.h

#ifdef CHECK_UNITS

class Units {         // …    };
#define CategoryBase( var, m, l, t )
const Units var( m, l, t );

#else

typedef double Units;
#define CategoryBase( var, m, l, t )
const double var = 1.0;

#endif

Here, we introduce the macro CategoryBase() to define the base units so that exponents are kept for debugging and thrown away for release. Also note that Category-Base() and the Units constructor require exponent values, but not scalar. This is because the scalar base unit of any category is always 1. Other category constants should be constructed by multiplying the base by some constant. This forces programmers to define constants in terms of the base category, and discourages the creation of haphazard constants.

The Units class is also easier to maintain this way. Likely, only the base category will change. Imagine that at the inception of some project, the Units constants were defined as:

CategoryBase( Meters, 0, 1, 0 )
const Length Feet = 0.3048 * Meters;
const Length Inches = Feet / 12.0;

Then later in the project, it is determined (for whatever reason) that the base unit for length needs to be 20cm instead of 1m. Rather than change every constant, we only need to change one line and add one additional one:

CategoryBase( U_20Centimeters,
0, 1, 0 )
const Length Meters =
U_20Centimeters * 5.0;

The value that we store in the scalar has changed but the rest of the application is unaffected. This is not a big deal when there are few unit constants, but becomes very helpful when there are dozens of them.

And since Units are replaced by doubles in the release software, most arithmetic is done ahead of time by the compiler. For example, the compiler resolves:

Length Height = 5.0 * Feet
+ 11.5 * Inches;

to:

double Height = 1.816 // meters at compile time.

Integer math

For simplicity, the previous examples used equation-friendly units as the category base and stored them as doubles. This works fine for desktop applications but not so well in embedded systems. Embedded systems are often limited to integer arithmetic and measurements are seldom equation-friendly. However, by simply changing our scalar data type and the category base, our Units data type can handle almost any scenario. For example, let's examine a hypothetical car speedometer.

Imagine our speedometer measures length using a pulse pickup unit (PPU) on the car's wheel. It is designed so that the PPU produces 1,293 pulses for every mile the car travels. Our speedometer measures time using an internal timer interrupt. This particular interrupt fires every 20ms (50Hz). Given this, we might find the code snippet in Listing 8 in part of a car meant for delivery in the U.S.

Listing 8: Speedometer example

class Units {
//    functions go here
int m_MassExp;
int m_LengthExp;
int m_TimeExp;
int m_Value;    // now math is fast!
};

// define categories
typedef Units Length;
typedef Units Time;
typedef Units Velocity;

// define conversion constants
CategoryBase( PPUCount, 0, 1, 0 )
CategoryBase( TimerCount, 0, 0, 1 )
const Length Miles = 1293 * PPUCount;
const Time Seconds = 50 * TimerCount;
const Time Hours = 3600 * Seconds;

Length delta_x = GetPPUCount() * PPUCount;
Time delta_t = GetTimerCount() * TimerCount;
Velocity v = delta_x / delta_t;
ReportMPH( v * Hours / Miles );

ReportMPH() expects an integer and that is exactly what it gets-although the final line does not appear in the form that we have come to expect (measurement/units). Just like integer arithmetic, we must now be conscious of our order of operations. By saving the division operation for last we get the most accurate answer. Similarly, we should avoid defining constants using integer division.

// truncation error induced
// by integer division
const Velocity MPH = Miles / Hours;

Templates

Templates seem to be the least accepted feature of the C++ language. Since few compilers are completely standard-compliant, templates have fallen into ill repute. Templates that work on Compiler A might not work on Compiler B. For this reason, I have saved them for last.

At this point, the remaining limitations of our Units class are:

• Extra space is required to tote around the exponents.
• Exponents need to be frequently checked.
• Arithmetic is slowed by function calls.
• Errors are only detected at runtime.

Because templates implement the exponents as template parameters instead of member variables, the aforementioned problems are eliminated:

template< int MassExp, int
LengthExp, int TimeExp >

class Units {
public:
// operators go here
private:
double m_Value;
};

typedef Units< 1, 0, 0 > Mass;
typedef Units< 0, 1, 0 > Length;
typedef Units< 0, 0, 1 > Time;

If we inline all of the operators, the standard multiplication, addition, and comparison operations can all be as fast as their scalar equivalents.

Since no memory is required for the template arguments, the size of the Units is the same as the internal scalar. And instead of continuously checking that the exponents match using an assert() , we guarantee that exponents are correct by only defining operators with valid arguments.

Multiplication operations are valid for any category combination:

// multiplication form
inline template    int T1, int M2, int L2, int T2>
Units< M1+M2, L1+L2, T1+T2 >
operator*( Units,
Units );

Addition and comparison operators require both argument categories to match:

// addition and comparison form
inline template< int M, int L,
int T >
Units Units::opera-
tor+( Units ) const;

inline template< int M, int L,
int T >
bool Units::operator<
(Units ) const;

The scalar typecast is defined only for exponents that are zero:

// scalar typecast form
inline Units<0,0,0>::operator
double() const;

The implication of all these carefully defined operators is that the compiler can now detect errors at compile time instead of runtime. Let me say that again: errors can now be detected at compile time .

This is particularly useful if you have only a C compiler for your target system. You can first check units using any template-savvy C++ compiler (like GNU's free g++), then simply flip the #define switch and compile for release using your target C compiler.

Who could ask for anything more? If there were ever a reason to make C++ compilers template-compliant, this is it. You know that if your program compiles using the template implementation, your units are correct. Period. Your unit conversions are correct. Your equations are sound. There aren't even any run-time inefficiencies. No waiting for asserts. No tracking down unit errors in bug-ridden systems. No mistaking pounds for Newtons. And finally (if NASA had implemented such a system) no hunks of useless metal on Mars.

Christopher Rettig is a systems engineer with Vermeer Manufacturing in Pella, IA. He has a BS in electrical engineering from Rose-Hulman Institute of Technology, with a minor in computer science. His experience includes teaching physics and two years of programming embedded controllers. He can be contacted at .

Resources

Source code for both the class and template implementations can be downloaded from: ftp://ftp.embedded.com/2001/rettig