Multiplication by a Fixed Point Constant

Kyle York

June 16, 2009

Kyle YorkJune 16, 2009

Multiplication and division is something we take for granted when working with desktop computers, but avoid like the plague on embedded devices. However it is here to stay and often times we must simply deal with it.

If the CPU you use does not have native multiply and divide support, any use will likely bring in a library, and all of the overhead implied. If you step back a bit further and insist on using floating point numbers, the overhead can quickly consume all of the memory leaving precious little space for the real, ``working,'' code.

Luckily, for the most part true floating point math is rarely necessary, especially in small embedded devices. For these, fixed point math is sufficient [1]. Immediately a massive support library has been reduced to only a large one, but even fixed point has its problems " namely overflow. Let us say I need to multiply an number by 3.14. Naively, this would be achieved with something like:

x = x * 314 / 100

If x happens to be an 8-bit number, the multiply will overflow and a larger temporary (more data space) will be required. Luckily there are numerous ways around this (yes, one could use 157/50, but the overall problem still remains).

Often times, a program only needs one or two multiplies. The canonical example would be scaling the results from an ADC for display. Say you have an ADC with 10 bit precision which you would like to display as a percentage. Before display, you use x = x * 100 / 1023. That could very well be the only time a multiply is required " so why incur the overhead of a library for a single use?

The embedded platform I use is based on MicroChip's PIC 16F877. For those not familiar, this chip has neither multiply nor divide instructions, the shift instructions only shift a value one bit, and both code and data memories are severely limited.

Let us begin with a simple example, integer multiply by 100. For those blessed with native support for multiply, and who think these techniques do not apply to you, they might. Back in another life, even though the processor in use had native multiply, it was often faster to shift + add. Much faster.

To determine the necessary shifts, start by looking at the number in binary (highest bit on the left):

100: 0 1 1 0 0 1 0 0

Considering each bit the co-efficient of a polynomial results in the following equation:

x = 64x + 32x + 4x

or, using shifts instead:

t0 = x << 6
t1 = x << 5
x = x << 2
x += t0
x += t1

This clearly works but it requires two temporary variables and a total of 13 shifts. Today I do not have the luxury of multi-bit shifts and I also cannot afford even the two extra variables!

It is clear that some of the shifts are redundant. Start by factoring the original equation:

x = 64x + 32x + 4x
  = 4 ( 16x + 8x + x )
  = 4 ( 8 ( x + 2x ) + x )

which yields the following:

t0 = x
x = x << 1 ; 2x
x += t0 ; x + 2x
x = x << 3 ; 8 ( x + 2x )
x += t0 ; 8 ( x + 2x ) + x
x = x << 2 ; 4 ( 8 ( x + 2x ) + x )

Note only a single temporary is used, and the number of shifts is reduced from 13 to 6.

Now that I have shown the simple multiply by an integer, I will generalize this to use fixed point values. For this I am going to use the constant 1.234. It appears arbitrary and somewhat useless but is actually a perfect example as it shows each point I want to make.

As mentioned above, begin by converting this value to a 64-bit fixed point binary. When doing this programmatically, I find it easiest to simply use an array of 64 bytes for reasons which should become obvious.

The first 32 bytes in the array represent the whole part and are set by converting value into an integer, running through the bits of the integer and setting the appropriate elements to 1. The second 32 bytes are set by first subtracting off the whole part of the value, then continuously multiplying by 2. Every time the value exceeds one, set the corresponding array element to 1, and subtract 1 from the value.

In fixed point binary, 1.234 looks like this:

0001.0011 1011 1110 0111 0110 1100 1000 1011

The first 28 bits are zero, and therefore not shown. Remember the values on the right of the binary point are negative powers of 2, so, this represents 2^0 + 2^-3 + 2^-4 + 2^-5 + 2^-7 ...

At first blush, this appears to be a lot of work: a total of 299 shifts and 20 additions! Luckily there are some tricks that can be used to reduce this. For example, contiguous runs of 1s can be reduced. Say we want to multiply a number by 15, or 1111 in binary. The straightforward way would be:

x = x * 8 + x * 4 + x * 2 + x

giving 6 shifts and 3 adds. Alternately, one could do this:

x = x * 16 " x

giving 4 shifts and a subtract at the expense of a possible overflow. This is represented as follows:

+0001 0000
-0000 0001

Experience shows that runs of three or more 1s will usually reduce the resulting code. Rewriting the above, taking into account runs of 1s, yields:

+0001.0100 0100 0000 1000 0110 1100 1000 1011
-0000.0000 1000 0010 0001 0000 0000 0000 0000

The number of shifts required has been reduced to 250 (with 11 additions and three subtractions). Can more be done? Of course. Once the bit stream has been split into additive and subtractive parts above, occasionally an identity will present itself. Again, by example, let use say we end up with the following:


which becomes the following equation:

x = 4x - 2x

which is the same as:

x = 2x

Any time an additive bit is immediately followed by a subtractive one (or vise versa), one of the bits can be removed saving a shift and either an add or subtract. The above example then becomes:


Applying this `identity' detection to the original expression gives:

+0001.0100 0000 0000 1000 0110 1100 1000 1011
-0000.0000 0100 0010 0001 0000 0000 0000 0000

saving 5 more shifts, for a total of 245, a roughly 20% reduction from the original, but still rather hefty. Now consider required precision. The equation above results in a fixed point constant multiply with no loss of precision.

In the ADC example above, if the display is only going to show an integer between 0 and 100 it is silly to calculate the result to ten decimal places. To limit the precision, simply evaluate the expression from left to right, stopping when the result of the expression is within the tolerances and zeroing all of the co-efficients past that point. Allowing an error of just 0.01% the expression becomes:

+0001.0100 0000 000
-0000.0000 0100 001

This represents a multiply by 1.233887 instead of 1.234, for an actual error of 0.009%. This has reduced the number shifts to 19, yielding:

x + x / 4 " x / 64 " x / 2048

This leads us to the the final step, refactoring:

x + ( ( -x / 32 " x ) / 16 + x ) / 4

giving a final total of 11 shifts and 4 adds and subtracts, just 4% of the original 299.

This is all well and good, if not tedious " I certainly would not want to go through this work for each constant I need. Luckily, computers are really good at doing tedious chores and not complaining.

Kyle York is a programmer for Cisco Systems specializing in layer 2 security. He also wrote and maintains the JALv2 compiler for the PIC microncontrollers.

Gordon, Robert. ``A Calculated Look at Fixed-Point Arithmetic,'' June 17 2003

Loading comments...