State-Oriented Programming
by Al Schneider
Want to know how to allow small microcontrollers
to be fast, use very little memory, and seem to do everything at once? Read on.
If you're implementing a system based on a small microcontroller that controls a large number of I/O points and must exhibit fast response, here's an approach to programming that produces fast, compact code for small microcontrollers, enabling them to quickly perform a wide variety of tasks. A small microcontroller might have 64 bytes of RAM, 2,048 bytes of ROM, and a 1ms per instruction execution rate. Although you'll
find smaller computers than this, and some that are even faster, this is a good example of the challenge.
The use of state machines for each function or task enables the software to be small yet respond quickly to real world events. Tasks include key debouncing, communications, LED flashing, and so on. Each function consists of a number of steps or states. Each individual state is a reaction or action that requires only a few instructions to perform. The entire program consists of passing control to each
function in turn. Each function only executes part of itself to act on or react to one step. The result is that the system seems to be doing everything at the same time. Consider Figure 1, illustrating the program looping through each function. Each function executes a small part of itself and passes control to the next function.
This article introduces a variety of ways this concept can be implemented, accompanied with snippets of
code. The code contains comments to explain the instructions used. Please note that the code is neither totally accurate nor tested. It's purpose is to serve as illustration.
To clarify this concept and expand upon its use, let's consider an example of a large tank filling with water and emptying repeatedly. Our little computer will control this process.
The drawing in Figure 2 represents a large container with an input pipe in the upper left and an output pipe in the lower right. The input pipe is
controlled by a valve as well as is the output pipe. The valves are output devices in that the computer can open or close them. If the computer writes a 0 to a valve, it is opened and allows water to flow. If the computer writes 1 to the valve, the valve is closed and no water flows.
You can also see floats in the tank. Float A indicates the tank is full and Float B indicates the tank is empty. These floats are input devices that the
computer is able to sense. If a float is up, it sends a value of 1 to the computer. If the float is down, it sends a value of 0 to the computer.
Our purpose in this task is to control the flow of water through the tank. We want to fill up the tank. Once full, we want to empty the tank completely. When it's empty, we want to fill it up again, and so on. The following C program describes how this task can be achieved:
void Main(void)
{
MainLoop; //The program loops between here
MachineOne( );
goto
MainLoop; //and here, calling MachineOne //each time
}
void MachineOne()
{
//The switch statement passes control to one
//of the states depending on the value in
//StateVariable
switch(StateVariable)
{
case FILLING: //Here the StateVariable is equal
//to FILLING
if(FloatA == 1)//if float a = 1 (up) do this code
{
//The upper float moved up, start dumping
ValveA=1;
ValveB=0;
StateVariable=DUMPING;
}
break;
case DUMPING: //Here the
StateVariable is equal
//to DUMPING
if(FloatB == 0)//If float B is 0 (down)
//do this code
{
//The lower float moved down, start
//filling.
ValveA=0;
ValveB=1;
StateVariable=FILLING;
}
break;
}
}
Note these important features of this implementation:
- There is a main control loop which constantly calls the "machines"
- Most of the time nothing is going on, so a "machine" is merely executing the same
"state" over and over again. The if statement in each "state" is executed and finds nothing has happened. This requires but a few instructions
- When an event does occur, only a few instructions are executed to take some action and possibly switch to a new state
The heart of the concept can be described like this: the main control loop calls routines to handle individual tasks in the system. Each routine consists of a state machine. Each state in the routine does one of two things. First, it checks to
see if there is something to do. If not, control merely falls out of the function. If there is something to do, the state does it and changes the state variable to execute another state on the next pass through the main control loop. The process of checking to see if there is something to do requires two or three instructions. The something-to-do can require less than 15 instructions or about 15ms. Therefore, as many as 60 machines could be executed within 1ms.
IMPLEMENTING STATE MACHINES
Let's
continue our study of the state machine concept by using code suitable for microcontrollers. The 8051 family assembly language will be used. Our first task is to express the state machine in assembly language instead of C. The code might appear as shown in Listing 1.
You have two ways to pass control to the appropriate state in a machine. The method of using a vector table-the method in Listing 1-is versatile and can be used with
machines with many states. For machines with a small number of states, a few bits can be used to track what state the machine is in. For example, if a machine has four states, two bits can keep track of them. The example we have been using has two states, so only one bit is necessary. The example in Listing 2 is a rewrite of the above code using that one bit.
STACK USE
In a small microcontroller environment, each
byte of stack space must be carefully planned for. Following a few rules will reduce the system's demands on the stack. First, use the stack only for calls, and not for data storage. Second, keep the number of calls you use to a minimum. A call depth of one is not unusual. In this method of programming, subroutines are not often used. Special needs always exist, so a stack with a call depth of three is acceptable. With only 64 bytes of RAM available, keeping the stack small is highly desirable.
Normally the
stack is also used for handling interrupts. That is, when an interrupt occurs, the interrupt routine saves all important registers, which can amount to more than 10 bytes. The space need not be reserved for interrupts if you use a technique that disables interrupts during normal processing. Then interrupts are allowed during those times when nothing is happening and the registers are essentially not used. Consider this piece of code:
MainControlLoop:
lcall Machine1
setb E0 ;allow interrupts
clr E0 ;enable interrupts
lcall Machine2
setb E0 ;allow interrupts
clr E0 ;enable interrupts
lcall Machine3
setb E0 ;allow interrupts
clr E0 ;enable interrupts
sjmp MainControlLoop
Interrupts are handled between calls to machines. Because the registers are not being used, they don't need to be saved. This technique works because each machine only requires about 15ms to get its job done. The interrupt latency, then, is 15ms, which is acceptable in most systems.
BYTE TIMERS
Timing events within your program is one of your more complex tasks. Often timers are required for a variety of uses, from communication time-outs to debouncing keys and flashing LEDs. The time control method presented here is simple. In this method, a byte is used for each software timer we need. When a timer interrupt occurs, each byte is examined to see if it is 0. If it is not 0, it is decremented:
TimerInterrupt: ;Execute this when interrupt occurs
mov a,TimerA ;Get timer
byte into register a
jz ItsZero ;Jump if TimerA is 0
dec TimerA ;If it's not 0, subtract 1 from it
ItsZero:
iret
If the timer interrupt occurs every 100ms and a timer is set to, say, five, then the timer will time out or become zero between 400 and 500ms.
When a task wishes to use a timer, it stores a value in the timer value. Then that task continues to test that value for zero. When it goes to zero, the task continues, knowing an appropriate time has passed. To demonstrate this,
let's make a machine that flashes an LED (see Listing 3).
MULTIPLE TIMERS IN ONE BYTE
With the timer scheme just presented, each software timer requires one byte. If you have a number of items to time, you could expend a lot of precious memory keeping track of them. Here's a scheme that enables four timers to exist in one byte. The layout of the byte is AABBCCDD.
AA represents two bits for one timer, BB, two bits for
the next, and so on. The idea is that each of these 2-bit timers is checked to see if it is zero. If not, the timer is decremented by one. If a timer is set to three or binary 11, then the timer will time out or go to zero in three timer ticks. If the interrupt timer is ticking every 50ms, then this timer would time out in 100 to 150ms.
Decrementing each 2-bit field in the byte would require about five instructions for each timer. However, one bold method checks and decrements the four timers using four
instructions. In this method, a timer byte is used as an index into a table to retrieve a new timer byte value. The values in the table are such that the appropriate non-zero 2-bit fields are decremented while those that are zero remain zero.
The major problem with this is that 256 bytes of ROM are required. The real gain is that three bytes of RAM can contain 12 timers. With the byte method, 12 bytes would be required. Often you will have plenty of ROM but little RAM, so this scheme can be valuable. The
2-bit timers are set with an OR instruction and tested using an AND instruction.
USING A LARGE NUMBER OF TIMERS
Decrementing all timers with each occurrence of an interrupt would be desirable, for the process could take longer than 15ms if there were a large number of them. Normally the timers need not be decremented on the same pass. In this scheme, the timers are not decremented when the timer interrupt occurs. Instead, the timer interrupt sets TimersState equal to zero, then when the main
control loop calls HandleTimers the timers are managed.
HandleTimers:
;main control loop calls this label
mov DPTR,#TimerCode ;get address of TimerCode into DPTR
mov a,TimersState ;get offset into TimerCode
jmp @a+DPTR ;Jump to the appropriate state
TimerCode;
mov TimersState,#INDEX_TO_TIMERA ;set up for next state
;decrement timer here
ret ;done with this timer
mov TimersState,#INDEX_TO_TIMERB
;decrement timer here
ret
ret ;this is
idle state
In this example, when the INDEX_TO_TIMER value is added to the TimerCode address, the jump indirect (jmp @a+DPTR) instruction will pass control to the appropriate routine. That routine handles the timer and selects the next INDEX_TO_TIMER value. When all timers are handled, only the last state is executed, which simply returns until the timer interrupt resets the state variable to zero.
SWITCH DEBOUNCING
Let's take a look at a real task and apply this technique to
debouncing keys and switches. The debounce strategy is to accept a key closure immediately as a closure but wait for a while to allow bounce to occur before checking to see if the key has been opened. The same occurs when the switch opens. We treat the initial opening as such, but allow some time to pass before checking to see if the key might be closed again. Those engineers who are worried about false signals due to noise on the wire from the key can use an extra state to double check the key to make
sure it's still closed a short time after the initial key closure.
;Debouncing a Key press
;Key 1 is closed if Key1StateBit equals 1
;Key 1 is open if Key1StateBit equals 0
;Key1Input is connected directly to key or switch
KEY1_TIMER_SET_MASK equ 0c0h
;set mask for 2-bit timer
KEY1_TIMER_TEST_MASK equ 3fh
;test mask for 2-bit timer
DoKey1: ;Main control loop calls DoKey1
;jump to appropriate state
jnb Key1StateBit1,Key1ClosedStates
jb Key1StateBit2,Key1OpenState
sjmp Key1JustOpenedState
Key1ClosedStates:
jnb Key1StateBit2,Key1ClosedState
sjmp Key1JustClosedState
Key1OpenState:
;if Key1Input =1, jmp to KeyJustClosed
jb Key1Input,KeyJustClosed
ret ;key not closed, do nothing
KeyJustClosed:
orl KeyTimer,#KEY1_TIMER_SET_MASK
;start debounce timer
set Key1StateBit1 ;closed state
clr Key1StateBit2 ;just closed state
ret
Key1JustClosedState:
mov a,#KEY1_TIMER_TEST_MASK
;check if timer timed out
anl a,KeyTimer
jz CloseDebounced ;if it did do jump
ret ;if not, just return
CloseDebounced:
set Key1StateBit2
;done debouncing, go to closed state
ret
Key1ClosedState:
;if Key1Input =0, jmp to KeyJustOpened
jnb Key1Input,KeyJustOpened
ret ;key not closed, do nothing
KeyJustOpened:
orl KeyTimer,#KEY1_TIMER_SET_MASK
;start debounce timer
clr Key1StateBit1 ;open state
clr Key1StateBit2 ;just opened state
ret
Key1JustOpenedState:
mov a,#KEY1_TIMER_TEST_MASK
;check if timer timed out
anl a,KeyTimer
jz OpenDebounced ;if it did do jump
ret ;if not, just return
OpenDebounced:
set Key1StateBit2
;done debouncing, go to open state
ret
In this routine, you can see the timer in use, which of course is decremented elsewhere in the program, and you can see that each step of the machine requires very few instructions to execute during each pass through the main loop.
DISPLAY DEVICES
Our next example illustrates writing data to an array of LEDs (see Figure 3).
Below is a simplified model made up for the purposes of this explanation. In this model, bits are fed into the DAT line one at a time. As the bits enter, they all move around like a bucket brigade. The dots inside the dotted box represent LEDs that make up a 5-by-7 character array. If a one bit falls on a dot, that LED is lit. If a zero falls on a dot, it is not lit. The CE (chip enable), CLK (clock), and DAT (data) lines are manipulated to get the chips into the array. The CE line is brought high to start
sending bits into the array. To move a bit into the array, the CLK line is raised to one. Then the DAT line is raised to one if a bit is to be moved in, otherwise the DAT line remains zero if a zero bit is to be moved in. Then the CLK line drops to zero to tell the array a bit is on the DAT line. That process is repeated for each bit. Then the CE line is dropped to zero, which tells the array all bits are in and the one bits can be turned on.
The following code is designed to clock one bit in every pass through the main control loop. Note that to start the machine working the code invoking the routine puts the bits to make a character into Data0-4 and sets LedState, the state variable, to LED_START_STATE. When the routine is done entering the bits, the state variable equals LED_IDLE_STATE.
;NOTE: code to vector to the state labels is not ;displayed here
LedIdleState:
;do nothing until someone changes state variable to
;LedStartState
ret
LedStartState:
;Begin to display character in Data0-4
mov r0,#Data0 ;get index to first byte
setb CE ;raise chip enable to write bits to LED array
mov LedState,#LED_BIT_STATE
;switch to display bit state
ret
LedBitState:
setb CLK
;raise the clock letting LED know a bit is coming
mov a,@r0
;retrieve bits to write to LED array
rlc a ;rotate a bit into carry flag
mov DATA,c
;write the carry flag to the data out pin
mov @r0,a
;save the bits to write to LED
array
clr CLK
;drop clock informing LED the bit is done
clr DATA
;clear the data out pin to ready for next pass
djnz BitCnt,StillBitsToDo
;decrement bits to write
;& jump if not done
mov LedState,#LED_BYTE_STATE
;done writing, change state
StillBitsToDo:
ret
LedByteState:
;here we check to see if there are more bytes to do
inc r0
;compare index to end of data area, jump if not equal
cjne r0,#DATA4+1,StillBytesToDo
mov a,#LED_DONE_STATE ;we are done so
mov LedState,a
;change state
ret
StillBytesToDo:
mov BitCnt,#8
;more to do so reset bit count, do it again
mov LedState,#LED_BIT_STATE
;done, change state to send bits
ret
LedDoneState:
clr CE
;drop chip enable causing character to be displayed
mov LedState,#LED_IDLE
;go to idle state because we are done
ret
This routine would display a character in about 10ms. This assumes that for most functions the main control loop calls have nothing to do when they are called.
COMMUNICATIONS
Many
microcontrollers support some sort of communication hardware. This could range from IIC, SPI, SCI, or other serial communication systems. This hardware handles the details of communication while the software acts in a supervisory role. The hardware provides registers for the software through which control is effected.
Here I describe a rough model of the communications environment that contains common elements of these systems. Then I'll present a sample of code that handles communication with the concepts
presented in this article. Here's how the registers and hardware flags may be organized:
- InByte: A register in memory which receives a byte from the outside world
- OutByte: A register in memory which sends a byte to the outside world
- RxFlag: A bit when set by the hardware indicates InByte has received a byte
- TxFlag: A bit when set by the hardware indicates OutByte has been sent
To send a byte out of the system, the software would put a byte into OutByte and the hardware, recognizing the
new data, begins to send the data serially out over the transmission line. When the byte is sent, the processor sets the TxFlag, which may also cause an interrupt. When a byte comes in over the receive line, it is moved into the InByte register. When a complete byte is received, the RxFlag is set which may also cause an interrupt.
In this discussion we will assume all necessary items have been conditioned. A communications clock has been set for the right rate and necessary control bits have been set to
enable communication to proceed.
For this example, assume we are using full duplex. Input always comes in over one wire and output signals go out over another wire. Also assume that the messages are four bytes long. This message must be received in less than 100ms. If that is so, the controller sends an ACK; otherwise, the controller will send a NAK. Also our example does not use interrupts but constantly examines RxFlag and TxFlag, taking action when they are set. Only the input or read routine will be
presented.
The philosophy is the same as before; a function, named GetMessage, handles the job and is called from the main control loop. This function is a state machine. RxState is the state variable. The routine GetMessage accepts 4 input bytes placing them into Data0, Data1, Data2, and Data3, which are locations in RAM.
GetMessage:
;NOTE: the code to vector to the state labels
;has been left out for brevity
RxIdleState:
;this state is constanly executed while waiting ;for input
;When it detects an
input byte it receives the ;byte and prepares to receive the other bytes.
jb RxFlag,GotInput
;If RxFlag set jump to GotInput
ret ;just return to caller, no input yet
GotInput :
mov a,RxByte
;get input byte into register a
mov r0,#DATA0
;initialize index to message area
mov @r0,a
;move register a into message area
inc r0 ;point at next MT slot
mov a,#COMM_TIMEOUT ;fire up time out timer
mov CommTimer,a
mov a,#RECEIVE_STATE
;prepare to receive more bytes
mov RxState,a
;by changing
to the receive state
ret
RxReceiveState:
;This state receives input bytes
;It checks the timer to see if to much time has
;passed. When all bytes have been read it
;changes state to send an ACK mov a,CommTimer
;check to see if we timed out
jnz CheckForNext
;jump if we have not timed out
mov a,#RXTIMEOUT
;fall to here with message timeout
mov RxState,a ;and change to time out state
ret
CheckForNext:
jb RxFlag,GotNext
;if RxFlag set jump to GotNext
ret ;if not set just
return to caller
GotNext:
;Here we received another byte
mov a,RxByte ;move byte to register a
mov @r0,a
;move register a into message area
inc r0
;point at next byte in message area
cjne r0,#Data3+1,Next
;compare index to input data (r0)
;and jump if not equal to Data3+1 address
mov a,#RXACK_STATE ;done receiving bytes,
mov RxState,a ;change state to send ACK
Next:
ret
RxAckState:
;Send ACK and change state to wait for ACK to be sent
mov a,#TX_REQUEST_ACK_SEND
;Setup Tx to send
ACK
mov TxState,a
mov a,#RXACK_WAIT_STATE
;change read state to
mov RxState,a ;wait for ACK sent
ret
RxAckWaitState:
;Wait for ACK to be sent then change state to ;idle state
mov a,#TxState
;if Tx send state NE to idle
cjne a,#TX_IDLE,AckNotSent ;ACK is not sent
mov a,#RX_IDLE
;otherwise fall here and change state
mov RxState,a ;to idle state
AckNotSent:
ret
RxNakState:
;Send NAK and change state to wait for NAK to be ;sent
mov a,#TX_REQUEST_NAK_SEND
;Setup Tx to send
ACK
mov TxState,a
mov a,#RXNAK_WAIT_STATE
mov RxState,a
ret
RxNakWaitState:
;Wait for NAK to be sent, when sent change state ;to idle
mov a,#TxState
cjne a,#TX_IDLE,NakNotSent
mov a,#RX_IDLE
mov RxState,a
NakNotSent
ret
Small Feature Microcontrollers
In general the microcontrollers I've discussed here have been small but have still had several important hardware extras. Here's how to deal with the situation in which you have few or no luxuries at all. Consider, for example,
a microcontroller without stack hardware, indirect addressing capability, or communication specific hardware.
Little Or No Stack
Some small microcontrollers have limited stack capability or none at all. The method I present in this article reduces the need for the stack. The main control loop used in all of the examples uses calls to all of the functions. The calls could be eliminated by simply having all the functions in line. Instead of returning from the function when it completes, the
program could jump to the end of the function where the next function would begin. The stack is also used in many implementations as calls to common routines. The approach taken in this article does not use common routines since each state requires so few instructions. If a piece of code is used in more than one place it is simply duplicated. The reduced need for stack space enables those microcontrollers with limited stack to use it for very important functions.
No Indirect Addressing
The method
presented here depends on the ability to use vector tables (via indirect addressing) to vector to the appropriate state. Here is an alternative. The state variable is decremented until it is zero. At that point the appropriate state is jumped to. Here is a code snippet showing this:
AFunction:
;main control loop calls this function
mov a,StateVariable
;get state variable into register a
jz State0
;if register a is 0 jump to state 0
dec a
;subtract 1 from the register
jz State1
;if reg a
is now 0 jump to state 1
dec a ;and so on
jz State2
The state getting control can then put a value into StateVariable to cause another state to be executed on next pass.
Communication Without Communication Hardware
Now consider a microcontroller that does not have the hardware to support communications. Assume we are building a "machine" that will monitor an input line and accept asynchronous 8-bit messages commonly found in RS-232 communications. We don't need timer
interrupts: the timer can be set up to run free. It can be constantly incremented so that when it reaches its maximum value it will wrap to zero. We'll call the routine to handle all of this DoComm. It's called between each function call in the main control loop. The main control loop might appear as follows:
MainControlLoop:
call KeyBoard
call DoComm
;Do input bit manipulation
call Heater
call DoComm
;Do input bit manipulation
call GetMessage
call DoComm
;Do input bit
manipulation
jmp MainControlLoop
Notice the GetMessage function, which is similar to the communication function we discussed earlier. It's another task, just as it was before. Before, however, the hardware was getting the data for it to handle. Now the DoComm function handles this task. Timing is the critical aspect of this task and is handled as follows. Each state only executes about 15 instructions which should require about 15ms for each function. Then every 15ms, or after each function, we call
DoComm. Upon entry DoComm loops waiting for the exact time or "point of examination" necessary for its processing. The precise time is in DesireTime. The code might appear:
DoComm:
DesiredTime equ R4
;this code snippit assumes an 8048 type architecture
DoCommA:
;loop to here to wait for right time
mov a,T
;get time ticks from reg T into a
;8048 has no compare, use xor,complement
;for that
xrl a,DesiredTime
;compare present time to desired time
cpl a
;compare present time to desired
time
jnz DoCommA
;not yet, jump to DoCommA to try again
;fall here with time to handle communications
add a,#COMMTIME ;set up to get next comm handle time
mov DesiredTime,a
;save the next desired time
;continue processing communications
There are two critical times involved with the reading of an 8-bit byte. One time is the time to accept one bit. We can call that the bit time. The other time is how often we read the input line to check for that bit. That time is called the point of examination
or ticks in this document. The time from point of examination to point of examination is one-tenth of the bit time. Another way to put it is the point of examination occurs 10 times more often than a bit. The code presented above in DoComm enables the "point of examination" to be precisely located. The structure of a byte coming into the read port may be as follows:
- A mark time which is at least 1.5 bit times. Mark means the signal is high or one
- A start bit which is a zero-level signal for
1-bit time
- Then eight bits follow, each bit being a bit time high or low
- After the eight data bits are received, the input line may enter the mark state for another 1.5 bit times or more
DoComm must be written to accept this structure. As before, the DoComm function is divided into states each one doing its small share of the whole operation. The states could be as follows:
- Mark state: this state gets control constantly while the input line is one. When mark state detects a zero input, it assumes
that it is the beginning of a start bit and sets up a counter to count ticks into the middle of the first bit. Then the state is changed to read state
- Read state: when the tick counter goes to zero, read state reads the input line and rotates that value as a bit into a byte holding the input eight bits. The tick counter is set up again to wait until the middle of another bit. This process continues until eight bits have been read. Then the state is changed to done state
- Done state: this state serves
to notify other parts of the program that a byte has been read. The other function can read the state of this function, find the state variable equal to done state and take action to move the new byte in. One of the actions would be to set the state variable here to find mark state
- Find mark state: this state continues reading the input line until it is one for 1.5 bit times. Then it changes the state to mark state. The process is ready to read another byte
DIVIDE AND CONQUER
The point I
want to make in this article is that many tasks can be divided into smaller parts, each part of which is executed sequentially. This creates the illusion that all tasks are performed at the same time. This technique can be applied to almost any other task. Other functions that can be adapted to this technique include multiplying and dividing. These are loop-intensive tasks. Each loop could be a part of a math function that is executed each pass of the main control loop. Managing serial EEPROM is another
application. Each pass through the main control loop could set up or finish a write. Often in manipulating these devices one must wait for some action to be completed within the part. With this technique, instead of waiting, the loop continues and other work is done.
Perhaps some day you will try this idea on a project that has limited capability and rewrite an old favorite routine, finding, like the author, that the performance of these little chips can be impressive.
Al Schneider has a BS in physics. He
has been programming for 26 years, concentrating on embedded systems. Presently he is a contract programmer and has worked on many applications, including pacemakers, airplane guidance systems, industrial HVAC interfaces, and plastic and molten metal injection control systems. If you try using the technique described in this article, Al would like to hear from you. You can reach him at als@citilink.com.
|