Implementing the logic behind complex behavior -

Implementing the logic behind complex behavior

Making sure event-driven, asynchronous systems behave correctly can be a difficult task, usually resulting in complex implementations with hard-to-find and hard-to-fix corner cases. This article introduces some simple techniques that can make the implementation much easier. State machines are used to represent dependencies between events and data. By combining simple state machines into one diagram, you can get a clear picture of the complex behavior and derive transition code from that diagram.

The model problem
The following example shows the problems you'll encounter when designing a controller that initializes and maintains the state of some hardware resource. The controller receives events from three entities:

  • ProcessController sends Enabled and Disabled events, which indicate the status of the software running on the hardware resource;
  • DataManager sends DataAvailable event, which comes with data needed for initialization;
  • Hardware resource sends InitResponse event when initialization is finished.

The hardware can't be initialized until the Enabled and the DataAvailable events (with valid data) have been received. Initialization involves sending an InitRequest with the data to a process running on the hardware, then waiting for the InitResponse, after which the hardware is available. A Disabled event should mark the hardware as unavailable, and the hardware must be reinitialized after another Enabled event. Any DataAvailable event with modified data should cause a reinitialization as well.

Figure 1: UML state diagram modeling an initialization sequence

The logic I've described can be modeled as a state machine. A possible design using the Unified Modeling Language (UML) notation is shown in Figure 1. In order to have a reliable design for an asynchronous system, every event must be handled in every state. The data that comes with an event can also determine the behavior. Invalid data, for example, should mark the resource as unavailable. The state diagram therefore is cluttered with choice points and transitions.

Transition code is not shown in Figure 1 and only adds to the complexity. Of course, other designs are possible but the main problem is that all states, events, and data are tied together. For every event and every state the developer must determine whether it's valid to stay in this state or, if not, which transition to take and what code to execute. Adding new events or more states increases the complexity exponentially since almost everything will be affected. What you need are abstractions.

The state diagram in Figure 1 focuses only on the initialization sequence; other events just have to fit in. The idea is to break up the system into multiple state machines, where each state machine handles a unique subset of related events. The data that's part of an event is important for the behavior. Therefore, data ranges are mapped to a state machine where the states reflect the influence on the behavior. In this example, it's important whether data is valid or invalid.

Initialization can start when isEnabled equals true (local variable indicating that an Enabled event was received) and the data is valid. This specific combination of states can be represented by a state machine as well: a state machine whose state is dependent on the states of other state machines has actions associated with state changes and controls another state machine. The initialization state machine is only interested in whether or not it's okay to start the initialization process. The new state machine represents those preconditions and is an abstraction that implements the dependencies between other state machines. I'll refer to this as a Dependency State Machine (DSM) .

Figure 2: Dependencies

A graphical representation of the dependencies is shown in Figure 2. The lines show the direction in which the dependencies work. Figure 2 also shows that the DataState has a variable configData associated with it.

Implementation details
Implementing this design is language independent, though templates in C++ make the code a bit more readable. First, I'll introduce enumerations for all states as shown in Listing 1. Local variables store the state information as shown in Listing 2.

Listing 1: Enumerations for all states

enum AvailabilityState { enabled, disabled };
enum DataState { invalid, valid };
enum InitializationState { uninitialized, waitForRsp, initialized };
enum DependencyState { active, inactive };

Listing 2: How local variables store the state information

AvailabilityState theAvailabilityState;
DataState theDataState;
InitializationState theInitializationState;
DependencyState theInitDependencyState;

Next we define the functions that handle the events. How events are received doesn't really matter as long as a function will be executed when an event is received. These functions are primarily responsible for determining the new state associated with this event. For example, the function in Listing 3 handles the Enabled and Disabled events. Similar for the DataAvailable event shown in Listing 4.

Listing 3: Function handles Enabled and Disabled events

voidExample1::handleAvailabilityEvent (bool isEnabled){  theAvailabilityState = (isEnabled ? enabled : disabled);  checkInitDependencyState();}  

Listing 4: Function handles DataAvailable event

voidExample1::handleDataAvailableEvent (DataType& data){  if (configData != data)   {    theDataState = invalid;  // moves InitDependencyState to inactive     checkInitDependencyState();  }  configData = data;   if (configData.isValid())   {     theDataState = valid;    checkInitDependencyState();  }}  

The function checkInitDependencyState should be called after every state assignment. This function evaluates the state of the DSM, handles the state changes, and influences the next state machine in the dependency chain—in this case, the InitState shown in Listing 5.

Listing 5: InitState to be checked by checkInitDependencyState

voidExample1::checkInitDependencyState(){  DependencyState newState = (theAvailabilityState == enabled &&      theDataState == valid) ? active : inactive;  
  if (theInitDependencyState != newState) }  { // a state change    if (newState == active)     { // to active  
      initCounter++;  // used for matching request and response      sendInitReq();  
    }    else     {  // transition to inactive: move initState to notInitialized  
      theInitState = notInitialized;  
    }  }  theInitDependencyState = newState;}  

First determine the new state and then check for state changes. The red text in Listing 5 represents the transition code that's executed as part of state changes. For example, a transition of the DSM to active due to an Enabled and a DataAvailable event (in random order) triggers a function call to sendInitReq. This function sends the InitRequest and moves the initialization state to waitForResponse:

sendInitReq(){  //use initCounter as a    transactionId  ::send(configData, initCounter);     // implementation dependent  theInitState = waitForResponse;}  

When an event is received that causes the InitDependencyState to move to inactive (such as a DataAvailable event with invalid data), InitState is forced to notInitialized, therefore ignoring the InitResponse shown in Listing 6.

Listing 6: Function handles InitResponse

voidExample1::handleInitResponse(int transactionId){  if (theInitState == waitForResponse)   { // still waiting for response?    if (transactionId == initCounter)     {      theInitState = initialized;    }     // else: wrong transactionId so wait for another response  }   // else: wrong state}  

The code in Listing 6 shows the use of a transactionId to match InitRequest and InitResponse since multiple InitRequests may be send, but the state machine should only trigger on the InitResponse that matches the InitRequest that was send last. The initCounter is updated before sending a new request.

Automatic dependency check
A different approach is to wrap the enumerated types into a templated class that handles the dependencies in the background by executing callbacks. The interface of the templated State class looks like this:

template class State{  public:    void operator=(const T& );    bool operator==(const T& );    bool operator!=(const T& );    void setDependency(Callback* );  private:    Callback* theCallback;    T theState;};  

The key function here is:

operator=(const T& aState){  if(theState != aState)    theState = aState;    if (theCallback)       (*theCallback)();}  

The callback should point to a checkInitDependencyState- like function, so that after every state assignment, the dependencies will be checked. Note that the Callback class used here is a templated class that stores a pointer to a function. Sample code is shown in Listing 7.

Listing 7: Callback

template class Callback{  public:    Callback(T& obj, void (T::*f)());    void operator()();  private:    void (T::*function)();    T* object;};  
template Callback::Callback(T& obj, void  (T::*f)()){  function = f;  object = &obj;}  
template void Callback::operator()(){  (object->*function)();}  

A class that uses the template would define states as shown in Listing 8. Adding the dependencies between the states is done by specifying the callbacks shown in Listing 9.

Listing 8: Definition of theDataState

State theAvailabilityState;
State theDataState;

Listing 9: Specifying the callbacks

Callback* callback = new Callback}  (*this, &InitExample::  checkInitDependencyState); theAvailabilityState. setDependency(callback);theDataState.setDependency(callback);  

All dependencies can be located in one place, such as the constructor of a class. An stl::list of Callback pointers can be used instead of a pointer if a state machine influences more than one state.

To show how easy it is to add more dependencies, consider an additional requirement that a ClusterId has to be known as well. Simply add a new state machine with states “idKnown” and “idUnknown,” set the callback to the checkInitDependencyState function, and add the state check shown in Listing 10.

Listing 10: Adding more dependencies

voidExample2::checkInitDependencyState(){  DependencyState newState = (theAvailabilityState == enabled &&     theDataState == valid && theClusterIdState == idKnown)      ? active : inactive;  // nothing else changed  ..}  

This is a simple addition and hardly affects existing code. And of course, don't forget to store the ClusterId as well. The DSM prevents this change from affecting other parts of the system. This is a big difference from the state chart shown at the beginning of this article, where most of the transitions and choice points would have been affected.

Figure 3: A generic dependency chain of a state machine

Figure 3 shows a more generic dependency chain of state machines. The outlined box represents a new state machine. It handles only events related to this state machine and reports its state to the DSM (shaded boxes).

Transition code
The interaction between the DSM and the InitState by means of state assignments can be a dangerous thing. Code that depends on state checks will be affected when this logic changes. Is there a way to make sure that state assignments and state checks are in sync? Related to this is writing protective code, such as adding checks for preconditions shown in Listing 11.

Listing 11: Adding checks for preconditions

voidExample1::sendInitReq(){  if (theInitDependencyState == active) // add this just to be sure  {      ::send(configData, initCounter);  // implementation dependent    theInitState = waitForResponse;   // always ok to do this?  }}  

It seems like there's no harm in doing this since the code still works. Apparently a lot of implementations are functionally equivalent. But is this really true?

The following example will make the difference between the various implementations much clearer. With the current implementation, an InitRequest is sent every time the InitDependencyState transitions to active, without waiting for an InitResponse. The receiving process queues up all InitRequests but should only handle the last one received since it contains the latest data. All other requests are just delaying the process. So the following logic is needed: don't send a new request when a response is pending; send a new InitRequest (if needed) when a response is received. Basically, postpone the sending of the request. How hard can that be?

A simple check for the InitState in the sendInitReq function seems enough. If the InitState equals waitForResponse, don't send a request shown in Listing 12.

Listing 12: A simple check for WaitForResponse

voidExample2::sendInitReq(){  if (theInitState != waitForResponse)  {     // send signal    theInitState = waitForResponse;  } }  

When a response is received, decide whether or not to send another request (a “resend check”). This depends on state changes of the InitDependencyState and such changes can be detected by using the transactionId. If the counter doesn't match the transactionId in the InitResponse then the counter was updated due to an InitDependencyState transition to active shown in Listing 13.

Listing 13: InitDependencyState transition to active

voidExample2::handleInitResponse(int transactionId){  if (theInitState == waitForResponse)   {    if (transactionId == initCounter)     {  // InitDependencyState did not change      theInitState = initialized;    }    else     {  // wrong transactionId      sendInitReq();    }   }}  

The function call to sendInitReq is only executed when the InitState equals waitForResponse. The “optimization” is now preventing the correct behavior! I tried moving some code around, changing and adding if statements, and so on. Every time, one or more use cases failed. The question now is how to come up with the transition code and be sure that it is correct without having to go through the iterative implement/test cycle.

Combined states
This article started with creating small state machines for every set of related events and focused on dependencies between them. In order to find transition code, two state machines are combined into one. The picture in Figure 4 shows all combinations of the InitDependencyState (two states) and the InitState (three states).

Figure 4: Combined state diagram for InitDependencyState and InitState

The horizontal transitions are caused by InitDependencyState changes (and therefore indirectly by events), while vertical transitions are triggered by events handled by the InitState (InitResponse) and by state manipulation (assignments) by the InitDependencyState (dotted lines).

The initial state is inactive/notInitialized. The only thing that can happen here is a transition to active. This should trigger an InitRequest to be sent. The sendInitReq then assigns the InitState to waitForResponse causing the transition to active/waitForResponse. The only two events that need to be handled now are inactive and an InitResponse event. An inactive transition brings us to inactive/waitForResponse. The InitState should then be forced to notInitialized, and we're back at the initial state. An InitResponse in active/waitForResponse should bring us to active/initialized. The diagram shows that it's possible to send multiple requests without waiting for a response (InitDependencyState can change at any time). Therefore, responses have to be matched to requests, which is done in the choice point. If the transactionId equals the counter, then that response matches the last request that was sent, so the InitState can transition to initialized. Otherwise, wait for the next response. Finally, when initialized, the only possible event is inactive. The InitState then has to be forced to notInitialized, and we're ready to go from the beginning.

In Figure 5, you can see what happens when we add the optimization.

Figure 5: Combined state diagram for InitDependencyState and InitState (more efficient)

The only difference is that when in the active/waitForResponse state, a transition of InitDependencyState to inactive should not force the InitState to notInitialized state. Now, when in waitForResponse/inactive state, two things can happen. An InitResponse will bring it back to notInitialized, while an active should not do anything besides incrementing the counter, since we're still waiting for a response. The choice point code should, upon receiving an InitRequest, check whether or not to send another InitRequest. Of course, this should be done whenever the transactionId doesn't match the counter since this indicates changes in the InitDependencyState while waiting for a response.

The code can be derived by systematically checking every combination of state and transition. After rearranging the if-statements, it looks like the code in Listing 14.

Listing 14: The code after rearranging the if-statements

voidExample2::checkInitDependencyState(){  DependencyState newState = (theAvailabilityState == enabled &&      theDataState == valid) ? active : inactive;  
  if (theInitDependencyState != newState)   {  // a state change    if (newState == active)     {                // to active      initCounter++;      if (theInitState == notInitialized)       {         sendInitReq();        theInitState = waitForResponse;  // force transition      }     }    else {  // transition to inactive      if (theInitState == initialized)       {        theInitState = notInitialized;  // force transition      }    }  }  theInitDependencyState = newState;}  
voidExample2::handleInitResponse(int transactionId){  if (theInitState == waitForResponse)    if (theInitDependencyState == active) {      if (transactionId == initCounter) {        theInitState = initialized;      }      else       {        sendInitReq()      }     }    else     {  // state = Inactive/WaitForResponse      theInitState = notInitialized;    }   }}  

Notice how the sendInitReq function is called from two different contexts: active/NotInitialized and active/WaitForResponse. So the check for active is not needed in this function. The function is also called when an InitResponse is received, which means that no response is pending so a new request can be send. This implies that the check for waitForResponse is also not needed. Therefore, the state machine ensures the right preconditions when this function is called. Also, moving the InitState to waitForResponse is handled by the transition code of the DSM (forced transition), leaving the sendInitReq function free of any side affects, shown in Listing 15.

Listing 15: No checks or assignments needed in sendInitReq

voidExample2::sendInitReq(){  if ((theInitDependencyState == active) && (theInitState != waitForResponse))  {     ::send(configData, initCounter);  // implementation dependent    theInitState = waitForResponse;  } }  

This example shows that by combining a DSM and a regular state machine, you can focus on only a small portion of the complex behavior. The logic that determines the DSM is completely irrelevant.

Another example
Now that we've covered the basics let's tackle a more complex scenario. Let's say the controller needs to know its ClusterId before it can get configuration data from a database, which has to be up and running. Events indicate the status of the database (up or down). As in the previous example, this data will be used in the initialization process.

Combining everything, the complete dependency diagram looks like the one shown in Figure 6.

Figure 6: Complete dependency diagram

Compared with the previous dependency diagram, four new states are added but they only influence the DataState, which still has a dependency to the InitDependencyState. Therefore, the initialization logic and implementation is not affected by this addition.

Just as in the previous section, a DSM and a regular state machine will be combined into one. For this example, the DataDependencencyState and GetDataState are combined shown in Figure 7.

Figure 7: DataDependencyState and GetDataState are combined

The choice point implements the same kind of resend check as in the previous example. For example, while waiting for the DataResponse, changing the ClusterId means that new data has to be retrieved from the database. This mechanism is highly efficient, especially in systems with slow response times.

Listing 16: Part of code that combines DataDependency State with GetDataState

voidExample3::checkDataDependencyState(){  DependencyState newState = (theClusterIdState == known &&    theDataServerState == up) ? active : inactive;  
  if (theDataDependencyState != newState)   {  // state change    if (newState == active)     {  // change to active      dataCounter++;      if (theDataState == notWaiting)       {        sendDataReq();        theDataState = waitForResponse; // force transition, see diagram      }    }  }  theDataDependencyState = newState;}  
voidExample3::handleDataResponse(DataType& data){  if (theDataDependencyState == active)   {    if (theTransactionId == dataCounter)     {      theDataState = notWaiting;      handleDataAvailable(data);    }    else     {      sendDataReq();    }  }  else {    theDataState = notWaiting;  }}  

Part of the code is shown in Listing 16. Note how the function handleDataAvailable is simply reused from the previous example. This function is no longer called when an event is received, but is now dependent upon a completely different set of preconditions. The initialization part of the code, however, didn't change at all! This shows how existing designs can be extended.

There are thousands of combinations in which the seven input events shown in the previous example can be received. It's nearly impossible to describe all these use cases. But by focusing on dependencies, the logic of the behavior can be modeled much more straightforwardly. Combining a DSM and a regular state machine into one enables you to focus on that specific part of the behavior.

Changing requirements
Requirements will always change, so it's a must to have code that can handle these changes without a major overhaul. Here are some examples:

  • Changing the ClusterId causes the system to retrieve new data followed by reinitializing the hardware. While waiting for this data to arrive, the hardware is still marked as available. How to make it not available? Add a dependency from ClusterIdState to InitDependencyState (also set the callback) shown in Listing 17.

Listing 17: Add dependency from ClusterIdState to InitDependencyState

DependencyState newState = (theAvailabilityState == enabled && }     theDataState == valid && theClusterIdState == idKnown)   
  • The InitState is available even when the system is waiting for updated data, which upon arrival brings the system down, so why mark it as available when you know it'll be unavailable shortly? Add a dependency from GetDataState to InitDependencyState, as shown in Listing 18. This way the initialization process will not start until the response is received with the latest information.

Listing 18: Add dependency from GetDataState to InitDependencyState

DependencyState newState = (theAvailabilityState == enabled &&}     theDataState == valid && theGetDataState == notWaiting)   
  • A reset button that resets (parts of) the system can be implemented with a state machine consisting of two states: “resetting” and “done” where done state is part of a precondition check for DSM. A reset then involves switching to resetting and back to done.

These changes also don't affect the DSM state machines, but only the dependencies. In the object-oriented world, derived classes can even override the code that sets the callbacks and override methods like checkInitDependencyState creating different behavior without affecting the transition code in the base class.

Large-system considerations
For big systems that communicate with dozens of entities, the design can easily be broken up into smaller pieces, which then can run in their own thread if needed and communicate via any Inter Process Communication (IPC) mechanism. State reporting to a DependencyState can be done just as well with events; it doesn't have to be a function call.

The examples given in this article are taken from a large telecommunications product. Initialization of the hardware was only a small fraction of the total dependency web. The ability to reset specific parts of the system proved to be especially useful.

The examples here have only a limited number of events, yet the number of use cases is in the thousands. By breaking up the system into small state machines, and focusing on the dependencies among them, the behavior can be modeled much easier and you, the programmer, are much more in control. The code can be derived and verified from the combined state diagram. State assignments and state checks can be visualized in the state diagram so it's easier to keep them in sync.

I hope this also makes clear what the side effects can be of adding flags to fix corner cases or if-statements to make the code robust.

Jos Dijkstra has a masters in mechanical engineering from the University of Groningen, Holland. He has seven years of software engineering experience and is currently working for Ericsson Wireless Inc in Boulder, CO. You can reach him at

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.