CMP EMBEDDED.COM

Login | Register     Welcome Guest  
HOME DESIGN PRODUCTS COLUMNS E-LEARNING CONFERENCES CODE FORUMS/BLOGS NEWSLETTERS CONTACT FEATURES RSS RSS

Back to the Basics - Practical Embedded Coding Tips: Part 2
Asynchronous Hardware/Firmware, race conditions and solutions



Embedded.com
What options are available?
Fortunately a number of solutions do exist. The easiest is to stop the timer before attempting to read it. There will be no chance of an overflow putting the upper and lower halves of the data out of sync. This is a simple and guaranteed solution. We will lose time. Since the hardware generally counts the processor's clock, or clock divided by a small number, it may lose quite a few ticks during the handful of instructions executed to do the reads.

The problem will be much worse if an interrupt causes a context switch after disabling the counting. Turning interrupts off during this period will eliminate unwanted tasking, but increases both system latency and complexity.

I just hate disabling interrupts, system latency goes up, and sometimes the debugging tools get a bit funky. When reading code a red flag goes up if I see a lot of disable interrupt instructions sprinkled about. Though not necessarily bad, it's often a sign that either the code was beaten into submission (made to work by heroic debugging instead of careful design), or there's something quite difficult and odd about the environment.

Another solution is to read the timer_hi variable, then the hardware timer, and then reread timer_hi. An interrupt occurred if both variable values aren't identical. Iterate until the two variable reads are equal. The upside: correct data, interrupts stay on, and the system doesn't lose counts.

The downside: in a heavily loaded, multitasking environment, it's possible that the routine could loop for rather a long time before getting two identical reads. The function's execution time is nondeterministic. We've gone from a very simple timer reader to somewhat more complex code that could run for milliseconds instead of microseconds.

Another alternative might be to simply disable interrupts around the reads. This will prevent the ISR from gaining control and changing timer_hi after we've already read it, but creates another issue.

We enter read_timer and immediately shut down interrupts. Suppose the hardware timer is at our notoriously-problematic 0xffff, and timer_hi is zero. Now, before the code has a chance to do anything else, the overflow occurs. With context switching shut down we miss the rollover.

The code reads a zero from both the timer register and from timer_hi, returning zero instead of the correct 0x10000, or even a reasonable 0x0ffff. Yet disabling interrupts is probably indeed a good thing to do, despite my rant against this practice.

With them on there's always the chance our reading routine will be suspended by higher priority tasks and other ISRs for perhaps a very long time. Maybe long enough for the timer to roll over several times. So let's try to fix the code. Consider the following:

long Read_timer(void){
    unsigned int low, high;
    push_interrupt_state;
    disable_interrupts;
    low=inword(Timer_register);
    high=timer_hi;
    if(inword(timer_overflow))
        {++high;
        low=inword(timer_register);}
    pop_interrupt_state;
    return (((ulong)high)<<16 + (ulong)low);
}

We've made three changes to the RTEMS code. First, interrupts are off, as described. Second, you'll note that there's no explicit interrupt re-enable. Two new pseudo-C statements have appeared, which push and pop the interrupt state. Trust me for a moment—this is just a more sophisticated way to manage the state of system interrupts.

The third change is a new test that looks at something called "timer_overflow," an input port that is part of the hardware. Most timers have a testable bit that signals an overflow took place. We check this to see if an overflow occurred between turning interrupts off and reading the low part of the time from the device. With an inactive ISR variable timer_hi won't properly reflect such an overflow.

We test the status bit and reread the hardware count if an overflow had happened. Manually incrementing the high part corrects for the suspended ISR. The code then concatenates the two fixed values and returns the correct result—every time. With interrupts off we have increased latency. However, there are no loops; the code's execution time is entirely deterministic.

1 | 2 | 3

Rate this article: Low High
Current rating
  • .
Embedded.com Career Center
Looking for a new job?
SEARCH JOBS

Browse all jobs

SPONSOR
RECENT JOB POSTINGS



TECH PAPER
WEBINAR
WEBINAR
WEBINAR




 :