Self-testing in embedded systems: Hardware failure

February 09, 2016

Colin Walls-February 09, 2016

Hard faults are permanent malfunctions and show up in three forms:

  1. Memory not responding to being addressed at all.
  2. One or more bits are stuck on 0 or 1.
  3. There is cross talk; addressing one bit has an effect on one or more others.


(1) results in a trap in the same way as the aforementioned peripheral failure. Ensuring that a suitable trap handler is implemented addresses this issue.

The other forms of hard memory failure, (2) and (3), may be detected using self-test code. This testing may be run at start-up and can also be executed as a background task.

Start-up memory testing
There are a couple of reasons why testing memory on start-up makes a lot of sense. As with most electronic devices, the time when memory is most likely to fail is on power-up. So, testing before use is logical. It is also possible to do more comprehensive testing on memory that does not yet contain meaningful data.

A common power-up memory test is called “moving ones”. This tests for cross-talk – i.e. whether setting or clearing one bit affects any others. Here is the logic of a moving ones test:

set every bit of memory to 0
for each bit of memory
    verify that all bits are 0
    set the bit under test to 1
    verify that it is 1
    verify all other bits are 0
    set the bit under test to 0

The same idea may be applied to implement a moving zeros test. Ideally, both tests should be used in succession. Coding these tests needs care. The process should not, itself, use any memory – code should be executed from flash and all working data must be stored in CPU registers.

With increasingly large amounts of memory, the time taken to perform these tests escalates exponentially and could result in an unacceptable delay in the start-up time for a device. Knowledge of memory architecture can enable optimization. For example, cross talk is more likely within a given memory array. So, if there are multiple arrays, the test can be performed individually on each one. Afterwards, a quick check can be performed to verify that there is no cross-talk between arrays, thus:

fill all of memory with 0s
for each memory array
    fill array with 1s
    verify that other arrays still contain just 0s
    fill array with 0s

This can then be repeated with all of memory starting full of ones.

Background memory testing
Once “real” code is running, comprehensive memory testing is no longer possible. However, testing of individual bytes/words of memory is possible, so long as tiny interruptions in software execution can be tolerated. Most embedded systems have some idle time or run a background task, when there is no real work to be done. This may be an opportunity to run a memory test.

A simple approach is to write, read and verify a series of bit patterns: all ones, all zeros and alternate one/zero patterns. Here is the logic:

for each byte of memory
    turn off interrupts
    save memory byte contents
    for values 0x00, 0xff, 0xaa, 0x55
        write value to byte under test
        verify value of byte
    restore byte data
    turn on interrupts

Implementing this code requires a little care, as an optimizing compiler is likely to conclude that some or all of the memory accesses are redundant and optimize them away. The compiler is optimistic about memory integrity.

In part two of this two-part series, we'll look at self-testing approaches for mitigating software failures.

Colin Walls has over thirty years experience in the electronics industry, largely dedicated to embedded software. A frequent presenter at conferences and seminars and author of numerous technical articles and two books on embedded software, Colin is an embedded software technologist with Mentor Embedded [the Mentor Graphics Embedded Software Division], and is based in the UK. His regular blog is located at: He may be reached by email at

< Previous
Page 2 of 2
Next >

Loading comments...