Debugging hard faults in ARM Cortex-M0 based SoCs

Programmable system-on-chip (PSoC) architectures such as the Cypress PSoC family of MCUs integrate a wide range of capabilities, including MCU cores like the Cortex-M0, programmable analog blocks (PAB), programmable digital blocks (PDB), programmable interconnect and routing, a wide range of interfaces and peripherals, and advanced capabilities such as capacitive touch sensing. These architectures have many advantages over traditional microcontrollers and can substantially reduce design time and system bill of materials (BOM) cost.

As the complexity of programmable system-on-chip architectures and their MCU increases, so do the issues that can occur at each stage of design. One common issue developers face in Cortex-M0-based embedded systems is the hard fault. In some cases, we might get lucky and be able to quickly locate the source of the hard fault. However, most of the time chasing down a hard fault can be very time consuming. In this article, we will discuss some common errors programmers make and how to debug the hard fault caused by these errors.

Hard faults
A hard fault is an exception that occurs because of an error during normal or exception processing. As per the Cortex-M0 Devices Generic User Guide (revision r0p0), the following sources can cause a hard fault:

  • execution of an SVC instruction at a priority equal or higher than SVCall

  • execution of a BKPT instruction without a debugger attached

  • system-generated bus error on a load or store

  • system-generated bus error on a vector fetch

  • execution of an instruction from a location for which the system generates a bus fault

  • execution of an instruction when not in Thumb-State as a result of the T-bit being previously cleared to 0

  • execution of an undefined instruction

  • attempted load or store to an unaligned address

  • execution of an instruction from an XN memory address  

The most common user-created causes for hard fault are:

  • execution of an undefined instruction

  • attempted load or store to an unaligned address.

  • execution of an instruction from an XN memory address

Detecting the hard fault
When your system is hung up, the first step is to detect the cause for the hang up. To detect the cause for system hang up, first execute your program in debugging mode and allow the system to run until the system hangs up again, then halt the debugger. There are two ways to determine whether the hang up is due to the hard fault. The first is to watch the Program Counter (PC) register. If this is a hard fault, the PC register will indicate operation in the hard fault handler. The second method is to watch the Interrupt Program Status Register (IPSR). A value of 0x3 indicates a hard fault. Here is the IPSR bit assignment in detail (as per Cortex -M0 Devices Generic User Guide):


Figure 1 : IPSR Register Definition (Source: ARM)  

The IPSR register is a part of the ARM Cortex-M0’s Program Status Register (PSR). The PSR combines three 32-bit registers – APSR, IPSR, and EPSR – as shown in Figure 2 . The IPSR register indicates a hard fault occurrence. PSR is referred to as xPSR in some IDEs like PSoC Creator.


Figure 2: PSR Register Definition (Source: ARM)  

Execution of undefined instruction
As the name suggests, this type of hard fault is caused when the core attempts to execute an undefined instruction. This can occur when either the instruction fetched by the PC register is corrupted and takes an undefined instruction value or the PC register itself gets corrupted because of stack corruption.

The stack can be corrupted when a pointer is passed to a function and the function writes beyond the allowed size of the pointer. This may end up corrupting the stack. Here is one such illustration:

include 

/* Application level function */

uint8 i2cRead(void);

/* HAL level function */

void halI2cReadData(uint8 *buffer);

int main()

{

   volatile uint8 data = 0;

   for(;;)

   {

       data = i2cRead();

       CyDelay(1);

   }

}

#define I2C_BUFFER_SZ 64

uint8 i2cRead(void)

{

   uint8 i2cBuffer;

      halI2cReadData(&i2cBuffer);

      return(i2cBuffer);

}

void halI2cReadData(uint8 *buffer)

{

   uint8 bufIndex = 0;

   /* Read I2C data from hardware */

   /* Fill the buffer */

   for(bufIndex = 0; bufIndex < I2C_BUFFER_SZ; bufIndex++)

   {

       buffer[bufIndex] = bufIndex;

   }

}

In the above example, buffer is defined as an array of 1 byte. However the halI2cReadData function tries to fill 64 bytes. This operation will corrupt the stack. The stack contents at any given time are, from top to bottom, the local variable of the current function, current stack frame context, and calling stack frame context. The hard fault occurs when the CPU writes into the PC register of the calling stack frame and the execution return is attempted.

Continue to page two >>

When the hard fault occurs in such a scenario, the first step is to detect the cause of the hang up as explained in the section above. Once you ensure that this is a hard fault, the next step is to look at the call stack to get an idea of which function caused the violation, as illustrated in Figure 3.


Figure 3: Registers and call stack window displayed in PSoC Creator IDE on a hard fault (Source: Cypress Semiconductor)

The call stack shows function in the sequence in which they were called. In this example, the sequence main() -> i2cRead() -> IntDefaultHandler(). IntDefaultHandler() is the hard fault handler. From this we can infer that an operation in the function i2cRead() caused the hard fault. Once you know the function that is causing the fault, review that section of code thoroughly to identify the issue. You can also now execute in debugging mode and do a step-by-step execution in that function, examining the variables for any violations using the watch window.

Another scenario where the core might execute an undefined instruction is when you try to execute a function with an undefined function pointer. This may happen when you try to access a function pointer from an array of function pointers beyond the defined array size.

Some best practices to avoid these kinds of faults are:

  • Always pass the size along with the pointer address into a function

  • In the function, check if the pointer and the size are valid. Do not access elements that are out of bound.

Attempted load or store to an unaligned address
This type of hard fault occurs when the core tries to read or write to an unaligned address. A common scenario is when you have a packed structure for communication with an external peripheral. Some of the structure elements may be on unaligned addresses. If one such address is accessed as a pointer in another function, this results in an unaligned access. An example is shown below:

#include 

typedef struct __attribute__((packed)){

      uint32 header;

      uint8 packetLength;

      uint16 buffer[64];

}i2cPacket_t;

void parseData(uint16 *buf);

int main()

{

      i2cPacket_t i2cPacket;

   for(;;)

   {

            parseData(&i2cPacket.buffer[0]);

   }

}

void parseData(uint16 *buf)

{

      uint16 intBuf[64];

      uint8 index = 0;

      for(index= 0; index < 64; index++)

      {

            intBuf[index] = buf[index];

      }

}

In the above example, the structure element buffer will be assigned an odd address because of the packed structure. When this buffer is accessed in the function parseData, it results in a hard fault. To avoid this type of error, ensure that 32-bit and 16-bit variables in a packed structure are always aligned. Use padding bytes if needed.

To debug this type of hard fault, halt execution and view the registers. If the XPSR register has the exception number as ‘3’, then it is a hard fault. View the call stack window to trace back and identify which function caused the violation. Review the code thoroughly and make the necessary fixes in the firmware.


Figure 4: Registers and call stack window displayed in PSoC Creator IDE on a hard fault (Source: Cypress Semiconductor)

Execution of an instruction from an XN memory address
XN is a memory designation for Execute Never. XN prevents the processor from accessing instructions in this region of memory. A hard fault exception is generated upon an attempt to execute an instruction fetched from an XN region of memory. To avoid the hard fault, you should ensure that the core is not fetching instructions from the regions marked as XN in the following table:


Figure 5: Memory Regions (Source: Cypress Semiconductor)

The linker script has the definitions of all the memory regions. Ensure that the code region does not overlap with any of the regions marked with XN in the table above.

Shashikant Joshi has been managing applications group in Cypress Semiconductor since 2012. He has 13+ years of experience in semiconductor industry specialized in Pre- and Post-Silicon Validation and Applications engineering. He has experience working with MIPS and multiple ARM cores from M-Series to A-Series in large multi-core system-on-chip systems. He loves debugging system issues. Apart from these, he works on hobby electronics projects in his free time. He can be reached at .

Shruti Hanumanthaiah is a Staff Applications Engineer working in Cypress Semiconductor on Capacitive Touch Sensing applications since 2009. Her interest lies in designing embedded system applications. She loves debugging technical issues including EMI/EMC issues, and working on analog and digital circuits. Apart from these, she is a big travel enthusiast. She can be reached at .

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.