How to use ARM’s data-abort exception

Processors giveth and processors taketh away. They can fetch and store data or they can refuse to do either. When your processor aborts a data access, what can you do? This in-depth article explains the hows and whys of data aborts on the ARM7 family of processors, including working code for a useful data-abort exception handler.

The late Joseph Campbell, well-known scholar of comparative religion and mythology, once expressed his sentiments about computers from his perspective: “Computers are like Old Testament gods; lots of rules and no mercy.” Add to his observations the familiar wisdom “where there are rules, there are also exceptions,” and your Old Testament machine becomes more forgiving.

The data-abort exception (with the help of an exception handler) may be God's gift to ARM programmers. A data-abort exception is a response by a memory system to an invalid data access. The data-abort exception handler is a program that can inform the programmer where in his or her code this exception has occurred (after the application has crashed). The exception handler ought to handle the consequences of the aborted instruction gracefully, rather than forcing the processor to hang in an infinite loop. If you understand the fundamental rules of the ARM architecture and data-abort exception handling, you'll spend less time begging for mercy.

Not all ARM processors, however, come with these data-abort exception handlers. The exception handlers are usually not provided by the compiler, RTOS, or silicon vendors since it would necessitate quite a high level of integration. Ideally, a perfectly designed system doesn't need an exception handler. However, in a process of striving for perfection, engineers can come across moments when the processor showers them with a slew of undesired abort exceptions. These exceptions may originate in software, such as improper C-structures, or appear in code ported from a different processor architecture. Alternatively, exceptions may become signs of improper memory system design or manifestations of environmental effects on a marginal hardware design or on a specific component.

This article is an introduction to programming data-abort exceptions handlers on the ARM architecture. I'll demonstrate many of the concepts related to exceptions using the LPC 2148 from Philips, which necessitates a side trip through the underworld of the LPC 2000 family's undocumented features. I'll also explore how the discrete implementations of ARM core in a general-purpose MCU may provide a disparate saga about the causes of the data-abort exception. I'll also dispel some myths associated with exceptions and data alignment and show you how to create a data-abort exception handler for the LPC 2148 (and other ARM processors). The material presented here may help you develop your own exception handler for a similar ARM7TDMI processor and reduce your debugging time.

Off the beaten path
I was motivated to explore the topic of ARM exceptions after a prolonged debugging session when the processor would end up frequently in the default (dummy) data-abort exception handler, implemented by an infinite loop.1 I wanted to know more about the state of affairs prior to this dead-end, glory-free execution of my application, and I suspected there was a way to construct a more efficient exception handler. Despite my large library of ARM literature and 6Mbps connection to the “information super-highway,” I was dismayed to find very little coverage or documentation in the ARM literature related to this broad but important topic of exception handling.

So I started to write my own exception handler. While writing and testing it, I stumbled on some undocumented features of LPC 2148, such as address aliasing of USB RAM and the location for corruptible Special Function Registers (SFR).

During my research many side issues arose from my empirical verifications of the researched material. When a result didn't support the theory, I wanted to know why. I hope that some of my findings help you save debugging time, since the data-abort exceptions and memory Reserved Areas are related yet not always accurately documented. You can also use this information as a springboard for developing other exception handlers for prefetch abort and undefined instruction exceptions. The two remaining exceptions—the interrupt request (IRQ) and fast interrupt request (FIQ)—already have very good coverage in the literature.

Ultimately, as you'll see, I accomplished my goal to implement a data-abort exception handler that provides insight into the fault using a simple RS-232 connection to my PC instead of any high-level debugging tool (such as a debugger connected to JTAG port).

Exception handlers
Of the six exceptions that an ARM-based processor can raise, two abort exceptions signal that the current memory access cannot be successfully completed. The first one, data-abort exception, has the second-highest priority, just after reset, as shown in Figure 1. This exception conveys that the data access transaction was unsuccessful. The second is the prefetch-abort exception, which has the second-lowest priority, just one notch above the software interrupts. This abort is invoked when the processor is unable to fetch an instruction from memory. In this article, I concentrate on explaining why data-abort exceptions occur; the prefetch-abort exceptions are beyond our scope.

View the full-size image

Many 32-bit embedded systems use a real-time operating system (RTOS), but not every RTOS comes with a data-abort exception handler, thus putting the responsibility for those aborts in the hands of the programmer. After all, the experts tell us “efficient handlers can dramatically improve system performance.”2

Figure 2 shows a sneak-preview of the data-abort exception handler's output.

View the full-size image

The exception handler is a simple UART driver that performs a register dump with the disassembled instruction that caused the data-abort exception. For example:

Processor aborted due to execution of instruction stmeqia located at address 0x0000 0230. Reason: a memory write was initiated at the top of SRAM (register r8=r7, before write) extended into Reserved Area range (register r8, after write).

ARMed with a brief history
During the past decades, the ARM architecture has undergone numerous revisions to the instruction set and hardware design. One of the many significant hardware changes was a move from the von Neumann architecture of ARM7 to the Harvard architecture starting with ARM9. Similarly, the Thumb-2 instruction set is the latest advancement and improvement of the first Thumb instruction set.

ARM processors are used in a system-on-chip (SoC) custom/proprietary designs (such as iPod, cell phones, hard disks) and in general purpose off-the-shelf MCUs. In either case, the ARM processor is in the center of a larger system. One of the synonyms for “system” is “complex.” In a context of the data and prefetch aborts, dealing with both exceptions can be complex. Yet, they are both conditionally invoked based on a state of only one simple input to the ARM core–the ABORT signal asserted by the memory subsystem, shown in Figure 3. It forces the ARM core to respond to an event evaluated by the memory sub-system as a fault. In other words, what constitutes a failed memory transaction is decided not by the ARM core, but by the memory subsystem: its design and level of intelligence. The more the memory controller scrutinizes the ARM core's requested address, the easier it will be for the programmer to pinpoint a fault in the software (or hardware) that causes the abort exception. The ARM core's response to unsuccessfully completed memory transaction is uniform across all versions. However, the memory subsystem design's common point is only the ABORT output.

I've used the LPC 2148 as a primary test subject for this article. The LPC 2148 silicon design lacking any advanced memory controller features makes the handling of data-abort exceptions somewhat limited and challenging, as we'll see soon. Briefly, the key features of the LPC 2148 are:

  • A simple three-stage instruction pipeline
  • No cache, MMU or MPU
  • 512KB internal flash
  • 32KB + 8KB of internal SRAM, USB interface
  • A convenient UART-based boot-loader

Let's begin by taking an inventory of instructions that are capable of raising a data-abort exception. These instructions listed in Table 1.

View the full-size image

For quick reference, I've included page numbers from David Seal's ARM Architectural Reference Manual (also known as the ARM ARM ).3 For completeness, I've listed two coprocessor-related instructions LDC/STC of Addressing Mode 5, even though the MCU used in development of this data-abort exception handler doesn't have one.

How exceptions are raised
As I said earlier, a data-abort exception is a response from a memory system to an invalid memory access. For our discussion, it's useful to itemize the generic memory access as a “read” (load) and a “write” (store). Under certain circumstances, these instructions behave differently, as I'll explain later. Let's begin with a few simple definitions.

To write to memory means any possible form of store-type (ST) instruction. It may be a single-register store for a byte, a half-word (16 bits), or a word (32 bits); it may also be a multiple-register store or a special instruction “swap” that indivisibly (atomically) reads and writes. This definition applies to all addressing modes and to both states, ARM and Thumb. For a Thumb state, one more instruction, PUSH, expands this list.

Similarly, to read from memory implies that any possible form of load-type (LD) instruction is being executed, as applicable to ARM and Thumb states. For a Thumb state, one more instruction, POP, expands this list.

When can we expect a data-abort exception to be raised on the LPC 2148? These are the three general classes:

  • When executing an unaligned memory access using instructions summarized in Table 1
  • While performing a write to ROM (flash) space
  • When accessing any of the Reserved Areas defined in the LPC 2148 User Manual 4

Unaligned memory access
Actually, this was a trick question. In general, the ARM architecture does not support unaligned accesses. However, due to its design, the LPC 2148 won't inform you that this event took place; rather it will produce an output which I'll analyze in the following section.

An unaligned RAM data access is a read or write that involves an address that's not a multiple of the data size. For a word-aligned memory access, the last two bits of an address have to be zero. Thus, the address's last nibble will be multiples of four: 0x0, 0x4, 0x8, and 0xC. For a half-word-aligned memory access, the last bit of an address must be zero. The address's last nibble will be multiples of two: 0x0, 0x2, 0x4, 0x6, and up to 0xE. Accessing a byte is trivial.

What are the outcomes of an unaligned RAM memory access and when do they occur?

Whether the processor runs in the ARM state as it fetches 32-bit instructions or in the Thumb state fetching 16-bit instructions, the memory controller may allow the processor to access the memory using:

  • A word-aligned address (ldr/str) or non-word-aligned with byte rotation 5 (ldr)
  • A half-word aligned address (ldrh/strh) or non-half-word aligned address yielding unpredictable6 results
  • A byte address (ldrb/strb)

Alternatively, the memory controller may abort all unaligned accesses. The effects of word and half-word unaligned access will become clearer from an actual example, which I'll take you through step by step.

A word on storing words: LPC 2148 is a little-endian7 MCU, which means that the least significant byte (LSB) of a word is stored at the lowest memory address, whereas the most significant byte (MSB) is stored at the highest memory address of a word.

Case 1, word access
Read a word from 0x4000 000n to register r1, where n is the value of the last two bits of the accessed memory address. Table 2 shows you the results using instruction ldr r1, [r0].

View the full-size image

You can see in Table 2 that it's only in the first case, when n =0, that we perform word-aligned access. In the remaining three cases, n =1, 2, 3, the silicon-vendor-specific implementation of the memory controller will decide whether the register is loaded with byte-rotated value (LPC2000), or the memory controller detects the nonaligned address and aborts the read (AT91SAM7S). The byte rotation of the instruction ldr as defined in ARM ARM is depicted in Figure 4 and Table 2.8

For memory writes, such as str r1, [r0] the str instruction will produce a word-aligned address by masking the last two bits with 0xFFFFFFFE. This behavior, as I've described, applies to all nine addressing modes for loading and storing a word in Addressing Mode 2.

For instruction ldm (multiple load), the last two bits of the address are ignored, so no rotation of bytes occurs, unlike for the instruction ldr. For instruction stm (multiple store), the last two bits are ignored for unaligned memory access. These are the only two instruction types of the Addressing Mode 4.

No data-abort exceptions are generated by LPC 2148 in any of these cases.

Case 2, half-word access
Read a half-word from 0x4000 000n to register r1, where n is the value of the last two bits of the accessed memory address. Table 3 shows you the results using instruction ldrh r1, [r0]. This byte rotation of the instruction ldrh, defined by ARM ARM as “unpredictable,” is depicted in Table 3.9 For memory writes, using strh r1, [r0] the strh instruction will produce identical results as described for ldrh. This behavior, as described in Table 3, applies to all six addressing modes for the loading and storing of a half-word in Addressing Mode 3.

View the full-size image

To briefly summarize, word-unaligned access to defined memory space on the LPC 2000 results in byte-rotated read, as shown and referenced in Table 2. Half-word unaligned access produces an unpredictable result, shown in Table 3. There is no intervention of the memory subsystem to abort an unaligned data access with a raised exception.

For ARM implementations with more advanced memory systems than the one used on the LPC 2000, unaligned access may yield a data-abort exception.

A write to ROM (flash) space
This scenario is perhaps too obvious to discuss, yet just such a situation arose when I was debugging my data-abort code. I faced a stray pointer that insisted on writing to ROM. The system defended itself by generating data-abort exceptions. I'm not talking about the programming of the flash, but the runtime misdirected write operation. This stray pointer is the only one of the three discussed in this article that will produce data-abort exception most reliably–100% coverage of the flash memory space.

Memory access to reserved areas
ARM is a 32-bit architecture designed with thirty-seven 32-bit registers; its 32-bit program counter (pc) is capable addressing a memory space of 2 32-1 = 4,294,967,295 bytes (4GB). In the LPC 2000 family this linear memory space is divided into four distinct memory regions:

  • ROM (flash) for the nonvolatile code storage
  • SRAM for volatile data
  • VLSI Peripheral Bus (VPB) and Advanced High-Performance Bus (AHB) peripherals
  • Four Reserved Areas

Examples of VPB peripherals are UART0/1, I2 C, SPI, and timers. An example of an AHB peripheral is the Vector Interrupt Controller (VIC) and an 8KB USB SRAM.

The sum of all four Reserved Areas is 3,757,518,848 bytes (0xDFF7 3000), representing 87% of the 4GB total memory space. The probability of a stray pointer hitting this area is reasonably high. But if we place our data-abort exception handler in its way, we'll debug our program quicker, right? Actually, it's not so simple.

When I finished my data-abort exception handler, I needed to test it. “There is a huge test area,” I thought, “It will go fast.” Two days later I realized my code was solid, but I ran into some undocumented features of the chip. Have you noticed how a five-minute software upgrade easily turns into many hours (or days) of tweaking?

All of a sudden it's déjà vécu all over again.10 My original test approach to the data-abort exception handler behavior was to intercept the exception, count it, and return to the execution of the next instruction (more on this later).

I set my pointers to read and write the Reserved Areas, expecting to get an exception for every word that accessed the “forbidden” memory region. But it did not happen this way; see the test data in Table 4 and the memory map shown in Figure 5.

View the full-size image

View the full-size image

For Reserved Areas #1 and #3, there were no surprises; I received an exception for every read and write to those Reserved Areas. Reserved Area #2 and #4, however, produced what I first called “false positive” exceptions–they didn't occur when they should have.

When I quantified the results, I stepped back from the problem and modified the memory map, as shown in Figure 5. I needed to investigate this further.

Hic sunt leones
Now I was entering uncharted territory.11 A few days later, after I finished my memory map and code testing, I came across a document published by Philips.12 A figure in the document marked the memory area between 0x3FFF 8000 to 0x3FFF FFFF as “Special Registers,” which I've labeled on my memory map (Figure 5) as HSL #4a. This area accounted for the unbalanced count in data exception generated in reads. But how about writes to this area? There were problems encountered when a constant 0x11223344 was written to any address below 0x4000 0000.

The type of problems encountered was that the debugger's memory window (with view of processor's stack RAM) got cleared, and the data-abort exception handler experienced data-abort exception.

It wasn't my goal to find out why my data-abort exception handler couldn't handle this small segment of Reserved Area, but I can attest that debugging this type of problem can be a real challenge.

More lions
The LPC 2148 has a dedicated USB controller with a DMA transfer and a dedicated 8KB RAM. This RAM can also be used as a generic storage area. However, this memory is inaccessible unless the USB controller is also enabled by setting PCONP [31] = 1. The reason for this behavior is that this RAM is a part of the USB controller on the AMBA bus, unlike the standard system SRAM, which resides on the ARM7 local bus.

After setting this bit, the SRAM is enabled and filled with random values. During my attempt to initialize this region, I noticed that when the value is being written to the address 0x7FD0 0000, it is also being “echoed” at address 0x7FD0 2000, 0x7FD0 4000, 0x7FD0 6000, and so forth. Apparently, some address aliasing is taking place. If we pull out the calculator and perform the following math, the result makes sense:

0x7FE0 0000 (top of HSL #2a) - 0x7FD0 0000 (bottom of USB SRAM) = 0x0001 0000.   

But the size of the USB RAM is 8KB (0x2000), so we divide 0x10000 by this value. Voila! There are 0x80 (128) segments of 8KB each in the bottom of the Reserved Area #2; only one of them is filled with actual RAM. This explains the peculiar behavior of my data-abort scanning procedure for reporting “false positives” during the test when the SRAM was disabled.

Review
Let's review our findings so far:

  • Unaligned memory access doesn't generate data-abort exception
  • A write to ROM does generate data-abort exception
  • A read or write to the entire memory segment defined as Reserved Area #1 and #3 will generate data-abort exceptions, whereas a read or write to specific subsets of the Reserved Area #2 and #4 will not. Those subsets are defined in Figure 5.

Effects of data-abort exception
To examine the effects of data-abort exceptions, let's look at some basic background material. There are two models for data abort:

  • Base restored abort model
  • Base updated abort model

Both models are implementation-dependent, so you should refer to your chip's manuals for details. What's the difference between the two?The ARM7 in general uses the base updated abort model , which ARM ARM defines as “If a Data Abort occurs in an instruction, which specifies base register writeback, the base register writeback still occurs.”13On the other hand, ARM9 uses the base restored abort model , defined by ARM ARM as “If a Data Abort occurs in an instruction, which specifies base register writeback, the value in the base register is unchanged.”Notice how quickly we're getting deeper into apparently unrelated issues.What is a base register writeback ?Consider the following instruction; for a moment we step outside of a data-abort exception context:

ldr   r0, [r1, #4]!   

This instruction (auto-indexing) loads the register r0 with the content of the memory located at the address stored in r1, which is automatically incremented by four, after its execution. For example, if this instruction were placed in a loop, the effect would be an automatic stepping through a lookup table with a starting address pointed to by a value in r1. This feature is built into the ARM hardware; the exclamation mark in this example simply means base register writeback update.Let's return to data-abort exception effects and rules.There are two cases to be considered: first, how the data abort affects the memory content (write); and second, how the data abort affects the content of registers (read). Furthermore, we can analyze these two cases for a single register transfer and a multiple register transfer. David Seal's treatment of this topic is very precise.14 I won't attempt to re-interpret the rules; rather you can study these rules on your own. However, we'll examine two cases–multiple writes and a single read–to gain some background for my implementation of the data-abort exception handler for the LPC 2148. Let's consider the following instruction for storing multiple registers involving the memory addresses region of boundary #4:

stmda  r8!, {r0-r7}     //store multiple decrement    after and update; with '!'   

If we let r8=0x7FD0 0010 (pointing to the bottom of USB RAM range), this instruction would write out to memory only five registers15 (r7 to r3), followed by data-abort exception. However, the register r8 would be updated though all eight registers were transferred to memory, namely the contents of r8 would be updated to:

0x7FD0 0010 - (8 registers*4 bytes=0x20) = 0x7FCF FFF0   

as Figure 6 depicts.

This instruction should cause a data abort, and it does. Register r8 was predictably updated to 0x7FCF FFF0 as if the last write to the memory address at 0x7FCF FFF4 was successful (the “decrement after” mode caused the register r8 to be updated to the next word address after 0x7FCF FFF4, which is 0x7FCF FFF0).In the example just cited, I've intentionally selected the most important of the six rules for the data-abort effects since this rule specifically addresses the unique property of ARM7TDMI-S. Later, I'll analyze the specifics of this property and its consequences in the remaining three boundary cases in the context of the LPC 2148 implementation. I decided not to process the writeback update condition because it would have an impact on the code size of the data-abort exception handler.Now, let's look at the outcome when the writeback is not specified (absence of an exclamation mark), as shown here:

stmda  r8, {r0-r7}   //store multiple decrement    after without an update    

Using identical initial conditions as before, the value stored in r8 would be unchanged after the data-abort exception; it would still be 0x7FD0 0010. For a case involving memory read with a destination a single general-purpose register, the value of that register is unchanged; for example, consider the following code segment:

Initially r0=0x00000101 ldr  r2, =0x7FE00000  ;address in Reserved Area #2ldr  r0, [r2]  ;data abort exception raisedAfter data abort exception  raised, r0=0x00000101    

The initial value in r0 was 0x00000101. Execution of these two lines would not affect the register's content, but the data-abort exception would be raised since we attempted to load r0 from an address in Reserved Area #2.

Non-unified theory
There is a schism even in the ARM literature about what ought to be done by the data-abort exception handler. One view states the exception handler should “fix” the error and return to re-execute the instruction that caused the exception.16 The other says that the exception handler should report the error and stop further program execution.17 This divide is understandable because so many variations of hardware are available; there is no silver bullet. The final decision rests with the system programmer and depends the context of the problem at hand.

I agree with the second view that emphasizes reporting the error and stopping further program execution. However, I do process the content of all 16 registers, the opcode of the instruction causing the exception, and the disassembled opcode.

The bulk of my data-abort exception handler's code uses the exception's native 32-bit ARM state. After all, it is only 104 bytes in code size. so there's no incentive to try to compress it using Thumb. The second reason for using the ARM state for most of the data-abort exception processing is that one has access to MSR and MRS instructions capable of modifying control bits of CPSR necessary for switching in and out of the data-abort mode.

The state (and mode) switching is performed for the purpose of obtaining the content of the stack pointer and link register (r13=sp and r14=lr, respectively) at the time of exception; the contents of registers r0 to r12 are transparent to the data-abort mode.

A myth is being perpetuated by some people working with ARM processors that there are only eight registers when the processor is in a Thumb state. The source of this myth is perhaps rooted in the original ARM block diagram for Thumb state showing only eight registers in any mode.

My first question when I saw this diagram was: where did the remaining registers go? They are there; they didn't vanish. Actually, their names are Lo ; r0-r7 and Hi ; r8-r15.18 However, only three instructions can operate on those Hi registers: MOV, ADD, and CMP. Perhaps for the compiler designer, it might be easier not to use those Hi registers, but for someone programming in assembly, any available register can be a great gift.

The data-abort exception handler, therefore, processes and displays the content of all 16 registers, regardless of whether the data-abort exception was raised in an ARM or the Thumb state.

Insert code here
I wanted to write an exception handler that could process data aborts not only from either state but also from any mode (sys, usr, swi, irq, and so on). Furthermore, I wanted to write a modular exception handler, independent of an output device driver. At the end of the exception handler routine, the processor would switch to Thumb state and an additional umbrella-like function would be called that might contain additional processing; in my implementation, it would be a call to the disassembler and to the UART output driver.For further processing of the crash data, I chose the Thumb state based on the merit of the code density; the processing speed is irrelevant because the application had already crashed.The data-abort exception handler fills the global array of 72 bytes with the contents of all 16 registers at the time of exception plus the mode of the processor (in last five bits of cpsr) as well as the offending instruction's opcode.The data-abort exception handler produced the output I described earlier, based on two facts:

  • First, when the data-abort exception is raised, the processor stores the address of the aborted instruction plus eight bytes (pipeline offset) in lr, and then fetches the address of

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.