Tutorial: On-chip MCU debug and the end of the (off-chip) ICE Age - Embedded.com

Tutorial: On-chip MCU debug and the end of the (off-chip) ICE Age

In the age of non-volatile memory-based MCUs and devices, one veryuseful role of on-chip debug is in support of Flash memory basedsoftware routines. These modern Flash memories typically include anon-chip state machine that automates most of the steps needed to erasepages of Flash or program Flash locations.

During programming and erase operations, the Flash memory istemporarily removed from the memory map so it can be especiallychallenging to debug this type of routine. Emulators, including theon-chip ICE, overcome these problems by monitoring and capturingaddress or data information in real time without interfering with thetiming or execution of these programs.

In addition there are several practical debug scenarios in whichon-chip ICE canperform useful functions. The on-chip ICE includes multiple comparatorsthat can be used for hardware breakpoints or single and dual 16-bitaddress triggers, single 16-bit address plus 8-bit data triggers, andsingle 16-bit address event triggers. Trigger mode logic determines howbus information is captured relative to a trigger and whether thetarget MCU stops after the capture buffer is full or continues to runthe application software. The debug scenarios show some of the ways theon-chip ICE can be configured to quickly track down applicationproblems.

Debug scenario 1: Using trace inFlash MCUs
Typical trace techniques do not work well for routines that program orerase Flash memory, because the Flash array is disabled during theactual program or erase operation and this would interfere withdebugging. When the debugger tries to read Flash during the operation,invalid data is returned. Also, any attempt to read from Flash in themiddle of a programming or erase operation would cause an error andabort the operation.

The on-chip debug system will allow us to trace this routine becauseit non-intrusively captures bus information during the trace run. Wewill set a trigger just before the start of the programming operation.It will be configured to continue running after the trace buffer isfull. An independent breakpoint will be set just after the operation iscomplete. This allows us to see what happened without disturbing theoperation.

We will examine a routine called DoOnStack thatis shown in Figure 2, below. This routine resides in the Flash and allows you to program or eraseanother area of the same Flash even though the Flash memory is removedfrom the memory map during the Flash operation. It does this by copyinga small subroutine called SpSub onto thestack, and calling that subroutine to perform the critical steps of theFlash operation. When the Flash operation is finished, the programreturns to DoOnStack in Flash and de-allocates the stack space used by SpSub . Thesubroutine that actually controls the operation is executing from stackRAM during the Flash programming operation.

Figure2: DoOnStack and SpSub source code. The red line A in the left columnnear the bottom of the DoOnStack routine indicates trigger point A hasbeen set at this address. The red arrow, two lines below, indicates abreakpoint is set at this address.

With older MCU Flash memories, the code to perform the Flashoperation would use so much stack space that this approach probablywouldn't be desirable. In the HCS08, the stack subroutine uses 24 bytesand in the HCS12 it  takes about 12 bytes, which is about the sameas a normal interrupt stack frame. Interrupts are blocked just beforecalling the small SpSub routine on the stack and are unblocked as soon as it is finished.

The execution time to copy this small routine onto the stack issmall enough that it easily fits in line with normal application code.Programming a single byte of Flash only requires about 45 microsecondsso there are no problems with extended interrupt latency due to Flashprogramming. If you program several Flash locations in a series,interrupts are unblocked between operations so interrupts only need towait for the current byte to finish.

Figure 2, above , shows twosource code windows from the CodeWarrior IDE containing the source codefor the DoOnStack and SpSub routines. Both of these routines are located in the application programin Flash memory. These routines are designed to allow you to erase a512-byte page of Flash or program a byte of Flash in the same Flashmemory where your application program is located. With these routines,you can easily use one or more pages of the Flash memory as if it wasEEPROM.

Refer to the DoOnStack subroutine in the Source:1 window at the top of Figure 2 . Aftercopying entry values onto the stack, the SpMoveLoop copies the SpSub routine from Flash onto the stack (RAM).

After this move operation, the stack pointer points at the start ofthe relocated SpSub routine. The instruction copies the stack pointer into X so we can usea tsxJSR ,X to call the relocated subroutine in stack RAM. We want to blockinterrupts during the Flash operation and restore the I mask after wereturn so there are two slightly different blocks of code depending onwhether the I bit was already set or not. After returning from the JSR ,X subroutine call, we de-allocate the stack space we used, adjust theposition of error flags in A and return.

The JSR ,X calls SpSub tocomplete the last four steps of the Flash operation and return toDoOnStack. Now refer to the SpSub routine in the bottom half of Figure 2 . The Flash address thatwill be operated on was pushed onto the stack before SpSub was copiedonto the stack. When SpSub isexecuted,  before SpSubSize+4 indicates to the assembler that you want this value to be treated as an8-bit offset even though SpSubSize is a16-bit value in the assembler.

Debug Scenario 2: Delayed softwareroutine execution
One place a single 16-bit address trigger is useful would be where theuser had a software routine that was performed periodically. However,during alternate execution passes of this software routine, the resultfrom this routine are unexpectedly delayed.

To determine why this unexpected delay is being introduced, the userwould first perform a single 16-bit address trigger (A only) placed onthe first instruction of the questionable routine. Using a begincapture option, the user can verify the actual versus expected softwareexecution path. Depending on the number of changes-of-flow that occurfrom the beginning of the software routine until the unexpected delay,the trigger point might need to be repositioned to a point later in theroutine.

Using this approach, the user might determine that an unexpected interruptservice routine (ISR) is occurring during every other instance ofthis routine, which introduces the unexpected delay. From this result,the user could take corrective measures to ensure that interrupts aredisabled during a time-critical routine. Further trigger captures couldbe performed to determine what causes the interrupt to occur.

Debug Scenario 3: Suspect runawaycode
The dual 16-bit address trigger function in the on-chip debug logicwould be useful if the user suspects code runaway is occurring in theapplication software. The problem is noticed when a static calibrationdata table in memory was being corrupted.

Not knowing what part in the software was changing the values in thecalibration data table, the user can set a dual 16-bit address triggerto catch the responsible instruction. Because the user does not knowwhere in the calibration data table the corruption is first occurring,the trigger can be made within an address range.

The Inside Range A to B trigger condition can be used for thisscenario because it will capture instruction flow leading up to themoment any location within address range A to B is written. ComparatorA would be placed at the beginning of the calibration data table.Comparator B would be placed at the end of the calibration data table.

To locate the instruction that is causing the corruption in thecalibration data table, the user would configure the capture option toperform an end trigger. This setting would allow the user to see thepath that led to the first memory corruption in the calibration datatable.

Debug Scenario 4: Verifying thesoftware execution path
Full trigger modes which use a single 16-bit address plus 8-bit datatrigger allow the user to capture eight software changes-of-flow beforeor after a specified data value is read or modified at a specificmemory address. The direction (before or after) is determined by thecapture options. This trigger uses two comparators. The first is set toa single 16-bit address and the second is set to an 8-bit data valuebeing observed.

This kind of capability would be useful if the user wants to verifythe execution path in the software after enabling a peripheral. (Forthis scenario, writing 0x80 to thecorresponding peripheral control register enables the peripheral.) Theuser can use the AAND B (data) trigger in the begin trigger mode to capture thesoftware changes-of-flowfollowing the write of 0x80 to theperipheral control register.

Comparator A would be set to the 16-bit address associated with theperipheral control register and comparator B would be set to theexpected data value (0x80 ) to bewritten in the peripheral control register. When the write of 0x80 occurs atthe peripheral control register, the on-chip ICE would begin recordingthe software changes of flow. This would provide the user with the datanecessary to evaluate the software.

Debug Scenario 5: Overwrittenperipheral register
This scenario shows how the on-chip ICE can be used if the usersuspects that a peripheral control register is being overwrittenunexpectedly. The user's software maintains a 0x55 value in aperipheral control register. However, the software condition statementthat checks for the 0x55 value in the peripheral control registerunexpectedly fails.

The value appears to have changed because the condition statement isnot being satisfied as expected. To determine the cause of theunexpected data value modification, the user can set up an A AND NOT B (data) trigger, in end trigger mode.

The end trigger mode will allow the user to record the instructionsperformed prior to the change-in-value at the peripheral controlregister. Comparator A will be set to the 16-bit peripheral controlregister address in question. Comparator B will be set to the 8-bitdata value (0x55 ),which the user expects the register to maintain.

Since this trigger condition is qualified by a write of any valueother than 0x55(NOT B), the first instance that changes the peripheral registerwill record the contents of the previous eight changes-of-flow that areresponsible for the erroneous write to the peripheral control register.

Debug Scenario 6: Verify datastored in memory
The single 16-bit address event trigger records data values rather thansoftware change-of-flow addresses. Unlike the other trigger methods,this trigger allows the user to capture eight event data values readand/or written at a specified 16-bit memory location. Also, thistrigger method does not care about the recording direction option(begin/end trigger modes) because it always performs a begin trigger.

For example, the user wants to verify an 8-bit data result stored inmemory. If there is any doubt about the accuracy in the softwarealgorithm that calculates the data result, the user can set up an EventOnly B trigger. For this example, comparator B would be set to the16-bit address associated with 8-bit data result that the user wants tolog. This trigger would record the first eight values written at thememory location where the result resides. The user can then verify thathis expected data result pattern is satisfied.

A variation on this situation would be if the user wanted to captureaccesses to a data result register after the software has completedpolling for a status ready flag. For this example, the user can use theA then Event Only B trigger condition, which postpones the recording ofdata accesses until the software line that indicates transmission readystatus. Without this trigger condition, a simple Event Only B triggercould not differentiate data accesses between the idle and ready statusperiods.

For this condition, comparator A would be set to the 16-bit addressthat contains the instruction that validates the ready status flag.Comparator B would be set to the 16-bit address that is associated withthe data result register. From this trigger, the user can verify thatcorrect data is being transmitted after shifting from the idle to readystate.

What's next for on-chip debug
The on-chip bus analysis capability is a significant evolutionary jumpin MCU design. The problems of cabling and emulation speed are becomingdownright unmanageable using the traditional ICE approach.

On-chip debug offers a practical solution to the impending technicalbarriers of speed, pin density, and more. It has been demonstrated thatthis on-chip approach works. Future generations of MCUs could increasethe size of the capture buffer, add trigger comparators and add theability to trigger on CPU register contents.

For example, several new MCUs, such as the MC9S12E128,MC9S12C32, and MC9S12NE64, have already gone a little further thanthe HCS08.They increased the depth of the capture FIFO from eight words to 64words and added two new capture modes called “Loop1” and “Detail.”

Loop1 inhibits multiple captures of the most recent change of flowevent as in delay loops and loops where you are waiting for a flag toset. This increases the effective depth of the capture buffer.

The detail mode uses pairs of FIFO words to capture the address anddata for every significant bus cycle around the trigger rather thanjust change of flow addresses. This allows you to see values that areloaded into registers, and data values read from or written to memory.In this detail mode, selective capture is used to ignore “free” cycleswhere internal CPU operations are being performed and the address anddata bus are not used for any meaningful data transfer.

To read Part 1 in this series, go to “Part1: Why is on-chip debug necessary?

References :
1) Eduardo Montañez,”Using the HCS08 On-Chip Debug System,” Freescale Application Note, AN2596/D.

2) Motorola, HCS08 FamilyReference Manual Volume 1, 2003.

3) J. Sibigtroth, “SerialMonitor for MC9S08GB60,” Freescale Application Note, AN2140/D.

4) J. Williams, “SerialMonitor Program for HCS12 MCUs,” Freescale Application Note, AN2548/D.

Jim Sibigtroth has worked for Freescale Semiconductor (formerly the Semiconductor ProductsSector of Motorola, Inc.) for more than 27 years and is currently asenior systems engineer in the 8/16-Bit MCU Division of theTransportation and Standard Products Group. Jim defined the originalMC68HC11 and wrote the M68HC11 Reference Manual, commonly known as the”Pink Book.” More recently he defined the CPU12 instruction set and thesingle-wire background debug interface that is on all HCS08 and HCS12MCUs. Eduardo Montañez has workfor the past five years at Freescale Semiconductor (formerly theSemiconductor Products Sector of Motorola, Inc.) as a Systems &Applications Engineer in the Microcontroller Division.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.