CMP EMBEDDED.COM

Login | Register     Welcome Guest  
HOME DESIGN PRODUCTS COLUMNS E-LEARNING CONFERENCES CODE FORUMS/BLOGS NEWSLETTERS CONTACT FEATURES RSS RSS

Code techniques for processor pipeline optimization: Part 3
Optimization for Control-Oriented Operations



Embedded.com

Optimizing Condition Checks
Core instructions can selectively modify the state of the condition codes. When generating code for if...else and loop conditions, it is often beneficial to make use of this feature to set condition codes, thereby eliminating the need for a subsequent compare instruction. Consider the following C statement.

if ((a + b) !=0)
    c = c + 1;

Code generated for the if condition without using an add instruction to set condition codes is:

add         r2, r0, r1
cmp        r2, #0
addne     r3, r3, #1

However, code can be optimized making use of an add instruction to set condition codes:

adds         r2, r0, r1
addne       r3, r3, #1

Condition checking for coprocessor registers can also be performed. SIMD flags in the wCASF register are updated during execution of Wireless MMX instructions. Then, using one of the three flag extraction operations - TANDC, TORC, or TEXTRC - flags for the XScale core can be updated.

This method allows checking of all or one of the SIMD fields for conditional execution. Called group conditional execution, this method is shown in the following example:

wsubhus     wR1, wR2, wR3
@ Saturating subtraction minimum of wR1
@ is zero
torch R15
@ Updating core flags with ORed
@ coprocessor flag values
addeq         r2, r2, #1
@ now executes conditional coprocessor flag

All preceding techniques of effectively using conditional execution can also be applied to the group conditional execution. For cases such as peak detection or finding a match in a vector, you can use group conditional techniques.

The instructions that increment or decrement the loop counter can also be used to modify the condition codes. Modifying the codes eliminates the need for a subsequent compare instruction. A conditional branch instruction can then be used to exit or continue with the next loop iteration.

Consider the following C code segment:

for (i = 10; i != 0; i--) {
    perform inner_kernel;
}

The optimized code generated for the preceding code segment would look like:

L6:
@equivalent to inner_kernel
    subs     r3, r3, #1
    bne     .L6

Using the above argument, it is also beneficial to rewrite loops whenever possible to make the loop exit conditions check against the value 0. For example, the code generated for the following code segment needs a compare instruction to check for the loop exit condition.

for (i = 0; i < 10; i++) {
    perform inner_kernel;
}

If the loop is rewritten as follows, the code generated avoids using a compare instruction to check for the loop exit condition.

for (i = 9; i >= 0; i--) {
    perform inner_kernel;
}

However, the use of conditional instructions should be considered carefully to ensure it improves performance. To decide when to use conditional instructions over branches, consider this hypothetical code segment:

< style="font-style: italic;">if (cond)
    if_stmt
else
    else_stmt

Using the following data:

N1Beta = number of cycles to execute the if_stmt, assuming the use of branch instructions
N2Beta = number of cycles to execute the else_stmt, assuming the use of branch instructions
P1 = percentage of times the if_stmt is likely to be executed
P2 = percentage of times likely to incur a branch misprediction penalty
N1c = number of cycles to execute the if...else portion using conditional instructions assuming the if condition to be true
N2c = number of cycles to execute the if...else portion using conditional instructions assuming the if condition to be false

Use conditional instructions when:

EQPage227

The following example illustrates a situation in which it is better to use branches instead of conditional instructions.

    cmp     r0, #0
    bne     L1
    add     r0, r0, #1
    add     r1, r1, #1
    add     r2, r2, #1
    add     r3, r3, #1
    add     r4, r4, #1
b            L2
L1:
    sub     r0, r0, #1
    sub     r1, r1, #1
    sub     r2, r2, #1
    sub     r3, r3, #1
    sub     r4, r4, #1
L2:

The CMP instruction takes one cycle to execute, the if statement takes seven cycles to execute, and the else statement takes six cycles to execute. If the code were changed to eliminate the branch instructions by using conditional instructions, the if...else statement would take 10 cycles to complete.

Assuming an equal probability of both paths being taken and that branch mispredictions occur 50 percent of the time, the cost of using conditional instructions is 11 cycles and the cost of branches is 9.5 cycles.

1 | 2 | 3

Rate this article: Low High
Current rating
  • .
Embedded.com Career Center
Looking for a new job?
SEARCH JOBS

Browse all jobs

SPONSOR
RECENT JOB POSTINGS





 :