CMP EMBEDDED.COM

Login | Register     Welcome Guest  
HOME DESIGN PRODUCTS COLUMNS E-LEARNING CONFERENCES CODE FORUMS/BLOGS NEWSLETTERS CONTACT FEATURES RSS RSS

The basics of programming embedded processors: Part 7
Trace-Driven Performance Analysis



Embedded.com

Optimizing for execution speed
If you need to improve the speed of your program, you can attack the problem at several levels of abstraction. First, make sure that the code really needs to be accelerated. If you are dealing with a large program, the part of the program using the most time may not be obvious.

Profiling the program will help you find hot spots. Many systems are built from multiple programs running as communicating processes; a slow program may not actually limit the overall system performance.

You may be able to redesign your algorithm to improve efficiency. Examining asymptotic performance is often a good guide to efficiency. Doing fewer operations is usually the key to performance. In a few cases, however, brute force may provide a better implementation. A seemingly simple high-level language statement may in fact hide a very long sequence of operations that slows down the algorithm.

Using dynamically allocated memory is one example, since managing the heap takes time but is hidden from the programmer. For example, a sophisticated algorithm that uses dynamic storage may be slower in practice than an algorithm that performs more operations on statically allocated memory.

Finally, you can look at the implementation of the program itself. A few hints on program implementation are summarized below.

  • Try to use registers efficiently. Group accesses to a value together so that the value can be brought into a register and kept there.
  • Make use of page mode accesses in the memory system whenever possible. Page mode reads and writes eliminate one step in the memory access. You can increase use of page mode by rearranging your variables so that more can be referenced contiguously.
  • Analyze cache behavior to find major cache conflicts. Restructure the code to eliminate as many of these as you can as follows:

- For instruction conflicts, if the offending code segment is small, try to rewrite the segment to make it as small as possible so that it better fits into the cache. Writing in assembly language may be necessary. For conflicts across larger spans of code, try moving the instructions or padding with NOPs.
- For scalar data conflicts, move the data values to different locations to reduce conflicts.
- For array data conflicts, consider either moving the arrays or changing your array access patterns to reduce conflicts.

Next in Part 8:  Analysis and optimization of energy, power amd program design
To read Part 1, go to "Program design and analysis."
To read Part 2 , go to " Models of program, assemblers and linkers."
To read Part 3, go to "Basic Compilation Techniques"
To read Part 4, go to "The creation of procedures"
To read Part 5, go to  "Register allocation and scheduling"
To read Part 6, go to "Analysis and optimization of execution time"

Used with the permission of the publisher, Newnes/Elsevier, this series of six articles is based on copyrighted material from "Computers as Components: Principles of Embedded Computer System Design" by Wayne Wolf. The book can be purchased on line.

Wayne Wolf  is currently the Georgia Research Alliance Eminent Scholar holding the Rhesa "Ray" S. Farmer, Jr., Distinguished Chair in Embedded Computer Systems at Georgia Tech's School of Electrical and Computer Engineering (ECE). Previously a professor of electrical engineering at Princeton University, he worked at AT&T Bell Laboratories. He has served as editor in chief of the ACM Transactions on Embedded Computing and of Design Automation for Embedded Systems.

References:
[Gho97] Somnath Ghosh, Margaret Martonosi and Sharad Malik "Cache miss equations: an analytical representation of cache misses." Proceedings of the 11th ACM International Conference on Supercomputing, ACM Press, 1997
[Col97] Robet Collins, "In-circuit emulation," Dr. Dobb's Journal, September, 1997. pp. 111-113.
1 | 2 | 3 | 4

Rate this article: Low High
Current rating
  • .
Embedded.com Career Center
Looking for a new job?
SEARCH JOBS

Browse all jobs

SPONSOR
RECENT JOB POSTINGS





 :