Omniscient code compilation comes to the PIC32 RISC CPU - Embedded.com

Omniscient code compilation comes to the PIC32 RISC CPU

Phoenix, Az. – At Microchip Technology's MASTERS Conferencehere Wednesday HI-TECH Software will take the wraps off an “omniscient”ANSI C compiler for 32-bit MCU code that it claims boosts real-timeresponse by more than 25% as well as nearly doubling code density.

The new HI-TECH C PRO compilerfor Microchip'sPIC32 MCU uses a new technique called omniscientcode generation(OCG) to optimize stack and register allocationacross all code modules prior to generating the object code. Smallercode generally executes more quickly and requires smaller, lessexpensive flash memory for storage.

It collects comprehensive data on every register, stack, pointer,object and variable declaration across the entire program. It uses thisinformation to optimize register usage, stack allocations and pointersacross the whole program. It also ensures consistent variable andobject declarations between modules and deletes unused variables andfunctions.

According to CEO and company founder Clyde Stubbs, its performanceon the PIC32 proves out the company's belief that OCG technique shouldresult in even better performance and code density improvements on32-bit register-based MCUs than that achieved in 8- and16-bit MCUs where the company has focused its OCG effortspreviously.

Because PIC32 is based on a MIPS Technologies 32-bit core, hebelieves that the performance improvements achieved should berepeatable on most other MIPS architectural derivatives, as well asmany other RISC-based designs. “Right now we are being somewhatconservative and are confining ourselves to architectures that have aclear and large following in the embedded systems market.”

Next on the company's agenda is the 32-bit RISC ARM architecture,with a particular focus on the ARM Cortex-M3, which is targetedspecifically at embedded applications. There, as with most other 32-bitRISC CPUs, said Stubbs, code is most often generated one module at atime, using variations of GNUCompiler Collection (GCC) techniques.

Because GCC generates code one-module-at-a-time, he said, nocomprehensive cross-module data is available. “But without knowing howobjects are used across the whole program, it is impossible to achievethe same level of optimization as an OCG compiler,” said Stubbs.

In code density benchmarks, the company's OCG compiler achieved codethat can be as much as 40% smaller than that generated using industryleading GCC-based PIC32 compilers. “The smaller code size can cutdevice costs by reducing the amount on on-chip flash required,” hesaid.

Stubbs pointed out what because GCC-based 32-bit compilers areconstrained as to which registers can be used to store parameters forcalled functions. “Whenever a function is called from another codemodule, the parameters of that function are usually stored in theregisters,” said Stubbs, via four specific registers reserved for thispurpose in GCC-based compilers.

The problem is that if the function has more than four parameters,the additional parameters must be stored on and passed to the calledfunction using the stack (in RAM) – a cycle intensive process thatdegrades performance and leads to increased RAM usage.

Faster Interrupt Handling.
By comparison, he said, interrupt-intensive code generated byomniscient code compilation typically requires 26% fewer cycles for thePIC32 to execute than code compiled using a non-OCG compiler.

By reducing the number of CPU cycles spent moving data between theregisters and stack, HI-TECH's OCG compiler effectively gives the CPU a26% performance boost. More important, called functions frequently callother functions, which may, in turn call other functions.

“This is particularly true for interrupt intensive applications,”said Stubbs. “For example, if the code calls a function, which thencalls a second function, the parameters for the first function willhave to be saved to the stack to make room for the parameters for thesecond function. “

If this second function calls a third function, the parameters forthe second function will also have to be saved to the stack to makeroom for the parameters of the third function.

“Data will have to be shifted continuously between the stack and theregisters,” he said. “The penalty for this is at least a cycle everytime data is moved to or from the stack ” or 8 cycles to move the datafor a single four-parameter function to the stack and back to theregisters.”

Even if other registers are available, the GCC compiler allocatesthe extra parameters to the stack once the fixed set of four registersis full. This process wastes both cycles and RAM. It also results incode bloat due to the extra instructions required to save functionparameters to the stack.

In contrast, with OCG compilation, said Stubbs, there is perfectknowledge of the register usage of each function. At any point in theprogram, it knows which registers are available and which registers arenot available, and can optimize register usage without any arbitraryconstraints.

“When there are two or three deep function calls, it allocatesparameters for different functions into non-overlapping register sets,often eliminating the need to store parameters into memory completely,”he said.

“This results in better utilization of the available registers,fewer cycles wasted moving parameters between the stacks and theregisters, and less RAM usage. It also contributes to smaller code sizeby reducing or eliminating the need for code to save registers to thestack.”

With the use of OCG, the HI-TECH C PRO knows the register usage ofevery function in the entire program, including interrupts and anyfunctions that are called by the interrupt code.

“It also knows exactly which registers need to be saved and restoredfor each interrupt routine. The OCG compiler saves only those registersthat are necessary, reducing the size of the interrupt contextswitching code, and decreasing the number of cycles required to executethe interrupt routine.”

Improving Memory Optimization.
Since the HI-TECH C PRO compiler knows the usage of every instance ofevery variable in the program, it has the ability to optimize theallocation of every variable between either the stack or the registers.The optimization is based on the frequency of use of each variable.

Variables that are used intensively can be allocated permanentlyto registers, which have no cycle penalty at all. All register andstack allocations are always optimized to elicit the best overallperformance for the entire program. This highly refined optimization ofmemory both boosts performance and minimizes power consumption bykeeping frequently used data in locations that have the shortest accesstime.

HI-TECH C PRO for the PIC32 MCU Family is available now throughSeptember 30, 2008 for the introductory price of US$1595, after whichit will sell for $1995.A fully functional 45-day trial version can bedownloaded, free ofcharge, at HI-TECH’swebsite.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.