Speeding up flash-based embedded applications - Embedded.com

Speeding up flash-based embedded applications

Most modern embedded software applications are stored and executed from flash memory. Flash provides an inexpensive and fast storage medium for microcontroller-based applications. These applications though are often real-time applications where execution time and deterministic behavior are paramount. While flash memory is fast, it’s not as fast as executing code from RAM. In order to speed up the execution time of flash-based applications, developers can selectively choose critical functions and execute them from RAM to get an extra speed boost.

There are typically three steps that a developer needs to follow in order to execute a function from RAM. These include:

  1. Creating a RAM region in the linker for functions
  2. Specifying which functions should be stored in RAM
  3. Copying the functions into RAM at start-up.

Let’s examine this process in detail.

Step #1 – Creating a RAM region in the linker for functions

Each compiler will have a different syntax for defining memory regions within a microcontroller. For today’s example, I’m going to use the GCC based Code Composer Studio and the syntax that is used with the Texas Instruments C2000 family because I think it provides a good example.

When we modify the linker file to include functions that will be executed from RAM, we need to create a memory section that will specify where the function is being loaded into RAM from and where that function is being loaded to in RAM.

The linker file will contain a sections region that specifies important program allocations such as:

  • cinit
  • .text
  • codestart
  • stack
  • constants
  • etc

The developer needs to create a region for their RAM functions. This can be done using something like the following:

ramfuncs            : LOAD = FLASHA,
LOAD_START (_RamfuncsLoadStart),
LOAD_END (_RamfuncsLoadEnd),
RUN_START (_RamfuncsRunStart),
LOAD_SIZE (_RamfuncsLoadSize),
PAGE = 0

As you can see, this is creating a region in RAM named ramfuncs. The RAM region is loaded from functions stored in the FLASH A sector. It is specified to run in RAM region RAML0. There are then some definitions for specifying where the RAM functions start and end along with their size. These values will be important in Step #3.

Step #2 – Specifying which functions should be stored in RAM

Once we have a RAM section created in the linker to store our functions, we need to specify to the linker which functions should reside there. The method that is most often used to do this is to use a #pragma. In general, we should try to avoid using #pragma in our code because these are capabilities that are compiler dependent. This means that a developer will most likely have to modify the #pragma line if compilers are changed. For our purposes today, this okay since we would have to modify a new linker file anyway and we would need to figure out the correct syntax to specify how to put a function in a memory region anyways.

A common set of functions that are often executed from RAM are functions related to accessing and controlling flash memory. The reason is that when we want to write or erase flash, most microcontrollers won’t let you execute code from flash simultaneously! So, we need to put these functions into RAM anyway. We could put a function such as Flash_Init into our RAM region using code similar to the following:

#pragma CODE_SECTION(Flash_Init, “ramfuncs”);

You can see from this statement that we are using a custom compiler designation CODE_SECTION to specify that the function Flash_Init should be placed into the ramfuncs region in the linker. This statement would commonly be placed directly above the function definition as a reminder to any developer working on the function that it will be placed into RAM. (This also makes it easier to find if we should decide that the function does not need to be placed in RAM).

Step #3 – Copying the functions into RAM at start-up

The final step in the process is to make sure that during the microcontroller start-up that the functions we want to execute in RAM are actually copied into RAM. The easiest way to do this is to use memcpy. I usually perform this copy shortly after configuring the system clock and interrupt vector table but before I initialize onboard peripherals and application code. I mentioned in Step #1 that there would be several variables that we defined that would come in handy later. These were RamfuncsRunStart, RamfuncsLoadStart and RamfuncsLoadSize. We will use these with memcpy to copy the functions into RAM using the following statement:

        /* Copy time critical code and Flash setup code to RAM
* The  RamfuncsLoadStart, RamfuncsLoadSize, and RamfuncsRunStart
* symbols are created by the linker. Refer to the project.cmd file.
memcpy(&RamfuncsRunStart, &RamfuncsLoadStart, (Uint32)&RamfuncsLoadSize);

It’s that simple. Once this is done, a developer simply calls the function like they normally would, and the function is executed in RAM.


When a developer is executing their application code from flash, they can speed up critical sections of their code by copying those functions into RAM. Executing a function from RAM will increase the execution speed by removing any wait states that may be associated with accessing and loading instructions from the flash memory. This extra boost can ensure that critical functions are able to execute at the fastest speed possible. As we have seen, loading functions into RAM and executing them is straightforward and simple (once you have done it once or twice).

Jacob Beningo is an embedded software consultant, advisor and educator who currently works with clients in more than a dozen countries to dramatically transform their software, systems and processes. Feel free to contact him at jacob@beningo.com, at his website www.beningo.com, and sign-up for his monthly Embedded Bytes Newsletter.

2 thoughts on “Speeding up flash-based embedded applications

  1. “There are several different ways you can measure the difference. n1) If you are using an RTOS, you could use Tracealyzer, System View, etcn2) If you are running bare-metal, the tried and true toggle an I/O pin before you call the function and after. Do

    Log in to Reply

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.