A step-by-step process for achieving MPU security

January 30, 2018

Ralph Moore-January 30, 2018

Most Cortex-M MCUs, both in the field and in development, have Memory Protection Units (MPUs). However, because of a combination of tight schedules to deliver products and difficulty using the Cortex-M MPU, these MPUs are either under-used or not used at all. The apparent large waste of memory due to the MPU requirements that MPU regions be powers-of-two in size and that they be aligned on size boundaries has been an additional impediment for adoption by systems with limited memories.

Yet for these MCUs, the MPU and the SVC instruction are the only means of achieving acceptable security. Therefore, I set out a year and a half ago to determine if the problems with the MPU could be overcome and if it were possible to devise a practical way to upgrade post- and late-development projects, as well as new projects to use MPU security. I have found that it is practical to do this and MPU-Plus has been developed to ease this process.

Part 1 introduced the key elements of this porting process and you should read that article before continuing with this article, which describes the step-by-step process for completing the conversion.

Step-by-Step MPU Security Conversion

Start by identifying the most untrusted or vulnerable task or partition that you wish to isolate from the rest of the system. This might be a networking partition or third-party SOUP[1]. It could be the site of a recent hack where it has been decided that it is easier to isolate the vulnerable code than to fix it. We recommend an incremental approach to improving system security. Significant gains can be made by isolating one bad actor at a time. However, this approach also works for porting multiple partitions at once, if necessary or desirable. Figure 4 illustrates the conversion process:

click for larger image

Figure 4: Step-By-Step Conversion Process (Source Micro Digital)


1. Start

To start, put a call to sb_MPUInit() near the beginning of the startup code. This turns on the MPU and enables its background region. Your application should run normally. Note: disable loading sys_code into MPU[5][2] and sys_data into MPU[6] in sb_MPUInit(), since these regions have not yet been defined.

2. System Regions

Next, define .sys_code and .sys_data sections. sys_code should contain all handler and ISR shell[3] code. If an ISR does not use a shell, then the ISR, itself, must be included. This is done as in the following examples, first for assembly code:

   smx_MPU_BR_ON    ; turns on MPU background region
   … ; Handler code
   smx_MPU_BR_OFF   ; turns off MPU background region
     cpsid   f
   pop     {pc}
then for C code:
#pragma default_function_attributes = @ ".sys_code"
void sb_OS_ISR0(void)  /* ISR shell */
   smx_ISR_ENTER();  /* turns on MPU background region */
   /* ISR body or call ISR here      
   smx_ISR_EXIT();           /* turns off MPU background region */
#pragma default_function_attributes =

sys_data contains the System Stack (SS)[4]. The BR macros can be eliminated wherever sys_code and sys_data are all that is needed for an ISR or handler to operate.

Then in the linker command file:

define exported symbol scsz = 0x1000;
define exported symbol sdsz = 0x400;
define block sys_code  with size = 0x1000, alignment = scsz
                                           {ro section .sys_code};
define block sys_data  with size = 0x400,  alignment = sdsz
                                           {block CSTACK, block EVT};

Of course, actual sizes depend upon the application. They should be the next power of two that is large enough. The alignment must equal the size. Now re-enable loading sys_code into MPU[5] and sys_data into MPU[6] in sb_MPUInit(). These are permanent regions that are present for every ptask as shown in Figure 2. They may also be present for utasks, but are not accessible in umode because they are privileged regions.

3. Super regions

The next step is to define super regions for SRAM, ROM, DRAM, other memories, and I/O areas in your system. These serve as temporary replacements for BR until partition- or task-specific regions are defined. Consult the linker map to determine the starting address and how much memory is being used in each memory. Then pick the next larger power of two for the size. The following template is an example:

const MPA mpa_tmplt_sr =
  {0x20000000 | V | 0, PRW_DATA | N67 | (0x11 << 1) | EN}, /* SRAM in use */
  {0x00200000 | V | 1, PCODE    | N57 | (0x11 << 1) | EN}, /* ROM in use */
  {0xC0000000 | V | 2, PRW_DATA       | (0x10 << 1) | EN}, /* RAM in use */
  {0x40040000 | V | 3, PRW_DATA       | (0x11 << 1) | EN}, /* Synopsys HS */
  {0x40011000 | V | 4, PRW_DATA       | (0x09 << 1) | EN}  /* UART1 */                          

Super regions encompass all other regions for each memory or I/O area. Hence it is simpler to use physical addresses and sizes, as shown above, rather than complicating the linker command file by defining super blocks in it. Load this template into the Memory Protection Array (MPA) after creation, for each task being converted. For example:

  smx_Idle = smx_TaskCreate(ainit, PRI_SYS, 500, SMX_FL_LOCK, "idle");
   smx_TaskSet(smx_Idle, SMX_ST_MPA, (u32)&mpa_tmplt_sr);

When a task’s MPA is loaded, its mpav flag is set. Tasks with mpav set run with BR off; all other tasks run with BR on. This allows working with one task, at a time. It also allows leaving tasks alone that are intended to stay in pmode.

Now run the system. The targeted task is likely to get Memory Manage Faults (MMFs). This indicates that it needs access to other things, such as functions, static variables, or peripherals. Dealing with this problem may require enlarging a region more than one would like. However, this is the preferable approach at this time if code changes can be avoided. (Code changes are best left for future passes, when more is known about what is needed.)

For each task running with BR off, a security gain has just been made: handlers, ISRs, and other tasks are running as they were before, but this task is running with reduced memory regions and these regions have strictly controlled attributes (RO, XN, etc.) It is quite possible that latent bugs will start to show up and be fixed – especially if the SOUP is thick and the comments are thin.

4. Pmode operation

The next step is pmode operation. For simplicity, we will assume that a single task, taskA, is being isolated. Tasks processed in this step must be running in super-regions with BR off. Hence, the sys_code and sys_data regions are required in MPU[5] and MPU[6] to handle exceptions.

The first step is to group code and data into task-specific regions and to define blocks in the linker command file to hold these regions. It is convenient to name them after the task, e.g.: taskA_code and taskA_data or name them after the partition, e.g. usbh_code and usbh_data.

Next, define common code and data regions to hold RTOS and other system services and to hold common data needed by them. We have named these pcom_code and pcom_data, respectively. At this point, taskA is a ptask, so pcom_code needs to include the RTOS and other system services needed by taskA and pcom_data needs to include data needed by these services

Then, create mpu_tmplt_taskA and add code to load it into the MPA for taskA, as shown in step 3. At this point the mpa_tmplt_sr has been replaced by mpu_tmplt_taskA for this task. taskA is standing alone and is partially isolated from all other tasks. Will it run? This is where “the tire meets the road”. MMFs from taskA are likely to be due to references outside of its regions or due to attribute violations (e.g. writing to ROM). The former indicates that the task or partition needs access to other code or other data than expected.

Continue reading on page two >>


< Previous
Page 1 of 2
Next >

Loading comments...

Most Commented