POSIX IPC on Cortex-M architectures - Embedded.com

POSIX IPC on Cortex-M architectures

Tyler Gilbert of CoActionOS explains how to implement the system components required send and receive POSIX-style signals—a form of IPC–between concurrently running tasks on the Cortex-M architecture and then specifically with the Cortex-M3 (CM3).

The ARM Cortex-M architecture is a 32-bit microcontroller core designed to replace many 8-bit and 16-bit devices. This is in contrast to the ARM Cortex-A architecture, an application processor that typically runs Linux, iOS, or Windows 8 (at least in development environments).

The M-series runs lesser-known, embedded operating systems such as FreeRTOS, ThreadX, and my favorite (of course) CoActionOS. These operating systems use the advanced features of the Cortex-M architecture to implement system functionality such as context switching and inter-process communication (IPC).

This article explains how to implement the system components required to send and receive POSIX-style signals – a form of IPC – between concurrently running tasks on the Cortex-M architecture, and deals specifically with the Cortex-M3 (CM3).

Processor interrupt vs. task interrupt
A processor interrupt, when talking in general about interrupts on microcontrollers, is usually triggered by peripheral circuitry such as a timer or input/output port, and causes the processor to suspend its current activities and execute an interrupt service routine (ISR). On the CM3, the Nested Vectored Interrupt Controller (NVIC) handles processor interrupts. By contrast, a task interrupt is when a function is placed on a dormant task’s stack and executed when the task resumes.

For purposes of this discussion, a 'task' is defined as an independent execution context (either a process or thread) that is preemptively managed by a scheduling algorithm. An example of a task interrupt in a POSIX-style operating system is when a task calls the sigqueue() or kill() function. The operating system inserts a function on the stack of the task receiving the signal. When the scheduler resumes execution of the receiving task, the signal handling function is executed before the task resumes.

On the CM3, a processor interrupt has two attributes that make task interrupts useful. First, the CM3’s NVIC always executes ISRs in handler mode rather than process mode. Second, ISR execution is prioritized over normal execution. Using a task interrupt allows application developers to execute the response to an event in process mode and at the same priority level as the task that is responding to the event.

CM3 system support hardware and context switching
Note: To implement task interrupts on the CM3, knowledge of advanced CM3 features as well as context switching is required.

Advanced features of the CM3 include two execution modes and two stack pointers. The execution modes are handler mode and process mode while the stacks are the main stack pointer (MSP) and the process stack pointer (PSP). ISRs always execute in handler mode: privileged and using the MSP. Process mode can be configured to use either the MSP or the PSP as well as to be privileged or unprivileged. Unprivileged code execution has limited access to critical system registers.

Additionally, the memory protection unit (MPU) can be configured to allow privileged software to have different permissions than unprivileged code. Code executing in process mode must use the SVCall interrupt to switch to handler mode; this allows application developers access to kernel features such as context switching.

The CM3 architecture provides two processor interrupts that facilitate context switching: the PendSV and SysTick interrupts. The PendSV interrupt can be triggered by software while the SysTick interrupt is triggered when the SysTick timer counts down to zero. One systems design approach is to use PendSV for FIFO algorithms and SysTick for round robin scheduling.

When an interrupt is handled, the NVIC hardware immediately pushes the following registers (referred to in the figures as the 'hardware stack frame'): r0, r1, r2, r3, r4, r12, LR, PC, and xPSR. By convention, only these registers are required to be saved/restored when handling interrupts. However, for context switching, all registers must be saved/restored.

When the NVIC starts execution of the context switcher, the context switcher pushes the remainder of the registers (the 'software stack frame' in the figures): r4, r5, r6, r7, r8, r9, r10, and r11. The context switcher then decides which task to execute next, updates the process stack pointer (PSP), and pops the software stack frame for the new task. When the context switcher exits, the ISR returns, the NVIC pops the hardware stack frame, and execution of the next task resumes. The chronology of a context switch is illustrated in Figure 1 below.


Click on image to enlarge.

Figure 1: Chronology of a Context Switch

(More details on context switching on the CM3 can be found in this article: “Taking advantage of the Cortex M3's pre-emptive context switches” .)

Implementing Task Interrupts
Implementing a task interrupt on the CM3 architecture involves manipulating the stack of the receiving task (Figure 2). This method assumes the sending task is different than the receiving task. If the sending and receiving tasks are the same, the sending task can simply make a direct call to the task interrupt handling function.


Click on image to enlarge.

Figure 2: Task Interrupt Stack Timeline

Figure 2 shows a timeline for inserting a task interrupt handler on the receiving task as well as how the stack is restored after the handler returns.

The NVIC and context switcher insert the hardware and software stack frames when switching from the receiving task to some other task.

The sending task decrements the receiving task’s stack with a new hardware stack frame and an uninitialized software stack frame.

When the receiving task resumes, the context switcher pops the new software stack frame, and the NVIC pops the hardware stack frame causing the task interrupt handling function to be executed.

When the task interrupt handling function exits, an interrupt is forced, and the stack is adjusted such that the original hardware as well as software stack frames are restored.

When the forced interrupt exits, the NVIC uses the original hardware stack frame to restore execution of the task to its pre-task-interrupt state.

The sending task adjusts the receiving task’s stack and initializes the new hardware stack frame; the new software stack frame is uninitialized by convention. The registers in the hardware stack frame are crucial to the execution of and return from the task interrupt:

r0-r3 : These are the parameters (a,b,c and d respectively, see task_interrupt_handler() below) passed to the task interrupt-handling routine.
pc : This is the location of the task interrupt-handling routine.
lr : This is the location of the return routine which restores normal task execution.


The following code is an example of how to interrupt a task as well ashow to restore task execution after the interrupt-handling functionreturns. This code assumes process mode on the CM3 is unprivileged andusing the PSP.

void task_interrupt_handler(int a, int b, int c, int d);

typedef struct {
     uint32_t r0;

uint32_t r1;

uint32_t r2;

uint32_t r3;

uint32_t r12;

uint32_t lr;

uint32_t pc;

uint32_t psr;

} hw_stack_frame_t;

int priv_task_interrupt_call(int task_id, int arg[4]){
hw_stack_frame_t hw_frame;

hw_frame = (hw_stack_frame_t *)(task_table[task_id].sp – sizeof(hw_stack_frame_t));

task_table[task_id].sp = task_table[task_id].sp – (sizeof(hw_stack_frame_t) + sizeof(sw_stack_frame_t));

hw_frame->r0 = arg[0];
hw_frame->r1 = arg[1];
hw_frame->r2 = arg[2];
hw_frame->r3 = arg[3];
hw_frame->r12 = 0;
hw_frame->pc = (uint32_t)task_interrupt_handler;
hw_frame->lr = (uint32_t)task_restore;
hw_frame->psr = 0x21000000; //default PSR value

}

static void priv_task_restore(void * args){
uint32_t pstack;

//discard the current HW stack by adjusting the PSP up by sizeof(hw_stack_frame_t) –sw_stack_frame_t is same size
pstack = __get_PSP();
__set_PSP(pstack + 4 + sizeof(hw_stack_frame_t)); //the extra 4 bytes are added by the handler

//Load the software context that is on the stack from the pre-interrupted task
task_load_context();
//This function will now return to the original execution stack

}

void task_restore(void){

//handlers inserted with task_interrupt() must call this function when the task completes in order to restore the stack

core_priv_call(priv_task_restore, NULL);

}

The task_interrupt_handler() function is the prototype for the taskinterrupt-handling routine. This can be any function that the systemsdeveloper wants it to be.

The priv_task_interrupt_call() adjusts the target task’s stack, which islocated in the global task table (task_table[task_id].sp), by the sizeof the hardware stack frame plus the size of the software stack frame inaddition to initializing the new hardware stack frame.

Thetask_restore() function is executed when the task_interrupt_handler()returns. It executes priv_task_restore() in handler mode by using theSVCall processor interrupt–in a similar manner,priv_task_interrupt_call() must be executed in handler mode.

Restoringthe task context includes reading and adjusting the PSP by the size ofthe hardware stack frame as well as by a compiler/implementationspecific value, in this case: four (4). The task_load_context() functionuses inline assembly to pop the original software stack frame from thePSP. When priv_task_restore() returns from the processor interrupt, theNVIC uses the hardware stack frame which existed before the taskinterrupt handler was inserted, and normal execution of the taskresumes.

Using the SVCall
The above code usescore_priv_call() to synchronously enter handler mode using theSVCall. On the CM3, the SVCall is designed to allow the applicationprogrammer to access system calls. There are several ways to implementthe SVCall. One method, suggested by ARM in the Cortex-M3 user guide, isto extract the immediate value of the SVC instruction using the stackedprogram counter (PC).

Another method is to use the stackedregister values to pass parameters to the SVCall handler routine. Theformer method has two disadvantages: 1) two levels of indirection arerequired to get the immediate value (look up the stacked PC, then lookup the actually immediate value), and 2) the immediate value is onlyeight bits wide which limits the number of system calls without usingadditional resources.

The following code (compiled using GCC)uses the latter method by utilizing the stacked r0 and r1 values toexecute any function with a single argument in handler mode.

void core_priv_call(core_priv_call_t call, void * args) __attribute__((optimize(“1”)));
void core_priv_call(core_priv_call_t call, void * args){

asm volatile(“SVC 0n”);

}

void core_svcall_handler(void){
register uint32_t * frame;

register core_priv_call_t call;

register void * args;

asm volatile (“MRS %0, pspnt” : “=r” (frame) );

call = (core_priv_call_t)frame[0];

args = (void*)(frame[1]);

call(args);

}

Whencore_priv_call() is executed, the call parameter, by convention, iscopied to r0, and the args parameter is copied to r1. When the SVCinstruction is executed within core_priv_call(), the NVIC pushes thehardware stack frame and executes the core_svcall_handler() routine. Thehandler routine grabs the stacked r0 and r1 values (frame[0] andframe[1] respectively) then executes the function pointed to by r0 withr1 as the argument. Though not implemented here r2 and r3 can also beused as arguments if the prototype for core_priv_call() has fourarguments.

It is important that the core_priv_call() functionprototype disable any optimizations that prevent the call and argsvalues from being assigned to r0 and r1, respectively. Becausecore_priv_call() does not directly use call or args, many compilers(including GCC when optimization is higher than O1) do not assign r0 andr1 when calling core_priv_call().

Conclusion
Taskinterrupts on the CM3 allow developers to respond to events (or signals)asynchronously without compromising system protection of criticalregisters and memory regions. This is accomplished by executing the taskinterrupt handler in process mode on the same stack as other user tasksrather than in handler mode, which always uses the main stack pointerand is always privileged. Implementing task interrupts requires a soundunderstanding of the advanced CM3 hardware features as well as thetarget system’s context switching approach.

Tyler Gilbert is the lead developer on CoActionOS, an embedded development platformfor the ARM Cortex-M architecture (visit www.coactionos.com to learnmore). He welcomes your feedback at tgil@coactionos.com.

References: ARM Infocenter

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.