Layering the kernel
From one point of view, the kernel is a set of subroutines called from
exception handlers. The raw post-exception "exception mode" environment
on a MIPS CPU is all-powerful and very low-overhead but tricky to
program. >
So with each entry to the kernel you get something like a
foreshortened bootstrap process, as each "layer" constructs the
environment necessary for the next one. Moreover, as you exit from the
kernel you pass through the same layers again, in reverse order,
passing briefly through exception mode again before the final eret
which returns you to userland.
Different environments in the kernel are built by more or less
elaborate software which makes up for the limitations of the exception
handler environment. Let's list a few starting at the bottom, as the
kernel is entered:
MIPS CPU in Exception Mode
Immediately after taking an exception, the CPU has SR(EXL) set -it's in
exception mode. Exception mode forces the CPU into kernel-privilege
mode and disables interrupts, regardless of the setting of other SR
bits. Moreover, the CPU cannot take a nested exception in exception
mode except in a very peculiar way.
(There are some cunning tricks in
MIPS history that exploit the peculiar behavior of an exception from
exception mode - but Linux doesn't use any of them.)
The first few instructions of an exception handler usually save the
values of the CPU's general-purpose registers, whose values are likely
to be important to the software that was running before the exception.
They're saved on the kernel stack of the process that was running when
the interrupt hit.
It's in the nature of MIPS that the store operations that save the
register require you to use at least one general-purpose register
first, which is why the registers called k0 and k1 are reserved for the
use of exception handlers.
The handler also saves the values of some key CP0 registers: SR will
be changed in the next section of the exception handler, but the whole
at-exception value should be kept intact for when we return. Once
that's done, we're ready to leave exception mode by changing SR, though
we are going to leave interrupts disabled.
A CISC CPU like an x86 has no equivalent of exception mode; the work
done in MIPS exception mode is done by hardware (really by invisible microcode). An
x86 arrives at an interrupt or trap handler with registers already
saved.
The software run in MIPS exception mode can be seen as producing a
virtual machine that looks after saving the interrupted user program's
state immediately after an exception and then restores it while
preparing for the eret, which will take us back again.
Programmers need to be very careful what they do in exception mode.
Exceptions are largely beyond the control of the software locks that
make the kernel thread-safe, so exception code may only interact very
carefully with the rest of the kernel.
In the particular case of the exception used to implement a system
call, it's not really necessary to save GP registers at all (so long as the exception handler doesn't
overwrite the s0"s8 "saved" registers, that is). In a system
call or any noninterrupt exception, you can call straight out to code
running in thread context.
Some particularly simple exception handlers never leave exception
mode. Such code doesn't even have to save the registers (it just avoids
using most of them). An example is the "TLB refill" exception handler
described later in this series.
It's also possible - though currently unusual - to have an interrupt
handler that runs briefly at exception level, does its minimal
business, and returns. But such an interrupt handler has no real
visibility at the OS level, and at some point will have to cause a
Linux-recognized interrupt to get higher-level software working on its
data.