This is the first part in a series all about how to run the Linux operating system on the MIPS 32k/64k architecture. Why? Well, this is what a CPU is for. A CPU "architecture" is the description of what a useful CPU does, and a useful CPU runs programs under the control of an operating system.
Although many operating systems run on the MIPS architecture, the great
thing about Linux is that it's public. Anyone can download the source
code, so anyone can see how it works.
Any operating system is just a bunch of programs. Ingenious
programs, and - perhaps more than most software - built on a set of
ideas that have been refined and figured out over the years. An
operating system is supposed to be particularly reliable (doesn't
crash) and secure (doesn't let some program do things the OS hasn't
been told to let it do).
Correct usage sees "Linux" as the name of the
operating system kernel originally written by Linus Torvalds, a kernel whose
subsequent history is, well, history. Most of the (much larger) rest of
the system came from projects organized under the "GNU" banner of the Free
Software Foundation. Everybody sometimes forgets and calls
the whole thing "Linux."
Both sides of this process emerged as a reaction to the seminal work
on the UNIX operating system developed
by Bell Laboratories in the 1970s. Probably because Bell saw it as of
no commercial value, it distributed the software widely to academic
institutions under terms that were then unprecedently "open."
But it wasn't "open source" - many programmers worked on UNIX at
university, only to find that their contributions were either lost or
were now owned by Bell Labs (and their many successors). Frustration
with this process eventually drove people to write "really free"
replacements.
The last key part was the kernel. Kernels are quite difficult
programs, but the delay was cultural: OS kernels were seen as something
for academic groups, and those groups wanted to go beyond UNIX, not to
recreate it.
The post-UNIX fashion was for a small, modular operating system
built of clearly separated components, but no OS built on that basis
ever found a significant user base. Linux won out because it was a much
more pragmatic project.
(Some claim that Windows/NT - and
therefore most modern versions of Microsoft Windows - has a
microkernel. That may be true, but it certainly lost any claims to be
small or modular on its way to world domination.)
Linus and his fellow developers wanted something that worked (on x86
desktops, in the first instance). When the Linux kernel was in
competition with offshoots of the finally free BSD4.4 system, BSD
protagonists insisted with some justification on their superior
engineering. But the Linux community had arrived at an understanding of
a far more "open" development style.
Linux evolved quickly. Sometimes, it evolved quickly because Linux
people were perfectly happy to adapt BSD code. It wasn't long before
Linux triumphed, and the engineering got better, too.
Basic Linux Building Blocks
To get to grips with any artifact you need to attach some good working
meaning to the terms used by its experts, and you are particularly
likely to be confused by terms you already know, but with not quite the
same meaning. The UNIX/Linux heritage is long enough that there are
lots of magic words: thread, file,
user mode and system calls: interrupt context, Interrupt service
routine (ISR), scheduler, memory map/address space, thread group, high
memory, libraries and applications.
Thread:
The best general definition of "thread" I know is "a set of computer
instructions being run in the order specified by the programmer." The
Linux kernel has an explicit notion of a thread (for each thread
there's a struct thread struct).
It's almost the same thing, but by the terms of my definition a
low-level interrupt handler (for example) is a distinct thread that
happens to have borrowed the environment of the interrupted thread to
run with. Both definitions are valuable, and we'll say "Linux thread"
when necessary.
Linux loves threads (there are currently 134 on the desktop machine
I'm typing this on). Most of those threads correspond to an active
application program - but there are quite a few special-purpose threads
that run only in the kernel, and some applications have multiple
threads. One of the kernel's basic jobs is scheduling - picking which
Linux thread to run next, which will be discussed later.
File: A
named chunk of data. In GNU/Linux, most of the interactions a program
makes with the world beyond its process are done by reading and writing
files. Files can just be things you write data to and get it back
later.
But there are also special files that lead to device drivers: Read
one of those and the data comes from a keyboard, write another and your
data is interpreted as digital audio and sent out to a loudspeaker. The
Linux kernel likes to avoid too many new system calls, so special /proc
files are also used to allow applications to get information about the
kernel.
User mode and
system calls: Linux applications run in user mode, the
lower-privilege state of MIPS CPUs. In user mode, the software can't
directly access the parts of the address space where the kernel lives,
and all the locations it can address are mapped to pages the kernel has
agreed to let the application playwith. In usermode, you can't run the
coprocessor zero CPU control instructions.
(GNU/Linux application code that
runs in user mode is frequently referred to as userland.)
To obtain any service from the kernel (most often, to read or write
a file) the application makes a systemcall. A systemcall is a
deliberately planted exception, interpreted by the kernel's exception
handler. The exception switches to high-privilege mode.
Through the system call, Linux application threads run quite happily
in the kernel in high-privilege mode (but of course they're running
trusted code there).
When it's done, the return from exception code involves an eret,
which makes sure that the change back to user mode and the return to
user mode code are done simultaneously.