Linux vs. organism
I recently read a fascinating news article regarding recently published Yale bioinformatics research, comparing Linux to genomes. The study elucidates the stark difference between the E. coli cell regulatory network of genes and the Linux kernel regulatory network: its function call graph. The E. coli network is pyramidal, with a few key master genes at the top influencing a larger number of “workhorses” at the bottom. In contrast, the Linux call graph resembles a reverse pyramid, with a large number of top-level entry points calling down to a small number of bottom level routines.
The reverse pyramid is failure prone because changes to the workhorses - the ones most likely to require adaptation over time - force corresponding changes up through the caller hierarchy which is comparatively large. The researchers noted the rapid churn of code changes originating in the low level kernel routines. As one of the researchers, Marc Gerstein, commented: “You can easily see why software systems might be fragile, and biological systems robust. Biological networks are built to adapt to random changes. They’re lessons on how to construct something that can change and evolve.”
Often I have thought of systems software architecture as an analogy to many things, living and not, that share the challenge of balancing the naturally opposed goals of providing an extremely sophisticated service while maintaining extremely high levels of robustness.
Linux’s lack of robustness is caused primarily by its structure (as discussed in the research) but exacerbated by its high rate of modification by a large and diverse group of contributors. Thus, despite following what are considered good commercial software development practices, these developers are statistically guaranteed to introduce critical flaws at a regular if not increasing rate. A great overview article that discusses this is here.
There are two major classes of operating systems, monolithic (like Linux, Windows, and Solaris) and microkernel (like INTEGRITY, L4, and Minix). The monolithic approach places a large number of services in a single memory space (the OS kernel), where there are many intricate, direct and indirect call pathways between software modules, shared memory, and just a large amount of code. As the researchers point out, a single flaw can take down the entire system, making it crash-prone.
Most leading OS experts agree the logic that a microkernel approach is much better for robustness. With a microkernel, the supervisor mode core provides only a very small set of critical services: memory protection for processes and itself, time scheduling for processes, and event handling (such as reacting to crashes in processes). Other services that are typically thought of as part of the operating system – such as networking stacks and file systems – are executed in processes instead of in the kernel. As systems grow in complexity – multimedia, new communications mechanisms, web browsers, and so on – these are all built into separate components which use a well-defined, auditable interface between other components and the kernel. Each component is provided a private memory space and quota of execution resources (memory, CPU time) that cannot be stolen or corrupted by other applications. Systems are composed of only the minimal components required. This approach promotes a more maintainable, debuggable, testable, and robust system.
The monolithic approach was adopted in older operating systems due to performance. But Intel and other microprocessor designers have thankfully taken this objection off the table (OS designers have also gotten much better at making the messaging and process switching very fast). The commercial success of microkernels only over the past decade is testament to this.
Of course, the microkernel remains a single point of failure, but this piece is small and simple enough that it requires little or no changes over time, can be exhaustively tested, and is amenable to formal mathematical proof of its safety and security functions. One of the key features of microkernels in this domain is the ability to host a virtualized general purpose OS, like Linux, without impacting the robustness of critical services running directly on the microkernel. Thus, computing can realize the strengths of both worlds - microkernel and monolithic. As an example, Dell sells a specialized desktop PC that uses a microkernel to host multiple virtual PCs which are able to securely and simultaneously connect to distinct classified and unclassified government networks.
Furthermore, we can improve the overall robustness of a Linux system by moving critical functions out of the bad cells and into the good cells, if you will. For example, if the network security and crypto components are moved from Linux to its supervisory microkernel, then malware which finds its way into Linux via the Internet cannot masquerade trusted network connections which can only be made with the isolated crypto software and keying material. Component isolation is the same characteristic that makes some viruses difficult to thwart: you can kill one virus cell, but the other cells continue to wreak havoc. Modern ships use multiple container holds to prevent sinking if one of the cells is pierced.
Another topic which has drawn comparisons between the electronic and the organic is the emergence of cloud computing. The power of the cloud lies in the on-demand, remote access to services, the combinatorial potential of services across the cloud, and the ability to rapidly evolve those services, for example via social networking. In other words, the cloud is the antithesis of the age old computing model in which a user’s digital universe consists of locally stored data and applications. However, if you host the cloud on a small number of all-powerful centralized data centers (e.g. Amazon, Google), than this is going against the natural robustness grain. The cloud shouldn’t be a great cumulonimbus; micro-cloudlets of cirrostratus perhaps?
Dave Kleidermacher has been developing systems software for high criticality embedded systems for more than 20 years and is one of the original developers of the INTEGRITY operating system, the first software technology certified to EAL 6+ High Robustness , the highest Common Criteria security level ever achieved for software. He managed INTEGRITYʼs development for a decade and now serves as the chief technology officer at Green Hills Software This is his personal blog; opinions expressed are not necessarily those of GHS.
Copyright (c) David Kleidermacher


Loading comments... Write a comment