Getting down to basics: Running Linux on a 32-/64-bit RISC architecture - Part 5 -

Getting down to basics: Running Linux on a 32-/64-bit RISC architecture – Part 5

Much of the Linux kernel is written in portable C,and a great deal of it is portable to a clean architecture like MIPSwith no further trouble. In the previous Part 4 we looked at the obviousmachine-dependent code around exceptions and memory management.

This and the next part in this series will look at the other placeswhere MIPS-specific code is needed. We will deal first with cases wheremost MIPS CPUs havetraded off programming convenience for hardware simplicity: first, thatMIPS caches often require software management and, second, that theMIPS CP0 (CPU control) operations sometimes require explicit care withpipeline effects.

We'll also take a quick look at what you need to know about MIPS fora symmetricmultiprocessor (SMP) Linux system. And lastly is a glimpseat the use of heroic assembly code to speed up a heavily used kernelroutine.

Explicit Cache Management
In x86 CPUs, where Linux was born and grew up, thecaches are mostly invisible, with hardware keeping everything just asif you were talking directly to memory.

Not so MIPS systems, where many MIPS cores have caches with no extra”coherence” hardware of any kind. Linux systems must deal with troublesin several areas.

DMA Device Accesses
DMA controllers write memory (leavingcache contents out-of-date) or read it (perhaps missing cached data notyet written back ). On some systems – particularly x86 PCs – theDMA controllers find some way to tell the hardware cache controllerabout their transfers, and the cache controller automaticallyinvalidates or writes back cache contents as required to make the wholeprocess transparent, just as though the CPU was reading and writing rawmemory.

Such a system is called “I/O-cache coherent” or more often just “I/Ocoherent.” Few MIPS systems are I/O-cache coherent. In most cases, aDMA transfer will take place without any notification to the cachelogic, and the device driver software must manage the caches to makesure that no stale data in cache or memory is used.

Linux has a DMA API that exports routines to device drivers thatmanage DMA data flow (many of theroutines become null in an I/O coherent system ). You can readabout it in the documentation provided with the Linux kernel sources, whichincludes Documentation/DMA-API.txt.

In fact, if you're writing or porting a device driver, you shouldread that. When a driver asks to allocate a buffer, it can choose:

“Consistent”memory: Linux guarantees that “consistent” memory is I/Ocoherent, possibly at some cost to performance. On a MIPS CPU this islikely to be uncached, and the cost to performance is considerable.

But consistent buffers are the best way to handle smallmemory-resident control structures for complex device controllers.

Usingnonconsistent memory for buffers : Since consistent memory will be uncached for many MIPS systems, it canlead to very poor performance to use it for large DMA buffers.

So for most regular DMA, the API offers calls with names like dmamap xx(). They provide buffers suitable for DMA, but the buffers won'tbe I/O coherent unless the system makes univeral coherence cheap.

The kernel memory allocator makes sure the buffer is in amemoryregion that DMA can reach, segregates different buffers so they don'tshare the same cache lines, and provides you with an address in a formusable by the DMA controller.

Since this is not coherent, there are calls that operate on thebuffer and do the necessary cache invalidation or write-back operationsbefore or after DMA: They are called dma sync xx(), and the APIincludes instructions on when and how to call these functions.

For genuinely coherent hardware, the “sync” functions are null. Thelanguage of the API documentation is unfortunate here. There is alittle-used extension to the API whose function names contain the word”noncoherent,” but you should not use it unless your system is reallystrange.

A regular MIPS system, even though it is not I/O coherent, can andshould work fine with drivers using the standard API.

This is all moderately straightforward by OS standards. But manydriver developers are working on machines that manage this in hardware,where the “sync” functions are just stubs. If they forget to call theright sync function at the right moment, their software will work: Itwill work until you port it to a MIPS machine requiring explicit cachemanagement.

So be cautious when taking driver code from elsewhere. The need tomake porting more trouble free is the most persuasive argument foradding some level of hardware cache management in future CPUs.

Writing Instructions for LaterExecution
A program that writes instructions for itself can leave theinstructions in the D-cache but not in memory, or can leave stale datain the I-cache where the instructions ought to be.

This is not a kernel-specific problem: In fact, it's more likely tobe met in applications such as the “just-in-time” translators used tospeed up language interpreters. It's beyond the scope of this book todiscuss how you might fix this portably, but any fix for MIPS will bebuilt on the synci instruction.

That's the ideal: synci was only defined in 2003 with the secondrevision of the MIPS32/64 specifications, and many CPUs without theinstruction are still in use.

On such CPUs there must be a special system call to do the necessaryD-cache write-back and I-cache invalidation using privileged cacheinstructions.

Cache/Memory Mapping Problems
Virtual caches (real ones withvirtual index and tagging ) seem a wonderful free ride, since thewhole cache search process can start earlier and run in parallel withpage-based address translation.

A plain virtual cache must be emptied out whenever there's a memorymap change, which is intolerable unless the cache is very small. But ifyou use the ASID to extend the virtual address, entries from differentprocesses are disambiguated.

OS programmers know why virtual caches are a bad idea: The troublewith virtual caches is that the data in the cache can survive a changeto the page tables.

In general, the virtual cache ought to be checked after any mappingis rescinded. That's costly, so OS engineers try to minimize updates,miss some corner case, and end up with bugs.

In a heroic attempt to make Linux work successfully even withvirtual caches, the kernel provides a set of rules and function callsthat should be provided as part of the port to an architecture withtroublesome caches.

They're the functions with names starting flush cache xxx()described in the kernel documentation (Documentation/cachetlb.txt.) Idon't like the word “flush” to describe cache operations: It's beenused to mean too many things. So note carefully that in the Linuxkernel a “cache flush” is something you do to get rid of cache entriesthat relate to obsolete memory mappings.

In a system where all caches are physically indexed and tagged, noneof these calls needs to do anything.

Fortunately, virtual D-caches are rare on MIPS CPUs. Some recentCPUs have virtual I-caches: Implement the “flush” functions asdescribed in the documentation and you should be all right.

But L1 caches with physical tags but virtual indexes are common onMIPS CPUs. They solve the problems described in this section, but theylead to a different problem called a “cache alias”(read on).

Cache Aliases
We're now getting to something more pernicious. MIPS CPU designers wereamong the first to realize that the benefits of using the virtualaddress to index their cache could be combined with the benefit ofusing the physical address to tag it. This can lead to cache aliases.

The R4000 CPU was the first to use virtually indexed caches. Asoriginally conceived, the CPU always came with an L2 cache (the cache memory was off chip, but the L2controller is included with the CPU), and it used the L2 cacheto detect L1-cache aliases. If you loaded an alias to a line that wasalready present in the L1, the CPU generated an exception, which couldbe used to clean up.

But the temptation to produce a smaller, cheaper R4000 variant byomitting the L2 cache memory chips and the pins that wired them upproved too strong. Contemporary UNIX systems had a fairly stylized wayof using virtual memory, which meant that you could control memoryallocation to avoid ever loading an alias.

In retrospect we can see that generating aliases is a bug, and thecareful memory management was a workaround for it. But it worked, andpeople forgot, and it became a feature.

There are basically two ways to deal with cache aliases. The firstis to try to ensure that whenever a page is shared, all the virtualreferences to it have the same “page color” (that means that thereferences may be different, but the difference between them is amultiple of the cache set size).

Any data visible twice in same-color pages will be stored at thesame cache index and handled correctly. It's possible to ensure thatall user-space mappings of a page are of the same color.

But unlike the old BSD systems, Linux providesfeatures where correct page coloring is impossible. Those will be caseswhere you have both a user-space and kernel mapping to the same page(in many cases, on a MIPS kernel, the kernel “mapping” will be a kseg0address). So the MIPS port has special code to detect those cases andclean out any old alias mappings.

TheCache/TLB documentation (that's Documentation/cachetlb.txt, as mentioned in the section above)makes a heroic attempt to deal with cache aliases as “just anothersymptom” of virtual caches in general. It provides some notes on how toconfigure the kernel to do what it can on page coloring and how tohandle kernel/user-space aliases.

<>Next in Part 6: CP0 pipelinehazards, multiprocessors and coherent caches .
To read Part 4, go to “What we really want”.
To read Part 3, go to “WhatHappens on a System Call
To read Part 2, go to “How hardware and software work together.”
To read Part 1, go to “GNU/Linux from eight miles high”

Thisseries of articles is based on material from “SeeMIPS Run Linux,” by Dominic Sweetman, used with the permission ofthe publisher, Morgan Kaufmann/Elsevier, which retains full copyrights.It can be purchased on line.

Dominic Sweetman is asoftware/hardware boundary expert based in London, England, whopreviously served as managing director at Algorithmics Ltd.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.