I've appreciated the addition of Jim Turley to the pages of Embedded Systems Programming; his reports on the microprocessor business have been informative and accurate. However, I must object to his March 2003 column (“RISCy Business,” p.37); I am astonished at the depth and breadth of disinformation presented.
I have been a professional programmer for more than 20 years, working on everything from ancient military hardware based on 1950s minicomputers to state-of-the-art desktop machines. I have worked on RISC and CISC processors in embedded systems, desktop workstations, and large servers. There are many tradeoffs between RISC and CISC, but Jim has failed to properly characterize those tradeoffs. Let me rebut several points in particular:
“The theory behind RISC is that reducing the number of features within the chip makes it go faster because it's “streamlined,” unencumbered by the features and functions CISC chips had accumulated over the years.”
Any reading of RISC design literature whatsoever would have dispelled this notion immediately. The idea was not to make a “streamlined” chip with a smaller transistor count, but rather to strip off the unused parts of the CISC microcode and dedicate those transistors to making the chip faster. To indulge in an analogy, you don't make a car faster by throwing out the backseat, you make it faster by throwing out the backseat and adding a turbocharger that weighs the same as the backseat.
“This ignores the fact that those features have been added for a reason; CPU companies don't gratuitously add instructions just to waste transistors.”
Actually, some CPU companies did gratuitously add instructions. CISC companies didn't regularly check their user's software for unused instructions. But such checks were one of the major elements of the competing Stanford and Berkeley RISC design programs; both characterized large bodies of existing software to determine what kinds of instructions were actually used.
The majority of software written for any chip is compiled by a relatively small number of compilers, and those compilers tend to use pretty much the same subset of instructions. The UNIX portable C compiler for example used less than 30% of the Motorola 68000 instruction set.
RISC processors were designed to throw out only the instructions that were not used in the wild. Jim attempts to back up his shaky arguments by bringing up the ultimate argument for CISC, those lovely x86 string scanning functions. What he either didn't disclose is that the REP SCAS instruction is slower than coding the exact same function as separate instructions on a Pentium 4 processor.
The reason for this is disarmingly simple: the Pentium 4 isn't really a CISC processor at all. It's really a very powerful RISC engine that decodes Pentium instructions into one or more RISC microinstructions and executes them at very high speed. Intel has wisely concentrated on optimizing only the x86 instructions that are generated by common compilers and these do not include the REP SCAS and REP STOS instructions.
I'd hate for your readers to be left with the impression that Jim's column was factual or authoritative on the issue of CISC vs. RISC.
Jim Turley Responds:
Thanks for taking the time to write to Embedded Systems Programming. My editor was nice enough to show me your letter.
You make some very good points. Your response was one of the better, more-reasoned ones I've received. In the short amount of space allowed you made some good arguments for RISC and other architectural features.
In my defense, my original draft of the article was a lot longer than the version you saw. I like to think that the longer draft might have addressed, or at least alleviated, some of your concerns.
For example, you're correct in pointing out that compilers focus on a subset of instructions. That's natural. But that doesn't mean the other 30% (or so) of the instruction set serves no purpose. Universally, designers at Motorola, Intel, TI, and other CPU firms added instructions only after extensive code profiling and long, hard looks into performance bottlenecks. Complex new instructions may not be used frequently, but are very useful when they are used. Besides, instructions do not burden a processor, per se. The extra transistors cost approximately nothing. If the compiler doesn't use those instructions, there's no harm done. But if the compiler (or assembly language programmer) does use them, they're terrific.
Most English-language conversations use only about 500 different words. Should we remove the other 150,000 words from our vocabulary? Although the 80/20 rule also applies to C compilers, it's not a compelling argument for removing the “other” 20 percent. It's possible to make a fully functional microprocessor with only four instructions; that doesn't make it a good idea. I think the poor usage ratio observed at Stanford and Berkeley is an indictment of lazy complier writers, not of over-zealous processor designers.
Your observation about REP SCAS is technically accurate; it is slower than using separate instructions — on a Pentium 4. That misses the point. The REP SCAS instruction pair wasn't developed for Pentium 4; it was created more than a decade ago when x86 chips really were CISC designs. And it was very, very useful. Intel consciously chose to implement that instruction pair poorly because it was inconvenient for the RISC-ified core of the newer chips. Thus, the performance paradox you cite is artificial, a decision by Intel's marketing department and not a reflection of RISC or CISC goodness.
Your analogy about “shifting weight” is a good one. But in an age when Moore's Law is throwing transistors at designers faster than they can use them, there's little point in conserving transistors (e.g., weight) at all. Certainly, features that slow down a chip should be removed. Everyone does that. But to jettison a useful feature, however marginal, makes no sense. Indeed, the past ten years have seen every single RISC architecture gradually restore the very same features they shunned and ridiculed during the RISC “purge” of the 1980s. Unaligned memory access, bit-wise addressing, media instructions, DSP instructions, variable-length instructions, code compression, multi-cycle functions, multiply-accumulate… you name it, they've all come right back. The fad has passed. It seems RISC is “reduced” in name only.
Again, thanks for your kind words and for your dedicated efforts.
An offshore perspective
I am writing in response to Michael Barr's May 2003 editorial on the changes that have been brought about by outsourcing (“Distributed Development,” p. 7). I am presently living in Bangalore, India, working for Cisco Systems. I used to work for an electrical engineering company; the job was stable but not very interesting. At Cisco, I've shifted to embedded systems and software development and this has really changed me completely. Change is so constant now that I need to learn new technologies almost on a daily basis. Also, my lifestyle has changed. I now own a car and a house. This was not possible even after six years in my previous job.
I know that outsourcing has created a lot of changes in the marketplace. Personally, I feel bad about the loss of jobs for my fellow engineers in the U.S.. I feel that business in the U.S. must go to the next level. They must innovate and evolve the product development process. And who knows? There might be a new invention like the microcontroller that spawns a new industry, giving rise to millions of new jobs.
What's on your mind?
Embedded Systems Programming welcomes feedback. Please send any comments to editor-in-chief Michael Barr at . Letters to the editor will be considered for publication in any and all media unless the writer requests otherwise. They may be edited for clarity and length.