A call for modern compilers - Embedded.com

A call for modern compilers

The compiler vendors are providing us with the same old crap we've put up with for 20 years.

Snazzy integrated development environments (IDEs) are at least two decades old, though now the GUI versions are a lot prettier than old text-based DOS windows. Source-level debugging appeared about the same time. But compilers haven't improved in any significant way since then.

In the intervening years our projects have changed tremendously. SoC, buried cores, and the adoption of high-powered processors with deep pipelines have made debugging harder. Embedded apps have grown from a few tens of thousands of lines of code to millions.

Yet our tools are about the same as ever.

Compilers still take the same old ins and generate the same old outs they've accepted since Fortran, the first compiled language, appeared in 1957. Big disks have (thankfully!) replaced 80-column punched cards but that's due to the evolution of hardware, not compilers. Cheap processors did kill off batch processing (again, thankfully!) so we're working interactively with the computer. This, too, is a result of Moore's Law and improved operating systems.

All compilers still process plain old unformatted text source files. Once upon a time that made a lot of sense. Punched cards and '60s-era printers could only manage fixed-font uppercase characters. Even into the '70s most of us banged away on ASR-33 teletypes that generated uppercase printout on rolls of paper more like the Dead Sea Scrolls than the modern 8.5 x 11 or A4 documents ejected at furious rates from today's laser printers. Remember filing those yellow mounds of output? Accordion-style folds that never quite lined up yielded ugly and hard-to-manage source listings that we stuffed into filing cabinets.

But those of us working with Microsoft products haven't created a document using fixed-fonts since 1992 when they finally released a version of Windows that worked reasonably well. Apple devotees got Write, a word processor much like any used today, a decade earlier.

Those tools are useless when creating source files, of course. Compilers still only accept plain old unformatted text. That's just plain dumb.

Why can't we format source code? I want italics to emphasize certain comments, bold-faced type where something must stand out, and various Headings to break up sections of code. Sometimes a font change can help describe what the machine we're building does.

Different styles immediately come to mind: the code and the comments probably should look different. Then I might want to use another style for assert macros and yet one more for lint directives.

But no, that's impossible since compilers only accept plain text files.

Why don't the tools spell-check my comments as I type (as does pretty much any word processor)? Give me word-wrap for comments so I don't have to edit all those CR/LFs (carriage return/linefeeds) every time I make an edit! In my opinion the comments are as important as the code, so grammar checking is as important as syntax verification.

I want a compiler that accepts source files that include both code and the documentation, documentation that includes formatted text, drawings, charts, and more. The stuff any user expects in the most limited word processor today.

Why can't we document the transfer equations of a control algorithm with super- and subscripts? Clarity is our goal, and a formula written using asterisks for exponentiation is hard to read.

Shouldn't summation signs look like Σ instead of some laboriously constructed nonstandard text-only description?

One reason it's so hard to change code without injecting errors is because the comments are mind-numbing text that looks just like the C itself. Like driving on a long desert road our eyes glaze over and we miss important warnings and notes about possible interactions. DANGER: The following three lines of code are highly optimized for speed and must never be changed leaps out at the maintainer in a way fixed fonts never will.

Give me reviewing tools that support collaboration. If enabled, one can see the changes made to the file by yourself or other authors, and it's easy to insert notes that appear in a different color and different font that explain why a change was made. Or that ask for approval for a change. Word processors have had this fantastic feature for years. But no, we rely on poorly maintained comments and another off-line tool (the version control system) that gets updated only on check-in. So most change descriptions are terse and incomplete.

Frankly, I think one reason comments are so bad today is that they're ugly and dull. Dress 'em up visually and developers will be more inclined to get them right.

Linkages
For nearly 15 years we've lived in a hyperlinked world. A single click takes us to sites and information scattered on a vast planetwide network of computers. Everyone links everything everywhere.

Except in source code, which does not recognize the nature of a hyperlink. Ironically, there's an awful lot of code written to implement links. We give this tool, which has the ability to jump at will through cyberspace, to the entire world yet don't use it in our own work.

Commenting is truly an art form, as is any sort of writing. Masters know which documents belong in the code itself, and which should be kept in other locations. Do we document the intricate details of a complex algorithm in the comments or refer the reader to the original research paper? Judicious use of hyperlinks can tie the code to outside reference material.

Requirements traceability, mandated in many safety-critical applications, means you identify which code satisfies each requirement. A comment might say “The following meets requirement 14.3.2A. ” If we could add a hyperlink to section 14.3.2A of the requirements document the developer could instantly pop up the relevant section in another window to compare the code with the spec.

Sometimes I just want to hyperlink inside a single file. The IDE should generate a table of contents at the top of each module with links to each function, interactively, as we work. Though we do sometimes have class browsers that give some of this capability, they analyze the program once it's written, not as it's being created.

(I wonder if a new sort of language, one built of links, makes sense. A function call is nothing more than a real-time link, after all.)

One reason software is so difficult is that we're presented with a tiny view of a huge structure. It's like the old story of the blind man trying to identify an elephant. Hyperlinks bring the rest of the code and the rest of the project's documentation immediately and easily into view.

File formats
How do we store all of this information? Text files are inadequate, which is why word processors use a variety of open and proprietary file formats to encode vast amounts of meta-data that go far beyond the words one sees on the screen.

If a vendor decides to create a format to give us all the capability I've described, well, the wise developer will run, fast, to another IDE purveyor. Don't get locked into one particular environment. Vendors go out of business or are assimilated into other organizations. Proprietary formats increase the risk that we won't be able to maintain the code years or decades hence.

Though we need a standard for source files, one as versatile as those used by word processors, I think it would be a mistake for the IEEE or other body to invent a new standard when so many extant ones work so well today.

A couple of standards already exist:

The OpenDocument Format (ODF) is a non-proprietary file format based on the XML format originally created for the Open Office suite of desktop tools.

XML, or Extensible Markup Language, looks vaguely like HTML and is a descriptive of data as is Microsoft's forever-changing .doc format. The files are text-based (though they may be compressed) so those precious libraries comprising 100s of megs of source code will be readable next year and next century. Not many binary formats can make the same claim.

XML is seen as an alternative to closed formats such as Microsoft's .doc, .ppt and .xls versions. ODF is gaining a lot of traction; some 13 companies including IBM, Oracle, Google, and Sun have made significant commitments to it. This past September, amid great controversy, the State of Massachusetts decided to standardize on it for many of the State's agencies.

Microsoft, too, is migrating to an XML-derived version. In a recent move the company is seeking to make their Office Open XML format both open and free. They're hoping to get approval in 2006 by the International Organization for Standardization. In fact, today the company's Office XP and 2003 already use a zipped XML format.

Get with the times
I'm taking the compiler vendors to task because they usually provide the entire IDE (at least, for non-Eclipse environments) and the compiler will have to accept much more complex and much richer input files. So it's up to them to effect change.

Long ago Donald Knuth conceived the compelling idea of Literate Programming. Write the software as a story, intertwining both code and description, using a fully graphical text-processing system he called TeX. But the poor state of compiler technology forced Knuth to create tools that split the input file into separate document and source (text, of course) outputs.

It's time to extend compilers. Let's keep both graphical docs and source code united, presented in both editing and source-level debugger windows. The function of the code will be much clearer, and the documentation will more likely stay synchronized with the code.

First we need unity on a file format. That will surely happen. Tool vendors and embedded systems developers should clamor for this now

The “I” in IDE means “integrated” but that's a lie. IDEs today are not integrated. They're a motley collection of random tools (compiler, editor, linker, and so forth) duct-taped together to somewhat ease programming. A truly integrated environment binds the tools tightly together, performing syntax checks as one types, identifying unresolved link issues dynamically, and more.

Basic gave us those capabilities 40 years ago. It's time for our tools to catch up to the state of Office suites.

Jack G. Ganssle is a lecturer and consultant on embedded development issues. He conducts seminars on embedded systems and helps companies with their embedded challenges. Contact him at .

Reader Response


Once upon a time (before MacOS X) the word processor from www.nisus.com stored plain text in the data fork and all formatting information and graphics in the resource fork.

Compilers such as ThinkC, Apple's MPW, and Metrowerks CodeWarrior had no problems compiling fully dressed and formatted Nisus Writer files.It was fun to do once.

More importantly it meant that a non-Nisus word processor could salvage a document's text with minimal effort.

Maybe the problem is with the editor, not the compiler.

–David Kelly
Huntsville, AL


I agree with Jack. For today’s embedded projects and highly complex code it’s sometimes tricky to get the attention of any fellow coder. Though configuration management tools are there as you pointed out, if in 10 lines of comment a particular section is made bold or underlined, this can get the attention of anyone working on the code. Formatting the text can help fellow coders share the logic but leave out just enough information in a smart way. I generally prefer IDE options to color out text then code and so on. It definitely helps to figure out what’s there in source code text.

–ashm
India


“Doctor heal thyself” should be written on the office walls of all tools developers.

Hell, I will happy even if I get auto word complete in an IDE editor. Visual programming has become such catch phrase, that the current text tools have no hope of any serious upgrade.

–Kalpak Dabir
Proprietor
Polar Systems and Devices
India


I completely agree with you. There's a good editor that you can duct tape to your compiler that does it's own on-the-fly _color_and_font_ syntax highlighting, Source Insight by Source Dynamics @ http://sourceinsight.comIf you actually setup a project it will process all the files and cross reference everything for you (makes you a bit lax mentally though).

–A. Perez
Murrietta, CA


Why not 2 panes–code on the left, comments on the right, all kept in sync. Code looks like code, document looks like document. Arrows to the code (once in a while) would be nice.

–Tom Sullivan
Lafayette, IN


I applaud your vision. But first things first.

Like I told the compiler support guy the other day:”I thought crashing compilers went out with gas lights.”

–Steve Vreeland
Fieldbus Inc.
Austin, TX


I realize it's not exactly what you're writing about here, but I use IAR products & their editor does do some of the formatting of which you speak. Comments are italicized and in a different color than source text, and key words are in boldface, so that does help. The editor also allows you select automatic indenting (although I usually do my own). Perhaps the real problem is one of user inertia?

–Dave Telling
Electronics Engineer
Mr. Gasket, Inc.
Carson City, NV


I've wanted rich text source for years as well. The compilers I was using for 6811s and 8051s didn't have an IDE, so I invented my own–with Microsoft Word and bag of macros. It turns out that with just a small collection of macros, Word can be quite an amazing editor. You can do bold, and color, and embed most anything into your document. A great example is an excel spreadsheet representing the analysis leading to a particular implementation. I pasted it into the comment section immediately adjacent to the code. Word could easily reformat your code, so when you typed an opening brace, it created the matching column aligned brace. I created a custom dictionary for each project, so it not only spell checked my comments, but the functions and variable names too. It colorized, sliced and diced. When you clicked save, it wrote the .doc file and then emitted either a .C, .h, or .asm as you defined. The comments looked a little weird with the ascii version of an &#60embedded object&#62 tag, but it gave most of what we are asking for yet today. Yet, I've succumbed to the less visual IDE like the rest of us. A series of Word upgrades in the 90s and its evolving, and usually not always backward compatible visual basic created a maintenance problem that wore me down.

I'm also reminded of the lowly Commodore 64, and an operating system and development environment called GEOS. Its word processor was where you wrote code–and it allowed rich formatting. You used its drawing program to create a graphic, which you copied and pasted into the document. Its macro assembler read the document complete with embedded graphics and produce the executable. The graphics were commonly converted into sprites for animation. It was an incredible environment – killed I suspect based on speed and their dependency on the C64.

And of course, while we're on compilers, why don't they emit more reports–call trees, stack depth detail, and even some level of timing detail? I used to count the cycles in the assembly code, but now we simply throw more cost into even small systems to cover the overhead that we can't plan for–run-time library performance and footprint for example.

–Dave Smart
Johnston, IA


It sounds like what you want is a good source editor/IDE. There are many good tools available that can provide many of the “lacking” functions that you mention.

–Dean L. Enoch
Liebert Corp
Senior Project Engineer
Delaware, OH


All good ideas, but before we extend the compilers could we get them to compile correctly and maybe even optimize a little. Next I'd like better diagnostic messages and then perhaps an improved debugger.

–Fred Carter
Systems Engineer
Waters Corporation
Milford, MA


Why does someone use a hammer and chisel to carve a piece of wood instead of a computer controlled milling machine? Because wood carving is an “Art” not a “Science”.

–Tom Szolyga
Palo Alto, CA


use XML<comments>The code does ...<link>http:\blah.comspec</link></comments><code>main{...</code></pre>

This could be done in the IDE. Give the code stuff to the compiler!

–Tim Flynn
Houston, TX


I think the software community is stuck with the idea that “But we have always done it that way.” They just don't want to change the paradigm. I would contend that a Pentium 4 is internally as complex as a very large software/firmware product. Twenty years ago, the chip designer sat at a drafting board and individually drew each transistor and even the metal lines connecting the transistors. Over the next twenty years, the chip design paradigm changed. A lot of proprietary software was developed by the likes of Intel, Motorola, etc. to improve the productivity. Then, the VHDL concept came along and replaced a lot of the proprietary software. Today, VHDL, Verilog, etc. is the standard to develop not only Pentium 4s, but also thousands of custom ASIC chips and has even taken over the design for PLDs and FPGAs.

I don't think we need a better compiler that just allows us to more easily do what has been done in the past. We need a new software paradigm, one that allows us to describe the application in its terms and then creates the code that allows the processor to implement the application.

I am sure that there is enough intellectual talent available to create a new paradigm, but the business issues will be the problem. I don't think there would be a big enough market for some company to want to develop this as a product. And, the tool would not sell for $99 a copy. It would be more like a full seat of Modelsim and Synopsis, starting around $50,000 and going to $250,000 or more with options. The final sad commentary is that we both know the software engineering manager just won't spend that kind of money for a tool, regardless of what it promises. So, we could have a great tool, but nobody buys it.

Are we stuck here for the next 20 years? I sure hope not! There needs to be somebody with a great idea, some means to get started, and then a software community that truly recognizes the benefits and embraces the new ideas.

–Howard Smith
Milwaukee, WI


I just sent a copy of this article to the engineering managers responsible for our IDE and compiler product lines. I think it will give them a lot to think about!

–Colin Walls
Accelerated Technology
United Kingdom


I think a key reason is because people (and companies) trust pure text files not becoming obsolete or beholden to some software company.

Maybe if Eclipse comes up with a reasonable and open solution for software design and code artifacts then that could become a defacto standard.

A similar discussion is happening on comp.lang.adahttp://tinyurl.com/bc3bz

–Mark Taube
Tucson, AZ


Finally, somebody said it!! I always wanted to ask for the all the things that Jack has asked for!!

–Chaitanya
Infineon Tech
India


Everybody has said what I was going to say already! The power should be in the editor. I'd LOVE to see an editor with separate code and comment panes; user-selectable bold, italic, font, and color choices; and footnotes.

–Rich Ries
Software Engineer
Honeywell
Morristown, NJ


I empathise here, Jack.

Context sensitive editors, with inter-file navigation features have been available for some time–these do improve the engineer's ability to interpret, and to quickly move around the total source code.

However, my question is, should a large amount of effort be expended repeatedly tackling problems from the wrong end of the software engineering spectrum–the code/implementation end? I have worked on too many projects where the code was “poor quality code”, the design was merely “poor quality code”, and the requirements were again “poor quality code”. Our industry needs training, techniques and tools to encourage Darwinian evolution, and bring the code-monkeys down out of their trees.

–Martin Allen
United Kingdom


I am quite sure, it is relatively easy to take Word file (or ODF file), extract sections of text marked as or [code] or whatever else, pass it to compiler, get back list of lines with errors or warnings, mark these lines inside your Word (or even better Open Office). Therefore, why don't we write code inside Word processor instead of using "glorified" plain text editor (which any IDE is)? All IDE really does is call compiler, call linker, process output, call debuger/simulator, react to current combination of program counter/source file/line.

--Slavko Radman
South Africa


Source Insight does provide a language (of macros) which can be compiled and plugged into the IDE. I think it covers most of the features desired by all.

But, integration/compatibility with other source code editors is still a question.

--Prafulla Harpanhalli
Motorola INDIA
Singapore


While we're at it, why don't we ask for reverse video screen (or something like Classic Borland screen)? A brilliant engineer I've known edits her code in a DOS screen. If you have to look at the screen all the time it really helps to cool down what you're looking at. Just because we print code listing on white paper doesn't mean we like looking at white screens.

And also, with wide screen laptops, we need multi-column views to look at long codes or codes at different places simultaneously.

--Patrick Wong



It might be interesting to look at the xref emacs extension at www.xref-tech.com. It supports auotmated code refactoring, variable name completion, jump to the declaration of a symbol, jump to header file and a host of other features. It was implemented in 1,000000 (1e6) lines of C code and costs about $400 a seat, works on C and C++. It will do most of what's asked here. I saw it on slashdot a year or two ago and love it.

--Cameron KelloughResearch Engineer
SRI International
Huntsville, AL


Much of the tricks people are talking about in existing editors is already done (best in my opinion) in Emacs. Syntax highlights, multi-column views, etc.

The idea of code in one pane, comments in another has been done somewhere.

The important thing that Jack is after and which I have thought about for many years is the idea of integrating formatting by the programmer within the main code, not as decoration on the side or added automatically afterwards by tools.

--Jakob Engblom
Sweden


I suspect a lot of code is written assuming that the author is the only one who will ever see it. (If I seem to be pointing a finger, rest assured there are three more fingers pointing back at me.) Until we overcome that attitude the finest documentation tools will go as unused as a health club membership given as a Christmas gift.

--Dave Hinerman
Wilmington, MA


Everything Jack says can be solved without inventing a new xml format or whatever. He more or less is asking for some rich-editing of source code comments.

I think 60% of what he is asking can be solved currently with doxygen and compelling IDE support.

A more elaborate solution would be a source comments-based documentation generator, based on Wiki syntax. With Wiki syntax on sourcecode comments one can achieve 90% of what he's talking about. The 10% left would be IDE work: adding another mode (other than text mode edition), perhaps based on a RichTextEdit or Html control.

So basically, one would use the "text mode" when editing the code (like when one edits WikiMedia texts), and using another control for viewing (Html control, like when one reads WikiMedia texts).

--Takeshi Miya
Argentina


How about a compiler where you can selectively set a variable as "lockable." When application logic knows it will never change again code can set variable as "locked" until same code resets as "unlocked." This kind of invariant condition could be extended to "has min/max value." Compiler flags could cause semantic checking at compile time or at execution time.

--Jack Coleman
Avionics Systems Analyst
Boeing
Wentzville, MO


Insofar as Jack's suggestions call for enabling tools, I agree. However, despite his title and his headings, I find his suggestions are highly debatable.

Jack suggests code would be clearer if it were WYSIWYG, had a broader symbol set, and it were packaged with metadata--linking to, or interweaving with, documentation.

In theory, that sounds great, but in practice, I know how awful word processing was to the majority of texts out there, let alone ones created by embedded systems developers. Sure, the eyesore of some lousy webpage is acceptable, but the thought of reading someone else's bad code like that? It's really frightening.

What do I suggest instead? Tools. Code to read code, explain code, traverse code. Profilers, callby trees, reverse engineered state diagrams, sequence diagrams, that sort of thing. Do they exist? Yeah. But they suffer the same fate as features that Jack suggests (and already exist). All these efforts are very fragmented, as Jack explained himself. The fragmentation is because nothing that is free succeeds quickly, and nothing that is costly suceeds for long. Either way it will cost you and end up bloated in the wrong hands.

And that's why code is the same. It's laid out simple; it's a fixed grammar, it's already established, and there are relatively few conventions to deal with. Anything higher order ends up becoming a language in of itself. Anything useful becomes a commandline tool.

--Ed Dench
Fairfax, VA


I believe that Knuth's literate programming is an example of the type of tool you want, and its pretty old. It was one document that contained both design and code and with two formatters, one that generated the design doc, and the other that generated the source code.

--Mati Sauks
Canada


4 thoughts on “A call for modern compilers

  1. I was going to sing the praises of Nisus for the Mac but David Kelly has done that already. I still keep an old Mac with MacOS9 running so that I can use Nisus to write LaTeX (a markup language for text formatting rather than a programming language but the

    Log in to Reply
  2. Having read about 30 or more of the comments, they obviously run the gamut. I have heard all the arguments this way and that way. DOxygen is okay but just not enough. Editors/IDEs which preprocess to remove rich text are just not enough. I am definitely wi

    Log in to Reply

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.