Comments on Comments
Better English means better code: there's nothing so valuable as a good comment. Too bad they're so rare.
According to Henry Petroski, in The Pencil: A History of Design and Circumstance, the first known book about engineering is the 2,000-year-old work De Architectura by Marcus Vitruvius Pollio. It's a fairly complete description of how skilled artisans created their bridges and tunnels in ancient Rome. One historian said of Vitruvius and his book: "He writes in atrocious Latin, but he knows his business." Another wrote: "He has all the marks of one unused to composition, to whom writing is a painful task."
How little things have changed. Even two millennia ago engineers wrote badly, yet were recognized as experts in their field. Perhaps even then we were geeks. (Which begs the question: were engineers from Athens Greek geeks?)
Some developers care little about their writing skills, figuring they interact with machines, not people. And, of course, developers only communicate with other writing-challenged engineers, right?
This is the communications age. The spoken and written word has never been more important. Consider how e-mail has reinvigorated letter writing, when for years leading up to its advent, philologists moaned about the death of letters.
Old timers will remember how engineers could once function perfectly with no typing skills. That seems quaint today, when most of us live with a keyboard all but strapped to our hands. Just as old-fashioned is the idea of a secretary transcribing notes and fixing spelling and grammar. Today it's up to us to express ourselves clearly, with only the assistance of a spellchecker and an annoyingly picky grammar engine.
I write a weekly column on Embedded.com that generates a lot of e-mail feedback. The majority of these responses are well written, giving lie to the old generalization that engineers are compositionally challenged. But some replies are appalling. Sure, some non-English speakers struggle with our language's idiosyncrasies, but all too many of these confusing ungrammatical missives come from Joe Smith in Anytown, USA.
Even if you're stuck in a hermetically sealed cubicle never interacting with people and just cranking code all day, I contend that you still have a responsibility to communicate clearly with others. Software is, after all, a mix of computerese (the C or C++ itself) and comments (a verbal description meant for humans, not the computer). If we write perfect C with illegible comments, we're doing a lousy job.
I read a lot of code from a huge range of developers. Consistently well-done comments are rare. Sometimes I can see the enthusiasm of the team at the project's outset. The startup code is fantastic. main()'s flow is clear and well documented. As the project wears on, functions are added and coded with less and less care. Comments like:
/* ???? */
or my favorite:
/* Is this right? */
start to show up. Commenting frequency declines; clarity gives way to short cryptic notes; capitalization descends into chaotic randomness. The initial project excitement, as shown in the careful crafting of early descriptive comments, yields to schedule panic as the developers all but abandon anything that's not executable.
Onerous and capricious schedules are a fact of life in this business. It's natural to chuck everything not immediately needed to make the product work. Few bosses grade on the readability of the source code. Quality, when considered at all, is usually a back-end complaint about all the bugs that keep surfacing in the released product, or the ongoing discovery of defects that pushes the schedule back further and further.
Firmware folks know that quality starts at the front-end, in proper design and implementation, using reasonable processes. Quality also requires fine workmanship. Our profession parallels that of the tradecrafts of centuries ago. The perfect joint in a chair may be almost invisible, but will last forever. A shoddy alternative could be just as hard to see, but is simply not acceptable. Professional pride mandates doing the right thing just because we know it's the best way to build the product.
Most of us create software in secret. I rarely see companies using code inspections, for example, which at least bring our flaws into the cold, harsh light of day. Secrecy naturally breeds laziness. It takes a very strong person to consistently rise above the temptations of expediency to do things right, even when it's not clear that working carefully will be rewarded.
Though embedded people work at the border between hardware and software, where sometimes it's hard to say where one ends and the other starts, even hardware designers work in the spotlight. Their creations are subject to ongoing audits during manufacturing, test, and repair. Technicians work with the schematics daily. Faults glare from the page for everyone to see. Sloppy work can't be hidden.
(Now, though, ASICs, programmable logic, and high-level synthesis can bury lots of evil in the confines of an inscrutable IC. The hardware folks are inheriting all of the perils of software. For more on this trend, check out Jim Turley's column "The Death of Hardware Engineering")
eXtreme Programming fascinates me, though I shudder at some of the practices it espouses. All of XP's ideas come from four "core values": communications, simplicity, feedback, and courage. No other methodology that I'm aware of derives from values. In America, we talk a lot about values, sometimes so much so that the meaning gets lost in the rhetoric. Yet, values are the basis of good behavior. I think the XP folks got it right by deriving the process from values rather than from a collection of good ideas. However, I'd add a fifth to their list: Pride of Workmanship.
In my experience, software created without pride is awful. Shortcuts abound. The limited docs never mirror current reality. Error conditions and exceptions are poorly thought-out. For example, Microsoft's various products have garnered a reputation for their susceptibility to buffer overflow attacks. Unix, too, has long suffered the same flaws. Recent posts on the Risks forum (http://catless.ncl.ac.uk/Risks/21.84.html and http://catless.ncl.ac.uk/Risks/21.85.html) suggest that the C language is the source of the problem. Programs written in C usually have no intrinsic array bounds checking; worse, the dynamic nature of pointers makes automatic run time checks that much more problematic.
I disagree. C is nothing more than a tool, one that should come with an "adults only" warning. Those who use it carelessly are at fault, not the language itself. Index into a data structure without adding the requisite overflow checks and you're playing with dynamite. While smoking. In a puddle of gasoline.
Every programmer knows he or she should run simple sanity checks on all data from untrusted sources. Not doing so is laziness, a lack of Pride in Workmanship. Careful craftsmen spend a few seconds adding these checks to save months of debugging or millions in product recalls.
My standard for commenting is that someone versed in the functionality of the product, but not the software, should be able to follow the program flow by reading the comments without reference to the code itself. Code implements an algorithm; the comments communicate the code's operation to yourself and others, maybe even to a future version of yourself performing maintenance years from now.
Write every bit of the documentation (in the U.S. at least) in English. Noun, verb. Use active voice. Be concise; don't write the great American novel. Be explicit and complete; assume your reader has not the slightest insight into the solution to the problem. In most cases, I prefer to incorporate an algorithm description in a function's header, even for well-known approaches like Newton's Method. A description that uses your variable names makes a lot more sense than "see any calculus book for a description." And let's face it: once a program is carefully thought out in the comments, it's almost trivial to implement.
Capitalize per standard English procedures. IT HASN'T MADE SENSE TO WRITE ENTIRELY IN UPPER CASE SINCE THE TELETYPE DISAPPEARED 25 YEARS AGO. the common c practice of never using capital letters is also obsolete. Worst aRe the DevElopeRs wHo usE rAndOm caSe changeS. Sounds silly, perhaps, but I see a lot of this. And spel al of the wrds gud.
Avoid long paragraphs. Use simple sentences. "start_motor actuates the induction relay after a three second pause" beats "this function, when called, will start it all off and flip on the external controller but not until a time defined in HEADER.H goes by."
Begin every module and function with a header in a standard format. The format may vary a lot between organizations, but should be consistent within a team. Every module (source file) must start off with a general description of what's in the file, the company name, a copyright message if appropriate, and dates. Start every function with a header that describes what the routine does and how, goes-intas and goes-outas (i.e., parameters), the author's name, date, version, a record of changes with dates, and the name of the programmer who made the change.
C lends itself to the use of asterisks to delimit comments, which is fine. I see a lot of this:
* comment *
which is a lousy practice. If your comments end with an asterisk as shown, every edit requires fixing the position of the trailing asterisk. Leave it off, as follows:
Most modern C compilers accept C++'s double slash comment delimiters, which is more convenient than the /* */ C requires. Start each comment line with the double slash so the difference between comments and code is crystal clear.
Some folks rely on a fancy editor to clean up comment formatting or add trailing asterisks. Don't. Editors are like religion. Everyone has his or her own preference, each of which is configured differently. Someday compilers will accept source files created with a word processor that will let us define editing styles for different parts of the program. Until then, dumb ASCII text formatted with spaces (not tabs) is all we can count on to be portable and reliable.
Enter comments in C at block resolution and when necessary to clarify a line. Don't feel compelled to comment each line. It is much more natural to comment groups of lines that work together to perform a larger function.
Explain the meaning and function of every variable declaration. Long variable names are merely an aid to understanding; accompany the descriptive name with a meaningful, prose description.
One of the perils of good comments-which is frequently used as an excuse for sloppy work-is that over time the comments no longer reflect the truth of the code. Comment drift is intolerable. Pride in Workmanship means we change the docs as we change the code. The two things happen in parallel. Never defer fixing comments until later, as it just won't happen. Better to edit the descriptions first, and then fix the code.
One side effect of our industry's inglorious 50-year history of comment drift is that people no longer trust comments. Such lack of confidence leads to even sloppier work. It's hard to thwart this descent into commenting chaos. Wise developers edit the header to reflect the update for each patch, but even better add a note that says "comments updated, too" to build trust in the docs.
If you use code inspections (and please do; they are the cheapest known way to get rid of bugs) review the comments as well as the code. Both are equally important.
Finally, consider changing the way you write a function. I have learned to write all of the comments first, including the header and those buried in the code. Then it's simple, even trivial, to fill in the C or C++. Any idiot can write software following a decent design; inventing the design, reflected in well-written comments, is the really creative part of our jobs.
Jack G. Ganssle is a lecturer and consultant on embedded development issues. He conducts seminars on embedded systems and helps companies with their embedded challenges. Contact him at firstname.lastname@example.org.