Paul A. Clayton


Biography has not been added

Paul A. Clayton

's contributions
    • [cont.] (The MPU design space seems to be poorly explored. There appear to be two basic design types—range and aligned region (sometimes with sectoring)—and little variation in capacity. For microcontrollers and other highly integrated systems, the opportunities for broader use of a default/backing MPU seem significant.) It should also be noted that an MPU can be treated like a software-filled TLB-without-translations. I.e., a permission violation exception can be handled as a "TLB" miss and software can add the appropriate entry (if any) into the MPU. Timing critical tasks could have the timing critical memory areas preloaded at task switch (possibly with the most critical areas "locked" and invalidated/validated as appropriate on context switches to minimize time overhead). (I am just a computer architecture enthusiast not an embedded systems developer. I look forward to reading what actual developers think about these issues.)

    • I suspect a full MMU is not desirable for a microcontroller (ARM R [real time] profile is also uses an MPU and not an MMU despite generally having larger cores than the M profile). (MIPS has similar constraints for real-time and microcontroller cores; even with virtualization, a backing MPU is used to avoid variable timing issues.) Address translation is probably not especially useful for microcontrollers (though even without paging out to lower cost memory there are uses for address translation, including simplifying dynamic sizing of memory segments). TLBs (as caches of translations) increase the difficulty of providing tight WCET guarantees. Page tables also use extra memory, which could be an issue for tiny systems. On the other hand, I think MPUs could be improved significantly in capacity and speed of modification (and perhaps providing permission space numbers analogous to MMUs ASIDs) to allow different tasks to have different permission constraints without introducing less predictable behavior or power/area overhead of translation. E.g., a 32-bit microcontroller could easily provide an instruction to enable/disable user-level permissions for 32 MPU entries (or even 64 using a pair of registers). Even a backing map (which is provided by some MPUs) could be used to reduce MPU area/power overhead while providing flexible permissions by using "sectoring" of the entries (since for an MPU the tag overhead is relatively large since the data portion is relatively small, especially if CAMs are used which is attractive for avoiding placement constraints); a 32-bit microcontroller could mark 32 sectors of a global entry as valid/invalid in a single operation. [cont.]

    • It is somewhat surprising that source code escrow was not mentioned with respect to the danger of product discontinuation. Not providing source code may also imply some negative things about the vendor. E.g., if the vendor is concerned about trade secrets, it implies that the customer is not trusted to uphold the licensing agreement, if the vendor does not want the code exposed because of increased security risks or the code is ugly (which may be a valid tradeoff with availability—ugly working code now can be better than beautiful working code years from now), it implies the code is not of the highest quality. Such concerns may be valid to some degree, but the implications are not likely to win over potential customers. A vendor might also be concerned about patent or even copyright litigation; even pure cleanroom software development (which adds significant cost) cannot protect against accidental patent violation, and exposing source code presumably increases the vulnerability to patent trolls. Not providing source code also avoids certain support issues. Even if the support contract explicitly states no support or "reasonable effort" support of a modified version, customers may (unjustly) complain about a lack of support. In addition, questions about the rationales behind design decisions can increase support costs (these questions could not be asked if the designs were secret). With respect to temptation, providing source code to the customer does not prevent the customer from restricting modification. In addition, with modern version control systems, forking software is much less dangerous.

    • While rubber ducking ( does not require another person, a person with substantially less knowledge can be helpful in asking "stupid questions". Not only does such present an opportunity to clarify one's thinking (an opportunity common for teachers) but it can jolt one out of a pattern of thinking. As you noted, an outside expert can provide not only uncomfortable questions (which one knows one should ask oneself but would otherwise avoid answering) but also alien questions (which one would otherwise not even have imagined). I have not had "a mentor or advisor for work or life issues", but at least the Internet provides access to a significant amount of external knowledge. This is not as helpful as a more personal association but is more helpful than even a vast collection of books; one can ask "stupid questions" and try to answer others' "stupid questions".

    • Presumably you meant "segmented" not "paged". Most more recent systems with address translation use paging. (PA-RISC [and Itanium] and PowerPC are a bit unusual in using segments to extend the address space which is then translated at the page level; but bringing up these architectures would increase complexity with little benefit.) While I can understand excluding such from a short introductory article, Single Address Space OSes do not provide separate address spaces to each process, but still provide protection. In addition, even without address translation a memory protection unit (which is simpler than an MMU--though you may be using MMU in a more generic sense that includes MPUs) can provide permission isolation.

    • Thanks for the information. (I wonder if bit-band regions are more C-friendly. It might be possible for a compiler to infer bit operations [e.g., "device |= 0x04", but it would be dangerous to rely on compiler optimization when an I/O device access might have side effects].) Again, thanks for the teaching.

    • Aside from things like legacy benefits (validation/recertification, familiarity, etc.) and popularity benefits (fit-for-purpose chips--beyond the core--are available, supply is reliable, etc.), how much sense do 8-bit processors make? I am guessing that size difference between a minimal 16-bit core and a minimal 8-bit core is not that great in the context of an entire chip. If the 16-bit ISA provides a better programming experience, better code density, or even just a more natural transition path from tiny to medium-capability microcontrollers, then a little bit more area-/energy-expensive _core_ might be preferred if system area, energy, development effort, or other costs are reduced. (For an organization with slightly broader product lines, reducing the ISA count could be useful.) (From the little I understand, less ISA-specific factors are more important for portability, so the transition path factor may be very minor.) It is also not obvious that there is a great difference is between an 8-bit ISA that uses register pairs to support 16-bit addressing (and perhaps addition) and a 16-bit ISA that uses register partitions to support 8-bit values. However, I am not an embedded systems developer (but merely someone interested in computer architecture).

    • "as an open source OS without commercial licensing fees, it could be used at no cost in courses as long as it is not used for commercial development" According to the Open Source Definition ( ), point 6 (No Discrimination Against Fields of Endeavor), commercial use cannot be excluded by an open source license. Not all gratis software is Open Source (even when source is provided without extra fees beyond distribution costs). The Free Software Definition ( ) is similar: "The freedom to run the program, for any purpose (freedom 0)".

    • "The inline keyword is also implemented in many modern C compilers as a language extension," C99 includes 'inline', so for C99 it is not a language extension. Non-inlining of single call site static functions has been used to extract uncommonly executed code from the main path. A better solution would be something like gcc's __builtin_expect()--likely hidden behind a macro--(assuming using actual profile data is impractical), but if the compiler does not support such avoiding inlining would be the next best option.

    • I think "specialdevice.bit[3] = 1;" would be more friendly than "specialdevice |= 0x0004;"; but "specialdevice.function_active = true;" would be better (where "function_active" describes the role of the bit in a way that "true/false" would fit as values). In addition to the gray area of overloading based on appearance (as in the stream interface), there are also cases where an operation might be applied to a single member (or even a member selected based on the type of the other operand). While such may abstract the object in a way that simplifies the code, it may make code after future modifications less clear ("Why do I use 'object += more_foo;' to increase foo-ness but ' += more_bar' to increase bar-ness?") or confuse the programmer ("What is 'object'? There is 'object += more_foo' but also 'object += more_bar'. Why do I get a compiler error for 'object += 3'?" or worse: "What is the difference between 'object += more_bar', ' += more_bar', and 'object.later_bar += more_bar'?"). My guess would be that "when in doubt, leave it out" applies in most cases. (I am not a programmer, much less an embedded systems programmer.) If better unicode support was common, appearance-based overloading would be less useful. E.g., something like "⇐" and "⇒" might have been used for the stream interface.

    • Computers in Spaceflight: The NASA Experience is also available for web browsing at (I ran into this from searching for information about Voyager's use of data compression, inspired by this question on the new Space Exploration Stack Exchange site: [This site is currently in "private Beta", which means that for a couple of weeks or so only those initially committed to the site can post.].)

    • Moving to simpler cores will tend to run into the parallelism wall. In addition, more cores will mean more interconnect overhead--not as much overhead as the aggressive execution engines in a great big OoO core, but non-zero overhead. (Some multicore chips do have private L2 caches [sharing in L3]. Also, sharing would mainly add a small arbitration delay. With replicated tags, the arbitration delay in accessing data could be overlapped with tag checking. With preferential placement, a local partition of a shared L2 could typically supply data; less timing critical [e.g., prefetch-friendly] data could be placed in a more remote partition.) It should also be noted that memory bandwidth is also a problem. Pin (ball) count has not been scaling as rapidly as transistor count. (Tighter integration can help, but such can also make thermal issues harder to deal with.) While "necessity is the mother of invention", the problems are very difficult (arguably increasingly difficult--earlier, software-defined parallelism was not necessary for a single processor chip to achieve good performance, now, SIMD and multithreading are increasingly important). Even the cost of following the actual Moore's Law (doubling transistor count in a "manufacturable" chip) has been increasing, so economic limits might slow growth in compute performance. In any case, this can be viewed as an opportunity for clever design.

    • Expecting training would also tend to weed out the sub-mediocre employees (either by making them better or by recognizing that they are not willing to learn) which tends to improve productivity and morale. Longer employee retention while excluding those unwilling to learn would tend to increase productivity and morale by increasing cultural coherence and trust. Knowing who to ask, when to ask, and how to ask are important factors in communication (but cannot be covered in a pamphlet or even an organizational wiki). Trust also substantially reduces the need for communication and improves morale (reducing stress, increasing feelings of being valued, and increasing feelings of community).

    • The only way "intrinsically-serial code" can be converted into parallel code is with algorithmic changes--such a drastic rewrite is generally not considered converting. There is much more hope for converting *implicitly* serial code into explicitly parallel code, at least for limited degrees of parallelism. However, for personal computers, it seems doubtful that the benefit of programming for greater parallelism would justify the cost when most programs have "good enough" performance (or are limited by memory or I/O performance). If most code is not aggressively optimized for single-threaded operation, why would the programmers seek to optimize the code for parallel operation? By the way, "Number 4: ARM's BIG.little heterogeneous cores" should, of course, be "Number 4: ARM's big.LITTLE heterogeneous cores"; ARM seems to be emphasizing the smaller core.

    • Several compilers provide extensions supporting '0b' notation for binary literals (e.g., gcc C++11 also supports user-defined literals using suffixes (so one could define "_B" as such a suffix and have the compiler run--at compile time--a procedure which translates such strings into numbers). This stackoverflow question seems to be a good source of information on this topic:

    • I think moving from PDF to XML could be good. If the industry could standardize on a DTD (or collection of DTDs--or more information-rich schemas), it might become practical to develop tools that extract and manipulate the information. Even with just a DTD, CSS could be used to provide context-specific presentation of the data in the XML document with almost any web browser. (Collaboration seems unlikely, though.) PDF is a print-oriented format. XML would be particularly appropriate for data-rich, non-narrative documents like datasheets which have significant amounts of commonality in the type of information presented.

    • Just providing the source code (under a license that allows modification but restricts use to those already licensed--while allowing a third party to make modifications) would be enough to help avoid some issues. (Open source would be better for the customer, of course.) Unfortunately, a company is unlikely to provide source code (and required documentation) even after the product is no longer supported because such might reveal trade secrets or reduce demand for new products. Revealing poorly written source code and documentation could also hurt a company's reputation, and preparing such for public release would add costs for a product that is no longer generating revenue.

    • Amen! C++ is intended to be a multi-paradigm programming language; it is almost a superset of C.

    • The fatal defect rate per KLOC would presumably increase with program size (in the common case of intercommunication). If two communicating modules each have a 5% chance of having a bug and a 0.05% chance of an internal catastrophic bug, there is presumably a non-zero chance that a bug that is not internally catastrophic in one module with interact catastrophically with a like bug in the other module. While increasing the number of users tends to exercise more potential paths to failure, increasing the diversity of uses can even more broadly exercise the system. I also wonder how the metric of defect potential would apply for agile development which might generate more total bugs even though bugs are discovered more quickly (unless the defect count is taken at the first candidate for open release).

    • While I agree that there are tradeoffs among quality, cost, and time to market, I am not certain that your Patriot Missile and Therac examples are good examples. I received the impression that in the Therac case, there was an overconfidence in the quality of the code and a simple, relatively inexpensive failsafe would have avoided the problem. The Patriot Missile system may also have increased the damage from a failure by targeting the ground and (I suspect) should have been more quickly fixed. (I do not remember there being that many Scud launches that were intercepted, though I suspect you are correct that delivering a faulty system early was better than nothing. The psychological advantage was also significant; doing something--even if ineffective--can help morale.) One problem, however, is that tolerance of defects can corrupt the development culture. Being a good manager (considering all these and the many other tradeoffs) is not easy.

    • [continuing] One of the strengths of C++ is that it is multiparadigm. One can write C-style code in C++. (C++ also provides better support for implementing features in libraries where C would have to extend the base language itself--e.g., complex numbers. This significantly facilitates extension of the language. The more broadly useful a tool is the more diversely it will be tested--finding bugs or misfeatures--and the richer its auxiliary facilities will become.) Even C is not as controllable as assembly, but the writing speed and maintainability advantages alone often justify the use of C. (Portability is also a major factor--not only in allowing a given code to be used on different ISAs but also in allowing more programmers to be familiar with C.) C++ sacrifices further control (while--like C supporting inline assembly--still supporting coding at a lower level) for similar benefits. I am not a programmer, but the benefits of a multiparadigm language seem obvious.

    • "Everything classes and virtual functions give you can be easily implemented in C using function tables etc." Except that such does not communicate information to the compiler that can be used for optimization. Removing the table look-up is a common optimization for C++ compilers that might not be performed by C compilers. C requires additional (explicit) context which effectively increases LOC. Bug count and maintainability tend to be track with the count of LOC. "Namespaces... With well constructed modular code namespaces are not an issue and C copes fine. Again, refer to the Linux kernel." I am not familiar with how the Linux kernel handles name conflicts, but I suspect that the centralization of code handling is an important factor in allowing Linux to "cope". While one could handle external name conflicts by passing the external code through a filter that appends a package-specific identifier to each name, such is certainly a kludge. If the external code used other external code, then the intelligence of the filter would need to be increased to recognize references to the external code and append the appropriate identifiers. [to be continued]