You have a thousand questions about UML, but, as the cliché goes, are afraid to ask. What is UML, exactly? How is it being used and by whom? Are there many ways to use it or just one true way? Does UML apply to embedded and real-time systems, and if so, how? What are the popular ways to use UML and how effective has it been? Why are there so many UML tools, and why do they seem to be so different even though UML is a standard? What should I look for in a tool? Will using UML change my development process? Make it faster? Build better-quality systems? Or is it all hype?
If these and other questions bombard you when you think about UML, this article is for you. It answers these questions and more by describing the provenance of the language; its various usage styles, subsets, and extensions, including real-time profiles; and where it's being used and with what degree of success. A brief self-assessment is included to help you determine whether your team is ready for UML and how it can best benefit your project.
What is UML, really?
The Unified Modeling Language (UML) is a graphical (and textual) way to describe systems that has been standardized by the Object Management Group (OMG). OMG is a not-for-profit consortium of users and vendors dedicated to interoperability of object-oriented systems.1
For a user (usually an embedded systems developer), UML presents itself as a notation for common object-oriented concepts. For example, before UML was standardized some folks represented a class as a box and others as a cloud, each with variations for attributes and other properties of the class. UML standardizes on a three-box compartment with the name of the class in the first compartment, attributes in the second, and so on. This gives us “interoperability” in the sense that we can now have the same understanding of those icons when they appear on a screen or whiteboard.
The definition of UML is captured in a specification that is itself written in UML. Just as you might have a class Circuit with an attribute bandWidth in a model of telephone system, the UML specification has classes Class and Attribute to capture the concepts in the modeling language. Tools capture user models as instances of those classes (Class, Attribute, and so on), which encourages interoperability of those models by tools.
Where did UML come from?
UML emerged from a surplus of object-oriented methods in the early 1990s. These methods defined ways to analyze and design systems but they all used different notations, which caused considerable confusion in the marketplace. To unify these methods, the “Unified Method” was born as a project within the OMG.Unfortunately, (according to the joke I first heard from Martin Fowler) the difference between a terrorist and a methodologist is that you can negotiate with the terrorist. So the scope of UML was cut to define a notation, not a method.
This is important: UML defines a notation , not a method for analyzing and designing systems. UML offers no advice on how to go about building a system, only on how to describe it.
The designers of UML took this decision seriously. Each diagram is well defined but there is no one true way in which the diagrams must fit together. That decision is up to you, the designer.
So what diagrams make up UML, then?
There are thirteen diagram types, listed in Table 1 with their usage.Defining all these diagrams could (and does) take a whole book! For a breezy introduction, read UML Distilled by Martin Fowler.2 If you're partial to cubical books, take a look at Doing Hard Time: Developing Real-Time Systems with UML by Bruce Powel Douglass.3 For another view, read Executable UML: A Foundation for Model-Drive Architecture by Marc Balcer and your most humble author–me.4
Unless you're a tool builder or a masochist, I don't recommend reading the UML specification from the OMG because it's written for clarity of specification, not ease of understanding.
Which diagrams are useful for embedded systems?
All UML diagrams have their uses, even for embedded or real-time systems. The most commonly used diagrams are the class diagram, the state machine, the use case, and the sequence diagram. The timing diagram is new in UML 2.0, so we don't yet know how useful it will turn out to be. Certainly, the state-machine diagram is especially helpful in event-driven systems.
Why is UML important for designing embedded systems?
UML increases your productivity by allowing you to work at a higher level of abstraction. Studies going back to the days of the first programming languages show that you can write the same number of lines of code per day, regardless of the language, so a higher-level language like UML will necessarily make you more productive. Moreover, UML allows you to visualize concurrent behavior, which is difficult in a traditional programming language where everything is expressed linearly.
I have direct knowledge of UML being used in systems as varied as telecommunications switches (several million lines of C++) all the way down to pacemakers and drug-delivery devices, consumer electronics, avionics, automotive–you name it.
Research firm Venture Data Corporation says that 14% of projects in 2004 used UML and this will grow to 25% by 2007.5 Of course, those percentages depend on what you mean by “uses.”
How is UML used?
There are many ways that UML can be said to be used. Most people use UML informally. That is, they sketch out a few diagrams on the back of the proverbial napkin and discuss their abstractions with their peers. From there, coding proceeds directly. Martin Fowler's book, UML Distilled , explains this usage.
Others use UML to specify software structure, creating a near one-to-one correspondence between the UML diagrams and the code. Frequently, a software tool can generate code frames from these models. The developer then adds code directly to the model or in separate files linked to the output generated from the models. Note that if the generated output is changed by hand, the model no longer reflects the code. This disconnect may lead to a desire to “round trip” and generate a model from the code. The round trip only makes sense as long as the model bears a one-to-one relationship to the code. Bruce Powel Douglass describes this UML usage in Doing Hard Time: Developing Real-Time Systems with UML .3
The third use is executable models that have action language as a part of the model. The UML action model is designed to allow the model to be translated into any implementation. A single model may be translated into one that has many tasks or one, many processors or one. In contrast to the second use, there is no necessary correspondence between the structure of the model and the structure of the implementation except that the behavior defined by the model must be preserved. Balcer and I describe this use in Executable UML .4
These three ways of using UML (sketch, blueprint, and executable) enable different kinds of reuse.
What are the types of reuse?
Sketches are useful mainly to help visualize a solution and for communication among people, whether the sketches are drawn on a napkin or with a drawing tool. Although sketches have the advantage of low maintenance costs (because you don't bother to update them), they cannot be reused except from the base of knowledge you and your colleagues have accumulated in your heads.
Blueprints are useful to document a design, especially a large one that will be coded as a separate step. Depending on your process, you may code from the blueprint or write code as you build the model incrementally. (The latter is generally preferred.) However, as knowledge is gained from coding, the design may change. Going back to change the model is generally seen as “extra work” that doesn't contribute to getting a running system, so the model and the code may get out of sync.
Attempts to reuse the models often founder on that inaccuracy, leading to calls to “reverse engineer” the code. Leaving aside the fact that it's problematic to reverse-engineer an artifact that wasn't engineered in the first place, the real engineering is smeared throughout the model. The fact that you decided to use three tasks, for example, is evident, but your choice of communication mechanism is repeated, often slightly differently in each place, wherever communication is required. Lack of uniformity of design is one reason for the failure of such reverse engineering.
These concerns can be addressed by making the blueprint the sole artifact, adding code to the graphical UML elements as required, and generating everything–always–from the model, a kind of high-level visual programming. These models can then be reused if both the application and the implementation structures will be largely the same in the new system.
In contrast, the distinguishing feature of executable models is that they separate application from implementation so they can each be separately reused.
How can you reuse an implementation without reusing the application?
Executable UML models have two parts: the model of the application and a set of rules for implementation.
We can capture what your system does and what data it needs to support that behavior in a UML model of the application, with the proviso that we make no implementation decisions. Now let's decide to use, say, C structures, a state-machine dispatcher, and a function for each set of actions triggered at once. The two sentences above represent two separate ideas. The former captures the behavior of the application; the latter captures the overall structure of the implementation.
You can reuse these latter (implementation) decisions in a completely different application, building a wholly different system that also makes use of C structures, a state-machine dispatcher, and a function for each set of actions triggered at once. After all, there is no mention of the application in those implementation decisions.
You could also make a different set of implementation decisions, for example using C++ classes, a switch statement for each state machine, and a sequence of linear statements for what gets done on each transition, and apply those to any application.
In other words, the application can be reused and redeployed separately from the set of implementation decisions, which can be reapplied to different applications. Figure 1 shows the elements of an executable UML model for a simple microwave oven application. These three diagrams–classes, state machines, and action language–show the entire executable semantics of an application. Other diagrams may be derived or used as thinking tools to produce the executable model.
How effective is UML?
Frankly, it's hard to judge the effectiveness of UML in a scientific way. Few people have the resources to build a system twice, once using UML and once without. If they do, it's often with small projects using students as guinea pigs. That's unconvincing.
It stands to reason, however, that if we can sketch a design and talk about improving it, it's going to be better than if we just hack out the design. In any case, using UML this way is an option, and you wouldn't diagram out everything, only those pieces that can benefit from sketching. Using UML will then decrease time to market because the earlier you find an error, the cheaper it is to fix.
It's not clear that using UML as a blueprint for software is effective. More accurately, it's not clear that building a blueprint, carefully documenting it, and then coding from that blueprint is effective. In fact, there's a whole movement, the Agile Alliance, that argues against models and documents as being superfluous and costly.6 The most visible faction of this movement is XP, or extreme programming, led by Kent Beck, who wrote a book on the subject in 2000.7 Chief among the arguments made by this movement is that you cannot know a model is correct because you can't test it, nor does it execute.
This argument doesn't apply to executable models, though. Because these models can be interpreted, they can be checked for correctness with the customer and analyzed in a variety of ways for synchronization problems. This type of verification is especially important in systems that must be known to be correct, such as medical instruments, airplane control, and the like. Executable models give you the advantages of both UML and agility.
I hear a lot about UML 2.0. What's that?
UML 2.0 is the next major revision of UML. The first version was UML 1.0, and UML 1.1 was the first update based on reviews of the initial specification. And so on. UML 1.5, the last UML 1.x, added an action model to the specification.UML added features to model architectures better, such as the composite structure diagram and the timing diagram. It also added several features from SDL, the Specification and Description Language.
Moreover, UML 2.0 refactored and reorganized the underlying metamodel of UML 1.x with the goal of making it cleaner and smaller, but with minimum violence to the notation of UML 1.x.
A metamodel is a model in which the instances are types from another model. The UML metamodel has classes Class and Attribute . Instances of the class Class might be Circuit or Switch in a model of telephone system, and bandWidth an instance of the classes Attribute .
Meta- is a Greek prefix meaning “beyond” or “after,” and it's a relative term. The class Class is meta to the class Switch , and the class Attribute is meta to the attribute bandWidth .
Why is UML 2.0 not well suited to embedded/real-time applications?
To the contrary, UML 2.0 is well suited to the design of embedded/real-time applications but it's also well suited to business process reengineering, workflow analysis, and IT systems. This harkens back to the idea that though each diagram is well-defined, there's no single way in which the diagrams must fit together. That decision is up to you.
How do real-time developers fit the diagrams together?
There are many possible ways real-time developers can fit UML diagrams together. For example, some people:
• Build whatever seems to be illuminating at the moment
• Build class diagrams, add attributes and operations, then code from that
• Build class diagrams, then sequence diagrams for certain important scenarios, and then code
• Start from sequence diagrams derived from the use cases, abstract classes from the scenarios, and then code
• Build classes, then state machines for each class, then actions for each state, and then translate the models into implementation
There are still more approaches for non–real-time work such as workflow analysis and IT systems. There is no agreement on which approach is best, even under specific circumstances.
Do these processes imply necessary connections among the diagrams?
UML does not prescribe necessary connections among diagrams, nor does it say there must be one state machine for each class. You can build a state machine for anything that takes your fancy. You can also constrain the UML to follow this rule using a profile.
What's a UML profile?
Formally, a profile is a UML model that describes extensions and subsets to UML. Subsets are described using the Object Constraint Language (OCL). For example, to constrain a class to have zero or one state machine, you could write:
Forall Class.StateMachine.allinstances( ) -> size( ) <>
Constraints like these can be used to define how the elements of UML fit together; they can also be used to subset the language. For example:
StateMachine.allinstances( ) -> size( ) = 0
requires that there be no state machines in a model.
Extensions are created by defining stereotypes, which are tags that can decorate any model element. For example, we may tag a class “persistent” and use the tag to identify a class whose instances are stored past the lifetime of the runtime of the system.
Informally–and this is ideologically unsound–a profile is any set of extensions and subset to UML whether written down using these mechanisms or not. Formally, a profile is the OCL and stereotype definitions that describe the rules, which in UML 2.0 are captured in a package.
What about the real-time profiles?
There's a document called “UML Profile for Modeling Quality of Service and Fault Tolerant Characteristics and Mechanisms” that defines stereotypes for properties such as performance, dependability, security, integrity, coherence (temporal consistency of data and software elements), throughput, latency, efficiency, demand, reliability, and availability.8 Each of these properties is itself broken down into specific stereotypes. Throughput, for example, defines stereotypes “input-data-throughput,” “communication-throughput,” and “processing-throughput.” The profile also defines extensions for fault-tolerance and risk assessment. My favorite here is “ThreatAgent,” which can be denoted by a stick-figure icon holding a bomb.
You can decorate your model from this catalog of icons and properties to define the real-time requirements of your application problem. Tools may then use this information to construct implementations, though at present this is not directly possible. Consider a class with a “communication-throughput” value over some “observationInterval.” Now what? There are many ways of realizing this requirement; tools are not yet able to decide between them. Nonetheless, having a standard set of concepts defined for characterizing real-time systems is valuable.
Further work is taking place with Modeling and Analysis of Real-Time and Embedded systems (MARTE), which requests a UML profile to add capabilities for analyzing schedulability and performance properties of UML specifications.9
What real-time UML tools are available?
UML, as I've suggested, is a family of languages and UML 2.0 incorporates some concepts from SDL. Each tool set defines one member of the family. Different tools exist because each tool serves a different subset of the market; even so, all tools are pretty much based on the UML standard.
What should I look for in a tool?
What you look for depends on how you want to go about developing systems using UML. If you intended to use it as a sketching language, your requirements will be quite different from what you look for if you want execution. There's no substitute for doing the research, especially as tools grow and change over time.
Does this mean I have to have an agreed development process to use UML?
You need to have an agreed development process, with or without UML, if you're going to work together as a team. Working as team doesn't require a “heavyweight” process, however; you could all decide that your process is as simple as “Talk to customer, write code, test it, feedback to customer.” But if each person is doing something completely different, it will be difficult to work together.
But doesn't UML define the process?
If two people (Amy and Bob) were working together, and Amy builds models in this order: use cases, sequence diagrams, code, and Bob builds class diagrams, state machine diagrams, then actions, Amy and Bob would not be able to communicate as effectively as if their processes had some commonality. Bringing UML into a project is sometimes cast as an answer to the question, “How do we build software?” (Answer: “We use UML!”) In fact, it makes the question more pointed.
Will using UML change my development process?
Using UML doesn't have to change your development process, but the introduction of UML is often seen as an opportunity to make some changes. It's critical for your success that you consider what you want your process to be and then make your use of UML fit that process.
Do you have a preferred process for real-time systems?
I prefer to build executable models consisting of class diagrams and state-machine diagrams for those classes, and then write action language for the state machines. All of the processing in the system is housed in concurrently executing state machine instances, which helps capture concurrency in the application so it can be mapped into an implementation.
I didn't see use cases. Why not?
Use cases in UML are remarkably popular, and with good reason: they help focus our attention on the requirements and not on the coding. Especially in real-time systems, with our hard-to-meet time and memory constraints, it's all too easy to fall into designing the system before acquiring an understanding of the problem. This is where use cases can help. Use cases help us build executable and translatable models, but they do not themselves execute.
What do you mean by “translatable”?
An executable model can be interpreted, but that is unlikely to yield a system that meets performance constraints. Accordingly, the model must be translated into an implementation using a set of formalized rules that, for example, take a class and turn it into a C struct, a state-machine diagram, and a set of switch statements, or take a CreateObjectAction and turn it into a malloc( ), and so on. Each set of rules must be internally self-consistent, and a complete set is called a model compiler .
What's a model compiler?
A model compiler is simply a set of rules that read a developer's model as captured in a metamodel and turn it into text. That text, of course, will often be consistent with programming language syntax (in other words, the text is a C program), that can itself be compiled onto a processor.
How does a model compiler differ from a programming-language compiler?
Model compilers aren't significantly different from traditional compilers. It's just that the level of abstraction in the UML “programming language” is higher, so much so that the UML program is independent of its platform. You don't even need to make decisions about data structures. A model compiler, is, in fact, exactly analogous to a programming language compiler but at a higher level of abstraction.
Why exactly is UML at a “higher level of abstraction?”
When we write a statement in C that if a == b then … , the C compiler turns this into assembly language that loads registers, maybe subtracts b from a, and jumps if the result is zero. On a different processor, it may do something altogether different. C has “abstracted away” the details of register allocation and even the concept of if, turning it into GOTOs (jump or branch instructions) we would normally eschew.
When we write in UML that two classes have an association between them, we have abstracted away how the association is made. We don't say whether the implementation uses a linked list, a doubly-linked list, or a table–just as we don't say anything about registers when we write C.
In blueprint-style UML, you abstract away details like this but there is still a nearly one-to-one correlation between the model and the code. In executable UML, we go further and abstract away all data structure decisions, tasking structure decisions, and language decisions and leave these to the model compiler.
How many model compilers are there?
That's like asking, “How many possible designs are there for the same problem?” (which is actually a very interesting, if rather abstract, question). The answer is that there's one model compiler for each possible design; related designs may be housed in the same package using compiler switches. For example, we use a Java model compiler to build our own UML toolset, which is itself modeled. We offer a small-footprint C-based model compiler for real-time and embedded systems.
Are the market leaders ready for model compilers?
Whether the leaders of the market are ready to adopt executable UML is a chicken-and-egg problem. Vendors won't build a solution until they see a market, and customers won't buy tools until they see them available. Even a well-proven tool such as Accelerated Technology's BridgePoint, which has been in the market for 10 years now, can be perceived as risky until further standards are developed.
But we already have another standard UML! Why do we need a standard?
Executable UML relies on a subset of UML, but each vendor could choose a different subset on which to base its model compilers. That would reduce interoperability. Moreover, the execution semantics of UML are not formally defined.
When can we expect that standard from the OMG?
Initial submissions for the Executable UML Foundation are due in April 2006.10 It generally takes 6 to 12 months after submission for the standard to be officially adopted, although you can, of course, use tools that conform to the to-be-adopted standard.
Should I wait, then?
How visible should my shiny new UML project be?
UML doesn't guarantee success, which, obviously enough, admits of the possibility of failure. In other words, it's a gamble. You have to decide how much you're willing to risk before you see how the project will turn out.
Most engineers instinctively go for the low-risk approach, choosing a low-visibility project or a low-visibility piece of an important project. Of course with low risks, the rewards are lame as well. “Sure it worked for the logging facility, but no way would that stuff work for the real project I'm working on like the motor-control loop.”
By the way, dilute this answer a lot when using UML as a sketching language, and some for blueprint style. The investment, and therefore the risk, is less for sketching than for executable modeling. Consequently, it matters less how visible your project is. Equally, executable models give you the most return on your investment but visibilty matters more. The same applies for the headings that follow.
How big should my first project be?
Adopting any new technology on a wide scale, all at once, is rarely a good idea. Pick a project that's manageable in size. It has to be big enough to matter but small enough that you can steer it. Don't underestimate the force required to overcome organizational inertia.
What kind of application is best for a first project?
Don't pick a project that's basically algorithmic. Mathematics is a perfectly good “modeling language” for much of that. UML really helps in laying out the overall structure of an application and is particularly well suited for understanding concurrent behavior. So look for something that's complex enough in the control aspect that your best programmer would struggle to sort it out. Got concurrency? Model it.
What about legacy code?
If you have lots of legacy code you may spend more time trying to interface the newly generated code to it than it's worth. Pick an initial project where the return-on-investment for learning a new technology is high enough to matter. It doesn't make much sense to invest heavily to improve your productivity for a piece that represents 3% of the overall effort of the project.
How well should I know my application? Should I start with something new?
Start by modeling something you know really well. Don't try to learn a new subject matter or push the envelope of one you already know while you're learning to model. After you're comfortable modeling, use your new skills to explore more-complex and unfamiliar areas where you can get a greater return on the investment.
What kind of people should I get?
You'll need smart people, and you'll need several different skill sets along with that intelligence. More important than talent and wit is desire. There's no better way to ensure failure than to populate the project with a few folks who don't believe this new-fangled approach can work for their problem. So how do I know if I'm ready for UML?
We never get to do anything that's ideal, so we end up making tradeoffs between the desired project and what's available. Only you can make these tradeoffs. In some situations it's better to have a small, nearly invisible success than to risk a spectacular failure. In other cases, it might be just the reverse, where having success in a low-risk environment could just confirm for some that UML is not serious. In the end, indecision is worse than failure. Pick a project, staff it with people, and then make it work.
For some help in making the decision to proceed, take a look at the brief self-assessment below.
This self-assessment is divided into several sections. The first question is, where does it hurt?
Are you having difficulties:
• Capturing requirements?
• Visualizing relationships between requirements?
• Communicating requirements?
• Communicating designs?
• Visualizing interactions?