A guide to domain specific modeling - Part 3: Higher abstraction levels - Embedded.com

A guide to domain specific modeling – Part 3: Higher abstraction levels

Editor’s Note: In Part 3 in a series excerpted from their book Domain Specific Modeling: Enabling full-code generation, the authors discuss the ways domain specific modeling raises the level of abstraction beyond programming by specifying the design using concepts and rules from the applications problem domain.

As discussed in Part 1 and Part 2 in this series, Domain-Specific Modeling mainly aims to do two things. First, raise the level of abstraction beyond programming by specifying the solution in a language that directly uses concepts and rules from a specific problem domain. Second, generate final products in a chosen programming language or other form from these high- level specifications.

Usually the code generation is further supported by framework code that provides the common atomic implementations for the applications within the domain. The more extensive automation of application development is possible because the modeling language, code generator, and framework code need fit the requirements of a narrow application domain. In other words, they are domain- specific and are fully under the control of their users.

Higher levels of abstraction
Abstractions are extremely relevant for software development. Throughout the history of software development, raising the level of abstraction has been the cause of the largest leaps in developer productivity. The most recent example was the move from Assembler to Third Generation Languages (3GLs), which happened decades ago.

As we all know, 3GLs such as FORTRAN and C gave developers much more expressive power than Assembler and in a much easier-to-understand format, yet compilers could automatically translate them into Assembler. According to Capers Jones’ Software Productivity Research (SPR, 2006), 3GLs increased developer productivity by an astonishing 450%.

In contrast, the later introduction of object-oriented languages did not raise the abstraction level much further. For example, the same research suggests that Java allows developers to be only 20% more productive than BASIC. Since the figures for C++ and C# do not differ much from Java, the use of newer programming languages can hardly be justified by claims of improved productivity.

If raising the level of abstraction reduces complexity, then we need to ask ourselves how we can raise it further. Figure 7 shows how developers at different times have bridged the abstraction gap between an idea in domain terms and its implementation.

Figure 7: Bridging the abstraction gap of an idea in domain terms and its implementation

The first step in developing any software is always to think of a solution in terms that relate to the problem domain—a solution on a high abstraction level (Step one). An example here would be deciding whether we should first ask for a person name or for a payment method while registering to a conference.

Having found a solution, we would then map that to a specification in some language (Step two). With traditional programming, here the developers map domain concepts to coding concepts: “wait for choice” maps to a while loop in code.

With UML or other general-purpose modeling languages, developers map the problem domain solution to the specification with the modeling language: “wait for choice” triggers an action in activity diagram.

Step three then implements the full solution: giving the right condition and code content for the loop code. However, if general-purpose modeling languages are used, there is an extra mapping from a model to code.

It is most remarkable that developers still have to perform step one without any tool support, especially when we know that mistakes in this phase of development are the most costly ones to solve. Most of us will also argue that finding the right solution on this level is exactly what has been the most complex.

Automation with generators
While making a design before starting implementation makes a lot of sense, most companies want more from the models than just throwaway specification or documentation that often does not reflect what is actually built. UML and other code-level modeling languages often just add an extra stepping stone on the way to the finished product.

Automatically generating code from the UML designs (automating step three) would remove the duplicate work, but this is where UML generally falls short. In practice, it is possible to generate only very little usable code from UML models.

Rather than having extra stepping stones and requiring developers to master the problem domain, UML, and coding, a better situation would allow developers to specify applications in terms they already know and use and then have generators take those specifications and produce the same kind of code that developers used to write by hand.

This would raise the abstraction level significantly, moving away from programming with bits and bytes, attributes, and return values and toward the concepts and rules of the problem domain in which developers are working.

This new “programming” language then essentially merges steps one and two and completely automates step three. That raised abstraction level coupled with automatically-generated code is the goal of Domain-Specific Modeling.

DSM does not expect that all code can be generated from models, but anything that is modeled from the modelers’ perspective, generates complete finished code. This completeness of the transformation has been the cornerstone of automation and raising abstraction in the past.

In DSM, the generated code is functional, readable, and efficient—ideally looking like code handwritten by the experienced developer who defined the generator. Here DSM differs from earlier CASE and UML tools: the generator is written by a company’s own expert developer who has written several applications in that domain. The code is thus just like the best in-house code at that particular company rather than the one-size-fits-all code produced by a generator supplied by a modeling tool vendor.

The generated code is usually supported by purpose-built framework code as well as by existing platforms, libraries, components, and other legacy code. Their use is dependent on the generation needs, and later in this book (Part III), we illustrate DSM cases that apply and integrate to existing code differently. Some cases don’t use any additional support other than having a generator.

At this point, we need to emphasize that code generation is not restricted to
any particular programming language or paradigm: the generation target can be, for instance, an object-oriented as well as a structural or functional programming language. It can be a traditional programming language, a scripting language, data definitions, or a configuration file.

DSM solution evolves. Changes to the DSM language and generators are more the norm than an exception. A DSM solution should never be considered ready unless all the applications for that domain are already known. The DSM solution needs to be changed because the domain itself and related requirements change over time.

Usually this leads to changes in the modeling language and related generators. If a change occurs only on the implementation side, like a new version of the programming language to be generated or using a new library, changes to just the code generators can be adequate. This keeps the design models untouched and hides implementation details from developers using DSM.

A DSM solution also needs to be updated because your understanding of a domain, even if you are an expert in it, will improve while defining languages and generators for it. Even after your language is used, your understanding of your domain will improve through modeling or from getting feedback from others that model with the language you defined. Partly you will understand the domain better and partly you will see possible improvements for your language.

When to use DSM?
Languages and tools that are made to solve the particular task that we are working with always perform better than general-purpose ones. Therefore DSM solutions should be applied whenever it is possible. DSM is not a solution for every development situation though. We need to know what we are doing before we can automate it.

A DSM solution is therefore implausible when building an application or a feature unlike anything developed earlier. It is something unique that we don’t know about. In such a situation we usually can only make prototypes and mock-up applications and follow the trial-and-error method, hopefully in small, agile, and iterative steps.

In reality, we don’t often face such unique development situations. It is much more likely that after coding some features we start to find similarities in the code and patterns that seem to repeat. In such situations, developers usually agree that it does not make sense to write all code character by character.

For most developers, it would then make sense to focus on just the unique functionality, the differences between the various features and products, rather than wasting time and effort reimplementing similar functionality again and again. Avoiding reinventing the wheel is good advice for a single developer, but even more so if colleagues are implementing almost identical code too.

In code-driven development, patterns can evolve into libraries, reusable com- ponents, and services to be used. Building a DSM solution requires a similar mindset as it offers a way to find a balance between writing the code manually and generating it. How the actual decision is made differs between application domains.

Conclusion
Using resources to build a DSM solution implies that development work is conducted over a longer period within the same domain. DSM is therefore a less likely option for companies that are working in short term projects without knowing which kind of application domain the next customer has. Similarly, it is less suitable for generalist consultancy companies and for those having their core competence in a particular programming language rather than a problem domain.

Although the time to implement a DSM solution can be short, from a few weeks to months, the expected time to benefit from it can decrease the investment interest. The longer a company can predict to be working in the same domain, the more likely it will be interested in developing a DSM solution.

Some typical cases for DSM are companies having a product line, making similar kinds of products, or building applications on top of a common library or platform. For product lines, a typical case of using domain-specific languages is to focus on specifying just variation: how products are different.

The commonalities are then provided by the underlying framework. For companies making applications on top of a platform, DSM works well as it allows having languages that hide the details of the libraries and APIs by raising the level of abstraction on which the applications are built.

Application developers can model the applications using these high level concepts and generate working code that takes the best advantage of the platform and its services. DSM is also suitable for situations where domain experts, who often can be nonprogrammers, can make complete specifications using their own terminology and run generators to produce the application code. This capability to support domain experts’ concepts makes DSM applicable for end-user programming too.

Domain-specific modeling fundamentally raises the level of abstraction while at the same time narrowing down the design space, often to a single range of products for a single company.

With a DSM language, the problem is solved only once by visually modeling the solution using only familiar domain concepts. The final products are then automatically generated from these high-level specifications with domain-specific code generators.

With DSM, there is no longer any need to make error-prone mappings from domain concepts to design concepts and on to programming language concepts. In this sense, DSM follows the same recipe that made programming languages successful in the past: offer a higher level of abstraction and make an automated mapping from the higher level concepts to the lower- level concepts known and used earlier.

Today, DSM provides a way for continuing to raise the description of software to more abstract levels. These higher abstractions are based not on current coding concepts or on general-purpose concepts but on concepts that are specific to each application domain.

In the vast majority of development cases general-purpose modeling languages like UML cannot enable model-driven development, since the core models are at substantially the same level of abstraction as the programming languages supported.

The benefits of visual modeling are offset by the resources used in keeping all models and code synchronized with only semiautomatic support. In practice, part of the code structure is duplicated in the static models, and the rest of the design—user view, dynamics, behavior, interaction, and so on—and the code are maintained manually.

Domain-specific languages always work better than general-purpose languages. The real question is: does your domain already have such languages available or do you need to define them?

Part 1: Code- vs. model-driven design
Part 2: Modeling examples

Juha-Pekka Tolvanen has been involved in domain-specific languages, code generators and related tools since 1991. He works for MetaCase and has acted as a consultant world-wide for modeling language and code generator development. Juha-Pekka holds a Ph.D. in computer science from the University of Jyväskylä, Finland.

Steven Kelly is chief technical officer of Metacase and cofounder of the DSM Forum. He is architect and lead developer of MetaEdit+, Metacase’s DSM tool.

Used with permission from Wiley-IEEE Computer Society Press, Copyright 2014, this article was excerpted from Domain-Specific Modeling: Enabling Full Code Generation , by Steven Kelly and Juha-Pekka Tolvanen.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.