Beyond makefiles-Building large-scale C projects (ESC-200) -

Beyond makefiles–Building large-scale C projects (ESC-200)


(Are you tired of waiting for “make” to find the one C file that you changed? Do you want to take advantage of your multi-core system to compile in parallel but are afraid that the build won’t be reliable? Do you perform a “clean build” before checking in because you don’t trust your build system? You’re not alone. Learn from Mike Shal, who is teaching a class (ESC-200) at the Embedded Systems Conference about why “make” fails its basic promise of a working and scalable build system, why the prevailing wisdom found in “Recursive Make Considered Harmful” is wrong, and what you can do about it.)

This article will focus on one simple topic: how make and related build systems inhibit development in large-scale C projects. In this context, “large-scale” is in the 10,000 to 100,000 file range; enough for an entire embedded system. The primary issue with these build systems is that they do not scale well.

This makes the development cycle slower and slower as the project gets larger and larger, resulting in wasted developer time. Unfortunately this problem cannot be solved merely by throwing more hardware at it. We need to perform a thorough investigation into why these build systems do not scale, and what can be done to fix it.

During this investigation we will look at some of the history of build systems, the performance of current build systems, and even peak under the hood to see why they perform the way they do. The inevitable conclusion of this investigation is that make, and any build system based on its principles, must not be used for development.

Here are two examples to see what problems we’ll be looking at in this discussion. First, open up a project that you’re working on, and do the following:

1. Perform a build, to make sure everything is up-to-date.
2. Without changing anything, time how long it takes to perform another build (this is called a null build).

How long does Step #2 take to complete? This is all wasted time. Nothing has changed, so nothing should be done. If your project is small then the build may be fast now, but how fast will it be when the project doubles or quadruples in size?

As a second test, try the following if you are able to in your build environment:

1. Intentionally leave out a dependency that you know will be used (for example, you could leave out a dependency on an automatically-generated header file).
2. Perform a build.

Does the build system warn you or produce an error message that a dependency is missing? Does it silently succeed as if nothing is wrong? Will it work if the build is executed serially, but fail randomly if run in parallel?

Both of these problems have one common result: developer time, YOUR time is wasted. Whether this is the time spent waiting for the build system to decide what to do, or whether it is time spent debugging a problem in the build description files, you are spending time working around the build system rather than working on your project. You should have that time back.

The term “build system” can mean many things in diffierent contexts; from full configuration and installation [1] , to continuous integration [2] . For this discussion, the term “build system” will have a very limited scope.

The build system is the piece of software that a developer executes after editing source files in order to bring the project upto date.Examples of build systems in this context include make [3] and SCons [4] , among many others.

This article will focus only on a small subsection of the build system problem – we are going to look at it from the perspective of a developer going through the Edit-Compile-Test cycle. The build system is responsible for making the Compile step more efficient, so that the developer’s overall productivity can be increased. The total time it takes for the Compile step to execute can be broken down into two parts:

Timetotal = TimeBuildSystem + TimeCompiler /Nprocessors

An ideal build system and compiler would take no time to execute, resulting in a natural lower bound of Timetotal =0. The developer will be more productive as the build system gets closer to this lower bound.

The TimeCompiler variable is the time it takes for the sub-programs to execute. These sub-programs are spawned by the build system, which for a C project will typically be the C compiler.

This time will also include the time for the linker, archiver, and any scripts or other programs that the build system may spawn. The individual compiler steps are generally independent from one another and so they can be executed in parallel among N processors, subject to the constraints of the DAG.

Minimizing the TimeCompiler variable is a worthwhile goal, but that is the job of compiler writers. It is not the focus here. This paper is only concerned with minimizing the TimeBuildSystem portion of the total build time.

The TimeBuildSystem variable is the time it takes the build system to determine which files need to be re-compiled. For example, in make this would be the time to read in Makefiles and generated dependency files, stat() source and derived files, and walk through the DAG to compare timestamps among dependent files.

The build system will then invoke the compiler multiple times to bring files up to date. Note that the build system itself is I/O bound since it must read dependency files and check file timestamps or signatures, which all involve reads from the disk.

This is why only the TimeCompiler variable is divided by the number of processors. Although one could attempt to solve the Compiler part of the equation by buying more hardware or distributing the build among multiple machines, this will not significantly improve the build system part of the equation. For that we need to look at how the build system processes the DAG and decides what to build.

Perhaps the most common build system in use today is still make. Make worked reasonably well enough for small projects that it gained quick acceptance. For larger projects, where defining the build description in separate modules is desirable, a common practice was to use make recursively.

The “recursive make” pattern is where each Makefile recursively invokes other make processes for each subdirectory in the project. Since each individual make process doesn’t know what the others are doing, this resulted in a number of problems, as documented in Recursive Make Considered Harmful (RMCH) [5] .

Unfortunately, RMCH did not go far enough. This led to an unfortunate pattern in recent years, in that many new build systems are developed withthe idea that a global DAG view is necessary to function properly.

Further, they assume that a global DAG view is suficient for ensuring the build is correct. Neither is the case. In fact, in this paper we will see that:

1. Loading the entire DAG is a waste of time, and unnecessary.
2. A build system needs to check the sub-commands’ inputs and outputs to ensure a correct build.

Later we will look at the performance problems associated with a global DAG, and we shall see an alternative to the global view known as the “partial DAG” view. Finally we will see why sub-commands must be checked.

Global DAG Considered Harmful
As shown below in Figure 1 below Recursive Make Considered Harmful [5] used the following DAG during its analysis of a recursive make setup:

              Figure 1
And RMCH showed (Figure 2 below ) how this DAG is viewed from multiple separate make invocations:

                                          Figure 2
The problem is that if both main.c and parse.y are modified, the DAG on the left doesn’t know that parse.h needs to be re-built, and the DAGon the right doesn’t know that main.o depends on parse.h.

This means if the left Makefile is executed before the right Makefile, the build will complete successfully but have a subtle error: the main.o object will be using an old version of parse.h. The solution presented in RMCH is to always load the entire DAG, so dependencies like this won’t be missed.

When a build system uses a “Global DAG”, it means that every time the build system is invoked, it will load the entire dependency graph into memory before deciding what to build. This is the behavior of a non-recursive make setup, as well as for derivative build systems like SCons. In fact, SCons lists this very feature prominently on its homepage [4] :

* Global view of all dependencies – no more multiple build passes or reordering targets to build everything.

Let’s see how well this works in practice. Using a build-system test harness from wonderbuild [6] (originally from Games From Within [7] ), we can generate test projects of any size and build them using different build systems.

In this first test, we’ll use a non-recursive Makefile setup (GNU make v3.81, built-in rules disabled ) and SCons (v2.0.1.r5134, using time stamps instead of MD5 sums ).The project sizes range from 10 files to 100,000 files, increasing by a factor of ten each time.

Note that the time we are measuring here is not the time it takes to build the project from scratch. A full build is completed first. Then we run the build system again, without modifying a single file, and time how long it takes. Since the input sizes are exponential, the graph is shown in Figure 3 below as a log-log plot in order to make some sense of the data:

                                                  Figure 3
The associated Table 1 below shows how much time is wasted during each Edit-Compile-Test cycle due to the use of a build system that does not scale. At 10 files both systems respond quickly at under 1 second.

                                 Table 1
At around 1000 files, SCons starts to be unusable, taking over 9 seconds to do nothing. Make starts to get slow around 10,000 files with times over 2 seconds, and is clearly unusable at the 100,000 file mark with a 4-minute time.

There is no time for Scons in the 100,000 file case since it used up all 8GB of the test machine’s memory and died. Even though some files were built, Scons was not able to keep track of what it had done, so during the next build it started over from the beginning and died again. In comparison, make uses 960MB of memory in the 100,000 file null build.

The key point here is that during these builds nothing has changed, so nothing should be done. And yet each build system does a significant amount of work and wastes real time to do it. Read on to section 4 for an alternative solution to the Global DAG view.

Partial DAG
Instead of the Global DAG approach from RMCH, a build system can use a Partial DAG as described in Build System Rules and Algorithms [8] . Taking another look at the problem in RMCH (Figure 4, below ), the direction of the dependency arrows can be reversed:

                Figure 4
By structuring the dependency data properly (Figure 5 below ), the build system is able to start with the list of files that have changed and build only the parts of the DAG that are needed to update the whole project.

                                          Figure 5
Notice how the build system doesn’t even need to consider the parse side of the tree if only the main.c input file has changed. We can also generate the partial DAG if the parse.y file has changed as shown Figure 6 below :

                                      Figure 6
In this case, the dependency from parse.hto main.ois present, so the build will not be broken like it is with recursive make. Therefore it is not necessary to use a Global DAG to achieve a correct build!

To get an idea of the scope of a Global DAG view compared to a Partial DAG view, we can take a look at a real project (Figure 7 below ).

                                      Figure 7
The project is the game “nethack”, which builds approximately 100 object files. In each case, a single C file is modified. These pictures show what is loaded by the build system with a Global DAG, and what is loaded by the build system with a Partial DAG (Figure 8 below ).

    Figure 8
The simplicity of the Partial DAG approach results in a very real time savings. We can now re-execute the same scaling test from the wonder build [6] test harness, and include a build system (Figure 9, below ) that uses a Partial DAG called tup [9] .

                                                 Figure 9
It is clear from Table 2 below that tup scales much better, due to the fact that it doesn’t have to load the full DAG during every invocation. However, it still requires over 2 seconds in the 100,000 file case.

                                            Table 2
This is mostly the time it takes to stat()the entire project tree. If instead a file system monitor is used to provide a list of file changes up front to tup, then the build system has a constant null-build performance of 0.003s.

Imagine editing any single C file in a 100,000 file project and being able to start compiling it in a few milliseconds! At first glance it may seem unfair to use this file system monitor for tup and not for other build systems.

However, one of the results of Build System Rules and Algorithms [8]   is that even if make were provided a file change list up front, the dependency information is not structured in a way for make to take advantage of it. Make would still have to load the entire DAG before doing anything, so it would still suffer from poor scalability.

Sub-Command Verification
Compilers have come a long way in terms of recognizing and reporting errors, as well as warning about things that very likely could be errors but are allowed by the C language (such as when x=y is in a while loop condition, it is usually a typo and should be x==y ).

In contrast, make has not progressed at all. We will take a look at two ways error-checking should be provided by build systems, but commonly aren’t.

Output File Verification.
Consider the following Makefile fragment, which demonstrates the need for a build system to verify sub-command outputs:

foo. o: foo. c
            gcc -c foo. c -o bar.o

This fragment is clearly wrong – it provides a rule to build foo.o, and yet the gcc command writes to bar.o. What happens if we run this in make?

$ make
gcc -c foo.c -o bar.o
$ echo $?

It dutifully executes the command and exits with success! And yet the DAG that we are describing in the Makefile does not accurately represent the state of the file system. That is why Paul Smith (GNU Make maintainer) has listed this as rule 2 in his Rules of Makefiles page [10]:

2.Everynon-.PHONY rule must update a file with the exact name of its target.Make sure every command script touches the file “$@”– not “../$@”,or “$(notdir $@)”, but exactly $@. That way you and GNU make always agree.

This rule is listed as a “Rule of Makefiles ”, but who is enforcing this rule? Make? No. It is up to you, the developer, to enforce it. This is absurd. This must be flagged as an error by the build system.

In contrast, tup checks the actual files written by the sub-command with the list in the build description. Using an equivalent faulty Tup file:

: foo. c |>gcc -c foo. c -o bar. o |>foo . o

We can see that an error is reported about the incorrect output file:

$ tup upd
[ tup ] Scanning filesystem …0.003s[ tup ] No tup. config changes .
[ tup ] No Tupfiles to parse .
[ tup ] No files to delete .
[ tup ] Executing Commands. . .
[      0/1      ] gcc -c foo. c -o bar . o
tup error :File ’ bar .o’was written to ,but is not in tup/db.You probably should specify it as an output for the command ’ gcc -c foo. c -o bar.o’

Input File Verification. Another example of how sub-commands should be verified is in regard to input checking. In this Makefile fragment, the foo.h header is generated and later used by the foo.c compilation:

all : foo.h foo. o
gen: gen. c
             gcc gen. c -o gen
foo.h: gen .
             /gen >foo .h
foo. o: foo. c
            gcc -c foo. c -o bar . o

We can execute this Makefile in make with a single execution thread just fine:

$ make
gcc gen. c -o gen
./gen >foo .h
gcc -c foo. c -o foo. o
$ echo $?

However if we re-build from scratch using a parallel compilation, an error suddenly appears:

$ make -j2
gcc gen. c -o gen
gcc -c foo. c -o foo.o
foo. c :1:17: error : foo.h: No such file or directory
make: *** [foo. o] Error1
make: *** Waiting for unfinished jobs . . . .

Why didn’t we get an error in the first case but we did in the second case? The entire project is identical in both; all that differs is the parallel compilation mode of make. The problem is that make doesn’t do any input verification at all.

Fortunately in this case the file didn’t exist yet, so the compiler was able to report it as an error. However, it is also possible that the foo.cfile could read an old foo.h header, while other parts of the program use a newly generated foo.h header with different values.

This would result in a much more subtle and hard-to-detect bug, since there would be no error messages at all. The only evidence of a problem would be mysterious crashes or erroneous program logic.

The result is wasted developer time trying to track down the source of the errors. Or maybe the developer will end up not trusting the build system with parallel builds, so extra cores on the development machine will go unused.

A build system that actually checks input files, such as tup, can report this error even when the build executes serially:

$ tup upd
[ tup ] Scanning filesystem …0.003s
[ tup ] No tup. config changes .
[ tup ] Parsing Tupfiles . . .
[       0/1     ] .
[      1/1      ]
[ tup ] No files to delete .
[ tup ] Executing Commands. . .

[      0/3    ] gcc gen. c -o gen
[      1/3    ] ./gen >foo .h
[      2/3    ] gcc -c foo. c -o foo.o

tup error : Missing input dependency ? a file was read from , and was not specified as an input link for the command. This is an issue because the file was created from another command, and without the input link the commands may execute out of order. You should add this file as an input , since it is possible this could randomly break in the future .
— Command ID: 12
— [9] foo.h

When the build system handles the responsibility of sub-command verification, the developer can focus their attention on what really matters and get their work done faster.

Build systems like make waste developer time in a number of ways. Whether it is the ine?cient algorithms they use, or the lack of error-checking, the result is the same.

The developer spends time working around the build system, rather than focusing on their actual program. In order to get back on track, developers should look beyond make and choose a build system designed with scalability and error-checking in mind.

Michael Shal is senior staff engineer at The PTR Group with seven years of embedded software design. Primarily working in C, he has been fascinated with the build process since first reading “Recursive Make Considered Harmful” 10 years ago. His job at The PTR Group requires him to do real embedded development, but he continues to pursue the build system problem in his spare time.

This article is a part of the material provided to attendees of the class he is teaching on this topic at the Embedded Systems Conference(ESC-200 ).

[1] Free Software Foundation, Introducing the GNU Build System.
[2] Martin Fowler, Continuous Integration.
[3] Free Software Foundation, GNU Make.
[4] The SCons Foundation, SCons: A software construction tool.
[5] Miller, P.A. (1998), Recursive Make Considered Harmful. AUUGN Journal of AUUG Inc., 19(1), pp. 14-25.
[6] Benchmarks of various C++ build tools.
[7] Noel Llopis, The Quest for the Perfect Build System (Part 2).
[8] Mike Shal, Build System Rules and Algorithms.
[9] Mike Shal, tup.
[10] Paul Smith, Paul’s Rules of Make?les.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.