What is Wrong with Make? – Freecode (original) (raw)

Evolution is a slow process. Getting rid of old bad habits is never easy. This article is a critique of the Make build tool. I'll list its shortcomings this week and suggest a few more modern alternatives next week.

Make looks fine on the surface

The Make build tool has been with us for about 20 years. It is a somewhat sad thing to see today's complex new projects considering Make as their only choice. I often get the question, "What is wrong with Make?" So often, in fact, that the obvious answer to the question becomes, "The first wrong thing is people's ignorance about Make's limitations with respect to their requirements." I'll start my critique with a list of wrong beliefs about this classic tool.

Myth 1: Make-based builds are portable

Several Make clones are Open Source C software, and all build platforms have a Make preinstalled. Make-based builds are known to be portable and they really are, compared to some IDE solutions. The problem is that the Make tool, as it was originally implemented, has to rely on shell commands and on features of the filesystem. These two are notorious sources of platform incompatibilities.

In consequence, the fact that every system has a Make is probably true, but not relevant. The statement "Every system has a shell" is also true. That doesn't mean that shell scripts are portable, and nobody claims that. Another problem is the fact that the original Make was lacking some fundamental features, and later clones added the missing features in different ways. The obvious example is the if-then-else construct. It is not really possible to build real-life projects without if-then-else. One has to use workarounds, based on either the "include" directive or on conditional macro expansion. In the best case, you do get to the desired functionality, but you lose the readability of the build description.

Make-based builds are not portable. They rely too much on features of the surrounding system, and Make clones are incompatible with each other.

Myth 2: Make-based builds are scalable

Many of us have had the experience of expanding a given codebase by adding a few source files or by adding a static library. In carefully designed builds, it is not a lot of work to make the new code part of the final product. You just add a short new Makefile in a new directory and rely on a recursive call of Make.

The problem comes when the project gets really big. In big projects, keeping the build fast enough is a challenging task. Also, changing the binary deployment of the product becomes a complex task, much more complex than what it needs to be. With a typical recursive Make, the architecture of the software product is nailed down to the directory structure. Adding or removing a single piece is still ok, but other restructuring is much more disruptive. A sad side effect of this is that, over time, developers come to think about the software not as a set of abstract components and their interfaces, but as a set of source directories with sourcecode and that builds up even more resistance to change.

The speed issues in large builds are well documented in Peter Miller's seminal article "Recursive Make Considered Harmful". When using recursive Make, there is no simple way to guarantee that every component is visited only once during the build. Indeed, the dependency tree is split in memory between several processes that don't know a lot about each other. This makes it hard to optimize for speed. Another problem is that Make uses just one namespace, and recursion is never easy with only one global namespace. If you want to configure subtargets differently (to pass parameters to Make processes started by Make), you'll get an error-prone build setup.

Recursive Make is not the only way to implement Make. You may wonder: If it is harmful, why is it so widely used? One argument that people mention is that recursive Make is easier to set up. This is not a very solid argument. For example, John Graham Cumming demonstrates in a recent DDJ article a simple non-recursive Make system. There is also a more subtle argument in favor of recursive Make. The fact is that it allows developers to use their knowledge about the project to easily start only a smaller part of the build. In non-recursive systems, you can start a smaller part, but the system will usually first "collect" all the build descriptions, which is slower or has undesired side effects (or both). It is also possible to design and implement systems that are somewhere in between. Some homegrown Make-based systems use clever checks outside Make to avoid starting new Make processes when there is nothing to be built, then act recursively for the rest of the cases. Anyway, for good or bad reasons, the fact remains that Make-based builds scale painfully.

Myth 3: The Make tool is simple and easy

The Make tool paradigm is elegant and powerful. I have seen nice expert systems implemented only with Makefiles. People like things that are elegant and powerful. Unfortunately, Make implementations are poor and dirty by today's standards. The Makefile syntax is obscure and, consequently, more difficult than you may think. Make wins hands down the contest for the messiest namespace in all programming tools I have ever seen.

File targets and command targets (also known as phony targets) share the same namespace, making it next to impossible to build files with certain names. The environment shell variables and the Make macros also share the same namespace. Make macros cover both variable and function concepts (as in functional programming languages). Macros can be defined inside the Makefile or on the commandline. All these problems lead to complex scope rules and some bad surprises for both the novice user and the expert user.

Make-based builds rely heavily on shell environment variables. The best-known consequence is that the build is difficult to reproduce for another user on another build machine. A more subtle issue is that it is difficult to document the build. Most modern systems allow you to ask which are the parameters of the build and which are their meaningful values. Make doesn't help you to provide this feature, despite the fact that Make-based builds definitely need it.

Even without several sources for variables, a single namespace is still too messy. One namespace means that target platform-wide variables, build machine-wide variables, project-wide variables, and individual user customizations are right next to each other. Without solid naming conventions, you certainly don't know what you can change without endangering the entire build.

Another area of misleading simplicity is the fact that many Make macros have one-letter names. Many Make clones improved on that by introducing more understandable aliases for those names. Of course, they did that in incompatible ways, so that you cannot profit from the improvement if you want to keep portability of the build description across Make clones.

Myth 4: Make is fast

Make and many of its clones are implemented using the C/C++ programming languages. They are speed-efficient implementations. Indeed, correctly-designed Make-based builds show little overhead while building. But before you enjoy too much the raw speed of Make, you have to remember that fast is not safe and safe is not fast. Because of this fundamental contradiction, you should look suspiciously at anyone claiming amazing speed. Indeed, Make achieves some of its speed by performing only superficial checks. This kind of speed gain strikes back because you'll feel more often the need for complete builds from scratch.

Make also expects you to know how to build a smaller part of the project that you are currently working on. This strikes back when more time is spent in product integration between several developers.

More Make problems

In addition to the characteristics described above, often considered strong points of Make, there are also a few characteristics that are acknowledged shortcomings of Make. Here is a short list of them.

Reliability issues

Make-based builds are not safe, and nobody claims that they are safe. The main reason for this problem is the fact that Make relies on time stamps and not on content signatures in order to detect file changes. In local area networks, it is frequently the case that several computers with their own clock devices are sharing the filesystem containing the sources for the build. When those clocks get out of sync (yes, that happens), you may get inaccurate builds, especially when running parallel builds. Moreover, Make takes the approach of not storing any stamp between builds. This is a very bad choice because it forces Make to use risky heuristics for change detection. This is how Make fails to detect that the file changed to another one older than the previous (which happens quite often in virtual filesystems).

Not using content signatures is especially painful when an automatically-generated header file is included in all the sources (like config.h in the GNU build system). Complex solutions involving dummy stamp files have been developed in order to prevent Make tools from rebuilding the entire project when that central header file was regenerated with the same content as before. An even more insidious safety issue is the fact that Make does not detect changes in the environment variables that are used or in the binary executables of the tool chain that is used. This is usually compensated with proprietary logic in homegrown Make systems, which makes the build more complex and more fragile.

Implicit dependencies

One cannot criticize Make for implicit dependencies detection for a good reason: Make doesn't have such a mechanism. In order to deal with header file inclusion in C sources, several separate tools exist, as well as special options to some compilers. High-level tools wrapping Make use them in order to provide a more or less portable "automatic dependencies" feature. Despite the efforts and the good will of those higher-level tools, Make blocks good solutions for automatic dependencies for the following two reasons:

Make doesn't really support dynamic many-to-one relationships. It does support many-to-one, but not if the "many" part changes from one build to the next. For example, Make will not detect if a new dependency has been added if that dependency is new in the list but old on disk (older than the target, according to its timestamp). By the way, make also lacks support for dynamic one-to-many, which makes it inappropriate for Java builds (with Java, a single file can produce a variable number of outputs).
Make doesn't really support using automatic dependencies and updating those automatic dependencies in one run. This forces you to make multiple Make calls for a complete build. (Did you ever wonder why the ubiquitous sequence "make depend; make; make install" has never been folded into just one Make call?)

Limited syntax

The lack of portable if-then-else was already mentioned. There are many other idiosyncrasies in the Makefile syntax. Some of them have been fixed in later Make clones (in incompatible ways, as you may expect). Here is a list:

Space characters are significant in a painful way. For example, spaces at the end of the line (between a macro value and the comment sign following the macro definition) are kept. This always generated and still generates very hard-to-track bugs.
In the original Make, there was no way to print something at parsing time. Moreover, Make has a bad habit of silently ignoring what it doesn't understand inside the Makefile. The combination of the two is a killer. I've had to stare at Makefile code that wasn't working as expected, just to discover that some code was silently and entirely ignored because of a typo somewhere above.
There are no "and"/"or" operators in logical expressions. This forces you to deeply nest the non-portable if-then-else or those portable but unreadable equivalent constructions.

All this is annoying, but it is far less serious than Make's decision to rely on shell syntax when describing the executable part of a rule. Decent build tools gained, over the years, libraries of rules to support various tool chains. Not so with Make. Make has instead a hard-coded set of rules rarely used in real-life-sized projects. Actually, it hampers accountability more than anything else. One of those rules may and sometimes does trigger by accident when the makefile author doesn't intend it. Lack of good support for a portable library of rules is, in my opinion, the biggest shortcoming of Make and its direct clones.

Difficult debugging

The inference process of Make may be elegant, but its trace and debug features date back to the Stone Age. Most clones improved on that. Nevertheless, when the Make tool decides to rebuild something contrary to the user's expectation, most users will find the time needed to understand that behavior not worth the effort (unless it yields an immediate fatal error when running the result of the build, of course). I have noticed that Makefile authors tend to forget the following:

How rules are preferred when more than one rule can build the same target.
How to inhibit and when to inhibit the default built-in rules.
How the scope rules for macro work in general and in their own build setup.

While not completely impossible, Make-based builds are tedious to track and debug. By way of consequence, Makefile authors will continue to spend too much time fixing their mistakes or, under high time pressure, they will just ignore all behavior that they don't understand.

What about generated Makefiles?

You may argue that all the discussion about the syntax of Makefiles is pointless today, and that would be fair. Indeed, today, many Makefiles are automatically generated from higher-level build descriptions, and the manual maintenance burden is gone. With these new tools, can you forget about those Make problems and just use it? Yes and no. Yes for the syntax-related problems, yes for the portability problems (to some extent), and definitely no for all the rest (reliability issues, debugging problematic builds, etc.). See my discussion of other build systems to understand how the shortcomings of Make are affecting the tools built on top of it.

Conclusion on Make

The Make build tool was and still is a nice tool if not stretched beyond its limits. It fits best in projects of several dozen source files working in homogeneous environments (always the same tool chain, always the same target platform, etc.). But it cannot really keep up with today's requirements of large projects.

After I tell people what is wrong with Make, the next question is always the same: "If it is so awkward, why is it so widely used?" The answer does not pertain to technology. It pertains to economy. In short, the reason is that workarounds are cheaper than complete solutions. In order to displace something, the new thing has to be all that the old thing was and then some more (some more crucial features, not just some more sugar). And then it has to be cheaper to top it. Despite the difficulty of being so much more, in my humble opinion, today, the time of retreat has come for the Make tool. Next week, I'll offer my look at the alternatives.

[The second article is here.]