GNU & You, Building a Better World

H. Chu
Jet Propulsion Laboratory
California Institute of Technology
Pasadena, CA 91109
hyc@hanauma.jpl.nasa.gov
Ph (818) 354-3792

Abstract

The phrase "compute server" is well-known, and most computing clusters include a variety of fast machines designated as such. This paper describes a "compile server," a server used for building programs to be run on other systems in a network. The goal is to provide an efficient base for building large program sets, such as a release of the X Window System. The approach is based on using the GNU C compiler as a cross-compiler for all the target systems, and using GNU Make to give flexibility to the actual build procedure. A sample configuration is described, focused on Release 4 of X Version 11.

1 Introduction

Computer users are always wanting more power, and parallel architectures were developed as a quick means of delivering that power. When looking at MIPS and MFLOPS, it seems so obvious that a ten-processor system is an order of magnitude better than a single-processor system. Unfortunately, with real-world problems it is difficult to realize the full potential of a parallel system. Most software is written with a single-processor mindset, coding for a parallel environment can be difficult, and most programmers have little experience with parallel systems.

On another front, while computer systems slowly increase in power, the problems they are meant to solve grow as well, usually faster than the computers. The software designed to solve the problems grows, and as the software size increases, it too becomes a problem needing a solution. With today's networks of workstations and servers, it is fairly common to want to run a given application across the various machines in the network. This is easily done for small, simple applications but becomes more complicated as the types of machines and the program sizes increase. Maintaining multiple copies of source code for multiple target architectures consumes disk space, and makes updates more tedious. While sharing filesystems across the network with protocols like NFS eases this problem, it does not go far enough.

The purpose of this paper is to describe an environment that utilizes a multi-processor computer system as a base for a centralized program compilation server. This environment provides high speed compilation of large program sets for multiple target architectures. Other benefits include a reduction in the amount of disk space required for a package, and a simple reconfiguration process.

Section 2 examines the problems with compiling large program sets such as the X Windows System. The principal problem needing to be solved is jut the lengthy compile time, which is due partly to the configuration process. Section 3 presents methods to restructure the software configuration allowing easier reconfiguration and parallel compilation, using the features of GNU Make. This approach speeds up compilation significantly, but only for the multi-processor machine. Section 4 extends the benefits to other architectures through use of the GNU C compiler, and a summary is provided in Section 5.

2 The Typical Makefile

The intent of this work is to take advantage of a multi-processor machine to speed up program compilation, and then to allow other machines to benefit from the speed up. One would like to use parallel make to speed up the compilation step, and in a networked environment, it is desirable to share a single copy of a program's source code across all the machines using the program, thus minimizing the space required as well as making it easier to manage software updates. The typical approach to configuring and compiling software presents a number of problems preventing this from occurring.

The X software distribution makes an ideal case study because the problems it presents are so extreme. Managing the source tree can be a tremendous task in itself, as just the core X distribution consumes on the order of 100 megabytes of storage. Actually compiling the code can consume several hours of time, and modifying the configuration usually demands a complete restart of the build procedure.

It would be nice if simply using a make that supports parallel operations was all that was necessary to speed things up, but typical makefiles prevent effective parallel use in the following two ways:

Dependencies that are not interdependent are built using a for-loop in the shell, explicitly serializing a process that could have been performed in parallel. For example:
```
mk-dirs:
	for i in $(SUBDIRS); \
	do \
		mkdir $$i \
	done
```
This behavior is not so bad, it just negates the advantage of using a parallel make.
Dependencies are listed as equivalent and built in parallel, but some dependencies in the list depend on others in the list, e.g.:
```
subdirs: libs extensions clients demos 
	cd $?; make 
```
Here a parallel make would build the programs in the clients and demos directories before the libraries they require are complete. This behavior is almost certain to cause the make to fail.

Modifying the configuration of a large program set is always a daunting prospect. There may be many header files to change, and often a large number of makefiles to edit. The X distributors attack this difficulty by automating the creation of makefiles, using their Imake utility. Imake creates system-dependent makefiles from system-independent Imakefiles, an Imake template file, and system-dependent configuration files. This is obviously a brute-force approach; there are over 130 makefiles in the X core source tree, each containing about 10K of identical data generated from the Imake template. That means the makefiles account for over a megabyte of redundant data. To make matters worse, all of those makefiles need to be recreated if any aspect of the target system changes.

3 Building with GNU Make

The X developers recognized that make and makefiles are inherently nonportable. The Imake solution is inelegant at best, and is not recommended. The approach taken here is to standardize the makefiles by requiring a standard version of make - GNU make. There are many advantages of this approach over the Imake solution, not the least being that some systems' standard make program still have problems with Imake makefiles.

Writing a proper parallel makefile is not difficult, and with GNU make it is easier still. Most for-loops in makefiles are intended to perform an action for a collection of subdirectories. The previous for-loop example could be rewritten quite simply:

mk-dirs: $(SUBDIRS)
	mkdir $?

Any garden-variety make program could handle this example. For most situations though, where the subdirectories already exist and the goal is to perform an action in that directory, some trickery is necessary. The following illustrates an example based loosely on the Xlib sources:

SUBLIB = lib
SUBPRG = clients demos
SUBDIRS = $(SUBLIB) $(SUBPRG)
all:
	@(MAKE) $(SUBPRG) TARG=all
clean includes install:
	@$(MAKE) $(SUBDIRS) TARG=$@
$(SUBDIRS)::
	@cd $@; echo making $(TARG) in $@...; \
	$(MAKE) $(TARG)
$(SUBPRG):: $(SUBLIB)
.PHONY: $(SUBDIRS)

This example demonstrates a number of points. First, it shows how to write a set of rules to operate in a collection of subdirectories without using a for-loop, giving parallel make a chance to work. The example also shows how to structure the targets so that the dependency hierarchy is preserved. (E.g., the parallel build of "clients" and "demos" in this example will not occur until the "lib" directory is complete.) The crucial feature of GNU Make that lets it all work is the ".PHONY" target. In GNU Make, any dependencies of the .PHONY target are always treated as out-of-date, so their associated commands will always get executed. This feature is what allows the subdirectory rules to execute.

On a large multi-processor system with high I/O bandwidth, parallel make increases the speed of compilation linearly with the number of available processors. This is a big win, particularly for packages like X where the release notes warn that the process can take up to 12 hours. On an Alliant FX/8 (no giant by today's standards) the original X build process takes five hours, but eliminating imake and running in parallel cuts that time to less than an hour, using only four CPUs. With faster systems, like an Alliant FX/2800 with a dozen processors, the build time can be reduced to only a few minutes. While intense number-crunching jobs may be hard to port to parallel architectures, this is clearly a case where multi-processors shine.

Another handy feature is the "include" command, which allows a makefile to reference another file. With the include command all of those copies of the 10K Imake template in the X sources can be deleted, immediately recovering over a megabyte of disk space. Also, using the include file means that makefiles only need to be written once, and any configuration changes need only be made in a single make include file. As an additional refinement, two include files may be used, one solely for macro definitions and the other for commonly used rules. Taking the optimizations all the way, the macro include can be surrounded with ifdef's, so the macros will only need to be read once per build, in the highest level makefile.

4 Reaching Out in the World

It would be hard to justify going to all the trouble of rewriting hundreds of makefiles for a program set that is only going to be compiled once. If the setup can be used to support other systems, though, its worth increases greatly. With the GNU C compiler (gcc) it is possible to support a wide variety of target machines on a single platform. While currently it is impractical to build a single image of gcc that supports all of the binary formats, multiple specific versions can easily be invoked through a simple shell-script wrapper. The system-specific header and library files can be accessed via NFS. Adding the GNU assembler and linker makes the cross-compiler system complete. (For those systems that are not supported by the GNU linker, it is simple enough to let the target system perform the link step.)

When cross-compiling for many systems it is essential to keep track of what the current target system is. Continuing with the makefile support described in the previous section, a few rules added to the make include file take care of this bookkeeping. A file naming the current architecture is kept in each source directory, and the current architecture is defined in the macro include file. For example:

all:
	OLDARCH := $(shell cat arch)
	ifneq ($ARCH, $OLDARCH)
		echo $ARCH > arch
		@$(MAKE) clean
	endif
	@$(MAKE) $(SUBDIRS) TARG=all

This example rule automatically cleans out the old contents of the directory if the current target architecture is different from the previous one. The example also demonstrates more of the power of GNU make, with conditionals and shell command substitution.

5 Summary

Maintaining software on a single compile server in a network is desirable because it is easier than maintaining multiple sets of software on machines scattered around the network. Building the compile server around a multi-processor computer system provides even greater benefits, especially when the build procedure takes maximum advantage of the available processors. Using GNU tools makes it very easy to maintain a software set for multiple different target architectures, allows for simple rules to control parallel builds, and provides improved efficiency as well.

This work was performed at the Jet Propulsion Laboratory, California Institute of Technology under contract with the National Aeronautics and Space Administration.