Automation and Make: Discussion

Parallel Execution

Make can build dependencies in parallel sub-processes, via its --jobs flag (or its -j abbreviation) which specifies the number of sub-processes to use e.g.

$ make --jobs 4 results.txt

If we have independent dependencies then these can be built at the same time. For example, abyss.dat and isles.dat are mutually independent and can both be built at the same time. Likewise for abyss.png and isles.png. If you’ve got a bunch of independent branches in your analysis, this can greatly speed up your build process.

For more information see the GNU Make manual chapter on Parallel Execution.

Different Types of Assignment

Some Makefiles may contain := instead of =. Your Makefile may behave differently depending upon which you use and how you use it:

For a detailed explanation, see:

Make and Version Control

Imagine that we manage our Makefiles using a version control system such as Git.

Let’s say we’d like to run the workflow developed in this lesson for three different word counting scripts, in order to compare their speed (e.g. wordcount.py, wordcount2.py, wordcount3.py).

To do this we could edit config.mk each time by replacing COUNT_SRC=wordcount.py with COUNT_SRC=wordcount2.py or COUNT_SRC=wordcount3.py, but this would be detected as a change by the version control system. This is a minor configuration change, rather than a change to the workflow, and so we probably would rather avoid committing this change to our repository each time we decide to test a different counting script.

An alternative is to leave config.mk untouched, by overwriting the value of COUNT_SRC at the command line instead:

$ make variables COUNT_SRC=wordcount2.py

The configuration file then simply contains the default values for the workflow, and by overwriting the defaults at the command line you can maintain a neater and more meaningful version control history.

Make Variables and Shell Variables

Makefiles embed shell scripts within them, as the actions that are executed to update an object. More complex actions could well include shell variables. There are several ways in which make variables and shell variables can be confused and can be in conflict.

Detailed Example of Shell Variable Quoting

Say we had the following Makefile (and the .dat files had already been created):

BOOKS = abyss isles

.PHONY: plots
plots:
	for book in $(BOOKS); do python plotcount.py $book.dat $book.png; done

the action that would be passed to the shell to execute would be:

for book in abyss isles; do python plotcount.py ook.dat ook.png; done

Notice that make substituted $(BOOKS), as expected, but it also substituted $book, even though we intended it to be a shell variable. Moreover, because we didn’t use $(NAME) (or ${NAME}) syntax, make interpreted it as the single character variable $b (which we haven’t defined, so it has a null value) followed by the text “ook”.

In order to get the desired behavior, we have to write $$book instead of $book:

BOOKS = abyss isles

.PHONY: plots
plots:
	for book in $(BOOKS); do python plotcount.py $$book.dat $$book.png; done

which produces the correct shell command:

for book in abyss isles; do python plotcount.py $book.dat $book.png; done

Make and Reproducible Research

Blog articles, papers, and tutorials on automating commonly occurring research activities using Make:

Return messages and .PHONY target behaviour

Up to date vs Nothing to be done is discussed in episode 2.

A more detailed discussion can be read on issue 98.