Everybody hates build.xml (code reuse in Ant)

February 08, 2013 [Java, Tech, Test Driven, Videos]

If you're starting a new Java project, I'd suggest suggest considering the many alternatives to Ant, including Gant, Gradle, SCons and, of course, Make. This post is about how to bend Ant to work like a programming language, so you can write good code in it. It's seriously worth considering a build tool that actually is a programming language.

If you've chosen Ant, or you're stuck with Ant, read on.

Slides: Everybody hates build.xml slides.

Most projects I've been involved with that use Ant have a hateful build.xml surrounded by fear. The most important reason for this is that the functionality of the build file is not properly tested, so you never know whether you've broken it, meaning you never make "non-essential" changes i.e. changes that make it easier to use or read. A later post and video will cover how to test your build files, but first we must address a pre-requisite:

Can you write good code in Ant, even if you aren't paralysed by fear?

One of the most important aspects of good code is that you only need to express each concept once. Or, to put it another way, you can re-use code.

I want to share with you some of the things I have discovered recently about Ant, and how you should (and should not) re-use code.

But first:

What is Ant?

Ant is 2 languages:

In fact, it's just like a Makefile (ignore this if Makefiles aren't familiar). A Makefile rule consists of a target (the name before the colon) with its dependencies (the names after the colon), which make up a declarative description of the dependencies, and the commands (the things indented by tabs) which are a normal procedural description of what to do to build that target.

# Ignore this if you don't care about Makefiles!
target: dep1 dep2   # Declarative
    action1         # Procedural
    action2

The declarative language

In Ant, the declarative language is a directed graph of targets and dependencies:

<target name="A"/>
<target name="B" depends="A"/>
<target name="C" depends="B"/>
<target name="D" depends="A"/>

This language describes a directed graph of dependencies. I.e. they say what depends on what, or what must be built before you can build something else. Targets and dependencies are completely separate from what lives inside them, which are tasks.

The procedural language

The procedural language is a list of tasks:

<target ...>
    <javac ...>
    <copy ...>
    <zip ...>
    <junit ...>
</target>

When the dependency mechanism has decided a target will be executed, its tasks are executed one by one in order, just like in a programming language. Except that tasks live inside targets, they are completely separate from them. Essentially each target has a little program inside it consisting of tasks, and these tasks are a conventional programming language, nothing special (except for the lack of basic looping and branching constructs).

I'm sorry if the above is glaringly obvious to you, but it only recently became clear to me, and it helped me a lot to think about how to improve my Ant files.

Avoiding repeated code

Imagine you have two similar Ant targets:

<target name="A">
    <javac
        srcdir="a/src" destdir="a/bin"
        classpath="myutil.jar" debug="false"
    />
</target>

<target name="B">
    <javac
        srcdir="b/code" destdir="b/int"
        classpath="myutil.jar" debug="false"
    />
</target>

The classpath and debug information are the same in both targets, and we would like to write this information in one single place. Imagine with me that the code we want to share is too complex for it to be possible to store it as the values of properties in some properties file.

How do we share this code?

The Wrong Way: antcall

Here's the solution we were using in my project until I discovered the right way:

<target name="compile">
    <javac
        srcdir="${srcdir}" destdir="${destdir}"
        classpath="myutil.jar" debug="false"
    />
</target>

<target name="A">
    <antcall target="compile">
        <param name="srcdir" value="a/src"/>
        <param name="destdir" value="a/bin"/>
    </antcall>
</target>

<target name="B">
    <antcall target="compile">
    ...

Here we put the shared code into a target called compile, which makes use of properties to access the varying information (or the parameters, if we think of this as a function). The targets A and B use the <antcall> task to launch the compile target, setting the values of the relevant properties.

This works, so why is it Wrong?

Why not antcall?

antcall launches a whole new Ant process and runs the supplied target within that. This is wrong because it subverts the way Ant is supposed to work. The new process will re-calculate all the dependencies in the project (even if our target doesn't depend on anything) which could be slow. Any dependencies of the compile target will be run even if they've already been run, meaning some of your assumptions about order of running could be incorrect, and the assumption that each target will only run once will be violated. What's more, it subverts the Ant concept that properties are immutable, and remain set once you've set them: in the example above, srcdir and destdir will have different values at different times (because they exist inside different Ant processes).

Basically what we're doing here is breaking all of Ant's paradigms to force it to do what we want. Before Ant 1.8 you could have considered it a necessary evil. Now, it's just Evil.

The Horrific Way: custom tasks

Ant allows you to write your own tasks (not targets) in Java. So our example would look something like this:

Java:

public class MyCompile extends Task {
    public void execute() throws BuildException
    {
        Project p = getProject();

        Javac javac = new Javac();

        javac.setSrcdir(  new Path( p, p.getUserProperty( "srcdir" ) ) );
        javac.setDestdir( new File( p.getUserProperty( "destdir" ) ) );
        javac.setClasspath( new Path( p, "myutil.jar" ) );
        javac.setDebug( false );
        javac.execute();
    }
}

Ant:

<target name="first">
    <javac srcdir="mycompile"/>
    <taskdef name="mycompile" classname="MyCompile"
        classpath="mycompile"/>
</target>

<target name="A" depends="first">
    <mycompile/>
</target>

<target name="B" depends="first">
    <mycompile/>
</target>

Here we write the shared code as a Java task, then call that task from inside targets A and B. The only word to describe this approach is "cumbersome". Not only do we need to ensure our code gets compiled before we try to use it, and add a taskdef to allow Ant to see our new task (meaning every target gets a new dependency on the "first" target), but much worse, our re-used code has to be written in Java, rather than the Ant syntax we're using for everything else. At this point you might start asking yourself why you're using Ant at all - my thoughts start drifting towards writing my own build scripts in Java ... anyway, I'm sure that would be a very bad idea.

The Relatively OK Way: macrodef

So, enough teasing. Here's the Right Way:

<macrodef name="mycompile">
    <attribute name="srcdir"/>
    <attribute name="destdir"/>
    <sequential>
        <javac
            srcdir="@{srcdir}" destdir="@{destdir}"
            classpath="myutil.jar" debug="false"
        />
    </sequential>
</macrodef>

<target name="A">
    <mycompile srcdir="a/src" destdir="a/bin"/>
</target>

<target name="B">
    <mycompile srcdir="b/code" destdir="b/int"/>
</target>

Since Ant 1.8, we have the macrodef task, which allows us to write our own tasks in Ant syntax. In any other language these would be called functions, with arguments which Ant calls attributes. You use these attributes by giving their name wrapped in a @{} rather than the normal ${} for properties. The body of the function lives inside a sequential tag.

This allows us to write re-usable tasks within Ant. But what about re-using parts from the other language - the declarative targets and dependencies?

Avoiding repeated dependencies?

Imagine we have a build file containing targets like this:

<target name="everyoneneedsme"...

<target name="A" depends="everyoneneedsme"...
<target name="B" depends="everyoneneedsme"...
<target name="C" depends="everyoneneedsme"...
<target name="D" depends="everyoneneedsme"...

In Ant, I don't know how to share this. The best I can do is make a single target that is re-used whenever I want the same long list of dependencies, but in a situation like this where everything needs to depend on something, I don't know what to do. (Except, of course, drop to the Nuclear Option of the <script> tag, which we'll see next time.)

I haven't used it in anger, but this kind of thing seems pretty straightforward with Gradle. I believe the following is roughly equivalent to my example above, but I hope someone will correct me if I get it wrong:

task everyoneneedsme

tasks.whenTaskAdded { task ->
    task.dependsOn everyoneneedsme
}

task A
task B
task C
task D

(Disclaimer: I haven't run this.)

So, if you want nice features in your build tool, like code-reuse and testability, you should consider a build tool that is integrated into a grown-up programming language where all this stuff comes for free. But, if you're stuck with Ant, you should not despair: basic good practice is possible if you make the effort.