Java Maven Deterministic Builds of Jar War Ear

How it happens that when you compile same code twice — you get different files? How to avoid this, how to get deterministic builds — builds, which are depends on what and how are building, and not when.  Or other question — how to make sure that these binaries are produced from this source code, and without extra manipulation?

That’s a serious question, currently raised amongst many project, like Debian, FreeBSD, bitcoinj and many more.

What about Java? Java itself not guaranteed same bytecode produced with different version of compiler; But at least if environment unchanged, then the code produced from *.java files are same.

So, in theory it’s not hard to get deterministic builds at least in one environment scope. In reality it’s harder than that, since classes usually packed in jar/war/ear (which all are really zip files under the hood). These archives storing timestamps of packed files, and what worse, Jar working differently with Manifest files.

To solve this problem, I created maven plugin (two actually) and it appears that they just works for me. To get deterministic builds in your project, just add these to build/plugins section in your pom.xml (you can even add this only to your parent pom.xml)

<plugin>
<groupId>org.javaz</groupId>
<artifactId>detarchive</artifactId>
<version>1.0</version>
<inherited>true</inherited>
<executions>
    <execution>
        <id>1. stamp files</id>
        <configuration>
            <stamp>20140101000000</stamp>
        </configuration>
        <phase>compile</phase>
        <goals>
            <goal>touch-classes</goal>
        </goals>
    </execution>
    <execution>
        <id>2. recreate archive</id>
        <phase>package</phase>
        <goals>
            <goal>repack</goal>
        </goals>
    </execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-antrun-plugin</artifactId>
<version>1.7</version>
<inherited>true</inherited>
<executions>
    <execution>
        <id>3. Move deterministic archive into default name</id>
        <phase>package</phase>
        <configuration>
            <target>
                <move todir="${project.build.directory}" includeemptydirs="false">
                    <fileset dir="${project.build.directory}">
                        <include name="*ar"/>
                    </fileset>
                    <mapper type="regexp" from="(.*)-det.(.*)" to="\1.\2"/>
                </move>
            </target>
        </configuration>
        <goals>
            <goal>run</goal>
        </goals>
    </execution>
</executions>
</plugin>

So, what magic are happening there?

First execution of detarchive plugin is touch-classes goal, which changes all timestamps of all files inside «classes» directory (can be configured via configuration parameter targetDirectory). Files are stamped using configuration parameter stamp. If you don’t like default format — yyyyMMddHHmmss, you can use your own custom format, just don’t forget specify it via stampFormat.

After classes and resources stamped, it packed by maven into jar, and also could be jar dependencies, pom.xml and somethinng more. After packing maven jar, we need to polish it, using goal repack. This goal smart enough to run without parameters, but if you want something else — you can specify in configuration these parameters: outputDirectorystamp, stampFormat — same usage as in touch-classes goal. If they aren’t specified, timestamp coming from most frequent timestamp amongst *.class files.

And additional parameters

  • archiveSuffix (by default it’s «-det» )
  • archiveName (which archive to repack, by default it’s found automagically)
  • skipPomProperties (default true, due to maven change each time content of pom.properties — and really, who need this file anyway)
  • skipFiles (array of filenames, which files should not get into resulting jar, if you have some nasty file somehow packed into your archive, you can define here to skip it)
  • manifestName (default «META-INF/MANIFEST.MF» — in case you need to change default name into something weird)

After we repacked archive, let’s replace it into default name, it achieved via maven-antrun-plugin and regexp globber. Which I think mostly self-explanatory.

Enjoy you Java deterministic builds :)