m4_include and aclocal

I ran into some surprising behavior a little while back while hacking on the MPICH2 build system, that I thought I should mention here. But first, some background info is in order. We have numerous subsystems that are currently configured (in the autoconf sense) by running an autoconf-generated configure script from various points in the top level configure. Think AC_CONFIG_SUBDIRS, but the subconfig happens at the place where the macro is used, rather than being deferred to config.status-time. There are some advantages to this scheme, but there are two fairly severe disadvantages:

The subconfigure scripts tend to re-test a whole bunch of information about the build environment (e.g., six repeated checks for “how to run the C preprocessor”). Not only is this wasteful in terms of time, but there is always unlikely possibility that two configure scripts will come to subtly different conclusions about the build environment, which could cause wacky bugs later in the build.
There’s no easy mechanism to utilize the results of subconfigure tests that should be global. That is, if we add -I/path/to/includes to CPPFLAGS that should be used for all compilation, then this won’t be seen in the top-level configure without some additional work. We currently “pass” these values back via a “localdefs” file that is listed in the subconfigure’s AC_CONFIG_FILES and contains some @FOO@ substitutions that are either AC_SUBSTed or “precious” (which implies AC_SUBST for that var). This stinks because it’s a fairly manual process, and it’s impenetrable to outsiders who are unfamiliar with our build system because the sourcing of “localdefs” is hidden inside a custom autoconf macro.

So I’m taking a stab at converting these configure files to be configure.in fragments that are then m4_included. This will cause those tests to be run in the same shell as the top-level configure and will only run the additional tests that are needed by the subsystem. In the process of that conversion, I ran into some odd error messages from the autotools that are represented by the following example program.

Quick Quiz: The code in Listing A will successfully autoreconf and configure. The code in Listing B will fail during autoreconf’s invocation of aclocal. Why?

Common code:

## contents of configure.in
AC_INIT([foo],[1.0])
AM_INIT_AUTOMAKE([-Wall -Werror foreign 1.11 subdir-objects])
LT_INIT([])
AC_PROG_CC
var="yes"
AM_CONDITIONAL([COND],[test "$var" = "yes"])
dnl this makes m4 bail, claiming that AM_COND_IF is undefined
m4_foreach([subsys_i],[[subconfigure]],[m4_include(subsys_i[.m4])])
AC_OUTPUT([Makefile])

Listing A

## contents of subconfigure.m4
AC_PROG_GREP

Listing B

## contents of subconfigure.m4
AM_COND_IF([COND],[echo COND is true],[echo COND is false])

Do you see the problem here? I sure didn’t, even after some moderate debugging. Go ahead and think about it for a minute, I’ll wait…

Luckily, the nice folks on the autoconf list helped me understand what was going on here. There are two things happening here:

Not all macros are created equal. In my original world view, AC_PROG_GREP and AM_COND_IF are both just AC_DEFUNed m4 macros and should behave more or less identically. However this assumption is totally false. AC_PROG_GREP is a macro that is built-in to autoconf, and therefore does not get distributed with your package via aclocal.m4, while AM_COND_IF comes with automake and is managed by aclocal, and must be placed into your package’s aclocal.m4 in order to be properly expanded.
It turns out that aclocal isn’t as smart as I thought it was. When it traces the m4 files looking for macro definitions that it must add to aclocal.m4, it doesn’t perform real m4_includes at that time. Instead it uses some regex heuristics in order to determine what file is being included and then traces that file separately. Since the name of the file that is being included is being computed via an m4 macro in the example above, aclocal gets rather confused and just doesn’t trace the subconfigure.m4 file.

The upshot of all of this is that I can’t use m4 code to programmatically include subsystem m4 fragments. They must either be hard-coded m4 file names or be autogenerated by some other non-autotools step, such as a quick shell script run before autoreconf.