IRC Services Technical Reference Manual

10. Compilation

10-1. Compilation overview
10-2. The configure script
10-3. The compilation process
    10-3-1. Core source files
    10-3-2. Modules
    10-3-3. Language files
    10-3-4. The tools and data directories
10-4. Installation
10-5. Assumptions


10-1. Compilation overview

Services uses the common configure; make; make install method for compilation. The configure script in the top directory tests the system compiler and environment to ensure Services can be compiled and to compensate for differences between systems, such as broken or missing implementations of system library functions; once it has run, the source files are compiled using the make tool. Once compilation is complete, the command make install will install the executable and data files to their final location on the system as specified by parameters to the configure script (or using the defaults from that script).


10-2. The configure script

Before the source code can be compiled, certain features of the compiler and environment need to be checked, and appropriate settings made; this is performed by the configure script present in the top directory. The script performs the following operations:

Note that a deliberate decision was made to not use the GNU autoconf/automake/libtool suite of tools, as they are overly complex for the range of systems Services is expected to be used on. (I have seen far too many programs where running the autoconf-generated configure script takes longer than compiling the program itself.)


10-3. The compilation process

Overall compilation is controlled by the Makefile in the top source directory. When run with no target specified, the default target all is used, which first checks that the configure script has been run (aborting with a notice to the user if not), then proceeds to actual compilation with the myall target. This target compiles the main program (ircservices or ircservices.exe), then generates the language files and creates the convert-db and ircservices-chk tools.

Additional targets available are: install, which installs program and data files to the appropriate directories (see section 10-4); clean, which removes most generated files, such as object and executable files; and spotless, or distclean in the GNU style, which removes all generated files (including config.cache). All of these targets are called recursively for the modules, lang, and tools directories.

The main Makefile also includes two rules, for services.h and language.h, which cause those files to be touched (the file timestamp updated) whenever any sub-header file changes. This is done because it is considered easier than ensuring that every dependency list stays up to date with all sub-header files.

10-3-1. Core source files

Compilation of the main executable starts with compilation of the core source files, stored in the top source directory. The object files to be created are listed in alphabetical order in the $(OBJS) variable. The options used for compilation are -DSTATIC_MODULES if using static modules, followed by the options selected by the configure script (in the $(BASE_CFLAGS) variable), followed by the options in the $(MORE_CFLAGS) variable (defined at the top of the Makefile, intended for users to add or change extra options on the fly).

After compiling all of the core source files, compilation proceeds to the modules, as described below; finally, a version.c file is generated by the version.sh script, containing the program version number and a build number (extracted from the previous contents of version.c) which is incremented by one each build, and this file is compiled and linked with the rest of the main source files, as well as the modules when compiling modules statically, to produce the main executable.

Note that one core header file, langstrs.h, is copied from the language file subdirectory, where it is first generated if necessary; see section 10-3-3 below for details.

10-3-2. Modules

Compilation of modules is handled by modules/Makefile and its auxiliary file, modules/Makerules. The top directory's Makefile calls one of two targets in modules/Makefile to compile all available modules: all-dynamic if dynamic linking is in use, all-static for static linking. Both of these targets perform the same basic function: search for all subdirectories of the modules directory that contain Makefiles, and call the all-dynamic or all-static target in each subdirectory's Makefile.

Since the subdirectory search is performed at compilation time, it is possible to add new modules to Services by simply copying the directory containing the module source code and Makefile into the modules directory. This is in fact the recommended method for installing third-party modules, and the module compilation system was designed with this aim in mind.

Of the two main targets in modules/Makefile, the all-dynamic target is the simpler of the two. It loops through all subdirectories, calling the all-dynamic target in each, then updates a dummy .stamp file if the subdirectory's .stamp file was updated. This file is used by the main Makefile to determine whether version.c should be regenerated.

For static modules, the process is slightly more complicated, since a list of all modules and exported symbols must be provided to the core module manager. In addition to calling each subdirectory's all-static target, the .modext-*, .modsyms-*, and .modlist-* files in each subdirectory are concatenated to form module and symbol lists; these are then compiled into an additional object file, modlist.o, and an archive (modules.a) is created containing this file and all module object files. This archive is then linked into the final executable.

The compilation of individual modules is handled by the modules/Makerules file, which is included by each subdirectory's Makefile. This file is designed so that individual module Makefiles need only set up some variables with lists of files to be compiled and include the Makerules file. However, due in part to limitations of the GNU make program, the actual implementation is quite complex, and as a side effect it takes make considerable time to process the module subdirectories even when no files need to be recompiled.

The limitation in make which I found no easy way around during development is that there is (or was; see the relevant note in section 11-1 for a method that works with newer versions of GNU make) no way to specify a rule in which the name of a variable specifying a dependency varies with the target name. For example, suppose a subdirectory contains two modules, module1 and module2. If each module consists of only one source file and has no other dependencies, the rule is simple (assuming dynamic modules, and assuming the presence of a %.o: %.c compilation rule):

%.so: %.o $(CC_SHARED) $^ -o $@

However, suppose that module1 depends on file aux1.o and module2 depends on files aux2.o and aux3.o. If these are specified in $(OBJECTS-module1.so) and $(OBJECTS-module2.so), one might then be tempted to write:

%.so: %.o $(OBJECTS-%.so) $(CC_SHARED) $^ -o $@

Unfortunately, this does not substitute module1 or module2 in the $(OBJECTS-%.so) reference, but uses the value of the variable literally named OBJECTS-%.so. Nor does it work to use $@ instead of %.so in the variable name:

%.so: OBJECTS = $(OBJECTS-$@) %.so: %.o $(OBJECTS) $(CC_SHARED) $^ -o $@

Here, $(OBJECTS) is only defined for commands inside the pattern rule, so it cannot be used in the dependency list. Likewise, $@ is only defined for the command list, so specifying $(OBJECTS-$@) in the dependency list also does not work.

The upshot of all this is that in order to have a dependency list that varies with the target, it is necessary to call make recursively. The bulk of the Makerules file is dedicated to handling this recursive calling and setting variables appropriately for each target.

The Makefile for a module directory typically consists only of variable definitions followed by the line "include ../Makerules". Chief among the variables required is the $(MODULES) variable, which lists the modules contained in the directory; each module is specified with a .so extension (static modules are compiled to .a archives, but the filename translation is handled transparently by the Makerules file), and is assumed to be compiled from a source file of the same name with a .c extension. If any additional object files are required by the module, they are specified in a variable named $(OBJECTS-module.so), where module is replaced by the module name. Header file dependencies can be specified through the $(INCLUDE) variable for dependencies common to all object files, and $(INCLUDE-object.o) for a particular object file. If any object requires a special rule for compilation, that rule can be written in the Makefile, but rules are not needed for the ordinary case of compiling a .c source file to a .o object file.

When the subdirectory's all-dynamic or all-static rule is first invoked, the script first checks, via pattern rules, that no modules have names ending in _static (which would conflict with the static module compilation procedure, as described below) or beginning with a period (which would conflict with the temporary files created during compilation). Then, for each module, make is called recursively with the $(TARGET) variable set to the target module name (with no filename extension), $(OBJECTS) variable set to the contents of the particular module's object list ($(OBJECTS-module.so)), and $(REALLY_COMPILE), the recursion level variable, set to 1.

When called with $(REALLY_COMPILE) equal to 1, Makerules instead defines a rule for the target file, $(TARGET).so or $(TARGET).a For dynamic modules, this simply involves compiling the relevant files and linking them into a shared object; for static modules, however, the objects will eventually be archived into a single modules.a archive, so no .a file for the individual module is actually needed. What the rule does instead is to record each object file's name in the .$(DIRNAME).lst file, which is then used by the all-static rule to link all object files in the directory into a single object $(DIRNAME).o for storing in the modules.a archive. (A dummy $(TARGET).a file is also created so that make can perform its file timestamp checks.) In addition, since the static module manager requires a list of names and pointers for all variables and functions exported by each module, the static module compilation rule searches through all source files—each object file is assumed to correspond to a single source file with the same name and a .c extension, and no header files are checked—for EXPORT_VAR(), EXPORT_ARRAY(), and EXPORT_FUNC() macro invocations. The export information given to these macros, along with declarations for the implicitly exported variables _this_module_ptr and module_version, array module_config[], and functions init_module() and exit_module(), are written to two temporary files: .modext-module.h, containing extern declarations, and .modsyms-module.c, containing the actual symbol entries. A line with the module name and symbol array pointer is also written to .modlist-module.c for later inclusion in the overall module list.

For each individual object file, Makerules first (at recursion level 1) generates a rule for compiling the object file that calls make recursively, with $(TARGET) now set to the object file's base filename (with the .o extension stripped), $(INCLUDES2) set to the particular object file's dependency list from $(INCLUDES-object.o), -DMODULE and -DMODULE_ID=module-id appended to $(CFLAGS) (module-id is a C-style identifier derived from the directory and module name, used to make common module identifiers unique), and $(REALLY_COMPILE) set to 2. In addition, for the main file of a module (the source file with the same name as the module), -DMODULE_MAIN_FILE is also appended to $(CFLAGS).

In this second recursion level, the actual source file compilation is performed. The object.o rule does not perform the actual compilation, but depends on a .compiled-object.o dummy file whose rule performs the compilation and on a dummy FRC (force) rule; this hack prevents make from outputting "nothing to do" messages for every unchanged object file. The compilation command itself is similar to that used for the core source code, but the command is prefixed by a cd to the top directory, so that the relative path to the source file is saved in the object file's debug information. This allows debuggers to easily find the proper source file, even if multiple module subdirectories have identically-named source files.

For static modules, a slight change is made for the module's main source file: rather than compiling the source file to module.o, the object file is given the filename module_static.o, and the five implicitly exported variables/functions (init_module(), etc.) are renamed via -D options to names containing the module-id so that they do not cause symbol conflicts with other modules at link time.

10-3-3. Language files

The language files, stored in the lang directory, are precompiled into binary format to speed the startup of Services, as described in section 2-8-4. This precompilation is performed by the langcomp program, compiled from langcomp.c. When called with the all target, the Makefile first compiles langcomp, then runs it on each language source file to generate the corresponding precompiled binary file. In the special case of the Japanese language files (ja_*.l), which correspond to the various encodings common on Japanese computer systems, the EUC file (ja_euc.l) is treated as canonical, and when it changes, the jconv.pl script is automatically run to regenerate the ja_sjis.l file before compiling it to binary format. (There was also a JIS-encoded file, ja_jis.l, in the past, but this was dropped because of extra % characters in the text causing printf() functions to break.)

The list of standard language strings is taken from the English language file, en_us.l; a simple grep is used to extract the string names to the index file, and this file is then used to generate langstrs.h, which contains the string names both as preprocessor (#define) constants and as a string array, available if LANGSTR_ARRAY is defined. The core source file language.c uses this array for looking up string names when loading external language files at runtime.

10-3-4. The tools and data directories

The tools directory contains two additional programs: convert-db, the database conversion tool discussed in section 9, and ircservices-chk, a simple script designed to run from a periodic execution utility such as cron to ensure that Services is restarted if it should stop for any reason (such as a crash).

In addition to the main source file convert-db and the convert-*.c source files that handle particular database types, convert-db makes use of three source files from the main source code: compat.c, containing compatibility functions; modules/database/fileutil.c, containing routines to read and write data in binary database files; and modules/misc/xml-export.c, containing routines to generate an XML file from loaded data. These three files are compiled using special rules, which include the -DCONVERT_DB compiler option to trigger special handling in the source files for the convert-db tool.

The "compilation" of ircservices-chk consists of simply replacing the @PROGRAM@, @BINDEST@ and @DATDEST@ fields in the template file ircservices-chk.in with the actual file/pathnames, writing the output to the file ircservices-chk (more precisely, $(PROGRAM)-chk, where $(PROGRAM) is the value of the -program option given to the configure script), and marking that file executable with chmod.

Likewise, the two sample configuration files (example-ircservices.conf and example-modules.conf) in the data subdirectory are generated from template files, replacing occurrences of @PROGRAM@ with the actual program name to give appropriate defaults for various file names.


10-4. Installation

The install target, which like all defers actual work to the myinstall target, first creates the target directories, $(BINDEST) and $(DATDEST), if they do not exist; copies the main executable file to the $(BINDEST) directory; and calls the install target in the modules, lang, tools, and data directories. However, when static modules are being used, the module install is skipped (since the module objects are linked directly into the executable, there is nothing to install).

The module installation consists of calling the install target in each module subdirectory; this target, declared in the modules/Makerules file, creates a subdirectory of the same name inside the modules directory under $(DATDEST), then copies all module shared-object files into that directory.

The language installation first creates a language directory under $(DATDEST); the precompiled language files are then copied into that directory.

The tool installation copies the ircservices-chk script to $(BINDEST), and the convert-db program to $(DATDEST). The latter is not copied to the executable file directory to avoid the possibility that the executable file name conflicts with another program installed on the system. (A better solution might be to rename the program to something like ircservices-convert.)

The data file installation copies the example configuration files, example-ircservices.conf and example-modules.conf, to $(DATDEST); example-ircservices.conf is renamed at this time to example-$(PROGRAM).conf. If the helpfiles directory does not exist under $(DATDEST), it is copied from the helpfiles subdirectory of the data directory.

Since the install target depends on the all (or more precisely, the myall) target, it is also possible to perform compilation and installation in one step by simply executing make install.

If the variable INSTALL_PREFIX is set, its value is prepended to all pathnames used for installation; for example, the ircservices executable file is installed to $(INSTALL_PREFIX)$(BINDEST)/ircservices. This allows Services to be installed to an alternate root directory, such as when preparing a chroot'd environment or a distribution image. This variable is not set by any of the Makefiles, but can be set on the make command line. (Note that there is no slash after $(INSTALL_PREFIX); inserting one would have the side effect of prefixing $(BINDEST) and $(DATDEST) with a slash when no prefix was given, which could potentially have undesired side effects.)


10-5. Assumptions

Finally, it is worth noting a few assumptions made by the source code. These are all believed to hold on any system Services is likely to be used on, and some are double-checked by the configure script.