IRC Services Technical Reference Manual

8. Other modules

8-1. Encryption modules
    8-1-1. encryption/md5: MD5 hashing
    8-1-2. encryption/unix-crypt: Encryption with the crypt() system function
8-2. HTTP server modules
    8-2-1. Client data structure and related constants
    8-2-2. HTTP server utility routines
    8-2-3. httpd/main: Main server module
    8-2-4. httpd/auth-ip: Authorization by IP address
    8-2-5. httpd/auth-password: Authorization by password
    8-2-6. httpd/top-page: Static page for server root
    8-2-7. httpd/redirect: Redirects to nickname/channel URLs
    8-2-8. httpd/dbaccess: Provides database access via HTTP
    8-2-9. httpd/debug: Debugging module
8-3. Mail-sending modules
    8-3-1. mail/main: Main mail module
    8-3-2. mail/sendmail: Sends mail using the sendmail program
    8-3-3. mail/smtp: Sends mail using SMTP
8-4. Miscellaneous modules
    8-4-1. misc/xml-export: Data export using XML
    8-4-2. misc/xml-import: Data import using XML


8-1. Encryption modules

As discussed in section 2-9-1, Services includes facilities for encrypting passwords. While the Services core provides an interface for encryption, the actual encryption processing is handled by encryption modules, located in the modules/encryption directory. Two encryption modules are included with Services: encryption/md5, using the MD5 hash function to encrypt passwords, and encryption/unix-crypt, using the system library's crypt() function.

Encryption modules generally have three parts:

The three CipherInfo functions mentioned above provide encryption, decryption, and encrypt-and-compare functionality for the particular cipher implemented by the module. They are defined as follows (the actual function names are of course up to the particular module):

int encrypt(const char *src, int len, char *dest, int size)
Encrypts the plaintext stored in src, which is len bytes long, and stores the result in the buffer pointed to by dest (of size size bytes). The source plaintext is not (necessarily) null-terminated, and should be treated as a block of binary data rather than a textual string. Returns:
int decrypt(const char *src, char *dest, int size)
Decrypts the ciphertext stored in src, storing the result in the buffer pointed to by dest (of size size bytes). Returns:
int check_password(const char *plaintext, const char *password)
Compares the null-terminated string plaintext against the encrypted data password. Returns:

The core encryption source file, encrypt.c in the top source directory, contains definitions of these three functions for use when no encryption module is loaded; the functions simply copy the plaintext string into or out of the provided encryption buffer, truncating as necessary. (As a result, only the first PASSMAX bytes of longer passwords are valid; any password beginning with those same bytes will be treated as equivalent, similar to the way old Unix-like systems ignored any characters in passwords after the first 8.)

8-1-1. encryption/md5: MD5 hashing

The encryption/md5 module, defined in md5.c, uses the MD5 message-digest algorithm to encrypt passwords. The bulk of the file consists of a literal copy of the md5c.c implementation published by RSA Data Security, Inc.; the CipherInfo implementation function md5_encrypt() simply calls these functions to obtain a 16-byte hash of its input and returns that hash (as binary data, not a hexadecimal string).

Of the remaining two CipherInfo functions, md5_decrypt() simply returns the special value -2, indicating that MD5 passwords cannot be decrypted; md5_check_password() calls md5_encrypt() on the plaintext string it is passed, comparing the resulting hash against the given password buffer to determine whether the password is correct.

The module includes one configuration option, EnableAnopeWorkaround. This is intended to be used with databases that have been imported from the Epona or Anope programs, some versions of which have a bug (which, to be fair, was inherited from an earlier version of Services) causing MD5-encrypted passwords to be stored incorrectly. The bug is in assuming that the MD5Final() routine returns an ASCII string of hexadecimal characters—in fact, it returns the raw 128-bit hash value—and attempting to convert that value into binary, resulting in 8 bytes of garbled hash data and 8 bytes that are essentially random. The workaround implemented by EnableAnopeWorkaround performs this same procedure when checking passwords if the hash itself does not match; since it only compares the 8 valid bytes of the corrupted hash, there is naturally a greater possibility of a hash collision, which would result in an incorrect password mistakenly being signaled as correct. See also the relevant part of section 5-3-2 of the user's manual.

8-1-2. encryption/unix-crypt: Encryption with the crypt() system function

The encryption/unix-crypt module, defined in unix-crypt.c, makes use of the crypt() function defined in the system libraries to encrypt passwords. Due to this, it may not be a desirable choice where portability of data is concerned, since differing systems may have incompatible implementations of crypt(); on the other hand, it allows Services to take advantage of more secure encryption algorithms as the operating system comes to support them, without having to write new Services modules as well. The impetus for the development of this module was the use of crypt() as one encryption method in the PTlink Services program (coincidentally, it was also this program's use of a "cipher type" field stored with passwords that provided the inspiration for the redesign of encryption functionality in Services 5.0).

The only noteworthy aspect of the encryption/unix-crypt module is the encryption routine, unixcrypt_encrypt(). Since the crypt() function requires a null-terminated password string (the input is not guaranteed to be null-terminated) and a "salt" parameter, these have to be prepared beforehand; the password is copied into a buffer of size PASSMAX and a trailing null attached, and the "salt" string is generated using the random() function. These are then passed to crypt(), and the result copied into the output buffer, assuming it is large enough. (Some modern systems implement crypt() using an MD5 hash, returned as a 32-character hexadecimal string with a distinguishing prefix; for such cases, PASSMAX must be raised from the default of 32, or passwords will not fit.)


8-2. HTTP server modules

Services includes a simple HTTP server that can be used to access Services data from outside IRC. The server is implemented by several modules in the modules/httpd directory: a core server module (section 8-2-3), authorization modules (sections (8-2-4 and 8-2-5), and resource modules (sectiona 8-2-6 through 8-2-9). All modules make use of a common header file containing data structure and constant definitions, described in 8-2-1; there are also several utility functions shared by all modules (and compiled into the core server module), discussed in section 8-2-2.

8-2-1. Client data structure and related constants

All modules make use of the header file http.h. This header file contains a definition of the Client structure, used by the modules to store information about a single client, along with various HTTP-server-related constants and declarations of the utility routines listed in section 8-2-2.

The Client structure contains the following fields:

Socket *socket
Contains the Socket structure used for communicating with the client (see section 3).
Timeout *timeout
A timeout (see section 2-7) used to disconnect clients after a certain period of idle time.
char address[22]
The client's IP address and port number, as a string. (22 bytes is exactly long enough to hold a string of the form "123.123.123.123:12345".)
uint32 ip
The client's IP address, in network byte order.
uint16 port
The client's (remote) port number, in network byte order.
int request_count
The number of requests that the client has made over the course of the connection, used to disconnect clients that make more than a certain number of requests.
int in_request
A flag indicating whether a request is currently being processed for the client.
char *request_buf
The buffer used to hold request data received from the client.
int32 request_len
The number of bytes of request data received from the client for this request (i.e., the number of bytes stored in request_buf).
int version_major
The major version of HTTP in use (the "x" in HTTP/x.y).
int version_minor
The minor version of HTTP in use (the "y" in HTTP/x.y).
int method
The request method (one of the METHOD_* constants; see below).
char *url
The URL given by the client. Points into request_buffer.
char *data
POST data for the request, or the query string for a GET or HEAD request. Points into request_buffer.
int32 data_len
POST data length, in bytes.
char **headers
int32 headers_count
A variable-length array containing the request headers. Each element of the array consists of the header name and its value separated by a null byte; the entries point into request_buffer.
char **variables
int32 variables_count
A variable-length array containing any variables found in POST data or a GET or HEAD request. Each element of the array consists of the variable's name and value separated by a null byte, with URL escapes converted to their respective characters.

There are also several constants defined by the header file:

HTTP_LINEMAX (4096)
Defines the maximum length (including the trailing null byte) of a request line that the server will handle. Lines longer than this will cause the request to be aborted with an HTTP error.
HTTP_AUTH_*
Constants used as return values from authorization functions (see section 8-2-3).
HTTP_METHOD_*
Constants used to indicate the request method in the method field of the Client structure.

These are followed by constants for the various HTTP return codes, as defined by the relevant RFC documents. Not all (or even most) of these are used by Services modules, but all are included for completeness. The name of each constant includes a character indicating the type of response (much like the first digit of the numeric code): "I for Informational, "S" for Successful, and so on.

8-2-2. HTTP server utility routines

The util.c source file contains several common functions used by HTTP server modules, listed below. util.c is linked into the main HTTP server module, httpd/main, so all submodules can make use of them without the necessity of explicitly importing each function.

char *http_get_header(Client *c, const char *header)
Returns the contents of the header header in the given client's currently active request, or NULL if the request did not include such a header. If header is NULL, returns the next instance of the header last searched for; this usage allows the caller to cycle through multiple headers of the same name, much like strtok() iterates through tokens in a string.
char *http_get_variable(Client *c, const char *variable)
Returns the contents of the variable variable in the given client's currently active request, or NULL if the request did not include such a variable. Like http_get_header(), a NULL value for the variable parameter allows iterating through multiple instances of a variable.
char *http_quote_html(const char *str, char *outbuf, int32 outsize)
Applies HTML-style quoting to str, replacing the characters < > & with "&lt;", "&gt;", and "&amp;" respectively. The result is placed in outbuf, and is truncated if necessary to fit within outsize bytes, including the trailing null byte; however, HTML entities inserted by this routine will never be partially truncated (if an entity would cause a buffer overflow, the output string will be terminated at the location where the entity would have been inserted). The routine returns outbuf, except when a parameter is invalid, in which case NULL is returned.
char *http_quote_url(const char *str, char *outbuf, int32 outsize, int slash_question)
Applies URL escaping to str, replacing with their equivalent %nn escapes any characters not in the set:
    A-Z a-z 0-9 - . _
As with http_quote_html(), stores the (possibly truncated, but without partial escapes) result in outbuf, and returns outbuf, or NULL on invalid parameters.
char *http_unquote_url(char *buf)
Converts any URL escapes in the string buf to their corresponding characters, overwriting the buffer. A truncated escape at the end of the string is discarded, as is any malformed escape (a % followed by two characters, one or both of which are not hexadecimal digits). Returns buf. (Note that Unicode escapes of the form %Unnnn are not handled by this routine, and will be interpreted as a malformed escape followed by three ordinary characters.)
void http_send_response(Client *c, int code)
Sends an HTTP response line with the response code code, followed by a Date: header. The header portion of the response is not terminated, so the caller can send additional headers as necessary.
void http_error(Client *c, int code, const char *format, ...)
Sends an error message (response headers and body) to the given client, then closes the client's connection. The HTTP response code for the error message is given by code. format gives an optional printf()-style format string to use for generating the body of the error message; if it is NULL, then default body text is chosen based on the response code.

8-2-3. httpd/main: Main server module

The core of the HTTP server is implemented by the httpd/main module, defined in the source file main.c (along with util.c, mentioned above). This module takes care of establishing a listener socket with which to accept client connections, receiving and parsing requests from clients, and passing those requests off to handlers which generate data to send back to the client. (The core module does not respond to any requests by itself, except for generating errors for requests that cannot be successfully processed.

Unlike most other modules, which take actions in response to messages received from the IRC network, the HTTP server operates independently, relying on the socket framework (see section 3) to inform it of activity. The module initialization routine, init_module(), opens the port or ports specified by the ListenTo configuration directive, creating listener sockets which call back to the do_accept() function when a connection is received. The initialization routine also creates two callbacks, "auth" and "request", into which submodules can hook to provide authorization or request handling services; these are covered in the discussion of request handling below.

When a connection has been accepted on a socket, the do_accept() routine first ensures that the client address is available (as it may be necessary for authorization purposes), then creates and initializes a Client structure in which to store information about the client. This is done before checking the number of active connections so that, if the client is to be disconnected due to load, an appropriate error response can be sent with http_error() (which requires a valid Client structure). If all goes well, read-line and disconnect callbacks are set on the new socket, along with a timeout (as given by the IdleTimeout configuration directive), and do_accept() returns.

The actual request processing takes place in two stages: first the full request is received from the client (unless the connection is aborted with an error), and then the request is passed to the relevant handlers. These stages are handled by the do_readline() socket callback function and the handle_request() routine.

do_readline() is called for each line of the request received from the client, and parses each line into appropriate parts of the Client structure. The routine tells the first (request) line from subsequent (header) lines by whether or not the url field of the Client structure is set; if the first line has been successfully processed, this field will always have a non-NULL value. Header lines are handled by the subroutine parse_header(), which checks whether the line is a new header or a continuation line of a previous header and processes it accordingly.

Once the blank line signaling the end of headers has been received, do_readline() checks whether the request has a body part (a POST request with a nonzero Content-Length header). If so, the read-line callback on the socket is removed, and do_readdata() is instead added as a read-data callback; do_readdata() reads in the requisite number of body data bytes and calls handle_request(). Otherwise, do_readline() calls handle_request() itself, after first truncating any query portion of the URL of a GET or HEAD request and putting the query data in the Client structure's data field.

handle_request() first takes any GET query or POST data and splits it up into variables and values, by calling either parse_data() or parse_data_multipart() depending on the request type. After this, it increments the client'S request count, sets the in_request flag, and then sets a local variable close which is used to indicate whether the client connection should be closed when the request processing is finished. After this setup is complete, handle_request() calls the two callbacks "auth" and "request" to perform the actual request handling; callback functions for both callbacks take the Client structure and a pointer to the close variable (which may be modified) as parameters.

The "auth" callback is used for request authorization. Each callback function must return one of the HTTP_AUTH_* values defined in http.h. A value of HTTP_AUTH_ALLOW causes the request to be allowed at that point, skipping any subsequent callback functions; likewise, a value of HTTP_AUTH_DENY causes the request to be immediately denied. HTTP_AUTH_UNDECIDED can be used when the callback function has nothing to say about the instant request, and allows the next callback function to handle authorization. If all callback functions return HTTP_AUTH_UNDECIDED (or no callback functions are registered), the request is allowed.

The "request" callback is used for actual request processing. Each callback function should check the URL to determine whether it is one to be processed by that function or not; if so, then the routine should take appropriate action and return a nonzero value, causing any subsequent callback functions to be skipped. If all callback functions return zero, the core server module will send a "not found" (404) error to the client.

Once the request has been processed, handle_request either closes the socket or clears out the Client structure, depending on whether the close flag is set (nonzero) or clear (zero). In the latter case, request processing for the connection then starts over with parsing of request lines by do_readline(). As an adjunct to the clear flag, if the Client structure's in_request field has a negative value, the connection is closed as well; this is to allow http_error(), which does not receive a pointer to the close flag, to signal that the client should be disconnected.

It should be noted that client sockets are set to blocking mode (see the description of sock_set_blocking() in section 3-2-1), to simplify implementation of request handlers. Depending on the modules and setting used, this can allow a malicious user to cause Services to freeze by requesting a large amount of data from Services (enough to increase the socket buffer to its maximum size) and deliberately not receiving any of that data.

8-2-4. httpd/auth-ip: Authorization by IP address

The httpd/auth-ip module, defined in auth-ip.c, is one of two authentication modules included with the Services HTTP server, and allows requests to be allowed or denied based on the IP address of the client. The module maintains a list of allow/deny rules, each with an associated URL path prefix, IP address, and network mask; when a request is found that matches a rule's prefix/address/mask triplet, the request is either allowed or denied based on the type of rule. (If the request matches more than one rule, only the first in the table—also the first in the file—is applied.)

The callback function for the server core's "auth" callback, do_auth(), is very simple, needing only to iterate through the rule table to find a matching rule for the request. The hard work of converting the list of AllowHost and DenyHost rules into a table that can be easily processed is handled at module configuration time via custom handler functions for the two directives, do_AllowHost() and do_DenyHost(). In fact, these are both stubs which call a common routine, do_AllowDenyHost(), with an extra parameter to indicate the rule type (allow or deny).

Note that this module interprets "allow" rules to mean "allow unless denied by another authorization method", and not "allow regardless of any other circumstances". Thus, if a request matches an "allow" rule, the callback function returns HTTP_AUTH_UNDECIDED rather than HTTP_AUTH_ALLOW.

8-2-5. httpd/auth-password: Authorization by password

The httpd/auth-password module, defined in auth-password.c, performs authorization based on a username and password provided by a client (using the WWW-Basic HTTP authorization method). If a request is denied, the authorization handler sends an HTTP "401 Unauthorized" message to the client, giving the realm name specified in the rule to provide a user prompt. Other than this, and the comparative simplicity of the configuration directive handler functions, this module is more or less identical to auth-ip.c.

As with the httpd/auth-ip module (and also mentioned in comments in the source code for this module), "allow" rules are treated as "allow subject to other permission checks" rather than "allow unconditionally", and the callback function do_auth() returns HTTP_AUTH_UNDECIDED rather than HTTP_AUTH_ALLOW for such rules.

8-2-6. httpd/top-page: Static page for server root

The httpd/top-page module, defined in top-page.c, is a very simple request handler which (depending on configuration settings) sends either the contents of a local file or an HTTP redirect in response to a request for the server's top page ("/").

8-2-7. httpd/redirect: Redirects to nickname/channel URLs

The httpd/redirect module, defined in redirect.c, allows URLs stored with registered nicknames and channels to be accessed through the HTTP server. Two URL prefixes, one each for nicknames and channels, are defined via configuration directives (NicknamePrefix and ChannelPrefix respectively); when a request is received that matches one of the prefixes, the remainder of the URL is used as a nickname or channel name, and a redirect is sent for the URL associated with the nickname or channel (if not registered or no URL is stored, an error is returned).

Since the "#" character is treated specially by web browsers, channel names are specified without the "#", which is added back internally when accessing the channel's data. For example, if ChannelPrefix is "/channel/", then a URL of "/channel/SomeChannel" will redirect to the URL record for the channel #SomeChannel.

Naturally, in order to access nickname and channel data, the module must interface with the NickServ and ChanServ modules. This is done via the "load module" and "unload module" callbacks, which watch for the nickserv/main and chanserv/main modules to be loaded and save pointers to necessary functions. To avoid problems arising from the order in which the module is loaded, the init_module() routine also checks for the presence of these modules, and calls the "load module" callback function do_load_module() manually if they are already loaded.

8-2-8. httpd/dbaccess: Provides database access via HTTP

The http/dbaccess module, defined in dbaccess.c, provides access to the data stored in the Services pseudoclient databases. It is easily the most complex of the HTTP server modules, as it must interface with each of the pseudoclient modules to obtain the data it provides to the client, and it must remain up-to-date with any changes to the internal data storage format used by the various modules.

At the top of the file are several definitions used to simplify access to imported functions and variables. As noted in the source code, these are not only referenced when the corresponding module has been loaded and the symbols successfully dereferenced, so there is no need to check the pointers for NULL values. (Implementation note: Nonetheless, it would be a good idea to do so anyway, just in case.) These are followed by the PRINT_SELOPT() macro, used to generate HTML for selecting among one of several display options, and the my_strftime() function, which converts a time_t timestamp value to a standard-format string and HTML-quotes the result.

The main request handler routine, do_request(), is located following these initial definitions. The only actual work performed by this routine, however, is checking the URL against the prefix defined for use by the module (in the Prefix variable, set by the same-named configuration directive), and generating a root page under Prefix redirecting to each of the available sets of data, one per pseudoclient (and one for XML export, as noted below). All requests for subpages are delivered to the appropriate subpath handler.

This routine is followed by the subpath handlers themselves, each with a name of the form handle_XXX() indicating the subpath handled by the routine (with a few exceptions, noted below). Each handler takes the Client *c and int *close_ptr parameters from the original request, along with a char *path parameter indicating the remainder of the URL path below the handler's own subpath.

The first of these handlers is the OperServ data handler, handle_operserv(). In addition to the current number of users and operators along with basic data recorded by OperServ (the maximum user count and time), the page includes links to further subhandlers for autokills and exclusions, news items, session exceptions, and S-lines. Each of these has its own handler function; with the exception of news items (handled by handle_operserv_news()), the subhandlers make use of a common routine, handle_operserv_maskdata(), to output the appropriate data. (However, there is no support for an explicit path /operserv/maskdata.)

The handle_operserv_maskdata() routine has two modes of operation, as do many of the lowest-level data handlers. When called with no further subpath (e.g. /operserv/akill/), a list of mask-data records of the appropriate type is sent to the client as a list of links. Selecting one of these will go to a path with that string as the final path element, and will cause the routine to display detailed information about the selected entry, much like using the VIEW subcommand of OperServ's various mask-data commands.

Unlike the other OperServ data sets, there is no detailed information to show about news items. Therefore, the handle_operserv_news() routine simply outputs a list of news items (both logon news and operator news), like the LOGONNEWS LIST and OPERNEWS LIST commands.

The OperServ data handlers are followed by handle_nickserv(), for displaying nickname data. Unlike handle_operserv(), this routine does not call on any subroutines, as there are only two modes of operation: listing registered nicknames (handled at the top of the routine) and displaying detailed information on a specific nickname (handled by the long remainder of the routine). The length of the routine is mainly the result of the need to quote all special characters in nickname data, to prevent malicious users from corrupting the output by setting particular strings in their nickname data.

This is followed by handle_chanserv(), which functions similarly to handle_nickserv() except that it works on channels rather than nicknames. However, to reduce the amount of data sent in response to a single request, the privilege level, channel access, and autokick lists are split off into separate pages, accessed by appending "/levels", "/access", or "/autokick" respectively to the URL. The local variable mode keeps track of what type of data the routine is to display.

Next is handle_statserv(), which predictably displays information from the StatServ pseudoclient's database. As StatServ currently only tracks a minimal amount of data, the implementation is comparatively simple, either listing the servers recorded with StatServ or displaying information for a selected server.

Finally, handle_xml_export() is used to generate an XML data set containing all data registered with Services pseudoclients, using the misc/xml-export module described in section 8-4-1. As browsers may attempt to parse the data rather than displaying or saving it if a content type of text/xml is used, the module instead sends the type text/plain. (The acerbic comment in the source code has to do with a misfeature in at least some versions of the Microsoft Internet Explorer web browser; such versions ignore a Content-Type: text/plain header and attempt to interpret the data using internal heuristics, resulting in users being unable to view the XML data.)

8-2-9. httpd/debug: Debugging module

The http/debug module, defined in debug.c, is intended to be used for debugging the HTTP server, and dumps several fields of the Client structure in response to requests to a particular URL (set by the DebugURL configuration directive). While the module does not return any sensitive information to the client, only information about the client itself, it is still bad practice to leave any unnecessary functionality such as this enabled, so this module should not be (and is not intended to be) loaded except when debugging.

The do_request() function in the source code, which does the actual request handling, also includes a number of comments explaining the request-handling process in more detail.


8-3. Mail-sending modules

In order to facilitate features such as mail authentication and memo forwarding, Services includes a set of modules allowing mail to be sent to remote systems. As with the built-in HTTP server described in section 8-2, this functionality operates independently of the primary pseudoclients and IRC network connection (except to the extent that the sending of mail is typically initiated in response to a pseudoclient command).

The mail-sending subsystem is composed of a core module implementing the mail interface, mail/main, and submodules for specific methods of sending mail. All relevant source files are located in the modules/mail directory.

8-3-1. mail/main: Main mail module

The core mail-sending functionality is located in the mail/main module, defined in main.c. The module consists of two interfaces: an external interface, declared in the mail.h header file, for use by other modules to send mail, and an internal interface, declared in the mail-local.h header file, used for communicating with the low-level modules that perform the actual send operation.

The external interface consists of a single function, sendmail(), declared as follows:

void sendmail(const char *to, const char *subject, const char *body, const char *charset, MailCallback completion_callback, void *callback_data)

The first thing to note about this function is that it does not return a value. Mail sending is performed asynchronously (subject to limitations of the particular low-level module in use), so that when the function returns, the requested message has been queued but not necessarily sent. In order to signal the result of a mail-sending operation, sendmail() takes a callback function parameter (completion_callback); this function is called when the sending operation has completed, successfully or otherwise. The function type is defined as MailCallback in mail.h:

typedef void (*MailCallback)(int status, void *data)

where data is the callback_data value passed to sendmail(), and status is one of the following values:

It is important to note that, while sendmail() does not wait for the message to be sent before returning, there is nothing preventing the low-level module from delivering the message immediately if possible, and in cases such as sending to a user on the local system, the callback function may be called even before sendmail() itself returns! For this reason, the caller must ensure that all setup required by the callback function is performed before calling sendmail().

sendmail(), in turn, does its work by calling out to functions implemented in a low-level module. The interface consists of two functions which the low-level module must provide, along with a function provided by the core module for signaling the completion of a mail operation:

void (*low_send)(MailMessage *msg)
Provided by the low-level module, this function performs the actual work of starting the send operation, and is called by sendmail() once parameter and other checks have been performed. As with sendmail(), the routine does not return a value, but instead calls send_finished() (see below) to signal the message's status. Typically, this routine will perform any necessary module-specific checks, then start the asynchronous send operation and return without calling send_finished().

The parameter passed to this routine is a structure (see below) describing the message to be sent. On entry, the structure's from, to, subject, and body are guaranteed to be non-NULL. The strings in these fields and the fromname field (which may be NULL) can be changed freely, but the pointer values should be left unmodified.

void (*low_abort)(MailMessage *msg)
Provided by the low-level module, this function takes any actions needed to abort the sending of a message currently in progress; the message to abort is indicated by the msg parameter, which will be the same as passed to a previous call to low_send(). The given message must be aborted, as there is no way for the routine to signal a failure to abort. The routine should not call send_finished(), as the core module will take care of setting the message completion status.
void send_finished(MailMessage *msg, int status)
Provided by the core module, this function is called by low-level modules to signal that a message has been successfully sent or an error has occurred that prevents the message from being sent. The msg parameter is the same one passed to low_send(), and status is one of the status codes listed above (MAIL_STATUS_*).

As can be seen from the above, both low_send and low_abort are declared as function pointers in the core module; low-level modules must set these to point to their own implementations of the functions. Implementation note: It would be better to use a register()/unregister() pair of functions, as with the encryption and database code.

The MailMessage structure used as a parameter in the above functions is used to collect the various parameters of a message into a single group for passing to the low-level modules. The pointer itself also serves as a unique ID value for each message in transit. The structure contains the following fields:

The core module itself, defined in main.c, simply serves as a kind of "glue" between external callers and the low-level modules; it consists of the implementations of sendmail() and send_finished(), along with a timeout callback function (send_timeout()) for messages which remain in transit longer than the time specified by the SendTimeout configuration directive. When sendmail() is called, it performs checks on its parameters (calling the callback function with an error code if a problem is found), then sets up a MailMessage structure for the message, activates a timeout if SendTimeout is enabled, and calls low_send() to begin the actual sending process. When the low-level module calls send_finished(), it likewise calls the completion callback function with the specified status, then unlinks and frees the MailMessage structure for the message. Messages can be aborted if they time out, or if the core module is removed with any messages still in transit.

8-3-2. mail/sendmail: Sends mail using the sendmail program

The mail/sendmail module, defined in sendmail.c, makes use of an external "sendmail" program to send mail. The module was designed primarily as a test module to ensure that the core mail processing code worked correctly, to help isolate problems before development of the more complex SMTP module started; it has been retained to support systems which cannot use SMTP to send mail directly, but such systems are presumed to be rare, and little effort has been put into improving this module. In particular, the module (and thus Services itself) blocks while interacting with the external program, potentially causing Services to lag and even opening up the possibility of denial-of-service attacks on Services (by repeatedly sending messages to addresses which take a long time to process).

The entire logic of the module, outside of the module initialization and cleanup code (which actually comprises about half of the source file), is contained in send_sendmail(), the implementation of the low_send() routine called by the core module's sendmail() function. send_sendmail() opens a pipe to the program specified by the SendmailPath directive, which is assumed to take a "-t" option to read the recipient address from the message headers, as the standard Unix sendmail program does. The message is then written over the pipe, and pclose() is called to wait for the message sending operation to complete. This latter step, which is required to free the pipe resources as well, places Services at the mercy of the external program, as pclose() will not return until the process exits. Implementation note: One improvement would be to make the pipe non-blocking, but as Services has no facilities for monitoring arbitrary file descriptors, this would require a periodic check via a timeout routine to see whether the child process had exited. Finally, the message status is reported based on the exit code of the child process.

8-3-3. mail/smtp: Sends mail using SMTP

The mail/smtp module, defined in smtp.c, sends mail via the SMTP protocol. While the module makes some simplifying assumptions, notably that a relay server is available that will accept and distribute mail on behalf of Services, it is more robustly designed than the mail/sendmail module, and is the recommended module for use in Services.

As mentioned above, the mail/smtp module relies on the presence of an external relay server, which can be as simple as an SMTP daemon running on the same machine, that will accept message from Services via SMTP and relay them to the appropriate destinations. By doing this, the module is freed from the necessity of performing DNS lookups for each message sent, significantly reducing the complexity of the module. However, this also means that invalid addresses cannot be detected, except to the extent that the relay server checks for them during the SMTP connection from Services.

For each message to be sent, the module creates a new connection to the relay server, taking advantage of the socket callbacks described in section 3 to process SMTP communications asynchronously. The socket used for each message, along with the MailMessage structure itself and other per-message data, is stored in a SocketInfo structure; the module maintains a list of these structures, one for each message in transit. The SocketInfo structure contains the following fields:

struct SocketInfo_ *next, *prev
Used to maintain the linked list of structures. (struct SocketInfo_ is the same type as SocketInfo, and is used here only because the structure is defined as part of the typedef.)
Socket *sock
The socket being used to send the message.
MailMessage *msg
The message data structure passed in from the core module.
int msg_status
The message status code to be passed to send_finished().
int relaynum
The index (into the RelayHosts[] array) of the relay server currently in use. If a connection to the first server fails, the code will increment this field and retry the connection until the list of relay hosts is exhausted.
enum {...} state
The current state of the connection:
int replycode
The reply code associated with the line currently being received from the server. A value of zero indicates that the next character received will be the beginning of a new line.
char replychar
The fourth character of the line currently being received (normally either a space or a hyphen, indicating the absence or presence of continuation lines respectively).
int garbage
The number of garbage (non-reply) lines received from the server, used to check for an erroneous connection to a non-SMTP server.

When the low_send() implementation routine, send_smtp(), is called, it first cleans any double quotes out of the "From" name (since that name will later be enclosed in double quotes), then sets up a SocketInfo structure for the message and creates a socket for SMTP communication. On success, the socket's callbacks are set, and try_next_relay() is called to attempt a connection to the first SMTP relay specified in the configuration file. (The msg_status field of SocketInfo is set to MAIL_STATUS_ERROR to provide a fallback value in case an error in the module results in send_finished() being called without an explicit status being set; the "don't depend on this" is simply a reminder to ensure that the status is in fact set correctly, rather than relying on that default value, since the default could potentially change.)

try_next_relay(), in turn, increments the relaynum field, then checks whether it has exceeded the number of configured relay servers. If so, sending is terminated with an error code based on the value of errno as returned from the last system call (the routine is assumed to be called immediately after a socket-related system call); otherwise, a connection is initiated to the next relay server, looping back to the top of the function if the conn() call fails.

Actual socket processing is handled by the smtp_readline() and smtp_disconnect() functions. The latter, smtp_disconnect(), simply calls send_finished(), passing either the value of msg_status (if the connection was closed locally) or an appropriate error status (if the connection was broken remotely or failed), then frees the SocketInfo structure with free_socketinfo(), which also closes the socket itself. (If the routine is called as the result of a failed connection, however, it calls try_next_relay() instead.)

smtp_readline() is the workhorse of the mail/smtp module, processing data read from the server and sending the SMTP commands necessary to relay the message. The routine first reads a line of data from the socket, ensuring that it ends with a newline and removing that newline. (While the socket subsystem ensures that a full line is available when the read-line callback is called, smtp_readline() is also able to handle partial lines, except in the pathological case of a truncated reply code.) If the text received is at the beginning of a line, the 3-digit reply code and continuation character are parsed and stored in the SocketInfo structure corresponding to the socket. When a complete, non-continued response line has been received, smtp_readline() then either generates an error (for 4xx or 5xx error responses from the SMTP server) or sends the next command or message data to the server, depending on the connection state, and the state is incremented. (After sending the final QUIT command, the socket is closed, causing send_finished() to be called from the socket disconnection callback.)

The module's implementation of the low_abort() function can be found in smtp_abort(). The routine simply looks up the SocketInfo corresponding to the message, then frees it, disconnecting the socket in the process.


8-4. Miscellaneous modules

This section documents the two remaining modules which do not fit neatly into any other category: the misc/xml-export and misc/xml-import modules, used for exporting Services pseudoclient data to an XML file and vice versa. Both of these modules are located in the modules/misc directory.

8-4-1. misc/xml-export: Data export using XML

The misc/xml-export module, defined in xml-export.c along with declarations in xml.h, provides a method through which Services pseudoclient data can be exported into an XML file suitable for use with external programs. It should be noted that this module does not make use of the standard database interface, relying instead on direct calls to the appropriate modules' database access functions and direct access to the corresponding data structures, and thus cannot export data added by third-party modules. This limitation is a result of the module's implementation in version 5.0, before the current database system was developed; one possible solution would be to reimplement this module and misc/xml-import as database modules (see section 11-1).

One thing worth noting about the structure of the module is that, since it is also compiled into the convert-db tool, there are a number of code segments (mainly logging calls) that need to be compiled differently. These are protected by preprocessor conditionals on the CONVERT_DB symbol, defined by tools/Makefile (see section 10-3-4).

Exporting is handled by the xml_export() routine defined near the bottom of the file. This routine takes two parameters: a function pointer of type xml_writefunc_t, specifying the function to be called to output data, and an arbitrary pointer value which is passed unchanged to the function. The xml_writefunc_t type is defined in xml.h as:

int (*xml_writefunc_t)(void *data, const char *fmt, ...)

where data is the pointer parameter passed to xml_export() and fmt is a printf()-style format string. (This prototype was chosen so that fprintf() could be used as a callback function. sprintf() also fits the prototype, but should be avoided due to the likelihood of buffer overflows.)

xml_export() does not actually export any data itself, other than writing the <?xml?> header tag and top-level <ircservices-db> enclosing tags. Rather, it calls helper routines to export each class of data, passing the write function pointer and data pointer along to each routine.

The first of these helper routines is export_constants(). This routine does not export any data per se, but instead writes out the values of various constants used by Services; this allows other programs which read in the data to interpret numerical data such as channel access levels and special values of limits properly, rather than relying on the definitions used in any particular version of Services (or whatever other program may have generated the data).

Following this is export_operserv_data(), the first of the actual data export routines. This routine writes out the maximum user count and timestamp, along with the super-user password if present. The password is written in encrypted format, and is first passed through the xml_quotebuf() function to avoid the danger of special characters like <, >, or the null character from causing problems when the data is read in. This latter function, defined near the top of the file, converts all non-ASCII bytes in the passed-in buffer to their equivalent character codes, and converts the three characters < > & to "&lt;", "&gt;", and "&amp;" respectively. The size of the static return buffer, BUFSIZE*6+1, is so that an input buffer of up to BUFSIZE bytes can be encoded with no truncation (the longest possible encoding for a single byte is 6 characters: "&#nnn;").

The next routine, export_nick_db(), is the first of the true database export routines, iterating through all nickname groups and then all nicknames to dump the data for each record to the XML output stream. The routine takes advantage of these XML_PUT_* macros defined at the top of the source file to simplify the writing of the various structure fields and substructures. These macros are:

Each macro takes three parameters: indent, a string prefixed to the output line for indenting; structure, the structure (not structure pointer) in which the field to write resides; and field, the name of the field to write. The value written is enclosed in tags named the same as the field name.

The subsequent database export routines—export_channel_db(), export_news_db(), export_maskdata, and export_statserv_db()—export the corresponding databases in a similar manner. One point of note is the writing of mode locks in export_channel_db(): since the on and off fields of the ModeLock structure are strings rather than bitmasks in the convert-db tool, as noted in section 7-4-1-1, they are handled differently depending on whether the preprocessor symbol CONVERT_DB is defined.

The misc/xml-export module also includes a callback function for the core's "command line" callback, allowing the pseudoclient databases to be exported without connecting to the network. The callback function, do_command_line(), checks for the -export option; if present, the XML database dump is written to the named file, or to standard output if no filename is given, and the function returns 3 (on success) or 2 (on error) to signal the core code to terminate immediately.

8-4-2. misc/xml-import: Data import using XML

The misc/xml-import module, defined in xml-import.c, performs the opposite function of the misc/xml-export module, reading data from an XML file and adding it to the various pseudoclient databases. As with the misc/xml-export module, this module is heavily intertwined with the pseudoclient modules and is unable to handle data used by third-party modules. Note that the xml.h header file is included by xml-import.c, as it is considered a common XML header file for both import and export, but there are no declarations in xml.h that are actually used in this module.

Since the import of data will typically create new records, the xml-import module requires a way to allocate and initialize a record of each of the various structure types. This is done for nickname and channel records by defining the STANDALONE_NICKSERV and STANDALONE_CHANSERV preprocessor symbols and including modules/nickserv/util.c and modules/chanserv/util.c (see also section 7-3-1-4), and for other record types by allocating with calloc() and freeing with custom free routines. This is admittedly a very kludgey way of doing things, but again is a carryover from previous versions, before the current database system was developed.

When importing data, there is the possibility that data in the imported XML file will conflict with data already stored in Services' databases. In the case of OperServ mask-data (autokill, etc.) records and StatServ server entries, the record in the imported data is always dropped; however, for nicknames and channels, one of several methods of handling collisions can be chosen. The various methods, along with the corresponding configuration options and the flags used to represent them internally, are:

One flag from each set is stored in the file-local variable flags at module initialization or reconfiguration time, based on the configuration file settings.

XML input is assumed to be from a file, whose file pointer is stored in the file-local variable import_file. The local function get_byte() reads in a byte from this file, returning the value of that byte or -1 on error, as well as performing buffering (which is probably redundant with the buffering performed by the stdio functions) and updating byte and line counters for use in error messages. The macro NEXT_BYTE encapsulates this call, assigning the return value of get_byte() to a variable c and returning -1 when end-of-file is reached.

The XML data is processed by a simple XML parser, implemented by the parse_tag() routine. This routine calls read_tag() to parse a single tag, then looks up the tag in the tags[] table and calls the associated handler to read and process the tag's contents, and returns a pointer to those contents (whose type can vary depending on the tag). The function has three special return values: CONTINUE for tags that were processed successfully but contain no data, NULL to indicate an error processing a tag, or PARSETAG_END when the closing tag corresponding to the tag given in the caller_tag parameter has been found (or end-of-file is reached). The parser does not handle empty tags (of the "<tag/>" syntax), as they are not used in well-formed Services data dumps; every tag has some sort of data associated with it.

read_tag(), in turn, reads bytes from the file until it locates the beginning of a tag, then parses the tag name and any attribute (only the first attribute is processed). The function itself returns 1 for an opening tag, 0 for a closing tag, or a negative value on error; the tag name, attribute name, attribute value, pre-tag text, and text length are stored in the variables pointed to by the parameters tag_ret, attr_ret, attrval_ret, text_ret, and textlen_ret, respectively. The strings returned point into a dynamically-allocated buffer local to the function, which can be freed by calling it with tag_ret set to NULL.

Each tag handler takes as parameters the tag name, attribute name (NULL if no attribute is present), and attribute value string (also NULL if no attribute is present). Since many tags consist of simple integer or string values, they make use of the common handlers th_text(), th_int32(), th_uint32(), th_time(), and th_strarray(). Of these, th_text() returns a TextInfo structure containing the malloc()'d text buffer, null-terminated, along with the length in bytes of the string (not including the null terminator); th_strarray() returns an ArrayInfo structure containing the malloc()'d, null-terminated string elements and element count; the other handlers return a pointer to the relevant type. The returned variables themselves are stored in static buffers local to each handler.

For simple tag handlers like the standard handlers mentioned above, handling a tag consists of simply parsing the text between the start and end tags for that tag. This is done by repeatedly calling parse_tag(), passing the handler's tag parameter as caller_tag, until the function returns PARSETAG_END, and converting the inter-tag text from the final parse_tag() call (the code assumes no intervening tags) to the proper format. For the case of th_strarray(), the parse_tag() loop checks for <array-element> tags, converting their contents to an ArrayInfo structure.

The handlers for specific types, like NickInfo and ChannelInfo, are more complex, having to deal with multiple subtags, but follow the same general structure. These handlers return dynamically allocated structures which are added directly into the import data list upon being returned from the tag handler.

The overall import process consists of reading the contents of the <ircservices-db> into data structures in memory, then merging those data structures into the appropriate databases. The reading and parsing is handled by the read_data() routine; if it succeeds, the data is then merged into the databases with merge_data(), and the loaded data is freed with free_data(). These routines are called by the top-level xml_import() function.

read_data() takes the place of the tag handler for the <ircservices-db> tag, which is read in manually by xml_import() (by calling read_tag()). Like other tag handlers, it loops calling parse_tag() to read in subtag contents, adding each returned structure into the temporary databases used for storing the data to import. read_data() also takes care of checking for collisions with data already existing in the pseudoclient databases, and taking proper action in such cases. The routine returns nonzero if all data was successfully read in and no collisions caused an abort, else zero.

If read_data() succeeds, merge_data() is then called to store the read-in records in the main Services databases. An extra check is performed here for nicknames and channels, ensuring that no collisions occur unless the collision flags specified overwriting current records; deletion of such colliding records is also performed at this stage (rather than when the data is read in, to avoid the case of a nickname or channel getting deleted and an error then being found later in the imported data). In the case of colliding nickname group IDs, the imported group is renumbered to use a free ID value, and all relevant channel entries (founders, successors, and access list entries) are adjusted accordingly.

The top-level xml_import() function is in turn called by the do_command_line() callback function, hooked into the core's "command line" callback. Like the misc/xml-export module, this module checks for a specific command-line option (in this case, "-import"; if found, xml_import() is called with the file given as a parameter to the option (an error is generated if the parameter is missing or the file cannot be opened), and the function's return value (2 or 3) signals Services to exit with an exit code indicating the success or failure of the import.

Formerly, the httpd/dbaccess module (see section 8-2-8) also provided the ability to import XML data via this module, by uploading a file via HTTP. This functionality was removed, however, mainly to avoid the security and stability issues raised by deleting data records (nicknames and channels) already in use on the network.