[IRCServices Coding] Analysis of the 5.1.2 bug

Craig McLure Craig at frostycoolslug.com
Mon Oct 29 10:39:34 PDT 2007


No apology necessary my good friend, everyone screws up, even myself,
On one or two occasions i've written code which in theory was solid,
just to have it come back and bite me in the ass several times a few
weeks on, each time i fixed a part of it, another part came along and
took another bite (I'm missing half a buttock at this point) it's an
unfortunate part of the development cycle. Sure it's an embarrassment
and can induce a really bad day, but no one will think any less of
you, especially if they know the effort and general requirements which
go into making such a large package as services.

I've been on these lists now since 2001 (admittedly, I was a LOT
younger back then (15ish?)), and i've found your general dedication to
the software to be inspirational, and it was tweaking IRCServices that
kickstarted my coding career. IRCServices has always been well
written, and always will be well written. I'm sure someone will pick
up the baton and continue the IRCServices package far into the future,
but i doubt they will have the same dedication and motivation to
improve this as you have had. I know from experience that coding
something for many, many years has some diminishing returns on the
mind of the developer (not that i'm saying you've lost your mind), but
dispite this, you've continued to fully support a package which you
have lost interest in.

You say you're a proud person when it comes to code, well, you damn
well should be. You've created a rock solid piece of software, you've
taken no crap over the years, and you've continued to evolve it to a
point where it's probably the best services package in existence.

Don't be depressed, don't be ashamed, don't be embarrassed, hold your
head up high, for the unfortunate events of the last week can't even
place a scratch on the quality you've provided over the years.

-- 
/**********************************************
 *        Craig "FrostyCoolSlug" McLure
 * ChatSpike    - http://www.chatspike.net
 * InspIRCd     - http://www.inspircd.org
 **********************************************/

On 29/10/2007, Andrew Church <achurch at achurch.org> wrote:
>     Now that a bit of time has passed since the release of Services
> 5.1.2, and (hopefully) everyone has upgraded, I thought I'd post a more
> detailed explanation of exactly how the critical bug that prompted this
> release came about.  (As you are probably aware, various and sundry bugs
> have pushed Services all the way to 5.1.6 since then, but in this text
> I'll focus on the bug that was fixed by 5.1.2.  Just as a reminder, all
> 5.1.x releases through 5.1.5 have bugs allowing users to crash Services,
> so you should immediately upgrade to 5.1.6 if you haven't already.)
>
> The Bug
> =======
>
>      The bug itself is fairly straightforward (as those who read the
> diff file will have already realized): using the ChanServ AKICK VIEW
> command on a channel with at least one autokick caused Services to
> crash, without exception.  By default, the AKICK command requires
> channel access level 100 to use; this is the basis for the "sufficiently
> privileged users" text in the announcement.  In reality, unless channel
> registration was disabled, any user could register a new channel (thus
> gaining founder status) and use the AKICK command on that channel,
> effectively allowing any user to crash Services.  (Ironically, when
> channel registration was disabled, a separate bug--fixed in 5.1.3--had
> the potential to let any user cause a crash.)
>
>      A most embarrassing bug, indeed...
>
> How It Came About
> =================
>
>      One of the changes implemented in Services 5.1 was to unify (from
> an interface standpoint, at least; internally, there's still far too
> much code duplication) the listing functionality of all pseudoclient
> commands.  Among other things, this included removing the entry numbers
> from the ChanServ ACCESS, XOP (SOP/AOP/HOP/VOP/NOP), and AKICK commands,
> mostly to avoid the confusion that can result from multiple users
> deleting entries in the middle of the list.
>
>      Conceptually, this is a simple change, only requiring the removal
> of the corresponding parameter from the code that outputs entries to the
> user.  But theory and practice are different beasts, as they say; in the
> case of AKICK VIEW, the data for each entry is formatted using a
> language-specific string, requiring a change not only in the code itself
> but also in each of the language files.  I actually made this latter
> change, presumably with the intent of going back and updating the code
> once I had all the language files updates out of the way.  However, it
> seems that I forgot to make that final update, leaving the entry number
> present in the the code that formatted autokick entries for the VIEW
> subcommand.  This resulted, eventually, in something like printf("%s",1)
> --which of course crashed when the program tried to dereference the
> entry number as a pointer.
>
>      Ordinarily, GCC can catch inconsistencies between format strings
> and parameters like this; I always compile Services using GCC's -Werror
> option, to ensure I catch as many potential problems as possible.  With
> language strings, however, the format string itself is located in a data
> file separate from the source code.  Since the data isn't accessible to
> the compiler, these checks can't be made, and the error slipped by
> unnoticed until this bug was reported.
>
> Are Any Other Commands At Risk?
> ===============================
>
>      With respect to this particular change, no.  The removal of list
> entry numbers affected only the ChanServ ACCESS LIST, XOP LIST, and
> AKICK LIST/VIEW commands.  Of these, XOP LIST and AKICK LIST use literal
> format strings for their output, which were corrected as part of the
> code change; ACCESS LIST uses the language system, but its code and
> format strings were both updated properly, and it is likewise not
> susceptible to this problem.
>
>      With respect to all commands in general, all I can say is "I hope
> not."  I've done my best, within the constraints of time and an eleven-
> year-old codebase, to program defensively; since 5.1.2, I've also gone
> over all uses of language strings, and added some checks to my release
> process (see below) to help avoid any more slipping in.  But given that
> I let these bugs by, who knows what else might be hiding?
>
> Lessons To Be Learned
> =====================
>
> Lesson 1: Test suites
>
>      Ah, our good old friend, testing.  I actually had started work on a
> test suite for Services at one point, which was the main motivation for
> the OperServ debug commands.  Regrettably, between a lack of time in my
> earlier days of Services development (being not nearly as fluent in
> progamming then as I am now, writing a test suite to interact with a
> server was a formidable task, and I chose to use my time working on
> Services itself instead) and a lack of interest later, I never took the
> test suite beyond simple nickname operations.
>
>      Of course, even a basic test that just ran through all the commands
> would have caught this bug, and to that I can only say:  Mea culpa.
> Lack of interest or otherwise, I failed to follow best practices, and
> that failure came back to bite me.
>
> Lesson 2: Format strings are dangerous
>
>      Services makes use of the standard printf()-style format strings in
> its language files, partly out of convenience and partly because I
> didn't know any better when I designed the language system.  While the
> flexibility of format strings is undeniably useful, their implementation
> leaves much to be desired--especially in a language like C that does not
> pass type information around.  In fact, inconsistencies between format
> strings in different language files has resulted in similar crashes in
> the past (including the less-dangerous one fixed in version 5.1.3).  Of
> course, most of the danger in using format strings comes from the fact
> that strings are second-class citizens in the C data type world: to
> process a string, you access successive bytes through a pointer--and if
> that particular value happens to be something that's not a pointer,
> BOOM.  So this lesson could just as easily be "C strings are dangerous".
>
>      So what can be done about it?  Well, one of the obvious solutions--
> and probably the best idea in the long run--is to rewrite the
> pseudoclients in a higher-level language, such as Perl; Perl's dynamic
> typing ensures that this sort of problem can't occur.  (Then again,
> dynamically typed languages have their own problems, so this isn't a
> perfect solution.)  I'd actually had thoughts about implementing a Perl
> base for pseudoclients in a future version of Services, but I don't have
> nearly enough time or interest to do so at this point.
>
>      Another option would be to change the language file system to use a
> custom formatting system.  Parameters could be named, for example, and
> an array of structures including parameter names and data types along
> with the parameters themselves could be passed to routines like
> notice_lang().  Of course, this has the downside of turning what's
> currently a simple list of parameters into a more complex variable
> declaration; and since the data types would have to be specified
> manually, there's still the danger of mismatched types, leading to
> potential crashes.  Nonetheless, this might be a more feasible approach
> for those who are interested in improving Services' robustness but don't
> want to go as far as rewriting half of it in Perl.
>
>      I've implemented two crude solutions in Services 5.1.3.  One is the
> use of a script, modified from one provided by German translator Jacek
> Margos, to check all language files for format string inconsistencies at
> release time.  The other is a simple compilation check: at release time,
> I run a test compilation where notice_lang() and notice_help() calls
> using fixed message IDs are replaced with notice() calls that use the
> corresponding strings from the English language file (lang/en_us.l).
> The resulting executable is of course useless from a multilingual point
> of view, but the compiler will complain about any cases where format
> strings and parameters don't match.  While this won't catch cases where
> the format string is selected by a variable, it would have found the
> particular bug fixed in version 5.1.2.
>
> Final Thoughts
> ==============
>
>      Needless to say, this bug in particular, as well as the fact that I
> had to make five releases of a supposedly "stable" version of Services
> in the space of ten days to correct other bugs of varying seriousness,
> is quite embarrassing.  I'm of course acutely aware of the trouble it's
> caused everyone using Services, of the repeated interruptions in normal
> network operations that will have resulted from upgrades (if not
> crashes)--and let none say that it's "just IRC"!  But at the same time,
> I take pride in my work as a software developer, and seeing myself make
> such trivial mistakes as these is, well, depressing to say the least.
> I could make excuses that the code base is aging (which is true), or
> that I've lost too much interest in the program to do a proper job
> (which is also true, to be honest--that's the main reason I declared 5.1
> to be the final version of Services).  I could even claim that I've got
> another project twice the sice of Services that's considerably more
> stable (which is true as well, thanks in no small part to the experience
> I've gained from Services itself).  But since none of that changes the
> facts, all I can say in the end is that I screwed up, and I'm sorry.
>
>   --Andrew Church
>     achurch at achurch.org
>     http://achurch.org/
> ------------------------------------------------------------------
> To unsubscribe or change your subscription options, visit:
> http://lists.ircservices.za.net/mailman/listinfo/ircservices-coding
>