Performance

NB: Most of the performance patches here have been merged into the Cyrus 2.2 stream.

Cyrus provides excellent performance for well behaved IMAP clients. Unfortunately very few IMAP clients are well behaved: the only three which come to mind are PINE, Mulberry and Prayer. In addition, some POP user agents have particularly evil leave mail on server modes which give Cyrus a hard time. We have made three fairly modest changes to Cyrus to help it out.

Extended Cyrus Cache

A vanilla Cyrus installation uses cyrus.cache files to store the IMAP ENVELOPE and BODYSTRUCTURE in a convenient wire form representation. It also stores a limited number of other headers which the Cyrus authors considered likely to be in common use:

 In-reply-to:   Priority:      References:
 Resent-from:   Newsgroups:    Followup-to:

The problem is that in practice most IMAP user agents request headers which are not in that list, causing expensive cache misses where large numbers of message files have to be opened. Picking one example at random, PINE asks for the following list of entities when drawing up a mailbox index:

  (UID ENVELOPE BODY.PEEK[HEADER.FIELDS (Newsgroups Content-MD5
   Content-Disposition Content-Language Content-Location resent-to
   resent-date resent-from resent-cc resent-subject List-Help
   List-Unsubscribe List-Subscribe List-Post List-Owner List-Archive
   Followup-To References)] INTERNALDATE RFC822.SIZE FLAGS)

Phew! Our solution is to store much more information in the cyrus cache files, namely all headers except for things like Received: which are clearly a waste of time. This is a trade off: we have far fewer cache misses (hopefully none at all in normal use), but the cache files become larger. This increased size would pose a substantial problem if it wasn't for the next modification on this list.

Lazy Cache files

A vanilla Cyrus installation rewrites the entire cache files at just about any update apart from a simple append caused by mail delivery. This is something of a problem given the larger caches that we have just created. An additional problem is that some POP user agents can be configured to leave mail on server for a numbers of days. This is disastrous if someone receives thousands of messages a day: you end up rewriting a multi megabyte file each time that a message is expunged.

A solution is obvious: there is no need to remove cache entries immediately on an expunge event. Instead we leave the data in the cache and garbage collect the unreferenced data data either overnight or when a certain threshold is reached. The complication is that a vanilla Cyrus does not store the size of individual cache entries in the associated index file: it just stores the offset to each messages (and uses the offset to the next message or the end of the cache file if it needs to determine the size of a cache entry). Consequently we have had to extend the Cyrus index file to contain an additional entry for the size of the cache entry. It is trivial to upgrade a vanilla Cyrus index to include this extra entry. Downgrading is slightly more complicated as the cache file has to be rewritten to remove the dead space. In either case, reconstruct can be used as an alternative to create index files of the correct format.

Fast Rename

A vanilla Cyrus installation implements a fairly conservative approach to folder name operations. The contents of a folder are copied to the target location one message at a time. When this step has completed, the messages in the source folder are removed. This cautious approach is particularly useful when the source and target folders are on different partitions (or conceivably even different back end servers, though I don't believe that the Cyrus Murder implements this yet).

The copy and then delete approach is based on the assumption that folder rename operations are fairly rare events. Most of the time this is true on our installation. However on the first of each month we have around 10,000 PINE users (and an unknown number of Mulberry users) who have monthly folder rotation switched on.

Our solution is to replace this copy and then delete approach with a simple rename system call when it seems appropriate. This isn't quite as safe as the copy and then delete approach, but experimentation suggests that reconstruct can be used to deal with most of the obvious disasters which can take place as a consequence of a machine losing power when the mboxlist and filesystems are out of step with each other. We shall see!


David Carter <dpc22@cam.ac.uk>