1.. _imap-developer-guidance-mailbox-format:
2
3Cyrus IMAP Server: Mailbox File Formats
4=======================================
5
6Intro
7-----
8
9This documentation refers to the "version 12" cyrus index format and
10associated mailbox files.
11
12No external tools should make use of this information. The only
13supported method of access to the mail store is through the standard
14interfaces: IMAP, POP, NNTP, LMTP, etc.
15
16A cyrus mailbox is a directory in the filesystem. It contains the
17following files:
18
19-  zero or more message files
20-  the ``cyrus.header`` metadata file
21-  the ``cyrus.index`` metadata file
22-  the ``cyrus.cache`` metadata file
23-  zero or one ``cyrus.squat`` search indexes
24-  zero or more subdirectories
25
26With "split metadata" configuration, the mailbox may actually be split
27between multiple disks, with the files being in the same relative
28directory on the meta disk. See the ``imapd.conf`` option
29``metapartition_files`` for more information
30
31Message Files
32-------------
33
34The message files are named by their UID, followed by a ".", so UID 423
35would be named "``423.``". They are stored in wire-format: lines are
36terminated by CRLF and binary data is not allowed.
37
38``cyrus.header``
39----------------
40
41This file contains mailbox-wide information that does not change often.
42Its format:
43
44::
45
46    <Mailbox Header Magic String>
47    <Quota Root>\t<Mailbox Unique ID String>\n
48    <Space-separated list of user flags>\n
49    <Mailbox ACL>\n
50
51The Mailbox Unique ID String is used for non-owner per-user \\Seen flags
52so they remain with the mailbox during renames, and also by the
53replication subsystem to detect mailbox renames.
54
55The ACL is a copy of the value stored in mailboxes.db, and isn't
56actually used.
57
58Locking Considerations
59~~~~~~~~~~~~~~~~~~~~~~
60
61The ``cyrus.index`` file must be locked in exclusive mode while making
62changes to the ``cyrus.header`` file to ensure consistency. All changes
63are made by rewriting the entire file and renaming the new version into
64place.
65
66``cyrus.cache``
67---------------
68
69The ``cyrus.cache`` file is a pure cache of information that's also
70present in the message files. It exists to make ENVELOPE and specific
71header fetches more efficient, as well as to assist with searches and
72sorts.
73
74If a ``cyrus.cache`` file is missing or corrupted, it can be
75re-generated by running a ``reconstruct`` on the mailbox.
76
77The format is 10 individual records each prefixed with a 32 bit length
78value in network byte order. The offset of each message's cache record
79is stored in the ``cyrus.index`` file (documented below). The records in
80a cyrus.cache file are of variable length, depending on the contents of
81the associated message.
82
83The first 4 bytes of the cyrus.cache file are a "generation number"
84which must match the first 4 bytes of the associated cyrus.index file.
85In the past this was used to track consistency between the files, but
86the name locking scheme and per-record CRC check in cyrus 2.4 and above
87means this is just a backup consistency check rather than an essential
88format feature.
89
90::
91
92    +------------------------------------------------------------------------+
93    |Gen # (32bits)|Size 1 (32bits)|Data 1                                   |
94    +------------------------------------------------------------------------+
95    |           |Size 2 (32bits)|Data 2            |Size 3 (32bits)| Data 3  |
96    +------------------------------------------------------------------------+
97    | .....                                                                  |
98    +------------------------------------------------------------------------+
99
100While there are occasional changes to the cache format, this information
101is NOT stored in the cyrus.cache file. Instead, there is a
102"cache\_version" field in the cyrus.index record, so multiple different
103versions of cache data may exist in the same cache file.
104
105The order of fields per record in the cache file is as follows: (keep in
106mind that they are all preceded by a 4 byte network byte order size).
107
108Envelope Response
109    Raw IMAP response for a request for the envelope.
110Bodystructure Response
111    Raw IMAP response for a request for the bodystructure.
112Body Response
113    Raw IMAP response for an (old style) request for the body.
114Binary Bodystructure
115    Offsets into the message file to pull out various body parts.
116    Because of the nature of MIME parts, this is somewhat recursive.
117
118    This looks like the following (starting the octet following the
119    cache field size). All of the fields are bit32s.
120
121    ::
122
123          [
124           [Number of message parts+1 for the rfc822 header if present]
125           [
126            [Offset in the message file of the header of this part]
127            [Size (octets) of the header of this part]
128            [Offset in the message file of the content of this part]
129            [Size (octets) of the content of this part]
130            [Encoding Type of this part]
131           ]
132              (repeat for each part as well as once for the headers)
133           [zero *or* number of sub-parts in the case of a multipart.
134            if nonzero, this is a recursion into the top structure]
135              (repeat for each part)
136          ]
137
138    Note if this is not a message/rfc822, than the values for the sizes
139    of the part 0 are -1 (to indicate that it doesn't exist). Sub-parts
140    are not possible for a part 0, so they aren't included when finding
141    recursive entries.
142
143    The offset and size info for both the mime header and content part
144    are useful in order to do fast indexing on the appropriate parts of
145    the message file when a client does a FETCH request for
146    BODY[HEADER], or BODY[2.MIME].
147
148    Note that the top level RFC822 headers are a treated as a separate
149    part from their body text ("0" or "HEADER").
150
151    In the case of a multipart/alternative, the content size & offset
152    refers to the size of the entire mime part.
153
154    A very simple message (with a single text/plain part) would
155    therefore look like:
156
157    ::
158
159          [[2][rfc822 header][text/plain body part info][0]]
160
161    A simple multipart/alternative message might look like:
162
163    ::
164
165          [[3][rfc822 header][text/plain message part info]
166              [second message part info][0][0]]
167
168    A message with an attachment that has two subparts:
169
170    ::
171
172          [[3][rfc822 header info][rfc822 first body part info][attachment info][0][
173                [3][NIL header info][sub part 1 info][sub part 2 info][0][0]]]
174
175    A message with an attached message/rfc822 message with the following
176    total structure:
177
178    ::
179
180            message/rfc822
181              0 headers; content-type: multipart/mixed
182              1 text/plain
183              2 message/rfc822
184                0 headers; content-type: multipart/alternative
185                1 text/plain
186                2 text/html
187
188    ::
189
190          [[3][rfc822 header part 0][text/plain part 1][overall attachment info][0][
191               [3][rfc822 header part 2.0][text/plain part 2.1][text/html part 2.2]
192                  [0][0]]]
193
194Cache Header
195    Any cached header fields. The exact set of fields here depends on
196    the cache record version - there is a function in ``imap/mailbox.c``
197    to determine if a named header would be cached based on the version.
198    These are in the same format they would appear in the message file:
199
200    ::
201
202          HeaderName: headerdata\r\n
203
204    Examples include: References, In-Reply-To, etc.
205
206From
207    The from header.
208To
209    The to header.
210Cc
211    The CC header.
212Bcc
213    The BCC header.
214Subject
215    The Subject header.
216
217Locking Considerations
218~~~~~~~~~~~~~~~~~~~~~~
219
220The ``cyrus.index`` file must be locked in exclusive mode while making
221changes to the ``cyrus.cache`` file to ensure consistency. All new cache
222records are created by reading the current end-of-file offset, appending
223the new cache record, and storing that start offset into the associated
224cyrus.index record.
225
226``cyrus.index``
227---------------
228
229The cyrus.index file is NOT just a cache - it stores information not
230present in the message file!
231
232The cyrus.index file consists of a fixed width header, followed by fixed
233width records. In the past, it would be rewritten on every expunge, but
234since Cyrus 2.4 the expunged records remain in the cyrus.index file for
235a configurable time to support QRESYNC and more efficient delayed
236expunge.
237
238The cyrus.index file is the "heart" of the mailbox format - containing
239checksums (CRC32) of everything else, and the most frequently updated
240fields. All fields are stored in network byte order and aligned on 4
241byte boundaries. Due to some 64 bit values being stored, the header and
242individual records are aligned on 8 byte boundaries.
243
244The overall format looks sort of like this:
245
246::
247
248    cyrus.index:
249    +----------------+
250    | Mailbox Header |
251    +----------------+
252    | Msg: Num 1     |
253    +----------------+
254    | Msg: Num 2     |
255    +----------------+
256    |     ...        |
257    +----------------+
258
259The basic idea being that there is one header, and then all the message
260records are evenly spaced throughout the file. All of the message
261records are at well-known offsets, making any part of the file
262accessable at roughly equal speed.
263
264Locking Considerations
265~~~~~~~~~~~~~~~~~~~~~~
266
267``cyrus.index`` files can not be repacked (i.e. records can not change
268UID for a particular offset, and the file can't be rewritten or deleted)
269unless there's an exclusive namelock held for the mailbox name. This is
270to avoid race conditions and simplify the use of mailboxes. Whenever a
271mailbox is opened, the caller holds a shared namelock on the mailbox
272name for the duration of the "mailbox object"'s existence.
273
274All reads of a ``cyrus.index`` file must be done with a lock held, and
275all writes must be done with an exclusive lock held. This ensures CRC32
276checksums of individual headers and records are always consistent. There
277are no direct "offset" reads done any more, instead the mailbox API
278provides a way to read an entire cyrus.index header or cyrus.index
279record into a struct, performing consistency checks. Writes are also
280done with a complete record struct.
281
282Detail of ``cyrus.index`` header
283~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
284
285The index header contains the following information, in order:
286
287Generation Number (4 bytes)
288    A number that is basically the "revision number" of the mailbox. It
289    must match between the cache and index files. This is to ensure that
290    if we fail to sync both the cache and index files and a crash
291    happens (so that only one is synced), we do not provide bad data to
292    the user. This is also backed by having individual cache checksums
293    on each record.
294Format (4 bytes)
295    Basically obsolete (indicates netnews or regular).
296Minor Version (4 bytes)
297    Indicates the version number of the index file. This can be used for
298    on-the-fly upgrades of the index and cache files.
299Start Offset (4 bytes)
300    Size of index header.
301Record Size (4 bytes)
302    Size of an index record.
303Num Records (4 bytes)
304    How many records are in this index (including records for expunged
305    records. See below for "Exists" which has moved from pre-version 12
306    files.
307Last Appenddate (4 bytes)
308    (time\_t) of the last time a message was appended
309Last UID (4 bytes)
310    Highest UID of all messages in the mailbox (UIDNEXT - 1).
311Quota Mailbox Used (8 bytes)
312    Total amount of storage used by all of the messages in the mailbox.
313    Platforms that don't support 64-bit integers only use the last 4
314    bytes.
315POP3 Last Login (4 bytes)
316    (time\_t) of the last pop3 login to this INBOX, used to enforce the
317    "poptimeout" ``imapd.conf`` option.
318UIDvalidity (4 bytes)
319    The UID validitiy of this mailbox. Cyrus currently uses the
320    ``time()`` when this mailbox was created.
321Deleted, Answered, and Flagged (4 bytes each)
322    Counts of how many messages have each flag.
323Mailbox Options (4 bytes)
324    Bitmask of mailbox options, consisting of any combination of the
325    following:
326
327    POP3\_NEW\_UIDL
328        Flag signalling that we're using "*uidvalidity*.\ *uid*" instead
329        of just "*uid*" for the output of the POP3 UIDL command.
330    IMAP\_SHAREDSEEN
331        Flag signalling that we're supporting a shared \\Seen flag on
332        the mailbox.
333    IMAP\_DUPDELIVER
334        Flag signalling that we're allowing duplicate delivery of
335        messages to the mailbox, overriding system-wide duplicate
336        suppression.
337    MAILBOX\_NEEDS\_REPACK
338        Flag signalling that the mailbox is due to be repacked. During
339        mailbox\_close() every process will attempt to take an exclusive
340        namelock on the mailbox and repack.
341    MAILBOX\_DELETED
342        Flag signalling that the mailbox is deleted. This can be set
343        with a shared namelock, and indicates to all other users of the
344        mailbox that they need to close it and attempt cleanup. The last
345        process to close the mailbox will perform the final cleanup
346        under an exclusive namelock, giving the other processes a chance
347        to finish their current operation first without files
348        disappearing from under them!
349
350Leaked Cache (4 bytes)
351    Number of leaked records in the cache file.
352Highest ModSeq (8 bytes)
353    Highest Modification Sequence of all the messages in the mailbox
354    (CONDSTORE).
355Deleted ModSeq (8 bytes)
356    Lowest Modification Sequence before which expunged message data may
357    have been purged from the mailbox and forgotten (CONDSTORE/QRESYNC
358    support).
359Exists (4 bytes)
360    See NumRecords above. This is the count of non-expunged records in
361    the mailbox and corresponds to the IMAP status item "EXISTS".
362First Expunged (4 bytes)
363    lowest modified time of an expunged message in this mailbox (or zero
364    if there are no expunged messages) - used to determine if the
365    mailbox needs repacking.
366Last Repack Time (4 bytes)
367    a timestamp for the last repack, to ensure repacks aren't done too
368    close together if expunges were closely spaced
369Header File CRC (4 bytes)
370    CRC32 value of the bytes in the ``cyrus.header`` file for this
371    mailbox. Must be rewritten whenever the cyrus.header file is changed
372    (see locking considerations above - this is why the cyrus.index must
373    be exclusively locked!)
374Sync CRC (4 bytes)
375    An XOR of the CRC32 of a specially generated value for each of the
376    non-expunged records in this mailbox. This is a cached value which
377    allows the replication subsystem to quickly determine that all
378    non-expunged records in a mailbox are in sync and detect possible
379    "split brain" scenarios with low bandwidth use.
380Recent UID (4 bytes)
381    The highest UID last time an IMAP client logged in as the mailbox
382    owner (or anybody if SHAREDSEEN is enabled) selected this mailbox.
383    Used to generate the \\Recent flags in IMAP
384Recent Time (4 bytes)
385    Used for consistency with the seen\_db code, but probably not
386    actually necessary. Oh well
387Header CRC (4 bytes)
388    Must always be the LAST record of the header. This is the CRC32 of
389    the actual bytes on disk (network order format) for the rest of the
390    cyrus.index. By keeping it last, it can be easily calculated with
391    the following snippet of code:
392    ``crc = crc32_map(buf, OFFSET_HEADER_CRC);`` - i.e. crc32 from the
393    start of the buffer to just before this field.
394
395There are also spare fields in the index header, to allow for future
396expansion without forcing an upgrade of the file, and to round up to be
397divisible by 8 bytes.
398
399Detail of ``cyrus.index`` records
400~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
401
402These records start immediately following the ``cyrus.index`` header,
403and are all fixed size. They are in-order by uid of the message.
404
405UID (4 bytes)
406    UID of the message
407INTERNALDATE (4 bytes)
408    INTERNALDATE of the message (where possible, this matches the
409    creation and modification times of the file on disk to help
410    reconstruct in the event of data loss)
411SENTDATE (4 bytes)
412    Contents of the Date: header chomped to day resolution with timezone
413    stripped.
414SIZE (4 bytes)
415    Size of the whole message (in octets)
416HEADER SIZE (4 bytes)
417    Size of the message header (in octets)
418GMTIME (4 bytes)
419    Contents of the Date: header at 1 second resolution and converted to
420    GMT (for sort)
421CACHE\_OFFSET (4 bytes)
422    Offset into the ``cyrus.cache`` file for the beginning of this
423    message's cache entry.
424LAST UPDATED (4 bytes)
425    (time\_t) of the last time this record was changed
426SYSTEM FLAGS (4 bytes)
427    Bitmask showing which system flags are set/unset
428USER FLAGS (MAX\_USER\_FLAGS / 32 bytes)
429    Bitmask showing which user flags are set/unset (bits correspond to
430    positions in the cyrus.header flag list, i.e. (1<<0) == the flag
431    name
432CONTENT\_LINES (4 bytes)
433    Number of text lines contained in the message content (body).
434CACHE\_VERSION (4 bytes)
435    Indicates the version number of the cache record for the message
436    (determines which headers are cached, see list in mailbox.c).
437GUID (MESSAGE\_GUID\_SIZE bytes)
438    Globally Unique IDentifier of the message (used by replication
439    engine). This is the sha1 value of the bytes as stored on disk.
440MODSEQ (8 bytes)
441    Modification Sequence of the message (CONDSTORE).
442CACHE\_CRC (4 bytes)
443    This is the CRC32 of all the bytes of the cache record (all 10
444    fields) as stored on disk. Again, calculated over the exact bytes
445    stored in the ``cyrus.cache`` file.
446RECORD\_CRC (4 bytes)
447    Like the header CRC - this is the CRC32 of all the bytes in on-disk
448    order that exist in this record. Records are always rewritten as the
449    entire record, including the updated CRC, so it's always consistent
450    if you have a lock on the ``cyrus.index`` file, because writers will
451    wait until they get an exclusive lock to make modifications.
452
453Notes
454-----
455
456-  Expunge is super quick now - it's just a flag update!
457-  Append is relatively fast (it only adds to the end of both the cache
458   and index files and modifies the index header)
459-  Message unlinks always happen during the "close" phase - which may be
460   noticed when you select another mailbox, but otherwise are delayed
461   from the actual action. With delayed expunge, the unlinks are pushed
462   off to cyr\_expire which is a background task, and will never be
463   noticed by the user.
464-  Message delivery is something like this:
465
466   #. write/sync message file
467   #. write/sync new ``cyrus.cache`` record
468   #. write/sync new ``cyrus.index`` record
469   #. calculate, write, sync new ``cyrus.index`` header
470   #. acknowledge message delivery
471
472   The message isn't delivered until the new index header is written. In
473   case of a crash before the new index header is written, any previous
474   writes will be overwritten on the next delivery (and will not be
475   noticed by the readers).
476
477   Note that certain power failure situations (power failure in the
478   middle of a disk sector write) could cause a mailbox to need
479   reconstruction (possibly even losing some flag state). These failure
480   modes are not possible in the "Hardware RAID disk model" (which we
481   will describe somewhere else when we get around to it).
482
483Future considerations
484---------------------
485
486-  Cache all header fields? (or all up to Xk?) This could greatly
487   improve speeds of clients that just ask for everything, but also
488   increases the expense of rewriting the cache file (as well as the
489   size it takes on disk).
490-  Reformat cache file to use a
491   (size)(size)(size)(size)(data)(data)(data) format. This makes
492   accesses anywhere in the cache file equally fast, as opposed to
493   having to iterate through all the entires for a given message to get
494   to the last one. Note that either way is still O(1) so maybe it
495   doesn't matter much.
496-  It would be useful to store a uniqueid -> mailbox name index, so that
497   we could fix arbitron again.
498