1.. _cyrus-backups: 2 3============= 4Cyrus Backups 5============= 6 7.. contents:: 8 9 10Introduction 11======================== 12 13Cyrus Backups are a replication-based backup service for Cyrus IMAP servers. 14This is currently an experimental feature. If you have the resources to try it 15out alongside your existing backup solutions, feedback would be appreciated. 16 17This document is intended to be a guide to the configuration and 18administration of Cyrus Backups. 19 20This document is a work in progress and at this point is incomplete. 21 22This document assumes that you are familiar with compiling, installing, 23configuring and maintaining Cyrus IMAP servers generally, and will only discuss 24backup-related portions in detail. 25 26This document assumes a passing familiarity with 27:ref:`Cyrus Replication <replication>`. 28 29Limitations 30=========== 31 32Cyrus Backups are experimental and incomplete. 33 34The following components exist and appear to work: 35 36- backupd, and therefore inbound replication 37- autovivification of backup storage for new users, with automatic partition 38 selection 39- rebuilding of backup indexes from backup data files 40- compaction of backup files to remove stale data and combine chunks for 41 better compression 42- deep verification of backup file/index state 43- examination of backup data 44- locking tool, for safe non-cyrus operations on backup files 45- recovery of data back into a Cyrus IMAP server 46 47The following components don't yet exist in a workable state -- these tasks 48must be massaged through manually (with care): 49 50- reconstruct of backups.db from backup files 51 52The following types of information are currently backed up and recoverable 53 54- mailbox state and annotations 55- messages 56- mailbox message records, flags, and annotations 57 58The following types of information are currently backed up, but tools to 59recover them don't yet exist: 60 61- sieve scripts (but not active script status) 62- subscriptions 63- seen data 64 65The following types of information are not currently backed up 66 67- quota information 68 69Architecture 70============ 71 72Cyrus Backups are designed to run on one or more standalone, dedicated backup 73servers, with suitably-sized storage partitions. These servers generally do 74not run an IMAP daemon, nor do they have conventional mailbox storage. 75 76Your Cyrus IMAP servers synchronise mailbox state to the Cyrus Backup server(s) 77using the Cyrus replication (aka sync, aka csync) protocol. 78 79Backup data is stored in two files per user: a data file, containing gzipped 80chunks of replication commands; and an SQLite database, which indexes the 81current state of the backed up data. User backup files are stored in a hashed 82subdirectory of their containing partition. 83 84A twoskip database, backups.db, stores mappings of users to their backup file 85locations 86 87Installation 88============ 89 90Requirements 91------------ 92 93- At least one Cyrus IMAP server, serving and storing user data. 94- At least one machine which will become the first backup server. 95 96Cyrus Backups server 97-------------------- 98 99#. Compile cyrus with the ``--enable-backup`` configure option and install it. 100#. Set up an :cyrusman:`imapd.conf(5)` file for it with the following options 101 (default values shown): 102 103 backup\_db: twoskip 104 The twoskip database format is recommended for backups.db 105 backup\_db\_path: {configdirectory}/backups.db 106 The backups db contains a mapping of user ids to their backup locations 107 backup\_staging\_path: {temp\_path}/backup 108 Directory to use for staging message files during backup operations. 109 The replication protocol will transfer as many as 1024 messages in a 110 single sync operation, so, conservatively, this directory needs to 111 contain enough storage for 1024 \* your maximum message size \* number 112 of running backupd's, plus some wiggle room. 113 backup\_retention\_days: 7 114 Number of days for which backup data (messages etc) should be kept 115 within the backup storage after the corresponding item has been 116 deleted/expunged from the Cyrus IMAP server. 117 backuppartition-\ *name*: /path/to/this/partition 118 You need at least one backuppartition-\ *name* to store backup data. 119 These work similarly to regular/archive IMAP partitions, but note that 120 there is no relationship between backup partition names and 121 regular/archive partition names. New users will be have their backup 122 storage provisioned according to the usual partition selection rules. 123 backup\_compact\_minsize: 0 124 The ideal minimum data chunk size within backup files, in kB. The 125 compact tool will try to combine chunks that are smaller than this 126 into neighbouring chunks. Larger values tend to yield better 127 compression ratios, but if the data is corrupted on disk, the entire 128 chunk will become unreadable. Zero turns this behaviour off. 129 backup\_compact\_maxsize: 0 130 The ideal maximum data chunk size within backup files, in kB. The 131 compact tool will try to split chunks that are larger than this into 132 multiple smaller chunks. Zero turns this behaviour off. 133 backup\_compact\_work\_threshold: 1 134 The number of chunks within a backup file that must obviously need 135 compaction before the compact tool will attempt to compact the file. 136 Larger values are expected to reduce compaction I/O load at the expense 137 of delayed recovery of storage space. 138 139#. Create a user for authenticating to the backup system, and add it to the 140 ``admins`` setting in :cyrusman:`imapd.conf(5)` 141#. Add appropriate ``sasl_*`` settings for your authentication method to 142 :cyrusman:`imapd.conf(5)` 143#. Set up a :cyrusman:`cyrus.conf(5)` file for it:: 144 145 START { 146 # this is required 147 recover cmd="ctl_cyrusdb -r" 148 } 149 150 SERVICES { 151 # backupd is probably the only service entry your backup server needs 152 backupd cmd="backupd" listen="csync" prefork=0 153 } 154 155 EVENTS { 156 # this is required 157 checkpoint cmd="ctl_cyrusdb -c" period=30 158 159 # arrange for compact to run at some interval 160 compact cmd="ctl_backups compact -A" at=0400 161 } 162 163#. Start up the server, and use :cyrusman:`synctest(1)` to verify that you can 164 authenticate to backupd 165 166Cyrus IMAP servers 167------------------ 168 169Your Cyrus IMAP servers must be running version 3 or later of Cyrus, and must 170have been compiled with the ``--enable-replication`` configure option. It does 171*not* need to be recompiled with the ``--enable-backup`` option. 172 173It's recommended to set up a dedicated replication channel for backups, so that 174your backup replication can coexist independently of your other replication 175configurations 176 177Add settings to :cyrusman:`imapd.conf(5)` like (default values shown): 178 179*channel*\ \_sync\_host: backup-server.example.com 180 The host name of your Cyrus Backup server 181*channel*\ \_sync\_port: csync 182 The port on which your Cyrus Backup server's backupd process listens 183*channel*\ \_sync\_authname: ... 184 Credentials for authenticating to the Cyrus Backup server 185*channel*\ \_sync\_password: ... 186 Credentials for authenticating to the Cyrus Backup server 187 188Using rolling replication 189+++++++++++++++++++++++++ 190 191You can configure backups to use rolling replication. Depending on the sync 192repeat interval you configure, this can be used to keep your backups very 193current -- potentially as current as your other replicas. 194 195To configure rolling replication, add additional settings to 196:cyrusman:`imapd.conf(5)` like: 197 198sync\_log: 1 199 Enable sync log if it wasn't already. 200sync\_log\_channels: *channel* 201 Add a new channel "*channel*" to whatever was already here. Suggest calling 202 this "backup" 203*channel*\ \_sync\_repeat\_interval: 1 204 Minimum time in seconds between rolling replication runs. Smaller value 205 means livelier backups but more network I/O. Larger value reduces I/O. 206 207Update :cyrusman:`cyrus.conf(5)` to add a :cyrusman:`sync_client(8)` invocation 208to the DAEMON section specifying (at least) the ``-r`` and ``-n channel`` 209options. 210 211See :cyrusman:`imapd.conf(5)` for additional *sync\_* settings that can 212be used to affect the replication behaviour. Many can be prefixed with 213a channel to limit their affect to only backups, if necessary. 214 215Using scheduled replication (push) 216++++++++++++++++++++++++++++++++++ 217 218You can configure backups to occur on a schedule determined by the IMAP 219server. 220 221To do this, add :cyrusman:`sync_client(8)` invocations to the EVENTS section 222of :cyrusman:`cyrus.conf(5)` (or cron, etc), specifying at least the 223``-n channel`` option (to use the channel-specific configuration), plus 224whatever other options you need for selecting users to back up. See the 225:cyrusman:`sync_client(8)` manpage for details. 226 227You could also invoke :cyrusman:`sync_client(8)` in a similar way from a 228custom script running on the IMAP server. 229 230Using scheduled replication (pull) 231++++++++++++++++++++++++++++++++++ 232 233You can configure backups to occur on a schedule determined by the 234backup server. For example, you may have a custom script that examines 235the existing backups, and provokes fresh backups to occur if they are 236determined to be out of date. 237 238To to this, enable XBACKUP on your IMAP server by adding the following 239setting to :cyrusman:`imapd.conf(5)`: 240 241xbackup\_enabled: yes 242 Enables the XBACKUP command in imapd. 243 244Your custom script can then authenticate to the IMAP server as an admin 245user, and invoke the command ``XBACKUP pattern [channel]``. A replication 246of the users or shared mailboxes matching the specified pattern will occur 247to the backup server defined by the named channel. If no channel is 248specified, default sync configuration will be used. 249 250For example:: 251 252 C: 1 XBACKUP user.* backup 253 S: * OK USER anne 254 S: * OK USER bethany 255 S: * NO USER cassandane (Operation is not supported on mailbox) 256 S: * OK USER demi 257 S: * OK USER ellie 258 S: 1 OK Completed 259 260This replicates all users to the channel *backup*. 261 262 263Administration 264============== 265 266Storage requirements 267-------------------- 268 269It's not really known yet how to predict the storage requirements for a backup 270server. Experimentation in dev environment suggests around 20-40% compressed 271backup file size relative to the backed up data, depending on compact settings, 272but this is with relatively tiny mailboxes and non-pathological data. 273 274The backup staging spool conservatively needs to be large enough to hold an 275entire sync's worth of message files at once. Which is your maximum message 276size \* 1024 messages \* the number of backupd processes you're running, plus 277some wiggle room probably. In practice it'll probably not hit this limit 278unless someone is trying to. (Most users, I suspect, don't have 1024 279maximum-sized messages in their account, or don't receive them all at once 280anyway.) 281 282Certain invocations of ctl\_backups and cyr\_backup also require staging spool 283space, due to the way replication protocol (and thus backup data) parsing 284handles messages. So keep this in mind I suppose. 285 286Initial backups 287--------------- 288 289Once a Cyrus Backup system is configured and running, new users that are 290created on the IMAP servers will be backed up seamlessly without administrator 291intervention. 292 293The very first backup taken of a pre-existing mailbox will be big -- the entire 294mailbox in one hit. It's suggested that, when initially provisioning a Cyrus 295Backup server for an existing Cyrus IMAP environment, that the 296:cyrusman:`sync_client(8)` commands be run carefully, for a small group of 297mailboxes at a time, until all/most of your mailboxes have been backed up at 298least once. Also run the :cyrusman:`ctl_backups(8)` ``compact`` command on the 299backups, to break up big chunks, if you wish. Only then should you enable 300rolling/scheduled replication. 301 302Restoring from backups 303---------------------- 304 305The :cyrusman:`restore(8)` tool will restore mailboxes and messages from a 306specified backup to a specified destination server. The destination server must 307be running a replication-capable :cyrusman:`imapd(8)` or 308:cyrusman:`sync_server(8)`. The restore tool should be run from the backup 309server containing the specified backup. 310 311File locking 312------------ 313 314All :cyrusman:`backupd(8)`/:cyrusman:`ctl_backups(8)`/:cyrusman:`cyr_backup(8)` 315operations first obtain a lock on the relevant backup file. ctl\_backups and 316cyr\_backup will try to do this without blocking (unless told otherwise), 317whereas backupd will never block. 318 319Moving backup files to different backup partitions 320-------------------------------------------------- 321 322There's no tool for this (yet). To do it manually, stop backupd, copy the files 323to the new partition, then use :cyrusman:`cyr_dbtool(8)` to update the user's 324backups.db entry to point to the new location. Run the 325:cyrusman:`ctl_backups(8)` ``verify`` command on both the new filename (``-f`` 326mode) and the user's userid (``-u`` mode) to ensure everything is okay, then 327restart backupd. 328 329Provoking a backup for a particular user/user group/everyone/etc right now 330-------------------------------------------------------------------------- 331 332Just run :cyrusman:`sync_client(8)` by hand with appropriate options (as cyrus 333user, of course). See its man page for ways of specifying items to replicate. 334 335If the IMAP server with the user's mail has been configured with the 336``xbackup_enabled: yes`` option in :cyrusman:`imapd.conf(5)`, then an admin 337user can cause a backup to occur by sending the IMAP server an ``XBACKUP`` 338command. 339 340What about tape backups? 341------------------------ 342 343As long as backupd, ctl\_backups and cyr\_backup are not currently running (and 344assuming no-one's poking around in things otherwise), it's safe to take/restore 345a filesystem snapshot of backup partitions. So to schedule, say, a nightly tape 346dump of your Cyrus Backup server, make your cron job shut down Cyrus, make the 347copy, then restart Cyrus. 348 349Meanwhile, your Cyrus IMAP servers are still online and available. Regular 350backups will resume once your backupd is running again. 351 352If you can work at a finer granularity than file system, you don't need to shut 353down backupd. Just use the :cyrusman:`ctl_backups(8)` ``lock`` command to hold 354a lock on each backup while you work with its files, and the rest of the backup 355system will work around that. 356 357Restoring is more complicated, depending on what you actually need to do: 358when you restart the backupd after restoring a filesystem snapshot, the next 359time your Cyrus IMAP server replicates to it, the restored backups will be 360brought up to date. Probably not what you wanted -- so don't restart backupd 361until you've done whatever you were doing. 362 363Multiple IMAP servers, one backup server 364---------------------------------------- 365 366This is fine, as long as each user being backed up is only being backed up by 367one server (or they are otherwise synchronised). If IMAP servers have different 368ideas about the state of a user's mailboxes, one of those will be in sync with 369the backup server and the other will get a lot of replication failures. 370 371Multiple IMAP servers, multiple backup servers 372---------------------------------------------- 373 374Make sure your :cyrusman:`sync_client(8)` configuration(s) on each IMAP server 375knows which users are being backed up to which backup servers, and selects 376them appropriately. See the :cyrusman:`sync_client(8)` man page for options for 377specifying users, and run it as an event (rather than rolling). 378 379Or just distribute it at server granularity, such that backup server A serves 380IMAP servers A, B and C, and backup server B serves IMAP servers D, E, F, etc. 381 382One IMAP server, multiple backup servers 383---------------------------------------- 384 385Configure one channel plus one rolling :cyrusman:`sync_client(8)` per backup 386server, and your IMAP server can be more or less simultaneously backed up to 387multiple backup destinations. 388 389Reducing load 390------------- 391 392To reduce load on your client-facing IMAP servers, configure sync log chaining 393on their replicas and let those take the load of replicating to the backup 394servers. 395 396To reduce network traffic, do the same thing, specifically using replicas that 397are already co-located with the backup server. 398 399Other setups 400------------ 401 402The use of the replication protocol and :cyrusman:`sync_client(8)` allows a lot 403of interesting configuration possibilities to shake out. Have a rummage in the 404:cyrusman:`sync_client(8)` man page for inspiration. 405 406Tools 407===== 408 409ctl\_backups 410------------ 411 412This tool is generally for mass operations that require few/fixed arguments 413across multiple/all backups 414 415Supported operations: 416 417compact 418 Reduce backups' disk usage by: 419 420 * combining small chunks for better gzip compression -- especially 421 important for hot backups, which produce many tiny chunks 422 * removing deleted content that has passed its retention period 423list 424 List known backups. 425lock 426 Lock a single backup, so you can safely work on it with non-cyrus tools. 427reindex 428 Regenerate indexes for backups from their data files. Useful if index 429 becomes corrupted by some bug, or invalidated by working on data with 430 non-cyrus tools. 431stat 432 Show statistics about backups -- disk usage, compression ratio, etc. 433verify 434 Deep verification of backups. Verifies that: 435 436 * Checksums for each chunk in index match data 437 * Mailbox states are in the chunk that the index says they're in 438 * Mailbox states match indexed states 439 * Messages are in the chunk the index says they're in 440 * Message data checksum matches indexed checksums 441 442See the :cyrusman:`ctl_backups(8)` man page for more information. 443 444cyr\_backup 445----------- 446 447This tool is generally for operations on a single mailbox that require multiple 448additional arguments 449 450Supported operations 451 452list [ chunks \| mailboxes \| messages \| all ] 453 Line-per-item listing of information stored in a backup. 454show [ chunks \| mailboxes \| messages ] items... 455 Paragraph-per-item listing of information for specified items. Chunk items 456 are specified by id, mailboxes by mboxname or uniqueid, messages by guid. 457dump [ chunk \| message ] item 458 Full dump of one item. chunk dumps the uncompressed content of a chunk 459 (i.e. a bunch of sync protocol commands). message dumps a raw rfc822 460 message (useful for manually restoring) 461 462See the :cyrusman:`cyr_backup(8)` man page for more information. 463 464restore 465------- 466 467This tool is for restoring mail from backup files. 468 469Required arguments are a destination server (in ip:port or host:port format), 470a backup file, and mboxnames, uniqueids or guids specifying the mailboxes or 471messages to be restored. 472 473If the target mailbox does not already exist on the destination server, options 474are available to preserve the mailbox and message properties as they existed 475in the backup. This is useful for rebuilding a lost server from backups, such 476that client state remains consistent. 477 478If the target mailbox already exists on the destination server, restored 479messages will be assigned new, unused uids and will appear to the client as new 480messages. 481 482See the :cyrusman:`restore(8)` man page for more information. 483