1Notes for backup implementation 2 3Backup index database (one per user): 4 5chunk: 6 int id 7 timestamp ts 8 int offset 9 int length 10 text file_sha1 -> sha1 of (compressed) data prior to this chunk 11 text data_sha1 -> sha1 of (uncompressed) data contained in this chunk 12 13mailbox: 14 int id 15 int last_chunk_id -> chunk that knows the current state 16 char uniqueid -> unique 17 char mboxname -> altered by a rename 18 char mboxtype 19 int last_uid 20 int highestmodseq 21 int recentuid 22 timestamp recenttime 23 timestamp last_appenddate 24 timestamp pop3_last_login 25 timestamp pop3_show_after 26 timestamp uidvalidity 27 char partition 28 char acl 29 char options 30 int sync_crc 31 int sync_crc_annot 32 char quotaroot 33 int xconvmodseq 34 char annotations 35 timestamp deleted -> time that it was unmailbox'd, or NULL if still alive 36 37message: 38 int id 39 char guid 40 char partition -> this is used to set the spool directory for the temp file - we might not need it 41 int chunk_id 42 int offset -> offset within chunk of dlist containing this message 43 int size -> size of this message (n.b. not length of dlist) 44 45mailbox_message: 46 int mailbox_id 47 int message_id 48 int last_chunk_id -> chunk that has a RECORD in a MAILBOX for this 49 int uid 50 int modseq 51 timestamp last_updated 52 char flags 53 timestamp internaldate 54 int size 55 char annotations 56 timestamp expunged -> time that it was expunged, or NULL if still alive 57 58subscription: 59 int last_chunk_id -> chunk that knows the current state 60 char mboxname -> no linkage to mailbox table, users can be sub'd to nonexistent 61 timestamp unsubscribed -> time that it was unsubscribed, or NULL if still alive 62 63seen: 64 int last_chunk_id -> chunk that knows the current state 65 char uniqueid -> mailbox (not necessarily ours) this applies to 66 timestamp lastread 67 int lastuid 68 timestamp lastchange 69 char seenuids -> a uid sequence encoded as a string 70 71sieve: 72 int chunk_id 73 timestamp last_update 74 char filename 75 char guid 76 int offset -> offset within chunk of the dlist containing this script 77 timestamp deleted -> time that it was deleted, or NULL if still alive 78 79 80sieve scripts and messages are both identified by a GUID 81but APPLY SIEVE doesn't take a GUID, it seems to be generated locally? 82the GUID in the response to APPLY SIEVE is generated in the process of 83reading the script from disk (sync_sieve_list_generate) 84 85can't activate scripts because only bytecode files are activated, but 86we neither receive bytecode files over sync protocol nor do we compile 87them ourselves. 88 89possibly reduce index size by breaking deleted/expunged values into their 90own tables, such that we only store a deleted value for things that are 91actually deleted. use left join + is null to find undeleted content 92 93 94messages 95-------- 96 97APPLY MESSAGE is a list of messages, not necessarily only one message. 98Actually, it's a list of messages for potentially multiple users, but we avoid 99this by rejecting GET MESSAGES requests that span multiple users (so that 100sync_client retries at USER level, and so we only see APPLY MESSAGE requests 101for a single user). 102 103Cheap first implementation is to index the start/end of the entire APPLY 104MESSAGE command identically for each message within it, and at restore time 105we grab that chunk and loop over it looking for the correct guid. 106 107Ideal implementation would be to index the offset and length of each message 108exactly (even excluding the dlist wrapper), but this is rather complicated 109by the dlist API. 110 111For now, we just index the offset of the dlist entry for the message, 112and we can parse the pure message data back out later from that, when 113we need to. Slightly less efficient on reads, but works->good->fast. We 114need to loop over the entries in the MESSAGE dlist to find the one with the 115desired GUID. 116 117The indexed length needs to be the length of the message, not the length of the 118dlist wrapper, because we need to know this cheaply to supply RECORDs in 119MAILBOX responses. 120 121 122renames 123------- 124 125APPLY RENAME %(OLDMBOXNAME old NEWMBOXNAME new PARTITION p UIDVALIDITY 123) 126 127We identify mboxes by uniqueid, so when we start seeing sync data for the same 128uniqueid with a new mboxname we just transparently update it anyway, without 129needing to handle the APPLY RENAME. Not sure if this is a problem... Do we 130need to record an mbox's previous names somehow? 131 132I think it's possible to use this to rename a USER though, something like: 133 134APPLY RENAME %(OLDMBOXNAME example.com!user.smithj NEWMBOXNAME example.com!user.jsmith ...) 135 136-- in which case, without special handling of the RENAME command itself, there 137will be a backup for the old user that ends with the RENAME, and a backup of 138the new user that (probably) duplicates everything again (except for stuff 139that's been expunged). 140 141And if someone else gets given the original name, like 142 143APPLY RENAME %(OLDMBOXNAME example.com!user.samantha-mithj NEWMBOXNAME example.com!user.smithj ...) 144 145Then anything that was expunged from the original user but still available in 146backup disappears? Or the two backups get conflated, and samantha can 147"restore" the original smithj's old mail? 148 149Uggh. 150 151if there's a mailboxes database pointing to the backup files, then the backup 152file names don't need to be based on the userid, they could e.g. be based on 153the user's inbox's uniqueid. this would make it easier to deal with user 154renames because the backup filename wouldn't need to change. but this depends 155on the uniqueid(s) in question being present on most areas of the sync 156protocol, otherwise when starting a backup of a brand new user we won't be 157able to tell where to store it. workaround in the meantime could be to make 158some kind of backup id from the mailboxes database, and base the filename on 159this. 160 161actually, using "some kind of backup id from the mailboxes database" is probably 162the best solution. otherwise the lock complexity of renaming a user while making 163sure their new backup filename doesn't already exist is frightful. 164 165maybe do something with mkstemp()? 166 167furthermore: what if a mailbox is moved from one user to another? like: 168 169APPLY RENAME %(OLD... example.com!user.foo.something NEW... example.com!user.bar.something ...) 170 171when a different-uid rename IS a rename of a user (and not just a folder 172being moved to a different user), what does it look like? 173* does it do a single APPLY RENAME for the user, and expect their folders to 174 shake out of that? 175* does it do an APPLY RENAME for each of their folders? 176 177in the latter case, we need to append each of those RENAMEs to the old backup 178so they can take effect correctly, and THEN rename the backup file itself. but 179how to tell when the appends are finished? 180 181how can we tell the difference between folder(s) moved to a different user vs 182user has been renamed? 183 184there is a setting: 'allowusermoves: 0' which, when enabled, allows users to 185be renamed via IMAP rename/xfer commands. but the default is that this is 186disabled. we could initially require this to be disabled while using backups... 187 188not sure what the workflow looks like for renaming a user if this is not enabled. 189 190not sure what the sync flow looks like in either case. 191 192looking at sync_apply_rename and mboxlist_renamemailbox, it seems like we'll 193see an APPLY RENAME for each affected mbox when a recursive rename is occurring. 194 195there doesn't seem to be anything preventing user/a/foo -> user/b/foo in the 196general (non-INBOX) case. 197 198renames might be a little easier to handle if the index replicated the mailbox 199hierarchy rather than just being a flat structure. though this adds complexity 200wrt hiersep handling. something like: 201 202 mailbox: 203 mboxname # just the name of this mbox 204 parent_id # fk to parent mailbox 205 full_mboxname # cached value, parent.full_mboxname + mboxname 206 207 208locking 209------- 210 211just use a normal flock/fcntl lock on the data file and only open the index 212if that lock succeeded 213 214backup: needs to append foo and update foo.index 215reindex: only needs to read foo, but needs a write lock to prevent writes 216 while it does so. needs to write to (replace) foo.index 217compact: needs to re-write foo and foo.index 218restore: needs to read 219 220 221verifying index 222--------------- 223 224how to tell whether the .index file is the correct one for the backup data it 225ostensibly represents? 226 227one way to do this would be to have backup_index_end() store a checksum of 228the corresponding data contents in the index. 229 230when opening a backup, verify this checksum against the data, and refuse to 231load the index if it doesn't match. 232 233- sha1sum of (compressed) contents of file prior to each chunk 234 235how to tell whether the chunk data is any good? store a checksum of the chunk 236contents along with the rest of the chunk index 237 238- sha1sum of (uncompressed) contents of each chunk 239 240 241mailboxes database 242------------------ 243 244bron reckons use twoskip for this 245userid -> backup_filename 246 247lib/cyrusdb module implements this, look into that 248 249look at conversations db code to see how to use it 250 251need a tool: 252 * given a user, show their backup filename 253 * dump/undump 254 * rebuild based on files discovered in backup directory 255 256where does this fit into the locking scheme? 257 258 259reindex 260------- 261 262* convert user mailbox name to backup name 263* complain if there's no backup data file? 264* lock, rename .index to .index.old, init new .index 265* foreach file chunk: 266* timestamp is from first line in chunk 267* complain if timestamp has gone backwards? 268* index records from chunk 269* unlock 270* clean up .index.old 271 272on error: 273* discard partial new index 274* restore .index.old 275* bail out 276 277 278backupd 279------- 280 281cmdloop: 282* (periodic cleanup) 283* read command, determine backup name 284* already holding lock ? bump timestamp : obtain lock 285* write data to gzname, flush immediately 286* index data 287 288periodic cleanup: 289* check timestamp of each held lock 290* if stale (define: stale?), release 291* FIXME if we've appended more than the chunk size we would compact to, release 292 293sync restart: 294* release each held lock 295 296exit: 297* release each held lock 298 299need a "backup_index_abort" to complete the backup_index_start/end set. 300_start should create a transaction, _end should commit it, and _abort should 301roll it back. then, if backupd fails to write to the gzip file for some 302reason, the (now invalid) index info we added can be discarded too. 303 304flushing immediately on write results in poor gzip compression, but for 305incremental backups that's not a problem. when the compact process hits the 306file it will recompress the data more efficiently. 307 308 309questions 310--------- 311* what does it look like when uidvalidity changes? 312 313 314restore 315------- 316 317restoration is effectively a reverse-direction replication (replicating TO master), 318which means we can't necessarily supply things like uid, modseq, etc without racing 319against normal message arrivals. so instead we add an extra command to the protocol 320to restore a message to a folder but let the destination determine the tasty bits. 321 322protocol flow looks something like: 323 324c: APPLY RESERVE ... # as usual 325s: * MISSING (foo bar) 326s: OK 327c: APPLY MESSAGE ... # as usual 328s: OK 329c: RESTORE MAILBOX ... # new sync proto command 330s: OK 331 332we introduce a new command, RESTORE MAILBOX, which is similar to the existing 333APPLY MAILBOX. it specifies, for a mailbox, the mailbox state plus the message 334records relevant to the restore. 335 336the imapd/sync_server receiving the RESTORE command creates the mailbox if necessary, 337and then adds the message records to it as new records (i.e. generating new uid etc). 338this will end up generating new events in the backup channel's sync log, and then the 339messages will be backed up again with their new uids, etc. additional wire transfer 340of message data should be avoided by keeping the same guid. 341 342if the mailbox already exists but its uniqueid does not match the one from the backup, 343then what? this probably means user has deleted folder and contents, then made new 344folder with same name. so it's probably v common for mailbox uniqueid to not match 345like this. so we don't care about special handling for this case. just add any 346messages that aren't already there. 347 348if the mailbox doesn't already exist on the destination (e.g. if rebuilding a server 349from backups) then it's safe and good to reuse uidvalidity, uniqueid, uid, modseq etc, 350such that connecting clients can preserve their state. so the imapd/sync_server 351receiving the restore request accepts these fields as optional, but only preserves 352them if it's safe to do so. 353 354* restore: sbin program for selecting and restoring messages 355 356restore command needs options: 357+ whether or not to trim deletedprefix off mailbox names to be restored 358+ whether or not to restore uniqueid, highestmodseq, uid and so on 359+ whether or not to limit to/exclude expunged messages 360+ whether or not to restore sub-mailboxes 361+ sync_client-like options (servername, local_only, partition, ...) 362+ user/mailbox/backup file(s) to restore from 363+ mailbox to restore to (override location in backup) 364+ override acl? 365 366can we heuristically determine whether an argument is an mboxname, uniqueid or guid? 367 => libuuid uniqueid is 36 bytes of hyphen (at fixed positions) and hex digits 368 => non-libuuid uniqueid is 24 bytes of hex digits 369 => mboxname usually contains at least one . somewhere 370 => guid is 40 bytes of hex digits 371 372usage: 373 restore [options] server [mode] backup [mboxname | uniqueid | guid]... 374 375options: 376 -A acl # apply specified acl to restored mailboxes 377 -C alt_config # alternate config file 378 -D # don't trim deletedprefix before restoring 379 -F input-file # read mailboxes/messages from file rather than argv 380 -L # local mailbox operations only (no mupdate) 381 -M mboxname # restore messages to specified mailbox 382 -P partition # restore mailboxes to specified partition 383 -U # try to preserve uniqueid, uid, modseq, etc 384 -X # don't restore expunged messages 385 -a # try to restore all mailboxes in backup 386 -n # calculate work required but don't perform restoration 387 -r # recurse into submailboxes 388 -v # verbose 389 -w seconds # wait before starting (useful for attaching a debugger) 390 -x # only restore expunged messages (not sure if useful?) 391 -z # require compression (abort if compression unavailable) 392 393mode: 394 -f # specified backup interpreted as filename 395 -m # specified backup interpreted as mboxname 396 -u # specified backup interpreted as userid (default) 397 398 399compact 400-------- 401 402# finding messages that are to be kept (either exist as unexpunged somewhere, 403# or exist as expunged but more recently than threshold) 404# (to get unique rows, add "distinct" and remove mm.expunged from fields) 405sqlite> select m.*, mm.expunged from message as m join mailbox_message as mm on m.id = mm.message_id and (mm.expunged is null or mm.expunged > 1437709300); 406id|guid|partition|chunk_id|offset|length|expunged 4071|1c7cca361502dfed2d918da97e506f1c1e97dfbe|default|1|458|2159| 4081|1c7cca361502dfed2d918da97e506f1c1e97dfbe|default|1|458|2159|1446179047 4091|1c7cca361502dfed2d918da97e506f1c1e97dfbe|default|1|458|2159|1446179047 410 411# finding chunks that are still needed (due to containing last state 412# of mailbox or mailbox_message, or containing a message) 413sqlite> select * from chunk where id in (select last_chunk_id from mailbox where deleted is null or deleted > 1437709300 union select last_chunk_id from mailbox_message where expunged is null or expunged > 1437709300 union select chunk_id from message as m join mailbox_message as mm on m.id = mm.message_id and (mm.expunged is null or mm.expunged > 1437709300)); 414id|timestamp|offset|length|file_sha1|data_sha1 4151|1437709276|0|3397|da39a3ee5e6b4b0d3255bfef95601890afd80709|6836d0110252d08a0656c14c2d2d314124755491 4163|1437709355|1977|2129|fee183c329c011ead7757f59182116500776eaaf|a5677cfa1f5f7b627763652f4bb9b99f5970748c 4174|1437709425|2746|1719|3d9f02135bf964ff0b6a917921b862c3420e48f0|7b64ec321457715ee61fe238f178f5d72adaef64 4185|1437709508|3589|2890|0cee599b1573110fee428f8323690cbcb9589661|90d104346ef3cba9e419461dd26045035f4cba02 419 420remember: a single APPLY MESSAGE line can contain many messages! 421 422thoughts: 423* need a heuristic for quickly determining whether a backup needs to be compacted 424 -> sum(chunks to discard, chunks to combine, chunks to split) > threshold 425 -> can we detect chunks that are going to significantly reduce in size as 426 result of discarding individual lines? 427* "quick" vs "full" compaction 428 429settings: 430 431* backup retention period 432* chunk combination size (byte length or elapsed time) 433 434combining chunks: 435* size threshold below which adjacent chunks can be joined 436* size threshold above which chunks should be split 437* duration threshold below which adjacent chunks can be joined 438* duration threshold above which chunks should be split 439backup_min_chunk_size: 0 for no minimum 440backup_max_chunk_size: 0 for no maximum 441backup_min_chunk_duration: 0 for no minimum 442backup_max_chunk_duration: 0 for no maximum 443priority: size or duration?? 444 445data we absolutely need to keep: 446 447* the most recent APPLY MAILBOX for each mailbox we're keeping (mailbox state) 448* the APPLY MAILBOX containing the most recent RECORD for each message we're keeping (record state) 449* the APPLY MESSAGE for each message we're keeping (message data) 450 451data that we should practically keep: 452 453* all APPLY MAILBOXes for a given mailbox from the chunk identified as its last 454* all APPLY MAILBOXes containing a RECORD for a given message from the chunk identified as its last 455* the APPLY MESSAGE for each message we're keeping 456 457four kinds of compaction (probably at least two simultaneously): 458 459* removing unused chunks 460* combining adjacent chunks into a single chunk (for better gz compression) 461* removing unused message lines from within a chunk (important after combining) 462* removing unused messages from within a message line 463 464"unused messages" 465 messages for which all records have been expunged for longer 466 than the retention period 467"unused chunks" 468 chunks which contain only unused messages 469 470algorithm: 471 472* open (and lock) backup and backup.new (or bail out) 473* use backup index to identify chunks we still need 474* create a chunk in backup.new 475* foreach chunk we still need: 476* foreach line in the chunk: 477* next line if we don't need to keep it 478* create new line 479* foreach message in line: 480* if we still need the message, or if we're not doing message granularity 481* add the message to the new line 482* write and index tmp line to backup.new 483* if the new chunk is big enough, or if we're not combining 484* end chunk and start a new one 485* end the new chunk 486* rename backup->backup.old, backup.new->backup 487* close (and unlock) backup.old and backup 488 489 490command line locking utility 491------- 492 493command line utility to lock a backup (for e.g. safely poking around in the 494.index on a live system). 495 496example failure: 497$ctl_backups lock -f /path/to/backup 498* Trying to obtain lock on /path/to/backup... 499NO some error 500<EOF> 501 502example success: 503$ctl_backups lock -f /path/to/backup 504* Trying to obtain lock on /path/to/backup... 505[potentially a delay here if we need to wait for another process to release the lock] 506OK locked 507[waits for its stdin to close, then unlocks and exits] 508 509if you need to rummage around in backup.index, run this program in another 510shell, do your work, then ^D it when you're finished. 511 512you could also call this from e.g. perl over a bidirectional pipe - wait to 513read "OK locked", then you've got your lock. close the pipe to unlock when 514you're finished working. if you don't read "OK locked" before the pipe closes 515then something went wrong and you didn't get the lock. 516 517specify backups by -f filename, -m mailbox, -u userid 518default run mode as above 519-s to fork an sqlite of the index (and unlock when it exits) 520-x to fork a command of your choosing (and unlock when it exits) 521 522 523reconstruct 524----------- 525 526rebuilding backups.db from on disk files 527 528scan each backup partition for backup files: 529 * skip timestamped files (i.e. backups from compact/reindex) 530 * skip .old files (old backups from reindex) 531 * .index files => skip??? 532 * skip unreadable files 533 * skip empty files 534 * skip directories etc 535 536what's the correct procedure for repopulating a cyrus database? 537keep copy of the previous (presumably broken) one? 538 539trim off mkstemp suffix (if any) to find userid 540can we use a recognisable character to delimit the mkstemp suffix? 541 542what if there's multiple backup files for a given userid? precedence? 543 544verify found backups before recording. reindex? 545 546locking? what if something has a filename and does stuff with it while 547reconstruct runs? 548 549backupd always uses db for opens, so as long as reconstruct keeps the db 550locked while it works, the db won't clash. but backupd might have backups 551still open from before reconstruct started, which it will write to quite 552happily, even though reconstruct might decide that some other file is the 553correct one for that user... 554 555a backup server would generally be used only for backups, and sync_client 556is quite resilient when the destination isn't there, so it's actually 557no problem to just shut down cyrus while reconstruct runs. no outage to 558user-facing services, just maybe some sync backlog to catch up on once 559cyrus is restarted. 560 561 562ctl_backups 563------------- 564 565sbin tool for mass backup/index/database operations 566 567needs: 568 * rebuild backups.db from disk contents 569 * list backups/info 570 * rename a backup 571 * delete a backup 572 * verify a backup (check all sha1's, not just most recent) 573 574not sure if these should be included, or separate tools: 575 * reindex a backup (or more) 576 * compact a backup (or more) 577 * lock a backup 578 * some sort of rolling compaction? 579 580usage: 581 ctl_backups [options] reconstruct # reconstruct backups.db from disk files 582 ctl_backups [options] list [list_opts] [[mode] backup...] # list backup info for given/all users 583 ctl_backups [options] move new_fname [mode] backup # rename a backup (think about this more) 584 ctl_backups [options] delete [mode] backup # delete a backup 585 ctl_backups [options] verify [mode] backup... # verify specified backups 586 ctl_backups [options] reindex [mode] backup... # reindex specified backups 587 ctl_backups [options] compact [mode] backup... # compact specified backups 588 ctl_backups [options] lock [lock_opts] [mode] backup # lock specified backup 589 590options: 591 -C alt_config # alternate config file 592 -F # force (run command even if not needed) 593 -S # stop on error 594 -v # verbose 595 -w # wait for locks (i.e. don't skip locked backups) 596 597mode: 598 -A # all known backups (not valid for single backup commands) 599 -D # specified backups interpreted as domains (nvfsbc) 600 -P # specified backups interpreted as userid prefixes (nvfsbc) 601 -f # specified backups interpreted as filenames 602 -m # specified backups interpreted as mboxnames 603 -u # specified backups interpreted as userids (default) 604 605lock_opts: 606 -c # exclusively create backup 607 -s # lock backup and open index in sqlite 608 -x cmd # lock backup and execute cmd 609 -p # lock backup and wait for eof on stdin (default) 610 611list_opts: 612 -t [hours] # "stale" (no update in hours) backups only (default: 24) 613 614 615cyr_backup 616------ 617 618sbin tool for inspecting backups 619 620needs: 621 * better name? 622 * list stuff 623 * show stuff 624 * dump stuff 625 * restore? 626 627* should lock/move/delete (single backup commands) from ctl_backups be moved here? 628 629usage: 630 cyr_backup [options] [mode] backup list [all | chunks | mailboxes | messages]... 631 cyr_backup [options] [mode] backup show chunks [id...] 632 cyr_backup [options] [mode] backup show messages [guid...] 633 cyr_backup [options] [mode] backup show mailboxes [mboxname | uniqueid]... 634 cyr_backup [options] [mode] backup dump [dump_opts] chunk id 635 cyr_backup [options] [mode] backup dump [dump_opts] message guid 636 cyr_backup [options] [mode] backup json [chunks | mailboxes | messages]... 637 638options: 639 -C alt_config # alternate config file 640 -v # verbose 641 642mode: 643 -f # backup interpreted as filename 644 -m # backup interpreted as mboxname 645 -u # backup interpreted as userid (default) 646 647commands: 648 list: table of contents, one per line 649 show: indexed details of listed items, one per paragraph, detail per line 650 dump: relevant contents from backup stream 651 json: indexed details of listed items in json format 652 653dump options: 654 -o filename # dump to named file instead of stdout 655 656 657partitions 658---------- 659 660not enough information in sync protocol to handle partitions easily? 661 662we know what the partition is when we do an APPLY operation (mailbox, message, 663etc), but the initial GET operations don't include it. so we need to already 664know where the appropriate backup is partitioned in order to find the backup 665file in order to look inside it to respond to the GET request 666 667if we have a mailboxes database (indexed by mboxname, uniqueid and userid) then 668maybe that would make it feasible? if it's not in the mailboxes database then 669we don't have a backup for it yet, so we respond accordingly, and get sent 670enough information to create it. 671 672does that mean the backup api needs to take an mbname on open, and it handles 673the job of looking it up in the mailboxes database to find the appropriate 674thing to open? 675 676can we use sqlite for such a database, or is the load on it going to be too 677heavy? locking? we have lots of database formats up our sleeves here, so 678even though we use sqlite for the backup index there isn't any particular 679reason we're beholden to it for the mailboxes db too 680 681if we have a mailboxes db then we need a reconstruct tool for that, too 682 683what if we support multiple backup partitions, but don't expect these 684to necessarily correspond with mailbox partitions. they're just for spreading 685disk usage around. 686 687* when creating a backup for a previously-unseen user we'd pick a random 688 partition to put them on 689* ctl_backups would need a command to move an existing backup to a 690 given partition 691* ctl_backups would need a command to pre-create a user backup on a 692 given partition for initial distribution 693* instead of "backup_data_path" setting, have one-or-more 694 "backuppartition-<name>" settings, ala partition- and friends 695 696see imap/partlist.[ch] for partition list management stuff. it's complicated 697and doesn't have a test suite, so maybe save this implementation until needed. 698 699but... maybe rename backup_data_path to backuppartition-default in the meantime, 700so that when we do add this it's not a complicated reconfig to update? 701 702partlist_local_select (and lazy-loaded partlist_local_init) are where the 703mailbox partitions come from (see also mboxlist_create_partition), do something 704similar for backup partitions 705 706 707data corruption 708--------------- 709 710backups.db: 711 * can be reconstructed from on disk files at any time 712 * how to detect corruption? does cyrus_db detect/repair on its own? 713 714backup indexes: 715 * can be reindexed at any time from backup data 716 * how to detect corruption? assume sqlite will notice, complain? 717 718backup data: 719 * what's zlib's failure mode? do we lose the entire chunk or just the corrupt bit? 720 * verify will notice sha1sum mismatches 721 * dlist format will reject some kinds of corruption (but not all) 722 * reindex: should skip unparseable dlist lines 723 * message data has its own checksums (guid) 724 * reindex: should skip messages that don't match their own checksums 725 * compact: "full" compact will only keep useful data according to index 726 * backupd: will sync anything that's in user mailbox but not in backup index 727 728i think this means that if a message or mailbox state becomes corrupted in 729the backup data file, and it still exists in the user's real mailbox, you 730recover from the corruption by reindexing and then letting the sync process 731copy the missing data back in again. and you can tidy up the data file by 732running a compact over it. 733 734you detect data corruption in most recent chunk reactively as soon as the 735backup system needs to open it again (quick verify on open) 736 737you detect data corruption in older chunks reactively by trying to restore from 738it. may be too late: if a message needs restoring it's because user mailbox no 739longer has it 740 741you detect data corruption preemptively by running the verify tool over it. 742recommend scheduling this in EVENTS/cron? 743 744if data corruption occurs in message that's no longer in user's mailbox, that 745message is lost. it was going to be deleted from the backup after $retention 746period anyway (by compact), but if it needs restoring in the meantime, sorry 747 748 749installation instructions 750------------------------- 751 752(obviously, most of this won't work at this point, because the code doesn't 753exist. but this is, approximately, where things are heading.) 754 755on your backup server: 756 * compile with --enable-backup configure option and install 757 * imapd.conf: 758 backuppartition-default: /var/spool/backup # FIXME better example 759 backup_db: twoskip 760 backup_db_path: /var/imap/backups.db 761 backup_staging_path: /var/spool/backup 762 backup_retention_days: 7 763 * cyrus.conf SERVICES: 764 backupd cmd="backupd" listen="csync" prefork=0 765 (remove other services, most likely) 766 (should i create a master/conf/backup.conf example file?) 767 * cyrus.conf EVENTS: 768 compact cmd="ctl_backups compact -A" at=0400 769 * start server as usual 770 * do i want a special port for backupd? 771 772on your imap server: 773 * imapd.conf: 774 sync_log_channels: backup 775 sync_log: 1 776 backup_sync_host: backup-server.example.com 777 backup_sync_port: csync 778 backup_sync_authname: ... 779 backup_sync_password: ... 780 backup_sync_repeat_interval: ... # seconds, smaller value = livelier backups but more i/o 781 backup_sync_shutdown_file: .... 782 * cyrus.conf STARTUP: 783 backup_sync cmd="sync_client -r -n backup" 784 * cyrus.conf SERVICES: 785 restored cmd="restored" [...] 786 * start/restart master 787 788files and such: 789 {configdirectory}/backups.db - database mapping userids to backup locations 790 {backuppartition-name}/<hash>/<userid>_XXXXXX - backup data stream for userid 791 {backuppartition-name}/<hash>/<userid>_XXXXXX.index - index into userid's backup data stream 792 793do i want rhost in the path? 794 * protects from issue if multiple servers are trying to back up their own version of same user 795 (though this is its own problem that the backup system shouldn't have to compensate for) 796 * but makes location of undifferentiated user unpredictable 797 * so probably not, actually 798 799 800chatting about implementation 20/10 801----------------------------------- 80209:54 elliefm_ 803here's a fun sync question 804APPLY MESSAGE provides a list of messages 805can a single APPLY MESSAGE contain messages for multiple mailboxes and/or users? 806my first hunch is that it doesn't cross users, since the broadest granularity for a single sync run is USER 80710:06 kmurchison 808We'd have to check with Bron, but I *think* messages can cross mailboxes for a single user 80910:06 brong_ 810yes 811APPLY MESSAGE just adds it to the reserve list 81210:07 elliefm_ 813nah apply message uploads the message, APPLY RESERVE adds it to the reserve list :P 81410:07 brong_ 815same same 816APPLY RESERVE copies it from a local mailbox 817APPLY MESSAGE uploads it 81810:07 elliefm_ 819yep 82010:07 brong_ 821they both wind up in the reserve list 82210:07 elliefm_ 823ahh i see what you mean, gotcha 82410:07 brong_ 825until you send a RESTART 826ideally you want it reserve in the same partition, but it will copy the message over if it's not on the same partition 827there's no restriction on which mailbox it came from/went to 828good for user renames, and good for an append to a bunch of mailboxes in different users / shared space all at once 829(which LMTP can do) 83010:10 elliefm_ 831i can handle the case where a single APPLY MESSAGE contains messages for multiple mailboxes belonging to the same user 832but i'm in trouble if a single APPLY MESSAGE can contain messages belonging to different users 83310:14 brong_ 834elliefm_: why? 83510:14 brong_ 836you don't have to keep them if they aren't used 83710:15 elliefm_ 838for backups - when i see the apply, i need to know which user's backup to add it to. that's easy enough if it doesn't cross users but gets mega fiddly if it does 839i'm poking around in sync client to see if it's likely to be an issue or not 84011:00 brong__ 841elliefm_: I would stage it, and add it to users as it gets refcounted in by an index file 84211:07 elliefm_ 843that's pretty much what we do for ordinary sync and delivery stuff yeah? 84411:08 brong__ 845yep 846and it's what the backup thing does 84711:09 elliefm_ 848i'm pretty sure that APPLY RESERVE and APPLY MESSAGE don't give a damn about users, they're just "here's every message you might not have already had since last time we spoke" and it lets the APPLY MAILBOX work out where to attach them later 84911:09 brong__ 850yep 85111:09 elliefm_ 852so yeah, i'll need to do something here 853i've been working so far on the idea that a single user's backup consists of 1) an append-only gzip stream of the sync protocol chat that built it, and 2) an index that tracks current state of mailboxes, and offsets within (1) of message data 854that gets us good compression (file per user, not file per message), and if the index gets corrupted or lost, it's rebuildable purely from (1), it doesn't need a live copy of the original mailbox 85511:12 brong_ 856yep, that all works 85711:12 elliefm_ 858(so if you lose your imap server, you're not unable to rebuild a broken index on the backup) 85911:13 brong_ 860it's easy enough to require the sync protocol stream to only contain messages per user 861though "apply reserve" is messy 862because you need to return "yes, I have that message" 86311:13 elliefm_ 864with that implementation i can't (easily) keep user.a's messages from not existing in user.b's data stream (though they won't be indexed) 86511:14 brong_ 866I'm not too adverse to the idea of just unpacking each message as it comes off the wire into a temporary directory 86711:14 elliefm_ 868(because at the time i'm receiving the sync data i don't know which it needs to go in, so if they come in in the same reserve i'd need to append them to both data streams) 869which isn't a huge problem, just… irks me a bit 87011:14 brong_ 871and then reading the indexes as they come in, checking against the state DB to see if we already have them, and streaming them into the gzip if they aren't there yet 872what we can do is something like the current format, where files go into a tar 87311:16 elliefm_ 874i guess the fiddly bit there is that there's one more moving part to keep synchronised across failure states 875a backup for a single user becomes 1) data stream + 2) any messages that were uploaded but not yet added to a mailbox + 3) index (which doesn't know what to do with (2)) 876which in the general case is fine, the next sync will update the mailboxes, which will push (2) into (1) and index it nicely, and on we go 877but it's just a little bit more mess if there's a failure that you need to recover from between those states — it's no longer a simple case of "it's in the backup and we know everything about it" or "it doesn't exist", there's a third case of "well we might have the data but don't really know what to do with it" 878the other fiddly bit is that the process of appending to the data stream is suddenly in the business of crafting output rather than simply dumping what it gets, which isn't really burdensome, but it is one more little crack for bugs to crawl into 879i guess in terms of sync protocol, one thing i could do on my end is identify apply operations that seem to contain multiple users' data, and just return an error on those. the sync client on the other end will promote them until they're eventually user syncs, which i think are always user granularity 88011:50 elliefm_ 881i think for now, first stage implementation will be to stream the reserve/message commands in full to every user backup they might apply to. and optimising that down so that each stream only contains messages belonging to that user can be a future optimisation 882 883 884todo list 885--------- 886 887* clean up error handling 888* perl tool to anonymise sync proto talk 889* verification step to check entire data stream for errors (even chunks that aren't indexed) 890* prot_fill_cb: extra argument to pass back an error string to prot_fill 891* ctl_backups verify: set level 892* backupd: don't block on locked backups, return mailbox locked -- but sync_client doesn't handle this 893* test multiple backup partitions 894* configure: error if backups requested and we don't have zlib 895* valgrind 896* finish reconstruct 897* compact: split before append? 898 899compact implementation steps: 900 1 remove unused chunks, keep everything else as is 901 2 join adjacent chunks if small enough, split large chunks 902 3 parse/rebuild message lines 903 4 discard unused mailbox lines 904