xref: /dragonfly/bin/cpdup/BACKUPS (revision 9b5ae8ee)
1$DragonFly: src/bin/cpdup/BACKUPS,v 1.4 2007/05/17 08:19:00 swildner Exp $
2
3			    INCREMENTAL BACKUP HOWTO
4
5    This document describes one of several ways to set up a LAN backup and
6    an off-site WAN backup system using cpdup's hardlinking capabilities.
7
8    The features described in this document are also encapsulated in scripts
9    which can be found in the scripts/ directory.  These scripts can be used
10    to automate all backup steps except for the initial preparation of the
11    backup and off-site machine's directory topology.  Operation of these
12    scripts is described in the last section of this document.
13
14
15		    PART 1 - PREPARE THE LAN BACKUP BOX
16
17    The easiest way to create a LAN backup box is to NFS mount all your
18    backup clients onto the backup box.  It is also possible to use cpdup's
19    remote host feature to access your client boxes but that requires root
20    access to the client boxes and is not described here.
21
22    Create a directory on the backup machine called /nfs, a subdirectory
23    foreach remote client, and subdirectories for each partition on each
24    client.  Remember that cpdup does not cross mount points so you will
25    need a mount for each partition you wish to backup.  For example:
26
27	[ ON LAN BACKUP BOX ]
28
29	mkdir /nfs
30	mkdir /nfs/box1
31	mkdir /nfs/box1/home
32	mkdir /nfs/box1/var
33
34    Before you actually do the NFS mount, create a dummy file for each
35    mount point that can be used by scripts to detect when an NFS mount
36    has not been done.  Scripts can thus avoid a common failure scenario
37    and not accidently cpdup an empty mount point to the backup partition
38    (destroying that day's backup in the process).
39
40	touch /nfs/box1/home/NOT_MOUNTED
41	touch /nfs/box1/var/NOT_MOUNTED
42
43    Once the directory structure has been set up, do your NFS mounts and
44    also add them to your fstab.  Since you will probably wind up with a
45    lot of mounts it is a good idea to use 'ro,bg' (readonly, background
46    mount) in the fstab entries.
47
48	mount box1:/home /nfs/box1/home
49	mount box1:/var /nfs/box1/var
50
51    You should create a huge /backup partition on your backup machine which
52    is capable of holding all your mirrors.  Create a subdirectory called
53    /backup/mirrors in your huge backup partition.
54
55	mount <huge_disk> /backup
56	mkdir /backup/mirrors
57
58
59			PART 2 - DOING A LEVEL 0 BACKUP
60
61    (If you use the supplied scripts, a level 0 backup can be accomplished
62    simply by running the 'do_mirror' script with an argument of 0).
63
64    Create a level 0 backup using a standard cpdup with no special arguments
65    other then -i0 -s0 (tell it not to ask questions and turn off the
66    file-overwrite-with-directory safety feature).  Name the mirror with
67    the date in a string-sortable format.
68
69	set date = `date "+%Y%m%d"`
70	mkdir /backup/mirrors/box1.${date}
71	cpdup -i0 -s0 /nfs/box1/home /backup/mirrors/box1.${date}/home
72	cpdup -i0 -s0 /nfs/box1/var /backup/mirrors/box1.${date}/var
73
74    Create a softlink to the most recently completed backup, which is your
75    level 0 backup.  Note that using 'ln -sf' will create a link in the
76    subdirectory pointed to by the current link, not replace the current
77    link. 'ln -shf' can be used to replace the link but is not portable.
78    'mv -f' has the same problem.
79
80	sync
81	rm -f /backup/mirrors/box1
82	ln -s /backup/mirrors/box1.${date} /backup/mirrors/box1
83
84			PART 3 - DO AN INCREMENTAL BACKUP
85
86    An incremental backup is exactly the same as a level 0 backup EXCEPT
87    you use the -H option to specify the location of the most recent
88    completed backup.  We simply maintain the handy softlink pointing at
89    the most recent completed backup and the cpdup required to do this
90    becomes trivial.
91
92    Each day's incremental backup will reproduce the ENTIRE directory topology
93    for the client, but cpdup will hardlink files from the most recent backup
94    instead of copying them and this is what saves you all the disk space.
95
96	set date = `date "+%Y%m%d"`
97	mkdir /backup/mirrors/box1.${date}
98	if ( "`readlink /backup/mirrors/box1`" == "box1.${date}" ) then
99	    echo "silly boy, an incremental already exists for today"
100	    exit 1
101	endif
102	cpdup -H /backup/mirrors/box1 \
103	      -i0 -s0 /nfs/box1/home /backup/mirrors/box1.${date}/home
104
105    Be sure to update your 'most recent backup' softlink, but only do it
106    if the cpdup's for all the partitions for that client have succeeded.
107    That way the next incremental backup will be based on the previous one.
108
109	rm -f /backup/mirrors/box1
110	ln -s /backup/mirrors/box1.${date} /backup/mirrors/box1
111
112    Since these backups are mirrors, locating a backup is as simple
113    as CDing into the appropriate directory.  If your filesystem has a
114    hardlink limit and cpdup hits it, cpdup will 'break' the hardlink
115    and copy the file instead.  Generally speaking only a few special cases
116    will hit the hardlink limit for a filesystem.  For example, the
117    CVS/Root file in a checked out cvs repository is often hardlinked, and
118    the sheer number of hardlinked 'Root' files multiplied by the number
119    of backups can often hit the filesystem hardlink limit.
120
121		    PART 4 - DO AN INCREMENTAL VERIFIED BACKUP
122
123    Since your incremental backups use hardlinks heavily the actual file
124    might exist on the physical /backup disk in only one place even though
125    it may be present in dozens of daily mirrors.  To ensure that the
126    file being hardlinked does not get corrupted cpdup's -f option can be
127    used in conjunction with -H to force cpdup to validate the contents
128    of the file, even if all the stat info looks identical.
129
130	cpdup -f -H /backup/mirrors/box1 ...
131
132    You can create completely redundant (non-hardlinked-dependent) backups
133    by doing the equivalent of your level 0, i.e. not using -H.  However I
134    do NOT recommend that you do this, or that you do it very often (maybe
135    once every 6 months at the most), because each mirror created this way
136    will have a distinct copy of all the file data and you will quickly
137    run out of space in your /backup partition.
138
139		    MAINTAINANCE OF THE "/backup" DIRECTORY
140
141    Now, clearly you are going to run out of space in /backup if you keep
142    doing this, but you may be surprised at just how many daily incrementals
143    you can create before you fill up your /backup partition.
144
145    If /backup becomes full, simply start rm -rf'ing older mirror directories
146    until enough space is freed up.   You do not have to remove the oldest
147    directory first.  In fact, you might want to keep it around and remove
148    a day's backup here, a day's backup there, etc, until you free up enough
149    space.
150
151				OFF-SITE BACKUPS
152
153    Making an off-site backup involves similar methodology, but you use
154    cpdup's remote host capability to generate the backup.  To avoid
155    complications it is usually best to take a mirror already generated on
156    your LAN backup box and copy that to the remote box.
157
158    The remote backup box does not use NFS, so setup is trivial.  Just
159    create your super-large /backup partition and mkdir /backup/mirrors.
160    Your LAN backup box will need root access via ssh to your remote backup
161    box.
162
163    You can use the handy softlink to get the latest 'box1.date' mirror
164    directory and since the mirror is all in one partition you can just
165    cpdup the entire machine in one command.  Use the same dated directory
166    name on the remote box, so:
167
168        # latest will wind up something like 'box1.20060915'
169	set latest = `readlink /backup/mirrors/box1`
170	cpdup -i0 -s0 /backup/mirrors/$latest remote.box:/backup/mirrors/$latest
171
172    As with your LAN backup, create a softlink on the backup box denoting the
173    latest mirror for any given site.
174
175	if ( $status == 0 ) then
176	    ssh remote.box -n \
177		"rm -f /backup/mirrors/box1; ln -s /backup/mirrors/$latest /backup/mirrors/box1"
178	endif
179
180    Incremental backups can be accomplished using the same cpdup command,
181    but adding the -H option to the latest backup on the remote box.  Note
182    that the -H path is relative to the remote box, not the LAN backup box
183    you are running the command from.
184
185	set latest = `readlink /backup/mirrors/box1`
186	set remotelatest = `ssh remote.box -n "readlink /backup/mirrors/box1"`
187	if ( "$latest" == "$remotelatest" ) then
188	    echo "silly boy, you already made a remote incremental backup today"
189	    exit 1
190	endif
191	cpdup -H /backup/mirrors/$remotelatest \
192	      -i0 -s0 /backup/mirrors/$latest remote.box:/backup/mirrors/$latest
193	if ( $status == 0 ) then
194	    ssh remote.box -n \
195		"rm -f /backup/mirrors/box1; ln -s /backup/mirrors/$latest /backup/mirrors/box1"
196	endif
197
198    Cleaning out the remote directory works the same as cleaning out the LAN
199    backup directory.
200
201
202			    RESTORING FROM BACKUPS
203
204    Each backup is a full filesystem mirror, and depending on how much space
205    you have you should be able to restore it simply by cd'ing into the
206    appropriate backup directory and using 'cpdup blah box1:blah' (assuming
207    root access), or you can export the backup directory via NFS to your
208    client boxes and use cpdup locally on the client to extract the backup.
209    Using NFS is probably the most efficient solution.
210
211
212			PUTTING IT ALL TOGETHER - SOME SCRIPTS
213
214    Please refer to the scripts in the script/ subdirectory.  These scripts
215    are EXAMPLES ONLY.  If you want to use them, put them in your ~root/adm
216    directory on your backup box and set up a root crontab.
217
218    First follow the preparation rules in PART 1 above.  The scripts do not
219    do this automatically.  Edit the 'params' file that the scripts use
220    to set default paths and such.
221
222	** FOLLOW DIRECTIONS IN PART 1 ABOVE TO SET UP THE LAN BACKUP BOX **
223
224    Copy the scripts to ~/adm.  Do NOT install a crontab yet (but an example
225    can be found in scripts/crontab).
226
227    Do a manual lavel 0 LAN BACKUP using the do_mirror script.
228
229	cd ~/adm
230	./do_mirror 0
231
232    Once done you can do incremental backups using './do_mirror 1' to do a
233    verified incremental, or './do_mirror 2' to do a stat-optimized
234    incremental.  You can enable the cron jobs that run do_mirror and
235    do_cleanup now.
236
237    --
238
239    Setting up an off-site backup box is trivial.  The off-site backup box
240    needs to allow root ssh logins from the LAN backup box (at least for
241    now, sorry!).  Set up the off-site backup directory, typically
242    /backup/mirrors.  Then do a level 0 backup from your LAN backup box
243    to the off-site box using the do_remote script.
244
245	cd ~/adm
246	./do_remote 0
247
248    Once done you can do incremental backups using './do_remote 1' to do a
249    verified incremental, or './do_mirror 2' to do a stat-optimized
250    incremental.  You can enable the cron jobs that run do_remote now.
251
252    NOTE!  It is NOT recommended that you use verified-incremental backups
253    over a WAN, as all related data must be copied over the wire every single
254    day.  Instead, I recommend sticking with stat-optimized backups
255    (./do_mirror 2).
256
257    You will also need to set up a daily cleaning script on the off-site
258    backup box.
259
260    SCRIPT TODOS - the ./do_cleanup script is not very smart.  We really
261    should do a tower-of-hanoi removal
262
263
264