1==============================
2Device-mapper snapshot support
3==============================
4
5Device-mapper allows you, without massive data copying:
6
7-  To create snapshots of any block device i.e. mountable, saved states of
8   the block device which are also writable without interfering with the
9   original content;
10-  To create device "forks", i.e. multiple different versions of the
11   same data stream.
12-  To merge a snapshot of a block device back into the snapshot's origin
13   device.
14
15In the first two cases, dm copies only the chunks of data that get
16changed and uses a separate copy-on-write (COW) block device for
17storage.
18
19For snapshot merge the contents of the COW storage are merged back into
20the origin device.
21
22
23There are three dm targets available:
24snapshot, snapshot-origin, and snapshot-merge.
25
26-  snapshot-origin <origin>
27
28which will normally have one or more snapshots based on it.
29Reads will be mapped directly to the backing device. For each write, the
30original data will be saved in the <COW device> of each snapshot to keep
31its visible content unchanged, at least until the <COW device> fills up.
32
33
34-  snapshot <origin> <COW device> <persistent?> <chunksize>
35   [<# feature args> [<arg>]*]
36
37A snapshot of the <origin> block device is created. Changed chunks of
38<chunksize> sectors will be stored on the <COW device>.  Writes will
39only go to the <COW device>.  Reads will come from the <COW device> or
40from <origin> for unchanged data.  <COW device> will often be
41smaller than the origin and if it fills up the snapshot will become
42useless and be disabled, returning errors.  So it is important to monitor
43the amount of free space and expand the <COW device> before it fills up.
44
45<persistent?> is P (Persistent) or N (Not persistent - will not survive
46after reboot).  O (Overflow) can be added as a persistent store option
47to allow userspace to advertise its support for seeing "Overflow" in the
48snapshot status.  So supported store types are "P", "PO" and "N".
49
50The difference between persistent and transient is with transient
51snapshots less metadata must be saved on disk - they can be kept in
52memory by the kernel.
53
54When loading or unloading the snapshot target, the corresponding
55snapshot-origin or snapshot-merge target must be suspended. A failure to
56suspend the origin target could result in data corruption.
57
58Optional features:
59
60   discard_zeroes_cow - a discard issued to the snapshot device that
61   maps to entire chunks to will zero the corresponding exception(s) in
62   the snapshot's exception store.
63
64   discard_passdown_origin - a discard to the snapshot device is passed
65   down to the snapshot-origin's underlying device.  This doesn't cause
66   copy-out to the snapshot exception store because the snapshot-origin
67   target is bypassed.
68
69   The discard_passdown_origin feature depends on the discard_zeroes_cow
70   feature being enabled.
71
72
73-  snapshot-merge <origin> <COW device> <persistent> <chunksize>
74   [<# feature args> [<arg>]*]
75
76takes the same table arguments as the snapshot target except it only
77works with persistent snapshots.  This target assumes the role of the
78"snapshot-origin" target and must not be loaded if the "snapshot-origin"
79is still present for <origin>.
80
81Creates a merging snapshot that takes control of the changed chunks
82stored in the <COW device> of an existing snapshot, through a handover
83procedure, and merges these chunks back into the <origin>.  Once merging
84has started (in the background) the <origin> may be opened and the merge
85will continue while I/O is flowing to it.  Changes to the <origin> are
86deferred until the merging snapshot's corresponding chunk(s) have been
87merged.  Once merging has started the snapshot device, associated with
88the "snapshot" target, will return -EIO when accessed.
89
90
91How snapshot is used by LVM2
92============================
93When you create the first LVM2 snapshot of a volume, four dm devices are used:
94
951) a device containing the original mapping table of the source volume;
962) a device used as the <COW device>;
973) a "snapshot" device, combining #1 and #2, which is the visible snapshot
98   volume;
994) the "original" volume (which uses the device number used by the original
100   source volume), whose table is replaced by a "snapshot-origin" mapping
101   from device #1.
102
103A fixed naming scheme is used, so with the following commands::
104
105  lvcreate -L 1G -n base volumeGroup
106  lvcreate -L 100M --snapshot -n snap volumeGroup/base
107
108we'll have this situation (with volumes in above order)::
109
110  # dmsetup table|grep volumeGroup
111
112  volumeGroup-base-real: 0 2097152 linear 8:19 384
113  volumeGroup-snap-cow: 0 204800 linear 8:19 2097536
114  volumeGroup-snap: 0 2097152 snapshot 254:11 254:12 P 16
115  volumeGroup-base: 0 2097152 snapshot-origin 254:11
116
117  # ls -lL /dev/mapper/volumeGroup-*
118  brw-------  1 root root 254, 11 29 ago 18:15 /dev/mapper/volumeGroup-base-real
119  brw-------  1 root root 254, 12 29 ago 18:15 /dev/mapper/volumeGroup-snap-cow
120  brw-------  1 root root 254, 13 29 ago 18:15 /dev/mapper/volumeGroup-snap
121  brw-------  1 root root 254, 10 29 ago 18:14 /dev/mapper/volumeGroup-base
122
123
124How snapshot-merge is used by LVM2
125==================================
126A merging snapshot assumes the role of the "snapshot-origin" while
127merging.  As such the "snapshot-origin" is replaced with
128"snapshot-merge".  The "-real" device is not changed and the "-cow"
129device is renamed to <origin name>-cow to aid LVM2's cleanup of the
130merging snapshot after it completes.  The "snapshot" that hands over its
131COW device to the "snapshot-merge" is deactivated (unless using lvchange
132--refresh); but if it is left active it will simply return I/O errors.
133
134A snapshot will merge into its origin with the following command::
135
136  lvconvert --merge volumeGroup/snap
137
138we'll now have this situation::
139
140  # dmsetup table|grep volumeGroup
141
142  volumeGroup-base-real: 0 2097152 linear 8:19 384
143  volumeGroup-base-cow: 0 204800 linear 8:19 2097536
144  volumeGroup-base: 0 2097152 snapshot-merge 254:11 254:12 P 16
145
146  # ls -lL /dev/mapper/volumeGroup-*
147  brw-------  1 root root 254, 11 29 ago 18:15 /dev/mapper/volumeGroup-base-real
148  brw-------  1 root root 254, 12 29 ago 18:16 /dev/mapper/volumeGroup-base-cow
149  brw-------  1 root root 254, 10 29 ago 18:16 /dev/mapper/volumeGroup-base
150
151
152How to determine when a merging is complete
153===========================================
154The snapshot-merge and snapshot status lines end with:
155
156  <sectors_allocated>/<total_sectors> <metadata_sectors>
157
158Both <sectors_allocated> and <total_sectors> include both data and metadata.
159During merging, the number of sectors allocated gets smaller and
160smaller.  Merging has finished when the number of sectors holding data
161is zero, in other words <sectors_allocated> == <metadata_sectors>.
162
163Here is a practical example (using a hybrid of lvm and dmsetup commands)::
164
165  # lvs
166    LV      VG          Attr   LSize Origin  Snap%  Move Log Copy%  Convert
167    base    volumeGroup owi-a- 4.00g
168    snap    volumeGroup swi-a- 1.00g base  18.97
169
170  # dmsetup status volumeGroup-snap
171  0 8388608 snapshot 397896/2097152 1560
172                                    ^^^^ metadata sectors
173
174  # lvconvert --merge -b volumeGroup/snap
175    Merging of volume snap started.
176
177  # lvs volumeGroup/snap
178    LV      VG          Attr   LSize Origin  Snap%  Move Log Copy%  Convert
179    base    volumeGroup Owi-a- 4.00g          17.23
180
181  # dmsetup status volumeGroup-base
182  0 8388608 snapshot-merge 281688/2097152 1104
183
184  # dmsetup status volumeGroup-base
185  0 8388608 snapshot-merge 180480/2097152 712
186
187  # dmsetup status volumeGroup-base
188  0 8388608 snapshot-merge 16/2097152 16
189
190Merging has finished.
191
192::
193
194  # lvs
195    LV      VG          Attr   LSize Origin  Snap%  Move Log Copy%  Convert
196    base    volumeGroup owi-a- 4.00g
197