xref: /dragonfly/share/man/man4/vinum.4 (revision e98bdfd3)
1.\"  Hey, Emacs, edit this file in -*- nroff-fill -*- mode
2.\"-
3.\" Copyright (c) 1997, 1998
4.\"	Nan Yang Computer Services Limited.  All rights reserved.
5.\"
6.\"  This software is distributed under the so-called ``Berkeley
7.\"  License'':
8.\"
9.\" Redistribution and use in source and binary forms, with or without
10.\" modification, are permitted provided that the following conditions
11.\" are met:
12.\" 1. Redistributions of source code must retain the above copyright
13.\"    notice, this list of conditions and the following disclaimer.
14.\" 2. Redistributions in binary form must reproduce the above copyright
15.\"    notice, this list of conditions and the following disclaimer in the
16.\"    documentation and/or other materials provided with the distribution.
17.\" 3. All advertising materials mentioning features or use of this software
18.\"    must display the following acknowledgement:
19.\"	This product includes software developed by Nan Yang Computer
20.\"      Services Limited.
21.\" 4. Neither the name of the Company nor the names of its contributors
22.\"    may be used to endorse or promote products derived from this software
23.\"    without specific prior written permission.
24.\"
25.\" This software is provided ``as is'', and any express or implied
26.\" warranties, including, but not limited to, the implied warranties of
27.\" merchantability and fitness for a particular purpose are disclaimed.
28.\" In no event shall the company or contributors be liable for any
29.\" direct, indirect, incidental, special, exemplary, or consequential
30.\" damages (including, but not limited to, procurement of substitute
31.\" goods or services; loss of use, data, or profits; or business
32.\" interruption) however caused and on any theory of liability, whether
33.\" in contract, strict liability, or tort (including negligence or
34.\" otherwise) arising in any way out of the use of this software, even if
35.\" advised of the possibility of such damage.
36.\"
37.\" $FreeBSD: src/share/man/man4/vinum.4,v 1.22.2.9 2002/04/22 08:19:35 kuriyama Exp $
38.\"
39.Dd December 12, 2014
40.Dt VINUM 4
41.Os
42.Sh NAME
43.Nm vinum
44.Nd Logical Volume Manager
45.Sh SYNOPSIS
46.Cd "pseudo-device vinum"
47.Sh DESCRIPTION
48.Nm
49is a logical volume manager inspired by, but not derived from, the Veritas
50Volume Manager.
51It provides the following features:
52.Bl -bullet
53.It
54It provides device-independent logical disks, called
55.Em volumes .
56Volumes are
57not restricted to the size of any disk on the system.
58.It
59The volumes consist of one or more
60.Em plexes ,
61each of which contain the
62entire address space of a volume.
63This represents an implementation of RAID-1
64(mirroring).
65Multiple plexes can also be used for
66.\" XXX What about sparse plexes?  Do we want them?
67.Bl -bullet
68.It
69Increased read throughput.
70.Nm
71will read data from the least active disk, so if a volume has plexes on multiple
72disks, more data can be read in parallel.
73.Nm
74reads data from only one plex, but it writes data to all plexes.
75.It
76Increased reliability.
77By storing plexes on different disks, data will remain
78available even if one of the plexes becomes unavailable.
79In comparison with a
80RAID-5 plex (see below), using multiple plexes requires more storage space, but
81gives better performance, particularly in the case of a drive failure.
82.It
83Additional plexes can be used for on-line data reorganization.
84By attaching an
85additional plex and subsequently detaching one of the older plexes, data can be
86moved on-line without compromising access.
87.It
88An additional plex can be used to obtain a consistent dump of a file system.
89By
90attaching an additional plex and detaching at a specific time, the detached plex
91becomes an accurate snapshot of the file system at the time of detachment.
92.\" Make sure to flush!
93.El
94.It
95Each plex consists of one or more logical disk slices, called
96.Em subdisks .
97Subdisks are defined as a contiguous block of physical disk storage.
98A plex may
99consist of any reasonable number of subdisks (in other words, the real limit is
100not the number, but other factors, such as memory and performance, associated
101with maintaining a large number of subdisks).
102.It
103A number of mappings between subdisks and plexes are available:
104.Bl -bullet
105.It
106.Em "Concatenated plexes"
107consist of one or more subdisks, each of which
108is mapped to a contiguous part of the plex address space.
109.It
110.Em "Striped plexes"
111consist of two or more subdisks of equal size.
112The file
113address space is mapped in
114.Em stripes ,
115integral fractions of the subdisk
116size.
117Consecutive plex address space is mapped to stripes in each subdisk in
118turn.
119.if t \{\
120.ig
121.\" FIXME
122.br
123.ne 1.5i
124.PS
125move right 2i
126down
127SD0: box
128SD1: box
129SD2: box
130
131"plex 0" at SD0.n+(0,.2)
132"subdisk 0" rjust at SD0.w-(.2,0)
133"subdisk 1" rjust at SD1.w-(.2,0)
134"subdisk 2" rjust at SD2.w-(.2,0)
135.PE
136..
137.\}
138The subdisks of a striped plex must all be the same size.
139.It
140.Em "RAID-5 plexes"
141require at least three equal-sized subdisks.
142They
143resemble striped plexes, except that in each stripe, one subdisk stores parity
144information.
145This subdisk changes in each stripe: in the first stripe, it is the
146first subdisk, in the second it is the second subdisk, etc.
147In the event of a
148single disk failure,
149.Nm
150will recover the data based on the information stored on the remaining subdisks.
151This mapping is particularly suited to read-intensive access.
152The subdisks of a
153RAID-5 plex must all be the same size.
154.\" Make sure to flush!
155.El
156.It
157.Em Drives
158are the lowest level of the storage hierarchy.
159They represent disk special
160devices.
161.It
162.Nm
163offers automatic startup.
164Unlike
165.Ux
166file systems,
167.Nm
168volumes contain all the configuration information needed to ensure that they are
169started correctly when the subsystem is enabled.
170This is also a significant
171advantage over the Veritas\(tm File System.
172This feature regards the presence
173of the volumes.
174It does not mean that the volumes will be mounted
175automatically, since the standard startup procedures with
176.Pa /etc/fstab
177perform this function.
178.El
179.Sh KERNEL CONFIGURATION
180.Nm
181is currently supplied as a KLD module, and does not require
182configuration.
183As with other klds, it is absolutely necessary to match the kld
184to the version of the operating system.
185Failure to do so will cause
186.Nm
187to issue an error message and terminate.
188.Pp
189It is possible to configure
190.Nm
191in the kernel, but this is not recommended.
192To do so, add this line to the
193kernel configuration file:
194.Pp
195.D1 Cd "pseudo-device vinum"
196.Ss Debug Options
197The current version of
198.Nm ,
199both the kernel module and the user program
200.Xr vinum 8 ,
201include significant debugging support.
202It is not recommended to remove
203this support at the moment, but if you do you must remove it from both the
204kernel and the user components.
205To do this, edit the files
206.Pa /usr/src/sbin/vinum/Makefile
207and
208.Pa /sys/dev/raid/vinum/Makefile
209and edit the
210.Va CFLAGS
211variable to remove the
212.Li -DVINUMDEBUG
213option.
214If you have
215configured
216.Nm
217into the kernel, either specify the line
218.Pp
219.D1 Cd "options VINUMDEBUG"
220.Pp
221in the kernel configuration file or remove the
222.Li -DVINUMDEBUG
223option from
224.Pa /usr/src/sbin/vinum/Makefile
225as described above.
226.Pp
227If the
228.Va VINUMDEBUG
229variables do not match,
230.Xr vinum 8
231will fail with a message
232explaining the problem and what to do to correct it.
233.Pp
234.Nm
235was previously available in two versions: a freely available version which did
236not contain RAID-5 functionality, and a full version including RAID-5
237functionality, which was available only from Cybernet Systems Inc.
238The present
239version of
240.Nm
241includes the RAID-5 functionality.
242.Sh RUNNING VINUM
243.Nm
244is part of the base
245.Dx
246system.
247It does not require installation.
248To start it, start the
249.Xr vinum 8
250program, which will load the kld if it is not already present.
251Before using
252.Nm ,
253it must be configured.
254See
255.Xr vinum 8
256for information on how to create a
257.Nm
258configuration.
259.Pp
260Normally, you start a configured version of
261.Nm
262at boot time.
263Set the variable
264.Va start_vinum
265in
266.Pa /etc/rc.conf
267to
268.Dq Li YES
269to start
270.Nm
271at boot time.
272(See
273.Xr rc.conf 5
274for more details.)
275.Pp
276If
277.Nm
278is loaded as a kld (the recommended way), the
279.Nm Cm stop
280command will unload it
281(see
282.Xr vinum 8 ) .
283You can also do this with the
284.Xr kldunload 8
285command.
286.Pp
287The kld can only be unloaded when idle, in other words when no volumes are
288mounted and no other instances of the
289.Xr vinum 8
290program are active.
291Unloading the kld does not harm the data in the volumes.
292.Ss Configuring and Starting Objects
293Use the
294.Xr vinum 8
295utility to configure and start
296.Nm
297objects.
298.Sh IOCTL CALLS
299.Xr ioctl 2
300calls are intended for the use of the
301.Xr vinum 8
302configuration program only.
303They are described in the header file
304.Pa /sys/dev/raid/vinum/vinumio.h .
305.Ss Disk Labels
306Conventional disk special devices have a
307.Em "disk label"
308in the second sector of the device.
309See
310.Xr disklabel 5
311for more details.
312This disk label describes the layout of the partitions within
313the device.
314.Nm
315does not subdivide volumes, so volumes do not contain a physical disk label.
316For convenience,
317.Nm
318implements the ioctl calls
319.Dv DIOCGDINFO
320(get disk label),
321.Dv DIOCGPART
322(get partition information),
323.Dv DIOCWDINFO
324(write partition information) and
325.Dv DIOCSDINFO
326(set partition information).
327.Dv DIOCGDINFO
328and
329.Dv DIOCGPART
330refer to an internal
331representation of the disk label which is not present on the volume.
332As a
333result, the
334.Fl r
335option of
336.Xr disklabel 8 ,
337which reads the
338.Dq "raw disk" ,
339will fail.
340.Pp
341In general,
342.Xr disklabel 8
343serves no useful purpose on a
344.Nm
345volume.
346.Pp
347.Nm
348ignores the
349.Dv DIOCWDINFO
350and
351.Dv DIOCSDINFO ioctls, since there is nothing to change.
352As a result, any attempt to modify the disk label will be silently ignored.
353.Sh MAKING FILE SYSTEMS
354Since
355.Nm
356volumes do not contain partitions, the names do not need to conform to the
357standard rules for naming disk partitions.
358For a physical disk partition, the
359last letter of the device name specifies the partition identifier (a to p).
360.Nm
361volumes need not conform to this convention, but if they do not,
362.Xr newfs 8
363will complain that it cannot determine the partition.
364To solve this problem,
365use the
366.Fl v
367flag to
368.Xr newfs 8 .
369For example, if you have a volume
370.Pa concat ,
371use the following command to create a
372.Xr UFS 5
373file system on it:
374.Pp
375.Dl "newfs -v /dev/vinum/concat"
376.Sh OBJECT NAMING
377.Nm
378assigns default names to plexes and subdisks, although they may be overridden.
379We do not recommend overriding the default names.
380Experience with the
381Veritas\(tm
382volume manager, which allows arbitrary naming of objects, has shown that this
383flexibility does not bring a significant advantage, and it can cause confusion.
384.Pp
385Names may contain any non-blank character, but it is recommended to restrict
386them to letters, digits and the underscore characters.
387The names of volumes,
388plexes and subdisks may be up to 64 characters long, and the names of drives may
389up to 32 characters long.
390When choosing volume and plex names, bear in mind
391that automatically generated plex and subdisk names are longer than the name
392from which they are derived.
393.Bl -bullet
394.It
395When
396.Nm
397creates or deletes objects, it creates a directory
398.Pa /dev/vinum ,
399in which it makes device entries for each volume.
400It also creates the
401subdirectories,
402.Pa /dev/vinum/plex
403and
404.Pa /dev/vinum/sd ,
405in which it stores device entries for the plexes and subdisks.  In addition, it
406creates two more directories,
407.Pa /dev/vinum/vol
408and
409.Pa /dev/vinum/drive ,
410in which it stores hierarchical information for volumes and drives.
411.It
412In addition,
413.Nm
414creates three super-devices,
415.Pa /dev/vinum/control ,
416.Pa /dev/vinum/Control
417and
418.Pa /dev/vinum/controld .
419.Pa /dev/vinum/control
420is used by
421.Xr vinum 8
422when it has been compiled without the
423.Dv VINUMDEBUG
424option,
425.Pa /dev/vinum/Control
426is used by
427.Xr vinum 8
428when it has been compiled with the
429.Dv VINUMDEBUG
430option, and
431.Pa /dev/vinum/controld
432is used by the
433.Nm
434daemon.
435The two control devices for
436.Xr vinum 8
437are used to synchronize the debug status of kernel and user modules.
438.It
439Unlike
440.Ux
441drives,
442.Nm
443volumes are not subdivided into partitions, and thus do not contain a disk
444label.
445Unfortunately, this confuses a number of utilities, notably
446.Xr newfs 8 ,
447which normally tries to interpret the last letter of a
448.Nm
449volume name as a partition identifier.
450If you use a volume name which does not
451end in the letters
452.Ql a
453to
454.Ql c ,
455you must use the
456.Fl v
457flag to
458.Xr newfs 8
459in order to tell it to ignore this convention.
460.\"
461.It
462Plexes do not need to be assigned explicit names.
463By default, a plex name is
464the name of the volume followed by the letters
465.Pa .p
466and the number of the
467plex.
468For example, the plexes of volume
469.Pa vol3
470are called
471.Pa vol3.p0 , vol3.p1
472and so on.
473These names can be overridden, but it is not recommended.
474.It
475Like plexes, subdisks are assigned names automatically, and explicit naming is
476discouraged.
477A subdisk name is the name of the plex followed by the letters
478.Pa .s
479and a number identifying the subdisk.
480For example, the subdisks of
481plex
482.Pa vol3.p0
483are called
484.Pa vol3.p0.s0 , vol3.p0.s1
485and so on.
486.It
487By contrast,
488.Em drives
489must be named.
490This makes it possible to move a drive to a different location
491and still recognize it automatically.
492Drive names may be up to 32 characters
493long.
494.El
495.Ss Example
496Assume the
497.Nm
498objects described in the section
499.Sx "CONFIGURATION FILE"
500in
501.Xr vinum 8 .
502The directory
503.Pa /dev/vinum
504looks like:
505.Bd -literal -offset indent
506# ls -lR /dev/vinum
507total 5
508crwxr-xr--  1 root  wheel   91,   2 Mar 30 16:08 concat
509crwx------  1 root  wheel   91, 0x40000000 Mar 30 16:08 control
510crwx------  1 root  wheel   91, 0x40000001 Mar 30 16:08 controld
511drwxrwxrwx  2 root  wheel       512 Mar 30 16:08 drive
512drwxrwxrwx  2 root  wheel       512 Mar 30 16:08 plex
513drwxrwxrwx  2 root  wheel       512 Mar 30 16:08 rvol
514drwxrwxrwx  2 root  wheel       512 Mar 30 16:08 sd
515crwxr-xr--  1 root  wheel   91,   3 Mar 30 16:08 strcon
516crwxr-xr--  1 root  wheel   91,   1 Mar 30 16:08 stripe
517crwxr-xr--  1 root  wheel   91,   0 Mar 30 16:08 tinyvol
518drwxrwxrwx  7 root  wheel       512 Mar 30 16:08 vol
519crwxr-xr--  1 root  wheel   91,   4 Mar 30 16:08 vol5
520
521/dev/vinum/drive:
522total 0
523crw-r-----  1 root  operator    4,  15 Oct 21 16:51 drive2
524crw-r-----  1 root  operator    4,  31 Oct 21 16:51 drive4
525
526/dev/vinum/plex:
527total 0
528crwxr-xr--  1 root  wheel   91, 0x10000002 Mar 30 16:08 concat.p0
529crwxr-xr--  1 root  wheel   91, 0x10010002 Mar 30 16:08 concat.p1
530crwxr-xr--  1 root  wheel   91, 0x10000003 Mar 30 16:08 strcon.p0
531crwxr-xr--  1 root  wheel   91, 0x10010003 Mar 30 16:08 strcon.p1
532crwxr-xr--  1 root  wheel   91, 0x10000001 Mar 30 16:08 stripe.p0
533crwxr-xr--  1 root  wheel   91, 0x10000000 Mar 30 16:08 tinyvol.p0
534crwxr-xr--  1 root  wheel   91, 0x10000004 Mar 30 16:08 vol5.p0
535crwxr-xr--  1 root  wheel   91, 0x10010004 Mar 30 16:08 vol5.p1
536
537/dev/vinum/sd:
538total 0
539crwxr-xr--  1 root  wheel   91, 0x20000002 Mar 30 16:08 concat.p0.s0
540crwxr-xr--  1 root  wheel   91, 0x20100002 Mar 30 16:08 concat.p0.s1
541crwxr-xr--  1 root  wheel   91, 0x20010002 Mar 30 16:08 concat.p1.s0
542crwxr-xr--  1 root  wheel   91, 0x20000003 Mar 30 16:08 strcon.p0.s0
543crwxr-xr--  1 root  wheel   91, 0x20100003 Mar 30 16:08 strcon.p0.s1
544crwxr-xr--  1 root  wheel   91, 0x20010003 Mar 30 16:08 strcon.p1.s0
545crwxr-xr--  1 root  wheel   91, 0x20110003 Mar 30 16:08 strcon.p1.s1
546crwxr-xr--  1 root  wheel   91, 0x20000001 Mar 30 16:08 stripe.p0.s0
547crwxr-xr--  1 root  wheel   91, 0x20100001 Mar 30 16:08 stripe.p0.s1
548crwxr-xr--  1 root  wheel   91, 0x20000000 Mar 30 16:08 tinyvol.p0.s0
549crwxr-xr--  1 root  wheel   91, 0x20100000 Mar 30 16:08 tinyvol.p0.s1
550crwxr-xr--  1 root  wheel   91, 0x20000004 Mar 30 16:08 vol5.p0.s0
551crwxr-xr--  1 root  wheel   91, 0x20100004 Mar 30 16:08 vol5.p0.s1
552crwxr-xr--  1 root  wheel   91, 0x20010004 Mar 30 16:08 vol5.p1.s0
553crwxr-xr--  1 root  wheel   91, 0x20110004 Mar 30 16:08 vol5.p1.s1
554
555/dev/vinum/vol:
556total 5
557crwxr-xr--  1 root  wheel   91,   2 Mar 30 16:08 concat
558drwxr-xr-x  4 root  wheel       512 Mar 30 16:08 concat.plex
559crwxr-xr--  1 root  wheel   91,   3 Mar 30 16:08 strcon
560drwxr-xr-x  4 root  wheel       512 Mar 30 16:08 strcon.plex
561crwxr-xr--  1 root  wheel   91,   1 Mar 30 16:08 stripe
562drwxr-xr-x  3 root  wheel       512 Mar 30 16:08 stripe.plex
563crwxr-xr--  1 root  wheel   91,   0 Mar 30 16:08 tinyvol
564drwxr-xr-x  3 root  wheel       512 Mar 30 16:08 tinyvol.plex
565crwxr-xr--  1 root  wheel   91,   4 Mar 30 16:08 vol5
566drwxr-xr-x  4 root  wheel       512 Mar 30 16:08 vol5.plex
567
568/dev/vinum/vol/concat.plex:
569total 2
570crwxr-xr--  1 root  wheel   91, 0x10000002 Mar 30 16:08 concat.p0
571drwxr-xr-x  2 root  wheel       512 Mar 30 16:08 concat.p0.sd
572crwxr-xr--  1 root  wheel   91, 0x10010002 Mar 30 16:08 concat.p1
573drwxr-xr-x  2 root  wheel       512 Mar 30 16:08 concat.p1.sd
574
575/dev/vinum/vol/concat.plex/concat.p0.sd:
576total 0
577crwxr-xr--  1 root  wheel   91, 0x20000002 Mar 30 16:08 concat.p0.s0
578crwxr-xr--  1 root  wheel   91, 0x20100002 Mar 30 16:08 concat.p0.s1
579
580/dev/vinum/vol/concat.plex/concat.p1.sd:
581total 0
582crwxr-xr--  1 root  wheel   91, 0x20010002 Mar 30 16:08 concat.p1.s0
583
584/dev/vinum/vol/strcon.plex:
585total 2
586crwxr-xr--  1 root  wheel   91, 0x10000003 Mar 30 16:08 strcon.p0
587drwxr-xr-x  2 root  wheel       512 Mar 30 16:08 strcon.p0.sd
588crwxr-xr--  1 root  wheel   91, 0x10010003 Mar 30 16:08 strcon.p1
589drwxr-xr-x  2 root  wheel       512 Mar 30 16:08 strcon.p1.sd
590
591/dev/vinum/vol/strcon.plex/strcon.p0.sd:
592total 0
593crwxr-xr--  1 root  wheel   91, 0x20000003 Mar 30 16:08 strcon.p0.s0
594crwxr-xr--  1 root  wheel   91, 0x20100003 Mar 30 16:08 strcon.p0.s1
595
596/dev/vinum/vol/strcon.plex/strcon.p1.sd:
597total 0
598crwxr-xr--  1 root  wheel   91, 0x20010003 Mar 30 16:08 strcon.p1.s0
599crwxr-xr--  1 root  wheel   91, 0x20110003 Mar 30 16:08 strcon.p1.s1
600
601/dev/vinum/vol/stripe.plex:
602total 1
603crwxr-xr--  1 root  wheel   91, 0x10000001 Mar 30 16:08 stripe.p0
604drwxr-xr-x  2 root  wheel       512 Mar 30 16:08 stripe.p0.sd
605
606/dev/vinum/vol/stripe.plex/stripe.p0.sd:
607total 0
608crwxr-xr--  1 root  wheel   91, 0x20000001 Mar 30 16:08 stripe.p0.s0
609crwxr-xr--  1 root  wheel   91, 0x20100001 Mar 30 16:08 stripe.p0.s1
610
611/dev/vinum/vol/tinyvol.plex:
612total 1
613crwxr-xr--  1 root  wheel   91, 0x10000000 Mar 30 16:08 tinyvol.p0
614drwxr-xr-x  2 root  wheel       512 Mar 30 16:08 tinyvol.p0.sd
615
616/dev/vinum/vol/tinyvol.plex/tinyvol.p0.sd:
617total 0
618crwxr-xr--  1 root  wheel   91, 0x20000000 Mar 30 16:08 tinyvol.p0.s0
619crwxr-xr--  1 root  wheel   91, 0x20100000 Mar 30 16:08 tinyvol.p0.s1
620
621/dev/vinum/vol/vol5.plex:
622total 2
623crwxr-xr--  1 root  wheel   91, 0x10000004 Mar 30 16:08 vol5.p0
624drwxr-xr-x  2 root  wheel       512 Mar 30 16:08 vol5.p0.sd
625crwxr-xr--  1 root  wheel   91, 0x10010004 Mar 30 16:08 vol5.p1
626drwxr-xr-x  2 root  wheel       512 Mar 30 16:08 vol5.p1.sd
627
628/dev/vinum/vol/vol5.plex/vol5.p0.sd:
629total 0
630crwxr-xr--  1 root  wheel   91, 0x20000004 Mar 30 16:08 vol5.p0.s0
631crwxr-xr--  1 root  wheel   91, 0x20100004 Mar 30 16:08 vol5.p0.s1
632
633/dev/vinum/vol/vol5.plex/vol5.p1.sd:
634total 0
635crwxr-xr--  1 root  wheel   91, 0x20010004 Mar 30 16:08 vol5.p1.s0
636crwxr-xr--  1 root  wheel   91, 0x20110004 Mar 30 16:08 vol5.p1.s1
637.Ed
638.Pp
639In the case of unattached plexes and subdisks, the naming is reversed.
640Subdisks
641are named after the disk on which they are located, and plexes are named after
642the subdisk.
643.\" XXX
644.Bf -symbolic
645This mapping is still to be determined.
646.Ef
647.Ss Object States
648Each
649.Nm
650object has a
651.Em state
652associated with it.
653.Nm
654uses this state to determine the handling of the object.
655.Ss Volume States
656Volumes may have the following states:
657.Bl -hang -width 14n
658.It Em down
659The volume is completely inaccessible.
660.It Em up
661The volume is up and at least partially functional.
662Not all plexes may be
663available.
664.El
665.Ss "Plex States"
666Plexes may have the following states:
667.Bl -hang -width 14n
668.It Em referenced
669A plex entry which has been referenced as part of a volume, but which is
670currently not known.
671.It Em faulty
672A plex which has gone completely down because of I/O errors.
673.It Em down
674A plex which has been taken down by the administrator.
675.It Em initializing
676A plex which is being initialized.
677.El
678.Pp
679The remaining states represent plexes which are at least partially up.
680.Bl -hang -width 14n
681.It Em corrupt
682A plex entry which is at least partially up.
683Not all subdisks are available,
684and an inconsistency has occurred.
685If no other plex is uncorrupted, the volume
686is no longer consistent.
687.It Em degraded
688A RAID-5 plex entry which is accessible, but one subdisk is down, requiring
689recovery for many I/O requests.
690.It Em flaky
691A plex which is really up, but which has a reborn subdisk which we do not
692completely trust, and which we do not want to read if we can avoid it.
693.It Em up
694A plex entry which is completely up.
695All subdisks are up.
696.El
697.Ss "Subdisk States"
698Subdisks can have the following states:
699.Bl -hang -width 14n
700.It Em empty
701A subdisk entry which has been created completely.
702All fields are correct, and
703the disk has been updated, but the on the disk is not valid.
704.It Em referenced
705A subdisk entry which has been referenced as part of a plex, but which is
706currently not known.
707.It Em initializing
708A subdisk entry which has been created completely and which is currently being
709initialized.
710.El
711.Pp
712The following states represent invalid data.
713.Bl -hang -width 14n
714.It Em obsolete
715A subdisk entry which has been created completely.
716All fields are correct, the
717config on disk has been updated, and the data was valid, but since then the
718drive has been taken down, and as a result updates have been missed.
719.It Em stale
720A subdisk entry which has been created completely.
721All fields are correct, the
722disk has been updated, and the data was valid, but since then the drive has been
723crashed and updates have been lost.
724.El
725.Pp
726The following states represent valid, inaccessible data.
727.Bl -hang -width 14n
728.It Em crashed
729A subdisk entry which has been created completely.
730All fields are correct, the
731disk has been updated, and the data was valid, but since then the drive has gone
732down.
733No attempt has been made to write to the subdisk since the crash, so the
734data is valid.
735.It Em down
736A subdisk entry which was up, which contained valid data, and which was taken
737down by the administrator.
738The data is valid.
739.It Em reviving
740The subdisk is currently in the process of being revived.
741We can write but not
742read.
743.El
744.Pp
745The following states represent accessible subdisks with valid data.
746.Bl -hang -width 14n
747.It Em reborn
748A subdisk entry which has been created completely.
749All fields are correct, the
750disk has been updated, and the data was valid, but since then the drive has gone
751down and up again.
752No updates were lost, but it is possible that the subdisk
753has been damaged.
754We won't read from this subdisk if we have a choice.
755If this
756is the only subdisk which covers this address space in the plex, we set its
757state to up under these circumstances, so this status implies that there is
758another subdisk to fulfil the request.
759.It Em up
760A subdisk entry which has been created completely.
761All fields are correct, the
762disk has been updated, and the data is valid.
763.El
764.Ss "Drive States"
765Drives can have the following states:
766.Bl -hang -width 14n
767.It Em referenced
768At least one subdisk refers to the drive, but it is not currently accessible to
769the system.
770No device name is known.
771.It Em down
772The drive is not accessible.
773.It Em up
774The drive is up and running.
775.El
776.Sh DEBUGGING PROBLEMS WITH VINUM
777Solving problems with
778.Nm
779can be a difficult affair.
780This section suggests some approaches.
781.Ss Configuration problems
782It is relatively easy (too easy) to run into problems with the
783.Nm
784configuration.
785If you do, the first thing you should do is stop configuration
786updates:
787.Pp
788.Dl "vinum setdaemon 4"
789.Pp
790This will stop updates and any further corruption of the on-disk configuration.
791.Pp
792Next, look at the on-disk configuration with the
793.Nm Cm dumpconfig
794command, for example:
795.if t .ps -3
796.if t .vs -3
797.Bd -literal
798# \fBvinum dumpconfig\fP
799Drive 4:        Device /dev/da3s0h
800                Created on crash.lemis.com at Sat May 20 16:32:44 2000
801                Config last updated Sat May 20 16:32:56 2000
802                Size:        601052160 bytes (573 MB)
803volume obj state up
804volume src state up
805volume raid state down
806volume r state down
807volume foo state up
808plex name obj.p0 state corrupt org concat vol obj
809plex name obj.p1 state corrupt org striped 128b vol obj
810plex name src.p0 state corrupt org striped 128b vol src
811plex name src.p1 state up org concat vol src
812plex name raid.p0 state faulty org disorg vol raid
813plex name r.p0 state faulty org disorg vol r
814plex name foo.p0 state up org concat vol foo
815plex name foo.p1 state faulty org concat vol foo
816sd name obj.p0.s0 drive drive2 plex obj.p0 state reborn len 409600b driveoffset 265b plexoffset 0b
817sd name obj.p0.s1 drive drive4 plex obj.p0 state up len 409600b driveoffset 265b plexoffset 409600b
818sd name obj.p1.s0 drive drive1 plex obj.p1 state up len 204800b driveoffset 265b plexoffset 0b
819sd name obj.p1.s1 drive drive2 plex obj.p1 state reborn len 204800b driveoffset 409865b plexoffset 128b
820sd name obj.p1.s2 drive drive3 plex obj.p1 state up len 204800b driveoffset 265b plexoffset 256b
821sd name obj.p1.s3 drive drive4 plex obj.p1 state up len 204800b driveoffset 409865b plexoffset 384b
822.Ed
823.if t .vs +3
824.if t .ps +3
825.Pp
826The configuration on all disks should be the same.
827If this is not the case,
828please save the output to a file and report the problem.
829There is probably
830little that can be done to recover the on-disk configuration, but if you keep a
831copy of the files used to create the objects, you should be able to re-create
832them.
833The
834.Cm create
835command does not change the subdisk data, so this will not cause data
836corruption.
837You may need to use the
838.Cm resetconfig
839command if you have this kind of trouble.
840.Ss Kernel Panics
841In order to analyse a panic which you suspect comes from
842.Nm
843you will need to build a debug kernel.
844See the online handbook at
845.Pa http://www.dragonflybsd.org/docs/user/list/DebugKernelCrashDumps/
846for more details of how to do this.
847.Pp
848Perform the following steps to analyse a
849.Nm
850problem:
851.Bl -enum
852.It
853Copy the following files to the directory in which you will be
854performing the analysis, typically
855.Pa /var/crash :
856.Pp
857.Bl -bullet -compact
858.It
859.Pa /sys/dev/raid/vinum/.gdbinit.crash ,
860.It
861.Pa /sys/dev/raid/vinum/.gdbinit.kernel ,
862.It
863.Pa /sys/dev/raid/vinum/.gdbinit.serial ,
864.It
865.Pa /sys/dev/raid/vinum/.gdbinit.vinum
866and
867.It
868.Pa /sys/dev/raid/vinum/.gdbinit.vinum.paths
869.El
870.It
871Make sure that you build the
872.Nm
873module with debugging information.
874The standard
875.Pa Makefile
876builds a module with debugging symbols by default.
877If the version of
878.Nm
879in
880.Pa /boot/kernel
881does not contain symbols, you will not get an error message, but the stack trace
882will not show the symbols.
883Check the module before starting
884.Xr kgdb 1 :
885.Bd -literal
886$ file /boot/kernel/vinum.ko
887/boot/kernel/vinum.ko: ELF 32-bit LSB shared object, Intel 80386,
888  version 1 (SYSV), dynamically linked, not stripped
889.Ed
890.Pp
891If the output shows that
892.Pa /boot/kernel/vinum.ko
893is stripped, you will have to find a version which is not.
894Usually this will be
895either in
896.Pa /usr/obj/usr/src/sys/SYSTEM_NAME/usr/src/sys/dev/raid/vinum/vinum.ko
897(if you have built
898.Nm
899with a
900.Dq Li "make world" )
901or
902.Pa /sys/dev/raid/vinum/vinum.ko
903(if you have built
904.Nm
905in this directory).
906Modify the file
907.Pa .gdbinit.vinum.paths
908accordingly.
909.It
910Either take a dump or use remote serial
911.Xr gdb 1
912to analyse the problem.
913To analyse a dump, say
914.Pa /var/crash/vmcore.5 ,
915link
916.Pa /var/crash/.gdbinit.crash
917to
918.Pa /var/crash/.gdbinit
919and enter:
920.Bd -literal -offset indent
921cd /var/crash
922kgdb kernel.debug vmcore.5
923.Ed
924.Pp
925This example assumes that you have installed the correct debug kernel at
926.Pa /var/crash/kernel.debug .
927If not, substitute the correct name of the debug kernel.
928.Pp
929To perform remote serial debugging,
930link
931.Pa /var/crash/.gdbinit.serial
932to
933.Pa /var/crash/.gdbinit
934and enter
935.Bd -literal -offset indent
936cd /var/crash
937kgdb kernel.debug
938.Ed
939.Pp
940In this case, the
941.Pa .gdbinit
942file performs the functions necessary to establish connection.
943The remote
944machine must already be in debug mode: enter the kernel debugger and select
945.Ic gdb
946(see
947.Xr ddb 4
948for more details.)
949The serial
950.Pa .gdbinit
951file expects the serial connection to run at 38400 bits per second; if you run
952at a different speed, edit the file accordingly (look for the
953.Va remotebaud
954specification).
955.Pp
956The following example shows a remote debugging session using the
957.Ic debug
958command of
959.Xr vinum 8 :
960.Bd -literal
961.if t .ps -3
962.if t .vs -3
963GDB 4.16 (i386-unknown-dragonfly), Copyright 1996 Free Software Foundation, Inc.
964Debugger (msg=0xf1093174 "vinum debug") at ../../i386/i386/db_interface.c:318
965318                 in_Debugger = 0;
966#1  0xf108d9bc in vinumioctl (dev=0x40001900, cmd=0xc008464b, data=0xf6dedee0 "",
967    flag=0x3, p=0xf68b7940) at
968    /usr/src/sys/dev/raid/vinum/vinumioctl.c:102
969102             Debugger ("vinum debug");
970(kgdb) bt
971#0  Debugger (msg=0xf0f661ac "vinum debug") at ../../i386/i386/db_interface.c:318
972#1  0xf0f60a7c in vinumioctl (dev=0x40001900, cmd=0xc008464b, data=0xf6923ed0 "",
973      flag=0x3, p=0xf688e6c0) at
974      /usr/src/sys/dev/raid/vinum/vinumioctl.c:109
975#2  0xf01833b7 in spec_ioctl (ap=0xf6923e0c) at ../../miscfs/specfs/spec_vnops.c:424
976#3  0xf0182cc9 in spec_vnoperate (ap=0xf6923e0c) at ../../miscfs/specfs/spec_vnops.c:129
977#4  0xf01eb3c1 in ufs_vnoperatespec (ap=0xf6923e0c) at ../../ufs/ufs/ufs_vnops.c:2312
978#5  0xf017dbb1 in vn_ioctl (fp=0xf1007ec0, com=0xc008464b, data=0xf6923ed0 "",
979      p=0xf688e6c0) at vnode_if.h:395
980#6  0xf015dce0 in ioctl (p=0xf688e6c0, uap=0xf6923f84) at ../../kern/sys_generic.c:473
981#7  0xf0214c0b in syscall (frame={tf_es = 0x27, tf_ds = 0x27, tf_edi = 0xefbfcff8,
982      tf_esi = 0x1, tf_ebp = 0xefbfcf90, tf_isp = 0xf6923fd4, tf_ebx = 0x2,
983      tf_edx = 0x804b614, tf_ecx = 0x8085d10, tf_eax = 0x36, tf_trapno = 0x7,
984      tf_err = 0x2, tf_eip = 0x8060a34, tf_cs = 0x1f, tf_eflags = 0x286,
985      tf_esp = 0xefbfcf78, tf_ss = 0x27}) at ../../i386/i386/trap.c:1100
986#8  0xf020a1fc in Xint0x80_syscall ()
987#9  0x804832d in ?? ()
988#10 0x80482ad in ?? ()
989#11 0x80480e9 in ?? ()
990.if t .vs
991.if t .ps
992.Ed
993.Pp
994When entering from the debugger, it is important that the source of frame 1
995(listed by the
996.Pa .gdbinit
997file at the top of the example) contains the text
998.Dq Li "Debugger (\*[q]vinum debug\*[q]);" .
999.Pp
1000This is an indication that the address specifications are correct.
1001If you get
1002some other output, your symbols and the kernel module are out of sync, and the
1003trace will be meaningless.
1004.El
1005.Pp
1006For an initial investigation, the most important information is the output of
1007the
1008.Ic bt
1009(backtrace) command above.
1010.Ss Reporting Problems with Vinum
1011If you find any bugs in
1012.Nm ,
1013please report them to
1014.An Greg Lehey Aq Mt grog@lemis.com .
1015Supply the following
1016information:
1017.Bl -bullet
1018.It
1019The output of the
1020.Nm Cm list
1021command
1022(see
1023.Xr vinum 8 ) .
1024.It
1025Any messages printed in
1026.Pa /var/log/messages .
1027All such messages will be identified by the text
1028.Dq Li vinum
1029at the beginning.
1030.It
1031If you have a panic, a stack trace as described above.
1032.El
1033.Sh SEE ALSO
1034.Xr disklabel 5 ,
1035.Xr disklabel 8 ,
1036.Xr newfs 8 ,
1037.Xr vinum 8
1038.Sh HISTORY
1039.Nm
1040first appeared in
1041.Fx 3.0 .
1042The RAID-5 component of
1043.Nm
1044was developed by Cybernet Inc.\&
1045.Pq Pa http://www.cybernet.com/ ,
1046for its NetMAX product.
1047.Sh AUTHORS
1048.An Greg Lehey Aq Mt grog@lemis.com .
1049.Sh BUGS
1050.Nm
1051is a new product.
1052Bugs can be expected.
1053The configuration mechanism is not yet
1054fully functional.
1055If you have difficulties, please look at the section
1056.Sx "DEBUGGING PROBLEMS WITH VINUM"
1057before reporting problems.
1058.Pp
1059Kernels with the
1060.Nm
1061pseudo-device appear to work, but are not supported.
1062If you have trouble with
1063this configuration, please first replace the kernel with a
1064.No non- Ns Nm
1065kernel and test with the kld module.
1066.Pp
1067Detection of differences between the version of the kernel and the kld is not
1068yet implemented.
1069.Pp
1070The RAID-5 functionality is new in
1071.Fx 3.3 .
1072Some problems have been
1073reported with
1074.Nm
1075in combination with soft updates, but these are not reproducible on all
1076systems.
1077If you are planning to use
1078.Nm
1079in a production environment, please test carefully.
1080