xref: /dragonfly/share/man/man4/vinum.4 (revision 375d1659)
1.\"  Hey, Emacs, edit this file in -*- nroff-fill -*- mode
2.\"-
3.\" Copyright (c) 1997, 1998
4.\"	Nan Yang Computer Services Limited.  All rights reserved.
5.\"
6.\"  This software is distributed under the so-called ``Berkeley
7.\"  License'':
8.\"
9.\" Redistribution and use in source and binary forms, with or without
10.\" modification, are permitted provided that the following conditions
11.\" are met:
12.\" 1. Redistributions of source code must retain the above copyright
13.\"    notice, this list of conditions and the following disclaimer.
14.\" 2. Redistributions in binary form must reproduce the above copyright
15.\"    notice, this list of conditions and the following disclaimer in the
16.\"    documentation and/or other materials provided with the distribution.
17.\" 3. All advertising materials mentioning features or use of this software
18.\"    must display the following acknowledgement:
19.\"	This product includes software developed by Nan Yang Computer
20.\"      Services Limited.
21.\" 4. Neither the name of the Company nor the names of its contributors
22.\"    may be used to endorse or promote products derived from this software
23.\"    without specific prior written permission.
24.\"
25.\" This software is provided ``as is'', and any express or implied
26.\" warranties, including, but not limited to, the implied warranties of
27.\" merchantability and fitness for a particular purpose are disclaimed.
28.\" In no event shall the company or contributors be liable for any
29.\" direct, indirect, incidental, special, exemplary, or consequential
30.\" damages (including, but not limited to, procurement of substitute
31.\" goods or services; loss of use, data, or profits; or business
32.\" interruption) however caused and on any theory of liability, whether
33.\" in contract, strict liability, or tort (including negligence or
34.\" otherwise) arising in any way out of the use of this software, even if
35.\" advised of the possibility of such damage.
36.\"
37.\" $FreeBSD: src/share/man/man4/vinum.4,v 1.22.2.9 2002/04/22 08:19:35 kuriyama Exp $
38.\" $DragonFly: src/share/man/man4/vinum.4,v 1.9 2006/03/01 14:00:10 swildner Exp $
39.\"
40.Dd May 16, 2002
41.Dt VINUM 4
42.Os
43.Sh NAME
44.Nm vinum
45.Nd Logical Volume Manager
46.Sh SYNOPSIS
47.Cd "pseudo-device vinum"
48.Sh DESCRIPTION
49.Nm
50is a logical volume manager inspired by, but not derived from, the Veritas
51Volume Manager.
52It provides the following features:
53.Bl -bullet
54.It
55It provides device-independent logical disks, called
56.Em volumes .
57Volumes are
58not restricted to the size of any disk on the system.
59.It
60The volumes consist of one or more
61.Em plexes ,
62each of which contain the
63entire address space of a volume.
64This represents an implementation of RAID-1
65(mirroring).
66Multiple plexes can also be used for
67.\" XXX What about sparse plexes?  Do we want them?
68.Bl -bullet
69.It
70Increased read throughput.
71.Nm
72will read data from the least active disk, so if a volume has plexes on multiple
73disks, more data can be read in parallel.
74.Nm
75reads data from only one plex, but it writes data to all plexes.
76.It
77Increased reliability.
78By storing plexes on different disks, data will remain
79available even if one of the plexes becomes unavailable.
80In comparison with a
81RAID-5 plex (see below), using multiple plexes requires more storage space, but
82gives better performance, particularly in the case of a drive failure.
83.It
84Additional plexes can be used for on-line data reorganization.
85By attaching an
86additional plex and subsequently detaching one of the older plexes, data can be
87moved on-line without compromising access.
88.It
89An additional plex can be used to obtain a consistent dump of a file system.
90By
91attaching an additional plex and detaching at a specific time, the detached plex
92becomes an accurate snapshot of the file system at the time of detachment.
93.\" Make sure to flush!
94.El
95.It
96Each plex consists of one or more logical disk slices, called
97.Em subdisks .
98Subdisks are defined as a contiguous block of physical disk storage.
99A plex may
100consist of any reasonable number of subdisks (in other words, the real limit is
101not the number, but other factors, such as memory and performance, associated
102with maintaining a large number of subdisks).
103.It
104A number of mappings between subdisks and plexes are available:
105.Bl -bullet
106.It
107.Em "Concatenated plexes"
108consist of one or more subdisks, each of which
109is mapped to a contiguous part of the plex address space.
110.It
111.Em "Striped plexes"
112consist of two or more subdisks of equal size.
113The file
114address space is mapped in
115.Em stripes ,
116integral fractions of the subdisk
117size.
118Consecutive plex address space is mapped to stripes in each subdisk in
119turn.
120.if t \{\
121.ig
122.\" FIXME
123.br
124.ne 1.5i
125.PS
126move right 2i
127down
128SD0: box
129SD1: box
130SD2: box
131
132"plex 0" at SD0.n+(0,.2)
133"subdisk 0" rjust at SD0.w-(.2,0)
134"subdisk 1" rjust at SD1.w-(.2,0)
135"subdisk 2" rjust at SD2.w-(.2,0)
136.PE
137..
138.\}
139The subdisks of a striped plex must all be the same size.
140.It
141.Em "RAID-5 plexes"
142require at least three equal-sized subdisks.
143They
144resemble striped plexes, except that in each stripe, one subdisk stores parity
145information.
146This subdisk changes in each stripe: in the first stripe, it is the
147first subdisk, in the second it is the second subdisk, etc.
148In the event of a
149single disk failure,
150.Nm
151will recover the data based on the information stored on the remaining subdisks.
152This mapping is particularly suited to read-intensive access.
153The subdisks of a
154RAID-5 plex must all be the same size.
155.\" Make sure to flush!
156.El
157.It
158.Em Drives
159are the lowest level of the storage hierarchy.
160They represent disk special
161devices.
162.It
163.Nm
164offers automatic startup.
165Unlike
166.Ux
167file systems,
168.Nm
169volumes contain all the configuration information needed to ensure that they are
170started correctly when the subsystem is enabled.
171This is also a significant
172advantage over the Veritas\(tm File System.
173This feature regards the presence
174of the volumes.
175It does not mean that the volumes will be mounted
176automatically, since the standard startup procedures with
177.Pa /etc/fstab
178perform this function.
179.El
180.Sh KERNEL CONFIGURATION
181.Nm
182is currently supplied as a KLD module, and does not require
183configuration.
184As with other klds, it is absolutely necessary to match the kld
185to the version of the operating system.
186Failure to do so will cause
187.Nm
188to issue an error message and terminate.
189.Pp
190It is possible to configure
191.Nm
192in the kernel, but this is not recommended.
193To do so, add this line to the
194kernel configuration file:
195.Pp
196.D1 Cd "pseudo-device vinum"
197.Ss Debug Options
198The current version of
199.Nm ,
200both the kernel module and the user program
201.Xr vinum 8 ,
202include significant debugging support.
203It is not recommended to remove
204this support at the moment, but if you do you must remove it from both the
205kernel and the user components.
206To do this, edit the files
207.Pa /usr/src/sbin/vinum/Makefile
208and
209.Pa /sys/dev/raid/vinum/Makefile
210and edit the
211.Va CFLAGS
212variable to remove the
213.Li -DVINUMDEBUG
214option.
215If you have
216configured
217.Nm
218into the kernel, either specify the line
219.Pp
220.D1 Cd "options VINUMDEBUG"
221.Pp
222in the kernel configuration file or remove the
223.Li -DVINUMDEBUG
224option from
225.Pa /usr/src/sbin/vinum/Makefile
226as described above.
227.Pp
228If the
229.Va VINUMDEBUG
230variables do not match,
231.Xr vinum 8
232will fail with a message
233explaining the problem and what to do to correct it.
234.Pp
235.Nm
236was previously available in two versions: a freely available version which did
237not contain RAID-5 functionality, and a full version including RAID-5
238functionality, which was available only from Cybernet Systems Inc.
239The present
240version of
241.Nm
242includes the RAID-5 functionality.
243.Sh RUNNING VINUM
244.Nm
245is part of the base
246.Dx
247system.
248It does not require installation.
249To start it, start the
250.Xr vinum 8
251program, which will load the kld if it is not already present.
252Before using
253.Nm ,
254it must be configured.
255See
256.Xr vinum 8
257for information on how to create a
258.Nm
259configuration.
260.Pp
261Normally, you start a configured version of
262.Nm
263at boot time.
264Set the variable
265.Va start_vinum
266in
267.Pa /etc/rc.conf
268to
269.Dq Li YES
270to start
271.Nm
272at boot time.
273(See
274.Xr rc.conf 5
275for more details.)
276.Pp
277If
278.Nm
279is loaded as a kld (the recommended way), the
280.Nm vinum Cm stop
281command will unload it
282(see
283.Xr vinum 8 ) .
284You can also do this with the
285.Xr kldunload 8
286command.
287.Pp
288The kld can only be unloaded when idle, in other words when no volumes are
289mounted and no other instances of the
290.Xr vinum 8
291program are active.
292Unloading the kld does not harm the data in the volumes.
293.Ss Configuring and Starting Objects
294Use the
295.Xr vinum 8
296utility to configure and start
297.Nm
298objects.
299.Sh IOCTL CALLS
300.Xr ioctl 2
301calls are intended for the use of the
302.Xr vinum 8
303configuration program only.
304They are described in the header file
305.Pa /sys/dev/raid/vinum/vinumio.h .
306.Ss Disk Labels
307Conventional disk special devices have a
308.Em "disk label"
309in the second sector of the device.
310See
311.Xr disklabel 5
312for more details.
313This disk label describes the layout of the partitions within
314the device.
315.Nm
316does not subdivide volumes, so volumes do not contain a physical disk label.
317For convenience,
318.Nm
319implements the ioctl calls
320.Dv DIOCGDINFO
321(get disk label),
322.Dv DIOCGPART
323(get partition information),
324.Dv DIOCWDINFO
325(write partition information) and
326.Dv DIOCSDINFO
327(set partition information).
328.Dv DIOCGDINFO
329and
330.Dv DIOCGPART
331refer to an internal
332representation of the disk label which is not present on the volume.
333As a
334result, the
335.Fl r
336option of
337.Xr disklabel 8 ,
338which reads the
339.Dq "raw disk" ,
340will fail.
341.Pp
342In general,
343.Xr disklabel 8
344serves no useful purpose on a
345.Nm vinum
346volume.
347If you run it, it will show you
348three partitions,
349.Ql a ,
350.Ql b
351and
352.Ql c ,
353all the same except for the
354.Va fstype ,
355for example:
356.Bd -literal
3573 partitions:
358#        size   offset    fstype   [fsize bsize bps/cpg]
359  a:     2048        0    4.2BSD     1024  8192     0   # (Cyl.    0 - 0)
360  b:     2048        0      swap                        # (Cyl.    0 - 0)
361  c:     2048        0    unused        0     0         # (Cyl.    0 - 0)
362.Ed
363.Pp
364.Nm
365ignores the
366.Dv DIOCWDINFO
367and
368.Dv DIOCSDINFO ioctls, since there is nothing to change.
369As a result, any attempt to modify the disk label will be silently ignored.
370.Sh MAKING FILE SYSTEMS
371Since
372.Nm
373volumes do not contain partitions, the names do not need to conform to the
374standard rules for naming disk partitions.
375For a physical disk partition, the
376last letter of the device name specifies the partition identifier (a to h).
377.Nm
378volumes need not conform to this convention, but if they do not,
379.Xr newfs 8
380will complain that it cannot determine the partition.
381To solve this problem,
382use the
383.Fl v
384flag to
385.Xr newfs 8 .
386For example, if you have a volume
387.Pa concat ,
388use the following command to create a UFS file system on it:
389.Pp
390.Dl "newfs -v /dev/vinum/concat"
391.Sh OBJECT NAMING
392.Nm
393assigns default names to plexes and subdisks, although they may be overridden.
394We do not recommend overriding the default names.
395Experience with the
396Veritas\(tm
397volume manager, which allows arbitary naming of objects, has shown that this
398flexibility does not bring a significant advantage, and it can cause confusion.
399.Pp
400Names may contain any non-blank character, but it is recommended to restrict
401them to letters, digits and the underscore characters.
402The names of volumes,
403plexes and subdisks may be up to 64 characters long, and the names of drives may
404up to 32 characters long.
405When choosing volume and plex names, bear in mind
406that automatically generated plex and subdisk names are longer than the name
407from which they are derived.
408.Bl -bullet
409.It
410When
411.Nm
412creates or deletes objects, it creates a directory
413.Pa /dev/vinum ,
414in which it makes device entries for each volume.
415It also creates the
416subdirectories,
417.Pa /dev/vinum/plex
418and
419.Pa /dev/vinum/sd ,
420in which it stores device entries for the plexes and subdisks.  In addition, it
421creates two more directories,
422.Pa /dev/vinum/vol
423and
424.Pa /dev/vinum/drive ,
425in which it stores hierarchical information for volumes and drives.
426.It
427In addition,
428.Nm
429creates three super-devices,
430.Pa /dev/vinum/control ,
431.Pa /dev/vinum/Control
432and
433.Pa /dev/vinum/controld .
434.Pa /dev/vinum/control
435is used by
436.Xr vinum 8
437when it has been compiled without the
438.Dv VINUMDEBUG
439option,
440.Pa /dev/vinum/Control
441is used by
442.Xr vinum 8
443when it has been compiled with the
444.Dv VINUMDEBUG
445option, and
446.Pa /dev/vinum/controld
447is used by the
448.Nm
449daemon.
450The two control devices for
451.Xr vinum 8
452are used to synchronize the debug status of kernel and user modules.
453.It
454Unlike
455.Nm UNIX
456drives,
457.Nm
458volumes are not subdivided into partitions, and thus do not contain a disk
459label.
460Unfortunately, this confuses a number of utilities, notably
461.Xr newfs 8 ,
462which normally tries to interpret the last letter of a
463.Nm
464volume name as a partition identifier.
465If you use a volume name which does not
466end in the letters
467.Ql a
468to
469.Ql c ,
470you must use the
471.Fl v
472flag to
473.Xr newfs 8
474in order to tell it to ignore this convention.
475.\"
476.It
477Plexes do not need to be assigned explicit names.
478By default, a plex name is
479the name of the volume followed by the letters
480.Pa .p
481and the number of the
482plex.
483For example, the plexes of volume
484.Pa vol3
485are called
486.Pa vol3.p0 , vol3.p1
487and so on.
488These names can be overridden, but it is not recommended.
489.It
490Like plexes, subdisks are assigned names automatically, and explicit naming is
491discouraged.
492A subdisk name is the name of the plex followed by the letters
493.Pa .s
494and a number identifying the subdisk.
495For example, the subdisks of
496plex
497.Pa vol3.p0
498are called
499.Pa vol3.p0.s0 , vol3.p0.s1
500and so on.
501.It
502By contrast,
503.Em drives
504must be named.
505This makes it possible to move a drive to a different location
506and still recognize it automatically.
507Drive names may be up to 32 characters
508long.
509.El
510.Ss Example
511Assume the
512.Nm
513objects described in the section
514.Sx "CONFIGURATION FILE"
515in
516.Xr vinum 8 .
517The directory
518.Pa /dev/vinum
519looks like:
520.Bd -literal -offset indent
521# ls -lR /dev/vinum
522total 5
523crwxr-xr--  1 root  wheel   91,   2 Mar 30 16:08 concat
524crwx------  1 root  wheel   91, 0x40000000 Mar 30 16:08 control
525crwx------  1 root  wheel   91, 0x40000001 Mar 30 16:08 controld
526drwxrwxrwx  2 root  wheel       512 Mar 30 16:08 drive
527drwxrwxrwx  2 root  wheel       512 Mar 30 16:08 plex
528drwxrwxrwx  2 root  wheel       512 Mar 30 16:08 rvol
529drwxrwxrwx  2 root  wheel       512 Mar 30 16:08 sd
530crwxr-xr--  1 root  wheel   91,   3 Mar 30 16:08 strcon
531crwxr-xr--  1 root  wheel   91,   1 Mar 30 16:08 stripe
532crwxr-xr--  1 root  wheel   91,   0 Mar 30 16:08 tinyvol
533drwxrwxrwx  7 root  wheel       512 Mar 30 16:08 vol
534crwxr-xr--  1 root  wheel   91,   4 Mar 30 16:08 vol5
535
536/dev/vinum/drive:
537total 0
538crw-r-----  1 root  operator    4,  15 Oct 21 16:51 drive2
539crw-r-----  1 root  operator    4,  31 Oct 21 16:51 drive4
540
541/dev/vinum/plex:
542total 0
543crwxr-xr--  1 root  wheel   91, 0x10000002 Mar 30 16:08 concat.p0
544crwxr-xr--  1 root  wheel   91, 0x10010002 Mar 30 16:08 concat.p1
545crwxr-xr--  1 root  wheel   91, 0x10000003 Mar 30 16:08 strcon.p0
546crwxr-xr--  1 root  wheel   91, 0x10010003 Mar 30 16:08 strcon.p1
547crwxr-xr--  1 root  wheel   91, 0x10000001 Mar 30 16:08 stripe.p0
548crwxr-xr--  1 root  wheel   91, 0x10000000 Mar 30 16:08 tinyvol.p0
549crwxr-xr--  1 root  wheel   91, 0x10000004 Mar 30 16:08 vol5.p0
550crwxr-xr--  1 root  wheel   91, 0x10010004 Mar 30 16:08 vol5.p1
551
552/dev/vinum/sd:
553total 0
554crwxr-xr--  1 root  wheel   91, 0x20000002 Mar 30 16:08 concat.p0.s0
555crwxr-xr--  1 root  wheel   91, 0x20100002 Mar 30 16:08 concat.p0.s1
556crwxr-xr--  1 root  wheel   91, 0x20010002 Mar 30 16:08 concat.p1.s0
557crwxr-xr--  1 root  wheel   91, 0x20000003 Mar 30 16:08 strcon.p0.s0
558crwxr-xr--  1 root  wheel   91, 0x20100003 Mar 30 16:08 strcon.p0.s1
559crwxr-xr--  1 root  wheel   91, 0x20010003 Mar 30 16:08 strcon.p1.s0
560crwxr-xr--  1 root  wheel   91, 0x20110003 Mar 30 16:08 strcon.p1.s1
561crwxr-xr--  1 root  wheel   91, 0x20000001 Mar 30 16:08 stripe.p0.s0
562crwxr-xr--  1 root  wheel   91, 0x20100001 Mar 30 16:08 stripe.p0.s1
563crwxr-xr--  1 root  wheel   91, 0x20000000 Mar 30 16:08 tinyvol.p0.s0
564crwxr-xr--  1 root  wheel   91, 0x20100000 Mar 30 16:08 tinyvol.p0.s1
565crwxr-xr--  1 root  wheel   91, 0x20000004 Mar 30 16:08 vol5.p0.s0
566crwxr-xr--  1 root  wheel   91, 0x20100004 Mar 30 16:08 vol5.p0.s1
567crwxr-xr--  1 root  wheel   91, 0x20010004 Mar 30 16:08 vol5.p1.s0
568crwxr-xr--  1 root  wheel   91, 0x20110004 Mar 30 16:08 vol5.p1.s1
569
570/dev/vinum/vol:
571total 5
572crwxr-xr--  1 root  wheel   91,   2 Mar 30 16:08 concat
573drwxr-xr-x  4 root  wheel       512 Mar 30 16:08 concat.plex
574crwxr-xr--  1 root  wheel   91,   3 Mar 30 16:08 strcon
575drwxr-xr-x  4 root  wheel       512 Mar 30 16:08 strcon.plex
576crwxr-xr--  1 root  wheel   91,   1 Mar 30 16:08 stripe
577drwxr-xr-x  3 root  wheel       512 Mar 30 16:08 stripe.plex
578crwxr-xr--  1 root  wheel   91,   0 Mar 30 16:08 tinyvol
579drwxr-xr-x  3 root  wheel       512 Mar 30 16:08 tinyvol.plex
580crwxr-xr--  1 root  wheel   91,   4 Mar 30 16:08 vol5
581drwxr-xr-x  4 root  wheel       512 Mar 30 16:08 vol5.plex
582
583/dev/vinum/vol/concat.plex:
584total 2
585crwxr-xr--  1 root  wheel   91, 0x10000002 Mar 30 16:08 concat.p0
586drwxr-xr-x  2 root  wheel       512 Mar 30 16:08 concat.p0.sd
587crwxr-xr--  1 root  wheel   91, 0x10010002 Mar 30 16:08 concat.p1
588drwxr-xr-x  2 root  wheel       512 Mar 30 16:08 concat.p1.sd
589
590/dev/vinum/vol/concat.plex/concat.p0.sd:
591total 0
592crwxr-xr--  1 root  wheel   91, 0x20000002 Mar 30 16:08 concat.p0.s0
593crwxr-xr--  1 root  wheel   91, 0x20100002 Mar 30 16:08 concat.p0.s1
594
595/dev/vinum/vol/concat.plex/concat.p1.sd:
596total 0
597crwxr-xr--  1 root  wheel   91, 0x20010002 Mar 30 16:08 concat.p1.s0
598
599/dev/vinum/vol/strcon.plex:
600total 2
601crwxr-xr--  1 root  wheel   91, 0x10000003 Mar 30 16:08 strcon.p0
602drwxr-xr-x  2 root  wheel       512 Mar 30 16:08 strcon.p0.sd
603crwxr-xr--  1 root  wheel   91, 0x10010003 Mar 30 16:08 strcon.p1
604drwxr-xr-x  2 root  wheel       512 Mar 30 16:08 strcon.p1.sd
605
606/dev/vinum/vol/strcon.plex/strcon.p0.sd:
607total 0
608crwxr-xr--  1 root  wheel   91, 0x20000003 Mar 30 16:08 strcon.p0.s0
609crwxr-xr--  1 root  wheel   91, 0x20100003 Mar 30 16:08 strcon.p0.s1
610
611/dev/vinum/vol/strcon.plex/strcon.p1.sd:
612total 0
613crwxr-xr--  1 root  wheel   91, 0x20010003 Mar 30 16:08 strcon.p1.s0
614crwxr-xr--  1 root  wheel   91, 0x20110003 Mar 30 16:08 strcon.p1.s1
615
616/dev/vinum/vol/stripe.plex:
617total 1
618crwxr-xr--  1 root  wheel   91, 0x10000001 Mar 30 16:08 stripe.p0
619drwxr-xr-x  2 root  wheel       512 Mar 30 16:08 stripe.p0.sd
620
621/dev/vinum/vol/stripe.plex/stripe.p0.sd:
622total 0
623crwxr-xr--  1 root  wheel   91, 0x20000001 Mar 30 16:08 stripe.p0.s0
624crwxr-xr--  1 root  wheel   91, 0x20100001 Mar 30 16:08 stripe.p0.s1
625
626/dev/vinum/vol/tinyvol.plex:
627total 1
628crwxr-xr--  1 root  wheel   91, 0x10000000 Mar 30 16:08 tinyvol.p0
629drwxr-xr-x  2 root  wheel       512 Mar 30 16:08 tinyvol.p0.sd
630
631/dev/vinum/vol/tinyvol.plex/tinyvol.p0.sd:
632total 0
633crwxr-xr--  1 root  wheel   91, 0x20000000 Mar 30 16:08 tinyvol.p0.s0
634crwxr-xr--  1 root  wheel   91, 0x20100000 Mar 30 16:08 tinyvol.p0.s1
635
636/dev/vinum/vol/vol5.plex:
637total 2
638crwxr-xr--  1 root  wheel   91, 0x10000004 Mar 30 16:08 vol5.p0
639drwxr-xr-x  2 root  wheel       512 Mar 30 16:08 vol5.p0.sd
640crwxr-xr--  1 root  wheel   91, 0x10010004 Mar 30 16:08 vol5.p1
641drwxr-xr-x  2 root  wheel       512 Mar 30 16:08 vol5.p1.sd
642
643/dev/vinum/vol/vol5.plex/vol5.p0.sd:
644total 0
645crwxr-xr--  1 root  wheel   91, 0x20000004 Mar 30 16:08 vol5.p0.s0
646crwxr-xr--  1 root  wheel   91, 0x20100004 Mar 30 16:08 vol5.p0.s1
647
648/dev/vinum/vol/vol5.plex/vol5.p1.sd:
649total 0
650crwxr-xr--  1 root  wheel   91, 0x20010004 Mar 30 16:08 vol5.p1.s0
651crwxr-xr--  1 root  wheel   91, 0x20110004 Mar 30 16:08 vol5.p1.s1
652.Ed
653.Pp
654In the case of unattached plexes and subdisks, the naming is reversed.
655Subdisks
656are named after the disk on which they are located, and plexes are named after
657the subdisk.
658.\" XXX
659.Bf -symbolic
660This mapping is still to be determined.
661.Ef
662.Ss Object States
663Each
664.Nm
665object has a
666.Em state
667associated with it.
668.Nm
669uses this state to determine the handling of the object.
670.Ss Volume States
671Volumes may have the following states:
672.Bl -hang -width 14n
673.It Em down
674The volume is completely inaccessible.
675.It Em up
676The volume is up and at least partially functional.
677Not all plexes may be
678available.
679.El
680.Ss "Plex States"
681Plexes may have the following states:
682.Bl -hang -width 14n
683.It Em referenced
684A plex entry which has been referenced as part of a volume, but which is
685currently not known.
686.It Em faulty
687A plex which has gone completely down because of I/O errors.
688.It Em down
689A plex which has been taken down by the administrator.
690.It Em initializing
691A plex which is being initialized.
692.El
693.Pp
694The remaining states represent plexes which are at least partially up.
695.Bl -hang -width 14n
696.It Em corrupt
697A plex entry which is at least partially up.
698Not all subdisks are available,
699and an inconsistency has occurred.
700If no other plex is uncorrupted, the volume
701is no longer consistent.
702.It Em degraded
703A RAID-5 plex entry which is accessible, but one subdisk is down, requiring
704recovery for many I/O requests.
705.It Em flaky
706A plex which is really up, but which has a reborn subdisk which we do not
707completely trust, and which we do not want to read if we can avoid it.
708.It Em up
709A plex entry which is completely up.
710All subdisks are up.
711.El
712.Ss "Subdisk States"
713Subdisks can have the following states:
714.Bl -hang -width 14n
715.It Em empty
716A subdisk entry which has been created completely.
717All fields are correct, and
718the disk has been updated, but the on the disk is not valid.
719.It Em referenced
720A subdisk entry which has been referenced as part of a plex, but which is
721currently not known.
722.It Em initializing
723A subdisk entry which has been created completely and which is currently being
724initialized.
725.El
726.Pp
727The following states represent invalid data.
728.Bl -hang -width 14n
729.It Em obsolete
730A subdisk entry which has been created completely.
731All fields are correct, the
732config on disk has been updated, and the data was valid, but since then the
733drive has been taken down, and as a result updates have been missed.
734.It Em stale
735A subdisk entry which has been created completely.
736All fields are correct, the
737disk has been updated, and the data was valid, but since then the drive has been
738crashed and updates have been lost.
739.El
740.Pp
741The following states represent valid, inaccessible data.
742.Bl -hang -width 14n
743.It Em crashed
744A subdisk entry which has been created completely.
745All fields are correct, the
746disk has been updated, and the data was valid, but since then the drive has gone
747down.
748No attempt has been made to write to the subdisk since the crash, so the
749data is valid.
750.It Em down
751A subdisk entry which was up, which contained valid data, and which was taken
752down by the administrator.
753The data is valid.
754.It Em reviving
755The subdisk is currently in the process of being revived.
756We can write but not
757read.
758.El
759.Pp
760The following states represent accessible subdisks with valid data.
761.Bl -hang -width 14n
762.It Em reborn
763A subdisk entry which has been created completely.
764All fields are correct, the
765disk has been updated, and the data was valid, but since then the drive has gone
766down and up again.
767No updates were lost, but it is possible that the subdisk
768has been damaged.
769We won't read from this subdisk if we have a choice.
770If this
771is the only subdisk which covers this address space in the plex, we set its
772state to up under these circumstances, so this status implies that there is
773another subdisk to fulfil the request.
774.It Em up
775A subdisk entry which has been created completely.
776All fields are correct, the
777disk has been updated, and the data is valid.
778.El
779.Ss "Drive States"
780Drives can have the following states:
781.Bl -hang -width 14n
782.It Em referenced
783At least one subdisk refers to the drive, but it is not currently accessible to
784the system.
785No device name is known.
786.It Em down
787The drive is not accessible.
788.It Em up
789The drive is up and running.
790.El
791.Sh DEBUGGING PROBLEMS WITH VINUM
792Solving problems with
793.Nm
794can be a difficult affair.
795This section suggests some approaches.
796.Ss Configuration problems
797It is relatively easy (too easy) to run into problems with the
798.Nm
799configuration.
800If you do, the first thing you should do is stop configuration
801updates:
802.Pp
803.Dl "vinum setdaemon 4"
804.Pp
805This will stop updates and any further corruption of the on-disk configuration.
806.Pp
807Next, look at the on-disk configuration with the
808.Nm vinum dumpconfig
809command, for example:
810.if t .ps -3
811.if t .vs -3
812.Bd -literal
813# \fBvinum dumpconfig\fP
814Drive 4:        Device /dev/da3h
815                Created on crash.lemis.com at Sat May 20 16:32:44 2000
816                Config last updated Sat May 20 16:32:56 2000
817                Size:        601052160 bytes (573 MB)
818volume obj state up
819volume src state up
820volume raid state down
821volume r state down
822volume foo state up
823plex name obj.p0 state corrupt org concat vol obj
824plex name obj.p1 state corrupt org striped 128b vol obj
825plex name src.p0 state corrupt org striped 128b vol src
826plex name src.p1 state up org concat vol src
827plex name raid.p0 state faulty org disorg vol raid
828plex name r.p0 state faulty org disorg vol r
829plex name foo.p0 state up org concat vol foo
830plex name foo.p1 state faulty org concat vol foo
831sd name obj.p0.s0 drive drive2 plex obj.p0 state reborn len 409600b driveoffset 265b plexoffset 0b
832sd name obj.p0.s1 drive drive4 plex obj.p0 state up len 409600b driveoffset 265b plexoffset 409600b
833sd name obj.p1.s0 drive drive1 plex obj.p1 state up len 204800b driveoffset 265b plexoffset 0b
834sd name obj.p1.s1 drive drive2 plex obj.p1 state reborn len 204800b driveoffset 409865b plexoffset 128b
835sd name obj.p1.s2 drive drive3 plex obj.p1 state up len 204800b driveoffset 265b plexoffset 256b
836sd name obj.p1.s3 drive drive4 plex obj.p1 state up len 204800b driveoffset 409865b plexoffset 384b
837.Ed
838.if t .vs +3
839.if t .ps +3
840.Pp
841The configuration on all disks should be the same.
842If this is not the case,
843please save the output to a file and report the problem.
844There is probably
845little that can be done to recover the on-disk configuration, but if you keep a
846copy of the files used to create the objects, you should be able to re-create
847them.
848The
849.Cm create
850command does not change the subdisk data, so this will not cause data
851corruption.
852You may need to use the
853.Cm resetconfig
854command if you have this kind of trouble.
855.Ss Kernel Panics
856In order to analyse a panic which you suspect comes from
857.Nm
858you will need to build a debug kernel.
859See the online handbook at
860.Pa /usr/share/doc/en/books/developers-handbook/kerneldebug.html
861(if installed) or
862.Pa http://www.FreeBSD.org/doc/en_US.ISO8859-1/books/developers-\%handbook/kerneldebug.html
863for more details of how to do this.
864.Pp
865Perform the following steps to analyse a
866.Nm
867problem:
868.Bl -enum
869.It
870Copy the following files to the directory in which you will be
871performing the analysis, typically
872.Pa /var/crash :
873.Pp
874.Bl -bullet -compact
875.It
876.Pa /sys/dev/raid/vinum/.gdbinit.crash ,
877.It
878.Pa /sys/dev/raid/vinum/.gdbinit.kernel ,
879.It
880.Pa /sys/dev/raid/vinum/.gdbinit.serial ,
881.It
882.Pa /sys/dev/raid/vinum/.gdbinit.vinum
883and
884.It
885.Pa /sys/dev/raid/vinum/.gdbinit.vinum.paths
886.El
887.It
888Make sure that you build the
889.Nm
890module with debugging information.
891The standard
892.Pa Makefile
893builds a module with debugging symbols by default.
894If the version of
895.Nm
896in
897.Pa /modules
898does not contain symbols, you will not get an error message, but the stack trace
899will not show the symbols.
900Check the module before starting
901.Xr gdb 1 :
902.Bd -literal
903$ file /modules/vinum.ko
904/modules/vinum.ko: ELF 32-bit LSB shared object, Intel 80386,
905  version 1 (FreeBSD), not stripped
906.Ed
907.Pp
908If the output shows that
909.Pa /modules/vinum.ko
910is stripped, you will have to find a version which is not.
911Usually this will be
912either in
913.Pa /usr/obj/usr/src/sys/SYSTEM_NAME/usr/src/sys/dev/raid/vinum/vinum.ko
914(if you have built
915.Nm
916with a
917.Dq Li "make world" )
918or
919.Pa /sys/dev/raid/vinum/vinum.ko
920(if you have built
921.Nm
922in this directory).
923Modify the file
924.Pa .gdbinit.vinum.paths
925accordingly.
926.It
927Either take a dump or use remote serial
928.Xr gdb 1
929to analyse the problem.
930To analyse a dump, say
931.Pa /var/crash/vmcore.5 ,
932link
933.Pa /var/crash/.gdbinit.crash
934to
935.Pa /var/crash/.gdbinit
936and enter:
937.Bd -literal -offset indent
938cd /var/crash
939gdb -k kernel.debug vmcore.5
940.Ed
941.Pp
942This example assumes that you have installed the correct debug kernel at
943.Pa /var/crash/kernel.debug .
944If not, substitute the correct name of the debug kernel.
945.Pp
946To perform remote serial debugging,
947link
948.Pa /var/crash/.gdbinit.serial
949to
950.Pa /var/crash/.gdbinit
951and enter
952.Bd -literal -offset indent
953cd /var/crash
954gdb -k kernel.debug
955.Ed
956.Pp
957In this case, the
958.Pa .gdbinit
959file performs the functions necessary to establish connection.
960The remote
961machine must already be in debug mode: enter the kernel debugger and select
962.Ic gdb
963(see
964.Xr ddb 4
965for more details.)
966The serial
967.Pa .gdbinit
968file expects the serial connection to run at 38400 bits per second; if you run
969at a different speed, edit the file accordingly (look for the
970.Va remotebaud
971specification).
972.Pp
973The following example shows a remote debugging session using the
974.Ic debug
975command of
976.Xr vinum 8 :
977.Bd -literal
978.if t .ps -3
979.if t .vs -3
980GDB 4.16 (i386-unknown-dragonfly), Copyright 1996 Free Software Foundation, Inc.
981Debugger (msg=0xf1093174 "vinum debug") at ../../i386/i386/db_interface.c:318
982318                 in_Debugger = 0;
983#1  0xf108d9bc in vinumioctl (dev=0x40001900, cmd=0xc008464b, data=0xf6dedee0 "",
984    flag=0x3, p=0xf68b7940) at
985    /usr/src/sys/dev/raid/vinum/vinumioctl.c:102
986102             Debugger ("vinum debug");
987(kgdb) bt
988#0  Debugger (msg=0xf0f661ac "vinum debug") at ../../i386/i386/db_interface.c:318
989#1  0xf0f60a7c in vinumioctl (dev=0x40001900, cmd=0xc008464b, data=0xf6923ed0 "",
990      flag=0x3, p=0xf688e6c0) at
991      /usr/src/sys/dev/raid/vinum/vinumioctl.c:109
992#2  0xf01833b7 in spec_ioctl (ap=0xf6923e0c) at ../../miscfs/specfs/spec_vnops.c:424
993#3  0xf0182cc9 in spec_vnoperate (ap=0xf6923e0c) at ../../miscfs/specfs/spec_vnops.c:129
994#4  0xf01eb3c1 in ufs_vnoperatespec (ap=0xf6923e0c) at ../../ufs/ufs/ufs_vnops.c:2312
995#5  0xf017dbb1 in vn_ioctl (fp=0xf1007ec0, com=0xc008464b, data=0xf6923ed0 "",
996      p=0xf688e6c0) at vnode_if.h:395
997#6  0xf015dce0 in ioctl (p=0xf688e6c0, uap=0xf6923f84) at ../../kern/sys_generic.c:473
998#7  0xf0214c0b in syscall (frame={tf_es = 0x27, tf_ds = 0x27, tf_edi = 0xefbfcff8,
999      tf_esi = 0x1, tf_ebp = 0xefbfcf90, tf_isp = 0xf6923fd4, tf_ebx = 0x2,
1000      tf_edx = 0x804b614, tf_ecx = 0x8085d10, tf_eax = 0x36, tf_trapno = 0x7,
1001      tf_err = 0x2, tf_eip = 0x8060a34, tf_cs = 0x1f, tf_eflags = 0x286,
1002      tf_esp = 0xefbfcf78, tf_ss = 0x27}) at ../../i386/i386/trap.c:1100
1003#8  0xf020a1fc in Xint0x80_syscall ()
1004#9  0x804832d in ?? ()
1005#10 0x80482ad in ?? ()
1006#11 0x80480e9 in ?? ()
1007.if t .vs
1008.if t .ps
1009.Ed
1010.Pp
1011When entering from the debugger, it is important that the source of frame 1
1012(listed by the
1013.Pa .gdbinit
1014file at the top of the example) contains the text
1015.Dq Li "Debugger (\*[q]vinum debug\*[q]);" .
1016.Pp
1017This is an indication that the address specifications are correct.
1018If you get
1019some other output, your symbols and the kernel module are out of sync, and the
1020trace will be meaningless.
1021.El
1022.Pp
1023For an initial investigation, the most important information is the output of
1024the
1025.Ic bt
1026(backtrace) command above.
1027.Ss Reporting Problems with Vinum
1028If you find any bugs in
1029.Nm ,
1030please report them to
1031.An Greg Lehey Aq grog@lemis.com .
1032Supply the following
1033information:
1034.Bl -bullet
1035.It
1036The output of the
1037.Nm vinum Cm list
1038command
1039(see
1040.Xr vinum 8 ) .
1041.It
1042Any messages printed in
1043.Pa /var/log/messages .
1044All such messages will be identified by the text
1045.Dq Li vinum
1046at the beginning.
1047.It
1048If you have a panic, a stack trace as described above.
1049.El
1050.Sh SEE ALSO
1051.Xr disklabel 5 ,
1052.Xr disklabel 8 ,
1053.Xr newfs 8 ,
1054.Xr vinum 8
1055.Sh HISTORY
1056.Nm
1057first appeared in
1058.Fx 3.0 .
1059The RAID-5 component of
1060.Nm
1061was developed by Cybernet Inc.\&
1062.Pq Pa http://www.cybernet.com/ ,
1063for its NetMAX product.
1064.Sh AUTHORS
1065.An Greg Lehey Aq grog@lemis.com .
1066.Sh BUGS
1067.Nm
1068is a new product.
1069Bugs can be expected.
1070The configuration mechanism is not yet
1071fully functional.
1072If you have difficulties, please look at the section
1073.Sx "DEBUGGING PROBLEMS WITH VINUM"
1074before reporting problems.
1075.Pp
1076Kernels with the
1077.Nm
1078pseudo-device appear to work, but are not supported.
1079If you have trouble with
1080this configuration, please first replace the kernel with a
1081.No non- Ns Nm
1082kernel and test with the kld module.
1083.Pp
1084Detection of differences between the version of the kernel and the kld is not
1085yet implemented.
1086.Pp
1087The RAID-5 functionality is new in
1088.Fx 3.3 .
1089Some problems have been
1090reported with
1091.Nm
1092in combination with soft updates, but these are not reproducible on all
1093systems.
1094If you are planning to use
1095.Nm
1096in a production environment, please test carefully.
1097