1.\" Hey, Emacs, edit this file in -*- nroff-fill -*- mode 2.\"- 3.\" Copyright (c) 1997, 1998 4.\" Nan Yang Computer Services Limited. All rights reserved. 5.\" 6.\" This software is distributed under the so-called ``Berkeley 7.\" License'': 8.\" 9.\" Redistribution and use in source and binary forms, with or without 10.\" modification, are permitted provided that the following conditions 11.\" are met: 12.\" 1. Redistributions of source code must retain the above copyright 13.\" notice, this list of conditions and the following disclaimer. 14.\" 2. Redistributions in binary form must reproduce the above copyright 15.\" notice, this list of conditions and the following disclaimer in the 16.\" documentation and/or other materials provided with the distribution. 17.\" 3. All advertising materials mentioning features or use of this software 18.\" must display the following acknowledgement: 19.\" This product includes software developed by Nan Yang Computer 20.\" Services Limited. 21.\" 4. Neither the name of the Company nor the names of its contributors 22.\" may be used to endorse or promote products derived from this software 23.\" without specific prior written permission. 24.\" 25.\" This software is provided ``as is'', and any express or implied 26.\" warranties, including, but not limited to, the implied warranties of 27.\" merchantability and fitness for a particular purpose are disclaimed. 28.\" In no event shall the company or contributors be liable for any 29.\" direct, indirect, incidental, special, exemplary, or consequential 30.\" damages (including, but not limited to, procurement of substitute 31.\" goods or services; loss of use, data, or profits; or business 32.\" interruption) however caused and on any theory of liability, whether 33.\" in contract, strict liability, or tort (including negligence or 34.\" otherwise) arising in any way out of the use of this software, even if 35.\" advised of the possibility of such damage. 36.\" 37.\" $FreeBSD: src/share/man/man4/vinum.4,v 1.22.2.9 2002/04/22 08:19:35 kuriyama Exp $ 38.\" $DragonFly: src/share/man/man4/vinum.4,v 1.14 2008/02/11 15:59:37 matthias Exp $ 39.\" 40.Dd February 11, 2008 41.Dt VINUM 4 42.Os 43.Sh NAME 44.Nm vinum 45.Nd Logical Volume Manager 46.Sh SYNOPSIS 47.Cd "pseudo-device vinum" 48.Sh DESCRIPTION 49.Nm 50is a logical volume manager inspired by, but not derived from, the Veritas 51Volume Manager. 52It provides the following features: 53.Bl -bullet 54.It 55It provides device-independent logical disks, called 56.Em volumes . 57Volumes are 58not restricted to the size of any disk on the system. 59.It 60The volumes consist of one or more 61.Em plexes , 62each of which contain the 63entire address space of a volume. 64This represents an implementation of RAID-1 65(mirroring). 66Multiple plexes can also be used for 67.\" XXX What about sparse plexes? Do we want them? 68.Bl -bullet 69.It 70Increased read throughput. 71.Nm 72will read data from the least active disk, so if a volume has plexes on multiple 73disks, more data can be read in parallel. 74.Nm 75reads data from only one plex, but it writes data to all plexes. 76.It 77Increased reliability. 78By storing plexes on different disks, data will remain 79available even if one of the plexes becomes unavailable. 80In comparison with a 81RAID-5 plex (see below), using multiple plexes requires more storage space, but 82gives better performance, particularly in the case of a drive failure. 83.It 84Additional plexes can be used for on-line data reorganization. 85By attaching an 86additional plex and subsequently detaching one of the older plexes, data can be 87moved on-line without compromising access. 88.It 89An additional plex can be used to obtain a consistent dump of a file system. 90By 91attaching an additional plex and detaching at a specific time, the detached plex 92becomes an accurate snapshot of the file system at the time of detachment. 93.\" Make sure to flush! 94.El 95.It 96Each plex consists of one or more logical disk slices, called 97.Em subdisks . 98Subdisks are defined as a contiguous block of physical disk storage. 99A plex may 100consist of any reasonable number of subdisks (in other words, the real limit is 101not the number, but other factors, such as memory and performance, associated 102with maintaining a large number of subdisks). 103.It 104A number of mappings between subdisks and plexes are available: 105.Bl -bullet 106.It 107.Em "Concatenated plexes" 108consist of one or more subdisks, each of which 109is mapped to a contiguous part of the plex address space. 110.It 111.Em "Striped plexes" 112consist of two or more subdisks of equal size. 113The file 114address space is mapped in 115.Em stripes , 116integral fractions of the subdisk 117size. 118Consecutive plex address space is mapped to stripes in each subdisk in 119turn. 120.if t \{\ 121.ig 122.\" FIXME 123.br 124.ne 1.5i 125.PS 126move right 2i 127down 128SD0: box 129SD1: box 130SD2: box 131 132"plex 0" at SD0.n+(0,.2) 133"subdisk 0" rjust at SD0.w-(.2,0) 134"subdisk 1" rjust at SD1.w-(.2,0) 135"subdisk 2" rjust at SD2.w-(.2,0) 136.PE 137.. 138.\} 139The subdisks of a striped plex must all be the same size. 140.It 141.Em "RAID-5 plexes" 142require at least three equal-sized subdisks. 143They 144resemble striped plexes, except that in each stripe, one subdisk stores parity 145information. 146This subdisk changes in each stripe: in the first stripe, it is the 147first subdisk, in the second it is the second subdisk, etc. 148In the event of a 149single disk failure, 150.Nm 151will recover the data based on the information stored on the remaining subdisks. 152This mapping is particularly suited to read-intensive access. 153The subdisks of a 154RAID-5 plex must all be the same size. 155.\" Make sure to flush! 156.El 157.It 158.Em Drives 159are the lowest level of the storage hierarchy. 160They represent disk special 161devices. 162.It 163.Nm 164offers automatic startup. 165Unlike 166.Ux 167file systems, 168.Nm 169volumes contain all the configuration information needed to ensure that they are 170started correctly when the subsystem is enabled. 171This is also a significant 172advantage over the Veritas\(tm File System. 173This feature regards the presence 174of the volumes. 175It does not mean that the volumes will be mounted 176automatically, since the standard startup procedures with 177.Pa /etc/fstab 178perform this function. 179.El 180.Sh KERNEL CONFIGURATION 181.Nm 182is currently supplied as a KLD module, and does not require 183configuration. 184As with other klds, it is absolutely necessary to match the kld 185to the version of the operating system. 186Failure to do so will cause 187.Nm 188to issue an error message and terminate. 189.Pp 190It is possible to configure 191.Nm 192in the kernel, but this is not recommended. 193To do so, add this line to the 194kernel configuration file: 195.Pp 196.D1 Cd "pseudo-device vinum" 197.Ss Debug Options 198The current version of 199.Nm , 200both the kernel module and the user program 201.Xr vinum 8 , 202include significant debugging support. 203It is not recommended to remove 204this support at the moment, but if you do you must remove it from both the 205kernel and the user components. 206To do this, edit the files 207.Pa /usr/src/sbin/vinum/Makefile 208and 209.Pa /sys/dev/raid/vinum/Makefile 210and edit the 211.Va CFLAGS 212variable to remove the 213.Li -DVINUMDEBUG 214option. 215If you have 216configured 217.Nm 218into the kernel, either specify the line 219.Pp 220.D1 Cd "options VINUMDEBUG" 221.Pp 222in the kernel configuration file or remove the 223.Li -DVINUMDEBUG 224option from 225.Pa /usr/src/sbin/vinum/Makefile 226as described above. 227.Pp 228If the 229.Va VINUMDEBUG 230variables do not match, 231.Xr vinum 8 232will fail with a message 233explaining the problem and what to do to correct it. 234.Pp 235.Nm 236was previously available in two versions: a freely available version which did 237not contain RAID-5 functionality, and a full version including RAID-5 238functionality, which was available only from Cybernet Systems Inc. 239The present 240version of 241.Nm 242includes the RAID-5 functionality. 243.Sh RUNNING VINUM 244.Nm 245is part of the base 246.Dx 247system. 248It does not require installation. 249To start it, start the 250.Xr vinum 8 251program, which will load the kld if it is not already present. 252Before using 253.Nm , 254it must be configured. 255See 256.Xr vinum 8 257for information on how to create a 258.Nm 259configuration. 260.Pp 261Normally, you start a configured version of 262.Nm 263at boot time. 264Set the variable 265.Va start_vinum 266in 267.Pa /etc/rc.conf 268to 269.Dq Li YES 270to start 271.Nm 272at boot time. 273(See 274.Xr rc.conf 5 275for more details.) 276.Pp 277If 278.Nm 279is loaded as a kld (the recommended way), the 280.Nm Cm stop 281command will unload it 282(see 283.Xr vinum 8 ) . 284You can also do this with the 285.Xr kldunload 8 286command. 287.Pp 288The kld can only be unloaded when idle, in other words when no volumes are 289mounted and no other instances of the 290.Xr vinum 8 291program are active. 292Unloading the kld does not harm the data in the volumes. 293.Ss Configuring and Starting Objects 294Use the 295.Xr vinum 8 296utility to configure and start 297.Nm 298objects. 299.Sh IOCTL CALLS 300.Xr ioctl 2 301calls are intended for the use of the 302.Xr vinum 8 303configuration program only. 304They are described in the header file 305.Pa /sys/dev/raid/vinum/vinumio.h . 306.Ss Disk Labels 307Conventional disk special devices have a 308.Em "disk label" 309in the second sector of the device. 310See 311.Xr disklabel 5 312for more details. 313This disk label describes the layout of the partitions within 314the device. 315.Nm 316does not subdivide volumes, so volumes do not contain a physical disk label. 317For convenience, 318.Nm 319implements the ioctl calls 320.Dv DIOCGDINFO 321(get disk label), 322.Dv DIOCGPART 323(get partition information), 324.Dv DIOCWDINFO 325(write partition information) and 326.Dv DIOCSDINFO 327(set partition information). 328.Dv DIOCGDINFO 329and 330.Dv DIOCGPART 331refer to an internal 332representation of the disk label which is not present on the volume. 333As a 334result, the 335.Fl r 336option of 337.Xr disklabel 8 , 338which reads the 339.Dq "raw disk" , 340will fail. 341.Pp 342In general, 343.Xr disklabel 8 344serves no useful purpose on a 345.Nm 346volume. 347.Pp 348.Nm 349ignores the 350.Dv DIOCWDINFO 351and 352.Dv DIOCSDINFO ioctls, since there is nothing to change. 353As a result, any attempt to modify the disk label will be silently ignored. 354.Sh MAKING FILE SYSTEMS 355Since 356.Nm 357volumes do not contain partitions, the names do not need to conform to the 358standard rules for naming disk partitions. 359For a physical disk partition, the 360last letter of the device name specifies the partition identifier (a to p). 361.Nm 362volumes need not conform to this convention, but if they do not, 363.Xr newfs 8 364will complain that it cannot determine the partition. 365To solve this problem, 366use the 367.Fl v 368flag to 369.Xr newfs 8 . 370For example, if you have a volume 371.Pa concat , 372use the following command to create a UFS file system on it: 373.Pp 374.Dl "newfs -v /dev/vinum/concat" 375.Sh OBJECT NAMING 376.Nm 377assigns default names to plexes and subdisks, although they may be overridden. 378We do not recommend overriding the default names. 379Experience with the 380Veritas\(tm 381volume manager, which allows arbitrary naming of objects, has shown that this 382flexibility does not bring a significant advantage, and it can cause confusion. 383.Pp 384Names may contain any non-blank character, but it is recommended to restrict 385them to letters, digits and the underscore characters. 386The names of volumes, 387plexes and subdisks may be up to 64 characters long, and the names of drives may 388up to 32 characters long. 389When choosing volume and plex names, bear in mind 390that automatically generated plex and subdisk names are longer than the name 391from which they are derived. 392.Bl -bullet 393.It 394When 395.Nm 396creates or deletes objects, it creates a directory 397.Pa /dev/vinum , 398in which it makes device entries for each volume. 399It also creates the 400subdirectories, 401.Pa /dev/vinum/plex 402and 403.Pa /dev/vinum/sd , 404in which it stores device entries for the plexes and subdisks. In addition, it 405creates two more directories, 406.Pa /dev/vinum/vol 407and 408.Pa /dev/vinum/drive , 409in which it stores hierarchical information for volumes and drives. 410.It 411In addition, 412.Nm 413creates three super-devices, 414.Pa /dev/vinum/control , 415.Pa /dev/vinum/Control 416and 417.Pa /dev/vinum/controld . 418.Pa /dev/vinum/control 419is used by 420.Xr vinum 8 421when it has been compiled without the 422.Dv VINUMDEBUG 423option, 424.Pa /dev/vinum/Control 425is used by 426.Xr vinum 8 427when it has been compiled with the 428.Dv VINUMDEBUG 429option, and 430.Pa /dev/vinum/controld 431is used by the 432.Nm 433daemon. 434The two control devices for 435.Xr vinum 8 436are used to synchronize the debug status of kernel and user modules. 437.It 438Unlike 439.Ux 440drives, 441.Nm 442volumes are not subdivided into partitions, and thus do not contain a disk 443label. 444Unfortunately, this confuses a number of utilities, notably 445.Xr newfs 8 , 446which normally tries to interpret the last letter of a 447.Nm 448volume name as a partition identifier. 449If you use a volume name which does not 450end in the letters 451.Ql a 452to 453.Ql c , 454you must use the 455.Fl v 456flag to 457.Xr newfs 8 458in order to tell it to ignore this convention. 459.\" 460.It 461Plexes do not need to be assigned explicit names. 462By default, a plex name is 463the name of the volume followed by the letters 464.Pa .p 465and the number of the 466plex. 467For example, the plexes of volume 468.Pa vol3 469are called 470.Pa vol3.p0 , vol3.p1 471and so on. 472These names can be overridden, but it is not recommended. 473.It 474Like plexes, subdisks are assigned names automatically, and explicit naming is 475discouraged. 476A subdisk name is the name of the plex followed by the letters 477.Pa .s 478and a number identifying the subdisk. 479For example, the subdisks of 480plex 481.Pa vol3.p0 482are called 483.Pa vol3.p0.s0 , vol3.p0.s1 484and so on. 485.It 486By contrast, 487.Em drives 488must be named. 489This makes it possible to move a drive to a different location 490and still recognize it automatically. 491Drive names may be up to 32 characters 492long. 493.El 494.Ss Example 495Assume the 496.Nm 497objects described in the section 498.Sx "CONFIGURATION FILE" 499in 500.Xr vinum 8 . 501The directory 502.Pa /dev/vinum 503looks like: 504.Bd -literal -offset indent 505# ls -lR /dev/vinum 506total 5 507crwxr-xr-- 1 root wheel 91, 2 Mar 30 16:08 concat 508crwx------ 1 root wheel 91, 0x40000000 Mar 30 16:08 control 509crwx------ 1 root wheel 91, 0x40000001 Mar 30 16:08 controld 510drwxrwxrwx 2 root wheel 512 Mar 30 16:08 drive 511drwxrwxrwx 2 root wheel 512 Mar 30 16:08 plex 512drwxrwxrwx 2 root wheel 512 Mar 30 16:08 rvol 513drwxrwxrwx 2 root wheel 512 Mar 30 16:08 sd 514crwxr-xr-- 1 root wheel 91, 3 Mar 30 16:08 strcon 515crwxr-xr-- 1 root wheel 91, 1 Mar 30 16:08 stripe 516crwxr-xr-- 1 root wheel 91, 0 Mar 30 16:08 tinyvol 517drwxrwxrwx 7 root wheel 512 Mar 30 16:08 vol 518crwxr-xr-- 1 root wheel 91, 4 Mar 30 16:08 vol5 519 520/dev/vinum/drive: 521total 0 522crw-r----- 1 root operator 4, 15 Oct 21 16:51 drive2 523crw-r----- 1 root operator 4, 31 Oct 21 16:51 drive4 524 525/dev/vinum/plex: 526total 0 527crwxr-xr-- 1 root wheel 91, 0x10000002 Mar 30 16:08 concat.p0 528crwxr-xr-- 1 root wheel 91, 0x10010002 Mar 30 16:08 concat.p1 529crwxr-xr-- 1 root wheel 91, 0x10000003 Mar 30 16:08 strcon.p0 530crwxr-xr-- 1 root wheel 91, 0x10010003 Mar 30 16:08 strcon.p1 531crwxr-xr-- 1 root wheel 91, 0x10000001 Mar 30 16:08 stripe.p0 532crwxr-xr-- 1 root wheel 91, 0x10000000 Mar 30 16:08 tinyvol.p0 533crwxr-xr-- 1 root wheel 91, 0x10000004 Mar 30 16:08 vol5.p0 534crwxr-xr-- 1 root wheel 91, 0x10010004 Mar 30 16:08 vol5.p1 535 536/dev/vinum/sd: 537total 0 538crwxr-xr-- 1 root wheel 91, 0x20000002 Mar 30 16:08 concat.p0.s0 539crwxr-xr-- 1 root wheel 91, 0x20100002 Mar 30 16:08 concat.p0.s1 540crwxr-xr-- 1 root wheel 91, 0x20010002 Mar 30 16:08 concat.p1.s0 541crwxr-xr-- 1 root wheel 91, 0x20000003 Mar 30 16:08 strcon.p0.s0 542crwxr-xr-- 1 root wheel 91, 0x20100003 Mar 30 16:08 strcon.p0.s1 543crwxr-xr-- 1 root wheel 91, 0x20010003 Mar 30 16:08 strcon.p1.s0 544crwxr-xr-- 1 root wheel 91, 0x20110003 Mar 30 16:08 strcon.p1.s1 545crwxr-xr-- 1 root wheel 91, 0x20000001 Mar 30 16:08 stripe.p0.s0 546crwxr-xr-- 1 root wheel 91, 0x20100001 Mar 30 16:08 stripe.p0.s1 547crwxr-xr-- 1 root wheel 91, 0x20000000 Mar 30 16:08 tinyvol.p0.s0 548crwxr-xr-- 1 root wheel 91, 0x20100000 Mar 30 16:08 tinyvol.p0.s1 549crwxr-xr-- 1 root wheel 91, 0x20000004 Mar 30 16:08 vol5.p0.s0 550crwxr-xr-- 1 root wheel 91, 0x20100004 Mar 30 16:08 vol5.p0.s1 551crwxr-xr-- 1 root wheel 91, 0x20010004 Mar 30 16:08 vol5.p1.s0 552crwxr-xr-- 1 root wheel 91, 0x20110004 Mar 30 16:08 vol5.p1.s1 553 554/dev/vinum/vol: 555total 5 556crwxr-xr-- 1 root wheel 91, 2 Mar 30 16:08 concat 557drwxr-xr-x 4 root wheel 512 Mar 30 16:08 concat.plex 558crwxr-xr-- 1 root wheel 91, 3 Mar 30 16:08 strcon 559drwxr-xr-x 4 root wheel 512 Mar 30 16:08 strcon.plex 560crwxr-xr-- 1 root wheel 91, 1 Mar 30 16:08 stripe 561drwxr-xr-x 3 root wheel 512 Mar 30 16:08 stripe.plex 562crwxr-xr-- 1 root wheel 91, 0 Mar 30 16:08 tinyvol 563drwxr-xr-x 3 root wheel 512 Mar 30 16:08 tinyvol.plex 564crwxr-xr-- 1 root wheel 91, 4 Mar 30 16:08 vol5 565drwxr-xr-x 4 root wheel 512 Mar 30 16:08 vol5.plex 566 567/dev/vinum/vol/concat.plex: 568total 2 569crwxr-xr-- 1 root wheel 91, 0x10000002 Mar 30 16:08 concat.p0 570drwxr-xr-x 2 root wheel 512 Mar 30 16:08 concat.p0.sd 571crwxr-xr-- 1 root wheel 91, 0x10010002 Mar 30 16:08 concat.p1 572drwxr-xr-x 2 root wheel 512 Mar 30 16:08 concat.p1.sd 573 574/dev/vinum/vol/concat.plex/concat.p0.sd: 575total 0 576crwxr-xr-- 1 root wheel 91, 0x20000002 Mar 30 16:08 concat.p0.s0 577crwxr-xr-- 1 root wheel 91, 0x20100002 Mar 30 16:08 concat.p0.s1 578 579/dev/vinum/vol/concat.plex/concat.p1.sd: 580total 0 581crwxr-xr-- 1 root wheel 91, 0x20010002 Mar 30 16:08 concat.p1.s0 582 583/dev/vinum/vol/strcon.plex: 584total 2 585crwxr-xr-- 1 root wheel 91, 0x10000003 Mar 30 16:08 strcon.p0 586drwxr-xr-x 2 root wheel 512 Mar 30 16:08 strcon.p0.sd 587crwxr-xr-- 1 root wheel 91, 0x10010003 Mar 30 16:08 strcon.p1 588drwxr-xr-x 2 root wheel 512 Mar 30 16:08 strcon.p1.sd 589 590/dev/vinum/vol/strcon.plex/strcon.p0.sd: 591total 0 592crwxr-xr-- 1 root wheel 91, 0x20000003 Mar 30 16:08 strcon.p0.s0 593crwxr-xr-- 1 root wheel 91, 0x20100003 Mar 30 16:08 strcon.p0.s1 594 595/dev/vinum/vol/strcon.plex/strcon.p1.sd: 596total 0 597crwxr-xr-- 1 root wheel 91, 0x20010003 Mar 30 16:08 strcon.p1.s0 598crwxr-xr-- 1 root wheel 91, 0x20110003 Mar 30 16:08 strcon.p1.s1 599 600/dev/vinum/vol/stripe.plex: 601total 1 602crwxr-xr-- 1 root wheel 91, 0x10000001 Mar 30 16:08 stripe.p0 603drwxr-xr-x 2 root wheel 512 Mar 30 16:08 stripe.p0.sd 604 605/dev/vinum/vol/stripe.plex/stripe.p0.sd: 606total 0 607crwxr-xr-- 1 root wheel 91, 0x20000001 Mar 30 16:08 stripe.p0.s0 608crwxr-xr-- 1 root wheel 91, 0x20100001 Mar 30 16:08 stripe.p0.s1 609 610/dev/vinum/vol/tinyvol.plex: 611total 1 612crwxr-xr-- 1 root wheel 91, 0x10000000 Mar 30 16:08 tinyvol.p0 613drwxr-xr-x 2 root wheel 512 Mar 30 16:08 tinyvol.p0.sd 614 615/dev/vinum/vol/tinyvol.plex/tinyvol.p0.sd: 616total 0 617crwxr-xr-- 1 root wheel 91, 0x20000000 Mar 30 16:08 tinyvol.p0.s0 618crwxr-xr-- 1 root wheel 91, 0x20100000 Mar 30 16:08 tinyvol.p0.s1 619 620/dev/vinum/vol/vol5.plex: 621total 2 622crwxr-xr-- 1 root wheel 91, 0x10000004 Mar 30 16:08 vol5.p0 623drwxr-xr-x 2 root wheel 512 Mar 30 16:08 vol5.p0.sd 624crwxr-xr-- 1 root wheel 91, 0x10010004 Mar 30 16:08 vol5.p1 625drwxr-xr-x 2 root wheel 512 Mar 30 16:08 vol5.p1.sd 626 627/dev/vinum/vol/vol5.plex/vol5.p0.sd: 628total 0 629crwxr-xr-- 1 root wheel 91, 0x20000004 Mar 30 16:08 vol5.p0.s0 630crwxr-xr-- 1 root wheel 91, 0x20100004 Mar 30 16:08 vol5.p0.s1 631 632/dev/vinum/vol/vol5.plex/vol5.p1.sd: 633total 0 634crwxr-xr-- 1 root wheel 91, 0x20010004 Mar 30 16:08 vol5.p1.s0 635crwxr-xr-- 1 root wheel 91, 0x20110004 Mar 30 16:08 vol5.p1.s1 636.Ed 637.Pp 638In the case of unattached plexes and subdisks, the naming is reversed. 639Subdisks 640are named after the disk on which they are located, and plexes are named after 641the subdisk. 642.\" XXX 643.Bf -symbolic 644This mapping is still to be determined. 645.Ef 646.Ss Object States 647Each 648.Nm 649object has a 650.Em state 651associated with it. 652.Nm 653uses this state to determine the handling of the object. 654.Ss Volume States 655Volumes may have the following states: 656.Bl -hang -width 14n 657.It Em down 658The volume is completely inaccessible. 659.It Em up 660The volume is up and at least partially functional. 661Not all plexes may be 662available. 663.El 664.Ss "Plex States" 665Plexes may have the following states: 666.Bl -hang -width 14n 667.It Em referenced 668A plex entry which has been referenced as part of a volume, but which is 669currently not known. 670.It Em faulty 671A plex which has gone completely down because of I/O errors. 672.It Em down 673A plex which has been taken down by the administrator. 674.It Em initializing 675A plex which is being initialized. 676.El 677.Pp 678The remaining states represent plexes which are at least partially up. 679.Bl -hang -width 14n 680.It Em corrupt 681A plex entry which is at least partially up. 682Not all subdisks are available, 683and an inconsistency has occurred. 684If no other plex is uncorrupted, the volume 685is no longer consistent. 686.It Em degraded 687A RAID-5 plex entry which is accessible, but one subdisk is down, requiring 688recovery for many I/O requests. 689.It Em flaky 690A plex which is really up, but which has a reborn subdisk which we do not 691completely trust, and which we do not want to read if we can avoid it. 692.It Em up 693A plex entry which is completely up. 694All subdisks are up. 695.El 696.Ss "Subdisk States" 697Subdisks can have the following states: 698.Bl -hang -width 14n 699.It Em empty 700A subdisk entry which has been created completely. 701All fields are correct, and 702the disk has been updated, but the on the disk is not valid. 703.It Em referenced 704A subdisk entry which has been referenced as part of a plex, but which is 705currently not known. 706.It Em initializing 707A subdisk entry which has been created completely and which is currently being 708initialized. 709.El 710.Pp 711The following states represent invalid data. 712.Bl -hang -width 14n 713.It Em obsolete 714A subdisk entry which has been created completely. 715All fields are correct, the 716config on disk has been updated, and the data was valid, but since then the 717drive has been taken down, and as a result updates have been missed. 718.It Em stale 719A subdisk entry which has been created completely. 720All fields are correct, the 721disk has been updated, and the data was valid, but since then the drive has been 722crashed and updates have been lost. 723.El 724.Pp 725The following states represent valid, inaccessible data. 726.Bl -hang -width 14n 727.It Em crashed 728A subdisk entry which has been created completely. 729All fields are correct, the 730disk has been updated, and the data was valid, but since then the drive has gone 731down. 732No attempt has been made to write to the subdisk since the crash, so the 733data is valid. 734.It Em down 735A subdisk entry which was up, which contained valid data, and which was taken 736down by the administrator. 737The data is valid. 738.It Em reviving 739The subdisk is currently in the process of being revived. 740We can write but not 741read. 742.El 743.Pp 744The following states represent accessible subdisks with valid data. 745.Bl -hang -width 14n 746.It Em reborn 747A subdisk entry which has been created completely. 748All fields are correct, the 749disk has been updated, and the data was valid, but since then the drive has gone 750down and up again. 751No updates were lost, but it is possible that the subdisk 752has been damaged. 753We won't read from this subdisk if we have a choice. 754If this 755is the only subdisk which covers this address space in the plex, we set its 756state to up under these circumstances, so this status implies that there is 757another subdisk to fulfil the request. 758.It Em up 759A subdisk entry which has been created completely. 760All fields are correct, the 761disk has been updated, and the data is valid. 762.El 763.Ss "Drive States" 764Drives can have the following states: 765.Bl -hang -width 14n 766.It Em referenced 767At least one subdisk refers to the drive, but it is not currently accessible to 768the system. 769No device name is known. 770.It Em down 771The drive is not accessible. 772.It Em up 773The drive is up and running. 774.El 775.Sh DEBUGGING PROBLEMS WITH VINUM 776Solving problems with 777.Nm 778can be a difficult affair. 779This section suggests some approaches. 780.Ss Configuration problems 781It is relatively easy (too easy) to run into problems with the 782.Nm 783configuration. 784If you do, the first thing you should do is stop configuration 785updates: 786.Pp 787.Dl "vinum setdaemon 4" 788.Pp 789This will stop updates and any further corruption of the on-disk configuration. 790.Pp 791Next, look at the on-disk configuration with the 792.Nm Cm dumpconfig 793command, for example: 794.if t .ps -3 795.if t .vs -3 796.Bd -literal 797# \fBvinum dumpconfig\fP 798Drive 4: Device /dev/da3s0h 799 Created on crash.lemis.com at Sat May 20 16:32:44 2000 800 Config last updated Sat May 20 16:32:56 2000 801 Size: 601052160 bytes (573 MB) 802volume obj state up 803volume src state up 804volume raid state down 805volume r state down 806volume foo state up 807plex name obj.p0 state corrupt org concat vol obj 808plex name obj.p1 state corrupt org striped 128b vol obj 809plex name src.p0 state corrupt org striped 128b vol src 810plex name src.p1 state up org concat vol src 811plex name raid.p0 state faulty org disorg vol raid 812plex name r.p0 state faulty org disorg vol r 813plex name foo.p0 state up org concat vol foo 814plex name foo.p1 state faulty org concat vol foo 815sd name obj.p0.s0 drive drive2 plex obj.p0 state reborn len 409600b driveoffset 265b plexoffset 0b 816sd name obj.p0.s1 drive drive4 plex obj.p0 state up len 409600b driveoffset 265b plexoffset 409600b 817sd name obj.p1.s0 drive drive1 plex obj.p1 state up len 204800b driveoffset 265b plexoffset 0b 818sd name obj.p1.s1 drive drive2 plex obj.p1 state reborn len 204800b driveoffset 409865b plexoffset 128b 819sd name obj.p1.s2 drive drive3 plex obj.p1 state up len 204800b driveoffset 265b plexoffset 256b 820sd name obj.p1.s3 drive drive4 plex obj.p1 state up len 204800b driveoffset 409865b plexoffset 384b 821.Ed 822.if t .vs +3 823.if t .ps +3 824.Pp 825The configuration on all disks should be the same. 826If this is not the case, 827please save the output to a file and report the problem. 828There is probably 829little that can be done to recover the on-disk configuration, but if you keep a 830copy of the files used to create the objects, you should be able to re-create 831them. 832The 833.Cm create 834command does not change the subdisk data, so this will not cause data 835corruption. 836You may need to use the 837.Cm resetconfig 838command if you have this kind of trouble. 839.Ss Kernel Panics 840In order to analyse a panic which you suspect comes from 841.Nm 842you will need to build a debug kernel. 843See the online handbook at 844.Pa http://wiki.dragonflybsd.org/index.cgi/DebugKernelCrashDumps 845for more details of how to do this. 846.Pp 847Perform the following steps to analyse a 848.Nm 849problem: 850.Bl -enum 851.It 852Copy the following files to the directory in which you will be 853performing the analysis, typically 854.Pa /var/crash : 855.Pp 856.Bl -bullet -compact 857.It 858.Pa /sys/dev/raid/vinum/.gdbinit.crash , 859.It 860.Pa /sys/dev/raid/vinum/.gdbinit.kernel , 861.It 862.Pa /sys/dev/raid/vinum/.gdbinit.serial , 863.It 864.Pa /sys/dev/raid/vinum/.gdbinit.vinum 865and 866.It 867.Pa /sys/dev/raid/vinum/.gdbinit.vinum.paths 868.El 869.It 870Make sure that you build the 871.Nm 872module with debugging information. 873The standard 874.Pa Makefile 875builds a module with debugging symbols by default. 876If the version of 877.Nm 878in 879.Pa /modules 880does not contain symbols, you will not get an error message, but the stack trace 881will not show the symbols. 882Check the module before starting 883.Xr kgdb 1 : 884.Bd -literal 885$ file /modules/vinum.ko 886/modules/vinum.ko: ELF 32-bit LSB shared object, Intel 80386, 887 version 1 (FreeBSD), not stripped 888.Ed 889.Pp 890If the output shows that 891.Pa /modules/vinum.ko 892is stripped, you will have to find a version which is not. 893Usually this will be 894either in 895.Pa /usr/obj/usr/src/sys/SYSTEM_NAME/usr/src/sys/dev/raid/vinum/vinum.ko 896(if you have built 897.Nm 898with a 899.Dq Li "make world" ) 900or 901.Pa /sys/dev/raid/vinum/vinum.ko 902(if you have built 903.Nm 904in this directory). 905Modify the file 906.Pa .gdbinit.vinum.paths 907accordingly. 908.It 909Either take a dump or use remote serial 910.Xr gdb 1 911to analyse the problem. 912To analyse a dump, say 913.Pa /var/crash/vmcore.5 , 914link 915.Pa /var/crash/.gdbinit.crash 916to 917.Pa /var/crash/.gdbinit 918and enter: 919.Bd -literal -offset indent 920cd /var/crash 921kgdb kernel.debug vmcore.5 922.Ed 923.Pp 924This example assumes that you have installed the correct debug kernel at 925.Pa /var/crash/kernel.debug . 926If not, substitute the correct name of the debug kernel. 927.Pp 928To perform remote serial debugging, 929link 930.Pa /var/crash/.gdbinit.serial 931to 932.Pa /var/crash/.gdbinit 933and enter 934.Bd -literal -offset indent 935cd /var/crash 936kgdb kernel.debug 937.Ed 938.Pp 939In this case, the 940.Pa .gdbinit 941file performs the functions necessary to establish connection. 942The remote 943machine must already be in debug mode: enter the kernel debugger and select 944.Ic gdb 945(see 946.Xr ddb 4 947for more details.) 948The serial 949.Pa .gdbinit 950file expects the serial connection to run at 38400 bits per second; if you run 951at a different speed, edit the file accordingly (look for the 952.Va remotebaud 953specification). 954.Pp 955The following example shows a remote debugging session using the 956.Ic debug 957command of 958.Xr vinum 8 : 959.Bd -literal 960.if t .ps -3 961.if t .vs -3 962GDB 4.16 (i386-unknown-dragonfly), Copyright 1996 Free Software Foundation, Inc. 963Debugger (msg=0xf1093174 "vinum debug") at ../../i386/i386/db_interface.c:318 964318 in_Debugger = 0; 965#1 0xf108d9bc in vinumioctl (dev=0x40001900, cmd=0xc008464b, data=0xf6dedee0 "", 966 flag=0x3, p=0xf68b7940) at 967 /usr/src/sys/dev/raid/vinum/vinumioctl.c:102 968102 Debugger ("vinum debug"); 969(kgdb) bt 970#0 Debugger (msg=0xf0f661ac "vinum debug") at ../../i386/i386/db_interface.c:318 971#1 0xf0f60a7c in vinumioctl (dev=0x40001900, cmd=0xc008464b, data=0xf6923ed0 "", 972 flag=0x3, p=0xf688e6c0) at 973 /usr/src/sys/dev/raid/vinum/vinumioctl.c:109 974#2 0xf01833b7 in spec_ioctl (ap=0xf6923e0c) at ../../miscfs/specfs/spec_vnops.c:424 975#3 0xf0182cc9 in spec_vnoperate (ap=0xf6923e0c) at ../../miscfs/specfs/spec_vnops.c:129 976#4 0xf01eb3c1 in ufs_vnoperatespec (ap=0xf6923e0c) at ../../ufs/ufs/ufs_vnops.c:2312 977#5 0xf017dbb1 in vn_ioctl (fp=0xf1007ec0, com=0xc008464b, data=0xf6923ed0 "", 978 p=0xf688e6c0) at vnode_if.h:395 979#6 0xf015dce0 in ioctl (p=0xf688e6c0, uap=0xf6923f84) at ../../kern/sys_generic.c:473 980#7 0xf0214c0b in syscall (frame={tf_es = 0x27, tf_ds = 0x27, tf_edi = 0xefbfcff8, 981 tf_esi = 0x1, tf_ebp = 0xefbfcf90, tf_isp = 0xf6923fd4, tf_ebx = 0x2, 982 tf_edx = 0x804b614, tf_ecx = 0x8085d10, tf_eax = 0x36, tf_trapno = 0x7, 983 tf_err = 0x2, tf_eip = 0x8060a34, tf_cs = 0x1f, tf_eflags = 0x286, 984 tf_esp = 0xefbfcf78, tf_ss = 0x27}) at ../../i386/i386/trap.c:1100 985#8 0xf020a1fc in Xint0x80_syscall () 986#9 0x804832d in ?? () 987#10 0x80482ad in ?? () 988#11 0x80480e9 in ?? () 989.if t .vs 990.if t .ps 991.Ed 992.Pp 993When entering from the debugger, it is important that the source of frame 1 994(listed by the 995.Pa .gdbinit 996file at the top of the example) contains the text 997.Dq Li "Debugger (\*[q]vinum debug\*[q]);" . 998.Pp 999This is an indication that the address specifications are correct. 1000If you get 1001some other output, your symbols and the kernel module are out of sync, and the 1002trace will be meaningless. 1003.El 1004.Pp 1005For an initial investigation, the most important information is the output of 1006the 1007.Ic bt 1008(backtrace) command above. 1009.Ss Reporting Problems with Vinum 1010If you find any bugs in 1011.Nm , 1012please report them to 1013.An Greg Lehey Aq grog@lemis.com . 1014Supply the following 1015information: 1016.Bl -bullet 1017.It 1018The output of the 1019.Nm Cm list 1020command 1021(see 1022.Xr vinum 8 ) . 1023.It 1024Any messages printed in 1025.Pa /var/log/messages . 1026All such messages will be identified by the text 1027.Dq Li vinum 1028at the beginning. 1029.It 1030If you have a panic, a stack trace as described above. 1031.El 1032.Sh SEE ALSO 1033.Xr disklabel 5 , 1034.Xr disklabel 8 , 1035.Xr newfs 8 , 1036.Xr vinum 8 1037.Sh HISTORY 1038.Nm 1039first appeared in 1040.Fx 3.0 . 1041The RAID-5 component of 1042.Nm 1043was developed by Cybernet Inc.\& 1044.Pq Pa http://www.cybernet.com/ , 1045for its NetMAX product. 1046.Sh AUTHORS 1047.An Greg Lehey Aq grog@lemis.com . 1048.Sh BUGS 1049.Nm 1050is a new product. 1051Bugs can be expected. 1052The configuration mechanism is not yet 1053fully functional. 1054If you have difficulties, please look at the section 1055.Sx "DEBUGGING PROBLEMS WITH VINUM" 1056before reporting problems. 1057.Pp 1058Kernels with the 1059.Nm 1060pseudo-device appear to work, but are not supported. 1061If you have trouble with 1062this configuration, please first replace the kernel with a 1063.No non- Ns Nm 1064kernel and test with the kld module. 1065.Pp 1066Detection of differences between the version of the kernel and the kld is not 1067yet implemented. 1068.Pp 1069The RAID-5 functionality is new in 1070.Fx 3.3 . 1071Some problems have been 1072reported with 1073.Nm 1074in combination with soft updates, but these are not reproducible on all 1075systems. 1076If you are planning to use 1077.Nm 1078in a production environment, please test carefully. 1079