1.\" Copyright (c) 2015 The DragonFly Project. All rights reserved. 2.\" 3.\" This code is derived from software contributed to The DragonFly Project 4.\" by Matthew Dillon <dillon@backplane.com> 5.\" 6.\" Redistribution and use in source and binary forms, with or without 7.\" modification, are permitted provided that the following conditions 8.\" are met: 9.\" 10.\" 1. Redistributions of source code must retain the above copyright 11.\" notice, this list of conditions and the following disclaimer. 12.\" 2. Redistributions in binary form must reproduce the above copyright 13.\" notice, this list of conditions and the following disclaimer in 14.\" the documentation and/or other materials provided with the 15.\" distribution. 16.\" 3. Neither the name of The DragonFly Project nor the names of its 17.\" contributors may be used to endorse or promote products derived 18.\" from this software without specific, prior written permission. 19.\" 20.\" THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS 21.\" ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT 22.\" LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS 23.\" FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE 24.\" COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, 25.\" INCIDENTAL, SPECIAL, EXEMPLARY OR CONSEQUENTIAL DAMAGES (INCLUDING, 26.\" BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; 27.\" LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED 28.\" AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 29.\" OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT 30.\" OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 31.\" SUCH DAMAGE. 32.\" 33.Dd March 26, 2015 34.Dt HAMMER2 8 35.Os 36.Sh NAME 37.Nm hammer2 38.Nd hammer2 file system utility 39.Sh SYNOPSIS 40.Nm 41.Fl h 42.Nm 43.Op Fl s Ar path 44.Op Fl t Ar type 45.Op Fl u Ar uuid 46.Ar command 47.Op Ar argument ... 48.Sh DESCRIPTION 49The 50.Nm 51utility provides miscellaneous support functions for a 52HAMMER2 file system. 53.Pp 54The options are as follows: 55.Bl -tag -width indent 56.It Fl s Ar path 57Specify the path to a mounted HAMMER2 filesystem. 58At least one PFS on a HAMMER2 filesystem must be mounted for the system 59to act on all PFSs managed by it. 60Every HAMMER2 filesystem typically has a PFS called "LOCAL" for this purpose. 61.It Fl t Ar type 62Specify the type when creating, upgrading, or downgrading a PFS. 63Supported types are MASTER, SLAVE, SOFT_MASTER, SOFT_SLAVE, CACHE, and DUMMY. 64If not specified the pfs-create directive will default to MASTER if no 65uuid is specified, and SLAVE if a uuid is specified. 66.It Fl u Ar uuid 67Specify the cluster uuid when creating a PFS. If not specified, a unique, 68random uuid will be generated. 69Note that every PFS also has a unique pfs_id which is always generated 70and cannot be overridden with an option. 71The { pfs_clid, pfs_fsid } tuple uniquely identifies a component of a cluster. 72.El 73.Pp 74.Nm 75directives are as shown below. 76Note that most directives require you to either be CD'd into a hammer2 77filesystem, specify a path to a mounted hammer2 filesystem via the 78.Fl s 79option, or specify a path after the directive. 80It depends on the directive. 81All hammer2 filesystem have a PFS called "LOCAL" which is typically mounted 82locally on the host in order to be able to issue commands for other PFSs 83on the filesystem. 84The mount also enables PFS configuration scanning for that filesystem. 85.Bl -tag -width indent 86.\" ==== connect ==== 87.It Cm connect Ar target 88Add a cluster link entry to the volume header. 89The volume header can support up to 255 link entries. 90This feature is not currently used. 91.\" ==== disconnect ==== 92.It Cm disconnect Ar target 93Delete a cluster link entry from the volume header. 94This feature is not currently used. 95.\" ==== status ==== 96.It Cm status Ar path... 97Dump a list of all cluster link entries configured in the volume header. 98.\" ==== hash ==== 99.It Cm hash Ar filename... 100Compute and print the directory hash for any number of filenames. 101.\" ==== pfs-list ==== 102.It Cm pfs-list Op path... 103List all local PFSs available on a mounted HAMMER2 filesystem, their type, 104and their current status. 105You must mount at least one PFS in order to be able to access the whole list. 106.\" ==== pfs-clid ==== 107.It Cm pfs-clid Ar label 108Print the cluster id for a PFS specified by name. 109.\" ==== pfs-fsid ==== 110.It Cm pfs-fsid Ar label 111Print the unique filesystem id for a PFS specified by name. 112.\" ==== pfs-create ==== 113.It Cm pfs-create Ar label 114Create a local PFS on a mounted HAMMER2 filesystem. 115If no uuid is specified the pfs-type defaults to MASTER. 116If a uuid is specified via the 117.Fl u 118option the pfs-type defaults to SLAVE. 119Other types can be specified with the 120.Fl t 121option. 122.Pp 123If you wish to add a MASTER to an existing cluster, you must first add it as 124a SLAVE and then upgrade it to MASTER to properly synchronize it. 125.Pp 126The DUMMY pfs-type is used to tie network-accessible clusters into the local 127machine when no local storage is desired. 128This type should be used on minimal H2 partitions or entirely in ram for 129netboot-centric systems to provide a tie-in point for the mount command, 130or on more complex systems where you need to also access network-centric 131clusters. 132.Pp 133The CACHE or SLAVE pfs-type is typically used when the main store is on 134the network but local storage is desired to improve performance. 135SLAVE is also used when a backup is desired. 136.Pp 137Generally speaking, you can mount any PFS element of a cluster in order to 138access the cluster via the full cluster protocol. 139There are two exceptions. 140If you mount a SOFT_SLAVE or a SOFT_MASTER then soft quorum semantics are 141employed... the soft slave or soft master's current state will always be used 142and the quorum protocol will not be used. The soft PFS will still be 143synchronized to masters in the background when available. 144Also, you can use 'mount -o local' to mount ONLY a local HAMMER2 PFS and 145not run any network or quorum protocols for the mount. 146All such mounts except for a SOFT_MASTER mount will be read-only. 147Other than that, you will be mounting the whole cluster when you mount any 148PFS within the cluster. 149.Pp 150DUMMY - Create a PFS skeleton intended to be the mount point for a 151more complex cluster, probably one that is entirely network based. 152No data will be synchronized to this PFS so it is suitable for use 153in a network boot image or memory filesystem. 154This allows you to create placeholders for mount points on your local 155disk, SSD, or memory disk. 156.Pp 157CACHE - Create a PFS for caching portions of the cluster piecemeal. 158This is similar to a SLAVE but does not synchronize the entire contents of 159the cluster to the PFS. 160Elements found in the CACHE PFS which are validated against the cluster 161will be read, presumably a faster access than having to go to the cluster. 162Only local CACHEs will be updated. 163Network-accessible CACHE PFSs might be read but will not be written to. 164If you have a large hard-drive-based cluster you can set up localized 165SSD CACHE PFSs to improve performance. 166.Pp 167SLAVE - Create a PFS which maintains synchronization with and provides a 168read-only copy of the cluster. 169HAMMER2 will prioritize local SLAVEs for data retrieval after validating 170their transaction id against the cluster. 171The difference between a CACHE and a SLAVE is that the SLAVE is synchronized 172to a full copy of the cluster and thus can serve as a backup or be staged 173for use as a MASTER later on. 174.Pp 175SOFT_SLAVE - Create a PFS which maintains synchronization with and provides 176a read-only copy of the cluster. 177This is one of the special mount cases. A SOFT_SLAVE will synchronize with 178the cluster when the cluster is available, but can still be accessed when 179the cluster is not available. 180.Pp 181MASTER - Create a PFS which will hold a master copy of the cluster. 182If you create several MASTER PFSs with the same cluster id you are 183effectively creating a multi-master cluster and causing a quorum and 184cache coherency protocol to be used to validate operations. 185The total number of masters is stored in each PFSs making up the cluster. 186Filesystem operations will stall for normal mounts if a quorum cannot be 187obtained to validate the operation. 188MASTER nodes which go offline and return later will synchronize in the 189background. 190Note that when adding a MASTER to an existing cluster you must add the 191new PFS as a SLAVE and then upgrade it to a MASTER. 192.Pp 193SOFT_MASTER - Create a PFS which maintains synchronization with and provides 194a read-write copy of the cluster. 195This is one of the special mount cases. A SOFT_MASTER will synchronize with 196the cluster when the cluster is available, but can still be read AND written 197to even when the cluster is not available. 198Modifications made to a SOFT_MASTER will be automatically flushed to the 199cluster when it becomes accessible again, and vise-versa. 200Manual intervention may be required if a conflict occurs during 201synchronization. 202.\" ==== pfs-delete ==== 203.It Cm pfs-delete Ar label 204Delete a local PFS on a mounted HAMMER2 filesystem. 205Deleting a PFS of type MASTER requires first downgrading it to a SLAVE (XXX). 206.\" ==== snapshot ==== 207.It Cm snapshot Ar path Op label 208Create a snapshot of a directory. 209This can only be used on a local PFS, and is only really useful if the PFS 210contains a complete copy of what you desire to snapshot so that typically 211means a local MASTER, SOFT_MASTER, SLAVE, or SOFT_SLAVE must be present. 212Snapshots are created simply by flushing a PFS mount to disk and then copying 213the directory inode to the PFS. 214The topology is snapshotted without having to be copied or scanned. 215Snapshots are effectively separate from the cluster they came from 216and can be used as a starting point for a new cluster. 217So unless you build a new cluster from the snapshot, it will stay local 218to the machine it was made on. 219.\" ==== service ==== 220.It Cm service 221Start the 222.Nm 223service daemon. 224This daemon is also automatically started when you run 225.Xr mount_hammer2 8 . 226The hammer2 service daemon handles incoming TCP connections and maintains 227outgoing TCP connections. It will interconnect available services on the 228machine (e.g. hammer2 mounts and xdisks) to the network. 229.\" ==== stat ==== 230.It Cm stat Op path... 231Print the inode statistics, compression, and other meta-data associated 232with a list of paths. 233.\" ==== leaf ==== 234.It Cm leaf 235XXX 236.\" ==== shell ==== 237.It Cm shell 238Start a debug shell to the local hammer2 service daemon via the DMSG protocol. 239.\" ==== debugspan ==== 240.It Cm debugspan 241(do not use) 242.\" ==== rsainit ==== 243.It Cm rsainit 244Create the 245.Pa /etc/hammer2 246directory and initialize a public/private keypair in that directory for 247use by the network cluster protocols. 248.\" ==== show ==== 249.It Cm show Ar devpath 250Dump the radix tree for the HAMMER2 filesystem by scanning a 251block device directly. No mount is required. 252.\" ==== freemap ==== 253Dump the freemap tree for the HAMMER2 filesystem by scanning a 254block device directly. No mount is required. 255.It Cm freemap Ar devpath 256.\" ==== setcomp ==== 257.It Cm setcomp Ar mode[:level] Op path... 258Set the compression mode as specified for any newly created elements at or 259under the path if not overridden by deeper elements. 260Available modes are none, autozero, lz4, or zlib. 261When zlib is used the compression level can be set. 262The default will be 6 which is the best trade-off between performance and 263time. 264.Pp 265newfs_hammer2 will set the default compression to lz4 which prioritizes 266speed over performance. 267Also note that HAMMER2 contains a heuristic and will not attempt to 268compress every block if it detects a sufficient amount of uncompressable 269data. 270.Pp 271Hammer2 compression is only effective when it can reduce the size of dataset 272(typically a 64KB block) by one or more powers of 2. A 64K block which 273only compresses to 40K will not yield any storage improvement. 274.\" ==== setcheck ==== 275.It Cm setcheck Ar check Op path... 276Set the check code as specified for any newly created elements at or under 277the path if not overridden by deeper elements. 278Available codes are default, disabled, crc32, crc64, or sha192. 279.\" ==== clrcheck ==== 280.It Cm clrcheck Op path... 281Clear the check code override for the specified paths. 282Overrides may still be present in deeper elements. 283.\" ==== setcrc32 ==== 284.It Cm setcrc32 Op path... 285Set the check code to the ISCSI 32-bit CRC for any newly created elements 286at or under the path if not overridden by deeper elements. 287.\" ==== setcrc64 ==== 288.It Cm setcrc64 Op path... 289Set the check code to CRC64 (not yet specified). 290.\" ==== setsha192 ==== 291.It Cm setsha192 Op path... 292Set the check code to SHA192 for any newly created elements at or under 293the path if not overridden by deeper elements. 294.\" ==== bulkfree ==== 295.It Cm bulkfree Op path... 296Run a bulkfree pass on a HAMMER2 mount. 297You can specify any PFS for the mount, the bulkfree pass is run on the 298entire partition. 299.El 300.Sh SETTING UP /etc/hammer2 301The 'rsainit' directive will create the 302.Pa /etc/hammer2 303directory with appropriate permissions and also generate a public key 304pair in this directory for the machine. These files will be 305.Pa rsa.pub 306and 307.Pa rsa.prv 308and needless to say, the private key shouldn't leave the host. 309.Pp 310The service daemon will also scan the 311.Pa /etc/hammer2/autoconn 312file which contains a list of hosts which it will automatically maintain 313connections to to form your cluster. 314The service daemon will automatically reconnect on any failure and will 315also monitor the file for changes. 316.Pp 317When the service daemon receives a connection it expects to find a 318public key for that connection in a file in 319.Pa /etc/hammer2/remote/ 320called 321.Pa <IPADDR>.pub . 322You normally copy the 323.Pa rsa.pub 324key from the host in question to this file. 325The IP address must match exactly or the connection will not be allowed. 326.Pp 327If you want to use an unencrypted connection you can create empty, 328dummy files in the remote directory in the form 329.Pa <IPADDR>.none . 330We do not recommend using unencrypted connections. 331.Sh CLUSTER SERVICES 332Currently there are two services which use the cluster network infrastructure, 333HAMMER2 mounts and XDISK. 334Any HAMMER2 mount will make all PFSs for that filesystem available to the 335cluster. 336And if the XDISK kernel module is loaded, the hammer2 service daemon will make 337your machine's block devices available to the cluster (you must load the 338xdisk.ko kernel module before starting the hammer2 service). 339They will show up as 340.Pa /dev/xa* 341and 342.Pa /dev/serno/* 343devices on the remote machines making up the cluster. 344Remote block devices are just what they appear to be... direct access to a 345block device on a remote machine. If the link goes down remote accesses 346will stall until it comes back up again, then automatically requeue any 347pending I/O and resume as if nothing happened. 348However, if the server hosting the physical disks crashes or is rebooted, 349any remote opens to its devices will see a permanent I/O failure requiring a 350close and open sequence to re-establish. 351The latter is necessary because the server's drives might not have committed 352the data before the crash, but had already acknowledged the transfer. 353.Pp 354Data commits work exactly the same as they do for real block devices. 355The originater must issue a BUF_CMD_FLUSH. 356.Sh ADDING A NEW MASTER TO A CLUSTER 357When you 358.Xr newfs_hammer2 8 359a HAMMER2 filesystem or use the 'pfs-create' directive on one already mounted 360to create a new PFS, with no special options, you wind up with a PFS 361typed as a MASTER and a unique cluster uuid, but because there is only one 362PFS for that cluster (for each PFS you create via pfs-create), it will 363act just like a normal filesystem would act and does not require any special 364protocols to operate. 365.Pp 366If you use the 'pfs-create' directive along with the 367.Fl u 368option to specify a cluster uuid that already exists in the cluster, 369you are adding a PFS to an existing cluster and this can trigger a whole 370series of events in the background. 371When you specify the 372.Fl u 373option in a 'pfs-create', 374.Nm 375will by default create a SLAVE PFS. 376In fact, this is what must be created first even if you want to add a new 377MASTER to your cluster. 378.Pp 379The most common action a system admin will want to take is to upgrade or 380downgrade a PFS. 381A new MASTER can be added to the cluster by upgrading an existing SLAVE 382to MASTER. 383A MASTER can be removed from the cluster by downgrading it to a SLAVE. 384Upgrades and downgrades will put nodes in the cluster in a transition state 385until the operation is complete. 386For downgrades the transition state is fleeting unless one or more other 387masters has not acknowledged the change. 388For upgrades a background synchronization process must complete before the 389transition can be said to be complete, and the node remains (really) a SLAVE 390until that transition is complete. 391.Sh USE CASES FOR A SOFT_MASTER 392The SOFT_MASTER PFS type is a special type which must be specifically 393mounted by a machine. 394It is a R/W mount which does not use the quorum protocol and is not 395cache coherent with the cluster, but which synchronizes from the cluster 396and allows modifying operations which will synchronize to the cluster. 397The most common case is to use a SOFT_MASTER PFS in a laptop allowing you 398to work on your laptop when you are on the road and not connected to 399your main servers, and for the laptop to synchronize when a connection is 400available. 401.Sh USE CASES FOR A SOFT_SLAVE 402A SOFT_SLAVE PFS type is a special type which must be specifically mounted 403by a machine. 404It is a RO mount which does not use the quorum protocol and is not 405cache coherent with the cluster. It will receive synchronization from 406the cluster when network connectivity is available but will not stall if 407network connectivity is lost. 408.Sh FSYNC FLUSH MODES 409TODO. 410.Sh RESTORING FROM A SNAPSHOT BACKUP 411TODO. 412.Sh PERFORMANCE TUNING 413Because HAMMER2 implements compression, decompression, and deup natively, 414it always double-buffers file data. This means that the file data is 415cached via the device vnode (in compressed / dedupped-form) and the same 416data is also cached by the file vnode (in decompressed / non-dedupped form). 417.Pp 418While HAMMER2 will try to age the logical file buffers on its, some 419additional performance tuning may be necessary for optimal operation 420whether swapcache is used or not. Our recommendation is to reduce the 421number of vnodes (and thus also the logical buffer cache behind the 422vnodes) that the system caches via the 423.Va kern.maxvnodes 424sysctl. 425.Pp 426Too-large a value will result in excessive double-caching and can cause 427unnecessary read disk I/O. 428We recommend a number between 25000 and 250000 vnodes, depending on your 429use case. 430Keep in mind that even though the vnode cache is smaller, this will make 431room for a great deal more device-level buffer caching which can encompasses 432far more data and meta-data than the vnode-level caching. 433.Sh ENVIRONMENT 434TODO. 435.Sh FILES 436.Bl -tag -width ".It Pa <fs>/abc/defghi/<name>" -compact 437.It Pa /etc/hammer2/ 438.It Pa /etc/hammer2/rsa.pub 439.It Pa /etc/hammer2/rsa.prv 440.It Pa /etc/hammer2/autoconn 441.It Pa /etc/hammer2/remote/<IP>.pub 442.It Pa /etc/hammer2/remote/<IP>.none 443.El 444.Sh EXIT STATUS 445.Ex -std 446.Sh SEE ALSO 447.Xr mount_hammer2 8 , 448.Xr mount_null 8 , 449.Xr newfs_hammer2 8 , 450.Xr swapcache 8 , 451.Xr sysctl 8 452.Sh HISTORY 453The 454.Nm 455utility first appeared in 456.Dx 4.1 . 457.Sh AUTHORS 458.An Matthew Dillon Aq Mt dillon@backplane.com 459