1.\" Copyright (c) 2015 The DragonFly Project. All rights reserved. 2.\" 3.\" This code is derived from software contributed to The DragonFly Project 4.\" by Matthew Dillon <dillon@backplane.com> 5.\" 6.\" Redistribution and use in source and binary forms, with or without 7.\" modification, are permitted provided that the following conditions 8.\" are met: 9.\" 10.\" 1. Redistributions of source code must retain the above copyright 11.\" notice, this list of conditions and the following disclaimer. 12.\" 2. Redistributions in binary form must reproduce the above copyright 13.\" notice, this list of conditions and the following disclaimer in 14.\" the documentation and/or other materials provided with the 15.\" distribution. 16.\" 3. Neither the name of The DragonFly Project nor the names of its 17.\" contributors may be used to endorse or promote products derived 18.\" from this software without specific, prior written permission. 19.\" 20.\" THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS 21.\" ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT 22.\" LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS 23.\" FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE 24.\" COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, 25.\" INCIDENTAL, SPECIAL, EXEMPLARY OR CONSEQUENTIAL DAMAGES (INCLUDING, 26.\" BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; 27.\" LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED 28.\" AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 29.\" OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT 30.\" OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 31.\" SUCH DAMAGE. 32.\" 33.Dd December 2, 2017 34.Dt HAMMER2 8 35.Os 36.Sh NAME 37.Nm hammer2 38.Nd hammer2 file system utility 39.Sh SYNOPSIS 40.Nm 41.Fl h 42.Nm 43.Op Fl s Ar path 44.Op Fl t Ar type 45.Op Fl u Ar uuid 46.Op Fl m Ar mem 47.Ar command 48.Op Ar argument ... 49.Sh DESCRIPTION 50The 51.Nm 52utility provides miscellaneous support functions for a 53HAMMER2 file system. 54.Pp 55The options are as follows: 56.Bl -tag -width indent 57.It Fl s Ar path 58Specify the path to a mounted HAMMER2 filesystem. 59At least one PFS on a HAMMER2 filesystem must be mounted for the system 60to act on all PFSs managed by it. 61Every HAMMER2 filesystem typically has a PFS called "LOCAL" for this purpose. 62.It Fl t Ar type 63Specify the type when creating, upgrading, or downgrading a PFS. 64Supported types are MASTER, SLAVE, SOFT_MASTER, SOFT_SLAVE, CACHE, and DUMMY. 65If not specified the pfs-create directive will default to MASTER if no 66uuid is specified, and SLAVE if a uuid is specified. 67.It Fl u Ar uuid 68Specify the cluster uuid when creating a PFS. If not specified, a unique, 69random uuid will be generated. 70Note that every PFS also has a unique pfs_id which is always generated 71and cannot be overridden with an option. 72The { pfs_clid, pfs_fsid } tuple uniquely identifies a component of a cluster. 73.It Fl m Ar mem 74Specify how much tracking memory to use for certain directives. 75At the moment, this option is only applicable to the 76.Cm bulkfree 77directive, allowing it to operate in fewer passes when given more memory. 78A nominal value for a 4TB drive with a ton of stuff on it would be around 79a gigabyte '-m 1g'. 80.El 81.Pp 82.Nm 83directives are as shown below. 84Note that most directives require you to either be CD'd into a hammer2 85filesystem, specify a path to a mounted hammer2 filesystem via the 86.Fl s 87option, or specify a path after the directive. 88It depends on the directive. 89All hammer2 filesystem have a PFS called "LOCAL" which is typically mounted 90locally on the host in order to be able to issue commands for other PFSs 91on the filesystem. 92The mount also enables PFS configuration scanning for that filesystem. 93.Bl -tag -width indent 94.\" ==== cleanup ==== 95.It Cm cleanup Op path 96Perform manual cleanup passes on paths or all mounted partitions. 97.\" ==== connect ==== 98.It Cm connect Ar target 99Add a cluster link entry to the volume header. 100The volume header can support up to 255 link entries. 101This feature is not currently used. 102.\" ==== destroy ==== 103.It Cm destroy Ar path 104Destroy the specified directory entry in a hammer2 filesystem. This bypasses 105all normal checks and will unconditionally destroy the directory entry. 106The underlying inode is not checked and, if it does exist, its nlinks count 107is not decremented. 108This directive should only be used to destroy a corrupted directory entry 109which no longer has a working inode. 110.Pp 111Note that this command may desynchronize the system namecache for the 112specified entry. If this happens, you may have to unmount and remount the 113filesystem. 114.\" ==== disconnect ==== 115.It Cm disconnect Ar target 116Delete a cluster link entry from the volume header. 117This feature is not currently used. 118.\" ==== info ==== 119.It Cm info Op devpath 120Access and print the status and super-root entries for all HAMMER2 121partitions found in /dev/serno or the specified device path(s). 122The partitions do not have to be mounted. 123Note that only mounted partitions will be under active management. 124This is accomplished by mounting at least one PFS within the partition. 125Typically at least the @LOCAL PFS is mounted. 126.\" ==== mountall ==== 127.It Cm mountall Op devpath 128This directive mounts the @LOCAL PFS on all HAMMER2 partitions found 129in /dev/serno, or the specified device path(s). 130The partitions are mounted as /var/hammer2/LOCAL.<id>. 131Mounts are executed in the background and this command will wait a 132limited amount of time for the mounts to complete before returning. 133.\" ==== status ==== 134.It Cm status Ar path... 135Dump a list of all cluster link entries configured in the volume header. 136.\" ==== hash ==== 137.It Cm hash Ar filename... 138Compute and print the directory hash for any number of filenames. 139.\" ==== pfs-list ==== 140.It Cm pfs-list Op path... 141List all local PFSs available on a mounted HAMMER2 filesystem, their type, 142and their current status. 143You must mount at least one PFS in order to be able to access the whole list. 144.\" ==== pfs-clid ==== 145.It Cm pfs-clid Ar label 146Print the cluster id for a PFS specified by name. 147.\" ==== pfs-fsid ==== 148.It Cm pfs-fsid Ar label 149Print the unique filesystem id for a PFS specified by name. 150.\" ==== pfs-create ==== 151.It Cm pfs-create Ar label 152Create a local PFS on a mounted HAMMER2 filesystem. 153If no uuid is specified the pfs-type defaults to MASTER. 154If a uuid is specified via the 155.Fl u 156option the pfs-type defaults to SLAVE. 157Other types can be specified with the 158.Fl t 159option. 160.Pp 161If you wish to add a MASTER to an existing cluster, you must first add it as 162a SLAVE and then upgrade it to MASTER to properly synchronize it. 163.Pp 164The DUMMY pfs-type is used to tie network-accessible clusters into the local 165machine when no local storage is desired. 166This type should be used on minimal H2 partitions or entirely in ram for 167netboot-centric systems to provide a tie-in point for the mount command, 168or on more complex systems where you need to also access network-centric 169clusters. 170.Pp 171The CACHE or SLAVE pfs-type is typically used when the main store is on 172the network but local storage is desired to improve performance. 173SLAVE is also used when a backup is desired. 174.Pp 175Generally speaking, you can mount any PFS element of a cluster in order to 176access the cluster via the full cluster protocol. 177There are two exceptions. 178If you mount a SOFT_SLAVE or a SOFT_MASTER then soft quorum semantics are 179employed... the soft slave or soft master's current state will always be used 180and the quorum protocol will not be used. The soft PFS will still be 181synchronized to masters in the background when available. 182Also, you can use 183.Sq mount -o local 184to mount ONLY a local HAMMER2 PFS and 185not run any network or quorum protocols for the mount. 186All such mounts except for a SOFT_MASTER mount will be read-only. 187Other than that, you will be mounting the whole cluster when you mount any 188PFS within the cluster. 189.Pp 190DUMMY - Create a PFS skeleton intended to be the mount point for a 191more complex cluster, probably one that is entirely network based. 192No data will be synchronized to this PFS so it is suitable for use 193in a network boot image or memory filesystem. 194This allows you to create placeholders for mount points on your local 195disk, SSD, or memory disk. 196.Pp 197CACHE - Create a PFS for caching portions of the cluster piecemeal. 198This is similar to a SLAVE but does not synchronize the entire contents of 199the cluster to the PFS. 200Elements found in the CACHE PFS which are validated against the cluster 201will be read, presumably a faster access than having to go to the cluster. 202Only local CACHEs will be updated. 203Network-accessible CACHE PFSs might be read but will not be written to. 204If you have a large hard-drive-based cluster you can set up localized 205SSD CACHE PFSs to improve performance. 206.Pp 207SLAVE - Create a PFS which maintains synchronization with and provides a 208read-only copy of the cluster. 209HAMMER2 will prioritize local SLAVEs for data retrieval after validating 210their transaction id against the cluster. 211The difference between a CACHE and a SLAVE is that the SLAVE is synchronized 212to a full copy of the cluster and thus can serve as a backup or be staged 213for use as a MASTER later on. 214.Pp 215SOFT_SLAVE - Create a PFS which maintains synchronization with and provides 216a read-only copy of the cluster. 217This is one of the special mount cases. A SOFT_SLAVE will synchronize with 218the cluster when the cluster is available, but can still be accessed when 219the cluster is not available. 220.Pp 221MASTER - Create a PFS which will hold a master copy of the cluster. 222If you create several MASTER PFSs with the same cluster id you are 223effectively creating a multi-master cluster and causing a quorum and 224cache coherency protocol to be used to validate operations. 225The total number of masters is stored in each PFSs making up the cluster. 226Filesystem operations will stall for normal mounts if a quorum cannot be 227obtained to validate the operation. 228MASTER nodes which go offline and return later will synchronize in the 229background. 230Note that when adding a MASTER to an existing cluster you must add the 231new PFS as a SLAVE and then upgrade it to a MASTER. 232.Pp 233SOFT_MASTER - Create a PFS which maintains synchronization with and provides 234a read-write copy of the cluster. 235This is one of the special mount cases. A SOFT_MASTER will synchronize with 236the cluster when the cluster is available, but can still be read AND written 237to even when the cluster is not available. 238Modifications made to a SOFT_MASTER will be automatically flushed to the 239cluster when it becomes accessible again, and vise-versa. 240Manual intervention may be required if a conflict occurs during 241synchronization. 242.\" ==== pfs-delete ==== 243.It Cm pfs-delete Ar label 244Delete a local PFS on a mounted HAMMER2 filesystem. 245Deleting a PFS of type MASTER requires first downgrading it to a SLAVE (XXX). 246.\" ==== snapshot ==== 247.It Cm snapshot Ar path Op label 248Create a snapshot of a directory. 249This can only be used on a local PFS, and is only really useful if the PFS 250contains a complete copy of what you desire to snapshot so that typically 251means a local MASTER, SOFT_MASTER, SLAVE, or SOFT_SLAVE must be present. 252Snapshots are created simply by flushing a PFS mount to disk and then copying 253the directory inode to the PFS. 254The topology is snapshotted without having to be copied or scanned. 255Snapshots are effectively separate from the cluster they came from 256and can be used as a starting point for a new cluster. 257So unless you build a new cluster from the snapshot, it will stay local 258to the machine it was made on. 259.\" ==== service ==== 260.It Cm service 261Start the 262.Nm 263service daemon. 264This daemon is also automatically started when you run 265.Xr mount_hammer2 8 . 266The hammer2 service daemon handles incoming TCP connections and maintains 267outgoing TCP connections. It will interconnect available services on the 268machine (e.g. hammer2 mounts and xdisks) to the network. 269.\" ==== stat ==== 270.It Cm stat Op path... 271Print the inode statistics, compression, and other meta-data associated 272with a list of paths. 273.\" ==== leaf ==== 274.It Cm leaf 275XXX 276.\" ==== shell ==== 277.It Cm shell 278Start a debug shell to the local hammer2 service daemon via the DMSG protocol. 279.\" ==== debugspan ==== 280.It Cm debugspan 281(do not use) 282.\" ==== rsainit ==== 283.It Cm rsainit 284Create the 285.Pa /etc/hammer2 286directory and initialize a public/private keypair in that directory for 287use by the network cluster protocols. 288.\" ==== show ==== 289.It Cm show Ar devpath 290Dump the radix tree for the HAMMER2 filesystem by scanning a 291block device directly. No mount is required. 292.\" ==== freemap ==== 293Dump the freemap tree for the HAMMER2 filesystem by scanning a 294block device directly. No mount is required. 295.It Cm freemap Ar devpath 296.\" ==== setcomp ==== 297.It Cm setcomp Ar mode[:level] Op path... 298Set the compression mode as specified for any newly created elements at or 299under the path if not overridden by deeper elements. 300Available modes are none, autozero, lz4, or zlib. 301When zlib is used the compression level can be set. 302The default will be 6 which is the best trade-off between performance and 303time. 304.Pp 305newfs_hammer2 will set the default compression to lz4 which prioritizes 306speed over performance. 307Also note that HAMMER2 contains a heuristic and will not attempt to 308compress every block if it detects a sufficient amount of uncompressable 309data. 310.Pp 311Hammer2 compression is only effective when it can reduce the size of dataset 312(typically a 64KB block) by one or more powers of 2. A 64K block which 313only compresses to 40K will not yield any storage improvement. 314.Pp 315Generally speaking you do not want to set the compression mode to 316.Sq none , 317as this will cause blocks of all-zeros to be written as all-zero blocks, 318instead of holes. The 319.Sq autozero 320compression mode detects blocks of all-zeros 321and writes them as holes. However, HAMMER2 will rewrite data in-place if 322the compression mode is set to 323.Sq none 324and the check code is set to 325.Sq disabled . 326Formal snapshots will still snapshot such files. However, 327de-duplication will no longer function on the data blocks. 328.\" ==== setcheck ==== 329.It Cm setcheck Ar check Op path... 330Set the check code as specified for any newly created elements at or under 331the path if not overridden by deeper elements. 332Available codes are default, disabled, crc32, xxhash64, or sha192. 333.\" ==== clrcheck ==== 334.It Cm clrcheck Op path... 335Clear the check code override for the specified paths. 336Overrides may still be present in deeper elements. 337.\" ==== setcrc32 ==== 338.It Cm setcrc32 Op path... 339Set the check code to the ISCSI 32-bit CRC for any newly created elements 340at or under the path if not overridden by deeper elements. 341.\" ==== setxxhash64 ==== 342.It Cm setxxhash64 Op path... 343Set the check code to XXHASH64, a fast 64-bit hash 344.\" ==== setsha192 ==== 345.It Cm setsha192 Op path... 346Set the check code to SHA192 for any newly created elements at or under 347the path if not overridden by deeper elements. 348.\" ==== bulkfree ==== 349.It Cm bulkfree Op path... 350Run a bulkfree pass on a HAMMER2 mount. 351You can specify any PFS for the mount, the bulkfree pass is run on the 352entire partition. 353Note that it takes two passes to actually free space. 354By default this directive will use up to 1/16 physical memory to track 355the freemap. The amount of memory used may be overridden with the 356.Op Fl m Ar mem 357option. 358.El 359.Sh SYSCTLS 360.Bl -tag -width indent 361.It Va vfs.hammer2.dedup_enable (default on) 362Enables live de-duplication. Any recently read data that is on-media 363(already synchronized to media) is tested against pending writes for 364compatibility. If a match is found, the write will reference the 365existing on-media data instead of writing new data. 366.It Va vfs.hammer2.always_compress (default off) 367This disables the H2 compression heuristic and forces H2 to always 368try to compress data blocks, even if they look uncompressable. 369Enabling this option reduces performance but has higher de-duplication 370repeatability. 371.It Va vfs.hammer2.cluster_data_read (default 4) 372.It Va vfs.hammer2.cluster_meta_read (default 1) 373Set the amount of read-ahead clustering to perform on data and meta-data 374blocks. 375.It Va vfs.hammer2.cluster_write (default 4) 376Set the amount of write-behind clustering to perform in buffers. Each 377buffer represents 64KB. The default is 4 and higher values typically do 378not improve performance. A value of 0 disables clustered writes. 379This variable applies to the underlying media device, not to logical 380file writes, so it should not interfere with temporary file optimization. 381Generally speaking you want this enabled to generate smoothly pipelined 382writes to the media. 383.It Va vfs.hammer2.bulkfree_tps (default 5000) 384Set bulkfree's maximum scan rate. This is primarily intended to limit 385I/O utilization on SSDs and cpu utilization when the meta-data is mostly 386cached in memory. 387.El 388.Sh SETTING UP /etc/hammer2 389The 390.Sq rsainit 391directive will create the 392.Pa /etc/hammer2 393directory with appropriate permissions and also generate a public key 394pair in this directory for the machine. These files will be 395.Pa rsa.pub 396and 397.Pa rsa.prv 398and needless to say, the private key shouldn't leave the host. 399.Pp 400The service daemon will also scan the 401.Pa /etc/hammer2/autoconn 402file which contains a list of hosts which it will automatically maintain 403connections to to form your cluster. 404The service daemon will automatically reconnect on any failure and will 405also monitor the file for changes. 406.Pp 407When the service daemon receives a connection it expects to find a 408public key for that connection in a file in 409.Pa /etc/hammer2/remote/ 410called 411.Pa <IPADDR>.pub . 412You normally copy the 413.Pa rsa.pub 414key from the host in question to this file. 415The IP address must match exactly or the connection will not be allowed. 416.Pp 417If you want to use an unencrypted connection you can create empty, 418dummy files in the remote directory in the form 419.Pa <IPADDR>.none . 420We do not recommend using unencrypted connections. 421.Sh CLUSTER SERVICES 422Currently there are two services which use the cluster network infrastructure, 423HAMMER2 mounts and XDISK. 424Any HAMMER2 mount will make all PFSs for that filesystem available to the 425cluster. 426And if the XDISK kernel module is loaded, the hammer2 service daemon will make 427your machine's block devices available to the cluster (you must load the 428xdisk.ko kernel module before starting the hammer2 service). 429They will show up as 430.Pa /dev/xa* 431and 432.Pa /dev/serno/* 433devices on the remote machines making up the cluster. 434Remote block devices are just what they appear to be... direct access to a 435block device on a remote machine. If the link goes down remote accesses 436will stall until it comes back up again, then automatically requeue any 437pending I/O and resume as if nothing happened. 438However, if the server hosting the physical disks crashes or is rebooted, 439any remote opens to its devices will see a permanent I/O failure requiring a 440close and open sequence to re-establish. 441The latter is necessary because the server's drives might not have committed 442the data before the crash, but had already acknowledged the transfer. 443.Pp 444Data commits work exactly the same as they do for real block devices. 445The originater must issue a BUF_CMD_FLUSH. 446.Sh ADDING A NEW MASTER TO A CLUSTER 447When you 448.Xr newfs_hammer2 8 449a HAMMER2 filesystem or use the 450.Sq pfs-create 451directive on one already mounted 452to create a new PFS, with no special options, you wind up with a PFS 453typed as a MASTER and a unique cluster uuid, but because there is only one 454PFS for that cluster (for each PFS you create via pfs-create), it will 455act just like a normal filesystem would act and does not require any special 456protocols to operate. 457.Pp 458If you use the 459.Sq pfs-create 460directive along with the 461.Fl u 462option to specify a cluster uuid that already exists in the cluster, 463you are adding a PFS to an existing cluster and this can trigger a whole 464series of events in the background. 465When you specify the 466.Fl u 467option in a 468.Sq pfs-create , 469.Nm 470will by default create a SLAVE PFS. 471In fact, this is what must be created first even if you want to add a new 472MASTER to your cluster. 473.Pp 474The most common action a system admin will want to take is to upgrade or 475downgrade a PFS. 476A new MASTER can be added to the cluster by upgrading an existing SLAVE 477to MASTER. 478A MASTER can be removed from the cluster by downgrading it to a SLAVE. 479Upgrades and downgrades will put nodes in the cluster in a transition state 480until the operation is complete. 481For downgrades the transition state is fleeting unless one or more other 482masters has not acknowledged the change. 483For upgrades a background synchronization process must complete before the 484transition can be said to be complete, and the node remains (really) a SLAVE 485until that transition is complete. 486.Sh USE CASES FOR A SOFT_MASTER 487The SOFT_MASTER PFS type is a special type which must be specifically 488mounted by a machine. 489It is a R/W mount which does not use the quorum protocol and is not 490cache coherent with the cluster, but which synchronizes from the cluster 491and allows modifying operations which will synchronize to the cluster. 492The most common case is to use a SOFT_MASTER PFS in a laptop allowing you 493to work on your laptop when you are on the road and not connected to 494your main servers, and for the laptop to synchronize when a connection is 495available. 496.Sh USE CASES FOR A SOFT_SLAVE 497A SOFT_SLAVE PFS type is a special type which must be specifically mounted 498by a machine. 499It is a RO mount which does not use the quorum protocol and is not 500cache coherent with the cluster. It will receive synchronization from 501the cluster when network connectivity is available but will not stall if 502network connectivity is lost. 503.Sh FSYNC FLUSH MODES 504TODO. 505.Sh RESTORING FROM A SNAPSHOT BACKUP 506TODO. 507.Sh PERFORMANCE TUNING 508Because HAMMER2 implements compression, decompression, and dedup natively, 509it always double-buffers file data. This means that the file data is 510cached via the device vnode (in compressed / dedupped-form) and the same 511data is also cached by the file vnode (in decompressed / non-dedupped form). 512.Pp 513While HAMMER2 will try to age the logical file buffers on its, some 514additional performance tuning may be necessary for optimal operation 515whether swapcache is used or not. Our recommendation is to reduce the 516number of vnodes (and thus also the logical buffer cache behind the 517vnodes) that the system caches via the 518.Va kern.maxvnodes 519sysctl. 520.Pp 521Too-large a value will result in excessive double-caching and can cause 522unnecessary read disk I/O. 523We recommend a number between 25000 and 250000 vnodes, depending on your 524use case. 525Keep in mind that even though the vnode cache is smaller, this will make 526room for a great deal more device-level buffer caching which can encompasses 527far more data and meta-data than the vnode-level caching. 528.Sh ENVIRONMENT 529TODO. 530.Sh FILES 531.Bl -tag -width ".It Pa <fs>/abc/defghi/<name>" -compact 532.It Pa /etc/hammer2/ 533.It Pa /etc/hammer2/rsa.pub 534.It Pa /etc/hammer2/rsa.prv 535.It Pa /etc/hammer2/autoconn 536.It Pa /etc/hammer2/remote/<IP>.pub 537.It Pa /etc/hammer2/remote/<IP>.none 538.El 539.Sh EXIT STATUS 540.Ex -std 541.Sh SEE ALSO 542.Xr mount_hammer2 8 , 543.Xr mount_null 8 , 544.Xr newfs_hammer2 8 , 545.Xr swapcache 8 , 546.Xr sysctl 8 547.Sh HISTORY 548The 549.Nm 550utility first appeared in 551.Dx 4.1 . 552.Sh AUTHORS 553.An Matthew Dillon Aq Mt dillon@backplane.com 554