1eda14cbcSMatt Macy /* 2eda14cbcSMatt Macy * CDDL HEADER START 3eda14cbcSMatt Macy * 4eda14cbcSMatt Macy * The contents of this file are subject to the terms of the 5eda14cbcSMatt Macy * Common Development and Distribution License (the "License"). 6eda14cbcSMatt Macy * You may not use this file except in compliance with the License. 7eda14cbcSMatt Macy * 8eda14cbcSMatt Macy * You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE 9271171e0SMartin Matuska * or https://opensource.org/licenses/CDDL-1.0. 10eda14cbcSMatt Macy * See the License for the specific language governing permissions 11eda14cbcSMatt Macy * and limitations under the License. 12eda14cbcSMatt Macy * 13eda14cbcSMatt Macy * When distributing Covered Code, include this CDDL HEADER in each 14eda14cbcSMatt Macy * file and include the License file at usr/src/OPENSOLARIS.LICENSE. 15eda14cbcSMatt Macy * If applicable, add the following below this CDDL HEADER, with the 16eda14cbcSMatt Macy * fields enclosed by brackets "[]" replaced with your own identifying 17eda14cbcSMatt Macy * information: Portions Copyright [yyyy] [name of copyright owner] 18eda14cbcSMatt Macy * 19eda14cbcSMatt Macy * CDDL HEADER END 20eda14cbcSMatt Macy */ 21eda14cbcSMatt Macy /* 22eda14cbcSMatt Macy * Copyright 2009 Sun Microsystems, Inc. All rights reserved. 23eda14cbcSMatt Macy * Use is subject to license terms. 24eda14cbcSMatt Macy */ 25eda14cbcSMatt Macy 26eda14cbcSMatt Macy /* 27eda14cbcSMatt Macy * Copyright (c) 2012, 2015 by Delphix. All rights reserved. 28783d3ff6SMartin Matuska * Copyright (c) 2024, Klara Inc. 29eda14cbcSMatt Macy */ 30eda14cbcSMatt Macy 31eda14cbcSMatt Macy #ifndef _ZIO_IMPL_H 32eda14cbcSMatt Macy #define _ZIO_IMPL_H 33eda14cbcSMatt Macy 34eda14cbcSMatt Macy #ifdef __cplusplus 35eda14cbcSMatt Macy extern "C" { 36eda14cbcSMatt Macy #endif 37eda14cbcSMatt Macy 38eda14cbcSMatt Macy /* 39eda14cbcSMatt Macy * XXX -- Describe ZFS I/O pipeline here. Fill in as needed. 40eda14cbcSMatt Macy * 41eda14cbcSMatt Macy * The ZFS I/O pipeline is comprised of various stages which are defined 42eda14cbcSMatt Macy * in the zio_stage enum below. The individual stages are used to construct 43*1719886fSMartin Matuska * these basic I/O operations: Read, Write, Free, Claim, Flush and Trim. 44eda14cbcSMatt Macy * 45eda14cbcSMatt Macy * I/O operations: (XXX - provide detail for each of the operations) 46eda14cbcSMatt Macy * 47eda14cbcSMatt Macy * Read: 48eda14cbcSMatt Macy * Write: 49eda14cbcSMatt Macy * Free: 50eda14cbcSMatt Macy * Claim: 51*1719886fSMartin Matuska * Flush: 52783d3ff6SMartin Matuska * Trim: 53eda14cbcSMatt Macy * 54eda14cbcSMatt Macy * Although the most common pipeline are used by the basic I/O operations 55eda14cbcSMatt Macy * above, there are some helper pipelines (one could consider them 56eda14cbcSMatt Macy * sub-pipelines) which are used internally by the ZIO module and are 57eda14cbcSMatt Macy * explained below: 58eda14cbcSMatt Macy * 59eda14cbcSMatt Macy * Interlock Pipeline: 60eda14cbcSMatt Macy * The interlock pipeline is the most basic pipeline and is used by all 61eda14cbcSMatt Macy * of the I/O operations. The interlock pipeline does not perform any I/O 62eda14cbcSMatt Macy * and is used to coordinate the dependencies between I/Os that are being 63eda14cbcSMatt Macy * issued (i.e. the parent/child relationship). 64eda14cbcSMatt Macy * 65eda14cbcSMatt Macy * Vdev child Pipeline: 66eda14cbcSMatt Macy * The vdev child pipeline is responsible for performing the physical I/O. 67eda14cbcSMatt Macy * It is in this pipeline where the I/O are queued and possibly cached. 68eda14cbcSMatt Macy * 69eda14cbcSMatt Macy * In addition to performing I/O, the pipeline is also responsible for 70eda14cbcSMatt Macy * data transformations. The transformations performed are based on the 71eda14cbcSMatt Macy * specific properties that user may have selected and modify the 72eda14cbcSMatt Macy * behavior of the pipeline. Examples of supported transformations are 73eda14cbcSMatt Macy * compression, dedup, and nop writes. Transformations will either modify 74eda14cbcSMatt Macy * the data or the pipeline. This list below further describes each of 75eda14cbcSMatt Macy * the supported transformations: 76eda14cbcSMatt Macy * 77eda14cbcSMatt Macy * Compression: 78eda14cbcSMatt Macy * ZFS supports five different flavors of compression -- gzip, lzjb, lz4, zle, 79eda14cbcSMatt Macy * and zstd. Compression occurs as part of the write pipeline and is 80eda14cbcSMatt Macy * performed in the ZIO_STAGE_WRITE_BP_INIT stage. 81eda14cbcSMatt Macy * 822a58b312SMartin Matuska * Block cloning: 832a58b312SMartin Matuska * The block cloning functionality introduces ZIO_STAGE_BRT_FREE stage which 842a58b312SMartin Matuska * is called during a free pipeline. If the block is referenced in the 852a58b312SMartin Matuska * Block Cloning Table (BRT) we will just decrease its reference counter 862a58b312SMartin Matuska * instead of actually freeing the block. 872a58b312SMartin Matuska * 88eda14cbcSMatt Macy * Dedup: 89eda14cbcSMatt Macy * Dedup reads are handled by the ZIO_STAGE_DDT_READ_START and 90eda14cbcSMatt Macy * ZIO_STAGE_DDT_READ_DONE stages. These stages are added to an existing 91eda14cbcSMatt Macy * read pipeline if the dedup bit is set on the block pointer. 92eda14cbcSMatt Macy * Writing a dedup block is performed by the ZIO_STAGE_DDT_WRITE stage 93eda14cbcSMatt Macy * and added to a write pipeline if a user has enabled dedup on that 94eda14cbcSMatt Macy * particular dataset. 95eda14cbcSMatt Macy * 96eda14cbcSMatt Macy * NOP Write: 97eda14cbcSMatt Macy * The NOP write feature is performed by the ZIO_STAGE_NOP_WRITE stage 98eda14cbcSMatt Macy * and is added to an existing write pipeline if a cryptographically 99eda14cbcSMatt Macy * secure checksum (i.e. SHA256) is enabled and compression is turned on. 100eda14cbcSMatt Macy * The NOP write stage will compare the checksums of the current data 101eda14cbcSMatt Macy * on-disk (level-0 blocks only) and the data that is currently being written. 102eda14cbcSMatt Macy * If the checksum values are identical then the pipeline is converted to 103eda14cbcSMatt Macy * an interlock pipeline skipping block allocation and bypassing the 104eda14cbcSMatt Macy * physical I/O. The nop write feature can handle writes in either 105eda14cbcSMatt Macy * syncing or open context (i.e. zil writes) and as a result is mutually 106eda14cbcSMatt Macy * exclusive with dedup. 107eda14cbcSMatt Macy * 108eda14cbcSMatt Macy * Encryption: 109eda14cbcSMatt Macy * Encryption and authentication is handled by the ZIO_STAGE_ENCRYPT stage. 110eda14cbcSMatt Macy * This stage determines how the encryption metadata is stored in the bp. 111eda14cbcSMatt Macy * Decryption and MAC verification is performed during zio_decrypt() as a 112eda14cbcSMatt Macy * transform callback. Encryption is mutually exclusive with nopwrite, because 113eda14cbcSMatt Macy * blocks with the same plaintext will be encrypted with different salts and 114eda14cbcSMatt Macy * IV's (if dedup is off), and therefore have different ciphertexts. For dedup 115eda14cbcSMatt Macy * blocks we deterministically generate the IV and salt by performing an HMAC 116eda14cbcSMatt Macy * of the plaintext, which is computationally expensive, but allows us to keep 117eda14cbcSMatt Macy * support for encrypted dedup. See the block comment in zio_crypt.c for 118eda14cbcSMatt Macy * details. 119eda14cbcSMatt Macy */ 120eda14cbcSMatt Macy 121eda14cbcSMatt Macy /* 122eda14cbcSMatt Macy * zio pipeline stage definitions 123eda14cbcSMatt Macy */ 124eda14cbcSMatt Macy enum zio_stage { 125*1719886fSMartin Matuska ZIO_STAGE_OPEN = 1 << 0, /* RWFCXT */ 126eda14cbcSMatt Macy 127783d3ff6SMartin Matuska ZIO_STAGE_READ_BP_INIT = 1 << 1, /* R----- */ 128783d3ff6SMartin Matuska ZIO_STAGE_WRITE_BP_INIT = 1 << 2, /* -W---- */ 129783d3ff6SMartin Matuska ZIO_STAGE_FREE_BP_INIT = 1 << 3, /* --F--- */ 130783d3ff6SMartin Matuska ZIO_STAGE_ISSUE_ASYNC = 1 << 4, /* -WF--T */ 131783d3ff6SMartin Matuska ZIO_STAGE_WRITE_COMPRESS = 1 << 5, /* -W---- */ 132eda14cbcSMatt Macy 133783d3ff6SMartin Matuska ZIO_STAGE_ENCRYPT = 1 << 6, /* -W---- */ 134783d3ff6SMartin Matuska ZIO_STAGE_CHECKSUM_GENERATE = 1 << 7, /* -W---- */ 135eda14cbcSMatt Macy 136783d3ff6SMartin Matuska ZIO_STAGE_NOP_WRITE = 1 << 8, /* -W---- */ 137eda14cbcSMatt Macy 138783d3ff6SMartin Matuska ZIO_STAGE_BRT_FREE = 1 << 9, /* --F--- */ 139eda14cbcSMatt Macy 140783d3ff6SMartin Matuska ZIO_STAGE_DDT_READ_START = 1 << 10, /* R----- */ 141783d3ff6SMartin Matuska ZIO_STAGE_DDT_READ_DONE = 1 << 11, /* R----- */ 142783d3ff6SMartin Matuska ZIO_STAGE_DDT_WRITE = 1 << 12, /* -W---- */ 143783d3ff6SMartin Matuska ZIO_STAGE_DDT_FREE = 1 << 13, /* --F--- */ 144eda14cbcSMatt Macy 145783d3ff6SMartin Matuska ZIO_STAGE_GANG_ASSEMBLE = 1 << 14, /* RWFC-- */ 146783d3ff6SMartin Matuska ZIO_STAGE_GANG_ISSUE = 1 << 15, /* RWFC-- */ 147eda14cbcSMatt Macy 148783d3ff6SMartin Matuska ZIO_STAGE_DVA_THROTTLE = 1 << 16, /* -W---- */ 149783d3ff6SMartin Matuska ZIO_STAGE_DVA_ALLOCATE = 1 << 17, /* -W---- */ 150783d3ff6SMartin Matuska ZIO_STAGE_DVA_FREE = 1 << 18, /* --F--- */ 151783d3ff6SMartin Matuska ZIO_STAGE_DVA_CLAIM = 1 << 19, /* ---C-- */ 152eda14cbcSMatt Macy 153*1719886fSMartin Matuska ZIO_STAGE_READY = 1 << 20, /* RWFCXT */ 154eda14cbcSMatt Macy 155*1719886fSMartin Matuska ZIO_STAGE_VDEV_IO_START = 1 << 21, /* RW--XT */ 156*1719886fSMartin Matuska ZIO_STAGE_VDEV_IO_DONE = 1 << 22, /* RW--XT */ 157*1719886fSMartin Matuska ZIO_STAGE_VDEV_IO_ASSESS = 1 << 23, /* RW--XT */ 158eda14cbcSMatt Macy 159783d3ff6SMartin Matuska ZIO_STAGE_CHECKSUM_VERIFY = 1 << 24, /* R----- */ 1602a58b312SMartin Matuska 161*1719886fSMartin Matuska ZIO_STAGE_DONE = 1 << 25 /* RWFCXT */ 162eda14cbcSMatt Macy }; 163eda14cbcSMatt Macy 1646c1e79dfSMartin Matuska #define ZIO_ROOT_PIPELINE \ 1656c1e79dfSMartin Matuska ZIO_STAGE_DONE 1666c1e79dfSMartin Matuska 167eda14cbcSMatt Macy #define ZIO_INTERLOCK_STAGES \ 168eda14cbcSMatt Macy (ZIO_STAGE_READY | \ 169eda14cbcSMatt Macy ZIO_STAGE_DONE) 170eda14cbcSMatt Macy 171eda14cbcSMatt Macy #define ZIO_INTERLOCK_PIPELINE \ 172eda14cbcSMatt Macy ZIO_INTERLOCK_STAGES 173eda14cbcSMatt Macy 174eda14cbcSMatt Macy #define ZIO_VDEV_IO_STAGES \ 175eda14cbcSMatt Macy (ZIO_STAGE_VDEV_IO_START | \ 176eda14cbcSMatt Macy ZIO_STAGE_VDEV_IO_DONE | \ 177eda14cbcSMatt Macy ZIO_STAGE_VDEV_IO_ASSESS) 178eda14cbcSMatt Macy 179eda14cbcSMatt Macy #define ZIO_VDEV_CHILD_PIPELINE \ 180eda14cbcSMatt Macy (ZIO_VDEV_IO_STAGES | \ 181eda14cbcSMatt Macy ZIO_STAGE_DONE) 182eda14cbcSMatt Macy 183eda14cbcSMatt Macy #define ZIO_READ_COMMON_STAGES \ 184eda14cbcSMatt Macy (ZIO_INTERLOCK_STAGES | \ 185eda14cbcSMatt Macy ZIO_VDEV_IO_STAGES | \ 186eda14cbcSMatt Macy ZIO_STAGE_CHECKSUM_VERIFY) 187eda14cbcSMatt Macy 188eda14cbcSMatt Macy #define ZIO_READ_PHYS_PIPELINE \ 189eda14cbcSMatt Macy ZIO_READ_COMMON_STAGES 190eda14cbcSMatt Macy 191eda14cbcSMatt Macy #define ZIO_READ_PIPELINE \ 192eda14cbcSMatt Macy (ZIO_READ_COMMON_STAGES | \ 193eda14cbcSMatt Macy ZIO_STAGE_READ_BP_INIT) 194eda14cbcSMatt Macy 195eda14cbcSMatt Macy #define ZIO_DDT_CHILD_READ_PIPELINE \ 196eda14cbcSMatt Macy ZIO_READ_COMMON_STAGES 197eda14cbcSMatt Macy 198eda14cbcSMatt Macy #define ZIO_DDT_READ_PIPELINE \ 199eda14cbcSMatt Macy (ZIO_INTERLOCK_STAGES | \ 200eda14cbcSMatt Macy ZIO_STAGE_READ_BP_INIT | \ 201eda14cbcSMatt Macy ZIO_STAGE_DDT_READ_START | \ 202eda14cbcSMatt Macy ZIO_STAGE_DDT_READ_DONE) 203eda14cbcSMatt Macy 204eda14cbcSMatt Macy #define ZIO_WRITE_COMMON_STAGES \ 205eda14cbcSMatt Macy (ZIO_INTERLOCK_STAGES | \ 206eda14cbcSMatt Macy ZIO_VDEV_IO_STAGES | \ 207eda14cbcSMatt Macy ZIO_STAGE_ISSUE_ASYNC | \ 208eda14cbcSMatt Macy ZIO_STAGE_CHECKSUM_GENERATE) 209eda14cbcSMatt Macy 210eda14cbcSMatt Macy #define ZIO_WRITE_PHYS_PIPELINE \ 211eda14cbcSMatt Macy ZIO_WRITE_COMMON_STAGES 212eda14cbcSMatt Macy 213eda14cbcSMatt Macy #define ZIO_REWRITE_PIPELINE \ 214eda14cbcSMatt Macy (ZIO_WRITE_COMMON_STAGES | \ 215eda14cbcSMatt Macy ZIO_STAGE_WRITE_COMPRESS | \ 216eda14cbcSMatt Macy ZIO_STAGE_ENCRYPT | \ 217eda14cbcSMatt Macy ZIO_STAGE_WRITE_BP_INIT) 218eda14cbcSMatt Macy 219eda14cbcSMatt Macy #define ZIO_WRITE_PIPELINE \ 220eda14cbcSMatt Macy (ZIO_WRITE_COMMON_STAGES | \ 221eda14cbcSMatt Macy ZIO_STAGE_WRITE_BP_INIT | \ 222eda14cbcSMatt Macy ZIO_STAGE_WRITE_COMPRESS | \ 223eda14cbcSMatt Macy ZIO_STAGE_ENCRYPT | \ 224eda14cbcSMatt Macy ZIO_STAGE_DVA_THROTTLE | \ 225eda14cbcSMatt Macy ZIO_STAGE_DVA_ALLOCATE) 226eda14cbcSMatt Macy 227eda14cbcSMatt Macy #define ZIO_DDT_CHILD_WRITE_PIPELINE \ 228eda14cbcSMatt Macy (ZIO_INTERLOCK_STAGES | \ 229eda14cbcSMatt Macy ZIO_VDEV_IO_STAGES | \ 230eda14cbcSMatt Macy ZIO_STAGE_DVA_THROTTLE | \ 231eda14cbcSMatt Macy ZIO_STAGE_DVA_ALLOCATE) 232eda14cbcSMatt Macy 233eda14cbcSMatt Macy #define ZIO_DDT_WRITE_PIPELINE \ 234eda14cbcSMatt Macy (ZIO_INTERLOCK_STAGES | \ 235eda14cbcSMatt Macy ZIO_STAGE_WRITE_BP_INIT | \ 236eda14cbcSMatt Macy ZIO_STAGE_ISSUE_ASYNC | \ 237eda14cbcSMatt Macy ZIO_STAGE_WRITE_COMPRESS | \ 238eda14cbcSMatt Macy ZIO_STAGE_ENCRYPT | \ 239eda14cbcSMatt Macy ZIO_STAGE_CHECKSUM_GENERATE | \ 240eda14cbcSMatt Macy ZIO_STAGE_DDT_WRITE) 241eda14cbcSMatt Macy 242eda14cbcSMatt Macy #define ZIO_GANG_STAGES \ 243eda14cbcSMatt Macy (ZIO_STAGE_GANG_ASSEMBLE | \ 244eda14cbcSMatt Macy ZIO_STAGE_GANG_ISSUE) 245eda14cbcSMatt Macy 246eda14cbcSMatt Macy #define ZIO_FREE_PIPELINE \ 247eda14cbcSMatt Macy (ZIO_INTERLOCK_STAGES | \ 248eda14cbcSMatt Macy ZIO_STAGE_FREE_BP_INIT | \ 2492a58b312SMartin Matuska ZIO_STAGE_BRT_FREE | \ 250eda14cbcSMatt Macy ZIO_STAGE_DVA_FREE) 251eda14cbcSMatt Macy 252eda14cbcSMatt Macy #define ZIO_DDT_FREE_PIPELINE \ 253eda14cbcSMatt Macy (ZIO_INTERLOCK_STAGES | \ 254eda14cbcSMatt Macy ZIO_STAGE_FREE_BP_INIT | \ 255eda14cbcSMatt Macy ZIO_STAGE_ISSUE_ASYNC | \ 256eda14cbcSMatt Macy ZIO_STAGE_DDT_FREE) 257eda14cbcSMatt Macy 258eda14cbcSMatt Macy #define ZIO_CLAIM_PIPELINE \ 259eda14cbcSMatt Macy (ZIO_INTERLOCK_STAGES | \ 260eda14cbcSMatt Macy ZIO_STAGE_DVA_CLAIM) 261eda14cbcSMatt Macy 262*1719886fSMartin Matuska #define ZIO_FLUSH_PIPELINE \ 263eda14cbcSMatt Macy (ZIO_INTERLOCK_STAGES | \ 264*1719886fSMartin Matuska ZIO_VDEV_IO_STAGES) 265eda14cbcSMatt Macy 266eda14cbcSMatt Macy #define ZIO_TRIM_PIPELINE \ 267eda14cbcSMatt Macy (ZIO_INTERLOCK_STAGES | \ 268eda14cbcSMatt Macy ZIO_STAGE_ISSUE_ASYNC | \ 269eda14cbcSMatt Macy ZIO_VDEV_IO_STAGES) 270eda14cbcSMatt Macy 271eda14cbcSMatt Macy #define ZIO_BLOCKING_STAGES \ 272eda14cbcSMatt Macy (ZIO_STAGE_DVA_ALLOCATE | \ 273eda14cbcSMatt Macy ZIO_STAGE_DVA_CLAIM | \ 274eda14cbcSMatt Macy ZIO_STAGE_VDEV_IO_START) 275eda14cbcSMatt Macy 276eda14cbcSMatt Macy extern void zio_inject_init(void); 277eda14cbcSMatt Macy extern void zio_inject_fini(void); 278eda14cbcSMatt Macy 279eda14cbcSMatt Macy #ifdef __cplusplus 280eda14cbcSMatt Macy } 281eda14cbcSMatt Macy #endif 282eda14cbcSMatt Macy 283eda14cbcSMatt Macy #endif /* _ZIO_IMPL_H */ 284