1e7751617SMauro Carvalho Chehab======================================== 2a3e1c56aSRandy Dunlapzram: Compressed RAM-based block devices 3e7751617SMauro Carvalho Chehab======================================== 4e7751617SMauro Carvalho Chehab 5e7751617SMauro Carvalho ChehabIntroduction 6e7751617SMauro Carvalho Chehab============ 7e7751617SMauro Carvalho Chehab 8a3e1c56aSRandy DunlapThe zram module creates RAM-based block devices named /dev/zram<id> 9e7751617SMauro Carvalho Chehab(<id> = 0, 1, ...). Pages written to these disks are compressed and stored 10e7751617SMauro Carvalho Chehabin memory itself. These disks allow very fast I/O and compression provides 11e7751617SMauro Carvalho Chehabgood amounts of memory savings. Some of the use cases include /tmp storage, 12a3e1c56aSRandy Dunlapuse as swap disks, various caches under /var and maybe many more. :) 13e7751617SMauro Carvalho Chehab 14e7751617SMauro Carvalho ChehabStatistics for individual zram devices are exported through sysfs nodes at 15e7751617SMauro Carvalho Chehab/sys/block/zram<id>/ 16e7751617SMauro Carvalho Chehab 17e7751617SMauro Carvalho ChehabUsage 18e7751617SMauro Carvalho Chehab===== 19e7751617SMauro Carvalho Chehab 20e7751617SMauro Carvalho ChehabThere are several ways to configure and manage zram device(-s): 21e7751617SMauro Carvalho Chehab 22e7751617SMauro Carvalho Chehaba) using zram and zram_control sysfs attributes 23e7751617SMauro Carvalho Chehabb) using zramctl utility, provided by util-linux (util-linux@vger.kernel.org). 24e7751617SMauro Carvalho Chehab 25e7751617SMauro Carvalho ChehabIn this document we will describe only 'manual' zram configuration steps, 26e7751617SMauro Carvalho ChehabIOW, zram and zram_control sysfs attributes. 27e7751617SMauro Carvalho Chehab 28e7751617SMauro Carvalho ChehabIn order to get a better idea about zramctl please consult util-linux 29e7751617SMauro Carvalho Chehabdocumentation, zramctl man-page or `zramctl --help`. Please be informed 30e7751617SMauro Carvalho Chehabthat zram maintainers do not develop/maintain util-linux or zramctl, should 31e7751617SMauro Carvalho Chehabyou have any questions please contact util-linux@vger.kernel.org 32e7751617SMauro Carvalho Chehab 33e7751617SMauro Carvalho ChehabFollowing shows a typical sequence of steps for using zram. 34e7751617SMauro Carvalho Chehab 35e7751617SMauro Carvalho ChehabWARNING 36e7751617SMauro Carvalho Chehab======= 37e7751617SMauro Carvalho Chehab 38e7751617SMauro Carvalho ChehabFor the sake of simplicity we skip error checking parts in most of the 39e7751617SMauro Carvalho Chehabexamples below. However, it is your sole responsibility to handle errors. 40e7751617SMauro Carvalho Chehab 41e7751617SMauro Carvalho Chehabzram sysfs attributes always return negative values in case of errors. 42e7751617SMauro Carvalho ChehabThe list of possible return codes: 43e7751617SMauro Carvalho Chehab 44e7751617SMauro Carvalho Chehab======== ============================================================= 45e7751617SMauro Carvalho Chehab-EBUSY an attempt to modify an attribute that cannot be changed once 46a3e1c56aSRandy Dunlap the device has been initialised. Please reset device first. 47e7751617SMauro Carvalho Chehab-ENOMEM zram was not able to allocate enough memory to fulfil your 48a3e1c56aSRandy Dunlap needs. 49e7751617SMauro Carvalho Chehab-EINVAL invalid input has been provided. 50e7751617SMauro Carvalho Chehab======== ============================================================= 51e7751617SMauro Carvalho Chehab 52a3e1c56aSRandy DunlapIf you use 'echo', the returned value is set by the 'echo' utility, 53e7751617SMauro Carvalho Chehaband, in general case, something like:: 54e7751617SMauro Carvalho Chehab 55e7751617SMauro Carvalho Chehab echo 3 > /sys/block/zram0/max_comp_streams 56a3e1c56aSRandy Dunlap if [ $? -ne 0 ]; then 57e7751617SMauro Carvalho Chehab handle_error 58e7751617SMauro Carvalho Chehab fi 59e7751617SMauro Carvalho Chehab 60e7751617SMauro Carvalho Chehabshould suffice. 61e7751617SMauro Carvalho Chehab 62e7751617SMauro Carvalho Chehab1) Load Module 63e7751617SMauro Carvalho Chehab============== 64e7751617SMauro Carvalho Chehab 65e7751617SMauro Carvalho Chehab:: 66e7751617SMauro Carvalho Chehab 67e7751617SMauro Carvalho Chehab modprobe zram num_devices=4 68a3e1c56aSRandy Dunlap 69e7751617SMauro Carvalho ChehabThis creates 4 devices: /dev/zram{0,1,2,3} 70e7751617SMauro Carvalho Chehab 71e7751617SMauro Carvalho Chehabnum_devices parameter is optional and tells zram how many devices should be 72e7751617SMauro Carvalho Chehabpre-created. Default: 1. 73e7751617SMauro Carvalho Chehab 74e7751617SMauro Carvalho Chehab2) Set max number of compression streams 75e7751617SMauro Carvalho Chehab======================================== 76e7751617SMauro Carvalho Chehab 77a3e1c56aSRandy DunlapRegardless of the value passed to this attribute, ZRAM will always 78a3e1c56aSRandy Dunlapallocate multiple compression streams - one per online CPU - thus 79e7751617SMauro Carvalho Chehaballowing several concurrent compression operations. The number of 80e7751617SMauro Carvalho Chehaballocated compression streams goes down when some of the CPUs 81e7751617SMauro Carvalho Chehabbecome offline. There is no single-compression-stream mode anymore, 82a3e1c56aSRandy Dunlapunless you are running a UP system or have only 1 CPU online. 83e7751617SMauro Carvalho Chehab 84e7751617SMauro Carvalho ChehabTo find out how many streams are currently available:: 85e7751617SMauro Carvalho Chehab 86e7751617SMauro Carvalho Chehab cat /sys/block/zram0/max_comp_streams 87e7751617SMauro Carvalho Chehab 88e7751617SMauro Carvalho Chehab3) Select compression algorithm 89e7751617SMauro Carvalho Chehab=============================== 90e7751617SMauro Carvalho Chehab 91e7751617SMauro Carvalho ChehabUsing comp_algorithm device attribute one can see available and 92e7751617SMauro Carvalho Chehabcurrently selected (shown in square brackets) compression algorithms, 93a3e1c56aSRandy Dunlapor change the selected compression algorithm (once the device is initialised 94e7751617SMauro Carvalho Chehabthere is no way to change compression algorithm). 95e7751617SMauro Carvalho Chehab 96e7751617SMauro Carvalho ChehabExamples:: 97e7751617SMauro Carvalho Chehab 98e7751617SMauro Carvalho Chehab #show supported compression algorithms 99e7751617SMauro Carvalho Chehab cat /sys/block/zram0/comp_algorithm 100e7751617SMauro Carvalho Chehab lzo [lz4] 101e7751617SMauro Carvalho Chehab 102e7751617SMauro Carvalho Chehab #select lzo compression algorithm 103e7751617SMauro Carvalho Chehab echo lzo > /sys/block/zram0/comp_algorithm 104e7751617SMauro Carvalho Chehab 105e7751617SMauro Carvalho ChehabFor the time being, the `comp_algorithm` content does not necessarily 106e7751617SMauro Carvalho Chehabshow every compression algorithm supported by the kernel. We keep this 107e7751617SMauro Carvalho Chehablist primarily to simplify device configuration and one can configure 108e7751617SMauro Carvalho Chehaba new device with a compression algorithm that is not listed in 109e7751617SMauro Carvalho Chehab`comp_algorithm`. The thing is that, internally, ZRAM uses Crypto API 110e7751617SMauro Carvalho Chehaband, if some of the algorithms were built as modules, it's impossible 111e7751617SMauro Carvalho Chehabto list all of them using, for instance, /proc/crypto or any other 112e7751617SMauro Carvalho Chehabmethod. This, however, has an advantage of permitting the usage of 113e7751617SMauro Carvalho Chehabcustom crypto compression modules (implementing S/W or H/W compression). 114e7751617SMauro Carvalho Chehab 115e7751617SMauro Carvalho Chehab4) Set Disksize 116e7751617SMauro Carvalho Chehab=============== 117e7751617SMauro Carvalho Chehab 118e7751617SMauro Carvalho ChehabSet disk size by writing the value to sysfs node 'disksize'. 119e7751617SMauro Carvalho ChehabThe value can be either in bytes or you can use mem suffixes. 120e7751617SMauro Carvalho ChehabExamples:: 121e7751617SMauro Carvalho Chehab 122e7751617SMauro Carvalho Chehab # Initialize /dev/zram0 with 50MB disksize 123e7751617SMauro Carvalho Chehab echo $((50*1024*1024)) > /sys/block/zram0/disksize 124e7751617SMauro Carvalho Chehab 125e7751617SMauro Carvalho Chehab # Using mem suffixes 126e7751617SMauro Carvalho Chehab echo 256K > /sys/block/zram0/disksize 127e7751617SMauro Carvalho Chehab echo 512M > /sys/block/zram0/disksize 128e7751617SMauro Carvalho Chehab echo 1G > /sys/block/zram0/disksize 129e7751617SMauro Carvalho Chehab 130e7751617SMauro Carvalho ChehabNote: 131e7751617SMauro Carvalho ChehabThere is little point creating a zram of greater than twice the size of memory 132e7751617SMauro Carvalho Chehabsince we expect a 2:1 compression ratio. Note that zram uses about 0.1% of the 133e7751617SMauro Carvalho Chehabsize of the disk when not in use so a huge zram is wasteful. 134e7751617SMauro Carvalho Chehab 135e7751617SMauro Carvalho Chehab5) Set memory limit: Optional 136e7751617SMauro Carvalho Chehab============================= 137e7751617SMauro Carvalho Chehab 138e7751617SMauro Carvalho ChehabSet memory limit by writing the value to sysfs node 'mem_limit'. 139e7751617SMauro Carvalho ChehabThe value can be either in bytes or you can use mem suffixes. 140e7751617SMauro Carvalho ChehabIn addition, you could change the value in runtime. 141e7751617SMauro Carvalho ChehabExamples:: 142e7751617SMauro Carvalho Chehab 143e7751617SMauro Carvalho Chehab # limit /dev/zram0 with 50MB memory 144e7751617SMauro Carvalho Chehab echo $((50*1024*1024)) > /sys/block/zram0/mem_limit 145e7751617SMauro Carvalho Chehab 146e7751617SMauro Carvalho Chehab # Using mem suffixes 147e7751617SMauro Carvalho Chehab echo 256K > /sys/block/zram0/mem_limit 148e7751617SMauro Carvalho Chehab echo 512M > /sys/block/zram0/mem_limit 149e7751617SMauro Carvalho Chehab echo 1G > /sys/block/zram0/mem_limit 150e7751617SMauro Carvalho Chehab 151e7751617SMauro Carvalho Chehab # To disable memory limit 152e7751617SMauro Carvalho Chehab echo 0 > /sys/block/zram0/mem_limit 153e7751617SMauro Carvalho Chehab 154e7751617SMauro Carvalho Chehab6) Activate 155e7751617SMauro Carvalho Chehab=========== 156e7751617SMauro Carvalho Chehab 157e7751617SMauro Carvalho Chehab:: 158e7751617SMauro Carvalho Chehab 159e7751617SMauro Carvalho Chehab mkswap /dev/zram0 160e7751617SMauro Carvalho Chehab swapon /dev/zram0 161e7751617SMauro Carvalho Chehab 162e7751617SMauro Carvalho Chehab mkfs.ext4 /dev/zram1 163e7751617SMauro Carvalho Chehab mount /dev/zram1 /tmp 164e7751617SMauro Carvalho Chehab 165e7751617SMauro Carvalho Chehab7) Add/remove zram devices 166e7751617SMauro Carvalho Chehab========================== 167e7751617SMauro Carvalho Chehab 168e7751617SMauro Carvalho Chehabzram provides a control interface, which enables dynamic (on-demand) device 169e7751617SMauro Carvalho Chehabaddition and removal. 170e7751617SMauro Carvalho Chehab 171a3e1c56aSRandy DunlapIn order to add a new /dev/zramX device, perform a read operation on the hot_add 172a3e1c56aSRandy Dunlapattribute. This will return either the new device's device id (meaning that you 173a3e1c56aSRandy Dunlapcan use /dev/zram<id>) or an error code. 174e7751617SMauro Carvalho Chehab 175e7751617SMauro Carvalho ChehabExample:: 176e7751617SMauro Carvalho Chehab 177e7751617SMauro Carvalho Chehab cat /sys/class/zram-control/hot_add 178e7751617SMauro Carvalho Chehab 1 179e7751617SMauro Carvalho Chehab 180e7751617SMauro Carvalho ChehabTo remove the existing /dev/zramX device (where X is a device id) 181e7751617SMauro Carvalho Chehabexecute:: 182e7751617SMauro Carvalho Chehab 183e7751617SMauro Carvalho Chehab echo X > /sys/class/zram-control/hot_remove 184e7751617SMauro Carvalho Chehab 185e7751617SMauro Carvalho Chehab8) Stats 186e7751617SMauro Carvalho Chehab======== 187e7751617SMauro Carvalho Chehab 188e7751617SMauro Carvalho ChehabPer-device statistics are exported as various nodes under /sys/block/zram<id>/ 189e7751617SMauro Carvalho Chehab 190a3e1c56aSRandy DunlapA brief description of exported device attributes follows. For more details 191a3e1c56aSRandy Dunlapplease read Documentation/ABI/testing/sysfs-block-zram. 192e7751617SMauro Carvalho Chehab 193e7751617SMauro Carvalho Chehab====================== ====== =============================================== 194e7751617SMauro Carvalho ChehabName access description 195e7751617SMauro Carvalho Chehab====================== ====== =============================================== 196e7751617SMauro Carvalho Chehabdisksize RW show and set the device's disk size 197e7751617SMauro Carvalho Chehabinitstate RO shows the initialization state of the device 198e7751617SMauro Carvalho Chehabreset WO trigger device reset 199e7751617SMauro Carvalho Chehabmem_used_max WO reset the `mem_used_max` counter (see later) 200e7751617SMauro Carvalho Chehabmem_limit WO specifies the maximum amount of memory ZRAM can 201e7751617SMauro Carvalho Chehab use to store the compressed data 202e7751617SMauro Carvalho Chehabwriteback_limit WO specifies the maximum amount of write IO zram 203e7751617SMauro Carvalho Chehab can write out to backing device as 4KB unit 204e7751617SMauro Carvalho Chehabwriteback_limit_enable RW show and set writeback_limit feature 205e7751617SMauro Carvalho Chehabmax_comp_streams RW the number of possible concurrent compress 206e7751617SMauro Carvalho Chehab operations 207e7751617SMauro Carvalho Chehabcomp_algorithm RW show and change the compression algorithm 208e7751617SMauro Carvalho Chehabcompact WO trigger memory compaction 209e7751617SMauro Carvalho Chehabdebug_stat RO this file is used for zram debugging purposes 210e7751617SMauro Carvalho Chehabbacking_dev RW set up backend storage for zram to write out 211e7751617SMauro Carvalho Chehabidle WO mark allocated slot as idle 212e7751617SMauro Carvalho Chehab====================== ====== =============================================== 213e7751617SMauro Carvalho Chehab 214e7751617SMauro Carvalho Chehab 215e7751617SMauro Carvalho ChehabUser space is advised to use the following files to read the device statistics. 216e7751617SMauro Carvalho Chehab 217e7751617SMauro Carvalho ChehabFile /sys/block/zram<id>/stat 218e7751617SMauro Carvalho Chehab 219e7751617SMauro Carvalho ChehabRepresents block layer statistics. Read Documentation/block/stat.rst for 220e7751617SMauro Carvalho Chehabdetails. 221e7751617SMauro Carvalho Chehab 222e7751617SMauro Carvalho ChehabFile /sys/block/zram<id>/io_stat 223e7751617SMauro Carvalho Chehab 224e7751617SMauro Carvalho ChehabThe stat file represents device's I/O statistics not accounted by block 225e7751617SMauro Carvalho Chehablayer and, thus, not available in zram<id>/stat file. It consists of a 226e7751617SMauro Carvalho Chehabsingle line of text and contains the following stats separated by 227e7751617SMauro Carvalho Chehabwhitespace: 228e7751617SMauro Carvalho Chehab 229e7751617SMauro Carvalho Chehab ============= ============================================================= 230e7751617SMauro Carvalho Chehab failed_reads The number of failed reads 231e7751617SMauro Carvalho Chehab failed_writes The number of failed writes 232e7751617SMauro Carvalho Chehab invalid_io The number of non-page-size-aligned I/O requests 233e7751617SMauro Carvalho Chehab notify_free Depending on device usage scenario it may account 234e7751617SMauro Carvalho Chehab 235e7751617SMauro Carvalho Chehab a) the number of pages freed because of swap slot free 236e7751617SMauro Carvalho Chehab notifications 237e7751617SMauro Carvalho Chehab b) the number of pages freed because of 238e7751617SMauro Carvalho Chehab REQ_OP_DISCARD requests sent by bio. The former ones are 239e7751617SMauro Carvalho Chehab sent to a swap block device when a swap slot is freed, 240e7751617SMauro Carvalho Chehab which implies that this disk is being used as a swap disk. 241e7751617SMauro Carvalho Chehab 242e7751617SMauro Carvalho Chehab The latter ones are sent by filesystem mounted with 243e7751617SMauro Carvalho Chehab discard option, whenever some data blocks are getting 244e7751617SMauro Carvalho Chehab discarded. 245e7751617SMauro Carvalho Chehab ============= ============================================================= 246e7751617SMauro Carvalho Chehab 247e7751617SMauro Carvalho ChehabFile /sys/block/zram<id>/mm_stat 248e7751617SMauro Carvalho Chehab 249a3e1c56aSRandy DunlapThe mm_stat file represents the device's mm statistics. It consists of a single 250e7751617SMauro Carvalho Chehabline of text and contains the following stats separated by whitespace: 251e7751617SMauro Carvalho Chehab 252e7751617SMauro Carvalho Chehab ================ ============================================================= 253e7751617SMauro Carvalho Chehab orig_data_size uncompressed size of data stored in this disk. 254e7751617SMauro Carvalho Chehab Unit: bytes 255e7751617SMauro Carvalho Chehab compr_data_size compressed size of data stored in this disk 256e7751617SMauro Carvalho Chehab mem_used_total the amount of memory allocated for this disk. This 257e7751617SMauro Carvalho Chehab includes allocator fragmentation and metadata overhead, 258e7751617SMauro Carvalho Chehab allocated for this disk. So, allocator space efficiency 259e7751617SMauro Carvalho Chehab can be calculated using compr_data_size and this statistic. 260e7751617SMauro Carvalho Chehab Unit: bytes 261e7751617SMauro Carvalho Chehab mem_limit the maximum amount of memory ZRAM can use to store 262e7751617SMauro Carvalho Chehab the compressed data 263a3e1c56aSRandy Dunlap mem_used_max the maximum amount of memory zram has consumed to 264e7751617SMauro Carvalho Chehab store the data 265e7751617SMauro Carvalho Chehab same_pages the number of same element filled pages written to this disk. 266e7751617SMauro Carvalho Chehab No memory is allocated for such pages. 267e7751617SMauro Carvalho Chehab pages_compacted the number of pages freed during compaction 268e7751617SMauro Carvalho Chehab huge_pages the number of incompressible pages 269194e28daSMinchan Kim huge_pages_since the number of incompressible pages since zram set up 270e7751617SMauro Carvalho Chehab ================ ============================================================= 271e7751617SMauro Carvalho Chehab 272e7751617SMauro Carvalho ChehabFile /sys/block/zram<id>/bd_stat 273e7751617SMauro Carvalho Chehab 274a3e1c56aSRandy DunlapThe bd_stat file represents a device's backing device statistics. It consists of 275e7751617SMauro Carvalho Chehaba single line of text and contains the following stats separated by whitespace: 276e7751617SMauro Carvalho Chehab 277e7751617SMauro Carvalho Chehab ============== ============================================================= 278e7751617SMauro Carvalho Chehab bd_count size of data written in backing device. 279e7751617SMauro Carvalho Chehab Unit: 4K bytes 280e7751617SMauro Carvalho Chehab bd_reads the number of reads from backing device 281e7751617SMauro Carvalho Chehab Unit: 4K bytes 282e7751617SMauro Carvalho Chehab bd_writes the number of writes to backing device 283e7751617SMauro Carvalho Chehab Unit: 4K bytes 284e7751617SMauro Carvalho Chehab ============== ============================================================= 285e7751617SMauro Carvalho Chehab 286e7751617SMauro Carvalho Chehab9) Deactivate 287e7751617SMauro Carvalho Chehab============= 288e7751617SMauro Carvalho Chehab 289e7751617SMauro Carvalho Chehab:: 290e7751617SMauro Carvalho Chehab 291e7751617SMauro Carvalho Chehab swapoff /dev/zram0 292e7751617SMauro Carvalho Chehab umount /dev/zram1 293e7751617SMauro Carvalho Chehab 294e7751617SMauro Carvalho Chehab10) Reset 295e7751617SMauro Carvalho Chehab========= 296e7751617SMauro Carvalho Chehab 297e7751617SMauro Carvalho Chehab Write any positive value to 'reset' sysfs node:: 298e7751617SMauro Carvalho Chehab 299e7751617SMauro Carvalho Chehab echo 1 > /sys/block/zram0/reset 300e7751617SMauro Carvalho Chehab echo 1 > /sys/block/zram1/reset 301e7751617SMauro Carvalho Chehab 302e7751617SMauro Carvalho Chehab This frees all the memory allocated for the given device and 303e7751617SMauro Carvalho Chehab resets the disksize to zero. You must set the disksize again 304e7751617SMauro Carvalho Chehab before reusing the device. 305e7751617SMauro Carvalho Chehab 306e7751617SMauro Carvalho ChehabOptional Feature 307e7751617SMauro Carvalho Chehab================ 308e7751617SMauro Carvalho Chehab 309e7751617SMauro Carvalho Chehabwriteback 310e7751617SMauro Carvalho Chehab--------- 311e7751617SMauro Carvalho Chehab 312e7751617SMauro Carvalho ChehabWith CONFIG_ZRAM_WRITEBACK, zram can write idle/incompressible page 313e7751617SMauro Carvalho Chehabto backing storage rather than keeping it in memory. 314e7751617SMauro Carvalho ChehabTo use the feature, admin should set up backing device via:: 315e7751617SMauro Carvalho Chehab 316e7751617SMauro Carvalho Chehab echo /dev/sda5 > /sys/block/zramX/backing_dev 317e7751617SMauro Carvalho Chehab 3184fbe7b19SEthan Dyebefore disksize setting. It supports only partitions at this moment. 3194fbe7b19SEthan DyeIf admin wants to use incompressible page writeback, they could do it via:: 320e7751617SMauro Carvalho Chehab 3215871023cSYue Hu echo huge > /sys/block/zramX/writeback 322e7751617SMauro Carvalho Chehab 323e7751617SMauro Carvalho ChehabTo use idle page writeback, first, user need to declare zram pages 324e7751617SMauro Carvalho Chehabas idle:: 325e7751617SMauro Carvalho Chehab 326e7751617SMauro Carvalho Chehab echo all > /sys/block/zramX/idle 327e7751617SMauro Carvalho Chehab 328e7751617SMauro Carvalho ChehabFrom now on, any pages on zram are idle pages. The idle mark 329a3e1c56aSRandy Dunlapwill be removed until someone requests access of the block. 330e7751617SMauro Carvalho ChehabIOW, unless there is access request, those pages are still idle pages. 331a7a03505SSergey SenozhatskyAdditionally, when CONFIG_ZRAM_TRACK_ENTRY_ACTIME is enabled pages can be 332755804d1SBrian Geffonmarked as idle based on how long (in seconds) it's been since they were 333755804d1SBrian Geffonlast accessed:: 334755804d1SBrian Geffon 335755804d1SBrian Geffon echo 86400 > /sys/block/zramX/idle 336755804d1SBrian Geffon 337755804d1SBrian GeffonIn this example all pages which haven't been accessed in more than 86400 338755804d1SBrian Geffonseconds (one day) will be marked idle. 339e7751617SMauro Carvalho Chehab 340e7751617SMauro Carvalho ChehabAdmin can request writeback of those idle pages at right timing via:: 341e7751617SMauro Carvalho Chehab 342e7751617SMauro Carvalho Chehab echo idle > /sys/block/zramX/writeback 343e7751617SMauro Carvalho Chehab 3444fbe7b19SEthan DyeWith the command, zram will writeback idle pages from memory to the storage. 345e7751617SMauro Carvalho Chehab 34630226b69SBrian GeffonAdditionally, if a user choose to writeback only huge and idle pages 34730226b69SBrian Geffonthis can be accomplished with:: 34830226b69SBrian Geffon 34930226b69SBrian Geffon echo huge_idle > /sys/block/zramX/writeback 35030226b69SBrian Geffon 351b46f9ea3SSergey SenozhatskyIf a user chooses to writeback only incompressible pages (pages that none of 352b46f9ea3SSergey Senozhatskyalgorithms can compress) this can be accomplished with:: 353b46f9ea3SSergey Senozhatsky 354b46f9ea3SSergey Senozhatsky echo incompressible > /sys/block/zramX/writeback 355b46f9ea3SSergey Senozhatsky 3564fbe7b19SEthan DyeIf an admin wants to write a specific page in zram device to the backing device, 357b46f9ea3SSergey Senozhatskythey could write a page index into the interface:: 3580d835962SMinchan Kim 3590d835962SMinchan Kim echo "page_index=1251" > /sys/block/zramX/writeback 3600d835962SMinchan Kim 361e7751617SMauro Carvalho ChehabIf there are lots of write IO with flash device, potentially, it has 362e7751617SMauro Carvalho Chehabflash wearout problem so that admin needs to design write limitation 363e7751617SMauro Carvalho Chehabto guarantee storage health for entire product life. 364e7751617SMauro Carvalho Chehab 365e7751617SMauro Carvalho ChehabTo overcome the concern, zram supports "writeback_limit" feature. 366e7751617SMauro Carvalho ChehabThe "writeback_limit_enable"'s default value is 0 so that it doesn't limit 3674fbe7b19SEthan Dyeany writeback. IOW, if admin wants to apply writeback budget, they should 368e7751617SMauro Carvalho Chehabenable writeback_limit_enable via:: 369e7751617SMauro Carvalho Chehab 370e7751617SMauro Carvalho Chehab $ echo 1 > /sys/block/zramX/writeback_limit_enable 371e7751617SMauro Carvalho Chehab 372e7751617SMauro Carvalho ChehabOnce writeback_limit_enable is set, zram doesn't allow any writeback 373a3e1c56aSRandy Dunlapuntil admin sets the budget via /sys/block/zramX/writeback_limit. 374e7751617SMauro Carvalho Chehab 375e7751617SMauro Carvalho Chehab(If admin doesn't enable writeback_limit_enable, writeback_limit's value 376a3e1c56aSRandy Dunlapassigned via /sys/block/zramX/writeback_limit is meaningless.) 377e7751617SMauro Carvalho Chehab 3784fbe7b19SEthan DyeIf admin wants to limit writeback as per-day 400M, they could do it 379e7751617SMauro Carvalho Chehablike below:: 380e7751617SMauro Carvalho Chehab 381e7751617SMauro Carvalho Chehab $ MB_SHIFT=20 382e7751617SMauro Carvalho Chehab $ 4K_SHIFT=12 383e7751617SMauro Carvalho Chehab $ echo $((400<<MB_SHIFT>>4K_SHIFT)) > \ 384e7751617SMauro Carvalho Chehab /sys/block/zram0/writeback_limit. 385e7751617SMauro Carvalho Chehab $ echo 1 > /sys/block/zram0/writeback_limit_enable 386e7751617SMauro Carvalho Chehab 387b2105aa2SAndrew KlychkovIf admins want to allow further write again once the budget is exhausted, 3884fbe7b19SEthan Dyethey could do it like below:: 389e7751617SMauro Carvalho Chehab 390e7751617SMauro Carvalho Chehab $ echo $((400<<MB_SHIFT>>4K_SHIFT)) > \ 391e7751617SMauro Carvalho Chehab /sys/block/zram0/writeback_limit 392e7751617SMauro Carvalho Chehab 3934fbe7b19SEthan DyeIf an admin wants to see the remaining writeback budget since last set:: 394e7751617SMauro Carvalho Chehab 395e7751617SMauro Carvalho Chehab $ cat /sys/block/zramX/writeback_limit 396e7751617SMauro Carvalho Chehab 3974fbe7b19SEthan DyeIf an admin wants to disable writeback limit, they could do:: 398e7751617SMauro Carvalho Chehab 399e7751617SMauro Carvalho Chehab $ echo 0 > /sys/block/zramX/writeback_limit_enable 400e7751617SMauro Carvalho Chehab 401e7751617SMauro Carvalho ChehabThe writeback_limit count will reset whenever you reset zram (e.g., 402e7751617SMauro Carvalho Chehabsystem reboot, echo 1 > /sys/block/zramX/reset) so keeping how many of 403e7751617SMauro Carvalho Chehabwriteback happened until you reset the zram to allocate extra writeback 404e7751617SMauro Carvalho Chehabbudget in next setting is user's job. 405e7751617SMauro Carvalho Chehab 4064fbe7b19SEthan DyeIf admin wants to measure writeback count in a certain period, they could 407e7751617SMauro Carvalho Chehabknow it via /sys/block/zram0/bd_stat's 3rd column. 408e7751617SMauro Carvalho Chehab 409443dd798SSergey Senozhatskyrecompression 410443dd798SSergey Senozhatsky------------- 411443dd798SSergey Senozhatsky 412443dd798SSergey SenozhatskyWith CONFIG_ZRAM_MULTI_COMP, zram can recompress pages using alternative 413443dd798SSergey Senozhatsky(secondary) compression algorithms. The basic idea is that alternative 414443dd798SSergey Senozhatskycompression algorithm can provide better compression ratio at a price of 415443dd798SSergey Senozhatsky(potentially) slower compression/decompression speeds. Alternative compression 416443dd798SSergey Senozhatskyalgorithm can, for example, be more successful compressing huge pages (those 417443dd798SSergey Senozhatskythat default algorithm failed to compress). Another application is idle pages 418443dd798SSergey Senozhatskyrecompression - pages that are cold and sit in the memory can be recompressed 419443dd798SSergey Senozhatskyusing more effective algorithm and, hence, reduce zsmalloc memory usage. 420443dd798SSergey Senozhatsky 421443dd798SSergey SenozhatskyWith CONFIG_ZRAM_MULTI_COMP, zram supports up to 4 compression algorithms: 422443dd798SSergey Senozhatskyone primary and up to 3 secondary ones. Primary zram compressor is explained 423443dd798SSergey Senozhatskyin "3) Select compression algorithm", secondary algorithms are configured 424443dd798SSergey Senozhatskyusing recomp_algorithm device attribute. 425443dd798SSergey Senozhatsky 426443dd798SSergey SenozhatskyExample::: 427443dd798SSergey Senozhatsky 428443dd798SSergey Senozhatsky #show supported recompression algorithms 429443dd798SSergey Senozhatsky cat /sys/block/zramX/recomp_algorithm 430443dd798SSergey Senozhatsky #1: lzo lzo-rle lz4 lz4hc [zstd] 431443dd798SSergey Senozhatsky #2: lzo lzo-rle lz4 [lz4hc] zstd 432443dd798SSergey Senozhatsky 433443dd798SSergey SenozhatskyAlternative compression algorithms are sorted by priority. In the example 434443dd798SSergey Senozhatskyabove, zstd is used as the first alternative algorithm, which has priority 435443dd798SSergey Senozhatskyof 1, while lz4hc is configured as a compression algorithm with priority 2. 436443dd798SSergey SenozhatskyAlternative compression algorithm's priority is provided during algorithms 437443dd798SSergey Senozhatskyconfiguration::: 438443dd798SSergey Senozhatsky 439443dd798SSergey Senozhatsky #select zstd recompression algorithm, priority 1 440443dd798SSergey Senozhatsky echo "algo=zstd priority=1" > /sys/block/zramX/recomp_algorithm 441443dd798SSergey Senozhatsky 442443dd798SSergey Senozhatsky #select deflate recompression algorithm, priority 2 443443dd798SSergey Senozhatsky echo "algo=deflate priority=2" > /sys/block/zramX/recomp_algorithm 444443dd798SSergey Senozhatsky 445443dd798SSergey SenozhatskyAnother device attribute that CONFIG_ZRAM_MULTI_COMP enables is recompress, 446443dd798SSergey Senozhatskywhich controls recompression. 447443dd798SSergey Senozhatsky 448443dd798SSergey SenozhatskyExamples::: 449443dd798SSergey Senozhatsky 450443dd798SSergey Senozhatsky #IDLE pages recompression is activated by `idle` mode 451443dd798SSergey Senozhatsky echo "type=idle" > /sys/block/zramX/recompress 452443dd798SSergey Senozhatsky 453443dd798SSergey Senozhatsky #HUGE pages recompression is activated by `huge` mode 454443dd798SSergey Senozhatsky echo "type=huge" > /sys/block/zram0/recompress 455443dd798SSergey Senozhatsky 456443dd798SSergey Senozhatsky #HUGE_IDLE pages recompression is activated by `huge_idle` mode 457443dd798SSergey Senozhatsky echo "type=huge_idle" > /sys/block/zramX/recompress 458443dd798SSergey Senozhatsky 459443dd798SSergey SenozhatskyThe number of idle pages can be significant, so user-space can pass a size 460443dd798SSergey Senozhatskythreshold (in bytes) to the recompress knob: zram will recompress only pages 461443dd798SSergey Senozhatskyof equal or greater size::: 462443dd798SSergey Senozhatsky 463443dd798SSergey Senozhatsky #recompress all pages larger than 3000 bytes 464443dd798SSergey Senozhatsky echo "threshold=3000" > /sys/block/zramX/recompress 465443dd798SSergey Senozhatsky 466443dd798SSergey Senozhatsky #recompress idle pages larger than 2000 bytes 467443dd798SSergey Senozhatsky echo "type=idle threshold=2000" > /sys/block/zramX/recompress 468443dd798SSergey Senozhatsky 469*34efe1c3SSergey SenozhatskyIt is also possible to limit the number of pages zram re-compression will 470*34efe1c3SSergey Senozhatskyattempt to recompress::: 471*34efe1c3SSergey Senozhatsky 472*34efe1c3SSergey Senozhatsky echo "type=huge_idle max_pages=42" > /sys/block/zramX/recompress 473*34efe1c3SSergey Senozhatsky 474443dd798SSergey SenozhatskyRecompression of idle pages requires memory tracking. 475443dd798SSergey Senozhatsky 476443dd798SSergey SenozhatskyDuring re-compression for every page, that matches re-compression criteria, 477443dd798SSergey SenozhatskyZRAM iterates the list of registered alternative compression algorithms in 478443dd798SSergey Senozhatskyorder of their priorities. ZRAM stops either when re-compression was 479443dd798SSergey Senozhatskysuccessful (re-compressed object is smaller in size than the original one) 480443dd798SSergey Senozhatskyand matches re-compression criteria (e.g. size threshold) or when there are 481443dd798SSergey Senozhatskyno secondary algorithms left to try. If none of the secondary algorithms can 482443dd798SSergey Senozhatskysuccessfully re-compressed the page such a page is marked as incompressible, 483443dd798SSergey Senozhatskyso ZRAM will not attempt to re-compress it in the future. 484443dd798SSergey Senozhatsky 485443dd798SSergey SenozhatskyThis re-compression behaviour, when it iterates through the list of 486443dd798SSergey Senozhatskyregistered compression algorithms, increases our chances of finding the 487443dd798SSergey Senozhatskyalgorithm that successfully compresses a particular page. Sometimes, however, 488443dd798SSergey Senozhatskyit is convenient (and sometimes even necessary) to limit recompression to 489443dd798SSergey Senozhatskyonly one particular algorithm so that it will not try any other algorithms. 490443dd798SSergey SenozhatskyThis can be achieved by providing a algo=NAME parameter::: 491443dd798SSergey Senozhatsky 492443dd798SSergey Senozhatsky #use zstd algorithm only (if registered) 493443dd798SSergey Senozhatsky echo "type=huge algo=zstd" > /sys/block/zramX/recompress 494443dd798SSergey Senozhatsky 495e7751617SMauro Carvalho Chehabmemory tracking 496e7751617SMauro Carvalho Chehab=============== 497e7751617SMauro Carvalho Chehab 498e7751617SMauro Carvalho ChehabWith CONFIG_ZRAM_MEMORY_TRACKING, user can know information of the 499e7751617SMauro Carvalho Chehabzram block. It could be useful to catch cold or incompressible 500e7751617SMauro Carvalho Chehabpages of the process with*pagemap. 501e7751617SMauro Carvalho Chehab 502e7751617SMauro Carvalho ChehabIf you enable the feature, you could see block state via 503e7751617SMauro Carvalho Chehab/sys/kernel/debug/zram/zram0/block_state". The output is as follows:: 504e7751617SMauro Carvalho Chehab 50577db7bb5SSergey Senozhatsky 300 75.033841 .wh... 50677db7bb5SSergey Senozhatsky 301 63.806904 s..... 50777db7bb5SSergey Senozhatsky 302 63.806919 ..hi.. 50877db7bb5SSergey Senozhatsky 303 62.801919 ....r. 50977db7bb5SSergey Senozhatsky 304 146.781902 ..hi.n 510e7751617SMauro Carvalho Chehab 511e7751617SMauro Carvalho ChehabFirst column 512e7751617SMauro Carvalho Chehab zram's block index. 513e7751617SMauro Carvalho ChehabSecond column 514e7751617SMauro Carvalho Chehab access time since the system was booted 515e7751617SMauro Carvalho ChehabThird column 516e7751617SMauro Carvalho Chehab state of the block: 517e7751617SMauro Carvalho Chehab 518e7751617SMauro Carvalho Chehab s: 519e7751617SMauro Carvalho Chehab same page 520e7751617SMauro Carvalho Chehab w: 521e7751617SMauro Carvalho Chehab written page to backing store 522e7751617SMauro Carvalho Chehab h: 523e7751617SMauro Carvalho Chehab huge page 524e7751617SMauro Carvalho Chehab i: 525e7751617SMauro Carvalho Chehab idle page 52660e9b39eSSergey Senozhatsky r: 52760e9b39eSSergey Senozhatsky recompressed page (secondary compression algorithm) 52877db7bb5SSergey Senozhatsky n: 52977db7bb5SSergey Senozhatsky none (including secondary) of algorithms could compress it 530e7751617SMauro Carvalho Chehab 531e7751617SMauro Carvalho ChehabFirst line of above example says 300th block is accessed at 75.033841sec 532e7751617SMauro Carvalho Chehaband the block's state is huge so it is written back to the backing 533e7751617SMauro Carvalho Chehabstorage. It's a debugging feature so anyone shouldn't rely on it to work 534e7751617SMauro Carvalho Chehabproperly. 535e7751617SMauro Carvalho Chehab 536e7751617SMauro Carvalho ChehabNitin Gupta 537e7751617SMauro Carvalho Chehabngupta@vflare.org 538