1e7751617SMauro Carvalho Chehab========================================
2a3e1c56aSRandy Dunlapzram: Compressed RAM-based block devices
3e7751617SMauro Carvalho Chehab========================================
4e7751617SMauro Carvalho Chehab
5e7751617SMauro Carvalho ChehabIntroduction
6e7751617SMauro Carvalho Chehab============
7e7751617SMauro Carvalho Chehab
8a3e1c56aSRandy DunlapThe zram module creates RAM-based block devices named /dev/zram<id>
9e7751617SMauro Carvalho Chehab(<id> = 0, 1, ...). Pages written to these disks are compressed and stored
10e7751617SMauro Carvalho Chehabin memory itself. These disks allow very fast I/O and compression provides
11e7751617SMauro Carvalho Chehabgood amounts of memory savings. Some of the use cases include /tmp storage,
12a3e1c56aSRandy Dunlapuse as swap disks, various caches under /var and maybe many more. :)
13e7751617SMauro Carvalho Chehab
14e7751617SMauro Carvalho ChehabStatistics for individual zram devices are exported through sysfs nodes at
15e7751617SMauro Carvalho Chehab/sys/block/zram<id>/
16e7751617SMauro Carvalho Chehab
17e7751617SMauro Carvalho ChehabUsage
18e7751617SMauro Carvalho Chehab=====
19e7751617SMauro Carvalho Chehab
20e7751617SMauro Carvalho ChehabThere are several ways to configure and manage zram device(-s):
21e7751617SMauro Carvalho Chehab
22e7751617SMauro Carvalho Chehaba) using zram and zram_control sysfs attributes
23e7751617SMauro Carvalho Chehabb) using zramctl utility, provided by util-linux (util-linux@vger.kernel.org).
24e7751617SMauro Carvalho Chehab
25e7751617SMauro Carvalho ChehabIn this document we will describe only 'manual' zram configuration steps,
26e7751617SMauro Carvalho ChehabIOW, zram and zram_control sysfs attributes.
27e7751617SMauro Carvalho Chehab
28e7751617SMauro Carvalho ChehabIn order to get a better idea about zramctl please consult util-linux
29e7751617SMauro Carvalho Chehabdocumentation, zramctl man-page or `zramctl --help`. Please be informed
30e7751617SMauro Carvalho Chehabthat zram maintainers do not develop/maintain util-linux or zramctl, should
31e7751617SMauro Carvalho Chehabyou have any questions please contact util-linux@vger.kernel.org
32e7751617SMauro Carvalho Chehab
33e7751617SMauro Carvalho ChehabFollowing shows a typical sequence of steps for using zram.
34e7751617SMauro Carvalho Chehab
35e7751617SMauro Carvalho ChehabWARNING
36e7751617SMauro Carvalho Chehab=======
37e7751617SMauro Carvalho Chehab
38e7751617SMauro Carvalho ChehabFor the sake of simplicity we skip error checking parts in most of the
39e7751617SMauro Carvalho Chehabexamples below. However, it is your sole responsibility to handle errors.
40e7751617SMauro Carvalho Chehab
41e7751617SMauro Carvalho Chehabzram sysfs attributes always return negative values in case of errors.
42e7751617SMauro Carvalho ChehabThe list of possible return codes:
43e7751617SMauro Carvalho Chehab
44e7751617SMauro Carvalho Chehab========  =============================================================
45e7751617SMauro Carvalho Chehab-EBUSY	  an attempt to modify an attribute that cannot be changed once
46a3e1c56aSRandy Dunlap	  the device has been initialised. Please reset device first.
47e7751617SMauro Carvalho Chehab-ENOMEM	  zram was not able to allocate enough memory to fulfil your
48a3e1c56aSRandy Dunlap	  needs.
49e7751617SMauro Carvalho Chehab-EINVAL	  invalid input has been provided.
50e7751617SMauro Carvalho Chehab========  =============================================================
51e7751617SMauro Carvalho Chehab
52a3e1c56aSRandy DunlapIf you use 'echo', the returned value is set by the 'echo' utility,
53e7751617SMauro Carvalho Chehaband, in general case, something like::
54e7751617SMauro Carvalho Chehab
55e7751617SMauro Carvalho Chehab	echo 3 > /sys/block/zram0/max_comp_streams
56a3e1c56aSRandy Dunlap	if [ $? -ne 0 ]; then
57e7751617SMauro Carvalho Chehab		handle_error
58e7751617SMauro Carvalho Chehab	fi
59e7751617SMauro Carvalho Chehab
60e7751617SMauro Carvalho Chehabshould suffice.
61e7751617SMauro Carvalho Chehab
62e7751617SMauro Carvalho Chehab1) Load Module
63e7751617SMauro Carvalho Chehab==============
64e7751617SMauro Carvalho Chehab
65e7751617SMauro Carvalho Chehab::
66e7751617SMauro Carvalho Chehab
67e7751617SMauro Carvalho Chehab	modprobe zram num_devices=4
68a3e1c56aSRandy Dunlap
69e7751617SMauro Carvalho ChehabThis creates 4 devices: /dev/zram{0,1,2,3}
70e7751617SMauro Carvalho Chehab
71e7751617SMauro Carvalho Chehabnum_devices parameter is optional and tells zram how many devices should be
72e7751617SMauro Carvalho Chehabpre-created. Default: 1.
73e7751617SMauro Carvalho Chehab
74e7751617SMauro Carvalho Chehab2) Set max number of compression streams
75e7751617SMauro Carvalho Chehab========================================
76e7751617SMauro Carvalho Chehab
77a3e1c56aSRandy DunlapRegardless of the value passed to this attribute, ZRAM will always
78a3e1c56aSRandy Dunlapallocate multiple compression streams - one per online CPU - thus
79e7751617SMauro Carvalho Chehaballowing several concurrent compression operations. The number of
80e7751617SMauro Carvalho Chehaballocated compression streams goes down when some of the CPUs
81e7751617SMauro Carvalho Chehabbecome offline. There is no single-compression-stream mode anymore,
82a3e1c56aSRandy Dunlapunless you are running a UP system or have only 1 CPU online.
83e7751617SMauro Carvalho Chehab
84e7751617SMauro Carvalho ChehabTo find out how many streams are currently available::
85e7751617SMauro Carvalho Chehab
86e7751617SMauro Carvalho Chehab	cat /sys/block/zram0/max_comp_streams
87e7751617SMauro Carvalho Chehab
88e7751617SMauro Carvalho Chehab3) Select compression algorithm
89e7751617SMauro Carvalho Chehab===============================
90e7751617SMauro Carvalho Chehab
91e7751617SMauro Carvalho ChehabUsing comp_algorithm device attribute one can see available and
92e7751617SMauro Carvalho Chehabcurrently selected (shown in square brackets) compression algorithms,
93a3e1c56aSRandy Dunlapor change the selected compression algorithm (once the device is initialised
94e7751617SMauro Carvalho Chehabthere is no way to change compression algorithm).
95e7751617SMauro Carvalho Chehab
96e7751617SMauro Carvalho ChehabExamples::
97e7751617SMauro Carvalho Chehab
98e7751617SMauro Carvalho Chehab	#show supported compression algorithms
99e7751617SMauro Carvalho Chehab	cat /sys/block/zram0/comp_algorithm
100e7751617SMauro Carvalho Chehab	lzo [lz4]
101e7751617SMauro Carvalho Chehab
102e7751617SMauro Carvalho Chehab	#select lzo compression algorithm
103e7751617SMauro Carvalho Chehab	echo lzo > /sys/block/zram0/comp_algorithm
104e7751617SMauro Carvalho Chehab
105e7751617SMauro Carvalho ChehabFor the time being, the `comp_algorithm` content does not necessarily
106e7751617SMauro Carvalho Chehabshow every compression algorithm supported by the kernel. We keep this
107e7751617SMauro Carvalho Chehablist primarily to simplify device configuration and one can configure
108e7751617SMauro Carvalho Chehaba new device with a compression algorithm that is not listed in
109e7751617SMauro Carvalho Chehab`comp_algorithm`. The thing is that, internally, ZRAM uses Crypto API
110e7751617SMauro Carvalho Chehaband, if some of the algorithms were built as modules, it's impossible
111e7751617SMauro Carvalho Chehabto list all of them using, for instance, /proc/crypto or any other
112e7751617SMauro Carvalho Chehabmethod. This, however, has an advantage of permitting the usage of
113e7751617SMauro Carvalho Chehabcustom crypto compression modules (implementing S/W or H/W compression).
114e7751617SMauro Carvalho Chehab
115e7751617SMauro Carvalho Chehab4) Set Disksize
116e7751617SMauro Carvalho Chehab===============
117e7751617SMauro Carvalho Chehab
118e7751617SMauro Carvalho ChehabSet disk size by writing the value to sysfs node 'disksize'.
119e7751617SMauro Carvalho ChehabThe value can be either in bytes or you can use mem suffixes.
120e7751617SMauro Carvalho ChehabExamples::
121e7751617SMauro Carvalho Chehab
122e7751617SMauro Carvalho Chehab	# Initialize /dev/zram0 with 50MB disksize
123e7751617SMauro Carvalho Chehab	echo $((50*1024*1024)) > /sys/block/zram0/disksize
124e7751617SMauro Carvalho Chehab
125e7751617SMauro Carvalho Chehab	# Using mem suffixes
126e7751617SMauro Carvalho Chehab	echo 256K > /sys/block/zram0/disksize
127e7751617SMauro Carvalho Chehab	echo 512M > /sys/block/zram0/disksize
128e7751617SMauro Carvalho Chehab	echo 1G > /sys/block/zram0/disksize
129e7751617SMauro Carvalho Chehab
130e7751617SMauro Carvalho ChehabNote:
131e7751617SMauro Carvalho ChehabThere is little point creating a zram of greater than twice the size of memory
132e7751617SMauro Carvalho Chehabsince we expect a 2:1 compression ratio. Note that zram uses about 0.1% of the
133e7751617SMauro Carvalho Chehabsize of the disk when not in use so a huge zram is wasteful.
134e7751617SMauro Carvalho Chehab
135e7751617SMauro Carvalho Chehab5) Set memory limit: Optional
136e7751617SMauro Carvalho Chehab=============================
137e7751617SMauro Carvalho Chehab
138e7751617SMauro Carvalho ChehabSet memory limit by writing the value to sysfs node 'mem_limit'.
139e7751617SMauro Carvalho ChehabThe value can be either in bytes or you can use mem suffixes.
140e7751617SMauro Carvalho ChehabIn addition, you could change the value in runtime.
141e7751617SMauro Carvalho ChehabExamples::
142e7751617SMauro Carvalho Chehab
143e7751617SMauro Carvalho Chehab	# limit /dev/zram0 with 50MB memory
144e7751617SMauro Carvalho Chehab	echo $((50*1024*1024)) > /sys/block/zram0/mem_limit
145e7751617SMauro Carvalho Chehab
146e7751617SMauro Carvalho Chehab	# Using mem suffixes
147e7751617SMauro Carvalho Chehab	echo 256K > /sys/block/zram0/mem_limit
148e7751617SMauro Carvalho Chehab	echo 512M > /sys/block/zram0/mem_limit
149e7751617SMauro Carvalho Chehab	echo 1G > /sys/block/zram0/mem_limit
150e7751617SMauro Carvalho Chehab
151e7751617SMauro Carvalho Chehab	# To disable memory limit
152e7751617SMauro Carvalho Chehab	echo 0 > /sys/block/zram0/mem_limit
153e7751617SMauro Carvalho Chehab
154e7751617SMauro Carvalho Chehab6) Activate
155e7751617SMauro Carvalho Chehab===========
156e7751617SMauro Carvalho Chehab
157e7751617SMauro Carvalho Chehab::
158e7751617SMauro Carvalho Chehab
159e7751617SMauro Carvalho Chehab	mkswap /dev/zram0
160e7751617SMauro Carvalho Chehab	swapon /dev/zram0
161e7751617SMauro Carvalho Chehab
162e7751617SMauro Carvalho Chehab	mkfs.ext4 /dev/zram1
163e7751617SMauro Carvalho Chehab	mount /dev/zram1 /tmp
164e7751617SMauro Carvalho Chehab
165e7751617SMauro Carvalho Chehab7) Add/remove zram devices
166e7751617SMauro Carvalho Chehab==========================
167e7751617SMauro Carvalho Chehab
168e7751617SMauro Carvalho Chehabzram provides a control interface, which enables dynamic (on-demand) device
169e7751617SMauro Carvalho Chehabaddition and removal.
170e7751617SMauro Carvalho Chehab
171a3e1c56aSRandy DunlapIn order to add a new /dev/zramX device, perform a read operation on the hot_add
172a3e1c56aSRandy Dunlapattribute. This will return either the new device's device id (meaning that you
173a3e1c56aSRandy Dunlapcan use /dev/zram<id>) or an error code.
174e7751617SMauro Carvalho Chehab
175e7751617SMauro Carvalho ChehabExample::
176e7751617SMauro Carvalho Chehab
177e7751617SMauro Carvalho Chehab	cat /sys/class/zram-control/hot_add
178e7751617SMauro Carvalho Chehab	1
179e7751617SMauro Carvalho Chehab
180e7751617SMauro Carvalho ChehabTo remove the existing /dev/zramX device (where X is a device id)
181e7751617SMauro Carvalho Chehabexecute::
182e7751617SMauro Carvalho Chehab
183e7751617SMauro Carvalho Chehab	echo X > /sys/class/zram-control/hot_remove
184e7751617SMauro Carvalho Chehab
185e7751617SMauro Carvalho Chehab8) Stats
186e7751617SMauro Carvalho Chehab========
187e7751617SMauro Carvalho Chehab
188e7751617SMauro Carvalho ChehabPer-device statistics are exported as various nodes under /sys/block/zram<id>/
189e7751617SMauro Carvalho Chehab
190a3e1c56aSRandy DunlapA brief description of exported device attributes follows. For more details
191a3e1c56aSRandy Dunlapplease read Documentation/ABI/testing/sysfs-block-zram.
192e7751617SMauro Carvalho Chehab
193e7751617SMauro Carvalho Chehab======================  ======  ===============================================
194e7751617SMauro Carvalho ChehabName            	access            description
195e7751617SMauro Carvalho Chehab======================  ======  ===============================================
196e7751617SMauro Carvalho Chehabdisksize          	RW	show and set the device's disk size
197e7751617SMauro Carvalho Chehabinitstate         	RO	shows the initialization state of the device
198e7751617SMauro Carvalho Chehabreset             	WO	trigger device reset
199e7751617SMauro Carvalho Chehabmem_used_max      	WO	reset the `mem_used_max` counter (see later)
200e7751617SMauro Carvalho Chehabmem_limit         	WO	specifies the maximum amount of memory ZRAM can
201e7751617SMauro Carvalho Chehab				use to store the compressed data
202e7751617SMauro Carvalho Chehabwriteback_limit   	WO	specifies the maximum amount of write IO zram
203e7751617SMauro Carvalho Chehab				can write out to backing device as 4KB unit
204e7751617SMauro Carvalho Chehabwriteback_limit_enable  RW	show and set writeback_limit feature
205e7751617SMauro Carvalho Chehabmax_comp_streams  	RW	the number of possible concurrent compress
206e7751617SMauro Carvalho Chehab				operations
207e7751617SMauro Carvalho Chehabcomp_algorithm    	RW	show and change the compression algorithm
208e7751617SMauro Carvalho Chehabcompact           	WO	trigger memory compaction
209e7751617SMauro Carvalho Chehabdebug_stat        	RO	this file is used for zram debugging purposes
210e7751617SMauro Carvalho Chehabbacking_dev	  	RW	set up backend storage for zram to write out
211e7751617SMauro Carvalho Chehabidle		  	WO	mark allocated slot as idle
212e7751617SMauro Carvalho Chehab======================  ======  ===============================================
213e7751617SMauro Carvalho Chehab
214e7751617SMauro Carvalho Chehab
215e7751617SMauro Carvalho ChehabUser space is advised to use the following files to read the device statistics.
216e7751617SMauro Carvalho Chehab
217e7751617SMauro Carvalho ChehabFile /sys/block/zram<id>/stat
218e7751617SMauro Carvalho Chehab
219e7751617SMauro Carvalho ChehabRepresents block layer statistics. Read Documentation/block/stat.rst for
220e7751617SMauro Carvalho Chehabdetails.
221e7751617SMauro Carvalho Chehab
222e7751617SMauro Carvalho ChehabFile /sys/block/zram<id>/io_stat
223e7751617SMauro Carvalho Chehab
224e7751617SMauro Carvalho ChehabThe stat file represents device's I/O statistics not accounted by block
225e7751617SMauro Carvalho Chehablayer and, thus, not available in zram<id>/stat file. It consists of a
226e7751617SMauro Carvalho Chehabsingle line of text and contains the following stats separated by
227e7751617SMauro Carvalho Chehabwhitespace:
228e7751617SMauro Carvalho Chehab
229e7751617SMauro Carvalho Chehab =============    =============================================================
230e7751617SMauro Carvalho Chehab failed_reads     The number of failed reads
231e7751617SMauro Carvalho Chehab failed_writes    The number of failed writes
232e7751617SMauro Carvalho Chehab invalid_io       The number of non-page-size-aligned I/O requests
233e7751617SMauro Carvalho Chehab notify_free      Depending on device usage scenario it may account
234e7751617SMauro Carvalho Chehab
235e7751617SMauro Carvalho Chehab                  a) the number of pages freed because of swap slot free
236e7751617SMauro Carvalho Chehab                     notifications
237e7751617SMauro Carvalho Chehab                  b) the number of pages freed because of
238e7751617SMauro Carvalho Chehab                     REQ_OP_DISCARD requests sent by bio. The former ones are
239e7751617SMauro Carvalho Chehab                     sent to a swap block device when a swap slot is freed,
240e7751617SMauro Carvalho Chehab                     which implies that this disk is being used as a swap disk.
241e7751617SMauro Carvalho Chehab
242e7751617SMauro Carvalho Chehab                  The latter ones are sent by filesystem mounted with
243e7751617SMauro Carvalho Chehab                  discard option, whenever some data blocks are getting
244e7751617SMauro Carvalho Chehab                  discarded.
245e7751617SMauro Carvalho Chehab =============    =============================================================
246e7751617SMauro Carvalho Chehab
247e7751617SMauro Carvalho ChehabFile /sys/block/zram<id>/mm_stat
248e7751617SMauro Carvalho Chehab
249a3e1c56aSRandy DunlapThe mm_stat file represents the device's mm statistics. It consists of a single
250e7751617SMauro Carvalho Chehabline of text and contains the following stats separated by whitespace:
251e7751617SMauro Carvalho Chehab
252e7751617SMauro Carvalho Chehab ================ =============================================================
253e7751617SMauro Carvalho Chehab orig_data_size   uncompressed size of data stored in this disk.
254e7751617SMauro Carvalho Chehab                  Unit: bytes
255e7751617SMauro Carvalho Chehab compr_data_size  compressed size of data stored in this disk
256e7751617SMauro Carvalho Chehab mem_used_total   the amount of memory allocated for this disk. This
257e7751617SMauro Carvalho Chehab                  includes allocator fragmentation and metadata overhead,
258e7751617SMauro Carvalho Chehab                  allocated for this disk. So, allocator space efficiency
259e7751617SMauro Carvalho Chehab                  can be calculated using compr_data_size and this statistic.
260e7751617SMauro Carvalho Chehab                  Unit: bytes
261e7751617SMauro Carvalho Chehab mem_limit        the maximum amount of memory ZRAM can use to store
262e7751617SMauro Carvalho Chehab                  the compressed data
263a3e1c56aSRandy Dunlap mem_used_max     the maximum amount of memory zram has consumed to
264e7751617SMauro Carvalho Chehab                  store the data
265e7751617SMauro Carvalho Chehab same_pages       the number of same element filled pages written to this disk.
266e7751617SMauro Carvalho Chehab                  No memory is allocated for such pages.
267e7751617SMauro Carvalho Chehab pages_compacted  the number of pages freed during compaction
268e7751617SMauro Carvalho Chehab huge_pages	  the number of incompressible pages
269194e28daSMinchan Kim huge_pages_since the number of incompressible pages since zram set up
270e7751617SMauro Carvalho Chehab ================ =============================================================
271e7751617SMauro Carvalho Chehab
272e7751617SMauro Carvalho ChehabFile /sys/block/zram<id>/bd_stat
273e7751617SMauro Carvalho Chehab
274a3e1c56aSRandy DunlapThe bd_stat file represents a device's backing device statistics. It consists of
275e7751617SMauro Carvalho Chehaba single line of text and contains the following stats separated by whitespace:
276e7751617SMauro Carvalho Chehab
277e7751617SMauro Carvalho Chehab ============== =============================================================
278e7751617SMauro Carvalho Chehab bd_count	size of data written in backing device.
279e7751617SMauro Carvalho Chehab		Unit: 4K bytes
280e7751617SMauro Carvalho Chehab bd_reads	the number of reads from backing device
281e7751617SMauro Carvalho Chehab		Unit: 4K bytes
282e7751617SMauro Carvalho Chehab bd_writes	the number of writes to backing device
283e7751617SMauro Carvalho Chehab		Unit: 4K bytes
284e7751617SMauro Carvalho Chehab ============== =============================================================
285e7751617SMauro Carvalho Chehab
286e7751617SMauro Carvalho Chehab9) Deactivate
287e7751617SMauro Carvalho Chehab=============
288e7751617SMauro Carvalho Chehab
289e7751617SMauro Carvalho Chehab::
290e7751617SMauro Carvalho Chehab
291e7751617SMauro Carvalho Chehab	swapoff /dev/zram0
292e7751617SMauro Carvalho Chehab	umount /dev/zram1
293e7751617SMauro Carvalho Chehab
294e7751617SMauro Carvalho Chehab10) Reset
295e7751617SMauro Carvalho Chehab=========
296e7751617SMauro Carvalho Chehab
297e7751617SMauro Carvalho Chehab	Write any positive value to 'reset' sysfs node::
298e7751617SMauro Carvalho Chehab
299e7751617SMauro Carvalho Chehab		echo 1 > /sys/block/zram0/reset
300e7751617SMauro Carvalho Chehab		echo 1 > /sys/block/zram1/reset
301e7751617SMauro Carvalho Chehab
302e7751617SMauro Carvalho Chehab	This frees all the memory allocated for the given device and
303e7751617SMauro Carvalho Chehab	resets the disksize to zero. You must set the disksize again
304e7751617SMauro Carvalho Chehab	before reusing the device.
305e7751617SMauro Carvalho Chehab
306e7751617SMauro Carvalho ChehabOptional Feature
307e7751617SMauro Carvalho Chehab================
308e7751617SMauro Carvalho Chehab
309e7751617SMauro Carvalho Chehabwriteback
310e7751617SMauro Carvalho Chehab---------
311e7751617SMauro Carvalho Chehab
312e7751617SMauro Carvalho ChehabWith CONFIG_ZRAM_WRITEBACK, zram can write idle/incompressible page
313e7751617SMauro Carvalho Chehabto backing storage rather than keeping it in memory.
314e7751617SMauro Carvalho ChehabTo use the feature, admin should set up backing device via::
315e7751617SMauro Carvalho Chehab
316e7751617SMauro Carvalho Chehab	echo /dev/sda5 > /sys/block/zramX/backing_dev
317e7751617SMauro Carvalho Chehab
3184fbe7b19SEthan Dyebefore disksize setting. It supports only partitions at this moment.
3194fbe7b19SEthan DyeIf admin wants to use incompressible page writeback, they could do it via::
320e7751617SMauro Carvalho Chehab
3215871023cSYue Hu	echo huge > /sys/block/zramX/writeback
322e7751617SMauro Carvalho Chehab
323e7751617SMauro Carvalho ChehabTo use idle page writeback, first, user need to declare zram pages
324e7751617SMauro Carvalho Chehabas idle::
325e7751617SMauro Carvalho Chehab
326e7751617SMauro Carvalho Chehab	echo all > /sys/block/zramX/idle
327e7751617SMauro Carvalho Chehab
328e7751617SMauro Carvalho ChehabFrom now on, any pages on zram are idle pages. The idle mark
329a3e1c56aSRandy Dunlapwill be removed until someone requests access of the block.
330e7751617SMauro Carvalho ChehabIOW, unless there is access request, those pages are still idle pages.
331a7a03505SSergey SenozhatskyAdditionally, when CONFIG_ZRAM_TRACK_ENTRY_ACTIME is enabled pages can be
332755804d1SBrian Geffonmarked as idle based on how long (in seconds) it's been since they were
333755804d1SBrian Geffonlast accessed::
334755804d1SBrian Geffon
335755804d1SBrian Geffon        echo 86400 > /sys/block/zramX/idle
336755804d1SBrian Geffon
337755804d1SBrian GeffonIn this example all pages which haven't been accessed in more than 86400
338755804d1SBrian Geffonseconds (one day) will be marked idle.
339e7751617SMauro Carvalho Chehab
340e7751617SMauro Carvalho ChehabAdmin can request writeback of those idle pages at right timing via::
341e7751617SMauro Carvalho Chehab
342e7751617SMauro Carvalho Chehab	echo idle > /sys/block/zramX/writeback
343e7751617SMauro Carvalho Chehab
3444fbe7b19SEthan DyeWith the command, zram will writeback idle pages from memory to the storage.
345e7751617SMauro Carvalho Chehab
34630226b69SBrian GeffonAdditionally, if a user choose to writeback only huge and idle pages
34730226b69SBrian Geffonthis can be accomplished with::
34830226b69SBrian Geffon
34930226b69SBrian Geffon        echo huge_idle > /sys/block/zramX/writeback
35030226b69SBrian Geffon
351b46f9ea3SSergey SenozhatskyIf a user chooses to writeback only incompressible pages (pages that none of
352b46f9ea3SSergey Senozhatskyalgorithms can compress) this can be accomplished with::
353b46f9ea3SSergey Senozhatsky
354b46f9ea3SSergey Senozhatsky	echo incompressible > /sys/block/zramX/writeback
355b46f9ea3SSergey Senozhatsky
3564fbe7b19SEthan DyeIf an admin wants to write a specific page in zram device to the backing device,
357b46f9ea3SSergey Senozhatskythey could write a page index into the interface::
3580d835962SMinchan Kim
3590d835962SMinchan Kim	echo "page_index=1251" > /sys/block/zramX/writeback
3600d835962SMinchan Kim
361e7751617SMauro Carvalho ChehabIf there are lots of write IO with flash device, potentially, it has
362e7751617SMauro Carvalho Chehabflash wearout problem so that admin needs to design write limitation
363e7751617SMauro Carvalho Chehabto guarantee storage health for entire product life.
364e7751617SMauro Carvalho Chehab
365e7751617SMauro Carvalho ChehabTo overcome the concern, zram supports "writeback_limit" feature.
366e7751617SMauro Carvalho ChehabThe "writeback_limit_enable"'s default value is 0 so that it doesn't limit
3674fbe7b19SEthan Dyeany writeback. IOW, if admin wants to apply writeback budget, they should
368e7751617SMauro Carvalho Chehabenable writeback_limit_enable via::
369e7751617SMauro Carvalho Chehab
370e7751617SMauro Carvalho Chehab	$ echo 1 > /sys/block/zramX/writeback_limit_enable
371e7751617SMauro Carvalho Chehab
372e7751617SMauro Carvalho ChehabOnce writeback_limit_enable is set, zram doesn't allow any writeback
373a3e1c56aSRandy Dunlapuntil admin sets the budget via /sys/block/zramX/writeback_limit.
374e7751617SMauro Carvalho Chehab
375e7751617SMauro Carvalho Chehab(If admin doesn't enable writeback_limit_enable, writeback_limit's value
376a3e1c56aSRandy Dunlapassigned via /sys/block/zramX/writeback_limit is meaningless.)
377e7751617SMauro Carvalho Chehab
3784fbe7b19SEthan DyeIf admin wants to limit writeback as per-day 400M, they could do it
379e7751617SMauro Carvalho Chehablike below::
380e7751617SMauro Carvalho Chehab
381e7751617SMauro Carvalho Chehab	$ MB_SHIFT=20
382e7751617SMauro Carvalho Chehab	$ 4K_SHIFT=12
383e7751617SMauro Carvalho Chehab	$ echo $((400<<MB_SHIFT>>4K_SHIFT)) > \
384e7751617SMauro Carvalho Chehab		/sys/block/zram0/writeback_limit.
385e7751617SMauro Carvalho Chehab	$ echo 1 > /sys/block/zram0/writeback_limit_enable
386e7751617SMauro Carvalho Chehab
387b2105aa2SAndrew KlychkovIf admins want to allow further write again once the budget is exhausted,
3884fbe7b19SEthan Dyethey could do it like below::
389e7751617SMauro Carvalho Chehab
390e7751617SMauro Carvalho Chehab	$ echo $((400<<MB_SHIFT>>4K_SHIFT)) > \
391e7751617SMauro Carvalho Chehab		/sys/block/zram0/writeback_limit
392e7751617SMauro Carvalho Chehab
3934fbe7b19SEthan DyeIf an admin wants to see the remaining writeback budget since last set::
394e7751617SMauro Carvalho Chehab
395e7751617SMauro Carvalho Chehab	$ cat /sys/block/zramX/writeback_limit
396e7751617SMauro Carvalho Chehab
3974fbe7b19SEthan DyeIf an admin wants to disable writeback limit, they could do::
398e7751617SMauro Carvalho Chehab
399e7751617SMauro Carvalho Chehab	$ echo 0 > /sys/block/zramX/writeback_limit_enable
400e7751617SMauro Carvalho Chehab
401e7751617SMauro Carvalho ChehabThe writeback_limit count will reset whenever you reset zram (e.g.,
402e7751617SMauro Carvalho Chehabsystem reboot, echo 1 > /sys/block/zramX/reset) so keeping how many of
403e7751617SMauro Carvalho Chehabwriteback happened until you reset the zram to allocate extra writeback
404e7751617SMauro Carvalho Chehabbudget in next setting is user's job.
405e7751617SMauro Carvalho Chehab
4064fbe7b19SEthan DyeIf admin wants to measure writeback count in a certain period, they could
407e7751617SMauro Carvalho Chehabknow it via /sys/block/zram0/bd_stat's 3rd column.
408e7751617SMauro Carvalho Chehab
409443dd798SSergey Senozhatskyrecompression
410443dd798SSergey Senozhatsky-------------
411443dd798SSergey Senozhatsky
412443dd798SSergey SenozhatskyWith CONFIG_ZRAM_MULTI_COMP, zram can recompress pages using alternative
413443dd798SSergey Senozhatsky(secondary) compression algorithms. The basic idea is that alternative
414443dd798SSergey Senozhatskycompression algorithm can provide better compression ratio at a price of
415443dd798SSergey Senozhatsky(potentially) slower compression/decompression speeds. Alternative compression
416443dd798SSergey Senozhatskyalgorithm can, for example, be more successful compressing huge pages (those
417443dd798SSergey Senozhatskythat default algorithm failed to compress). Another application is idle pages
418443dd798SSergey Senozhatskyrecompression - pages that are cold and sit in the memory can be recompressed
419443dd798SSergey Senozhatskyusing more effective algorithm and, hence, reduce zsmalloc memory usage.
420443dd798SSergey Senozhatsky
421443dd798SSergey SenozhatskyWith CONFIG_ZRAM_MULTI_COMP, zram supports up to 4 compression algorithms:
422443dd798SSergey Senozhatskyone primary and up to 3 secondary ones. Primary zram compressor is explained
423443dd798SSergey Senozhatskyin "3) Select compression algorithm", secondary algorithms are configured
424443dd798SSergey Senozhatskyusing recomp_algorithm device attribute.
425443dd798SSergey Senozhatsky
426443dd798SSergey SenozhatskyExample:::
427443dd798SSergey Senozhatsky
428443dd798SSergey Senozhatsky	#show supported recompression algorithms
429443dd798SSergey Senozhatsky	cat /sys/block/zramX/recomp_algorithm
430443dd798SSergey Senozhatsky	#1: lzo lzo-rle lz4 lz4hc [zstd]
431443dd798SSergey Senozhatsky	#2: lzo lzo-rle lz4 [lz4hc] zstd
432443dd798SSergey Senozhatsky
433443dd798SSergey SenozhatskyAlternative compression algorithms are sorted by priority. In the example
434443dd798SSergey Senozhatskyabove, zstd is used as the first alternative algorithm, which has priority
435443dd798SSergey Senozhatskyof 1, while lz4hc is configured as a compression algorithm with priority 2.
436443dd798SSergey SenozhatskyAlternative compression algorithm's priority is provided during algorithms
437443dd798SSergey Senozhatskyconfiguration:::
438443dd798SSergey Senozhatsky
439443dd798SSergey Senozhatsky	#select zstd recompression algorithm, priority 1
440443dd798SSergey Senozhatsky	echo "algo=zstd priority=1" > /sys/block/zramX/recomp_algorithm
441443dd798SSergey Senozhatsky
442443dd798SSergey Senozhatsky	#select deflate recompression algorithm, priority 2
443443dd798SSergey Senozhatsky	echo "algo=deflate priority=2" > /sys/block/zramX/recomp_algorithm
444443dd798SSergey Senozhatsky
445443dd798SSergey SenozhatskyAnother device attribute that CONFIG_ZRAM_MULTI_COMP enables is recompress,
446443dd798SSergey Senozhatskywhich controls recompression.
447443dd798SSergey Senozhatsky
448443dd798SSergey SenozhatskyExamples:::
449443dd798SSergey Senozhatsky
450443dd798SSergey Senozhatsky	#IDLE pages recompression is activated by `idle` mode
451443dd798SSergey Senozhatsky	echo "type=idle" > /sys/block/zramX/recompress
452443dd798SSergey Senozhatsky
453443dd798SSergey Senozhatsky	#HUGE pages recompression is activated by `huge` mode
454443dd798SSergey Senozhatsky	echo "type=huge" > /sys/block/zram0/recompress
455443dd798SSergey Senozhatsky
456443dd798SSergey Senozhatsky	#HUGE_IDLE pages recompression is activated by `huge_idle` mode
457443dd798SSergey Senozhatsky	echo "type=huge_idle" > /sys/block/zramX/recompress
458443dd798SSergey Senozhatsky
459443dd798SSergey SenozhatskyThe number of idle pages can be significant, so user-space can pass a size
460443dd798SSergey Senozhatskythreshold (in bytes) to the recompress knob: zram will recompress only pages
461443dd798SSergey Senozhatskyof equal or greater size:::
462443dd798SSergey Senozhatsky
463443dd798SSergey Senozhatsky	#recompress all pages larger than 3000 bytes
464443dd798SSergey Senozhatsky	echo "threshold=3000" > /sys/block/zramX/recompress
465443dd798SSergey Senozhatsky
466443dd798SSergey Senozhatsky	#recompress idle pages larger than 2000 bytes
467443dd798SSergey Senozhatsky	echo "type=idle threshold=2000" > /sys/block/zramX/recompress
468443dd798SSergey Senozhatsky
469*34efe1c3SSergey SenozhatskyIt is also possible to limit the number of pages zram re-compression will
470*34efe1c3SSergey Senozhatskyattempt to recompress:::
471*34efe1c3SSergey Senozhatsky
472*34efe1c3SSergey Senozhatsky	echo "type=huge_idle max_pages=42" > /sys/block/zramX/recompress
473*34efe1c3SSergey Senozhatsky
474443dd798SSergey SenozhatskyRecompression of idle pages requires memory tracking.
475443dd798SSergey Senozhatsky
476443dd798SSergey SenozhatskyDuring re-compression for every page, that matches re-compression criteria,
477443dd798SSergey SenozhatskyZRAM iterates the list of registered alternative compression algorithms in
478443dd798SSergey Senozhatskyorder of their priorities. ZRAM stops either when re-compression was
479443dd798SSergey Senozhatskysuccessful (re-compressed object is smaller in size than the original one)
480443dd798SSergey Senozhatskyand matches re-compression criteria (e.g. size threshold) or when there are
481443dd798SSergey Senozhatskyno secondary algorithms left to try. If none of the secondary algorithms can
482443dd798SSergey Senozhatskysuccessfully re-compressed the page such a page is marked as incompressible,
483443dd798SSergey Senozhatskyso ZRAM will not attempt to re-compress it in the future.
484443dd798SSergey Senozhatsky
485443dd798SSergey SenozhatskyThis re-compression behaviour, when it iterates through the list of
486443dd798SSergey Senozhatskyregistered compression algorithms, increases our chances of finding the
487443dd798SSergey Senozhatskyalgorithm that successfully compresses a particular page. Sometimes, however,
488443dd798SSergey Senozhatskyit is convenient (and sometimes even necessary) to limit recompression to
489443dd798SSergey Senozhatskyonly one particular algorithm so that it will not try any other algorithms.
490443dd798SSergey SenozhatskyThis can be achieved by providing a algo=NAME parameter:::
491443dd798SSergey Senozhatsky
492443dd798SSergey Senozhatsky	#use zstd algorithm only (if registered)
493443dd798SSergey Senozhatsky	echo "type=huge algo=zstd" > /sys/block/zramX/recompress
494443dd798SSergey Senozhatsky
495e7751617SMauro Carvalho Chehabmemory tracking
496e7751617SMauro Carvalho Chehab===============
497e7751617SMauro Carvalho Chehab
498e7751617SMauro Carvalho ChehabWith CONFIG_ZRAM_MEMORY_TRACKING, user can know information of the
499e7751617SMauro Carvalho Chehabzram block. It could be useful to catch cold or incompressible
500e7751617SMauro Carvalho Chehabpages of the process with*pagemap.
501e7751617SMauro Carvalho Chehab
502e7751617SMauro Carvalho ChehabIf you enable the feature, you could see block state via
503e7751617SMauro Carvalho Chehab/sys/kernel/debug/zram/zram0/block_state". The output is as follows::
504e7751617SMauro Carvalho Chehab
50577db7bb5SSergey Senozhatsky	  300    75.033841 .wh...
50677db7bb5SSergey Senozhatsky	  301    63.806904 s.....
50777db7bb5SSergey Senozhatsky	  302    63.806919 ..hi..
50877db7bb5SSergey Senozhatsky	  303    62.801919 ....r.
50977db7bb5SSergey Senozhatsky	  304   146.781902 ..hi.n
510e7751617SMauro Carvalho Chehab
511e7751617SMauro Carvalho ChehabFirst column
512e7751617SMauro Carvalho Chehab	zram's block index.
513e7751617SMauro Carvalho ChehabSecond column
514e7751617SMauro Carvalho Chehab	access time since the system was booted
515e7751617SMauro Carvalho ChehabThird column
516e7751617SMauro Carvalho Chehab	state of the block:
517e7751617SMauro Carvalho Chehab
518e7751617SMauro Carvalho Chehab	s:
519e7751617SMauro Carvalho Chehab		same page
520e7751617SMauro Carvalho Chehab	w:
521e7751617SMauro Carvalho Chehab		written page to backing store
522e7751617SMauro Carvalho Chehab	h:
523e7751617SMauro Carvalho Chehab		huge page
524e7751617SMauro Carvalho Chehab	i:
525e7751617SMauro Carvalho Chehab		idle page
52660e9b39eSSergey Senozhatsky	r:
52760e9b39eSSergey Senozhatsky		recompressed page (secondary compression algorithm)
52877db7bb5SSergey Senozhatsky	n:
52977db7bb5SSergey Senozhatsky		none (including secondary) of algorithms could compress it
530e7751617SMauro Carvalho Chehab
531e7751617SMauro Carvalho ChehabFirst line of above example says 300th block is accessed at 75.033841sec
532e7751617SMauro Carvalho Chehaband the block's state is huge so it is written back to the backing
533e7751617SMauro Carvalho Chehabstorage. It's a debugging feature so anyone shouldn't rely on it to work
534e7751617SMauro Carvalho Chehabproperly.
535e7751617SMauro Carvalho Chehab
536e7751617SMauro Carvalho ChehabNitin Gupta
537e7751617SMauro Carvalho Chehabngupta@vflare.org
538