xref: /freebsd/share/man/man9/zone.9 (revision 780fb4a2)
1.\"-
2.\" Copyright (c) 2001 Dag-Erling Coïdan Smørgrav
3.\" All rights reserved.
4.\"
5.\" Redistribution and use in source and binary forms, with or without
6.\" modification, are permitted provided that the following conditions
7.\" are met:
8.\" 1. Redistributions of source code must retain the above copyright
9.\"    notice, this list of conditions and the following disclaimer.
10.\" 2. Redistributions in binary form must reproduce the above copyright
11.\"    notice, this list of conditions and the following disclaimer in the
12.\"    documentation and/or other materials provided with the distribution.
13.\"
14.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
15.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
16.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
17.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
18.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
19.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
20.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
21.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
22.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
23.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
24.\" SUCH DAMAGE.
25.\"
26.\" $FreeBSD$
27.\"
28.Dd June 13, 2018
29.Dt ZONE 9
30.Os
31.Sh NAME
32.Nm uma_zcreate ,
33.Nm uma_zalloc ,
34.Nm uma_zalloc_arg ,
35.Nm uma_zalloc_domain ,
36.Nm uma_zfree ,
37.Nm uma_zfree_arg ,
38.Nm uma_zfree_domain ,
39.Nm uma_zdestroy ,
40.Nm uma_zone_set_max ,
41.Nm uma_zone_get_max ,
42.Nm uma_zone_get_cur ,
43.Nm uma_zone_set_warning ,
44.Nm uma_zone_set_maxaction
45.Nd zone allocator
46.Sh SYNOPSIS
47.In sys/param.h
48.In sys/queue.h
49.In vm/uma.h
50.Ft uma_zone_t
51.Fo uma_zcreate
52.Fa "char *name" "int size"
53.Fa "uma_ctor ctor" "uma_dtor dtor" "uma_init uminit" "uma_fini fini"
54.Fa "int align" "uint16_t flags"
55.Fc
56.Ft "void *"
57.Fn uma_zalloc "uma_zone_t zone" "int flags"
58.Ft "void *"
59.Fn uma_zalloc_arg "uma_zone_t zone" "void *arg" "int flags"
60.Ft "void *"
61.Fn uma_zalloc_domain "uma_zone_t zone" "void *arg" "int domain" "int flags"
62.Ft void
63.Fn uma_zfree "uma_zone_t zone" "void *item"
64.Ft void
65.Fn uma_zfree_arg "uma_zone_t zone" "void *item" "void *arg"
66.Ft void
67.Fn uma_zfree_domain "uma_zone_t zone" "void *item" "void *arg"
68.Ft void
69.Fn uma_zdestroy "uma_zone_t zone"
70.Ft int
71.Fn uma_zone_set_max "uma_zone_t zone" "int nitems"
72.Ft int
73.Fn uma_zone_get_max "uma_zone_t zone"
74.Ft int
75.Fn uma_zone_get_cur "uma_zone_t zone"
76.Ft void
77.Fn uma_zone_set_warning "uma_zone_t zone" "const char *warning"
78.Ft void
79.Fn uma_zone_set_maxaction "uma_zone_t zone" "void (*maxaction)(uma_zone_t)"
80.In sys/sysctl.h
81.Fn SYSCTL_UMA_MAX parent nbr name access zone descr
82.Fn SYSCTL_ADD_UMA_MAX ctx parent nbr name access zone descr
83.Fn SYSCTL_UMA_CUR parent nbr name access zone descr
84.Fn SYSCTL_ADD_UMA_CUR ctx parent nbr name access zone descr
85.Sh DESCRIPTION
86The zone allocator provides an efficient interface for managing
87dynamically-sized collections of items of identical size.
88The zone allocator can work with preallocated zones as well as with
89runtime-allocated ones, and is therefore available much earlier in the
90boot process than other memory management routines.  The zone allocator
91provides per-cpu allocation caches with linear scalability on SMP
92systems as well as round-robin and first-touch policies for NUMA
93systems.
94.Pp
95A zone is an extensible collection of items of identical size.
96The zone allocator keeps track of which items are in use and which
97are not, and provides functions for allocating items from the zone and
98for releasing them back (which makes them available for later use).
99.Pp
100After the first allocation of an item,
101it will have been cleared to zeroes, however subsequent allocations
102will retain the contents as of the last free.
103.Pp
104The
105.Fn uma_zcreate
106function creates a new zone from which items may then be allocated from.
107The
108.Fa name
109argument is a text name of the zone for debugging and stats; this memory
110should not be freed until the zone has been deallocated.
111.Pp
112The
113.Fa ctor
114and
115.Fa dtor
116arguments are callback functions that are called by
117the uma subsystem at the time of the call to
118.Fn uma_zalloc
119and
120.Fn uma_zfree
121respectively.
122Their purpose is to provide hooks for initializing or
123destroying things that need to be done at the time of the allocation
124or release of a resource.
125A good usage for the
126.Fa ctor
127and
128.Fa dtor
129callbacks
130might be to adjust a global count of the number of objects allocated.
131.Pp
132The
133.Fa uminit
134and
135.Fa fini
136arguments are used to optimize the allocation of
137objects from the zone.
138They are called by the uma subsystem whenever
139it needs to allocate or free several items to satisfy requests or memory
140pressure.
141A good use for the
142.Fa uminit
143and
144.Fa fini
145callbacks might be to
146initialize and destroy mutexes contained within the object.
147This would
148allow one to re-use already initialized mutexes when an object is returned
149from the uma subsystem's object cache.
150They are not called on each call to
151.Fn uma_zalloc
152and
153.Fn uma_zfree
154but rather in a batch mode on several objects.
155.Pp
156The
157.Fa flags
158argument of the
159.Fn uma_zcreate
160is a subset of the following flags:
161.Bl -tag -width "foo"
162.It Dv UMA_ZONE_NOFREE
163Slabs of the zone are never returned back to VM.
164.It Dv UMA_ZONE_NODUMP
165Pages belonging to the zone will not be included into mini-dumps.
166.It Dv UMA_ZONE_PCPU
167An allocation from zone would have
168.Va mp_ncpu
169shadow copies, that are privately assigned to CPUs.
170A CPU can address its private copy using base allocation address plus
171multiple of current CPU id and
172.Fn sizeof "struct pcpu" :
173.Bd -literal -offset indent
174foo_zone = uma_zcreate(..., UMA_ZONE_PCPU);
175 ...
176foo_base = uma_zalloc(foo_zone, ...);
177 ...
178critical_enter();
179foo_pcpu = (foo_t *)zpcpu_get(foo_base);
180/* do something with foo_pcpu */
181critical_exit();
182.Ed
183.It Dv UMA_ZONE_OFFPAGE
184By default book-keeping of items within a slab is done in the slab page itself.
185This flag explicitly tells subsystem that book-keeping structure should be
186allocated separately from special internal zone.
187This flag requires either
188.Dv UMA_ZONE_VTOSLAB
189or
190.Dv UMA_ZONE_HASH ,
191since subsystem requires a mechanism to find a book-keeping structure
192to an item being freed.
193The subsystem may choose to prefer offpage book-keeping for certain zones
194implicitly.
195.It Dv UMA_ZONE_ZINIT
196The zone will have its
197.Ft uma_init
198method set to internal method that initializes a new allocated slab
199to all zeros.
200Do not mistake
201.Ft uma_init
202method with
203.Ft uma_ctor .
204A zone with
205.Dv UMA_ZONE_ZINIT
206flag would not return zeroed memory on every
207.Fn uma_zalloc .
208.It Dv UMA_ZONE_HASH
209The zone should use an internal hash table to find slab book-keeping
210structure where an allocation being freed belongs to.
211.It Dv UMA_ZONE_VTOSLAB
212The zone should use special field of
213.Vt vm_page_t
214to find slab book-keeping structure where an allocation being freed belongs to.
215.It Dv UMA_ZONE_MALLOC
216The zone is for the
217.Xr malloc 9
218subsystem.
219.It Dv UMA_ZONE_VM
220The zone is for the VM subsystem.
221.It Dv UMA_ZONE_NUMA
222The zone should use a first-touch NUMA policy rather than the round-robin
223default. Callers that do not free memory on the same domain it is allocated
224from will cause mixing in per-cpu caches.  See
225.Xr numa 9 for more details.
226.El
227.Pp
228To allocate an item from a zone, simply call
229.Fn uma_zalloc
230with a pointer to that zone
231and set the
232.Fa flags
233argument to selected flags as documented in
234.Xr malloc 9 .
235It will return a pointer to an item if successful,
236or
237.Dv NULL
238in the rare case where all items in the zone are in use and the
239allocator is unable to grow the zone
240and
241.Dv M_NOWAIT
242is specified.
243.Pp
244Items are released back to the zone from which they were allocated by
245calling
246.Fn uma_zfree
247with a pointer to the zone and a pointer to the item.
248If
249.Fa item
250is
251.Dv NULL ,
252then
253.Fn uma_zfree
254does nothing.
255.Pp
256The variations
257.Fn uma_zalloc_arg
258and
259.Fn uma_zfree_arg
260allow callers to
261specify an argument for the
262.Dv ctor
263and
264.Dv dtor
265functions, respectively.
266The
267.Fn uma_zalloc_domain
268function allows callers to specify a fixed
269.Xr numa 9 domain to allocate from.  This uses a guaranteed but slow path in
270the allocator which reduces concurrency.  The
271.Fn uma_zfree_domain
272function should be used to return memory allocated in this fashion.  This
273function infers the domain from the pointer and does not require it as an
274argument.
275.Pp
276Created zones,
277which are empty,
278can be destroyed using
279.Fn uma_zdestroy ,
280freeing all memory that was allocated for the zone.
281All items allocated from the zone with
282.Fn uma_zalloc
283must have been freed with
284.Fn uma_zfree
285before.
286.Pp
287The
288.Fn uma_zone_set_max
289function limits the number of items
290.Pq and therefore memory
291that can be allocated to
292.Fa zone .
293The
294.Fa nitems
295argument specifies the requested upper limit number of items.
296The effective limit is returned to the caller, as it may end up being higher
297than requested due to the implementation rounding up to ensure all memory pages
298allocated to the zone are utilised to capacity.
299The limit applies to the total number of items in the zone, which includes
300allocated items, free items and free items in the per-cpu caches.
301On systems with more than one CPU it may not be possible to allocate
302the specified number of items even when there is no shortage of memory,
303because all of the remaining free items may be in the caches of the
304other CPUs when the limit is hit.
305.Pp
306The
307.Fn uma_zone_get_max
308function returns the effective upper limit number of items for a zone.
309.Pp
310The
311.Fn uma_zone_get_cur
312function returns the approximate current occupancy of the zone.
313The returned value is approximate because appropriate synchronisation to
314determine an exact value is not performed by the implementation.
315This ensures low overhead at the expense of potentially stale data being used
316in the calculation.
317.Pp
318The
319.Fn uma_zone_set_warning
320function sets a warning that will be printed on the system console when the
321given zone becomes full and fails to allocate an item.
322The warning will be printed no more often than every five minutes.
323Warnings can be turned off globally by setting the
324.Va vm.zone_warnings
325sysctl tunable to
326.Va 0 .
327.Pp
328The
329.Fn uma_zone_set_maxaction
330function sets a function that will be called when the given zone becomes full
331and fails to allocate an item.
332The function will be called with the zone locked.
333Also, the function
334that called the allocation function may have held additional locks.
335Therefore,
336this function should do very little work (similar to a signal handler).
337.Pp
338The
339.Fn SYSCTL_UMA_MAX parent nbr name access zone descr
340macro declares a static
341.Xr sysctl
342oid that exports the effective upper limit number of items for a zone.
343The
344.Fa zone
345argument should be a pointer to
346.Vt uma_zone_t .
347A read of the oid returns value obtained through
348.Fn uma_zone_get_max .
349A write to the oid sets new value via
350.Fn uma_zone_set_max .
351The
352.Fn SYSCTL_ADD_UMA_MAX ctx parent nbr name access zone descr
353macro is provided to create this type of oid dynamically.
354.Pp
355The
356.Fn SYSCTL_UMA_CUR parent nbr name access zone descr
357macro declares a static read-only
358.Xr sysctl
359oid that exports the approximate current occupancy of the zone.
360The
361.Fa zone
362argument should be a pointer to
363.Vt uma_zone_t .
364A read of the oid returns value obtained through
365.Fn uma_zone_get_cur .
366The
367.Fn SYSCTL_ADD_UMA_CUR ctx parent nbr name zone descr
368macro is provided to create this type of oid dynamically.
369.Sh RETURN VALUES
370The
371.Fn uma_zalloc
372function returns a pointer to an item, or
373.Dv NULL
374if the zone ran out of unused items
375and
376.Dv M_NOWAIT
377was specified.
378.Sh IMPLEMENTATION NOTES
379The memory that these allocation calls return is not executable.
380The
381.Fn uma_zalloc
382function does not support the
383.Dv M_EXEC
384flag to allocate executable memory.
385Not all platforms enforce a distinction between executable and
386non-executable memory.
387.Sh SEE ALSO
388.Xr malloc 9
389.Sh HISTORY
390The zone allocator first appeared in
391.Fx 3.0 .
392It was radically changed in
393.Fx 5.0
394to function as a slab allocator.
395.Sh AUTHORS
396.An -nosplit
397The zone allocator was written by
398.An John S. Dyson .
399The zone allocator was rewritten in large parts by
400.An Jeff Roberson Aq Mt jeff@FreeBSD.org
401to function as a slab allocator.
402.Pp
403This manual page was written by
404.An Dag-Erling Sm\(/orgrav Aq Mt des@FreeBSD.org .
405Changes for UMA by
406.An Jeroen Ruigrok van der Werven Aq Mt asmodai@FreeBSD.org .
407