xref: /freebsd/lib/libsys/shm_open.2 (revision e2257b31)
1.\"
2.\" Copyright 2000 Massachusetts Institute of Technology
3.\"
4.\" Permission to use, copy, modify, and distribute this software and
5.\" its documentation for any purpose and without fee is hereby
6.\" granted, provided that both the above copyright notice and this
7.\" permission notice appear in all copies, that both the above
8.\" copyright notice and this permission notice appear in all
9.\" supporting documentation, and that the name of M.I.T. not be used
10.\" in advertising or publicity pertaining to distribution of the
11.\" software without specific, written prior permission.  M.I.T. makes
12.\" no representations about the suitability of this software for any
13.\" purpose.  It is provided "as is" without express or implied
14.\" warranty.
15.\"
16.\" THIS SOFTWARE IS PROVIDED BY M.I.T. ``AS IS''.  M.I.T. DISCLAIMS
17.\" ALL EXPRESS OR IMPLIED WARRANTIES WITH REGARD TO THIS SOFTWARE,
18.\" INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
19.\" MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. IN NO EVENT
20.\" SHALL M.I.T. BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
21.\" SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
22.\" LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF
23.\" USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
24.\" ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
25.\" OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
26.\" OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
27.\" SUCH DAMAGE.
28.\"
29.Dd January 30, 2023
30.Dt SHM_OPEN 2
31.Os
32.Sh NAME
33.Nm memfd_create , shm_create_largepage , shm_open , shm_rename, shm_unlink
34.Nd "shared memory object operations"
35.Sh LIBRARY
36.Lb libc
37.Sh SYNOPSIS
38.In sys/types.h
39.In sys/mman.h
40.In fcntl.h
41.Ft int
42.Fn memfd_create "const char *name" "unsigned int flags"
43.Ft int
44.Fo shm_create_largepage
45.Fa "const char *path"
46.Fa "int flags"
47.Fa "int psind"
48.Fa "int alloc_policy"
49.Fa "mode_t mode"
50.Fc
51.Ft int
52.Fn shm_open "const char *path" "int flags" "mode_t mode"
53.Ft int
54.Fn shm_rename "const char *path_from" "const char *path_to" "int flags"
55.Ft int
56.Fn shm_unlink "const char *path"
57.Sh DESCRIPTION
58The
59.Fn shm_open
60function opens (or optionally creates) a
61POSIX
62shared memory object named
63.Fa path .
64The
65.Fa flags
66argument contains a subset of the flags used by
67.Xr open 2 .
68An access mode of either
69.Dv O_RDONLY
70or
71.Dv O_RDWR
72must be included in
73.Fa flags .
74The optional flags
75.Dv O_CREAT ,
76.Dv O_EXCL ,
77and
78.Dv O_TRUNC
79may also be specified.
80.Pp
81If
82.Dv O_CREAT
83is specified,
84then a new shared memory object named
85.Fa path
86will be created if it does not exist.
87In this case,
88the shared memory object is created with mode
89.Fa mode
90subject to the process' umask value.
91If both the
92.Dv O_CREAT
93and
94.Dv O_EXCL
95flags are specified and a shared memory object named
96.Fa path
97already exists,
98then
99.Fn shm_open
100will fail with
101.Er EEXIST .
102.Pp
103Newly created objects start off with a size of zero.
104If an existing shared memory object is opened with
105.Dv O_RDWR
106and the
107.Dv O_TRUNC
108flag is specified,
109then the shared memory object will be truncated to a size of zero.
110The size of the object can be adjusted via
111.Xr ftruncate 2
112and queried via
113.Xr fstat 2 .
114.Pp
115The new descriptor is set to close during
116.Xr execve 2
117system calls;
118see
119.Xr close 2
120and
121.Xr fcntl 2 .
122.Pp
123The constant
124.Dv SHM_ANON
125may be used for the
126.Fa path
127argument to
128.Fn shm_open .
129In this case, an anonymous, unnamed shared memory object is created.
130Since the object has no name,
131it cannot be removed via a subsequent call to
132.Fn shm_unlink ,
133or moved with a call to
134.Fn shm_rename .
135Instead,
136the shared memory object will be garbage collected when the last reference to
137the shared memory object is removed.
138The shared memory object may be shared with other processes by sharing the
139file descriptor via
140.Xr fork 2
141or
142.Xr sendmsg 2 .
143Attempting to open an anonymous shared memory object with
144.Dv O_RDONLY
145will fail with
146.Er EINVAL .
147All other flags are ignored.
148.Pp
149The
150.Fn shm_create_largepage
151function behaves similarly to
152.Fn shm_open ,
153except that the
154.Dv O_CREAT
155flag is implicitly specified, and the returned
156.Dq largepage
157object is always backed by aligned, physically contiguous chunks of memory.
158This ensures that the object can be mapped using so-called
159.Dq superpages ,
160which can improve application performance in some workloads by reducing the
161number of translation lookaside buffer (TLB) entries required to access a
162mapping of the object,
163and by reducing the number of page faults performed when accessing a mapping.
164This happens automatically for all largepage objects.
165.Pp
166An existing largepage object can be opened using the
167.Fn shm_open
168function.
169Largepage shared memory objects behave slightly differently from non-largepage
170objects:
171.Bl -bullet -offset indent
172.It
173Memory for a largepage object is allocated when the object is
174extended using the
175.Xr ftruncate 2
176system call, whereas memory for regular shared memory objects is allocated
177lazily and may be paged out to a swap device when not in use.
178.It
179The size of a mapping of a largepage object must be a multiple of the
180underlying large page size.
181Most attributes of such a mapping can only be modified at the granularity
182of the large page size.
183For example, when using
184.Xr munmap 2
185to unmap a portion of a largepage object mapping, or when using
186.Xr mprotect 2
187to adjust protections of a mapping of a largepage object, the starting address
188must be large page size-aligned, and the length of the operation must be a
189multiple of the large page size.
190If not, the corresponding system call will fail and set
191.Va errno
192to
193.Er EINVAL .
194.El
195.Pp
196The
197.Fa psind
198argument to
199.Fn shm_create_largepage
200specifies the size of large pages used to back the object.
201This argument is an index into the page sizes array returned by
202.Xr getpagesizes 3 .
203In particular, all large pages backing a largepage object must be of the
204same size.
205For example, on a system with large page sizes of 2MB and 1GB, a 2GB largepage
206object will consist of either 1024 2MB pages, or 2 1GB pages, depending on
207the value specified for the
208.Fa psind
209argument.
210The
211.Fa alloc_policy
212parameter specifies what happens when an attempt to use
213.Xr ftruncate 2
214to allocate memory for the object fails.
215The following values are accepted:
216.Bl -tag -offset indent -width SHM_
217.It Dv SHM_LARGEPAGE_ALLOC_DEFAULT
218If the (non-blocking) memory allocation fails because there is insufficient free
219contiguous memory, the kernel will attempt to defragment physical memory and
220try another allocation.
221The subsequent allocation may or may not succeed.
222If this subsequent allocation also fails,
223.Xr ftruncate 2
224will fail and set
225.Va errno
226to
227.Er ENOMEM .
228.It Dv SHM_LARGEPAGE_ALLOC_NOWAIT
229If the memory allocation fails,
230.Xr ftruncate 2
231will fail and set
232.Va errno
233to
234.Er ENOMEM .
235.It Dv SHM_LARGEPAGE_ALLOC_HARD
236The kernel will attempt defragmentation until the allocation succeeds,
237or an unblocked signal is delivered to the thread.
238However, it is possible for physical memory to be fragmented such that the
239allocation will never succeed.
240.El
241.Pp
242The
243.Dv FIOSSHMLPGCNF
244and
245.Dv FIOGSHMLPGCNF
246.Xr ioctl 2
247commands can be used with a largepage shared memory object to get and set
248largepage object parameters.
249Both commands operate on the following structure:
250.Bd -literal
251struct shm_largepage_conf {
252	int psind;
253	int alloc_policy;
254};
255
256.Ed
257The
258.Dv FIOGSHMLPGCNF
259command populates this structure with the current values of these parameters,
260while the
261.Dv FIOSSHMLPGCNF
262command modifies the largepage object.
263Currently only the
264.Va alloc_policy
265parameter may be modified.
266Internally,
267.Fn shm_create_largepage
268works by creating a regular shared memory object using
269.Fn shm_open ,
270and then converting it into a largepage object using the
271.Dv FIOSSHMLPGCNF
272ioctl command.
273.Pp
274The
275.Fn shm_rename
276system call atomically removes a shared memory object named
277.Fa path_from
278and relinks it at
279.Fa path_to .
280If another object is already linked at
281.Fa path_to ,
282that object will be unlinked, unless one of the following flags are provided:
283.Bl -tag -offset indent -width Er
284.It Er SHM_RENAME_EXCHANGE
285Atomically exchange the shms at
286.Fa path_from
287and
288.Fa path_to .
289.It Er SHM_RENAME_NOREPLACE
290Return an error if an shm exists at
291.Fa path_to ,
292rather than unlinking it.
293.El
294.Pp
295The
296.Fn shm_unlink
297system call removes a shared memory object named
298.Fa path .
299.Pp
300The
301.Fn memfd_create
302function creates an anonymous shared memory object, identical to that created
303by
304.Fn shm_open
305when
306.Dv SHM_ANON
307is specified.
308Newly created objects start off with a size of zero.
309The size of the new object must be adjusted via
310.Xr ftruncate 2 .
311.Pp
312The
313.Fa name
314argument must not be
315.Dv NULL ,
316but it may be an empty string.
317The length of the
318.Fa name
319argument may not exceed
320.Dv NAME_MAX
321minus six characters for the prefix
322.Dq memfd: ,
323which will be prepended.
324The
325.Fa name
326argument is intended solely for debugging purposes and will never be used by the
327kernel to identify a memfd.
328Names are therefore not required to be unique.
329.Pp
330The following
331.Fa flags
332may be specified to
333.Fn memfd_create :
334.Bl -tag -width MFD_ALLOW_SEALING
335.It Dv MFD_CLOEXEC
336Set
337.Dv FD_CLOEXEC
338on the resulting file descriptor.
339.It Dv MFD_ALLOW_SEALING
340Allow adding seals to the resulting file descriptor using the
341.Dv F_ADD_SEALS
342.Xr fcntl 2
343command.
344.It Dv MFD_HUGETLB
345This flag is currently unsupported.
346.El
347.Sh RETURN VALUES
348If successful,
349.Fn memfd_create
350and
351.Fn shm_open
352both return a non-negative integer,
353and
354.Fn shm_rename
355and
356.Fn shm_unlink
357return zero.
358All functions return -1 on failure, and set
359.Va errno
360to indicate the error.
361.Sh COMPATIBILITY
362The
363.Fn shm_create_largepage
364and
365.Fn shm_rename
366functions are
367.Fx
368extensions, as is support for the
369.Dv SHM_ANON
370value in
371.Fn shm_open .
372.Pp
373The
374.Fa path ,
375.Fa path_from ,
376and
377.Fa path_to
378arguments do not necessarily represent a pathname (although they do in
379most other implementations).
380Two processes opening the same
381.Fa path
382are guaranteed to access the same shared memory object if and only if
383.Fa path
384begins with a slash
385.Pq Ql \&/
386character.
387.Pp
388Only the
389.Dv O_RDONLY ,
390.Dv O_RDWR ,
391.Dv O_CREAT ,
392.Dv O_EXCL ,
393and
394.Dv O_TRUNC
395flags may be used in portable programs.
396.Pp
397POSIX
398specifications state that the result of using
399.Xr open 2 ,
400.Xr read 2 ,
401or
402.Xr write 2
403on a shared memory object, or on the descriptor returned by
404.Fn shm_open ,
405is undefined.
406However, the
407.Fx
408kernel implementation explicitly includes support for
409.Xr read 2
410and
411.Xr write 2 .
412.Pp
413.Fx
414also supports zero-copy transmission of data from shared memory
415objects with
416.Xr sendfile 2 .
417.Pp
418Neither shared memory objects nor their contents persist across reboots.
419.Pp
420Writes do not extend shared memory objects, so
421.Xr ftruncate 2
422must be called before any data can be written.
423See
424.Sx EXAMPLES .
425.Sh EXAMPLES
426This example fails without the call to
427.Xr ftruncate 2 :
428.Bd -literal -compact
429
430        uint8_t buffer[getpagesize()];
431        ssize_t len;
432        int fd;
433
434        fd = shm_open(SHM_ANON, O_RDWR | O_CREAT, 0600);
435        if (fd < 0)
436                err(EX_OSERR, "%s: shm_open", __func__);
437        if (ftruncate(fd, getpagesize()) < 0)
438                err(EX_IOERR, "%s: ftruncate", __func__);
439        len = pwrite(fd, buffer, getpagesize(), 0);
440        if (len < 0)
441                err(EX_IOERR, "%s: pwrite", __func__);
442        if (len != getpagesize())
443                errx(EX_IOERR, "%s: pwrite length mismatch", __func__);
444.Ed
445.Sh ERRORS
446.Fn memfd_create
447fails with these error codes for these conditions:
448.Bl -tag -width Er
449.It Bq Er EBADF
450The
451.Fa name
452argument was NULL.
453.It Bq Er EINVAL
454The
455.Fa name
456argument was too long.
457.Pp
458An invalid or unsupported flag was included in
459.Fa flags .
460.It Bq Er EMFILE
461The process has already reached its limit for open file descriptors.
462.It Bq Er ENFILE
463The system file table is full.
464.It Bq Er ENOSYS
465In
466.Fa memfd_create ,
467.Dv MFD_HUGETLB
468was specified in
469.Fa flags ,
470and this system does not support forced hugetlb mappings.
471.El
472.Pp
473.Fn shm_open
474fails with these error codes for these conditions:
475.Bl -tag -width Er
476.It Bq Er EINVAL
477A flag other than
478.Dv O_RDONLY ,
479.Dv O_RDWR ,
480.Dv O_CREAT ,
481.Dv O_EXCL ,
482or
483.Dv O_TRUNC
484was included in
485.Fa flags .
486.It Bq Er EMFILE
487The process has already reached its limit for open file descriptors.
488.It Bq Er ENFILE
489The system file table is full.
490.It Bq Er EINVAL
491.Dv O_RDONLY
492was specified while creating an anonymous shared memory object via
493.Dv SHM_ANON .
494.It Bq Er EFAULT
495The
496.Fa path
497argument points outside the process' allocated address space.
498.It Bq Er ENAMETOOLONG
499The entire pathname exceeds 1023 characters.
500.It Bq Er EINVAL
501The
502.Fa path
503does not begin with a slash
504.Pq Ql \&/
505character.
506.It Bq Er ENOENT
507.Dv O_CREAT
508is not specified and the named shared memory object does not exist.
509.It Bq Er EEXIST
510.Dv O_CREAT
511and
512.Dv O_EXCL
513are specified and the named shared memory object does exist.
514.It Bq Er EACCES
515The required permissions (for reading or reading and writing) are denied.
516.It Bq Er ECAPMODE
517The process is running in capability mode (see
518.Xr capsicum 4 )
519and attempted to create a named shared memory object.
520.El
521.Pp
522.Fn shm_create_largepage
523can fail for the reasons listed above.
524It also fails with these error codes for the following conditions:
525.Bl -tag -width Er
526.It Bq Er ENOTTY
527The kernel does not support large pages on the current platform.
528.El
529.Pp
530The following errors are defined for
531.Fn shm_rename :
532.Bl -tag -width Er
533.It Bq Er EFAULT
534The
535.Fa path_from
536or
537.Fa path_to
538argument points outside the process' allocated address space.
539.It Bq Er ENAMETOOLONG
540The entire pathname exceeds 1023 characters.
541.It Bq Er ENOENT
542The shared memory object at
543.Fa path_from
544does not exist.
545.It Bq Er EACCES
546The required permissions are denied.
547.It Bq Er EEXIST
548An shm exists at
549.Fa path_to ,
550and the
551.Dv SHM_RENAME_NOREPLACE
552flag was provided.
553.El
554.Pp
555.Fn shm_unlink
556fails with these error codes for these conditions:
557.Bl -tag -width Er
558.It Bq Er EFAULT
559The
560.Fa path
561argument points outside the process' allocated address space.
562.It Bq Er ENAMETOOLONG
563The entire pathname exceeds 1023 characters.
564.It Bq Er ENOENT
565The named shared memory object does not exist.
566.It Bq Er EACCES
567The required permissions are denied.
568.Fn shm_unlink
569requires write permission to the shared memory object.
570.El
571.Sh SEE ALSO
572.Xr posixshmcontrol 1 ,
573.Xr close 2 ,
574.Xr fstat 2 ,
575.Xr ftruncate 2 ,
576.Xr ioctl 2 ,
577.Xr mmap 2 ,
578.Xr munmap 2 ,
579.Xr sendfile 2
580.Sh STANDARDS
581The
582.Fn memfd_create
583function is expected to be compatible with the Linux system call of the same
584name.
585.Pp
586The
587.Fn shm_open
588and
589.Fn shm_unlink
590functions are believed to conform to
591.St -p1003.1b-93 .
592.Sh HISTORY
593The
594.Fn memfd_create
595function appeared in
596.Fx 13.0 .
597.Pp
598The
599.Fn shm_open
600and
601.Fn shm_unlink
602functions first appeared in
603.Fx 4.3 .
604The functions were reimplemented as system calls using shared memory objects
605directly rather than files in
606.Fx 8.0 .
607.Pp
608.Fn shm_rename
609first appeared in
610.Fx 13.0
611as a
612.Fx
613extension.
614.Sh AUTHORS
615.An Garrett A. Wollman Aq Mt wollman@FreeBSD.org
616(C library support and this manual page)
617.Pp
618.An Matthew Dillon Aq Mt dillon@FreeBSD.org
619.Pq Dv MAP_NOSYNC
620.Pp
621.An Matthew Bryan Aq Mt matthew.bryan@isilon.com
622.Pq Dv shm_rename implementation
623