xref: /dragonfly/lib/libc/sys/mmap.2 (revision e65bc1c3)
1.\" Copyright (c) 1991, 1993
2.\"	The Regents of the University of California.  All rights reserved.
3.\"
4.\" Redistribution and use in source and binary forms, with or without
5.\" modification, are permitted provided that the following conditions
6.\" are met:
7.\" 1. Redistributions of source code must retain the above copyright
8.\"    notice, this list of conditions and the following disclaimer.
9.\" 2. Redistributions in binary form must reproduce the above copyright
10.\"    notice, this list of conditions and the following disclaimer in the
11.\"    documentation and/or other materials provided with the distribution.
12.\" 3. Neither the name of the University nor the names of its contributors
13.\"    may be used to endorse or promote products derived from this software
14.\"    without specific prior written permission.
15.\"
16.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
17.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
18.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
19.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
20.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
21.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
22.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
23.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
24.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
25.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
26.\" SUCH DAMAGE.
27.\"
28.\"	@(#)mmap.2	8.4 (Berkeley) 5/11/95
29.\" $FreeBSD: src/lib/libc/sys/mmap.2,v 1.22.2.12 2002/02/27 03:40:13 dd Exp $
30.\" $DragonFly: src/lib/libc/sys/mmap.2,v 1.9 2007/05/17 08:19:00 swildner Exp $
31.\"
32.Dd December 11, 2006
33.Dt MMAP 2
34.Os
35.Sh NAME
36.Nm mmap
37.Nd allocate memory, or map files or devices into memory
38.Sh LIBRARY
39.Lb libc
40.Sh SYNOPSIS
41.In sys/types.h
42.In sys/mman.h
43.Ft void *
44.Fn mmap "void *addr" "size_t len" "int prot" "int flags" "int fd" "off_t offset"
45.Sh DESCRIPTION
46The
47.Fn mmap
48function causes the pages starting at
49.Fa addr
50and continuing for at most
51.Fa len
52bytes to be mapped from the object described by
53.Fa fd ,
54starting at byte offset
55.Fa offset .
56If
57.Fa len
58is not a multiple of the pagesize, the mapped region may extend past the
59specified range.
60Any such extension beyond the end of the mapped object will be zero-filled.
61.Pp
62If
63.Fa addr
64is non-zero, it is used as a hint to the system.
65(As a convenience to the system, the actual address of the region may differ
66from the address supplied.)
67If
68.Fa addr
69is zero, an address will be selected by the system.
70The actual starting address of the region is returned.
71A successful
72.Fa mmap
73deletes any previous mapping in the allocated address range.
74.Pp
75The protections (region accessibility) are specified in the
76.Fa prot
77argument by
78.Em or Ns 'ing
79the following values:
80.Pp
81.Bl -tag -width PROT_WRITE -compact
82.It Dv PROT_NONE
83Pages may not be accessed.
84.It Dv PROT_READ
85Pages may be read.
86.It Dv PROT_WRITE
87Pages may be written.
88.It Dv PROT_EXEC
89Pages may be executed.
90.El
91.Pp
92The
93.Fa flags
94parameter specifies the type of the mapped object, mapping options and
95whether modifications made to the mapped copy of the page are private
96to the process or are to be shared with other references.
97Sharing, mapping type and options are specified in the
98.Fa flags
99argument by
100.Em or Ns 'ing
101the following values:
102.Bl -tag -width MAP_HASSEMAPHORE
103.It Dv MAP_ANON
104Map anonymous memory not associated with any specific file.
105The file descriptor used for creating
106.Dv MAP_ANON
107must be \-1.
108The
109.Fa offset
110parameter is ignored.
111.\".It Dv MAP_FILE
112.\"Mapped from a regular file or character-special device memory.
113.It Dv MAP_FIXED
114Do not permit the system to select a different address than the one
115specified.
116If the specified address contains other mappings those mappings will
117be replaced.
118If the specified address cannot otherwise be used,
119.Fn mmap
120will fail.
121If
122.Dv MAP_FIXED
123is specified,
124.Fa addr
125must be a multiple of the pagesize.
126.It Dv MAP_TRYFIXED
127Try to do a fixed mapping but fail if another mapping already exists in
128the space instead of overwriting the mapping.
129.Pp
130When used with MAP_STACK this flag allows one MAP_STACK mapping to be
131made within another (typically the master user stack), as long as
132no pages have been faulted in the area requested.
133.It Dv MAP_HASSEMAPHORE
134Notify the kernel that the region may contain semaphores and that special
135handling may be necessary.
136.It Dv MAP_NOCORE
137Region is not included in a core file.
138.It Dv MAP_NOSYNC
139Causes data dirtied via this VM map to be flushed to physical media
140only when necessary (usually by the pager) rather than gratuitously.
141Typically this prevents the update daemons from flushing pages dirtied
142through such maps and thus allows efficient sharing of memory across
143unassociated processes using a file-backed shared memory map.  Without
144this option any VM pages you dirty may be flushed to disk every so often
145(every 30-60 seconds usually) which can create performance problems if you
146do not need that to occur (such as when you are using shared file-backed
147mmap regions for IPC purposes).  Note that VM/filesystem coherency is
148maintained whether you use
149.Dv MAP_NOSYNC
150or not.  This option is not portable
151across
152.Ux
153platforms (yet), though some may implement the same behavior
154by default.
155.Pp
156.Em WARNING !
157Extending a file with
158.Xr ftruncate 2 ,
159thus creating a big hole, and then filling the hole by modifying a shared
160.Fn mmap
161can lead to severe file fragmentation.
162In order to avoid such fragmentation you should always pre-allocate the
163file's backing store by
164.Fn write Ns ing
165zero's into the newly extended area prior to modifying the area via your
166.Fn mmap .
167The fragmentation problem is especially sensitive to
168.Dv MAP_NOSYNC
169pages, because pages may be flushed to disk in a totally random order.
170.Pp
171The same applies when using
172.Dv MAP_NOSYNC
173to implement a file-based shared memory store.
174It is recommended that you create the backing store by
175.Fn write Ns ing
176zero's to the backing file rather than
177.Fn ftruncate Ns ing
178it.
179You can test file fragmentation by observing the KB/t (kilobytes per
180transfer) results from an
181.Dq Li iostat 1
182while reading a large file sequentially, e.g. using
183.Dq Li dd if=filename of=/dev/null bs=32k .
184.Pp
185The
186.Xr fsync 2
187function will flush all dirty data and metadata associated with a file,
188including dirty NOSYNC VM data, to physical media.  The
189.Xr sync 8
190command and
191.Xr sync 2
192system call generally do not flush dirty NOSYNC VM data.
193The
194.Xr msync 2
195system call is obsolete since
196.Bx
197implements a coherent filesystem buffer cache.  However, it may be
198used to associate dirty VM pages with filesystem buffers and thus cause
199them to be flushed to physical media sooner rather than later.
200.It Dv MAP_PRIVATE
201Modifications are private.
202.It Dv MAP_SHARED
203Modifications are shared.
204.It Dv MAP_STACK
205Map the area as a stack.
206.Dv MAP_ANON
207is implied.
208.Fa Offset
209should be 0,
210.Fa fd
211must be -1, and
212.Fa prot
213should include at least
214.Dv PROT_READ
215and
216.Dv PROT_WRITE .
217This option creates
218a memory region that grows to at most
219.Fa len
220bytes in size, starting from the stack top and growing down.  The
221stack top is the starting address returned by the call, plus
222.Fa len
223bytes.
224The bottom of the stack at maximum growth is the starting
225address returned by the call.
226.Pp
227The entire area is reserved from the point of view of other
228.Fn mmap
229calls, even if not faulted in yet.
230.Pp
231WARNING.  We currently allow
232.Dv MAP_STACK
233mappings to provide a hint that points within an existing
234.Dv MAP_STACK
235mapping's space, and this will succeed as long as no page have been
236faulted in the area specified, but this behavior is no longer supported
237unless you also specify the
238.Dv MAP_TRYFIXED
239flag.
240.Pp
241Note that unless
242.Dv MAP_FIXED
243or
244.Dv MAP_TRYFIXED
245is used, you cannot count on the returned address matching the hint
246you have provided.
247.It Dv MAP_VPAGETABLE
248Memory accessed via this map is not linearly mapped and will be governed
249by a virtual page table.  The base address of the virtual page table may
250be set using
251.Xr mcontrol 2
252with
253.Dv MADV_SETMAP .
254Virtual page tables work with anonymous memory but there
255is no way to populate the page table so for all intents and purposes
256.Dv MAP_VPAGETABLE
257can only be used when mapping file descriptors.  Since the kernel will
258update the VPTE_M bit in the virtual page table, the mapping must R+W
259even though actual access to the memory will be properly governed by
260the virtual page table.
261.Pp
262Addressable backing store is limited by the range supported in the virtual
263page table entries.  The kernel may implement a page table abstraction capable
264of addressing a larger range within the backing store then could otherwise
265be mapped into memory.
266.El
267.Pp
268The
269.Xr close 2
270function does not unmap pages, see
271.Xr munmap 2
272for further information.
273.Pp
274The current design does not allow a process to specify the location of
275swap space.
276In the future we may define an additional mapping type,
277.Dv MAP_SWAP ,
278in which
279the file descriptor argument specifies a file or device to which swapping
280should be done.
281.Sh RETURN VALUES
282Upon successful completion,
283.Fn mmap
284returns a pointer to the mapped region.
285Otherwise, a value of
286.Dv MAP_FAILED
287is returned and
288.Va errno
289is set to indicate the error.
290.Sh ERRORS
291.Fn Mmap
292will fail if:
293.Bl -tag -width Er
294.It Bq Er EACCES
295The flag
296.Dv PROT_READ
297was specified as part of the
298.Fa prot
299parameter and
300.Fa fd
301was not open for reading.
302The flags
303.Dv MAP_SHARED
304and
305.Dv PROT_WRITE
306were specified as part of the
307.Fa flags
308and
309.Fa prot
310parameters and
311.Fa fd
312was not open for writing.
313.It Bq Er EBADF
314.Fa fd
315is not a valid open file descriptor.
316.It Bq Er EINVAL
317.Dv MAP_FIXED
318was specified and the
319.Fa addr
320parameter was not page aligned, or part of the desired address space
321resides out of the valid address space for a user process.
322.It Bq Er EINVAL
323.Fa Len
324was negative.
325.It Bq Er EINVAL
326.Dv MAP_ANON
327was specified and the
328.Fa fd
329parameter was not -1.
330.It Bq Er EINVAL
331.Dv MAP_ANON
332has not been specified and
333.Fa fd
334did not reference a regular or character special file.
335.It Bq Er EINVAL
336.Fa Offset
337was not page-aligned.
338(See
339.Sx BUGS
340below.)
341.It Bq Er ENOMEM
342.Dv MAP_FIXED
343was specified and the
344.Fa addr
345parameter wasn't available.
346.Dv MAP_ANON
347was specified and insufficient memory was available.
348The system has reached the per-process mmap limit specified in the
349.Va vm.max_proc_mmap
350sysctl.
351.El
352.Sh SEE ALSO
353.Xr madvise 2 ,
354.Xr mincore 2 ,
355.Xr mlock 2 ,
356.Xr mprotect 2 ,
357.Xr msync 2 ,
358.Xr munlock 2 ,
359.Xr munmap 2 ,
360.Xr getpagesize 3
361.Sh BUGS
362.Fa len
363is limited to 2GB.  Mmapping slightly more than 2GB doesn't work, but
364it is possible to map a window of size (filesize % 2GB) for file sizes
365of slightly less than 2G, 4GB, 6GB and 8GB.
366.Pp
367The limit is imposed for a variety of reasons.
368Most of them have to do
369with
370.Dx
371not wanting to use 64 bit offsets in the VM system due to
372the extreme performance penalty.
373So
374.Dx
375uses 32bit page indexes and
376this gives
377.Dx
378a maximum of 8TB filesizes.
379It's actually bugs in
380the filesystem code that causes the limit to be further restricted to
3811TB (loss of precision when doing blockno calculations).
382.Pp
383Another reason for the 2GB limit is that filesystem metadata can
384reside at negative offsets.
385