xref: /dragonfly/lib/libc/sys/mmap.2 (revision a68e0df0)
1.\" Copyright (c) 1991, 1993
2.\"	The Regents of the University of California.  All rights reserved.
3.\"
4.\" Redistribution and use in source and binary forms, with or without
5.\" modification, are permitted provided that the following conditions
6.\" are met:
7.\" 1. Redistributions of source code must retain the above copyright
8.\"    notice, this list of conditions and the following disclaimer.
9.\" 2. Redistributions in binary form must reproduce the above copyright
10.\"    notice, this list of conditions and the following disclaimer in the
11.\"    documentation and/or other materials provided with the distribution.
12.\" 3. All advertising materials mentioning features or use of this software
13.\"    must display the following acknowledgement:
14.\"	This product includes software developed by the University of
15.\"	California, Berkeley and its contributors.
16.\" 4. Neither the name of the University nor the names of its contributors
17.\"    may be used to endorse or promote products derived from this software
18.\"    without specific prior written permission.
19.\"
20.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
21.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
22.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
23.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
24.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
25.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
26.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
27.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
28.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
29.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
30.\" SUCH DAMAGE.
31.\"
32.\"	@(#)mmap.2	8.4 (Berkeley) 5/11/95
33.\" $FreeBSD: src/lib/libc/sys/mmap.2,v 1.22.2.12 2002/02/27 03:40:13 dd Exp $
34.\" $DragonFly: src/lib/libc/sys/mmap.2,v 1.9 2007/05/17 08:19:00 swildner Exp $
35.\"
36.Dd December 11, 2006
37.Dt MMAP 2
38.Os
39.Sh NAME
40.Nm mmap
41.Nd allocate memory, or map files or devices into memory
42.Sh LIBRARY
43.Lb libc
44.Sh SYNOPSIS
45.In sys/types.h
46.In sys/mman.h
47.Ft void *
48.Fn mmap "void *addr" "size_t len" "int prot" "int flags" "int fd" "off_t offset"
49.Sh DESCRIPTION
50The
51.Fn mmap
52function causes the pages starting at
53.Fa addr
54and continuing for at most
55.Fa len
56bytes to be mapped from the object described by
57.Fa fd ,
58starting at byte offset
59.Fa offset .
60If
61.Fa len
62is not a multiple of the pagesize, the mapped region may extend past the
63specified range.
64Any such extension beyond the end of the mapped object will be zero-filled.
65.Pp
66If
67.Fa addr
68is non-zero, it is used as a hint to the system.
69(As a convenience to the system, the actual address of the region may differ
70from the address supplied.)
71If
72.Fa addr
73is zero, an address will be selected by the system.
74The actual starting address of the region is returned.
75A successful
76.Fa mmap
77deletes any previous mapping in the allocated address range.
78.Pp
79The protections (region accessibility) are specified in the
80.Fa prot
81argument by
82.Em or Ns 'ing
83the following values:
84.Pp
85.Bl -tag -width PROT_WRITE -compact
86.It Dv PROT_NONE
87Pages may not be accessed.
88.It Dv PROT_READ
89Pages may be read.
90.It Dv PROT_WRITE
91Pages may be written.
92.It Dv PROT_EXEC
93Pages may be executed.
94.El
95.Pp
96The
97.Fa flags
98parameter specifies the type of the mapped object, mapping options and
99whether modifications made to the mapped copy of the page are private
100to the process or are to be shared with other references.
101Sharing, mapping type and options are specified in the
102.Fa flags
103argument by
104.Em or Ns 'ing
105the following values:
106.Bl -tag -width MAP_HASSEMAPHORE
107.It Dv MAP_ANON
108Map anonymous memory not associated with any specific file.
109The file descriptor used for creating
110.Dv MAP_ANON
111must be \-1.
112The
113.Fa offset
114parameter is ignored.
115.\".It Dv MAP_FILE
116.\"Mapped from a regular file or character-special device memory.
117.It Dv MAP_FIXED
118Do not permit the system to select a different address than the one
119specified.
120If the specified address contains other mappings those mappings will
121be replaced.
122If the specified address cannot otherwise be used,
123.Fn mmap
124will fail.
125If
126.Dv MAP_FIXED
127is specified,
128.Fa addr
129must be a multiple of the pagesize.
130.It Dv MAP_TRYFIXED
131Try to do a fixed mapping but fail if another mapping already exists in
132the space instead of overwriting the mapping.
133.Pp
134When used with MAP_STACK this flag allows one MAP_STACK mapping to be
135made within another (typically the master user stack), as long as
136no pages have been faulted in the area requested.
137.It Dv MAP_HASSEMAPHORE
138Notify the kernel that the region may contain semaphores and that special
139handling may be necessary.
140.It Dv MAP_NOCORE
141Region is not included in a core file.
142.It Dv MAP_NOSYNC
143Causes data dirtied via this VM map to be flushed to physical media
144only when necessary (usually by the pager) rather than gratuitously.
145Typically this prevents the update daemons from flushing pages dirtied
146through such maps and thus allows efficient sharing of memory across
147unassociated processes using a file-backed shared memory map.  Without
148this option any VM pages you dirty may be flushed to disk every so often
149(every 30-60 seconds usually) which can create performance problems if you
150do not need that to occur (such as when you are using shared file-backed
151mmap regions for IPC purposes).  Note that VM/filesystem coherency is
152maintained whether you use
153.Dv MAP_NOSYNC
154or not.  This option is not portable
155across
156.Ux
157platforms (yet), though some may implement the same behavior
158by default.
159.Pp
160.Em WARNING !
161Extending a file with
162.Xr ftruncate 2 ,
163thus creating a big hole, and then filling the hole by modifying a shared
164.Fn mmap
165can lead to severe file fragmentation.
166In order to avoid such fragmentation you should always pre-allocate the
167file's backing store by
168.Fn write Ns ing
169zero's into the newly extended area prior to modifying the area via your
170.Fn mmap .
171The fragmentation problem is especially sensitive to
172.Dv MAP_NOSYNC
173pages, because pages may be flushed to disk in a totally random order.
174.Pp
175The same applies when using
176.Dv MAP_NOSYNC
177to implement a file-based shared memory store.
178It is recommended that you create the backing store by
179.Fn write Ns ing
180zero's to the backing file rather than
181.Fn ftruncate Ns ing
182it.
183You can test file fragmentation by observing the KB/t (kilobytes per
184transfer) results from an
185.Dq Li iostat 1
186while reading a large file sequentially, e.g. using
187.Dq Li dd if=filename of=/dev/null bs=32k .
188.Pp
189The
190.Xr fsync 2
191function will flush all dirty data and metadata associated with a file,
192including dirty NOSYNC VM data, to physical media.  The
193.Xr sync 8
194command and
195.Xr sync 2
196system call generally do not flush dirty NOSYNC VM data.
197The
198.Xr msync 2
199system call is obsolete since
200.Bx
201implements a coherent filesystem buffer cache.  However, it may be
202used to associate dirty VM pages with filesystem buffers and thus cause
203them to be flushed to physical media sooner rather than later.
204.It Dv MAP_PRIVATE
205Modifications are private.
206.It Dv MAP_SHARED
207Modifications are shared.
208.It Dv MAP_STACK
209Map the area as a stack.
210.Dv MAP_ANON
211is implied.
212.Fa Offset
213should be 0,
214.Fa fd
215must be -1, and
216.Fa prot
217should include at least
218.Dv PROT_READ
219and
220.Dv PROT_WRITE .
221This option creates
222a memory region that grows to at most
223.Fa len
224bytes in size, starting from the stack top and growing down.  The
225stack top is the starting address returned by the call, plus
226.Fa len
227bytes.
228The bottom of the stack at maximum growth is the starting
229address returned by the call.
230.Pp
231The entire area is reserved from the point of view of other
232mmap() calls, even if not faulted in yet.
233.Pp
234WARNING.  We currently allow
235.Dv MAP_STACK
236mappings to provide a hint that points within an existing
237.Dv MAP_STACK
238mapping's space, and this will succeed as long as no page have been
239faulted in the area specified, but this behavior is no longer supported
240unless you also specify the
241.Dv MAP_TRYFIXED
242flag.
243.Pp
244Note that unless
245.Dv MAP_FIXED
246or
247.Dv MAP_TRYFIXED
248is used, you cannot count on the returned address matching the hint
249you have provided.
250.It Dv MAP_VPAGETABLE
251Memory accessed via this map is not linearly mapped and will be governed
252by a virtual page table.  The base address of the virtual page table may
253be set using
254.Xr mcontrol 2
255with
256.Dv MADV_SETMAP .
257Virtual page tables work with anonymous memory but there
258is no way to populate the page table so for all intents and purposes
259.Dv MAP_VPAGETABLE
260can only be used when mapping file descriptors.  Since the kernel will
261update the VPTE_M bit in the virtual page table, the mapping must R+W
262even though actual access to the memory will be properly governed by
263the virtual page table.
264.Pp
265Addressable backing store is limited by the range supported in the virtual
266page table entries.  The kernel may implement a page table abstraction capable
267of addressing a larger range within the backing store then could otherwise
268be mapped into memory.
269.El
270.Pp
271The
272.Xr close 2
273function does not unmap pages, see
274.Xr munmap 2
275for further information.
276.Pp
277The current design does not allow a process to specify the location of
278swap space.
279In the future we may define an additional mapping type,
280.Dv MAP_SWAP ,
281in which
282the file descriptor argument specifies a file or device to which swapping
283should be done.
284.Sh RETURN VALUES
285Upon successful completion,
286.Fn mmap
287returns a pointer to the mapped region.
288Otherwise, a value of
289.Dv MAP_FAILED
290is returned and
291.Va errno
292is set to indicate the error.
293.Sh ERRORS
294.Fn Mmap
295will fail if:
296.Bl -tag -width Er
297.It Bq Er EACCES
298The flag
299.Dv PROT_READ
300was specified as part of the
301.Fa prot
302parameter and
303.Fa fd
304was not open for reading.
305The flags
306.Dv MAP_SHARED
307and
308.Dv PROT_WRITE
309were specified as part of the
310.Fa flags
311and
312.Fa prot
313parameters and
314.Fa fd
315was not open for writing.
316.It Bq Er EBADF
317.Fa fd
318is not a valid open file descriptor.
319.It Bq Er EINVAL
320.Dv MAP_FIXED
321was specified and the
322.Fa addr
323parameter was not page aligned, or part of the desired address space
324resides out of the valid address space for a user process.
325.It Bq Er EINVAL
326.Fa Len
327was negative.
328.It Bq Er EINVAL
329.Dv MAP_ANON
330was specified and the
331.Fa fd
332parameter was not -1.
333.It Bq Er EINVAL
334.Dv MAP_ANON
335has not been specified and
336.Fa fd
337did not reference a regular or character special file.
338.It Bq Er EINVAL
339.Fa Offset
340was not page-aligned.
341(See
342.Sx BUGS
343below.)
344.It Bq Er ENOMEM
345.Dv MAP_FIXED
346was specified and the
347.Fa addr
348parameter wasn't available.
349.Dv MAP_ANON
350was specified and insufficient memory was available.
351The system has reached the per-process mmap limit specified in the
352.Va vm.max_proc_mmap
353sysctl.
354.El
355.Sh SEE ALSO
356.Xr madvise 2 ,
357.Xr mincore 2 ,
358.Xr mlock 2 ,
359.Xr mprotect 2 ,
360.Xr msync 2 ,
361.Xr munlock 2 ,
362.Xr munmap 2 ,
363.Xr getpagesize 3
364.Sh BUGS
365.Fa len
366is limited to 2GB.  Mmapping slightly more than 2GB doesn't work, but
367it is possible to map a window of size (filesize % 2GB) for file sizes
368of slightly less than 2G, 4GB, 6GB and 8GB.
369.Pp
370The limit is imposed for a variety of reasons.
371Most of them have to do
372with
373.Dx
374not wanting to use 64 bit offsets in the VM system due to
375the extreme performance penalty.
376So
377.Dx
378uses 32bit page indexes and
379this gives
380.Dx
381a maximum of 8TB filesizes.
382It's actually bugs in
383the filesystem code that causes the limit to be further restricted to
3841TB (loss of precision when doing blockno calculations).
385.Pp
386Another reason for the 2GB limit is that filesystem metadata can
387reside at negative offsets.
388