1.\" Copyright (c) 1991, 1993 2.\" The Regents of the University of California. All rights reserved. 3.\" 4.\" Redistribution and use in source and binary forms, with or without 5.\" modification, are permitted provided that the following conditions 6.\" are met: 7.\" 1. Redistributions of source code must retain the above copyright 8.\" notice, this list of conditions and the following disclaimer. 9.\" 2. Redistributions in binary form must reproduce the above copyright 10.\" notice, this list of conditions and the following disclaimer in the 11.\" documentation and/or other materials provided with the distribution. 12.\" 3. Neither the name of the University nor the names of its contributors 13.\" may be used to endorse or promote products derived from this software 14.\" without specific prior written permission. 15.\" 16.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND 17.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 18.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 19.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE 20.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 21.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 22.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 23.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 24.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 25.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 26.\" SUCH DAMAGE. 27.\" 28.\" @(#)mmap.2 8.4 (Berkeley) 5/11/95 29.\" $FreeBSD: src/lib/libc/sys/mmap.2,v 1.22.2.12 2002/02/27 03:40:13 dd Exp $ 30.\" $DragonFly: src/lib/libc/sys/mmap.2,v 1.9 2007/05/17 08:19:00 swildner Exp $ 31.\" 32.Dd December 11, 2006 33.Dt MMAP 2 34.Os 35.Sh NAME 36.Nm mmap 37.Nd allocate memory, or map files or devices into memory 38.Sh LIBRARY 39.Lb libc 40.Sh SYNOPSIS 41.In sys/types.h 42.In sys/mman.h 43.Ft void * 44.Fn mmap "void *addr" "size_t len" "int prot" "int flags" "int fd" "off_t offset" 45.Sh DESCRIPTION 46The 47.Fn mmap 48function causes the pages starting at 49.Fa addr 50and continuing for at most 51.Fa len 52bytes to be mapped from the object described by 53.Fa fd , 54starting at byte offset 55.Fa offset . 56If 57.Fa len 58is not a multiple of the pagesize, the mapped region may extend past the 59specified range. 60Any such extension beyond the end of the mapped object will be zero-filled. 61.Pp 62If 63.Fa addr 64is non-zero, it is used as a hint to the system. 65(As a convenience to the system, the actual address of the region may differ 66from the address supplied.) 67If 68.Fa addr 69is zero, an address will be selected by the system. 70The actual starting address of the region is returned. 71A successful 72.Fa mmap 73deletes any previous mapping in the allocated address range. 74.Pp 75The protections (region accessibility) are specified in the 76.Fa prot 77argument by 78.Em or Ns 'ing 79the following values: 80.Pp 81.Bl -tag -width PROT_WRITE -compact 82.It Dv PROT_NONE 83Pages may not be accessed. 84.It Dv PROT_READ 85Pages may be read. 86.It Dv PROT_WRITE 87Pages may be written. 88.It Dv PROT_EXEC 89Pages may be executed. 90.El 91.Pp 92The 93.Fa flags 94parameter specifies the type of the mapped object, mapping options and 95whether modifications made to the mapped copy of the page are private 96to the process or are to be shared with other references. 97Sharing, mapping type and options are specified in the 98.Fa flags 99argument by 100.Em or Ns 'ing 101the following values: 102.Bl -tag -width MAP_HASSEMAPHORE 103.It Dv MAP_ANON 104Map anonymous memory not associated with any specific file. 105The file descriptor used for creating 106.Dv MAP_ANON 107must be \-1. 108The 109.Fa offset 110parameter is ignored. 111.\".It Dv MAP_FILE 112.\"Mapped from a regular file or character-special device memory. 113.It Dv MAP_FIXED 114Do not permit the system to select a different address than the one 115specified. 116If the specified address contains other mappings those mappings will 117be replaced. 118If the specified address cannot otherwise be used, 119.Fn mmap 120will fail. 121If 122.Dv MAP_FIXED 123is specified, 124.Fa addr 125must be a multiple of the pagesize. 126.It Dv MAP_TRYFIXED 127Try to do a fixed mapping but fail if another mapping already exists in 128the space instead of overwriting the mapping. 129.Pp 130When used with MAP_STACK this flag allows one MAP_STACK mapping to be 131made within another (typically the master user stack), as long as 132no pages have been faulted in the area requested. 133.It Dv MAP_HASSEMAPHORE 134Notify the kernel that the region may contain semaphores and that special 135handling may be necessary. 136.It Dv MAP_NOCORE 137Region is not included in a core file. 138.It Dv MAP_NOSYNC 139Causes data dirtied via this VM map to be flushed to physical media 140only when necessary (usually by the pager) rather than gratuitously. 141Typically this prevents the update daemons from flushing pages dirtied 142through such maps and thus allows efficient sharing of memory across 143unassociated processes using a file-backed shared memory map. Without 144this option any VM pages you dirty may be flushed to disk every so often 145(every 30-60 seconds usually) which can create performance problems if you 146do not need that to occur (such as when you are using shared file-backed 147mmap regions for IPC purposes). Note that VM/filesystem coherency is 148maintained whether you use 149.Dv MAP_NOSYNC 150or not. This option is not portable 151across 152.Ux 153platforms (yet), though some may implement the same behavior 154by default. 155.Pp 156.Em WARNING ! 157Extending a file with 158.Xr ftruncate 2 , 159thus creating a big hole, and then filling the hole by modifying a shared 160.Fn mmap 161can lead to severe file fragmentation. 162In order to avoid such fragmentation you should always pre-allocate the 163file's backing store by 164.Fn write Ns ing 165zero's into the newly extended area prior to modifying the area via your 166.Fn mmap . 167The fragmentation problem is especially sensitive to 168.Dv MAP_NOSYNC 169pages, because pages may be flushed to disk in a totally random order. 170.Pp 171The same applies when using 172.Dv MAP_NOSYNC 173to implement a file-based shared memory store. 174It is recommended that you create the backing store by 175.Fn write Ns ing 176zero's to the backing file rather than 177.Fn ftruncate Ns ing 178it. 179You can test file fragmentation by observing the KB/t (kilobytes per 180transfer) results from an 181.Dq Li iostat 1 182while reading a large file sequentially, e.g. using 183.Dq Li dd if=filename of=/dev/null bs=32k . 184.Pp 185The 186.Xr fsync 2 187function will flush all dirty data and metadata associated with a file, 188including dirty NOSYNC VM data, to physical media. The 189.Xr sync 8 190command and 191.Xr sync 2 192system call generally do not flush dirty NOSYNC VM data. 193The 194.Xr msync 2 195system call is obsolete since 196.Bx 197implements a coherent filesystem buffer cache. However, it may be 198used to associate dirty VM pages with filesystem buffers and thus cause 199them to be flushed to physical media sooner rather than later. 200.It Dv MAP_PRIVATE 201Modifications are private. 202.It Dv MAP_SHARED 203Modifications are shared. 204.It Dv MAP_STACK 205Map the area as a stack. 206.Dv MAP_ANON 207is implied. 208.Fa Offset 209should be 0, 210.Fa fd 211must be -1, and 212.Fa prot 213should include at least 214.Dv PROT_READ 215and 216.Dv PROT_WRITE . 217This option creates 218a memory region that grows to at most 219.Fa len 220bytes in size, starting from the stack top and growing down. The 221stack top is the starting address returned by the call, plus 222.Fa len 223bytes. 224The bottom of the stack at maximum growth is the starting 225address returned by the call. 226.Pp 227The entire area is reserved from the point of view of other 228.Fn mmap 229calls, even if not faulted in yet. 230.Pp 231WARNING. We currently allow 232.Dv MAP_STACK 233mappings to provide a hint that points within an existing 234.Dv MAP_STACK 235mapping's space, and this will succeed as long as no page have been 236faulted in the area specified, but this behavior is no longer supported 237unless you also specify the 238.Dv MAP_TRYFIXED 239flag. 240.Pp 241Note that unless 242.Dv MAP_FIXED 243or 244.Dv MAP_TRYFIXED 245is used, you cannot count on the returned address matching the hint 246you have provided. 247.It Dv MAP_VPAGETABLE 248Memory accessed via this map is not linearly mapped and will be governed 249by a virtual page table. The base address of the virtual page table may 250be set using 251.Xr mcontrol 2 252with 253.Dv MADV_SETMAP . 254Virtual page tables work with anonymous memory but there 255is no way to populate the page table so for all intents and purposes 256.Dv MAP_VPAGETABLE 257can only be used when mapping file descriptors. Since the kernel will 258update the VPTE_M bit in the virtual page table, the mapping must R+W 259even though actual access to the memory will be properly governed by 260the virtual page table. 261.Pp 262Addressable backing store is limited by the range supported in the virtual 263page table entries. The kernel may implement a page table abstraction capable 264of addressing a larger range within the backing store then could otherwise 265be mapped into memory. 266.El 267.Pp 268The 269.Xr close 2 270function does not unmap pages, see 271.Xr munmap 2 272for further information. 273.Pp 274The current design does not allow a process to specify the location of 275swap space. 276In the future we may define an additional mapping type, 277.Dv MAP_SWAP , 278in which 279the file descriptor argument specifies a file or device to which swapping 280should be done. 281.Sh RETURN VALUES 282Upon successful completion, 283.Fn mmap 284returns a pointer to the mapped region. 285Otherwise, a value of 286.Dv MAP_FAILED 287is returned and 288.Va errno 289is set to indicate the error. 290.Sh ERRORS 291.Fn Mmap 292will fail if: 293.Bl -tag -width Er 294.It Bq Er EACCES 295The flag 296.Dv PROT_READ 297was specified as part of the 298.Fa prot 299parameter and 300.Fa fd 301was not open for reading. 302The flags 303.Dv MAP_SHARED 304and 305.Dv PROT_WRITE 306were specified as part of the 307.Fa flags 308and 309.Fa prot 310parameters and 311.Fa fd 312was not open for writing. 313.It Bq Er EBADF 314.Fa fd 315is not a valid open file descriptor. 316.It Bq Er EINVAL 317.Dv MAP_FIXED 318was specified and the 319.Fa addr 320parameter was not page aligned, or part of the desired address space 321resides out of the valid address space for a user process. 322.It Bq Er EINVAL 323.Fa Len 324was negative. 325.It Bq Er EINVAL 326.Dv MAP_ANON 327was specified and the 328.Fa fd 329parameter was not -1. 330.It Bq Er EINVAL 331.Dv MAP_ANON 332has not been specified and 333.Fa fd 334did not reference a regular or character special file. 335.It Bq Er EINVAL 336.Fa Offset 337was not page-aligned. 338(See 339.Sx BUGS 340below.) 341.It Bq Er ENOMEM 342.Dv MAP_FIXED 343was specified and the 344.Fa addr 345parameter wasn't available. 346.Dv MAP_ANON 347was specified and insufficient memory was available. 348The system has reached the per-process mmap limit specified in the 349.Va vm.max_proc_mmap 350sysctl. 351.El 352.Sh SEE ALSO 353.Xr madvise 2 , 354.Xr mincore 2 , 355.Xr mlock 2 , 356.Xr mprotect 2 , 357.Xr msync 2 , 358.Xr munlock 2 , 359.Xr munmap 2 , 360.Xr getpagesize 3 361.Sh BUGS 362.Fa len 363is limited to 2GB. Mmapping slightly more than 2GB doesn't work, but 364it is possible to map a window of size (filesize % 2GB) for file sizes 365of slightly less than 2G, 4GB, 6GB and 8GB. 366.Pp 367The limit is imposed for a variety of reasons. 368Most of them have to do 369with 370.Dx 371not wanting to use 64 bit offsets in the VM system due to 372the extreme performance penalty. 373So 374.Dx 375uses 32bit page indexes and 376this gives 377.Dx 378a maximum of 8TB filesizes. 379It's actually bugs in 380the filesystem code that causes the limit to be further restricted to 3811TB (loss of precision when doing blockno calculations). 382.Pp 383Another reason for the 2GB limit is that filesystem metadata can 384reside at negative offsets. 385