1.\" Copyright (c) 1991, 1993 2.\" The Regents of the University of California. All rights reserved. 3.\" 4.\" Redistribution and use in source and binary forms, with or without 5.\" modification, are permitted provided that the following conditions 6.\" are met: 7.\" 1. Redistributions of source code must retain the above copyright 8.\" notice, this list of conditions and the following disclaimer. 9.\" 2. Redistributions in binary form must reproduce the above copyright 10.\" notice, this list of conditions and the following disclaimer in the 11.\" documentation and/or other materials provided with the distribution. 12.\" 3. All advertising materials mentioning features or use of this software 13.\" must display the following acknowledgement: 14.\" This product includes software developed by the University of 15.\" California, Berkeley and its contributors. 16.\" 4. Neither the name of the University nor the names of its contributors 17.\" may be used to endorse or promote products derived from this software 18.\" without specific prior written permission. 19.\" 20.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND 21.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 22.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 23.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE 24.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 25.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 26.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 27.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 28.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 29.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 30.\" SUCH DAMAGE. 31.\" 32.\" @(#)mmap.2 8.4 (Berkeley) 5/11/95 33.\" $FreeBSD: src/lib/libc/sys/mmap.2,v 1.22.2.12 2002/02/27 03:40:13 dd Exp $ 34.\" $DragonFly: src/lib/libc/sys/mmap.2,v 1.9 2007/05/17 08:19:00 swildner Exp $ 35.\" 36.Dd December 11, 2006 37.Dt MMAP 2 38.Os 39.Sh NAME 40.Nm mmap 41.Nd allocate memory, or map files or devices into memory 42.Sh LIBRARY 43.Lb libc 44.Sh SYNOPSIS 45.In sys/types.h 46.In sys/mman.h 47.Ft void * 48.Fn mmap "void *addr" "size_t len" "int prot" "int flags" "int fd" "off_t offset" 49.Sh DESCRIPTION 50The 51.Fn mmap 52function causes the pages starting at 53.Fa addr 54and continuing for at most 55.Fa len 56bytes to be mapped from the object described by 57.Fa fd , 58starting at byte offset 59.Fa offset . 60If 61.Fa len 62is not a multiple of the pagesize, the mapped region may extend past the 63specified range. 64Any such extension beyond the end of the mapped object will be zero-filled. 65.Pp 66If 67.Fa addr 68is non-zero, it is used as a hint to the system. 69(As a convenience to the system, the actual address of the region may differ 70from the address supplied.) 71If 72.Fa addr 73is zero, an address will be selected by the system. 74The actual starting address of the region is returned. 75A successful 76.Fa mmap 77deletes any previous mapping in the allocated address range. 78.Pp 79The protections (region accessibility) are specified in the 80.Fa prot 81argument by 82.Em or Ns 'ing 83the following values: 84.Pp 85.Bl -tag -width PROT_WRITE -compact 86.It Dv PROT_NONE 87Pages may not be accessed. 88.It Dv PROT_READ 89Pages may be read. 90.It Dv PROT_WRITE 91Pages may be written. 92.It Dv PROT_EXEC 93Pages may be executed. 94.El 95.Pp 96The 97.Fa flags 98parameter specifies the type of the mapped object, mapping options and 99whether modifications made to the mapped copy of the page are private 100to the process or are to be shared with other references. 101Sharing, mapping type and options are specified in the 102.Fa flags 103argument by 104.Em or Ns 'ing 105the following values: 106.Bl -tag -width MAP_HASSEMAPHORE 107.It Dv MAP_ANON 108Map anonymous memory not associated with any specific file. 109The file descriptor used for creating 110.Dv MAP_ANON 111must be \-1. 112The 113.Fa offset 114parameter is ignored. 115.\".It Dv MAP_FILE 116.\"Mapped from a regular file or character-special device memory. 117.It Dv MAP_FIXED 118Do not permit the system to select a different address than the one 119specified. 120If the specified address contains other mappings those mappings will 121be replaced. 122If the specified address cannot otherwise be used, 123.Fn mmap 124will fail. 125If 126.Dv MAP_FIXED 127is specified, 128.Fa addr 129must be a multiple of the pagesize. 130.It Dv MAP_TRYFIXED 131Try to do a fixed mapping but fail if another mapping already exists in 132the space instead of overwriting the mapping. 133.Pp 134When used with MAP_STACK this flag allows one MAP_STACK mapping to be 135made within another (typically the master user stack), as long as 136no pages have been faulted in the area requested. 137.It Dv MAP_HASSEMAPHORE 138Notify the kernel that the region may contain semaphores and that special 139handling may be necessary. 140.It Dv MAP_NOCORE 141Region is not included in a core file. 142.It Dv MAP_NOSYNC 143Causes data dirtied via this VM map to be flushed to physical media 144only when necessary (usually by the pager) rather than gratuitously. 145Typically this prevents the update daemons from flushing pages dirtied 146through such maps and thus allows efficient sharing of memory across 147unassociated processes using a file-backed shared memory map. Without 148this option any VM pages you dirty may be flushed to disk every so often 149(every 30-60 seconds usually) which can create performance problems if you 150do not need that to occur (such as when you are using shared file-backed 151mmap regions for IPC purposes). Note that VM/filesystem coherency is 152maintained whether you use 153.Dv MAP_NOSYNC 154or not. This option is not portable 155across 156.Ux 157platforms (yet), though some may implement the same behavior 158by default. 159.Pp 160.Em WARNING ! 161Extending a file with 162.Xr ftruncate 2 , 163thus creating a big hole, and then filling the hole by modifying a shared 164.Fn mmap 165can lead to severe file fragmentation. 166In order to avoid such fragmentation you should always pre-allocate the 167file's backing store by 168.Fn write Ns ing 169zero's into the newly extended area prior to modifying the area via your 170.Fn mmap . 171The fragmentation problem is especially sensitive to 172.Dv MAP_NOSYNC 173pages, because pages may be flushed to disk in a totally random order. 174.Pp 175The same applies when using 176.Dv MAP_NOSYNC 177to implement a file-based shared memory store. 178It is recommended that you create the backing store by 179.Fn write Ns ing 180zero's to the backing file rather than 181.Fn ftruncate Ns ing 182it. 183You can test file fragmentation by observing the KB/t (kilobytes per 184transfer) results from an 185.Dq Li iostat 1 186while reading a large file sequentially, e.g. using 187.Dq Li dd if=filename of=/dev/null bs=32k . 188.Pp 189The 190.Xr fsync 2 191function will flush all dirty data and metadata associated with a file, 192including dirty NOSYNC VM data, to physical media. The 193.Xr sync 8 194command and 195.Xr sync 2 196system call generally do not flush dirty NOSYNC VM data. 197The 198.Xr msync 2 199system call is obsolete since 200.Bx 201implements a coherent filesystem buffer cache. However, it may be 202used to associate dirty VM pages with filesystem buffers and thus cause 203them to be flushed to physical media sooner rather than later. 204.It Dv MAP_PRIVATE 205Modifications are private. 206.It Dv MAP_SHARED 207Modifications are shared. 208.It Dv MAP_STACK 209Map the area as a stack. 210.Dv MAP_ANON 211is implied. 212.Fa Offset 213should be 0, 214.Fa fd 215must be -1, and 216.Fa prot 217should include at least 218.Dv PROT_READ 219and 220.Dv PROT_WRITE . 221This option creates 222a memory region that grows to at most 223.Fa len 224bytes in size, starting from the stack top and growing down. The 225stack top is the starting address returned by the call, plus 226.Fa len 227bytes. 228The bottom of the stack at maximum growth is the starting 229address returned by the call. 230.Pp 231The entire area is reserved from the point of view of other 232.Fn mmap 233calls, even if not faulted in yet. 234.Pp 235WARNING. We currently allow 236.Dv MAP_STACK 237mappings to provide a hint that points within an existing 238.Dv MAP_STACK 239mapping's space, and this will succeed as long as no page have been 240faulted in the area specified, but this behavior is no longer supported 241unless you also specify the 242.Dv MAP_TRYFIXED 243flag. 244.Pp 245Note that unless 246.Dv MAP_FIXED 247or 248.Dv MAP_TRYFIXED 249is used, you cannot count on the returned address matching the hint 250you have provided. 251.It Dv MAP_VPAGETABLE 252Memory accessed via this map is not linearly mapped and will be governed 253by a virtual page table. The base address of the virtual page table may 254be set using 255.Xr mcontrol 2 256with 257.Dv MADV_SETMAP . 258Virtual page tables work with anonymous memory but there 259is no way to populate the page table so for all intents and purposes 260.Dv MAP_VPAGETABLE 261can only be used when mapping file descriptors. Since the kernel will 262update the VPTE_M bit in the virtual page table, the mapping must R+W 263even though actual access to the memory will be properly governed by 264the virtual page table. 265.Pp 266Addressable backing store is limited by the range supported in the virtual 267page table entries. The kernel may implement a page table abstraction capable 268of addressing a larger range within the backing store then could otherwise 269be mapped into memory. 270.El 271.Pp 272The 273.Xr close 2 274function does not unmap pages, see 275.Xr munmap 2 276for further information. 277.Pp 278The current design does not allow a process to specify the location of 279swap space. 280In the future we may define an additional mapping type, 281.Dv MAP_SWAP , 282in which 283the file descriptor argument specifies a file or device to which swapping 284should be done. 285.Sh RETURN VALUES 286Upon successful completion, 287.Fn mmap 288returns a pointer to the mapped region. 289Otherwise, a value of 290.Dv MAP_FAILED 291is returned and 292.Va errno 293is set to indicate the error. 294.Sh ERRORS 295.Fn Mmap 296will fail if: 297.Bl -tag -width Er 298.It Bq Er EACCES 299The flag 300.Dv PROT_READ 301was specified as part of the 302.Fa prot 303parameter and 304.Fa fd 305was not open for reading. 306The flags 307.Dv MAP_SHARED 308and 309.Dv PROT_WRITE 310were specified as part of the 311.Fa flags 312and 313.Fa prot 314parameters and 315.Fa fd 316was not open for writing. 317.It Bq Er EBADF 318.Fa fd 319is not a valid open file descriptor. 320.It Bq Er EINVAL 321.Dv MAP_FIXED 322was specified and the 323.Fa addr 324parameter was not page aligned, or part of the desired address space 325resides out of the valid address space for a user process. 326.It Bq Er EINVAL 327.Fa Len 328was negative. 329.It Bq Er EINVAL 330.Dv MAP_ANON 331was specified and the 332.Fa fd 333parameter was not -1. 334.It Bq Er EINVAL 335.Dv MAP_ANON 336has not been specified and 337.Fa fd 338did not reference a regular or character special file. 339.It Bq Er EINVAL 340.Fa Offset 341was not page-aligned. 342(See 343.Sx BUGS 344below.) 345.It Bq Er ENOMEM 346.Dv MAP_FIXED 347was specified and the 348.Fa addr 349parameter wasn't available. 350.Dv MAP_ANON 351was specified and insufficient memory was available. 352The system has reached the per-process mmap limit specified in the 353.Va vm.max_proc_mmap 354sysctl. 355.El 356.Sh SEE ALSO 357.Xr madvise 2 , 358.Xr mincore 2 , 359.Xr mlock 2 , 360.Xr mprotect 2 , 361.Xr msync 2 , 362.Xr munlock 2 , 363.Xr munmap 2 , 364.Xr getpagesize 3 365.Sh BUGS 366.Fa len 367is limited to 2GB. Mmapping slightly more than 2GB doesn't work, but 368it is possible to map a window of size (filesize % 2GB) for file sizes 369of slightly less than 2G, 4GB, 6GB and 8GB. 370.Pp 371The limit is imposed for a variety of reasons. 372Most of them have to do 373with 374.Dx 375not wanting to use 64 bit offsets in the VM system due to 376the extreme performance penalty. 377So 378.Dx 379uses 32bit page indexes and 380this gives 381.Dx 382a maximum of 8TB filesizes. 383It's actually bugs in 384the filesystem code that causes the limit to be further restricted to 3851TB (loss of precision when doing blockno calculations). 386.Pp 387Another reason for the 2GB limit is that filesystem metadata can 388reside at negative offsets. 389