1.\" Copyright (c) 1991, 1993 2.\" The Regents of the University of California. All rights reserved. 3.\" 4.\" Redistribution and use in source and binary forms, with or without 5.\" modification, are permitted provided that the following conditions 6.\" are met: 7.\" 1. Redistributions of source code must retain the above copyright 8.\" notice, this list of conditions and the following disclaimer. 9.\" 2. Redistributions in binary form must reproduce the above copyright 10.\" notice, this list of conditions and the following disclaimer in the 11.\" documentation and/or other materials provided with the distribution. 12.\" 3. All advertising materials mentioning features or use of this software 13.\" must display the following acknowledgement: 14.\" This product includes software developed by the University of 15.\" California, Berkeley and its contributors. 16.\" 4. Neither the name of the University nor the names of its contributors 17.\" may be used to endorse or promote products derived from this software 18.\" without specific prior written permission. 19.\" 20.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND 21.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 22.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 23.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE 24.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 25.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 26.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 27.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 28.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 29.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 30.\" SUCH DAMAGE. 31.\" 32.\" @(#)mmap.2 8.4 (Berkeley) 5/11/95 33.\" $FreeBSD: src/lib/libc/sys/mmap.2,v 1.22.2.12 2002/02/27 03:40:13 dd Exp $ 34.\" $DragonFly: src/lib/libc/sys/mmap.2,v 1.9 2007/05/17 08:19:00 swildner Exp $ 35.\" 36.Dd December 11, 2006 37.Dt MMAP 2 38.Os 39.Sh NAME 40.Nm mmap 41.Nd allocate memory, or map files or devices into memory 42.Sh LIBRARY 43.Lb libc 44.Sh SYNOPSIS 45.In sys/types.h 46.In sys/mman.h 47.Ft void * 48.Fn mmap "void *addr" "size_t len" "int prot" "int flags" "int fd" "off_t offset" 49.Sh DESCRIPTION 50The 51.Fn mmap 52function causes the pages starting at 53.Fa addr 54and continuing for at most 55.Fa len 56bytes to be mapped from the object described by 57.Fa fd , 58starting at byte offset 59.Fa offset . 60If 61.Fa len 62is not a multiple of the pagesize, the mapped region may extend past the 63specified range. 64Any such extension beyond the end of the mapped object will be zero-filled. 65.Pp 66If 67.Fa addr 68is non-zero, it is used as a hint to the system. 69(As a convenience to the system, the actual address of the region may differ 70from the address supplied.) 71If 72.Fa addr 73is zero, an address will be selected by the system. 74The actual starting address of the region is returned. 75A successful 76.Fa mmap 77deletes any previous mapping in the allocated address range. 78.Pp 79The protections (region accessibility) are specified in the 80.Fa prot 81argument by 82.Em or Ns 'ing 83the following values: 84.Pp 85.Bl -tag -width PROT_WRITE -compact 86.It Dv PROT_NONE 87Pages may not be accessed. 88.It Dv PROT_READ 89Pages may be read. 90.It Dv PROT_WRITE 91Pages may be written. 92.It Dv PROT_EXEC 93Pages may be executed. 94.El 95.Pp 96The 97.Fa flags 98parameter specifies the type of the mapped object, mapping options and 99whether modifications made to the mapped copy of the page are private 100to the process or are to be shared with other references. 101Sharing, mapping type and options are specified in the 102.Fa flags 103argument by 104.Em or Ns 'ing 105the following values: 106.Bl -tag -width MAP_HASSEMAPHORE 107.It Dv MAP_ANON 108Map anonymous memory not associated with any specific file. 109The file descriptor used for creating 110.Dv MAP_ANON 111must be \-1. 112The 113.Fa offset 114parameter is ignored. 115.\".It Dv MAP_FILE 116.\"Mapped from a regular file or character-special device memory. 117.It Dv MAP_FIXED 118Do not permit the system to select a different address than the one 119specified. 120If the specified address contains other mappings those mappings will 121be replaced. 122If the specified address cannot otherwise be used, 123.Fn mmap 124will fail. 125If 126.Dv MAP_FIXED 127is specified, 128.Fa addr 129must be a multiple of the pagesize. 130.It Dv MAP_TRYFIXED 131Try to do a fixed mapping but fail if another mapping already exists in 132the space instead of overwriting the mapping. 133.Pp 134When used with MAP_STACK this flag allows one MAP_STACK mapping to be 135made within another (typically the master user stack), as long as 136no pages have been faulted in the area requested. 137.It Dv MAP_HASSEMAPHORE 138Notify the kernel that the region may contain semaphores and that special 139handling may be necessary. 140.It Dv MAP_NOCORE 141Region is not included in a core file. 142.It Dv MAP_NOSYNC 143Causes data dirtied via this VM map to be flushed to physical media 144only when necessary (usually by the pager) rather than gratuitously. 145Typically this prevents the update daemons from flushing pages dirtied 146through such maps and thus allows efficient sharing of memory across 147unassociated processes using a file-backed shared memory map. Without 148this option any VM pages you dirty may be flushed to disk every so often 149(every 30-60 seconds usually) which can create performance problems if you 150do not need that to occur (such as when you are using shared file-backed 151mmap regions for IPC purposes). Note that VM/filesystem coherency is 152maintained whether you use 153.Dv MAP_NOSYNC 154or not. This option is not portable 155across 156.Ux 157platforms (yet), though some may implement the same behavior 158by default. 159.Pp 160.Em WARNING ! 161Extending a file with 162.Xr ftruncate 2 , 163thus creating a big hole, and then filling the hole by modifying a shared 164.Fn mmap 165can lead to severe file fragmentation. 166In order to avoid such fragmentation you should always pre-allocate the 167file's backing store by 168.Fn write Ns ing 169zero's into the newly extended area prior to modifying the area via your 170.Fn mmap . 171The fragmentation problem is especially sensitive to 172.Dv MAP_NOSYNC 173pages, because pages may be flushed to disk in a totally random order. 174.Pp 175The same applies when using 176.Dv MAP_NOSYNC 177to implement a file-based shared memory store. 178It is recommended that you create the backing store by 179.Fn write Ns ing 180zero's to the backing file rather than 181.Fn ftruncate Ns ing 182it. 183You can test file fragmentation by observing the KB/t (kilobytes per 184transfer) results from an 185.Dq Li iostat 1 186while reading a large file sequentially, e.g. using 187.Dq Li dd if=filename of=/dev/null bs=32k . 188.Pp 189The 190.Xr fsync 2 191function will flush all dirty data and metadata associated with a file, 192including dirty NOSYNC VM data, to physical media. The 193.Xr sync 8 194command and 195.Xr sync 2 196system call generally do not flush dirty NOSYNC VM data. 197The 198.Xr msync 2 199system call is obsolete since 200.Bx 201implements a coherent filesystem buffer cache. However, it may be 202used to associate dirty VM pages with filesystem buffers and thus cause 203them to be flushed to physical media sooner rather than later. 204.It Dv MAP_PRIVATE 205Modifications are private. 206.It Dv MAP_SHARED 207Modifications are shared. 208.It Dv MAP_STACK 209Map the area as a stack. 210.Dv MAP_ANON 211is implied. 212.Fa Offset 213should be 0, 214.Fa fd 215must be -1, and 216.Fa prot 217should include at least 218.Dv PROT_READ 219and 220.Dv PROT_WRITE . 221This option creates 222a memory region that grows to at most 223.Fa len 224bytes in size, starting from the stack top and growing down. The 225stack top is the starting address returned by the call, plus 226.Fa len 227bytes. 228The bottom of the stack at maximum growth is the starting 229address returned by the call. 230.Pp 231The entire area is reserved from the point of view of other 232mmap() calls, even if not faulted in yet. 233.Pp 234WARNING. We currently allow 235.Dv MAP_STACK 236mappings to provide a hint that points within an existing 237.Dv MAP_STACK 238mapping's space, and this will succeed as long as no page have been 239faulted in the area specified, but this behavior is no longer supported 240unless you also specify the 241.Dv MAP_TRYFIXED 242flag. 243.Pp 244Note that unless 245.Dv MAP_FIXED 246or 247.Dv MAP_TRYFIXED 248is used, you cannot count on the returned address matching the hint 249you have provided. 250.It Dv MAP_VPAGETABLE 251Memory accessed via this map is not linearly mapped and will be governed 252by a virtual page table. The base address of the virtual page table may 253be set using 254.Xr mcontrol 2 255with 256.Dv MADV_SETMAP . 257Virtual page tables work with anonymous memory but there 258is no way to populate the page table so for all intents and purposes 259.Dv MAP_VPAGETABLE 260can only be used when mapping file descriptors. Since the kernel will 261update the VPTE_M bit in the virtual page table, the mapping must R+W 262even though actual access to the memory will be properly governed by 263the virtual page table. 264.Pp 265Addressable backing store is limited by the range supported in the virtual 266page table entries. The kernel may implement a page table abstraction capable 267of addressing a larger range within the backing store then could otherwise 268be mapped into memory. 269.El 270.Pp 271The 272.Xr close 2 273function does not unmap pages, see 274.Xr munmap 2 275for further information. 276.Pp 277The current design does not allow a process to specify the location of 278swap space. 279In the future we may define an additional mapping type, 280.Dv MAP_SWAP , 281in which 282the file descriptor argument specifies a file or device to which swapping 283should be done. 284.Sh RETURN VALUES 285Upon successful completion, 286.Fn mmap 287returns a pointer to the mapped region. 288Otherwise, a value of 289.Dv MAP_FAILED 290is returned and 291.Va errno 292is set to indicate the error. 293.Sh ERRORS 294.Fn Mmap 295will fail if: 296.Bl -tag -width Er 297.It Bq Er EACCES 298The flag 299.Dv PROT_READ 300was specified as part of the 301.Fa prot 302parameter and 303.Fa fd 304was not open for reading. 305The flags 306.Dv MAP_SHARED 307and 308.Dv PROT_WRITE 309were specified as part of the 310.Fa flags 311and 312.Fa prot 313parameters and 314.Fa fd 315was not open for writing. 316.It Bq Er EBADF 317.Fa fd 318is not a valid open file descriptor. 319.It Bq Er EINVAL 320.Dv MAP_FIXED 321was specified and the 322.Fa addr 323parameter was not page aligned, or part of the desired address space 324resides out of the valid address space for a user process. 325.It Bq Er EINVAL 326.Fa Len 327was negative. 328.It Bq Er EINVAL 329.Dv MAP_ANON 330was specified and the 331.Fa fd 332parameter was not -1. 333.It Bq Er EINVAL 334.Dv MAP_ANON 335has not been specified and 336.Fa fd 337did not reference a regular or character special file. 338.It Bq Er EINVAL 339.Fa Offset 340was not page-aligned. 341(See 342.Sx BUGS 343below.) 344.It Bq Er ENOMEM 345.Dv MAP_FIXED 346was specified and the 347.Fa addr 348parameter wasn't available. 349.Dv MAP_ANON 350was specified and insufficient memory was available. 351The system has reached the per-process mmap limit specified in the 352.Va vm.max_proc_mmap 353sysctl. 354.El 355.Sh SEE ALSO 356.Xr madvise 2 , 357.Xr mincore 2 , 358.Xr mlock 2 , 359.Xr mprotect 2 , 360.Xr msync 2 , 361.Xr munlock 2 , 362.Xr munmap 2 , 363.Xr getpagesize 3 364.Sh BUGS 365.Fa len 366is limited to 2GB. Mmapping slightly more than 2GB doesn't work, but 367it is possible to map a window of size (filesize % 2GB) for file sizes 368of slightly less than 2G, 4GB, 6GB and 8GB. 369.Pp 370The limit is imposed for a variety of reasons. 371Most of them have to do 372with 373.Dx 374not wanting to use 64 bit offsets in the VM system due to 375the extreme performance penalty. 376So 377.Dx 378uses 32bit page indexes and 379this gives 380.Dx 381a maximum of 8TB filesizes. 382It's actually bugs in 383the filesystem code that causes the limit to be further restricted to 3841TB (loss of precision when doing blockno calculations). 385.Pp 386Another reason for the 2GB limit is that filesystem metadata can 387reside at negative offsets. 388