psd/05.sysman/1.2.t

         Copyright (c) 1983, 1993, 1994
 The Regents of the University of California. All rights reserved.

 %sccs.include.redist.roff%

 @(#)1.2.t 8.5 (Berkeley) 05/21/94

.Sh 2 "Memory management
.Sh 3 "Text, data, and stack

Each process begins execution with three logical areas of memory
called text, data, and stack.
The text area is read-only and shared, while the data and stack
areas are private to the process. Both the data and stack areas may
be extended and contracted on program request. The call:

.Fd brk 1 "set data section size
brk(addr);
caddr_t addr;

sets the end of the data segment to the specified address.
More conveniently, the end can be extended by incr bytes,
and the base of the new area returned with the call:

.Fd sbrk 1 "change data section size
addr = sbrk(incr);
result caddr_t addr; int incr;

Application programs usually use the library routines
.Fn malloc ,
and
.Fn free
that provide a more convenient interface to
.Fn brk ,
and
.Fn sbrk .

There is no call for extending the stack,
as it is automatically extended as needed.
.Sh 3 "Mapping pages

The system supports sharing of data between processes
by allowing pages to be mapped into memory. These mapped
pages may be shared with other processes or private
to the process.
Protection and sharing options are defined in <sys/mman.h> as:


Protections are chosen from these bits, or-ed together
PROT_READ /* pages can be read */
PROT_WRITE /* pages can be written */
PROT_EXEC /* pages can be executed */


Flags contain sharing type and options. Sharing options, choose one
MAP_SHARED /* share changes */
MAP_PRIVATE /* changes are private */


Other flags \(dg
MAP_ANON /* allocated from virtual memory; fd ignored */
MAP_FIXED /* map addr must be exactly as requested */
MAP_NORESERVE /* don't reserve needed swap area */
MAP_INHERIT /* region is retained after exec */
MAP_HASSEMAPHORE /* region may contain semaphores */
MAP_RENAME /* Sun: rename private pages to file */


.FS
\(dg Currently only MAP_ANON and MAP_FIXED are implemented.
.FE
The cpu-dependent size of a page is returned by the
.Fn sysctl
interface described in section
.Xr 1.7.1 .
For convenience and backward compatibility, the
.Fn getpagesize
library routine is provided:

.Fd getpagesize 0 "get system page size
pagesize = getpagesize();
result int pagesize;


The call:

.Fd mmap 6 "map files or devices into memory
maddr = mmap(addr, len, prot, flags, fd, pos);
result caddr_t maddr; caddr_t addr; size_t len; int prot, flags, fd; off_t pos;

causes the pages starting at addr and continuing
for at most len bytes to be mapped from the object represented by
descriptor fd, starting at byte offset pos.
If addr is NULL, the system picks an unused address for the region.
The starting address of the region is returned;
for the convenience of the system,
it may differ from that supplied
unless the MAP_FIXED flag is given,
in which case the exact address will be used or the call will fail.
The addr and pos parameters
must be multiples of the pagesize,
len will be rounded by the system as necessary.
A successful
.Fn mmap
will delete any previous mapping
in the allocated address range.
The parameter prot specifies the accessibility
of the mapped pages.
The parameter flags specifies
the type of object to be mapped,
mapping options, and
whether modifications made to
this mapped copy of the page
are to be kept private, or are to be shared with
other references.
Possible types include MAP_SHARED or MAP_PRIVATE that
map a regular file or character-special device memory,
and MAP_ANON, which maps memory not associated with any specific file.
The file descriptor used when creating MAP_ANON regions is not used
and should be -1.
The MAP_INHERIT flag allows a region to be inherited after an
.Fn execve .
The MAP_HASSEMAPHORE flag allows special handling for
regions that may contain semaphores.
The MAP_NORESERVE flag allows processes to allocate regions whose
virtual address space, if fully allocated,
would exceed the available memory plus swap resources.
Such regions may get a SIGSEGV signal if they page fault and resources
are not available to service their request;
typically they would free up some resources via
.Fn munmap
so that when they return from the signal the page
fault could be completed successfully.

A facility is provided to synchronize a mapped region with the file
it maps; the call:

.Fd msync 2 "synchronize a mapped region
msync(addr, len);
caddr_t addr; size_t len;

causes any modified pages in the specified region to be synchronized
with their source and other mappings.
If necessary, it writes any modified pages back to the filesystem, and updates
the file modification time.
If len is 0, all modified pages within the region containing addr
will be flushed;
this usage is provisional, and may be withdrawn.
If len is non-zero, only the pages containing addr and len
succeeding locations will be examined.
Any required synchronization of memory caches
will also take place at this time.
Filesystem operations on a file that is mapped for shared modifications
are currently unpredictable except after an
.Fn msync .

A mapping can be removed by the call

.Fd munmap 2 "remove a mapping
munmap(addr, len);
caddr_t addr; size_t len;

This call deletes the mappings for the specified address range,
and causes further references to addresses within the range
to generate invalid memory references.
.Sh 3 "Page protection control

A process can control the protection of pages using the call:

.Fd mprotect 3 "control the protection of pages
mprotect(addr, len, prot);
caddr_t addr; size_t len; int prot;

This call changes the specified pages to have protection prot\|.
Not all implementations will guarantee protection on a page basis;
the granularity of protection changes may be as large as an entire region.
.Sh 3 "Giving and getting advice

A process that has knowledge of its memory behavior may
use the
.Fn madvise \(dg
call:
.FS
\(dg The entry point for this system call is defined,
but is not implemented,
so currently always returns with the error ``Operation not supported.''
.FE

.Fd madvise 3 "give advise about use of memory
madvise(addr, len, behav);
caddr_t addr; size_t len; int behav;

Behav describes expected behavior, as given
in <sys/mman.h>:


MADV_NORMAL /* no further special treatment */
MADV_RANDOM /* expect random page references */
MADV_SEQUENTIAL /* expect sequential references */
MADV_WILLNEED /* will need these pages */
MADV_DONTNEED /* don't need these pages */
MADV_SPACEAVAIL /* ensure that resources are reserved */


The
.Fn mincore \(dg
function allows a process to obtain information
about whether pages are memory resident:

.Fd mincore 3 "get advise about use of memory
mincore(addr, len, vec)
caddr_t addr; int len; result char *vec;

Here the current memory residency of the pages is returned
in the character array vec, with a value of 1 meaning
that the page is in-memory.
.Fn Mincore
provides only transient information about page residency.
Real-time processes that need guaranteed residence over time
can use the call:

.Fd mlock 2 "lock physical pages in memory
mlock(addr, len);
caddr_t addr; size_t len;

This call locks the pages for the specified address range into memory
(paging them in if necessary)
ensuring that further references to addresses within the range
will never generate page faults.
The amount of memory that may be locked is controlled by a resource limit,
see section
.Xr 1.6.3 .
When the memory is no longer critical it can be unlocked using:

.Fd munlock 2 "unlock physical pages in memory
munlock(addr, len);
caddr_t addr; size_t len;

After the
.Fn munlock
call, the pages in the specified address range are still accessible
but may be paged out if memory is short and they are not accessed.
.Sh 3 "Synchronization primitives
Primitives are provided for synchronization using semaphores
in shared memory.\(dd
.FS
\(dd All currently unimplemented, no entry points exists.
.FE
These primitives are expected to be superseded by the semaphore
interface being specified by the POSIX Pthread standard.
They are provided as an efficient interim solution.
Application programmers are encouraged to use the Pthread interface
when it becomes available.

Semaphores must lie within a MAP_SHARED region with at least modes
PROT_READ and PROT_WRITE.
The MAP_HASSEMAPHORE flag must have been specified when the region was created.
To acquire a lock a process calls:

.Fd mset 2 "acquire and set a semaphore
value = mset(sem, wait)
result int value; semaphore *sem; int wait;

.Fn Mset
indivisibly tests and sets the semaphore sem.
If the the previous value is zero, the process has acquired the lock and
.Fn mset
returns true immediately.
Otherwise, if the wait flag is zero,
failure is returned.
If wait is true and the previous value is non-zero,
.Fn mset
relinquishes the processor until notified that it should retry.

To release a lock a process calls:

.Fd mclear 2 "release a semaphore and awaken waiting processes
mclear(sem)
semaphore *sem;

.Fn Mclear
indivisibly tests and clears the semaphore sem.
If the ``WANT'' flag is zero in the previous value,
.Fn mclear
returns immediately.
If the ``WANT'' flag is non-zero in the previous value,
.Fn mclear
arranges for waiting processes to retry before returning.

Two routines provide services analogous to the kernel
.Fn sleep
and
.Fn wakeup
functions interpreted in the domain of shared memory.
A process may relinquish the processor by calling
.Fn msleep
with a set semaphore:

.Fd msleep 1 "wait for a semaphore
msleep(sem)
semaphore *sem;

If the semaphore is still set when it is checked by the kernel,
the process will be put in a sleeping state
until some other process issues an
.Fn mwakeup
for the same semaphore within the region using the call:

.Fd mwakeup 1 "awaken process(es) sleeping on a semaphore
mwakeup(sem)
semaphore *sem;

An
.Fn mwakeup
may awaken all sleepers on the semaphore,
or may awaken only the next sleeper on a queue.