xref: /netbsd/share/doc/papers/bus_dma/2.me (revision bf9ec67e)
$NetBSD: 2.me,v 1.1 1998/07/15 00:34:54 thorpej Exp $

Copyright (c) 1998 Jason R. Thorpe.
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
3. All advertising materials mentioning features or use of this software
must display the following acknowledgements:
This product includes software developed for the NetBSD Project
by Jason R. Thorpe.
4. The name of the author may not be used to endorse or promote products
derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
SUCH DAMAGE.

.sh 1 "Design considerations" .pp Hiding host and bus details is actually very straightforward. Handling WYSIWYG and direct-mapped DMA mechanisms is trivial. Handling scatter-gather-mapped DMA is also very easy, with the help of state kept in machine-dependent code layers. The presence and semantics of caches are also easy to handle with a set of four "synchronization" operations, and once caches are handled, DMA bouncing is conceptually trivial if viewed as a non-DMA-coherent cache. Unfortunately, while these operations are quite easy to do individually, traditional kernels do not provide a sufficiently abstract interface to the operations. This means that device drivers in these traditional kernels must handle each case explicitly. .pp In addition to the interface to these operations, a comprehensive DMA framework must also consider data buffer structures and DMA-safe memory handling. .sh 2 "Data buffer structures" .pp The BSD kernel has essentially three different structures used to represent data buffers. The first is a simple linear buffer in virtual space, for example the data areas used to implement the file system buffer cache, and miscellaneous buffers allocated by the general purpose kernel memory allocator. The second is the mbuf chain. Mbufs are typically used by code which implements inter-process communication and networking. Their structure, small buffers chained together, reduces memory fragmentation and allows packet headers to be prepended easily. The third is the uio structure. This structure describes software scatter-gather to the kernel address space or to the address space of a specific process. It is most commonly used by the read(2) and write(2) system calls. While it would be possible for the device driver to treat the two more complex buffer structures as sets of multiple simple linear buffers, this is undesirable in terms of source code maintenance; the code to handle these data buffer structures can be complex, especially in terms of error handling. .pp In addition to the obvious need to DMA to and from memory mapped into kernel address space, it is common in modern operating systems to implement an optimized I/O interface for user processes which provides a method for devices to DMA directly to or from memory regions mapped into a process's address space. While this facility is partially provided for character device I/O by double-mapping the user buffer into kernel address space, the interface is not sufficiently general, and consumes kernel resources. This is somewhat related to the uio structure, in that the uio is capable of addressing buffers in a process's address space. However it may be desirable to use an alternate data format, such as a linear buffer, in some applications. In order to implement this, the DMA mapping framework must have access to processes' virtual memory structures. .pp It may also be desirable to DMA to or from buffers not mapped into any address space. The obvious example is frame grabbers. These devices, which capture video images, often require large, physically contiguous memory regions to store the captured image data. On some architectures, mapping of virtual address space is expensive. An application may wish to give a large buffer to the device, allow the device to continuously update the buffer, and then only map small regions of the buffer at any given time. Since the entire buffer need not be mapped into virtual address space, the DMA framework should provide an interface for using raw, unmapped buffers in DMA transfers. .sh 2 "DMA-safe memory handling" .pp A comprehensive DMA framework must also provide several memory handling facilities. The most obvious of these is a method of allocating (and freeing) DMA-safe memory. The term "DMA-safe" is a way of describing a set of attributes the memory will have. First, DMA-safe memory must be addressable within the constraints of the bus. It must also be allocated in such a way as to not exceed the number of physical segments\** specified by the caller. .(f \**This is somewhat misleading. The actual constraint is on the number of DMA segments the memory may map to. However, this usually corresponds directly to the number of physical memory segments which make up the allocated memory. .)f .pp In order for the kernel to access the DMA-safe memory, a method must exist to map this memory into kernel virtual address space. This is a fairly straightforward operation, with one exception. On some platforms which do not have cache-coherent DMA, cache flushes are very expensive. However, it is sometimes possible to mark virtual mappings of memory as cache-inhibited, or access physical memory though a cache-inhibited direct-mapped address segment. In order to accommodate these situations, a hint may be provided to the memory mapping function which specifies that the user of this memory wishes to avoid expensive data cache flushes. .pp To facilitate optimized I/O to process address spaces, it is necessary to provide processes a way of mapping a DMA-safe memory area. The most convenient way to do this is via a device driver's mmap() entry point. Thus, a DMA mapping framework must have a way to communicate with the VM system's device pager\**. .(f \**The device pager provides support for memory mapping devices into a process's address space. .)f .pp All of these requirements must be considered in the design of a complete DMA framework. When possible, the framework may merge semantically similar operations or concepts, but it must address all of these issues. The next section describes the interface provided by such a framework.