1.\" $NetBSD: aio.3,v 1.5 2010/05/19 06:35:20 jruoho Exp $ $ 2.\" 3.\" Copyright (c) 2010 Jukka Ruohonen <jruohonen@iki.fi> 4.\" All rights reserved. 5.\" 6.\" Redistribution and use in source and binary forms, with or without 7.\" modification, are permitted provided that the following conditions 8.\" are met: 9.\" 1. Redistributions of source code must retain the above copyright 10.\" notice, this list of conditions and the following disclaimer. 11.\" 2. Redistributions in binary form must reproduce the above copyright 12.\" notice, this list of conditions and the following disclaimer in the 13.\" documentation and/or other materials provided with the distribution. 14.\" 15.\" THIS SOFTWARE IS PROVIDED BY Softweyr LLC AND CONTRIBUTORS ``AS IS'' AND 16.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 17.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 18.\" ARE DISCLAIMED. IN NO EVENT SHALL Softweyr LLC OR CONTRIBUTORS BE LIABLE 19.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 20.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 21.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 22.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 23.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 24.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 25.\" SUCH DAMAGE. 26.\" 27.Dd May 19, 2010 28.Dt AIO 3 29.Os 30.Sh NAME 31.Nm aio 32.Nd asynchronous I/O (REALTIME) 33.Sh LIBRARY 34.Lb librt 35.Sh SYNOPSIS 36.In aio.h 37.Sh DESCRIPTION 38The 39.St -p1003.1-2001 40standard defines an interface for asynchronous input and output. 41Although in 42.Nx 43this is provided as part of the 44.Lb librt , 45the implementation largely resides in the kernel. 46.Ss Rationale 47The rationale can be roughly summarized with the following points. 48.Bl -enum -offset 2n 49.It 50To increase performance by providing a mechanism to carry out 51.Tn I/O 52without blocking. 53Theoretically, if 54.Tn I/O 55would never block, 56neither at the software nor at the hardware level, 57the overhead of 58.Tn I/O 59would become zero, and processes would no longer be 60.Tn I/O 61bound. 62.It 63To segregate the different 64.Tn I/O 65operations into logically distinctive procedures. 66Unlike with the standard 67.Xr stdio 3 , 68the 69.Nm 70interface separates queuing and submitting 71.Tn I/O 72operations to the kernel, and 73receiving notifications of operation completion from the kernel. 74.It 75To provide an uniform and standardized framework for asynchronous 76.Tn I/O . 77For instance, 78.Nm 79avoids the need for (and the overhead of) extra worker threads 80sometimes used to perform asynchronous 81.Tn I/O . 82.El 83.Ss Asynchronous I/O Control Block 84The Asynchronous I/O Control Block is the basic operational unit behind 85.Nm . 86This is required since an arbitrary number of operations can be started 87at once, and because each operation can be either input or output. 88This block is represented by the 89.Em aiocb 90structure, which is defined in the 91.In aio.h 92header. 93The following fields are available for user applications: 94.Bd -literal -offset indent 95off_t aio_offset; 96void *aio_buf; 97size_t aio_nbytes; 98int aio_fildes; 99int aio_lio_opcode; 100int aio_reqprio; 101struct sigevent aio_sigevent; 102.Ed 103.Pp 104The fields are: 105.Bl -enum -offset indent 106.It 107The 108.Va aio_offset 109specifies the implicit file offset at which the 110.Tn I/O 111operations are performed. 112This cannot be expected to be the actual read/write offset of the 113file descriptor. 114.It 115The 116.Va aio_buf 117member is a pointer to the buffer to which data is going to be written or 118to which the read operation stores data. 119.It 120The 121.Va aio_nbytes 122specifies the length of 123.Va aio_buf . 124.It 125The 126.Va aio_fildes 127specifies the used file descriptor. 128.It 129The 130.Va aio_lio_opcode 131is used by the 132.Fn lio_listio 133function to initialize a list of 134.Tn I/O 135requests with a single call. 136.It 137The 138.Va aio_reqprio 139member can be used to lower the scheduling priority of an 140.Nm 141operation. 142This is only available if 143.Dv _POSIX_PRIORITIZED_IO 144and 145.Dv _POSIX_PRIORITY_SCHEDULING 146are defined, and the associated file descriptor supports it. 147.It 148The 149.Va aio_sigevent 150member is used to specify how the calling process is notified once an 151.Nm 152operation completes. 153.El 154.Pp 155The members 156.Va aio_buf , 157.Va aio_fildes , 158and 159.Va aio_nbytes 160are conceptually similar to the parameters 161.Sq buf , 162.Sq fildes , 163and 164.Sq nbytes 165used in the standard 166.Xr read 2 167and 168.Xr write 2 169functions. 170For example, the caller can read 171.Va aio_nbytes 172from a file associated with the file descriptor 173.Va aio_fildes 174into the buffer 175.Va aio_buf . 176All appropriate fields should be initialized by the caller before 177.Fn aio_read 178or 179.Fn aio_write 180is called. 181.Ss File Offsets 182Asynchronous 183.Tn I/O 184operations are not strictly sequential; 185operations are carried out in arbitrary order and more than one 186operation for one file descriptor can be started. 187The requested read or write operation starts 188from the absolute position specified by 189.Va aio_offset , 190as if 191.Xr lseek 2 192would have been called with 193.Dv SEEK_SET 194immediately prior to the operation. 195The 196.Tn POSIX 197standard does not specify what happens after an 198.Nm 199operation has been successfully completed. 200Depending on the implementation, 201the actual file offset may or may not be updated. 202.Ss Errors and Completion 203Asynchronous 204.Tn I/O 205operations are said to be complete when: 206.Bl -bullet -offset 2n 207.It 208An error is detected. 209.It 210The 211.Tn I/O 212transfer is performed successfully. 213.It 214The operation is canceled. 215.El 216.Pp 217If an error condition is detected that prevents 218an operation from being started, the request is not enqueued. 219In this case the read and write functions, 220.Fn aio_read 221and 222.Fn aio_write , 223return immediately, setting the global 224.Va errno 225to indicate the cause of the error. 226.Pp 227After an operation has been successfully enqueued, 228.Fn aio_error 229and 230.Fn aio_return 231must be used to determine the status of the operation and to determine 232any error conditions. 233This includes the conditions reported by the standard 234.Xr read 2 , 235.Xr write 2 , 236and 237.Xr fsync 2 . 238The request remains enqueued and consumes process and 239system resources until 240.Fn aio_return 241is called. 242.Ss Waiting for Completion 243The 244.Nm 245interface supports both polling and notification models. 246The first can be implemented by simply repeatedly calling the 247.Fn aio_error 248function to test the status of an operation. 249Once the operation has completed, 250.Fn aio_return 251is used to free the 252.Va aiocb 253structure for re-use. 254.Pp 255The notification model is implemented by using the 256.Va aio_sigevent 257member of the Asynchronous I/O Control Block. 258The operational model and the used structure are described in 259.Xr sigevent 3 . 260.Pp 261The 262.Fn aio_suspend 263function can be used to wait for the completion of one or more operations. 264It is possible to set a timeout so that the process can continue the 265execution and take recovery actions if the 266.Nm 267operations do not complete as expected. 268.Ss Cancellation and Synchronization 269The 270.Fn aio_cancel 271function can be used to request cancellation of an asynchronous 272.Tn I/O 273operation. 274Note however that not all of them can be canceled. 275The same 276.Va aiocb 277used to start the operation may be used as a handle for identification. 278It is also possible to request cancellation of all operations pending 279for a file. 280.Pp 281Comparable to 282.Xr fsync 2 , 283the 284.Fn aio_fsync 285function can be used to synchronize the contents of 286permanent storage when multiple asynchronous 287.Tn I/O 288operations are outstanding for the file or device. 289The synchronization operation includes only those requests that have 290already been successfully enqueued. 291.Sh FUNCTIONS 292The following functions comprise the 293.Tn API 294of the 295.Nm 296interface: 297.Bl -column -offset indent "aio_suspend " "XXX" 298.It Sy Function Ta Sy Description 299.It Xr aio_cancel 3 Ta cancel an outstanding asynchronous I/O operation 300.It Xr aio_error 3 Ta retrieve error status of asynchronous I/O operation 301.It Xr aio_fsync 3 Ta asynchronous data synchronization of file 302.It Xr aio_read 3 Ta asynchronous read from a file 303.It Xr aio_return 3 Ta get return status of asynchronous I/O operation 304.It Xr aio_suspend 3 Ta suspend until operations or timeout complete 305.It Xr aio_write 3 Ta asynchronous write to a file 306.It Xr lio_listio 3 Ta list directed I/O 307.El 308.Sh COMPATIBILITY 309Unfortunately, the 310.Tn POSIX 311asynchronous 312.Tn I/O 313implementations vary slightly. 314Some implementations provide a slightly different 315.Tn API 316with possible extensions. 317For instance, the 318.Fx 319implementation uses a function 320.Sq Fn aio_waitcomplete 321to wait for the next completion of an 322.Nm aio 323request. 324.Sh STANDARDS 325The 326.Nm 327interface is expected to conform to the 328.St -p1003.1-2001 329standard. 330.Sh HISTORY 331The 332.Nm 333interface first appeared in 334.Nx 5.0 . 335.Sh CAVEATS 336Few limitations can be mentioned: 337.Bl -bullet 338.It 339Undefined behavior results if simultaneous asynchronous operations 340use the same Asynchronous I/O Control Block. 341.It 342When an asynchronous read operation is outstanding, 343undefined behavior may follow if the contents of 344.Va aiocb 345are altered, or if memory associated with the structure, or the 346.Va aio_buf 347buffer, is deallocated. 348.El 349