1.\" Copyright (c) 2000 Jonathan Lemon 2.\" All rights reserved. 3.\" 4.\" Redistribution and use in source and binary forms, with or without 5.\" modification, are permitted provided that the following conditions 6.\" are met: 7.\" 1. Redistributions of source code must retain the above copyright 8.\" notice, this list of conditions and the following disclaimer. 9.\" 2. Redistributions in binary form must reproduce the above copyright 10.\" notice, this list of conditions and the following disclaimer in the 11.\" documentation and/or other materials provided with the distribution. 12.\" 13.\" THIS SOFTWARE IS PROVIDED ``AS IS'' AND 14.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 15.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 16.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE 17.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 18.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 19.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 20.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 21.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 22.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 23.\" SUCH DAMAGE. 24.\" 25.\" $FreeBSD: src/lib/libc/sys/kqueue.2,v 1.1.2.16 2002/07/02 21:05:08 mp Exp $ 26.\" $DragonFly: src/lib/libc/sys/kqueue.2,v 1.3 2006/05/26 19:39:37 swildner Exp $ 27.\" 28.Dd April 14, 2000 29.Dt KQUEUE 2 30.Os 31.Sh NAME 32.Nm kqueue , 33.Nm kevent 34.Nd kernel event notification mechanism 35.Sh LIBRARY 36.Lb libc 37.Sh SYNOPSIS 38.In sys/types.h 39.In sys/event.h 40.In sys/time.h 41.Ft int 42.Fn kqueue "void" 43.Ft int 44.Fn kevent "int kq" "const struct kevent *changelist" "int nchanges" "struct kevent *eventlist" "int nevents" "const struct timespec *timeout" 45.Fn EV_SET "&kev" ident filter flags fflags data udata 46.Sh DESCRIPTION 47.Fn kqueue 48provides a generic method of notifying the user when an event 49happens or a condition holds, based on the results of small 50pieces of kernel code termed filters. 51A kevent is identified by the (ident, filter) pair; there may only 52be one unique kevent per kqueue. 53.Pp 54The filter is executed upon the initial registration of a kevent 55in order to detect whether a preexisting condition is present, and is also 56executed whenever an event is passed to the filter for evaluation. 57If the filter determines that the condition should be reported, 58then the kevent is placed on the kqueue for the user to retrieve. 59.Pp 60The filter is also run when the user attempts to retrieve the kevent 61from the kqueue. 62If the filter indicates that the condition that triggered 63the event no longer holds, the kevent is removed from the kqueue and 64is not returned. 65.Pp 66Multiple events which trigger the filter do not result in multiple 67kevents being placed on the kqueue; instead, the filter will aggregate 68the events into a single struct kevent. 69Calling 70.Fn close 71on a file descriptor will remove any kevents that reference the descriptor. 72.Pp 73.Fn kqueue 74creates a new kernel event queue and returns a descriptor. 75The queue is not inherited by a child created with 76.Xr fork 2 . 77However, if 78.Xr rfork 2 79is called without the 80.Dv RFFDG 81flag, then the descriptor table is shared, 82which will allow sharing of the kqueue between two processes. 83.Pp 84.Fn kevent 85is used to register events with the queue, and return any pending 86events to the user. 87.Fa changelist 88is a pointer to an array of 89.Va kevent 90structures, as defined in 91.In sys/event.h . 92All changes contained in the 93.Fa changelist 94are applied before any pending events are read from the queue. 95.Fa nchanges 96gives the size of 97.Fa changelist . 98.Fa eventlist 99is a pointer to an array of kevent structures. 100.Fa nevents 101determines the size of 102.Fa eventlist . 103If 104.Fa timeout 105is a non-NULL pointer, it specifies a maximum interval to wait 106for an event, which will be interpreted as a struct timespec. If 107.Fa timeout 108is a NULL pointer, 109.Fn kevent 110waits indefinitely. To effect a poll, the 111.Fa timeout 112argument should be non-NULL, pointing to a zero-valued 113.Va timespec 114structure. The same array may be used for the 115.Fa changelist 116and 117.Fa eventlist . 118.Pp 119.Fn EV_SET 120is a macro which is provided for ease of initializing a 121kevent structure. 122.Pp 123The 124.Va kevent 125structure is defined as: 126.Bd -literal 127struct kevent { 128 uintptr_t ident; /* identifier for this event */ 129 short filter; /* filter for event */ 130 u_short flags; /* action flags for kqueue */ 131 u_int fflags; /* filter flag value */ 132 intptr_t data; /* filter data value */ 133 void *udata; /* opaque user data identifier */ 134}; 135.Ed 136.Pp 137The fields of 138.Fa struct kevent 139are: 140.Bl -tag -width XXXfilter 141.It ident 142Value used to identify this event. 143The exact interpretation is determined by the attached filter, 144but often is a file descriptor. 145.It filter 146Identifies the kernel filter used to process this event. The pre-defined 147system filters are described below. 148.It flags 149Actions to perform on the event. 150.It fflags 151Filter-specific flags. 152.It data 153Filter-specific data value. 154.It udata 155Opaque user-defined value passed through the kernel unchanged. 156.El 157.Pp 158The 159.Va flags 160field can contain the following values: 161.Bl -tag -width XXXEV_ONESHOT 162.It EV_ADD 163Adds the event to the kqueue. Re-adding an existing event 164will modify the parameters of the original event, and not result 165in a duplicate entry. Adding an event automatically enables it, 166unless overridden by the EV_DISABLE flag. 167.It EV_ENABLE 168Permit 169.Fn kevent 170to return the event if it is triggered. 171.It EV_DISABLE 172Disable the event so 173.Fn kevent 174will not return it. The filter itself is not disabled. 175.It EV_DELETE 176Removes the event from the kqueue. Events which are attached to 177file descriptors are automatically deleted on the last close of 178the descriptor. 179.It EV_ONESHOT 180Causes the event to return only the first occurrence of the filter 181being triggered. After the user retrieves the event from the kqueue, 182it is deleted. 183.It EV_CLEAR 184After the event is retrieved by the user, its state is reset. 185This is useful for filters which report state transitions 186instead of the current state. Note that some filters may automatically 187set this flag internally. 188.It EV_EOF 189Filters may set this flag to indicate filter-specific EOF condition. 190.It EV_ERROR 191See 192.Sx RETURN VALUES 193below. 194.El 195.Pp 196The predefined system filters are listed below. 197Arguments may be passed to and from the filter via the 198.Va fflags 199and 200.Va data 201fields in the kevent structure. 202.Bl -tag -width EVFILT_SIGNAL 203.It EVFILT_READ 204Takes a descriptor as the identifier, and returns whenever 205there is data available to read. 206The behavior of the filter is slightly different depending 207on the descriptor type. 208.Pp 209.Bl -tag -width 2n 210.It Sockets 211Sockets which have previously been passed to 212.Fn listen 213return when there is an incoming connection pending. 214.Va data 215contains the size of the listen backlog. 216.Pp 217Other socket descriptors return when there is data to be read, 218subject to the 219.Dv SO_RCVLOWAT 220value of the socket buffer. 221This may be overridden with a per-filter low water mark at the 222time the filter is added by setting the 223NOTE_LOWAT 224flag in 225.Va fflags , 226and specifying the new low water mark in 227.Va data . 228On return, 229.Va data 230contains the number of bytes in the socket buffer. 231.Pp 232If the read direction of the socket has shutdown, then the filter 233also sets EV_EOF in 234.Va flags , 235and returns the socket error (if any) in 236.Va fflags . 237It is possible for EOF to be returned (indicating the connection is gone) 238while there is still data pending in the socket buffer. 239.It Vnodes 240Returns when the file pointer is not at the end of file. 241.Va data 242contains the offset from current position to end of file, 243and may be negative. 244.It "Fifos, Pipes" 245Returns when the there is data to read; 246.Va data 247contains the number of bytes available. 248.Pp 249When the last writer disconnects, the filter will set EV_EOF in 250.Va flags . 251This may be cleared by passing in EV_CLEAR, at which point the 252filter will resume waiting for data to become available before 253returning. 254.El 255.It EVFILT_WRITE 256Takes a descriptor as the identifier, and returns whenever 257it is possible to write to the descriptor. For sockets, pipes 258and fifos, 259.Va data 260will contain the amount of space remaining in the write buffer. 261The filter will set EV_EOF when the reader disconnects, and for 262the fifo case, this may be cleared by use of EV_CLEAR. 263Note that this filter is not supported for vnodes. 264.Pp 265For sockets, the low water mark and socket error handling is 266identical to the EVFILT_READ case. 267.It EVFILT_AIO 268The sigevent portion of the AIO request is filled in, with 269.Va sigev_notify_kqueue 270containing the descriptor of the kqueue that the event should 271be attached to, 272.Va sigev_value 273containing the udata value, and 274.Va sigev_notify 275set to SIGEV_KEVENT. 276When the aio_* function is called, the event will be registered 277with the specified kqueue, and the 278.Va ident 279argument set to the 280.Fa struct aiocb 281returned by the aio_* function. 282The filter returns under the same conditions as aio_error. 283.Pp 284Alternatively, a kevent structure may be initialized, with 285.Va ident 286containing the descriptor of the kqueue, and the 287address of the kevent structure placed in the 288.Va aio_lio_opcode 289field of the AIO request. However, this approach will not work on 290architectures with 64-bit pointers, and should be considered depreciated. 291.It EVFILT_VNODE 292Takes a file descriptor as the identifier and the events to watch for in 293.Va fflags , 294and returns when one or more of the requested events occurs on the descriptor. 295The events to monitor are: 296.Bl -tag -width XXNOTE_RENAME 297.It NOTE_DELETE 298.Fn unlink 299was called on the file referenced by the descriptor. 300.It NOTE_WRITE 301A write occurred on the file referenced by the descriptor. 302.It NOTE_EXTEND 303The file referenced by the descriptor was extended. 304.It NOTE_ATTRIB 305The file referenced by the descriptor had its attributes changed. 306.It NOTE_LINK 307The link count on the file changed. 308.It NOTE_RENAME 309The file referenced by the descriptor was renamed. 310.It NOTE_REVOKE 311Access to the file was revoked via 312.Xr revoke 2 313or the underlying fileystem was unmounted. 314.El 315.Pp 316On return, 317.Va fflags 318contains the events which triggered the filter. 319.It EVFILT_PROC 320Takes the process ID to monitor as the identifier and the events to watch for 321in 322.Va fflags , 323and returns when the process performs one or more of the requested events. 324If a process can normally see another process, it can attach an event to it. 325The events to monitor are: 326.Bl -tag -width XXNOTE_TRACKERR 327.It NOTE_EXIT 328The process has exited. 329.It NOTE_FORK 330The process has called 331.Fn fork . 332.It NOTE_EXEC 333The process has executed a new process via 334.Xr execve 2 335or similar call. 336.It NOTE_TRACK 337Follow a process across 338.Fn fork 339calls. The parent process will return with NOTE_TRACK set in the 340.Va fflags 341field, while the child process will return with NOTE_CHILD set in 342.Va fflags 343and the parent PID in 344.Va data . 345.It NOTE_TRACKERR 346This flag is returned if the system was unable to attach an event to 347the child process, usually due to resource limitations. 348.El 349.Pp 350On return, 351.Va fflags 352contains the events which triggered the filter. 353.It EVFILT_SIGNAL 354Takes the signal number to monitor as the identifier and returns 355when the given signal is delivered to the process. 356This coexists with the 357.Fn signal 358and 359.Fn sigaction 360facilities, and has a lower precedence. The filter will record 361all attempts to deliver a signal to a process, even if the signal has 362been marked as SIG_IGN. Event notification happens after normal 363signal delivery processing. 364.Va data 365returns the number of times the signal has occurred since the last call to 366.Fn kevent . 367This filter automatically sets the EV_CLEAR flag internally. 368.It EVFILT_TIMER 369Establishes an arbitrary timer identified by 370.Va ident . 371When adding a timer, 372.Va data 373specifies the timeout period in milliseconds. 374The timer will be periodic unless EV_ONESHOT is specified. 375On return, 376.Va data 377contains the number of times the timeout has expired since the last call to 378.Fn kevent . 379This filter automatically sets the EV_CLEAR flag internally. 380.El 381.Sh RETURN VALUES 382.Fn kqueue 383creates a new kernel event queue and returns a file descriptor. 384If there was an error creating the kernel event queue, a value of -1 is 385returned and errno set. 386.Pp 387.Fn kevent 388returns the number of events placed in the 389.Fa eventlist , 390up to the value given by 391.Fa nevents . 392If an error occurs while processing an element of the 393.Fa changelist 394and there is enough room in the 395.Fa eventlist , 396then the event will be placed in the 397.Fa eventlist 398with 399.Dv EV_ERROR 400set in 401.Va flags 402and the system error in 403.Va data . 404Otherwise, 405.Dv -1 406will be returned, and 407.Dv errno 408will be set to indicate the error condition. 409If the time limit expires, then 410.Fn kevent 411returns 0. 412.Sh ERRORS 413The 414.Fn kqueue 415function fails if: 416.Bl -tag -width Er 417.It Bq Er ENOMEM 418The kernel failed to allocate enough memory for the kernel queue. 419.It Bq Er EMFILE 420The per-process descriptor table is full. 421.It Bq Er ENFILE 422The system file table is full. 423.El 424.Pp 425The 426.Fn kevent 427function fails if: 428.Bl -tag -width Er 429.It Bq Er EACCES 430The process does not have permission to register a filter. 431.It Bq Er EFAULT 432There was an error reading or writing the 433.Va kevent 434structure. 435.It Bq Er EBADF 436The specified descriptor is invalid. 437.It Bq Er EINTR 438A signal was delivered before the timeout expired and before any 439events were placed on the kqueue for return. 440.It Bq Er EINVAL 441The specified time limit or filter is invalid. 442.It Bq Er ENOENT 443The event could not be found to be modified or deleted. 444.It Bq Er ENOMEM 445No memory was available to register the event. 446.It Bq Er ESRCH 447The specified process to attach to does not exist. 448.El 449.Sh SEE ALSO 450.Xr aio_error 2 , 451.Xr aio_read 2 , 452.Xr aio_return 2 , 453.Xr poll 2 , 454.Xr read 2 , 455.Xr select 2 , 456.Xr sigaction 2 , 457.Xr write 2 , 458.Xr signal 3 459.Sh HISTORY 460The 461.Fn kqueue 462and 463.Fn kevent 464functions first appeared in 465.Fx 4.1 . 466.Sh AUTHORS 467The 468.Fn kqueue 469system and this manual page were written by 470.An Jonathan Lemon Aq jlemon@FreeBSD.org . 471.Sh BUGS 472It is currently not possible to watch a 473.Xr vnode 9 474that resides on anything but 475a UFS file system. 476