1.\" Copyright (c) 2000 Jonathan Lemon 2.\" All rights reserved. 3.\" 4.\" Redistribution and use in source and binary forms, with or without 5.\" modification, are permitted provided that the following conditions 6.\" are met: 7.\" 1. Redistributions of source code must retain the above copyright 8.\" notice, this list of conditions and the following disclaimer. 9.\" 2. Redistributions in binary form must reproduce the above copyright 10.\" notice, this list of conditions and the following disclaimer in the 11.\" documentation and/or other materials provided with the distribution. 12.\" 13.\" THIS SOFTWARE IS PROVIDED ``AS IS'' AND 14.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 15.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 16.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE 17.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 18.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 19.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 20.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 21.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 22.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 23.\" SUCH DAMAGE. 24.\" 25.\" $FreeBSD: src/lib/libc/sys/kqueue.2,v 1.1.2.16 2002/07/02 21:05:08 mp Exp $ 26.\" 27.Dd April 14, 2000 28.Dt KQUEUE 2 29.Os 30.Sh NAME 31.Nm kqueue , 32.Nm kevent 33.Nd kernel event notification mechanism 34.Sh LIBRARY 35.Lb libc 36.Sh SYNOPSIS 37.In sys/types.h 38.In sys/event.h 39.In sys/time.h 40.Ft int 41.Fn kqueue "void" 42.Ft int 43.Fn kevent "int kq" "const struct kevent *changelist" "int nchanges" "struct kevent *eventlist" "int nevents" "const struct timespec *timeout" 44.Fn EV_SET "&kev" ident filter flags fflags data udata 45.Sh DESCRIPTION 46.Fn kqueue 47provides a generic method of notifying the user when an event 48happens or a condition holds, based on the results of small 49pieces of kernel code termed filters. 50A kevent is identified by the (ident, filter) pair; there may only 51be one unique kevent per kqueue. 52.Pp 53The filter is executed upon the initial registration of a kevent 54in order to detect whether a preexisting condition is present, and is also 55executed whenever an event is passed to the filter for evaluation. 56If the filter determines that the condition should be reported, 57then the kevent is placed on the kqueue for the user to retrieve. 58.Pp 59The filter is also run when the user attempts to retrieve the kevent 60from the kqueue. 61If the filter indicates that the condition that triggered 62the event no longer holds, the kevent is removed from the kqueue and 63is not returned. 64.Pp 65Multiple events which trigger the filter do not result in multiple 66kevents being placed on the kqueue; instead, the filter will aggregate 67the events into a single struct kevent. 68Calling 69.Fn close 70on a file descriptor will remove any kevents that reference the descriptor. 71.Pp 72.Fn kqueue 73creates a new kernel event queue and returns a descriptor. 74The queue is not inherited by a child created with 75.Xr fork 2 . 76However, if 77.Xr rfork 2 78is called without the 79.Dv RFFDG 80flag, then the descriptor table is shared, 81which will allow sharing of the kqueue between two processes. 82.Pp 83.Fn kevent 84is used to register events with the queue, and return any pending 85events to the user. 86.Fa changelist 87is a pointer to an array of 88.Va kevent 89structures, as defined in 90.Aq Pa sys/event.h . 91All changes contained in the 92.Fa changelist 93are applied before any pending events are read from the queue. 94.Fa nchanges 95gives the size of 96.Fa changelist . 97.Fa eventlist 98is a pointer to an array of kevent structures. 99.Fa nevents 100determines the size of 101.Fa eventlist . 102If 103.Fa timeout 104is a non-NULL pointer, it specifies a maximum interval to wait 105for an event, which will be interpreted as a struct timespec. If 106.Fa timeout 107is a NULL pointer, 108.Fn kevent 109waits indefinitely. To effect a poll, the 110.Fa timeout 111argument should be non-NULL, pointing to a zero-valued 112.Va timespec 113structure. The same array may be used for the 114.Fa changelist 115and 116.Fa eventlist . 117.Pp 118.Fn EV_SET 119is a macro which is provided for ease of initializing a 120kevent structure. 121.Pp 122The 123.Va kevent 124structure is defined as: 125.Bd -literal 126struct kevent { 127 uintptr_t ident; /* identifier for this event */ 128 short filter; /* filter for event */ 129 u_short flags; /* action flags for kqueue */ 130 u_int fflags; /* filter flag value */ 131 intptr_t data; /* filter data value */ 132 void *udata; /* opaque user data identifier */ 133}; 134.Ed 135.Pp 136The fields of 137.Fa struct kevent 138are: 139.Bl -tag -width XXXfilter 140.It ident 141Value used to identify this event. 142The exact interpretation is determined by the attached filter, 143but often is a file descriptor. 144.It filter 145Identifies the kernel filter used to process this event. The pre-defined 146system filters are described below. 147.It flags 148Actions to perform on the event. 149.It fflags 150Filter-specific flags. 151.It data 152Filter-specific data value. 153.It udata 154Opaque user-defined value passed through the kernel unchanged. 155.El 156.Pp 157The 158.Va flags 159field can contain the following values: 160.Bl -tag -width XXXEV_ONESHOT 161.It EV_ADD 162Adds the event to the kqueue. Re-adding an existing event 163will modify the parameters of the original event, and not result 164in a duplicate entry. Adding an event automatically enables it, 165unless overridden by the EV_DISABLE flag. 166.It EV_ENABLE 167Permit 168.Fn kevent 169to return the event if it is triggered. 170.It EV_DISABLE 171Disable the event so 172.Fn kevent 173will not return it. The filter itself is not disabled. 174.It EV_DELETE 175Removes the event from the kqueue. Events which are attached to 176file descriptors are automatically deleted on the last close of 177the descriptor. 178.It EV_ONESHOT 179Causes the event to return only the first occurrence of the filter 180being triggered. After the user retrieves the event from the kqueue, 181it is deleted. 182.It EV_CLEAR 183After the event is retrieved by the user, its state is reset. 184This is useful for filters which report state transitions 185instead of the current state. Note that some filters may automatically 186set this flag internally. 187.It EV_EOF 188Filters may set this flag to indicate filter-specific EOF condition. 189.It EV_ERROR 190See 191.Sx RETURN VALUES 192below. 193.El 194.Pp 195The predefined system filters are listed below. 196Arguments may be passed to and from the filter via the 197.Va fflags 198and 199.Va data 200fields in the kevent structure. 201.Bl -tag -width EVFILT_SIGNAL 202.It EVFILT_READ 203Takes a descriptor as the identifier, and returns whenever 204there is data available to read. 205The behavior of the filter is slightly different depending 206on the descriptor type. 207.Pp 208.Bl -tag -width 2n 209.It Sockets 210Sockets which have previously been passed to 211.Fn listen 212return when there is an incoming connection pending. 213.Va data 214contains the size of the listen backlog. 215.Pp 216Other socket descriptors return when there is data to be read, 217subject to the 218.Dv SO_RCVLOWAT 219value of the socket buffer. 220This may be overridden with a per-filter low water mark at the 221time the filter is added by setting the 222NOTE_LOWAT 223flag in 224.Va fflags , 225and specifying the new low water mark in 226.Va data . 227On return, 228.Va data 229contains the number of bytes in the socket buffer. 230.Pp 231If the read direction of the socket has shutdown, then the filter 232also sets EV_EOF in 233.Va flags , 234and returns the socket error (if any) in 235.Va fflags . 236It is possible for EOF to be returned (indicating the connection is gone) 237while there is still data pending in the socket buffer. 238.It Vnodes 239Returns when the file pointer is not at the end of file. 240.Va data 241contains the offset from current position to end of file, 242and may be negative. 243.It "Fifos, Pipes" 244Returns when the there is data to read; 245.Va data 246contains the number of bytes available. 247.Pp 248When the last writer disconnects, the filter will set EV_EOF in 249.Va flags . 250This may be cleared by passing in EV_CLEAR, at which point the 251filter will resume waiting for data to become available before 252returning. 253.El 254.It EVFILT_WRITE 255Takes a descriptor as the identifier, and returns whenever 256it is possible to write to the descriptor. For sockets, pipes 257and fifos, 258.Va data 259will contain the amount of space remaining in the write buffer. 260The filter will set EV_EOF when the reader disconnects, and for 261the fifo case, this may be cleared by use of EV_CLEAR. 262Note that this filter is not supported for vnodes. 263.Pp 264For sockets, the low water mark and socket error handling is 265identical to the EVFILT_READ case. 266.It EVFILT_AIO 267The sigevent portion of the AIO request is filled in, with 268.Va sigev_notify_kqueue 269containing the descriptor of the kqueue that the event should 270be attached to, 271.Va sigev_value 272containing the udata value, and 273.Va sigev_notify 274set to SIGEV_KEVENT. 275When the aio_* function is called, the event will be registered 276with the specified kqueue, and the 277.Va ident 278argument set to the 279.Fa struct aiocb 280returned by the aio_* function. 281The filter returns under the same conditions as aio_error. 282.Pp 283Alternatively, a kevent structure may be initialized, with 284.Va ident 285containing the descriptor of the kqueue, and the 286address of the kevent structure placed in the 287.Va aio_lio_opcode 288field of the AIO request. However, this approach will not work on 289architectures with 64-bit pointers, and should be considered depreciated. 290.It EVFILT_VNODE 291Takes a file descriptor as the identifier and the events to watch for in 292.Va fflags , 293and returns when one or more of the requested events occurs on the descriptor. 294The events to monitor are: 295.Bl -tag -width XXNOTE_RENAME 296.It NOTE_DELETE 297.Fn unlink 298was called on the file referenced by the descriptor. 299.It NOTE_WRITE 300A write occurred on the file referenced by the descriptor. 301.It NOTE_EXTEND 302The file referenced by the descriptor was extended. 303.It NOTE_ATTRIB 304The file referenced by the descriptor had its attributes changed. 305.It NOTE_LINK 306The link count on the file changed. 307.It NOTE_RENAME 308The file referenced by the descriptor was renamed. 309.It NOTE_REVOKE 310Access to the file was revoked via 311.Xr revoke 2 312or the underlying fileystem was unmounted. 313.El 314.Pp 315On return, 316.Va fflags 317contains the events which triggered the filter. 318.It EVFILT_PROC 319Takes the process ID to monitor as the identifier and the events to watch for 320in 321.Va fflags , 322and returns when the process performs one or more of the requested events. 323If a process can normally see another process, it can attach an event to it. 324The events to monitor are: 325.Bl -tag -width XXNOTE_TRACKERR 326.It NOTE_EXIT 327The process has exited. 328.It NOTE_FORK 329The process has called 330.Fn fork . 331.It NOTE_EXEC 332The process has executed a new process via 333.Xr execve 2 334or similar call. 335.It NOTE_TRACK 336Follow a process across 337.Fn fork 338calls. The parent process will return with NOTE_TRACK set in the 339.Va fflags 340field, while the child process will return with NOTE_CHILD set in 341.Va fflags 342and the parent PID in 343.Va data . 344.It NOTE_TRACKERR 345This flag is returned if the system was unable to attach an event to 346the child process, usually due to resource limitations. 347.El 348.Pp 349On return, 350.Va fflags 351contains the events which triggered the filter. 352.It EVFILT_SIGNAL 353Takes the signal number to monitor as the identifier and returns 354when the given signal is delivered to the process. 355This coexists with the 356.Fn signal 357and 358.Fn sigaction 359facilities, and has a lower precedence. The filter will record 360all attempts to deliver a signal to a process, even if the signal has 361been marked as SIG_IGN. Event notification happens after normal 362signal delivery processing. 363.Va data 364returns the number of times the signal has occurred since the last call to 365.Fn kevent . 366This filter automatically sets the EV_CLEAR flag internally. 367.It EVFILT_TIMER 368Establishes an arbitrary timer identified by 369.Va ident . 370When adding a timer, 371.Va data 372specifies the timeout period in milliseconds. 373The timer will be periodic unless EV_ONESHOT is specified. 374On return, 375.Va data 376contains the number of times the timeout has expired since the last call to 377.Fn kevent . 378This filter automatically sets the EV_CLEAR flag internally. 379.El 380.Sh RETURN VALUES 381.Fn kqueue 382creates a new kernel event queue and returns a file descriptor. 383If there was an error creating the kernel event queue, a value of -1 is 384returned and errno set. 385.Pp 386.Fn kevent 387returns the number of events placed in the 388.Fa eventlist , 389up to the value given by 390.Fa nevents . 391If an error occurs while processing an element of the 392.Fa changelist 393and there is enough room in the 394.Fa eventlist , 395then the event will be placed in the 396.Fa eventlist 397with 398.Dv EV_ERROR 399set in 400.Va flags 401and the system error in 402.Va data . 403Otherwise, 404.Dv -1 405will be returned, and 406.Dv errno 407will be set to indicate the error condition. 408If the time limit expires, then 409.Fn kevent 410returns 0. 411.Sh ERRORS 412The 413.Fn kqueue 414function fails if: 415.Bl -tag -width Er 416.It Bq Er ENOMEM 417The kernel failed to allocate enough memory for the kernel queue. 418.It Bq Er EMFILE 419The per-process descriptor table is full. 420.It Bq Er ENFILE 421The system file table is full. 422.El 423.Pp 424The 425.Fn kevent 426function fails if: 427.Bl -tag -width Er 428.It Bq Er EACCES 429The process does not have permission to register a filter. 430.It Bq Er EFAULT 431There was an error reading or writing the 432.Va kevent 433structure. 434.It Bq Er EBADF 435The specified descriptor is invalid. 436.It Bq Er EINTR 437A signal was delivered before the timeout expired and before any 438events were placed on the kqueue for return. 439.It Bq Er EINVAL 440The specified time limit or filter is invalid. 441.It Bq Er ENOENT 442The event could not be found to be modified or deleted. 443.It Bq Er ENOMEM 444No memory was available to register the event. 445.It Bq Er ESRCH 446The specified process to attach to does not exist. 447.El 448.Sh SEE ALSO 449.Xr aio_error 2 , 450.Xr aio_read 2 , 451.Xr aio_return 2 , 452.Xr poll 2 , 453.Xr read 2 , 454.Xr select 2 , 455.Xr sigaction 2 , 456.Xr write 2 , 457.Xr signal 3 458.Sh HISTORY 459The 460.Fn kqueue 461and 462.Fn kevent 463functions first appeared in 464.Fx 4.1 . 465.Sh AUTHORS 466The 467.Fn kqueue 468system and this manual page were written by 469.An Jonathan Lemon Aq jlemon@FreeBSD.org . 470.Sh BUGS 471It is currently not possible to watch a 472.Xr vnode 9 473that resides on anything but 474a UFS file system. 475