1.\" Copyright (c) 2000 Jonathan Lemon 2.\" All rights reserved. 3.\" 4.\" Redistribution and use in source and binary forms, with or without 5.\" modification, are permitted provided that the following conditions 6.\" are met: 7.\" 1. Redistributions of source code must retain the above copyright 8.\" notice, this list of conditions and the following disclaimer. 9.\" 2. Redistributions in binary form must reproduce the above copyright 10.\" notice, this list of conditions and the following disclaimer in the 11.\" documentation and/or other materials provided with the distribution. 12.\" 13.\" THIS SOFTWARE IS PROVIDED ``AS IS'' AND 14.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 15.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 16.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE 17.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 18.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 19.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 20.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 21.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 22.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 23.\" SUCH DAMAGE. 24.\" 25.\" $FreeBSD: src/lib/libc/sys/kqueue.2,v 1.1.2.16 2002/07/02 21:05:08 mp Exp $ 26.\" $DragonFly: src/lib/libc/sys/kqueue.2,v 1.7 2008/05/02 02:05:04 swildner Exp $ 27.\" 28.Dd December 3, 2008 29.Dt KQUEUE 2 30.Os 31.Sh NAME 32.Nm kqueue , 33.Nm kevent 34.Nd kernel event notification mechanism 35.Sh LIBRARY 36.Lb libc 37.Sh SYNOPSIS 38.In sys/types.h 39.In sys/event.h 40.In sys/time.h 41.Ft int 42.Fn kqueue "void" 43.Ft int 44.Fn kevent "int kq" "const struct kevent *changelist" "int nchanges" "struct kevent *eventlist" "int nevents" "const struct timespec *timeout" 45.Fn EV_SET "&kev" ident filter flags fflags data udata 46.Sh DESCRIPTION 47.Fn kqueue 48provides a generic method of notifying the user when an event 49happens or a condition holds, based on the results of small 50pieces of kernel code termed filters. 51A kevent is identified by the (ident, filter) pair; there may only 52be one unique kevent per kqueue. 53.Pp 54The filter is executed upon the initial registration of a kevent 55in order to detect whether a preexisting condition is present, and is also 56executed whenever an event is passed to the filter for evaluation. 57If the filter determines that the condition should be reported, 58then the kevent is placed on the kqueue for the user to retrieve. 59.Pp 60The filter is also run when the user attempts to retrieve the kevent 61from the kqueue. 62If the filter indicates that the condition that triggered 63the event no longer holds, the kevent is removed from the kqueue and 64is not returned. 65.Pp 66Multiple events which trigger the filter do not result in multiple 67kevents being placed on the kqueue; instead, the filter will aggregate 68the events into a single struct kevent. 69Calling 70.Fn close 71on a file descriptor will remove any kevents that reference the descriptor. 72.Pp 73.Fn kqueue 74creates a new kernel event queue and returns a descriptor. 75The queue is not inherited by a child created with 76.Xr fork 2 . 77However, if 78.Xr rfork 2 79is called without the 80.Dv RFFDG 81flag, then the descriptor table is shared, 82which will allow sharing of the kqueue between two processes. 83.Pp 84.Fn kevent 85is used to register events with the queue, and return any pending 86events to the user. 87.Fa changelist 88is a pointer to an array of 89.Va kevent 90structures, as defined in 91.In sys/event.h . 92All changes contained in the 93.Fa changelist 94are applied before any pending events are read from the queue. 95.Fa nchanges 96gives the size of 97.Fa changelist . 98.Fa eventlist 99is a pointer to an array of kevent structures. 100.Fa nevents 101determines the size of 102.Fa eventlist . 103If 104.Fa timeout 105is a non-NULL pointer, it specifies a maximum interval to wait 106for an event, which will be interpreted as a struct timespec. 107If 108.Fa timeout 109is a NULL pointer, 110.Fn kevent 111waits indefinitely. 112To effect a poll, the 113.Fa timeout 114argument should be non-NULL, pointing to a zero-valued 115.Va timespec 116structure. 117The same array may be used for the 118.Fa changelist 119and 120.Fa eventlist . 121.Pp 122.Fn EV_SET 123is a macro which is provided for ease of initializing a 124kevent structure. 125.Pp 126The 127.Va kevent 128structure is defined as: 129.Bd -literal 130struct kevent { 131 uintptr_t ident; /* identifier for this event */ 132 short filter; /* filter for event */ 133 u_short flags; /* action flags for kqueue */ 134 u_int fflags; /* filter flag value */ 135 intptr_t data; /* filter data value */ 136 void *udata; /* opaque user data identifier */ 137}; 138.Ed 139.Pp 140The fields of 141.Fa struct kevent 142are: 143.Bl -tag -width XXXfilter 144.It ident 145Value used to identify this event. 146The exact interpretation is determined by the attached filter, 147but often is a file descriptor. 148.It filter 149Identifies the kernel filter used to process this event. 150The pre-defined system filters are described below. 151.It flags 152Actions to perform on the event. 153.It fflags 154Filter-specific flags. 155.It data 156Filter-specific data value. 157.It udata 158Opaque user-defined value passed through the kernel unchanged. 159.El 160.Pp 161The 162.Va flags 163field can contain the following values: 164.Bl -tag -width ".Dv EV_ONESHOT" 165.It Dv EV_ADD 166Adds the event to the kqueue. 167Re-adding an existing event will modify the parameters of the original 168event, and not result in a duplicate entry. 169Adding an event automatically enables it, unless overridden by the 170.Dv EV_DISABLE 171flag. 172.It Dv EV_ENABLE 173Permit 174.Fn kevent 175to return the event if it is triggered. 176.It Dv EV_DISABLE 177Disable the event so 178.Fn kevent 179will not return it. 180The filter itself is not disabled. 181.It Dv EV_DELETE 182Removes the event from the kqueue. 183Events which are attached to file descriptors are automatically 184deleted on the last close of the descriptor. 185.It Dv EV_ONESHOT 186Causes the event to return only the first occurrence of the filter 187being triggered. 188After the user retrieves the event from the kqueue, it is deleted. 189.It Dv EV_CLEAR 190After the event is retrieved by the user, its state is reset. 191This is useful for filters which report state transitions 192instead of the current state. 193Note that some filters may automatically set this flag internally. 194.It Dv EV_EOF 195Filters may set this flag to indicate filter-specific EOF condition. 196.It Dv EV_ERROR 197See 198.Sx RETURN VALUES 199below. 200.El 201.Pp 202The predefined system filters are listed below. 203Arguments may be passed to and from the filter via the 204.Va fflags 205and 206.Va data 207fields in the kevent structure. 208.Bl -tag -width ".Dv EVFILT_SIGNAL" 209.It Dv EVFILT_READ 210Takes a descriptor as the identifier, and returns whenever 211there is data available to read. 212The behavior of the filter is slightly different depending 213on the descriptor type. 214.Bl -tag -width 2n 215.It Sockets 216Sockets which have previously been passed to 217.Fn listen 218return when there is an incoming connection pending. 219.Va data 220contains the size of the listen backlog. 221.Pp 222Other socket descriptors return when there is data to be read, 223subject to the 224.Dv SO_RCVLOWAT 225value of the socket buffer. 226This may be overridden with a per-filter low water mark at the 227time the filter is added by setting the 228.Dv NOTE_LOWAT 229flag in 230.Va fflags , 231and specifying the new low water mark in 232.Va data . 233On return, 234.Va data 235contains the number of bytes in the socket buffer. 236.Pp 237If the read direction of the socket has shutdown, then the filter also sets 238.Dv EV_EOF 239in 240.Va flags , 241and returns the socket error (if any) in 242.Va fflags . 243It is possible for EOF to be returned (indicating the connection is gone) 244while there is still data pending in the socket buffer. 245.It Vnodes 246Returns when the file pointer is not at the end of file. 247.Va data 248contains the offset from current position to end of file, 249and may be negative. 250.It "Fifos, Pipes" 251Returns when the there is data to read; 252.Va data 253contains the number of bytes available. 254.Pp 255When the last writer disconnects, the filter will set 256.Dv EV_EOF 257in 258.Va flags . 259This may be cleared by passing in 260.Dv EV_CLEAR , 261at which point the filter will resume waiting for data to become 262available before returning. 263.El 264.It Dv EVFILT_WRITE 265Takes a descriptor as the identifier, and returns whenever 266it is possible to write to the descriptor. 267For sockets, pipes and fifos, 268.Va data 269will contain the amount of space remaining in the write buffer. 270The filter will set 271.Dv EV_EOF 272when the reader disconnects, and for the fifo case, this may be cleared 273by use of 274.Dv EV_CLEAR . 275Note that this filter is not supported for vnodes. 276.Pp 277For sockets, the low water mark and socket error handling is 278identical to the 279.Dv EVFILT_READ 280case. 281.It Dv EVFILT_EXCEPT 282Takes a descriptor as the identifier, and returns whenever one of the 283specified exceptional conditions has occurred on the descriptor. Conditions 284are specified in 285.Va fflags . 286Currently, a filter can monitor the reception of out-of-band data with 287.Dv NOTE_OOB . 288.It Dv EVFILT_AIO 289The sigevent portion of the AIO request is filled in, with 290.Va sigev_notify_kqueue 291containing the descriptor of the kqueue that the event should 292be attached to, 293.Va sigev_value 294containing the udata value, and 295.Va sigev_notify 296set to 297.Dv SIGEV_KEVENT . 298When the aio_* function is called, the event will be registered 299with the specified kqueue, and the 300.Va ident 301argument set to the 302.Fa struct aiocb 303returned by the aio_* function. 304The filter returns under the same conditions as aio_error. 305.Pp 306Alternatively, a kevent structure may be initialized, with 307.Va ident 308containing the descriptor of the kqueue, and the 309address of the kevent structure placed in the 310.Va aio_lio_opcode 311field of the AIO request. 312However, this approach will not work on architectures with 64-bit 313pointers, and should be considered deprecated. 314.It Dv EVFILT_VNODE 315Takes a file descriptor as the identifier and the events to watch for in 316.Va fflags , 317and returns when one or more of the requested events occurs on the descriptor. 318The events to monitor are: 319.Bl -tag -width ".Dv NOTE_RENAME" 320.It Dv NOTE_DELETE 321.Fn unlink 322was called on the file referenced by the descriptor. 323.It Dv NOTE_WRITE 324A write occurred on the file referenced by the descriptor. 325.It Dv NOTE_EXTEND 326The file referenced by the descriptor was extended. 327.It Dv NOTE_ATTRIB 328The file referenced by the descriptor had its attributes changed. 329.It Dv NOTE_LINK 330The link count on the file changed. 331.It Dv NOTE_RENAME 332The file referenced by the descriptor was renamed. 333.It Dv NOTE_REVOKE 334Access to the file was revoked via 335.Xr revoke 2 336or the underlying fileystem was unmounted. 337.El 338.Pp 339On return, 340.Va fflags 341contains the events which triggered the filter. 342.It Dv EVFILT_PROC 343Takes the process ID to monitor as the identifier and the events to watch for 344in 345.Va fflags , 346and returns when the process performs one or more of the requested events. 347If a process can normally see another process, it can attach an event to it. 348The events to monitor are: 349.Bl -tag -width ".Dv NOTE_TRACKERR" 350.It Dv NOTE_EXIT 351The process has exited. 352.It Dv NOTE_FORK 353The process has called 354.Fn fork . 355.It Dv NOTE_EXEC 356The process has executed a new process via 357.Xr execve 2 358or similar call. 359.It Dv NOTE_TRACK 360Follow a process across 361.Fn fork 362calls. 363The parent process will return with 364.Dv NOTE_TRACK 365set in the 366.Va fflags 367field, while the child process will return with 368.Dv NOTE_CHILD 369set in 370.Va fflags 371and the parent PID in 372.Va data . 373.It Dv NOTE_TRACKERR 374This flag is returned if the system was unable to attach an event to 375the child process, usually due to resource limitations. 376.El 377.Pp 378On return, 379.Va fflags 380contains the events which triggered the filter. 381.It Dv EVFILT_SIGNAL 382Takes the signal number to monitor as the identifier and returns 383when the given signal is delivered to the process. 384This coexists with the 385.Fn signal 386and 387.Fn sigaction 388facilities, and has a lower precedence. 389The filter will record all attempts to deliver a signal to a process, 390even if the signal has been marked as 391.Dv SIG_IGN . 392Event notification happens after normal signal delivery processing. 393.Va data 394returns the number of times the signal has occurred since the last call to 395.Fn kevent . 396This filter automatically sets the 397.Dv EV_CLEAR 398flag internally. 399.It Dv EVFILT_TIMER 400Establishes an arbitrary timer identified by 401.Va ident . 402When adding a timer, 403.Va data 404specifies the timeout period in milliseconds. 405The timer will be periodic unless 406.Dv EV_ONESHOT 407is specified. 408On return, 409.Va data 410contains the number of times the timeout has expired since the last call to 411.Fn kevent . 412This filter automatically sets the 413.Dv EV_CLEAR 414flag internally. 415.El 416.Sh RETURN VALUES 417.Fn kqueue 418creates a new kernel event queue and returns a file descriptor. 419If there was an error creating the kernel event queue, a value of -1 is 420returned and 421.Va errno 422set. 423.Pp 424.Fn kevent 425returns the number of events placed in the 426.Fa eventlist , 427up to the value given by 428.Fa nevents . 429If an error occurs while processing an element of the 430.Fa changelist 431and there is enough room in the 432.Fa eventlist , 433then the event will be placed in the 434.Fa eventlist 435with 436.Dv EV_ERROR 437set in 438.Va flags 439and the system error in 440.Va data . 441Otherwise, 442.Dv -1 443will be returned, and 444.Va errno 445will be set to indicate the error condition. 446If the time limit expires, then 447.Fn kevent 448returns 0. 449.Sh ERRORS 450The 451.Fn kqueue 452function fails if: 453.Bl -tag -width Er 454.It Bq Er ENOMEM 455The kernel failed to allocate enough memory for the kernel queue. 456.It Bq Er EMFILE 457The per-process descriptor table is full. 458.It Bq Er ENFILE 459The system file table is full. 460.El 461.Pp 462The 463.Fn kevent 464function fails if: 465.Bl -tag -width Er 466.It Bq Er EACCES 467The process does not have permission to register a filter. 468.It Bq Er EFAULT 469There was an error reading or writing the 470.Va kevent 471structure. 472.It Bq Er EBADF 473The specified descriptor is invalid. 474.It Bq Er EINTR 475A signal was delivered before the timeout expired and before any 476events were placed on the kqueue for return. 477.It Bq Er EINVAL 478The specified time limit or filter is invalid. 479.It Bq Er ENOENT 480The event could not be found to be modified or deleted. 481.It Bq Er ENOMEM 482No memory was available to register the event. 483.It Bq Er ESRCH 484The specified process to attach to does not exist. 485.El 486.Sh SEE ALSO 487.Xr aio_error 2 , 488.Xr aio_read 2 , 489.Xr aio_return 2 , 490.Xr poll 2 , 491.Xr read 2 , 492.Xr select 2 , 493.Xr sigaction 2 , 494.Xr write 2 , 495.Xr signal 3 496.Sh HISTORY 497The 498.Fn kqueue 499and 500.Fn kevent 501functions first appeared in 502.Fx 4.1 . 503.Sh AUTHORS 504The 505.Fn kqueue 506system and this manual page were written by 507.An Jonathan Lemon Aq jlemon@FreeBSD.org . 508.Sh BUGS 509It is currently not possible to watch a 510.Xr vnode 9 511that resides on anything but a 512.Xr UFS 5 513or a 514.Xr HAMMER 5 515file system. 516