1.\" Copyright (c) 2000 Jonathan Lemon 2.\" All rights reserved. 3.\" 4.\" Redistribution and use in source and binary forms, with or without 5.\" modification, are permitted provided that the following conditions 6.\" are met: 7.\" 1. Redistributions of source code must retain the above copyright 8.\" notice, this list of conditions and the following disclaimer. 9.\" 2. Redistributions in binary form must reproduce the above copyright 10.\" notice, this list of conditions and the following disclaimer in the 11.\" documentation and/or other materials provided with the distribution. 12.\" 13.\" THIS SOFTWARE IS PROVIDED ``AS IS'' AND 14.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 15.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 16.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE 17.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 18.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 19.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 20.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 21.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 22.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 23.\" SUCH DAMAGE. 24.\" 25.\" $FreeBSD: src/lib/libc/sys/kqueue.2,v 1.1.2.16 2002/07/02 21:05:08 mp Exp $ 26.\" $DragonFly: src/lib/libc/sys/kqueue.2,v 1.6 2007/09/07 08:14:57 swildner Exp $ 27.\" 28.Dd April 14, 2000 29.Dt KQUEUE 2 30.Os 31.Sh NAME 32.Nm kqueue , 33.Nm kevent 34.Nd kernel event notification mechanism 35.Sh LIBRARY 36.Lb libc 37.Sh SYNOPSIS 38.In sys/types.h 39.In sys/event.h 40.In sys/time.h 41.Ft int 42.Fn kqueue "void" 43.Ft int 44.Fn kevent "int kq" "const struct kevent *changelist" "int nchanges" "struct kevent *eventlist" "int nevents" "const struct timespec *timeout" 45.Fn EV_SET "&kev" ident filter flags fflags data udata 46.Sh DESCRIPTION 47.Fn kqueue 48provides a generic method of notifying the user when an event 49happens or a condition holds, based on the results of small 50pieces of kernel code termed filters. 51A kevent is identified by the (ident, filter) pair; there may only 52be one unique kevent per kqueue. 53.Pp 54The filter is executed upon the initial registration of a kevent 55in order to detect whether a preexisting condition is present, and is also 56executed whenever an event is passed to the filter for evaluation. 57If the filter determines that the condition should be reported, 58then the kevent is placed on the kqueue for the user to retrieve. 59.Pp 60The filter is also run when the user attempts to retrieve the kevent 61from the kqueue. 62If the filter indicates that the condition that triggered 63the event no longer holds, the kevent is removed from the kqueue and 64is not returned. 65.Pp 66Multiple events which trigger the filter do not result in multiple 67kevents being placed on the kqueue; instead, the filter will aggregate 68the events into a single struct kevent. 69Calling 70.Fn close 71on a file descriptor will remove any kevents that reference the descriptor. 72.Pp 73.Fn kqueue 74creates a new kernel event queue and returns a descriptor. 75The queue is not inherited by a child created with 76.Xr fork 2 . 77However, if 78.Xr rfork 2 79is called without the 80.Dv RFFDG 81flag, then the descriptor table is shared, 82which will allow sharing of the kqueue between two processes. 83.Pp 84.Fn kevent 85is used to register events with the queue, and return any pending 86events to the user. 87.Fa changelist 88is a pointer to an array of 89.Va kevent 90structures, as defined in 91.In sys/event.h . 92All changes contained in the 93.Fa changelist 94are applied before any pending events are read from the queue. 95.Fa nchanges 96gives the size of 97.Fa changelist . 98.Fa eventlist 99is a pointer to an array of kevent structures. 100.Fa nevents 101determines the size of 102.Fa eventlist . 103If 104.Fa timeout 105is a non-NULL pointer, it specifies a maximum interval to wait 106for an event, which will be interpreted as a struct timespec. 107If 108.Fa timeout 109is a NULL pointer, 110.Fn kevent 111waits indefinitely. 112To effect a poll, the 113.Fa timeout 114argument should be non-NULL, pointing to a zero-valued 115.Va timespec 116structure. 117The same array may be used for the 118.Fa changelist 119and 120.Fa eventlist . 121.Pp 122.Fn EV_SET 123is a macro which is provided for ease of initializing a 124kevent structure. 125.Pp 126The 127.Va kevent 128structure is defined as: 129.Bd -literal 130struct kevent { 131 uintptr_t ident; /* identifier for this event */ 132 short filter; /* filter for event */ 133 u_short flags; /* action flags for kqueue */ 134 u_int fflags; /* filter flag value */ 135 intptr_t data; /* filter data value */ 136 void *udata; /* opaque user data identifier */ 137}; 138.Ed 139.Pp 140The fields of 141.Fa struct kevent 142are: 143.Bl -tag -width XXXfilter 144.It ident 145Value used to identify this event. 146The exact interpretation is determined by the attached filter, 147but often is a file descriptor. 148.It filter 149Identifies the kernel filter used to process this event. 150The pre-defined system filters are described below. 151.It flags 152Actions to perform on the event. 153.It fflags 154Filter-specific flags. 155.It data 156Filter-specific data value. 157.It udata 158Opaque user-defined value passed through the kernel unchanged. 159.El 160.Pp 161The 162.Va flags 163field can contain the following values: 164.Bl -tag -width ".Dv EV_ONESHOT" 165.It Dv EV_ADD 166Adds the event to the kqueue. 167Re-adding an existing event will modify the parameters of the original 168event, and not result in a duplicate entry. 169Adding an event automatically enables it, unless overridden by the 170.Dv EV_DISABLE 171flag. 172.It Dv EV_ENABLE 173Permit 174.Fn kevent 175to return the event if it is triggered. 176.It Dv EV_DISABLE 177Disable the event so 178.Fn kevent 179will not return it. 180The filter itself is not disabled. 181.It Dv EV_DELETE 182Removes the event from the kqueue. 183Events which are attached to file descriptors are automatically 184deleted on the last close of the descriptor. 185.It Dv EV_ONESHOT 186Causes the event to return only the first occurrence of the filter 187being triggered. 188After the user retrieves the event from the kqueue, it is deleted. 189.It Dv EV_CLEAR 190After the event is retrieved by the user, its state is reset. 191This is useful for filters which report state transitions 192instead of the current state. 193Note that some filters may automatically set this flag internally. 194.It Dv EV_EOF 195Filters may set this flag to indicate filter-specific EOF condition. 196.It Dv EV_ERROR 197See 198.Sx RETURN VALUES 199below. 200.El 201.Pp 202The predefined system filters are listed below. 203Arguments may be passed to and from the filter via the 204.Va fflags 205and 206.Va data 207fields in the kevent structure. 208.Bl -tag -width ".Dv EVFILT_SIGNAL" 209.It Dv EVFILT_READ 210Takes a descriptor as the identifier, and returns whenever 211there is data available to read. 212The behavior of the filter is slightly different depending 213on the descriptor type. 214.Pp 215.Bl -tag -width 2n 216.It Sockets 217Sockets which have previously been passed to 218.Fn listen 219return when there is an incoming connection pending. 220.Va data 221contains the size of the listen backlog. 222.Pp 223Other socket descriptors return when there is data to be read, 224subject to the 225.Dv SO_RCVLOWAT 226value of the socket buffer. 227This may be overridden with a per-filter low water mark at the 228time the filter is added by setting the 229.Dv NOTE_LOWAT 230flag in 231.Va fflags , 232and specifying the new low water mark in 233.Va data . 234On return, 235.Va data 236contains the number of bytes in the socket buffer. 237.Pp 238If the read direction of the socket has shutdown, then the filter also sets 239.Dv EV_EOF 240in 241.Va flags , 242and returns the socket error (if any) in 243.Va fflags . 244It is possible for EOF to be returned (indicating the connection is gone) 245while there is still data pending in the socket buffer. 246.It Vnodes 247Returns when the file pointer is not at the end of file. 248.Va data 249contains the offset from current position to end of file, 250and may be negative. 251.It "Fifos, Pipes" 252Returns when the there is data to read; 253.Va data 254contains the number of bytes available. 255.Pp 256When the last writer disconnects, the filter will set 257.Dv EV_EOF 258in 259.Va flags . 260This may be cleared by passing in 261.Dv EV_CLEAR , 262at which point the filter will resume waiting for data to become 263available before returning. 264.El 265.It Dv EVFILT_WRITE 266Takes a descriptor as the identifier, and returns whenever 267it is possible to write to the descriptor. 268For sockets, pipes and fifos, 269.Va data 270will contain the amount of space remaining in the write buffer. 271The filter will set 272.Dv EV_EOF 273when the reader disconnects, and for the fifo case, this may be cleared 274by use of 275.Dv EV_CLEAR . 276Note that this filter is not supported for vnodes. 277.Pp 278For sockets, the low water mark and socket error handling is 279identical to the 280.Dv EVFILT_READ 281case. 282.It Dv EVFILT_AIO 283The sigevent portion of the AIO request is filled in, with 284.Va sigev_notify_kqueue 285containing the descriptor of the kqueue that the event should 286be attached to, 287.Va sigev_value 288containing the udata value, and 289.Va sigev_notify 290set to 291.Dv SIGEV_KEVENT . 292When the aio_* function is called, the event will be registered 293with the specified kqueue, and the 294.Va ident 295argument set to the 296.Fa struct aiocb 297returned by the aio_* function. 298The filter returns under the same conditions as aio_error. 299.Pp 300Alternatively, a kevent structure may be initialized, with 301.Va ident 302containing the descriptor of the kqueue, and the 303address of the kevent structure placed in the 304.Va aio_lio_opcode 305field of the AIO request. 306However, this approach will not work on architectures with 64-bit 307pointers, and should be considered deprecated. 308.It Dv EVFILT_VNODE 309Takes a file descriptor as the identifier and the events to watch for in 310.Va fflags , 311and returns when one or more of the requested events occurs on the descriptor. 312The events to monitor are: 313.Bl -tag -width ".Dv NOTE_RENAME" 314.It Dv NOTE_DELETE 315.Fn unlink 316was called on the file referenced by the descriptor. 317.It Dv NOTE_WRITE 318A write occurred on the file referenced by the descriptor. 319.It Dv NOTE_EXTEND 320The file referenced by the descriptor was extended. 321.It Dv NOTE_ATTRIB 322The file referenced by the descriptor had its attributes changed. 323.It Dv NOTE_LINK 324The link count on the file changed. 325.It Dv NOTE_RENAME 326The file referenced by the descriptor was renamed. 327.It Dv NOTE_REVOKE 328Access to the file was revoked via 329.Xr revoke 2 330or the underlying fileystem was unmounted. 331.El 332.Pp 333On return, 334.Va fflags 335contains the events which triggered the filter. 336.It Dv EVFILT_PROC 337Takes the process ID to monitor as the identifier and the events to watch for 338in 339.Va fflags , 340and returns when the process performs one or more of the requested events. 341If a process can normally see another process, it can attach an event to it. 342The events to monitor are: 343.Bl -tag -width ".Dv NOTE_TRACKERR" 344.It Dv NOTE_EXIT 345The process has exited. 346.It Dv NOTE_FORK 347The process has called 348.Fn fork . 349.It Dv NOTE_EXEC 350The process has executed a new process via 351.Xr execve 2 352or similar call. 353.It Dv NOTE_TRACK 354Follow a process across 355.Fn fork 356calls. 357The parent process will return with 358.Dv NOTE_TRACK 359set in the 360.Va fflags 361field, while the child process will return with 362.Dv NOTE_CHILD 363set in 364.Va fflags 365and the parent PID in 366.Va data . 367.It Dv NOTE_TRACKERR 368This flag is returned if the system was unable to attach an event to 369the child process, usually due to resource limitations. 370.El 371.Pp 372On return, 373.Va fflags 374contains the events which triggered the filter. 375.It Dv EVFILT_SIGNAL 376Takes the signal number to monitor as the identifier and returns 377when the given signal is delivered to the process. 378This coexists with the 379.Fn signal 380and 381.Fn sigaction 382facilities, and has a lower precedence. 383The filter will record all attempts to deliver a signal to a process, 384even if the signal has been marked as 385.Dv SIG_IGN . 386Event notification happens after normal signal delivery processing. 387.Va data 388returns the number of times the signal has occurred since the last call to 389.Fn kevent . 390This filter automatically sets the 391.Dv EV_CLEAR 392flag internally. 393.It Dv EVFILT_TIMER 394Establishes an arbitrary timer identified by 395.Va ident . 396When adding a timer, 397.Va data 398specifies the timeout period in milliseconds. 399The timer will be periodic unless 400.Dv EV_ONESHOT 401is specified. 402On return, 403.Va data 404contains the number of times the timeout has expired since the last call to 405.Fn kevent . 406This filter automatically sets the 407.Dv EV_CLEAR 408flag internally. 409.El 410.Sh RETURN VALUES 411.Fn kqueue 412creates a new kernel event queue and returns a file descriptor. 413If there was an error creating the kernel event queue, a value of -1 is 414returned and 415.Va errno 416set. 417.Pp 418.Fn kevent 419returns the number of events placed in the 420.Fa eventlist , 421up to the value given by 422.Fa nevents . 423If an error occurs while processing an element of the 424.Fa changelist 425and there is enough room in the 426.Fa eventlist , 427then the event will be placed in the 428.Fa eventlist 429with 430.Dv EV_ERROR 431set in 432.Va flags 433and the system error in 434.Va data . 435Otherwise, 436.Dv -1 437will be returned, and 438.Va errno 439will be set to indicate the error condition. 440If the time limit expires, then 441.Fn kevent 442returns 0. 443.Sh ERRORS 444The 445.Fn kqueue 446function fails if: 447.Bl -tag -width Er 448.It Bq Er ENOMEM 449The kernel failed to allocate enough memory for the kernel queue. 450.It Bq Er EMFILE 451The per-process descriptor table is full. 452.It Bq Er ENFILE 453The system file table is full. 454.El 455.Pp 456The 457.Fn kevent 458function fails if: 459.Bl -tag -width Er 460.It Bq Er EACCES 461The process does not have permission to register a filter. 462.It Bq Er EFAULT 463There was an error reading or writing the 464.Va kevent 465structure. 466.It Bq Er EBADF 467The specified descriptor is invalid. 468.It Bq Er EINTR 469A signal was delivered before the timeout expired and before any 470events were placed on the kqueue for return. 471.It Bq Er EINVAL 472The specified time limit or filter is invalid. 473.It Bq Er ENOENT 474The event could not be found to be modified or deleted. 475.It Bq Er ENOMEM 476No memory was available to register the event. 477.It Bq Er ESRCH 478The specified process to attach to does not exist. 479.El 480.Sh SEE ALSO 481.Xr aio_error 2 , 482.Xr aio_read 2 , 483.Xr aio_return 2 , 484.Xr poll 2 , 485.Xr read 2 , 486.Xr select 2 , 487.Xr sigaction 2 , 488.Xr write 2 , 489.Xr signal 3 490.Sh HISTORY 491The 492.Fn kqueue 493and 494.Fn kevent 495functions first appeared in 496.Fx 4.1 . 497.Sh AUTHORS 498The 499.Fn kqueue 500system and this manual page were written by 501.An Jonathan Lemon Aq jlemon@FreeBSD.org . 502.Sh BUGS 503It is currently not possible to watch a 504.Xr vnode 9 505that resides on anything but 506a UFS file system. 507