1.\" Copyright (c) 2000 Jonathan Lemon 2.\" All rights reserved. 3.\" 4.\" Redistribution and use in source and binary forms, with or without 5.\" modification, are permitted provided that the following conditions 6.\" are met: 7.\" 1. Redistributions of source code must retain the above copyright 8.\" notice, this list of conditions and the following disclaimer. 9.\" 2. Redistributions in binary form must reproduce the above copyright 10.\" notice, this list of conditions and the following disclaimer in the 11.\" documentation and/or other materials provided with the distribution. 12.\" 13.\" THIS SOFTWARE IS PROVIDED ``AS IS'' AND 14.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 15.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 16.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE 17.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 18.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 19.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 20.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 21.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 22.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 23.\" SUCH DAMAGE. 24.\" 25.\" $FreeBSD: src/lib/libc/sys/kqueue.2,v 1.1.2.16 2002/07/02 21:05:08 mp Exp $ 26.\" $DragonFly: src/lib/libc/sys/kqueue.2,v 1.7 2008/05/02 02:05:04 swildner Exp $ 27.\" 28.Dd April 14, 2000 29.Dt KQUEUE 2 30.Os 31.Sh NAME 32.Nm kqueue , 33.Nm kevent 34.Nd kernel event notification mechanism 35.Sh LIBRARY 36.Lb libc 37.Sh SYNOPSIS 38.In sys/types.h 39.In sys/event.h 40.In sys/time.h 41.Ft int 42.Fn kqueue "void" 43.Ft int 44.Fn kevent "int kq" "const struct kevent *changelist" "int nchanges" "struct kevent *eventlist" "int nevents" "const struct timespec *timeout" 45.Fn EV_SET "&kev" ident filter flags fflags data udata 46.Sh DESCRIPTION 47.Fn kqueue 48provides a generic method of notifying the user when an event 49happens or a condition holds, based on the results of small 50pieces of kernel code termed filters. 51A kevent is identified by the (ident, filter) pair; there may only 52be one unique kevent per kqueue. 53.Pp 54The filter is executed upon the initial registration of a kevent 55in order to detect whether a preexisting condition is present, and is also 56executed whenever an event is passed to the filter for evaluation. 57If the filter determines that the condition should be reported, 58then the kevent is placed on the kqueue for the user to retrieve. 59.Pp 60The filter is also run when the user attempts to retrieve the kevent 61from the kqueue. 62If the filter indicates that the condition that triggered 63the event no longer holds, the kevent is removed from the kqueue and 64is not returned. 65.Pp 66Multiple events which trigger the filter do not result in multiple 67kevents being placed on the kqueue; instead, the filter will aggregate 68the events into a single struct kevent. 69Calling 70.Fn close 71on a file descriptor will remove any kevents that reference the descriptor. 72.Pp 73.Fn kqueue 74creates a new kernel event queue and returns a descriptor. 75The queue is not inherited by a child created with 76.Xr fork 2 . 77However, if 78.Xr rfork 2 79is called without the 80.Dv RFFDG 81flag, then the descriptor table is shared, 82which will allow sharing of the kqueue between two processes. 83.Pp 84.Fn kevent 85is used to register events with the queue, and return any pending 86events to the user. 87.Fa changelist 88is a pointer to an array of 89.Va kevent 90structures, as defined in 91.In sys/event.h . 92All changes contained in the 93.Fa changelist 94are applied before any pending events are read from the queue. 95.Fa nchanges 96gives the size of 97.Fa changelist . 98.Fa eventlist 99is a pointer to an array of kevent structures. 100.Fa nevents 101determines the size of 102.Fa eventlist . 103If 104.Fa timeout 105is a non-NULL pointer, it specifies a maximum interval to wait 106for an event, which will be interpreted as a struct timespec. 107If 108.Fa timeout 109is a NULL pointer, 110.Fn kevent 111waits indefinitely. 112To effect a poll, the 113.Fa timeout 114argument should be non-NULL, pointing to a zero-valued 115.Va timespec 116structure. 117The same array may be used for the 118.Fa changelist 119and 120.Fa eventlist . 121.Pp 122.Fn EV_SET 123is a macro which is provided for ease of initializing a 124kevent structure. 125.Pp 126The 127.Va kevent 128structure is defined as: 129.Bd -literal 130struct kevent { 131 uintptr_t ident; /* identifier for this event */ 132 short filter; /* filter for event */ 133 u_short flags; /* action flags for kqueue */ 134 u_int fflags; /* filter flag value */ 135 intptr_t data; /* filter data value */ 136 void *udata; /* opaque user data identifier */ 137}; 138.Ed 139.Pp 140The fields of 141.Fa struct kevent 142are: 143.Bl -tag -width XXXfilter 144.It ident 145Value used to identify this event. 146The exact interpretation is determined by the attached filter, 147but often is a file descriptor. 148.It filter 149Identifies the kernel filter used to process this event. 150The pre-defined system filters are described below. 151.It flags 152Actions to perform on the event. 153.It fflags 154Filter-specific flags. 155.It data 156Filter-specific data value. 157.It udata 158Opaque user-defined value passed through the kernel unchanged. 159.El 160.Pp 161The 162.Va flags 163field can contain the following values: 164.Bl -tag -width ".Dv EV_ONESHOT" 165.It Dv EV_ADD 166Adds the event to the kqueue. 167Re-adding an existing event will modify the parameters of the original 168event, and not result in a duplicate entry. 169Adding an event automatically enables it, unless overridden by the 170.Dv EV_DISABLE 171flag. 172.It Dv EV_ENABLE 173Permit 174.Fn kevent 175to return the event if it is triggered. 176.It Dv EV_DISABLE 177Disable the event so 178.Fn kevent 179will not return it. 180The filter itself is not disabled. 181.It Dv EV_DELETE 182Removes the event from the kqueue. 183Events which are attached to file descriptors are automatically 184deleted on the last close of the descriptor. 185.It Dv EV_ONESHOT 186Causes the event to return only the first occurrence of the filter 187being triggered. 188After the user retrieves the event from the kqueue, it is deleted. 189.It Dv EV_CLEAR 190After the event is retrieved by the user, its state is reset. 191This is useful for filters which report state transitions 192instead of the current state. 193Note that some filters may automatically set this flag internally. 194.It Dv EV_EOF 195Filters may set this flag to indicate filter-specific EOF condition. 196.It Dv EV_ERROR 197See 198.Sx RETURN VALUES 199below. 200.El 201.Pp 202The predefined system filters are listed below. 203Arguments may be passed to and from the filter via the 204.Va fflags 205and 206.Va data 207fields in the kevent structure. 208.Bl -tag -width ".Dv EVFILT_SIGNAL" 209.It Dv EVFILT_READ 210Takes a descriptor as the identifier, and returns whenever 211there is data available to read. 212The behavior of the filter is slightly different depending 213on the descriptor type. 214.Bl -tag -width 2n 215.It Sockets 216Sockets which have previously been passed to 217.Fn listen 218return when there is an incoming connection pending. 219.Va data 220contains the size of the listen backlog. 221.Pp 222Other socket descriptors return when there is data to be read, 223subject to the 224.Dv SO_RCVLOWAT 225value of the socket buffer. 226This may be overridden with a per-filter low water mark at the 227time the filter is added by setting the 228.Dv NOTE_LOWAT 229flag in 230.Va fflags , 231and specifying the new low water mark in 232.Va data . 233On return, 234.Va data 235contains the number of bytes in the socket buffer. 236.Pp 237If the read direction of the socket has shutdown, then the filter also sets 238.Dv EV_EOF 239in 240.Va flags , 241and returns the socket error (if any) in 242.Va fflags . 243It is possible for EOF to be returned (indicating the connection is gone) 244while there is still data pending in the socket buffer. 245.It Vnodes 246Returns when the file pointer is not at the end of file. 247.Va data 248contains the offset from current position to end of file, 249and may be negative. 250.It "Fifos, Pipes" 251Returns when the there is data to read; 252.Va data 253contains the number of bytes available. 254.Pp 255When the last writer disconnects, the filter will set 256.Dv EV_EOF 257in 258.Va flags . 259This may be cleared by passing in 260.Dv EV_CLEAR , 261at which point the filter will resume waiting for data to become 262available before returning. 263.El 264.It Dv EVFILT_WRITE 265Takes a descriptor as the identifier, and returns whenever 266it is possible to write to the descriptor. 267For sockets, pipes and fifos, 268.Va data 269will contain the amount of space remaining in the write buffer. 270The filter will set 271.Dv EV_EOF 272when the reader disconnects, and for the fifo case, this may be cleared 273by use of 274.Dv EV_CLEAR . 275Note that this filter is not supported for vnodes. 276.Pp 277For sockets, the low water mark and socket error handling is 278identical to the 279.Dv EVFILT_READ 280case. 281.It Dv EVFILT_AIO 282The sigevent portion of the AIO request is filled in, with 283.Va sigev_notify_kqueue 284containing the descriptor of the kqueue that the event should 285be attached to, 286.Va sigev_value 287containing the udata value, and 288.Va sigev_notify 289set to 290.Dv SIGEV_KEVENT . 291When the aio_* function is called, the event will be registered 292with the specified kqueue, and the 293.Va ident 294argument set to the 295.Fa struct aiocb 296returned by the aio_* function. 297The filter returns under the same conditions as aio_error. 298.Pp 299Alternatively, a kevent structure may be initialized, with 300.Va ident 301containing the descriptor of the kqueue, and the 302address of the kevent structure placed in the 303.Va aio_lio_opcode 304field of the AIO request. 305However, this approach will not work on architectures with 64-bit 306pointers, and should be considered deprecated. 307.It Dv EVFILT_VNODE 308Takes a file descriptor as the identifier and the events to watch for in 309.Va fflags , 310and returns when one or more of the requested events occurs on the descriptor. 311The events to monitor are: 312.Bl -tag -width ".Dv NOTE_RENAME" 313.It Dv NOTE_DELETE 314.Fn unlink 315was called on the file referenced by the descriptor. 316.It Dv NOTE_WRITE 317A write occurred on the file referenced by the descriptor. 318.It Dv NOTE_EXTEND 319The file referenced by the descriptor was extended. 320.It Dv NOTE_ATTRIB 321The file referenced by the descriptor had its attributes changed. 322.It Dv NOTE_LINK 323The link count on the file changed. 324.It Dv NOTE_RENAME 325The file referenced by the descriptor was renamed. 326.It Dv NOTE_REVOKE 327Access to the file was revoked via 328.Xr revoke 2 329or the underlying fileystem was unmounted. 330.El 331.Pp 332On return, 333.Va fflags 334contains the events which triggered the filter. 335.It Dv EVFILT_PROC 336Takes the process ID to monitor as the identifier and the events to watch for 337in 338.Va fflags , 339and returns when the process performs one or more of the requested events. 340If a process can normally see another process, it can attach an event to it. 341The events to monitor are: 342.Bl -tag -width ".Dv NOTE_TRACKERR" 343.It Dv NOTE_EXIT 344The process has exited. 345.It Dv NOTE_FORK 346The process has called 347.Fn fork . 348.It Dv NOTE_EXEC 349The process has executed a new process via 350.Xr execve 2 351or similar call. 352.It Dv NOTE_TRACK 353Follow a process across 354.Fn fork 355calls. 356The parent process will return with 357.Dv NOTE_TRACK 358set in the 359.Va fflags 360field, while the child process will return with 361.Dv NOTE_CHILD 362set in 363.Va fflags 364and the parent PID in 365.Va data . 366.It Dv NOTE_TRACKERR 367This flag is returned if the system was unable to attach an event to 368the child process, usually due to resource limitations. 369.El 370.Pp 371On return, 372.Va fflags 373contains the events which triggered the filter. 374.It Dv EVFILT_SIGNAL 375Takes the signal number to monitor as the identifier and returns 376when the given signal is delivered to the process. 377This coexists with the 378.Fn signal 379and 380.Fn sigaction 381facilities, and has a lower precedence. 382The filter will record all attempts to deliver a signal to a process, 383even if the signal has been marked as 384.Dv SIG_IGN . 385Event notification happens after normal signal delivery processing. 386.Va data 387returns the number of times the signal has occurred since the last call to 388.Fn kevent . 389This filter automatically sets the 390.Dv EV_CLEAR 391flag internally. 392.It Dv EVFILT_TIMER 393Establishes an arbitrary timer identified by 394.Va ident . 395When adding a timer, 396.Va data 397specifies the timeout period in milliseconds. 398The timer will be periodic unless 399.Dv EV_ONESHOT 400is specified. 401On return, 402.Va data 403contains the number of times the timeout has expired since the last call to 404.Fn kevent . 405This filter automatically sets the 406.Dv EV_CLEAR 407flag internally. 408.El 409.Sh RETURN VALUES 410.Fn kqueue 411creates a new kernel event queue and returns a file descriptor. 412If there was an error creating the kernel event queue, a value of -1 is 413returned and 414.Va errno 415set. 416.Pp 417.Fn kevent 418returns the number of events placed in the 419.Fa eventlist , 420up to the value given by 421.Fa nevents . 422If an error occurs while processing an element of the 423.Fa changelist 424and there is enough room in the 425.Fa eventlist , 426then the event will be placed in the 427.Fa eventlist 428with 429.Dv EV_ERROR 430set in 431.Va flags 432and the system error in 433.Va data . 434Otherwise, 435.Dv -1 436will be returned, and 437.Va errno 438will be set to indicate the error condition. 439If the time limit expires, then 440.Fn kevent 441returns 0. 442.Sh ERRORS 443The 444.Fn kqueue 445function fails if: 446.Bl -tag -width Er 447.It Bq Er ENOMEM 448The kernel failed to allocate enough memory for the kernel queue. 449.It Bq Er EMFILE 450The per-process descriptor table is full. 451.It Bq Er ENFILE 452The system file table is full. 453.El 454.Pp 455The 456.Fn kevent 457function fails if: 458.Bl -tag -width Er 459.It Bq Er EACCES 460The process does not have permission to register a filter. 461.It Bq Er EFAULT 462There was an error reading or writing the 463.Va kevent 464structure. 465.It Bq Er EBADF 466The specified descriptor is invalid. 467.It Bq Er EINTR 468A signal was delivered before the timeout expired and before any 469events were placed on the kqueue for return. 470.It Bq Er EINVAL 471The specified time limit or filter is invalid. 472.It Bq Er ENOENT 473The event could not be found to be modified or deleted. 474.It Bq Er ENOMEM 475No memory was available to register the event. 476.It Bq Er ESRCH 477The specified process to attach to does not exist. 478.El 479.Sh SEE ALSO 480.Xr aio_error 2 , 481.Xr aio_read 2 , 482.Xr aio_return 2 , 483.Xr poll 2 , 484.Xr read 2 , 485.Xr select 2 , 486.Xr sigaction 2 , 487.Xr write 2 , 488.Xr signal 3 489.Sh HISTORY 490The 491.Fn kqueue 492and 493.Fn kevent 494functions first appeared in 495.Fx 4.1 . 496.Sh AUTHORS 497The 498.Fn kqueue 499system and this manual page were written by 500.An Jonathan Lemon Aq jlemon@FreeBSD.org . 501.Sh BUGS 502It is currently not possible to watch a 503.Xr vnode 9 504that resides on anything but 505a UFS file system. 506