1.\" $OpenBSD: kqueue.2,v 1.42 2020/06/22 13:42:06 jmc Exp $ 2.\" 3.\" Copyright (c) 2000 Jonathan Lemon 4.\" All rights reserved. 5.\" 6.\" Redistribution and use in source and binary forms, with or without 7.\" modification, are permitted provided that the following conditions 8.\" are met: 9.\" 1. Redistributions of source code must retain the above copyright 10.\" notice, this list of conditions and the following disclaimer. 11.\" 2. Redistributions in binary form must reproduce the above copyright 12.\" notice, this list of conditions and the following disclaimer in the 13.\" documentation and/or other materials provided with the distribution. 14.\" 15.\" THIS SOFTWARE IS PROVIDED ``AS IS'' AND 16.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 17.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 18.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE 19.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 20.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 21.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 22.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 23.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 24.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 25.\" SUCH DAMAGE. 26.\" 27.\" $FreeBSD: src/lib/libc/sys/kqueue.2,v 1.18 2001/02/14 08:48:35 guido Exp $ 28.\" 29.Dd $Mdocdate: June 22 2020 $ 30.Dt KQUEUE 2 31.Os 32.Sh NAME 33.Nm kqueue , 34.Nm kevent , 35.Nm EV_SET 36.Nd kernel event notification mechanism 37.Sh SYNOPSIS 38.In sys/types.h 39.In sys/event.h 40.In sys/time.h 41.Ft int 42.Fn kqueue "void" 43.Ft int 44.Fn kevent "int kq" "const struct kevent *changelist" "int nchanges" "struct kevent *eventlist" "int nevents" "const struct timespec *timeout" 45.Fn EV_SET "&kev" ident filter flags fflags data udata 46.Sh DESCRIPTION 47.Fn kqueue 48provides a generic method of notifying the user when an event 49happens or a condition holds, based on the results of small 50pieces of kernel code termed 51.Dq filters . 52A kevent is identified by the (ident, filter) pair; there may only 53be one unique kevent per kqueue. 54.Pp 55The filter is executed upon the initial registration of a kevent 56in order to detect whether a preexisting condition is present, and is also 57executed whenever an event is passed to the filter for evaluation. 58If the filter determines that the condition should be reported, 59then the kevent is placed on the kqueue for the user to retrieve. 60.Pp 61The filter is also run when the user attempts to retrieve the kevent 62from the kqueue. 63If the filter indicates that the condition that triggered 64the event no longer holds, the kevent is removed from the kqueue and 65is not returned. 66.Pp 67Multiple events which trigger the filter do not result in multiple 68kevents being placed on the kqueue; instead, the filter will aggregate 69the events into a single 70.Vt struct kevent . 71Calling 72.Xr close 2 73on a file descriptor will remove any kevents that reference the descriptor. 74.Pp 75.Fn kqueue 76creates a new kernel event queue and returns a descriptor. 77The queue is not inherited by a child created with 78.Xr fork 2 . 79Similarly, kqueues cannot be passed across UNIX-domain sockets. 80.Pp 81.Fn kevent 82is used to register events with the queue, and return any pending 83events to the user. 84.Fa changelist 85is a pointer to an array of 86.Vt kevent 87structures, as defined in 88.In sys/event.h . 89All changes contained in the 90.Fa changelist 91are applied before any pending events are read from the queue. 92.Fa nchanges 93gives the size of 94.Fa changelist . 95.Fa eventlist 96is a pointer to an array of 97.Vt kevent 98structures. 99.Fa nevents 100determines the size of 101.Fa eventlist . 102When 103.Fa nevents 104is zero, 105.Fn kevent 106will return immediately even if there is a 107.Fa timeout 108specified unlike 109.Xr select 2 . 110If 111.Fa timeout 112is not 113.Dv NULL , 114it specifies a maximum interval to wait 115for an event, which will be interpreted as a 116.Vt struct timespec . 117If 118.Fa timeout 119is 120.Dv NULL , 121.Fn kevent 122waits indefinitely. 123To effect a poll, the 124.Fa timeout 125argument should not be 126.Dv NULL , 127pointing to a zero-valued 128.Vt struct timespec . 129The same array may be used for the 130.Fa changelist 131and 132.Fa eventlist . 133.Pp 134.Fn EV_SET 135is a macro which is provided for ease of initializing a 136.Vt kevent 137structure. 138.Pp 139The 140.Vt kevent 141structure is defined as: 142.Bd -literal 143struct kevent { 144 uintptr_t ident; /* identifier for this event */ 145 short filter; /* filter for event */ 146 u_short flags; /* action flags for kqueue */ 147 u_int fflags; /* filter flag value */ 148 int64_t data; /* filter data value */ 149 void *udata; /* opaque user data identifier */ 150}; 151.Ed 152.Pp 153The fields of 154.Vt struct kevent 155are: 156.Bl -tag -width XXXfilter 157.It Fa ident 158Value used to identify this event. 159The exact interpretation is determined by the attached filter, 160but often is a file descriptor. 161.It Fa filter 162Identifies the kernel filter used to process this event. 163The pre-defined system filters are described below. 164.It Fa flags 165Actions to perform on the event. 166.It Fa fflags 167Filter-specific flags. 168.It Fa data 169Filter-specific data value. 170.It Fa udata 171Opaque user-defined value passed through the kernel unchanged. 172.El 173.Pp 174The 175.Fa flags 176field can contain the following values: 177.Bl -tag -width XXXEV_ONESHOT 178.It Dv EV_ADD 179Adds the event to the kqueue. 180Re-adding an existing event will modify the parameters of the original event, 181and not result in a duplicate entry. 182Adding an event automatically enables it, unless overridden by the 183.Dv EV_DISABLE 184flag. 185.It Dv EV_ENABLE 186Permit 187.Fn kevent 188to return the event if it is triggered. 189.It Dv EV_DISABLE 190Disable the event so 191.Fn kevent 192will not return it. 193The filter itself is not disabled. 194.It Dv EV_DISPATCH 195Disable the event source immediately after delivery of an event. 196See 197.Dv EV_DISABLE 198above. 199.It Dv EV_DELETE 200Removes the event from the kqueue. 201Events which are attached to file descriptors are automatically deleted 202on the last close of the descriptor. 203.It Dv EV_RECEIPT 204Causes 205.Fn kevent 206to return with 207.Dv EV_ERROR 208set without draining any pending events after updating events in the kqueue. 209When a filter is successfully added the 210.Fa data 211field will be zero. 212This flag is useful for making bulk changes to a kqueue. 213.It Dv EV_ONESHOT 214Causes the event to return only the first occurrence of the filter 215being triggered. 216After the user retrieves the event from the kqueue, it is deleted. 217.It Dv EV_CLEAR 218After the event is retrieved by the user, its state is reset. 219This is useful for filters which report state transitions 220instead of the current state. 221Note that some filters may automatically set this flag internally. 222.It Dv EV_EOF 223Filters may set this flag to indicate filter-specific EOF condition. 224.It Dv EV_ERROR 225See 226.Sx RETURN VALUES 227below. 228.El 229.Pp 230The predefined system filters are listed below. 231Arguments may be passed to and from the filter via the 232.Fa fflags 233and 234.Fa data 235fields in the 236.Vt kevent 237structure. 238.Bl -tag -width EVFILT_SIGNAL 239.It Dv EVFILT_READ 240Takes a descriptor as the identifier, and returns whenever 241there is data available to read. 242The behavior of the filter is slightly different depending 243on the descriptor type. 244.Bl -tag -width 2n 245.It Sockets 246Sockets which have previously been passed to 247.Xr listen 2 248return when there is an incoming connection pending. 249.Fa data 250contains the size of the listen backlog. 251.Pp 252Other socket descriptors return when there is data to be read, 253subject to the 254.Dv SO_RCVLOWAT 255value of the socket buffer. 256This may be overridden with a per-filter low water mark at the 257time the filter is added by setting the 258.Dv NOTE_LOWAT 259flag in 260.Fa fflags , 261and specifying the new low water mark in 262.Fa data . 263On return, 264.Fa data 265contains the number of bytes in the socket buffer. 266.Pp 267If the read direction of the socket has shutdown, then the filter 268also sets 269.Dv EV_EOF 270in 271.Fa flags , 272and returns the socket error (if any) in 273.Fa fflags . 274It is possible for EOF to be returned (indicating the connection is gone) 275while there is still data pending in the socket buffer. 276.It Vnodes 277Returns when the file pointer is not at the end of file. 278.Fa data 279contains the offset from current position to end of file, 280and may be negative. 281If 282.Dv NOTE_EOF 283is set in 284.Fa fflags , 285.Fn kevent 286will also return when the file pointer is at the end of file. 287The end of file condition is indicated by the presence of 288.Dv NOTE_EOF 289in 290.Fa fflags 291on return. 292.It "FIFOs, Pipes" 293Returns when there is data to read; 294.Fa data 295contains the number of bytes available. 296.Pp 297When the last writer disconnects, the filter will set 298.Dv EV_EOF 299in 300.Fa flags . 301This may be cleared by passing in 302.Dv EV_CLEAR , 303at which point the filter will resume waiting for data to become 304available before returning. 305.It "BPF devices" 306Returns when the BPF buffer is full, the BPF timeout has expired, or 307when the BPF has 308.Dq immediate mode 309enabled and there is any data to read; 310.Fa data 311contains the number of bytes available. 312.El 313.It Dv EVFILT_EXCEPT 314Takes a descriptor as the identifier, and returns whenever one of the 315specified exceptional conditions has occurred on the descriptor. 316Conditions are specified in 317.Fa fflags . 318Currently, a filter can monitor the reception of out-of-band data with 319.Dv NOTE_OOB . 320.It Dv EVFILT_WRITE 321Takes a descriptor as the identifier, and returns whenever 322it is possible to write to the descriptor. 323For sockets, pipes, and FIFOs, 324.Fa data 325will contain the amount of space remaining in the write buffer. 326The filter will set 327.Dv EV_EOF 328when the reader disconnects, and for the FIFO case, 329this may be cleared by use of 330.Dv EV_CLEAR . 331Note that this filter is not supported for vnodes or BPF devices. 332.Pp 333For sockets, the low water mark and socket error handling is 334identical to the 335.Dv EVFILT_READ 336case. 337.\".It Dv EVFILT_AIO 338.\"The sigevent portion of the AIO request is filled in, with 339.\".Va sigev_notify_kqueue 340.\"containing the descriptor of the kqueue that the event should 341.\"be attached to, 342.\".Va sigev_value 343.\"containing the udata value, and 344.\".Va sigev_notify 345.\"set to 346.\".Dv SIGEV_KEVENT . 347.\"When the aio_* function is called, the event will be registered 348.\"with the specified kqueue, and the 349.\".Va ident 350.\"argument set to the 351.\".Li struct aiocb 352.\"returned by the aio_* function. 353.\"The filter returns under the same conditions as aio_error. 354.\".Pp 355.\"Alternatively, a kevent structure may be initialized, with 356.\".Va ident 357.\"containing the descriptor of the kqueue, and the 358.\"address of the kevent structure placed in the 359.\".Va aio_lio_opcode 360.\"field of the AIO request. 361.\"However, this approach will not work on architectures with 64-bit pointers, 362.\"and should be considered deprecated. 363.It Dv EVFILT_VNODE 364Takes a file descriptor as the identifier and the events to watch for in 365.Fa fflags , 366and returns when one or more of the requested events occurs on the descriptor. 367The events to monitor are: 368.Bl -tag -width XXNOTE_RENAME 369.It Dv NOTE_DELETE 370.Xr unlink 2 371was called on the file referenced by the descriptor. 372.It Dv NOTE_WRITE 373A write occurred on the file referenced by the descriptor. 374.It Dv NOTE_EXTEND 375The file referenced by the descriptor was extended. 376.It Dv NOTE_TRUNCATE 377The file referenced by the descriptor was truncated. 378.It Dv NOTE_ATTRIB 379The file referenced by the descriptor had its attributes changed. 380.It Dv NOTE_LINK 381The link count on the file changed. 382.It Dv NOTE_RENAME 383The file referenced by the descriptor was renamed. 384.It Dv NOTE_REVOKE 385Access to the file was revoked via 386.Xr revoke 2 387or the underlying file system was unmounted. 388.El 389.Pp 390On return, 391.Fa fflags 392contains the events which triggered the filter. 393.It Dv EVFILT_PROC 394Takes the process ID to monitor as the identifier and the events to watch for 395in 396.Fa fflags , 397and returns when the process performs one or more of the requested events. 398If a process can normally see another process, it can attach an event to it. 399The events to monitor are: 400.Bl -tag -width XXNOTE_TRACKERR 401.It Dv NOTE_EXIT 402The process has exited. 403The exit status will be stored in 404.Fa data 405in the same format as the status set by 406.Xr wait 2 . 407.It Dv NOTE_FORK 408The process has called 409.Xr fork 2 . 410.It Dv NOTE_EXEC 411The process has executed a new process via 412.Xr execve 2 413or similar call. 414.It Dv NOTE_TRACK 415Follow a process across 416.Xr fork 2 417calls. 418The parent process will return with 419.Dv NOTE_FORK 420set in the 421.Fa fflags 422field, while the child process will return with 423.Dv NOTE_CHILD 424set in 425.Fa fflags 426and the parent PID in 427.Fa data . 428.It Dv NOTE_TRACKERR 429This flag is returned if the system was unable to attach an event to 430the child process, usually due to resource limitations. 431.El 432.Pp 433On return, 434.Fa fflags 435contains the events which triggered the filter. 436.It Dv EVFILT_SIGNAL 437Takes the signal number to monitor as the identifier and returns 438when the given signal is delivered to the process. 439This coexists with the 440.Xr signal 3 441and 442.Xr sigaction 2 443facilities, and has a lower precedence. 444The filter will record all attempts to deliver a signal to a process, 445even if the signal has been marked as 446.Dv SIG_IGN . 447Event notification happens after normal signal delivery processing. 448.Fa data 449returns the number of times the signal has occurred since the last call to 450.Fn kevent . 451This filter automatically sets the 452.Dv EV_CLEAR 453flag internally. 454.It Dv EVFILT_TIMER 455Establishes an arbitrary timer identified by 456.Fa ident . 457When adding a timer, 458.Fa data 459specifies the timeout period in milliseconds. 460The timer will be periodic unless 461.Dv EV_ONESHOT 462is specified. 463On return, 464.Fa data 465contains the number of times the timeout has expired since the last call to 466.Fn kevent . 467This filter automatically sets the 468.Dv EV_CLEAR 469flag internally. 470.It Dv EVFILT_DEVICE 471Takes a descriptor as the identifier and the events to watch for in 472.Fa fflags , 473and returns when one or more of the requested events occur on the 474descriptor. 475The events to monitor are: 476.Bl -tag -width XXNOTE_CHANGE 477.It Dv NOTE_CHANGE 478A device change event has occurred, e.g. an HDMI cable has been plugged in to a port. 479.El 480.Pp 481On return, 482.Fa fflags 483contains the events which triggered the filter. 484.El 485.Sh RETURN VALUES 486.Fn kqueue 487creates a new kernel event queue and returns a file descriptor. 488If there was an error creating the kernel event queue, a value of -1 is 489returned and 490.Va errno 491set. 492.Pp 493.Fn kevent 494returns the number of events placed in the 495.Fa eventlist , 496up to the value given by 497.Fa nevents . 498If an error occurs while processing an element of the 499.Fa changelist 500and there is enough room in the 501.Fa eventlist , 502then the event will be placed in the 503.Fa eventlist 504with 505.Dv EV_ERROR 506set in 507.Fa flags 508and the system error in 509.Fa data . 510Otherwise, -1 will be returned, and 511.Va errno 512will be set to indicate the error condition. 513If the time limit expires, then 514.Fn kevent 515returns 0. 516.Sh ERRORS 517The 518.Fn kqueue 519function fails if: 520.Bl -tag -width Er 521.It Bq Er ENOMEM 522The kernel failed to allocate enough memory for the kernel queue. 523.It Bq Er EMFILE 524The per-process descriptor table is full. 525.It Bq Er ENFILE 526The system file table is full. 527.El 528.Pp 529The 530.Fn kevent 531function fails if: 532.Bl -tag -width Er 533.It Bq Er EACCES 534The process does not have permission to register a filter. 535.It Bq Er EFAULT 536There was an error reading or writing the 537.Vt kevent 538structure. 539.It Bq Er EBADF 540The specified descriptor is invalid. 541.It Bq Er EINTR 542A signal was delivered before the timeout expired and before any 543events were placed on the kqueue for return. 544.It Bq Er EINVAL 545The specified time limit or filter is invalid. 546.It Bq Er ENOENT 547The event could not be found to be modified or deleted. 548.It Bq Er ENOMEM 549No memory was available to register the event. 550.It Bq Er ESRCH 551The specified process to attach to does not exist. 552.El 553.Sh SEE ALSO 554.Xr poll 2 , 555.Xr read 2 , 556.Xr select 2 , 557.Xr sigaction 2 , 558.Xr wait 2 , 559.Xr write 2 , 560.Xr signal 3 561.Sh HISTORY 562The 563.Fn kqueue 564and 565.Fn kevent 566functions first appeared in 567.Fx 4.1 568and have been available since 569.Ox 2.9 . 570.Sh AUTHORS 571The 572.Fn kqueue 573system and this manual page were written by 574.An Jonathan Lemon Aq Mt jlemon@FreeBSD.org . 575