1.\" $OpenBSD: kqueue.2,v 1.44 2021/04/22 15:30:12 visa Exp $ 2.\" 3.\" Copyright (c) 2000 Jonathan Lemon 4.\" All rights reserved. 5.\" 6.\" Redistribution and use in source and binary forms, with or without 7.\" modification, are permitted provided that the following conditions 8.\" are met: 9.\" 1. Redistributions of source code must retain the above copyright 10.\" notice, this list of conditions and the following disclaimer. 11.\" 2. Redistributions in binary form must reproduce the above copyright 12.\" notice, this list of conditions and the following disclaimer in the 13.\" documentation and/or other materials provided with the distribution. 14.\" 15.\" THIS SOFTWARE IS PROVIDED ``AS IS'' AND 16.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 17.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 18.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE 19.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 20.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 21.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 22.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 23.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 24.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 25.\" SUCH DAMAGE. 26.\" 27.\" $FreeBSD: src/lib/libc/sys/kqueue.2,v 1.18 2001/02/14 08:48:35 guido Exp $ 28.\" 29.Dd $Mdocdate: April 22 2021 $ 30.Dt KQUEUE 2 31.Os 32.Sh NAME 33.Nm kqueue , 34.Nm kevent , 35.Nm EV_SET 36.Nd kernel event notification mechanism 37.Sh SYNOPSIS 38.In sys/types.h 39.In sys/event.h 40.In sys/time.h 41.Ft int 42.Fn kqueue "void" 43.Ft int 44.Fn kevent "int kq" "const struct kevent *changelist" "int nchanges" "struct kevent *eventlist" "int nevents" "const struct timespec *timeout" 45.Fn EV_SET "&kev" ident filter flags fflags data udata 46.Sh DESCRIPTION 47.Fn kqueue 48provides a generic method of notifying the user when an event 49happens or a condition holds, based on the results of small 50pieces of kernel code termed 51.Dq filters . 52A kevent is identified by the (ident, filter) pair; there may only 53be one unique kevent per kqueue. 54.Pp 55The filter is executed upon the initial registration of a kevent 56in order to detect whether a preexisting condition is present, and is also 57executed whenever an event is passed to the filter for evaluation. 58If the filter determines that the condition should be reported, 59then the kevent is placed on the kqueue for the user to retrieve. 60.Pp 61The filter is also run when the user attempts to retrieve the kevent 62from the kqueue. 63If the filter indicates that the condition that triggered 64the event no longer holds, the kevent is removed from the kqueue and 65is not returned. 66.Pp 67Multiple events which trigger the filter do not result in multiple 68kevents being placed on the kqueue; instead, the filter will aggregate 69the events into a single 70.Vt struct kevent . 71Calling 72.Xr close 2 73on a file descriptor will remove any kevents that reference the descriptor. 74.Pp 75.Fn kqueue 76creates a new kernel event queue and returns a descriptor. 77The queue is not inherited by a child created with 78.Xr fork 2 . 79Similarly, kqueues cannot be passed across UNIX-domain sockets. 80.Pp 81.Fn kevent 82is used to register events with the queue, and return any pending 83events to the user. 84.Fa changelist 85is a pointer to an array of 86.Vt kevent 87structures, as defined in 88.In sys/event.h . 89All changes contained in the 90.Fa changelist 91are applied before any pending events are read from the queue. 92.Fa nchanges 93gives the size of 94.Fa changelist . 95.Fa eventlist 96is a pointer to an array of 97.Vt kevent 98structures. 99.Fa nevents 100determines the size of 101.Fa eventlist . 102When 103.Fa nevents 104is zero, 105.Fn kevent 106will return immediately even if there is a 107.Fa timeout 108specified unlike 109.Xr select 2 . 110If 111.Fa timeout 112is not 113.Dv NULL , 114it specifies a maximum interval to wait 115for an event, which will be interpreted as a 116.Vt struct timespec . 117If 118.Fa timeout 119is 120.Dv NULL , 121.Fn kevent 122waits indefinitely. 123To effect a poll, the 124.Fa timeout 125argument should not be 126.Dv NULL , 127pointing to a zero-valued 128.Vt struct timespec . 129The same array may be used for the 130.Fa changelist 131and 132.Fa eventlist . 133.Pp 134.Fn EV_SET 135is a macro which is provided for ease of initializing a 136.Vt kevent 137structure. 138.Pp 139The 140.Vt kevent 141structure is defined as: 142.Bd -literal 143struct kevent { 144 uintptr_t ident; /* identifier for this event */ 145 short filter; /* filter for event */ 146 u_short flags; /* action flags for kqueue */ 147 u_int fflags; /* filter flag value */ 148 int64_t data; /* filter data value */ 149 void *udata; /* opaque user data identifier */ 150}; 151.Ed 152.Pp 153The fields of 154.Vt struct kevent 155are: 156.Bl -tag -width XXXfilter 157.It Fa ident 158Value used to identify this event. 159The exact interpretation is determined by the attached filter, 160but often is a file descriptor. 161.It Fa filter 162Identifies the kernel filter used to process this event. 163The pre-defined system filters are described below. 164.It Fa flags 165Actions to perform on the event. 166.It Fa fflags 167Filter-specific flags. 168.It Fa data 169Filter-specific data value. 170.It Fa udata 171Opaque user-defined value passed through the kernel unchanged. 172.El 173.Pp 174The 175.Fa flags 176field can contain the following values: 177.Bl -tag -width XXXEV_ONESHOT 178.It Dv EV_ADD 179Adds the event to the kqueue. 180Re-adding an existing event will modify the parameters of the original event, 181and not result in a duplicate entry. 182Adding an event automatically enables it, unless overridden by the 183.Dv EV_DISABLE 184flag. 185.It Dv EV_ENABLE 186Permit 187.Fn kevent 188to return the event if it is triggered. 189.It Dv EV_DISABLE 190Disable the event so 191.Fn kevent 192will not return it. 193The filter itself is not disabled. 194.It Dv EV_DISPATCH 195Disable the event source immediately after delivery of an event. 196See 197.Dv EV_DISABLE 198above. 199.It Dv EV_DELETE 200Removes the event from the kqueue. 201Events which are attached to file descriptors are automatically deleted 202on the last close of the descriptor. 203.It Dv EV_RECEIPT 204Causes 205.Fn kevent 206to return with 207.Dv EV_ERROR 208set without draining any pending events after updating events in the kqueue. 209When a filter is successfully added the 210.Fa data 211field will be zero. 212This flag is useful for making bulk changes to a kqueue. 213.It Dv EV_ONESHOT 214Causes the event to return only the first occurrence of the filter 215being triggered. 216After the user retrieves the event from the kqueue, it is deleted. 217.It Dv EV_CLEAR 218After the event is retrieved by the user, its state is reset. 219This is useful for filters which report state transitions 220instead of the current state. 221Note that some filters may automatically set this flag internally. 222.It Dv EV_EOF 223Filters may set this flag to indicate filter-specific EOF condition. 224.It Dv EV_ERROR 225See 226.Sx RETURN VALUES 227below. 228.El 229.Pp 230The predefined system filters are listed below. 231Arguments may be passed to and from the filter via the 232.Fa fflags 233and 234.Fa data 235fields in the 236.Vt kevent 237structure. 238.Bl -tag -width EVFILT_SIGNAL 239.It Dv EVFILT_READ 240Takes a descriptor as the identifier, and returns whenever 241there is data available to read. 242The behavior of the filter is slightly different depending 243on the descriptor type. 244.Bl -tag -width 2n 245.It Sockets 246Sockets which have previously been passed to 247.Xr listen 2 248return when there is an incoming connection pending. 249.Fa data 250contains the size of the listen backlog. 251.Pp 252Other socket descriptors return when there is data to be read, 253subject to the 254.Dv SO_RCVLOWAT 255value of the socket buffer. 256This may be overridden with a per-filter low water mark at the 257time the filter is added by setting the 258.Dv NOTE_LOWAT 259flag in 260.Fa fflags , 261and specifying the new low water mark in 262.Fa data . 263On return, 264.Fa data 265contains the number of bytes in the socket buffer. 266.Pp 267If the read direction of the socket has shutdown, then the filter 268also sets 269.Dv EV_EOF 270in 271.Fa flags , 272and returns the socket error (if any) in 273.Fa fflags . 274It is possible for EOF to be returned (indicating the connection is gone) 275while there is still data pending in the socket buffer. 276.It Vnodes 277Returns when the file pointer is not at the end of file. 278.Fa data 279contains the offset from current position to end of file, 280and may be negative. 281If 282.Dv NOTE_EOF 283is set in 284.Fa fflags , 285.Fn kevent 286will also return when the file pointer is at the end of file. 287The end of file condition is indicated by the presence of 288.Dv NOTE_EOF 289in 290.Fa fflags 291on return. 292.It "FIFOs, Pipes" 293Returns when there is data to read; 294.Fa data 295contains the number of bytes available. 296.Pp 297When the last writer disconnects, the filter will set 298.Dv EV_EOF 299in 300.Fa flags . 301This may be cleared by passing in 302.Dv EV_CLEAR , 303at which point the filter will resume waiting for data to become 304available before returning. 305.It "BPF devices" 306Returns when the BPF buffer is full, the BPF timeout has expired, or 307when the BPF has 308.Dq immediate mode 309enabled and there is any data to read; 310.Fa data 311contains the number of bytes available. 312.El 313.It Dv EVFILT_EXCEPT 314Takes a descriptor as the identifier, and returns whenever one of the 315specified exceptional conditions has occurred on the descriptor. 316Conditions are specified in 317.Fa fflags . 318Currently, a filter can monitor the reception of out-of-band data 319on a socket or pseudo terminal with 320.Dv NOTE_OOB . 321.It Dv EVFILT_WRITE 322Takes a descriptor as the identifier, and returns whenever 323it is possible to write to the descriptor. 324For sockets, pipes, and FIFOs, 325.Fa data 326will contain the amount of space remaining in the write buffer. 327The filter will set 328.Dv EV_EOF 329when the reader disconnects, and for the FIFO case, 330this may be cleared by use of 331.Dv EV_CLEAR . 332Note that this filter is not supported for vnodes or BPF devices. 333.Pp 334For sockets, the low water mark and socket error handling is 335identical to the 336.Dv EVFILT_READ 337case. 338.\".It Dv EVFILT_AIO 339.\"The sigevent portion of the AIO request is filled in, with 340.\".Va sigev_notify_kqueue 341.\"containing the descriptor of the kqueue that the event should 342.\"be attached to, 343.\".Va sigev_value 344.\"containing the udata value, and 345.\".Va sigev_notify 346.\"set to 347.\".Dv SIGEV_KEVENT . 348.\"When the aio_* function is called, the event will be registered 349.\"with the specified kqueue, and the 350.\".Va ident 351.\"argument set to the 352.\".Li struct aiocb 353.\"returned by the aio_* function. 354.\"The filter returns under the same conditions as aio_error. 355.\".Pp 356.\"Alternatively, a kevent structure may be initialized, with 357.\".Va ident 358.\"containing the descriptor of the kqueue, and the 359.\"address of the kevent structure placed in the 360.\".Va aio_lio_opcode 361.\"field of the AIO request. 362.\"However, this approach will not work on architectures with 64-bit pointers, 363.\"and should be considered deprecated. 364.It Dv EVFILT_VNODE 365Takes a file descriptor as the identifier and the events to watch for in 366.Fa fflags , 367and returns when one or more of the requested events occurs on the descriptor. 368The events to monitor are: 369.Bl -tag -width XXNOTE_RENAME 370.It Dv NOTE_DELETE 371.Xr unlink 2 372was called on the file referenced by the descriptor. 373.It Dv NOTE_WRITE 374A write occurred on the file referenced by the descriptor. 375.It Dv NOTE_EXTEND 376The file referenced by the descriptor was extended. 377.It Dv NOTE_TRUNCATE 378The file referenced by the descriptor was truncated. 379.It Dv NOTE_ATTRIB 380The file referenced by the descriptor had its attributes changed. 381.It Dv NOTE_LINK 382The link count on the file changed. 383.It Dv NOTE_RENAME 384The file referenced by the descriptor was renamed. 385.It Dv NOTE_REVOKE 386Access to the file was revoked via 387.Xr revoke 2 388or the underlying file system was unmounted. 389.El 390.Pp 391On return, 392.Fa fflags 393contains the events which triggered the filter. 394.It Dv EVFILT_PROC 395Takes the process ID to monitor as the identifier and the events to watch for 396in 397.Fa fflags , 398and returns when the process performs one or more of the requested events. 399If a process can normally see another process, it can attach an event to it. 400The events to monitor are: 401.Bl -tag -width XXNOTE_TRACKERR 402.It Dv NOTE_EXIT 403The process has exited. 404The exit status will be stored in 405.Fa data 406in the same format as the status set by 407.Xr wait 2 . 408.It Dv NOTE_FORK 409The process has called 410.Xr fork 2 . 411.It Dv NOTE_EXEC 412The process has executed a new process via 413.Xr execve 2 414or similar call. 415.It Dv NOTE_TRACK 416Follow a process across 417.Xr fork 2 418calls. 419The parent process will return with 420.Dv NOTE_FORK 421set in the 422.Fa fflags 423field, while the child process will return with 424.Dv NOTE_CHILD 425set in 426.Fa fflags 427and the parent PID in 428.Fa data . 429.It Dv NOTE_TRACKERR 430This flag is returned if the system was unable to attach an event to 431the child process, usually due to resource limitations. 432.El 433.Pp 434On return, 435.Fa fflags 436contains the events which triggered the filter. 437.It Dv EVFILT_SIGNAL 438Takes the signal number to monitor as the identifier and returns 439when the given signal is delivered to the process. 440This coexists with the 441.Xr signal 3 442and 443.Xr sigaction 2 444facilities, and has a lower precedence. 445The filter will record all attempts to deliver a signal to a process, 446even if the signal has been marked as 447.Dv SIG_IGN . 448Event notification happens after normal signal delivery processing. 449.Fa data 450returns the number of times the signal has occurred since the last call to 451.Fn kevent . 452This filter automatically sets the 453.Dv EV_CLEAR 454flag internally. 455.It Dv EVFILT_TIMER 456Establishes an arbitrary timer identified by 457.Fa ident . 458When adding a timer, 459.Fa data 460specifies the timeout period in milliseconds. 461The timer will be periodic unless 462.Dv EV_ONESHOT 463is specified. 464On return, 465.Fa data 466contains the number of times the timeout has expired since the last call to 467.Fn kevent . 468This filter automatically sets the 469.Dv EV_CLEAR 470flag internally. 471.Pp 472If an existing timer is re-added, the existing timer and related pending events 473will be cancelled. 474The timer will be re-started using the timeout period 475.Fa data . 476.It Dv EVFILT_DEVICE 477Takes a descriptor as the identifier and the events to watch for in 478.Fa fflags , 479and returns when one or more of the requested events occur on the 480descriptor. 481The events to monitor are: 482.Bl -tag -width XXNOTE_CHANGE 483.It Dv NOTE_CHANGE 484A device change event has occurred, e.g. an HDMI cable has been plugged in to a port. 485.El 486.Pp 487On return, 488.Fa fflags 489contains the events which triggered the filter. 490.El 491.Sh RETURN VALUES 492.Fn kqueue 493creates a new kernel event queue and returns a file descriptor. 494If there was an error creating the kernel event queue, a value of -1 is 495returned and 496.Va errno 497set. 498.Pp 499.Fn kevent 500returns the number of events placed in the 501.Fa eventlist , 502up to the value given by 503.Fa nevents . 504If an error occurs while processing an element of the 505.Fa changelist 506and there is enough room in the 507.Fa eventlist , 508then the event will be placed in the 509.Fa eventlist 510with 511.Dv EV_ERROR 512set in 513.Fa flags 514and the system error in 515.Fa data . 516Otherwise, -1 will be returned, and 517.Va errno 518will be set to indicate the error condition. 519If the time limit expires, then 520.Fn kevent 521returns 0. 522.Sh ERRORS 523The 524.Fn kqueue 525function fails if: 526.Bl -tag -width Er 527.It Bq Er ENOMEM 528The kernel failed to allocate enough memory for the kernel queue. 529.It Bq Er EMFILE 530The per-process descriptor table is full. 531.It Bq Er ENFILE 532The system file table is full. 533.El 534.Pp 535The 536.Fn kevent 537function fails if: 538.Bl -tag -width Er 539.It Bq Er EACCES 540The process does not have permission to register a filter. 541.It Bq Er EFAULT 542There was an error reading or writing the 543.Vt kevent 544structure. 545.It Bq Er EBADF 546The specified descriptor is invalid. 547.It Bq Er EINTR 548A signal was delivered before the timeout expired and before any 549events were placed on the kqueue for return. 550.It Bq Er EINVAL 551The specified time limit or filter is invalid. 552.It Bq Er ENOENT 553The event could not be found to be modified or deleted. 554.It Bq Er ENOMEM 555No memory was available to register the event. 556.It Bq Er ESRCH 557The specified process to attach to does not exist. 558.El 559.Sh SEE ALSO 560.Xr poll 2 , 561.Xr read 2 , 562.Xr select 2 , 563.Xr sigaction 2 , 564.Xr wait 2 , 565.Xr write 2 , 566.Xr signal 3 567.Sh HISTORY 568The 569.Fn kqueue 570and 571.Fn kevent 572functions first appeared in 573.Fx 4.1 574and have been available since 575.Ox 2.9 . 576.Sh AUTHORS 577The 578.Fn kqueue 579system and this manual page were written by 580.An Jonathan Lemon Aq Mt jlemon@FreeBSD.org . 581