xref: /dragonfly/lib/libc/sys/kqueue.2 (revision 36a3d1d6)
1.\" Copyright (c) 2000 Jonathan Lemon
2.\" All rights reserved.
3.\"
4.\" Redistribution and use in source and binary forms, with or without
5.\" modification, are permitted provided that the following conditions
6.\" are met:
7.\" 1. Redistributions of source code must retain the above copyright
8.\"    notice, this list of conditions and the following disclaimer.
9.\" 2. Redistributions in binary form must reproduce the above copyright
10.\"    notice, this list of conditions and the following disclaimer in the
11.\"    documentation and/or other materials provided with the distribution.
12.\"
13.\" THIS SOFTWARE IS PROVIDED ``AS IS'' AND
14.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
15.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
16.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
17.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
18.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
19.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
20.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
21.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
22.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
23.\" SUCH DAMAGE.
24.\"
25.\" $FreeBSD: src/lib/libc/sys/kqueue.2,v 1.1.2.16 2002/07/02 21:05:08 mp Exp $
26.\" $DragonFly: src/lib/libc/sys/kqueue.2,v 1.7 2008/05/02 02:05:04 swildner Exp $
27.\"
28.Dd December 3, 2008
29.Dt KQUEUE 2
30.Os
31.Sh NAME
32.Nm kqueue ,
33.Nm kevent
34.Nd kernel event notification mechanism
35.Sh LIBRARY
36.Lb libc
37.Sh SYNOPSIS
38.In sys/types.h
39.In sys/event.h
40.In sys/time.h
41.Ft int
42.Fn kqueue "void"
43.Ft int
44.Fn kevent "int kq" "const struct kevent *changelist" "int nchanges" "struct kevent *eventlist" "int nevents" "const struct timespec *timeout"
45.Fn EV_SET "&kev" ident filter flags fflags data udata
46.Sh DESCRIPTION
47.Fn kqueue
48provides a generic method of notifying the user when an event
49happens or a condition holds, based on the results of small
50pieces of kernel code termed filters.
51A kevent is identified by the (ident, filter) pair; there may only
52be one unique kevent per kqueue.
53.Pp
54The filter is executed upon the initial registration of a kevent
55in order to detect whether a preexisting condition is present, and is also
56executed whenever an event is passed to the filter for evaluation.
57If the filter determines that the condition should be reported,
58then the kevent is placed on the kqueue for the user to retrieve.
59.Pp
60The filter is also run when the user attempts to retrieve the kevent
61from the kqueue.
62If the filter indicates that the condition that triggered
63the event no longer holds, the kevent is removed from the kqueue and
64is not returned.
65.Pp
66Multiple events which trigger the filter do not result in multiple
67kevents being placed on the kqueue; instead, the filter will aggregate
68the events into a single struct kevent.
69Calling
70.Fn close
71on a file descriptor will remove any kevents that reference the descriptor.
72.Pp
73.Fn kqueue
74creates a new kernel event queue and returns a descriptor.
75The queue is not inherited by a child created with
76.Xr fork 2 .
77However, if
78.Xr rfork 2
79is called without the
80.Dv RFFDG
81flag, then the descriptor table is shared,
82which will allow sharing of the kqueue between two processes.
83.Pp
84.Fn kevent
85is used to register events with the queue, and return any pending
86events to the user.
87.Fa changelist
88is a pointer to an array of
89.Va kevent
90structures, as defined in
91.In sys/event.h .
92All changes contained in the
93.Fa changelist
94are applied before any pending events are read from the queue.
95.Fa nchanges
96gives the size of
97.Fa changelist .
98.Fa eventlist
99is a pointer to an array of kevent structures.
100.Fa nevents
101determines the size of
102.Fa eventlist .
103If
104.Fa timeout
105is a non-NULL pointer, it specifies a maximum interval to wait
106for an event, which will be interpreted as a struct timespec.
107If
108.Fa timeout
109is a NULL pointer,
110.Fn kevent
111waits indefinitely.
112To effect a poll, the
113.Fa timeout
114argument should be non-NULL, pointing to a zero-valued
115.Va timespec
116structure.
117The same array may be used for the
118.Fa changelist
119and
120.Fa eventlist .
121.Pp
122.Fn EV_SET
123is a macro which is provided for ease of initializing a
124kevent structure.
125.Pp
126The
127.Va kevent
128structure is defined as:
129.Bd -literal
130struct kevent {
131	uintptr_t ident;	/* identifier for this event */
132	short	  filter;	/* filter for event */
133	u_short	  flags;	/* action flags for kqueue */
134	u_int	  fflags;	/* filter flag value */
135	intptr_t  data;		/* filter data value */
136	void	  *udata;	/* opaque user data identifier */
137};
138.Ed
139.Pp
140The fields of
141.Fa struct kevent
142are:
143.Bl -tag -width XXXfilter
144.It ident
145Value used to identify this event.
146The exact interpretation is determined by the attached filter,
147but often is a file descriptor.
148.It filter
149Identifies the kernel filter used to process this event.
150The pre-defined system filters are described below.
151.It flags
152Actions to perform on the event.
153.It fflags
154Filter-specific flags.
155.It data
156Filter-specific data value.
157.It udata
158Opaque user-defined value passed through the kernel unchanged.
159.El
160.Pp
161The
162.Va flags
163field can contain the following values:
164.Bl -tag -width ".Dv EV_ONESHOT"
165.It Dv EV_ADD
166Adds the event to the kqueue.
167Re-adding an existing event will modify the parameters of the original
168event, and not result in a duplicate entry.
169Adding an event automatically enables it, unless overridden by the
170.Dv EV_DISABLE
171flag.
172.It Dv EV_ENABLE
173Permit
174.Fn kevent
175to return the event if it is triggered.
176.It Dv EV_DISABLE
177Disable the event so
178.Fn kevent
179will not return it.
180The filter itself is not disabled.
181.It Dv EV_DELETE
182Removes the event from the kqueue.
183Events which are attached to file descriptors are automatically
184deleted on the last close of the descriptor.
185.It Dv EV_ONESHOT
186Causes the event to return only the first occurrence of the filter
187being triggered.
188After the user retrieves the event from the kqueue, it is deleted.
189.It Dv EV_CLEAR
190After the event is retrieved by the user, its state is reset.
191This is useful for filters which report state transitions
192instead of the current state.
193Note that some filters may automatically set this flag internally.
194.It Dv EV_EOF
195Filters may set this flag to indicate filter-specific EOF condition.
196.It Dv EV_ERROR
197See
198.Sx RETURN VALUES
199below.
200.El
201.Pp
202The predefined system filters are listed below.
203Arguments may be passed to and from the filter via the
204.Va fflags
205and
206.Va data
207fields in the kevent structure.
208.Bl -tag -width ".Dv EVFILT_SIGNAL"
209.It Dv EVFILT_READ
210Takes a descriptor as the identifier, and returns whenever
211there is data available to read.
212The behavior of the filter is slightly different depending
213on the descriptor type.
214.Bl -tag -width 2n
215.It Sockets
216Sockets which have previously been passed to
217.Fn listen
218return when there is an incoming connection pending.
219.Va data
220contains the size of the listen backlog.
221.Pp
222Other socket descriptors return when there is data to be read,
223subject to the
224.Dv SO_RCVLOWAT
225value of the socket buffer.
226This may be overridden with a per-filter low water mark at the
227time the filter is added by setting the
228.Dv NOTE_LOWAT
229flag in
230.Va fflags ,
231and specifying the new low water mark in
232.Va data .
233On return,
234.Va data
235contains the number of bytes in the socket buffer.
236.Pp
237If the read direction of the socket has shutdown, then the filter also sets
238.Dv EV_EOF
239in
240.Va flags ,
241and returns the socket error (if any) in
242.Va fflags .
243It is possible for EOF to be returned (indicating the connection is gone)
244while there is still data pending in the socket buffer.
245.It Vnodes
246Returns when the file pointer is not at the end of file.
247.Va data
248contains the offset from current position to end of file,
249and may be negative.
250.It "Fifos, Pipes"
251Returns when the there is data to read;
252.Va data
253contains the number of bytes available.
254.Pp
255When the last writer disconnects, the filter will set
256.Dv EV_EOF
257in
258.Va flags .
259This may be cleared by passing in
260.Dv EV_CLEAR ,
261at which point the filter will resume waiting for data to become
262available before returning.
263.El
264.It Dv EVFILT_WRITE
265Takes a descriptor as the identifier, and returns whenever
266it is possible to write to the descriptor.
267For sockets, pipes and fifos,
268.Va data
269will contain the amount of space remaining in the write buffer.
270The filter will set
271.Dv EV_EOF
272when the reader disconnects, and for the fifo case, this may be cleared
273by use of
274.Dv EV_CLEAR .
275Note that this filter is not supported for vnodes.
276.Pp
277For sockets, the low water mark and socket error handling is
278identical to the
279.Dv EVFILT_READ
280case.
281.It Dv EVFILT_EXCEPT
282Takes a descriptor as the identifier, and returns whenever one of the
283specified exceptional conditions has occurred on the descriptor. Conditions
284are specified in
285.Va fflags .
286Currently, a filter can monitor the reception of out-of-band data with
287.Dv NOTE_OOB .
288.It Dv EVFILT_AIO
289The sigevent portion of the AIO request is filled in, with
290.Va sigev_notify_kqueue
291containing the descriptor of the kqueue that the event should
292be attached to,
293.Va sigev_value
294containing the udata value, and
295.Va sigev_notify
296set to
297.Dv SIGEV_KEVENT .
298When the aio_* function is called, the event will be registered
299with the specified kqueue, and the
300.Va ident
301argument set to the
302.Fa struct aiocb
303returned by the aio_* function.
304The filter returns under the same conditions as aio_error.
305.Pp
306Alternatively, a kevent structure may be initialized, with
307.Va ident
308containing the descriptor of the kqueue, and the
309address of the kevent structure placed in the
310.Va aio_lio_opcode
311field of the AIO request.
312However, this approach will not work on architectures with 64-bit
313pointers, and should be considered deprecated.
314.It Dv EVFILT_VNODE
315Takes a file descriptor as the identifier and the events to watch for in
316.Va fflags ,
317and returns when one or more of the requested events occurs on the descriptor.
318The events to monitor are:
319.Bl -tag -width ".Dv NOTE_RENAME"
320.It Dv NOTE_DELETE
321.Fn unlink
322was called on the file referenced by the descriptor.
323.It Dv NOTE_WRITE
324A write occurred on the file referenced by the descriptor.
325.It Dv NOTE_EXTEND
326The file referenced by the descriptor was extended.
327.It Dv NOTE_ATTRIB
328The file referenced by the descriptor had its attributes changed.
329.It Dv NOTE_LINK
330The link count on the file changed.
331.It Dv NOTE_RENAME
332The file referenced by the descriptor was renamed.
333.It Dv NOTE_REVOKE
334Access to the file was revoked via
335.Xr revoke 2
336or the underlying fileystem was unmounted.
337.El
338.Pp
339On return,
340.Va fflags
341contains the events which triggered the filter.
342.It Dv EVFILT_PROC
343Takes the process ID to monitor as the identifier and the events to watch for
344in
345.Va fflags ,
346and returns when the process performs one or more of the requested events.
347If a process can normally see another process, it can attach an event to it.
348The events to monitor are:
349.Bl -tag -width ".Dv NOTE_TRACKERR"
350.It Dv NOTE_EXIT
351The process has exited.
352.It Dv NOTE_FORK
353The process has called
354.Fn fork .
355.It Dv NOTE_EXEC
356The process has executed a new process via
357.Xr execve 2
358or similar call.
359.It Dv NOTE_TRACK
360Follow a process across
361.Fn fork
362calls.
363The parent process will return with
364.Dv NOTE_TRACK
365set in the
366.Va fflags
367field, while the child process will return with
368.Dv NOTE_CHILD
369set in
370.Va fflags
371and the parent PID in
372.Va data .
373.It Dv NOTE_TRACKERR
374This flag is returned if the system was unable to attach an event to
375the child process, usually due to resource limitations.
376.El
377.Pp
378On return,
379.Va fflags
380contains the events which triggered the filter.
381.It Dv EVFILT_SIGNAL
382Takes the signal number to monitor as the identifier and returns
383when the given signal is delivered to the process.
384This coexists with the
385.Fn signal
386and
387.Fn sigaction
388facilities, and has a lower precedence.
389The filter will record all attempts to deliver a signal to a process,
390even if the signal has been marked as
391.Dv SIG_IGN .
392Event notification happens after normal signal delivery processing.
393.Va data
394returns the number of times the signal has occurred since the last call to
395.Fn kevent .
396This filter automatically sets the
397.Dv EV_CLEAR
398flag internally.
399.It Dv EVFILT_TIMER
400Establishes an arbitrary timer identified by
401.Va ident .
402When adding a timer,
403.Va data
404specifies the timeout period in milliseconds.
405The timer will be periodic unless
406.Dv EV_ONESHOT
407is specified.
408On return,
409.Va data
410contains the number of times the timeout has expired since the last call to
411.Fn kevent .
412This filter automatically sets the
413.Dv EV_CLEAR
414flag internally.
415.El
416.Sh RETURN VALUES
417.Fn kqueue
418creates a new kernel event queue and returns a file descriptor.
419If there was an error creating the kernel event queue, a value of -1 is
420returned and
421.Va errno
422set.
423.Pp
424.Fn kevent
425returns the number of events placed in the
426.Fa eventlist ,
427up to the value given by
428.Fa nevents .
429If an error occurs while processing an element of the
430.Fa changelist
431and there is enough room in the
432.Fa eventlist ,
433then the event will be placed in the
434.Fa eventlist
435with
436.Dv EV_ERROR
437set in
438.Va flags
439and the system error in
440.Va data .
441Otherwise,
442.Dv -1
443will be returned, and
444.Va errno
445will be set to indicate the error condition.
446If the time limit expires, then
447.Fn kevent
448returns 0.
449.Sh ERRORS
450The
451.Fn kqueue
452function fails if:
453.Bl -tag -width Er
454.It Bq Er ENOMEM
455The kernel failed to allocate enough memory for the kernel queue.
456.It Bq Er EMFILE
457The per-process descriptor table is full.
458.It Bq Er ENFILE
459The system file table is full.
460.El
461.Pp
462The
463.Fn kevent
464function fails if:
465.Bl -tag -width Er
466.It Bq Er EACCES
467The process does not have permission to register a filter.
468.It Bq Er EFAULT
469There was an error reading or writing the
470.Va kevent
471structure.
472.It Bq Er EBADF
473The specified descriptor is invalid.
474.It Bq Er EINTR
475A signal was delivered before the timeout expired and before any
476events were placed on the kqueue for return.
477.It Bq Er EINVAL
478The specified time limit or filter is invalid.
479.It Bq Er ENOENT
480The event could not be found to be modified or deleted.
481.It Bq Er ENOMEM
482No memory was available to register the event.
483.It Bq Er ESRCH
484The specified process to attach to does not exist.
485.El
486.Sh SEE ALSO
487.Xr aio_error 2 ,
488.Xr aio_read 2 ,
489.Xr aio_return 2 ,
490.Xr poll 2 ,
491.Xr read 2 ,
492.Xr select 2 ,
493.Xr sigaction 2 ,
494.Xr write 2 ,
495.Xr signal 3
496.Sh HISTORY
497The
498.Fn kqueue
499and
500.Fn kevent
501functions first appeared in
502.Fx 4.1 .
503.Sh AUTHORS
504The
505.Fn kqueue
506system and this manual page were written by
507.An Jonathan Lemon Aq jlemon@FreeBSD.org .
508.Sh BUGS
509It is currently not possible to watch a
510.Xr vnode 9
511that resides on anything but a
512.Xr UFS 5
513or a
514.Xr HAMMER 5
515file system.
516