xref: /dragonfly/lib/libc/sys/kqueue.2 (revision 0bb9290e)
1.\" Copyright (c) 2000 Jonathan Lemon
2.\" All rights reserved.
3.\"
4.\" Redistribution and use in source and binary forms, with or without
5.\" modification, are permitted provided that the following conditions
6.\" are met:
7.\" 1. Redistributions of source code must retain the above copyright
8.\"    notice, this list of conditions and the following disclaimer.
9.\" 2. Redistributions in binary form must reproduce the above copyright
10.\"    notice, this list of conditions and the following disclaimer in the
11.\"    documentation and/or other materials provided with the distribution.
12.\"
13.\" THIS SOFTWARE IS PROVIDED ``AS IS'' AND
14.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
15.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
16.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
17.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
18.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
19.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
20.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
21.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
22.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
23.\" SUCH DAMAGE.
24.\"
25.\" $FreeBSD: src/lib/libc/sys/kqueue.2,v 1.1.2.16 2002/07/02 21:05:08 mp Exp $
26.\" $DragonFly: src/lib/libc/sys/kqueue.2,v 1.3 2006/05/26 19:39:37 swildner Exp $
27.\"
28.Dd April 14, 2000
29.Dt KQUEUE 2
30.Os
31.Sh NAME
32.Nm kqueue ,
33.Nm kevent
34.Nd kernel event notification mechanism
35.Sh LIBRARY
36.Lb libc
37.Sh SYNOPSIS
38.In sys/types.h
39.In sys/event.h
40.In sys/time.h
41.Ft int
42.Fn kqueue "void"
43.Ft int
44.Fn kevent "int kq" "const struct kevent *changelist" "int nchanges" "struct kevent *eventlist" "int nevents" "const struct timespec *timeout"
45.Fn EV_SET "&kev" ident filter flags fflags data udata
46.Sh DESCRIPTION
47.Fn kqueue
48provides a generic method of notifying the user when an event
49happens or a condition holds, based on the results of small
50pieces of kernel code termed filters.
51A kevent is identified by the (ident, filter) pair; there may only
52be one unique kevent per kqueue.
53.Pp
54The filter is executed upon the initial registration of a kevent
55in order to detect whether a preexisting condition is present, and is also
56executed whenever an event is passed to the filter for evaluation.
57If the filter determines that the condition should be reported,
58then the kevent is placed on the kqueue for the user to retrieve.
59.Pp
60The filter is also run when the user attempts to retrieve the kevent
61from the kqueue.
62If the filter indicates that the condition that triggered
63the event no longer holds, the kevent is removed from the kqueue and
64is not returned.
65.Pp
66Multiple events which trigger the filter do not result in multiple
67kevents being placed on the kqueue; instead, the filter will aggregate
68the events into a single struct kevent.
69Calling
70.Fn close
71on a file descriptor will remove any kevents that reference the descriptor.
72.Pp
73.Fn kqueue
74creates a new kernel event queue and returns a descriptor.
75The queue is not inherited by a child created with
76.Xr fork 2 .
77However, if
78.Xr rfork 2
79is called without the
80.Dv RFFDG
81flag, then the descriptor table is shared,
82which will allow sharing of the kqueue between two processes.
83.Pp
84.Fn kevent
85is used to register events with the queue, and return any pending
86events to the user.
87.Fa changelist
88is a pointer to an array of
89.Va kevent
90structures, as defined in
91.In sys/event.h .
92All changes contained in the
93.Fa changelist
94are applied before any pending events are read from the queue.
95.Fa nchanges
96gives the size of
97.Fa changelist .
98.Fa eventlist
99is a pointer to an array of kevent structures.
100.Fa nevents
101determines the size of
102.Fa eventlist .
103If
104.Fa timeout
105is a non-NULL pointer, it specifies a maximum interval to wait
106for an event, which will be interpreted as a struct timespec.  If
107.Fa timeout
108is a NULL pointer,
109.Fn kevent
110waits indefinitely.  To effect a poll, the
111.Fa timeout
112argument should be non-NULL, pointing to a zero-valued
113.Va timespec
114structure.  The same array may be used for the
115.Fa changelist
116and
117.Fa eventlist .
118.Pp
119.Fn EV_SET
120is a macro which is provided for ease of initializing a
121kevent structure.
122.Pp
123The
124.Va kevent
125structure is defined as:
126.Bd -literal
127struct kevent {
128	uintptr_t ident;	/* identifier for this event */
129	short	  filter;	/* filter for event */
130	u_short	  flags;	/* action flags for kqueue */
131	u_int	  fflags;	/* filter flag value */
132	intptr_t  data;		/* filter data value */
133	void	  *udata;	/* opaque user data identifier */
134};
135.Ed
136.Pp
137The fields of
138.Fa struct kevent
139are:
140.Bl -tag -width XXXfilter
141.It ident
142Value used to identify this event.
143The exact interpretation is determined by the attached filter,
144but often is a file descriptor.
145.It filter
146Identifies the kernel filter used to process this event.  The pre-defined
147system filters are described below.
148.It flags
149Actions to perform on the event.
150.It fflags
151Filter-specific flags.
152.It data
153Filter-specific data value.
154.It udata
155Opaque user-defined value passed through the kernel unchanged.
156.El
157.Pp
158The
159.Va flags
160field can contain the following values:
161.Bl -tag -width XXXEV_ONESHOT
162.It EV_ADD
163Adds the event to the kqueue.  Re-adding an existing event
164will modify the parameters of the original event, and not result
165in a duplicate entry.  Adding an event automatically enables it,
166unless overridden by the EV_DISABLE flag.
167.It EV_ENABLE
168Permit
169.Fn kevent
170to return the event if it is triggered.
171.It EV_DISABLE
172Disable the event so
173.Fn kevent
174will not return it.  The filter itself is not disabled.
175.It EV_DELETE
176Removes the event from the kqueue.  Events which are attached to
177file descriptors are automatically deleted on the last close of
178the descriptor.
179.It EV_ONESHOT
180Causes the event to return only the first occurrence of the filter
181being triggered.  After the user retrieves the event from the kqueue,
182it is deleted.
183.It EV_CLEAR
184After the event is retrieved by the user, its state is reset.
185This is useful for filters which report state transitions
186instead of the current state.  Note that some filters may automatically
187set this flag internally.
188.It EV_EOF
189Filters may set this flag to indicate filter-specific EOF condition.
190.It EV_ERROR
191See
192.Sx RETURN VALUES
193below.
194.El
195.Pp
196The predefined system filters are listed below.
197Arguments may be passed to and from the filter via the
198.Va fflags
199and
200.Va data
201fields in the kevent structure.
202.Bl -tag -width EVFILT_SIGNAL
203.It EVFILT_READ
204Takes a descriptor as the identifier, and returns whenever
205there is data available to read.
206The behavior of the filter is slightly different depending
207on the descriptor type.
208.Pp
209.Bl -tag -width 2n
210.It Sockets
211Sockets which have previously been passed to
212.Fn listen
213return when there is an incoming connection pending.
214.Va data
215contains the size of the listen backlog.
216.Pp
217Other socket descriptors return when there is data to be read,
218subject to the
219.Dv SO_RCVLOWAT
220value of the socket buffer.
221This may be overridden with a per-filter low water mark at the
222time the filter is added by setting the
223NOTE_LOWAT
224flag in
225.Va fflags ,
226and specifying the new low water mark in
227.Va data .
228On return,
229.Va data
230contains the number of bytes in the socket buffer.
231.Pp
232If the read direction of the socket has shutdown, then the filter
233also sets EV_EOF in
234.Va flags ,
235and returns the socket error (if any) in
236.Va fflags .
237It is possible for EOF to be returned (indicating the connection is gone)
238while there is still data pending in the socket buffer.
239.It Vnodes
240Returns when the file pointer is not at the end of file.
241.Va data
242contains the offset from current position to end of file,
243and may be negative.
244.It "Fifos, Pipes"
245Returns when the there is data to read;
246.Va data
247contains the number of bytes available.
248.Pp
249When the last writer disconnects, the filter will set EV_EOF in
250.Va flags .
251This may be cleared by passing in EV_CLEAR, at which point the
252filter will resume waiting for data to become available before
253returning.
254.El
255.It EVFILT_WRITE
256Takes a descriptor as the identifier, and returns whenever
257it is possible to write to the descriptor.  For sockets, pipes
258and fifos,
259.Va data
260will contain the amount of space remaining in the write buffer.
261The filter will set EV_EOF when the reader disconnects, and for
262the fifo case, this may be cleared by use of EV_CLEAR.
263Note that this filter is not supported for vnodes.
264.Pp
265For sockets, the low water mark and socket error handling is
266identical to the EVFILT_READ case.
267.It EVFILT_AIO
268The sigevent portion of the AIO request is filled in, with
269.Va sigev_notify_kqueue
270containing the descriptor of the kqueue that the event should
271be attached to,
272.Va sigev_value
273containing the udata value, and
274.Va sigev_notify
275set to SIGEV_KEVENT.
276When the aio_* function is called, the event will be registered
277with the specified kqueue, and the
278.Va ident
279argument set to the
280.Fa struct aiocb
281returned by the aio_* function.
282The filter returns under the same conditions as aio_error.
283.Pp
284Alternatively, a kevent structure may be initialized, with
285.Va ident
286containing the descriptor of the kqueue, and the
287address of the kevent structure placed in the
288.Va aio_lio_opcode
289field of the AIO request.  However, this approach will not work on
290architectures with 64-bit pointers, and should be considered depreciated.
291.It EVFILT_VNODE
292Takes a file descriptor as the identifier and the events to watch for in
293.Va fflags ,
294and returns when one or more of the requested events occurs on the descriptor.
295The events to monitor are:
296.Bl -tag -width XXNOTE_RENAME
297.It NOTE_DELETE
298.Fn unlink
299was called on the file referenced by the descriptor.
300.It NOTE_WRITE
301A write occurred on the file referenced by the descriptor.
302.It NOTE_EXTEND
303The file referenced by the descriptor was extended.
304.It NOTE_ATTRIB
305The file referenced by the descriptor had its attributes changed.
306.It NOTE_LINK
307The link count on the file changed.
308.It NOTE_RENAME
309The file referenced by the descriptor was renamed.
310.It NOTE_REVOKE
311Access to the file was revoked via
312.Xr revoke 2
313or the underlying fileystem was unmounted.
314.El
315.Pp
316On return,
317.Va fflags
318contains the events which triggered the filter.
319.It EVFILT_PROC
320Takes the process ID to monitor as the identifier and the events to watch for
321in
322.Va fflags ,
323and returns when the process performs one or more of the requested events.
324If a process can normally see another process, it can attach an event to it.
325The events to monitor are:
326.Bl -tag -width XXNOTE_TRACKERR
327.It NOTE_EXIT
328The process has exited.
329.It NOTE_FORK
330The process has called
331.Fn fork .
332.It NOTE_EXEC
333The process has executed a new process via
334.Xr execve 2
335or similar call.
336.It NOTE_TRACK
337Follow a process across
338.Fn fork
339calls.  The parent process will return with NOTE_TRACK set in the
340.Va fflags
341field, while the child process will return with NOTE_CHILD set in
342.Va fflags
343and the parent PID in
344.Va data .
345.It NOTE_TRACKERR
346This flag is returned if the system was unable to attach an event to
347the child process, usually due to resource limitations.
348.El
349.Pp
350On return,
351.Va fflags
352contains the events which triggered the filter.
353.It EVFILT_SIGNAL
354Takes the signal number to monitor as the identifier and returns
355when the given signal is delivered to the process.
356This coexists with the
357.Fn signal
358and
359.Fn sigaction
360facilities, and has a lower precedence.  The filter will record
361all attempts to deliver a signal to a process, even if the signal has
362been marked as SIG_IGN.  Event notification happens after normal
363signal delivery processing.
364.Va data
365returns the number of times the signal has occurred since the last call to
366.Fn kevent .
367This filter automatically sets the EV_CLEAR flag internally.
368.It EVFILT_TIMER
369Establishes an arbitrary timer identified by
370.Va ident .
371When adding a timer,
372.Va data
373specifies the timeout period in milliseconds.
374The timer will be periodic unless EV_ONESHOT is specified.
375On return,
376.Va data
377contains the number of times the timeout has expired since the last call to
378.Fn kevent .
379This filter automatically sets the EV_CLEAR flag internally.
380.El
381.Sh RETURN VALUES
382.Fn kqueue
383creates a new kernel event queue and returns a file descriptor.
384If there was an error creating the kernel event queue, a value of -1 is
385returned and errno set.
386.Pp
387.Fn kevent
388returns the number of events placed in the
389.Fa eventlist ,
390up to the value given by
391.Fa nevents .
392If an error occurs while processing an element of the
393.Fa changelist
394and there is enough room in the
395.Fa eventlist ,
396then the event will be placed in the
397.Fa eventlist
398with
399.Dv EV_ERROR
400set in
401.Va flags
402and the system error in
403.Va data .
404Otherwise,
405.Dv -1
406will be returned, and
407.Dv errno
408will be set to indicate the error condition.
409If the time limit expires, then
410.Fn kevent
411returns 0.
412.Sh ERRORS
413The
414.Fn kqueue
415function fails if:
416.Bl -tag -width Er
417.It Bq Er ENOMEM
418The kernel failed to allocate enough memory for the kernel queue.
419.It Bq Er EMFILE
420The per-process descriptor table is full.
421.It Bq Er ENFILE
422The system file table is full.
423.El
424.Pp
425The
426.Fn kevent
427function fails if:
428.Bl -tag -width Er
429.It Bq Er EACCES
430The process does not have permission to register a filter.
431.It Bq Er EFAULT
432There was an error reading or writing the
433.Va kevent
434structure.
435.It Bq Er EBADF
436The specified descriptor is invalid.
437.It Bq Er EINTR
438A signal was delivered before the timeout expired and before any
439events were placed on the kqueue for return.
440.It Bq Er EINVAL
441The specified time limit or filter is invalid.
442.It Bq Er ENOENT
443The event could not be found to be modified or deleted.
444.It Bq Er ENOMEM
445No memory was available to register the event.
446.It Bq Er ESRCH
447The specified process to attach to does not exist.
448.El
449.Sh SEE ALSO
450.Xr aio_error 2 ,
451.Xr aio_read 2 ,
452.Xr aio_return 2 ,
453.Xr poll 2 ,
454.Xr read 2 ,
455.Xr select 2 ,
456.Xr sigaction 2 ,
457.Xr write 2 ,
458.Xr signal 3
459.Sh HISTORY
460The
461.Fn kqueue
462and
463.Fn kevent
464functions first appeared in
465.Fx 4.1 .
466.Sh AUTHORS
467The
468.Fn kqueue
469system and this manual page were written by
470.An Jonathan Lemon Aq jlemon@FreeBSD.org .
471.Sh BUGS
472It is currently not possible to watch a
473.Xr vnode 9
474that resides on anything but
475a UFS file system.
476