xref: /dragonfly/lib/libc/sys/kqueue.2 (revision 984263bc)
1.\" Copyright (c) 2000 Jonathan Lemon
2.\" All rights reserved.
3.\"
4.\" Redistribution and use in source and binary forms, with or without
5.\" modification, are permitted provided that the following conditions
6.\" are met:
7.\" 1. Redistributions of source code must retain the above copyright
8.\"    notice, this list of conditions and the following disclaimer.
9.\" 2. Redistributions in binary form must reproduce the above copyright
10.\"    notice, this list of conditions and the following disclaimer in the
11.\"    documentation and/or other materials provided with the distribution.
12.\"
13.\" THIS SOFTWARE IS PROVIDED ``AS IS'' AND
14.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
15.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
16.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
17.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
18.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
19.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
20.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
21.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
22.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
23.\" SUCH DAMAGE.
24.\"
25.\" $FreeBSD: src/lib/libc/sys/kqueue.2,v 1.1.2.16 2002/07/02 21:05:08 mp Exp $
26.\"
27.Dd April 14, 2000
28.Dt KQUEUE 2
29.Os
30.Sh NAME
31.Nm kqueue ,
32.Nm kevent
33.Nd kernel event notification mechanism
34.Sh LIBRARY
35.Lb libc
36.Sh SYNOPSIS
37.In sys/types.h
38.In sys/event.h
39.In sys/time.h
40.Ft int
41.Fn kqueue "void"
42.Ft int
43.Fn kevent "int kq" "const struct kevent *changelist" "int nchanges" "struct kevent *eventlist" "int nevents" "const struct timespec *timeout"
44.Fn EV_SET "&kev" ident filter flags fflags data udata
45.Sh DESCRIPTION
46.Fn kqueue
47provides a generic method of notifying the user when an event
48happens or a condition holds, based on the results of small
49pieces of kernel code termed filters.
50A kevent is identified by the (ident, filter) pair; there may only
51be one unique kevent per kqueue.
52.Pp
53The filter is executed upon the initial registration of a kevent
54in order to detect whether a preexisting condition is present, and is also
55executed whenever an event is passed to the filter for evaluation.
56If the filter determines that the condition should be reported,
57then the kevent is placed on the kqueue for the user to retrieve.
58.Pp
59The filter is also run when the user attempts to retrieve the kevent
60from the kqueue.
61If the filter indicates that the condition that triggered
62the event no longer holds, the kevent is removed from the kqueue and
63is not returned.
64.Pp
65Multiple events which trigger the filter do not result in multiple
66kevents being placed on the kqueue; instead, the filter will aggregate
67the events into a single struct kevent.
68Calling
69.Fn close
70on a file descriptor will remove any kevents that reference the descriptor.
71.Pp
72.Fn kqueue
73creates a new kernel event queue and returns a descriptor.
74The queue is not inherited by a child created with
75.Xr fork 2 .
76However, if
77.Xr rfork 2
78is called without the
79.Dv RFFDG
80flag, then the descriptor table is shared,
81which will allow sharing of the kqueue between two processes.
82.Pp
83.Fn kevent
84is used to register events with the queue, and return any pending
85events to the user.
86.Fa changelist
87is a pointer to an array of
88.Va kevent
89structures, as defined in
90.Aq Pa sys/event.h .
91All changes contained in the
92.Fa changelist
93are applied before any pending events are read from the queue.
94.Fa nchanges
95gives the size of
96.Fa changelist .
97.Fa eventlist
98is a pointer to an array of kevent structures.
99.Fa nevents
100determines the size of
101.Fa eventlist .
102If
103.Fa timeout
104is a non-NULL pointer, it specifies a maximum interval to wait
105for an event, which will be interpreted as a struct timespec.  If
106.Fa timeout
107is a NULL pointer,
108.Fn kevent
109waits indefinitely.  To effect a poll, the
110.Fa timeout
111argument should be non-NULL, pointing to a zero-valued
112.Va timespec
113structure.  The same array may be used for the
114.Fa changelist
115and
116.Fa eventlist .
117.Pp
118.Fn EV_SET
119is a macro which is provided for ease of initializing a
120kevent structure.
121.Pp
122The
123.Va kevent
124structure is defined as:
125.Bd -literal
126struct kevent {
127	uintptr_t ident;	/* identifier for this event */
128	short	  filter;	/* filter for event */
129	u_short	  flags;	/* action flags for kqueue */
130	u_int	  fflags;	/* filter flag value */
131	intptr_t  data;		/* filter data value */
132	void	  *udata;	/* opaque user data identifier */
133};
134.Ed
135.Pp
136The fields of
137.Fa struct kevent
138are:
139.Bl -tag -width XXXfilter
140.It ident
141Value used to identify this event.
142The exact interpretation is determined by the attached filter,
143but often is a file descriptor.
144.It filter
145Identifies the kernel filter used to process this event.  The pre-defined
146system filters are described below.
147.It flags
148Actions to perform on the event.
149.It fflags
150Filter-specific flags.
151.It data
152Filter-specific data value.
153.It udata
154Opaque user-defined value passed through the kernel unchanged.
155.El
156.Pp
157The
158.Va flags
159field can contain the following values:
160.Bl -tag -width XXXEV_ONESHOT
161.It EV_ADD
162Adds the event to the kqueue.  Re-adding an existing event
163will modify the parameters of the original event, and not result
164in a duplicate entry.  Adding an event automatically enables it,
165unless overridden by the EV_DISABLE flag.
166.It EV_ENABLE
167Permit
168.Fn kevent
169to return the event if it is triggered.
170.It EV_DISABLE
171Disable the event so
172.Fn kevent
173will not return it.  The filter itself is not disabled.
174.It EV_DELETE
175Removes the event from the kqueue.  Events which are attached to
176file descriptors are automatically deleted on the last close of
177the descriptor.
178.It EV_ONESHOT
179Causes the event to return only the first occurrence of the filter
180being triggered.  After the user retrieves the event from the kqueue,
181it is deleted.
182.It EV_CLEAR
183After the event is retrieved by the user, its state is reset.
184This is useful for filters which report state transitions
185instead of the current state.  Note that some filters may automatically
186set this flag internally.
187.It EV_EOF
188Filters may set this flag to indicate filter-specific EOF condition.
189.It EV_ERROR
190See
191.Sx RETURN VALUES
192below.
193.El
194.Pp
195The predefined system filters are listed below.
196Arguments may be passed to and from the filter via the
197.Va fflags
198and
199.Va data
200fields in the kevent structure.
201.Bl -tag -width EVFILT_SIGNAL
202.It EVFILT_READ
203Takes a descriptor as the identifier, and returns whenever
204there is data available to read.
205The behavior of the filter is slightly different depending
206on the descriptor type.
207.Pp
208.Bl -tag -width 2n
209.It Sockets
210Sockets which have previously been passed to
211.Fn listen
212return when there is an incoming connection pending.
213.Va data
214contains the size of the listen backlog.
215.Pp
216Other socket descriptors return when there is data to be read,
217subject to the
218.Dv SO_RCVLOWAT
219value of the socket buffer.
220This may be overridden with a per-filter low water mark at the
221time the filter is added by setting the
222NOTE_LOWAT
223flag in
224.Va fflags ,
225and specifying the new low water mark in
226.Va data .
227On return,
228.Va data
229contains the number of bytes in the socket buffer.
230.Pp
231If the read direction of the socket has shutdown, then the filter
232also sets EV_EOF in
233.Va flags ,
234and returns the socket error (if any) in
235.Va fflags .
236It is possible for EOF to be returned (indicating the connection is gone)
237while there is still data pending in the socket buffer.
238.It Vnodes
239Returns when the file pointer is not at the end of file.
240.Va data
241contains the offset from current position to end of file,
242and may be negative.
243.It "Fifos, Pipes"
244Returns when the there is data to read;
245.Va data
246contains the number of bytes available.
247.Pp
248When the last writer disconnects, the filter will set EV_EOF in
249.Va flags .
250This may be cleared by passing in EV_CLEAR, at which point the
251filter will resume waiting for data to become available before
252returning.
253.El
254.It EVFILT_WRITE
255Takes a descriptor as the identifier, and returns whenever
256it is possible to write to the descriptor.  For sockets, pipes
257and fifos,
258.Va data
259will contain the amount of space remaining in the write buffer.
260The filter will set EV_EOF when the reader disconnects, and for
261the fifo case, this may be cleared by use of EV_CLEAR.
262Note that this filter is not supported for vnodes.
263.Pp
264For sockets, the low water mark and socket error handling is
265identical to the EVFILT_READ case.
266.It EVFILT_AIO
267The sigevent portion of the AIO request is filled in, with
268.Va sigev_notify_kqueue
269containing the descriptor of the kqueue that the event should
270be attached to,
271.Va sigev_value
272containing the udata value, and
273.Va sigev_notify
274set to SIGEV_KEVENT.
275When the aio_* function is called, the event will be registered
276with the specified kqueue, and the
277.Va ident
278argument set to the
279.Fa struct aiocb
280returned by the aio_* function.
281The filter returns under the same conditions as aio_error.
282.Pp
283Alternatively, a kevent structure may be initialized, with
284.Va ident
285containing the descriptor of the kqueue, and the
286address of the kevent structure placed in the
287.Va aio_lio_opcode
288field of the AIO request.  However, this approach will not work on
289architectures with 64-bit pointers, and should be considered depreciated.
290.It EVFILT_VNODE
291Takes a file descriptor as the identifier and the events to watch for in
292.Va fflags ,
293and returns when one or more of the requested events occurs on the descriptor.
294The events to monitor are:
295.Bl -tag -width XXNOTE_RENAME
296.It NOTE_DELETE
297.Fn unlink
298was called on the file referenced by the descriptor.
299.It NOTE_WRITE
300A write occurred on the file referenced by the descriptor.
301.It NOTE_EXTEND
302The file referenced by the descriptor was extended.
303.It NOTE_ATTRIB
304The file referenced by the descriptor had its attributes changed.
305.It NOTE_LINK
306The link count on the file changed.
307.It NOTE_RENAME
308The file referenced by the descriptor was renamed.
309.It NOTE_REVOKE
310Access to the file was revoked via
311.Xr revoke 2
312or the underlying fileystem was unmounted.
313.El
314.Pp
315On return,
316.Va fflags
317contains the events which triggered the filter.
318.It EVFILT_PROC
319Takes the process ID to monitor as the identifier and the events to watch for
320in
321.Va fflags ,
322and returns when the process performs one or more of the requested events.
323If a process can normally see another process, it can attach an event to it.
324The events to monitor are:
325.Bl -tag -width XXNOTE_TRACKERR
326.It NOTE_EXIT
327The process has exited.
328.It NOTE_FORK
329The process has called
330.Fn fork .
331.It NOTE_EXEC
332The process has executed a new process via
333.Xr execve 2
334or similar call.
335.It NOTE_TRACK
336Follow a process across
337.Fn fork
338calls.  The parent process will return with NOTE_TRACK set in the
339.Va fflags
340field, while the child process will return with NOTE_CHILD set in
341.Va fflags
342and the parent PID in
343.Va data .
344.It NOTE_TRACKERR
345This flag is returned if the system was unable to attach an event to
346the child process, usually due to resource limitations.
347.El
348.Pp
349On return,
350.Va fflags
351contains the events which triggered the filter.
352.It EVFILT_SIGNAL
353Takes the signal number to monitor as the identifier and returns
354when the given signal is delivered to the process.
355This coexists with the
356.Fn signal
357and
358.Fn sigaction
359facilities, and has a lower precedence.  The filter will record
360all attempts to deliver a signal to a process, even if the signal has
361been marked as SIG_IGN.  Event notification happens after normal
362signal delivery processing.
363.Va data
364returns the number of times the signal has occurred since the last call to
365.Fn kevent .
366This filter automatically sets the EV_CLEAR flag internally.
367.It EVFILT_TIMER
368Establishes an arbitrary timer identified by
369.Va ident .
370When adding a timer,
371.Va data
372specifies the timeout period in milliseconds.
373The timer will be periodic unless EV_ONESHOT is specified.
374On return,
375.Va data
376contains the number of times the timeout has expired since the last call to
377.Fn kevent .
378This filter automatically sets the EV_CLEAR flag internally.
379.El
380.Sh RETURN VALUES
381.Fn kqueue
382creates a new kernel event queue and returns a file descriptor.
383If there was an error creating the kernel event queue, a value of -1 is
384returned and errno set.
385.Pp
386.Fn kevent
387returns the number of events placed in the
388.Fa eventlist ,
389up to the value given by
390.Fa nevents .
391If an error occurs while processing an element of the
392.Fa changelist
393and there is enough room in the
394.Fa eventlist ,
395then the event will be placed in the
396.Fa eventlist
397with
398.Dv EV_ERROR
399set in
400.Va flags
401and the system error in
402.Va data .
403Otherwise,
404.Dv -1
405will be returned, and
406.Dv errno
407will be set to indicate the error condition.
408If the time limit expires, then
409.Fn kevent
410returns 0.
411.Sh ERRORS
412The
413.Fn kqueue
414function fails if:
415.Bl -tag -width Er
416.It Bq Er ENOMEM
417The kernel failed to allocate enough memory for the kernel queue.
418.It Bq Er EMFILE
419The per-process descriptor table is full.
420.It Bq Er ENFILE
421The system file table is full.
422.El
423.Pp
424The
425.Fn kevent
426function fails if:
427.Bl -tag -width Er
428.It Bq Er EACCES
429The process does not have permission to register a filter.
430.It Bq Er EFAULT
431There was an error reading or writing the
432.Va kevent
433structure.
434.It Bq Er EBADF
435The specified descriptor is invalid.
436.It Bq Er EINTR
437A signal was delivered before the timeout expired and before any
438events were placed on the kqueue for return.
439.It Bq Er EINVAL
440The specified time limit or filter is invalid.
441.It Bq Er ENOENT
442The event could not be found to be modified or deleted.
443.It Bq Er ENOMEM
444No memory was available to register the event.
445.It Bq Er ESRCH
446The specified process to attach to does not exist.
447.El
448.Sh SEE ALSO
449.Xr aio_error 2 ,
450.Xr aio_read 2 ,
451.Xr aio_return 2 ,
452.Xr poll 2 ,
453.Xr read 2 ,
454.Xr select 2 ,
455.Xr sigaction 2 ,
456.Xr write 2 ,
457.Xr signal 3
458.Sh HISTORY
459The
460.Fn kqueue
461and
462.Fn kevent
463functions first appeared in
464.Fx 4.1 .
465.Sh AUTHORS
466The
467.Fn kqueue
468system and this manual page were written by
469.An Jonathan Lemon Aq jlemon@FreeBSD.org .
470.Sh BUGS
471It is currently not possible to watch a
472.Xr vnode 9
473that resides on anything but
474a UFS file system.
475