xref: /netbsd/share/man/man4/bpf.4 (revision bf9ec67e)
1.\" -*- nroff -*-
2.\"
3.\"	$NetBSD: bpf.4,v 1.17 2002/02/13 08:17:31 ross Exp $
4.\"
5.\" Copyright (c) 1990, 1991, 1992, 1993, 1994
6.\"	The Regents of the University of California.  All rights reserved.
7.\"
8.\" Redistribution and use in source and binary forms, with or without
9.\" modification, are permitted provided that: (1) source code distributions
10.\" retain the above copyright notice and this paragraph in its entirety, (2)
11.\" distributions including binary code include the above copyright notice and
12.\" this paragraph in its entirety in the documentation or other materials
13.\" provided with the distribution, and (3) all advertising materials mentioning
14.\" features or use of this software display the following acknowledgement:
15.\" ``This product includes software developed by the University of California,
16.\" Lawrence Berkeley Laboratory and its contributors.'' Neither the name of
17.\" the University nor the names of its contributors may be used to endorse
18.\" or promote products derived from this software without specific prior
19.\" written permission.
20.\" THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR IMPLIED
21.\" WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
22.\" MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
23.\"
24.\" This document is derived in part from the enet man page (enet.4)
25.\" distributed with 4.3BSD Unix.
26.\"
27.Dd June 28, 1994
28.Dt BPF 4
29.Os
30.Sh NAME
31.Nm bpf
32.Nd Berkeley Packet Filter raw network interface
33.Sh SYNOPSIS
34.Cd "pseudo-device bpfilter 16"
35.Sh DESCRIPTION
36The Berkeley Packet Filter
37provides a raw interface to data link layers in a protocol
38independent fashion.
39All packets on the network, even those destined for other hosts,
40are accessible through this mechanism.
41.Pp
42The packet filter appears as a character special device,
43.Pa /dev/bpf0 ,
44.Pa /dev/bpf1 ,
45etc.
46After opening the device, the file descriptor must be bound to a
47specific network interface with the
48.Dv BIOSETIF
49ioctl.
50A given interface can be shared be multiple listeners, and the filter
51underlying each descriptor will see an identical packet stream.
52The total number of open
53files is limited to the value given in the kernel configuration; the
54example given in the SYNOPSIS above sets the limit to 16.
55.Pp
56A separate device file is required for each minor device.
57If a file is in use, the open will fail and
58.Va errno
59will be set to EBUSY.
60.Pp
61Associated with each open instance of a
62.Nm
63file is a user-settable packet filter.
64Whenever a packet is received by an interface,
65all file descriptors listening on that interface apply their filter.
66Each descriptor that accepts the packet receives its own copy.
67.Pp
68Reads from these files return the next group of packets
69that have matched the filter.
70To improve performance, the buffer passed to read must be
71the same size as the buffers used internally by
72.Nm "" .
73This size is returned by the
74.Dv BIOCGBLEN
75ioctl (see below), and under
76BSD, can be set with
77.Dv BIOCSBLEN .
78Note that an individual packet larger than this size is necessarily
79truncated.
80.Pp
81The packet filter will support any link level protocol that has fixed length
82headers.  Currently, only Ethernet, SLIP and PPP drivers have been
83modified to interact with
84.Nm "" .
85.Pp
86Since packet data is in network byte order, applications should use the
87.Xr byteorder 3
88macros to extract multi-byte values.
89.Pp
90A packet can be sent out on the network by writing to a
91.Nm
92file descriptor.  The writes are unbuffered, meaning only one
93packet can be processed per write.
94Currently, only writes to Ethernets and SLIP links are supported.
95.Sh IOCTLS
96The
97.Xr ioctl 2
98command codes below are defined in \*[Lt]net/bpf.h\*[Gt].  All commands require
99these includes:
100.Bd -literal -offset indent
101.Fd #include \*[Lt]sys/types.h\*[Gt]
102.Fd #include \*[Lt]sys/time.h\*[Gt]
103.Fd #include \*[Lt]sys/ioctl.h\*[Gt]
104.Fd #include \*[Lt]net/bpf.h\*[Gt]
105.Ed
106.Pp
107Additionally, BIOCGETIF and BIOCSETIF require
108.Pa \*[Lt]net/if.h\*[Gt] .
109.Pp
110The (third) argument to the
111.Xr ioctl 2
112should be a pointer to the type indicated.
113.Bl -tag -width indent -offset indent
114.It Dv "BIOCGBLEN (u_int)"
115Returns the required buffer length for reads on
116.Nm
117files.
118.It Dv "BIOCSBLEN (u_int)"
119Sets the buffer length for reads on
120.Nm
121files.  The buffer must be set before the file is attached to an interface
122with
123.Dv BIOCSETIF .
124If the requested buffer size cannot be accommodated, the closest
125allowable size will be set and returned in the argument.
126A read call will result in EIO if it is passed a buffer that is not this size.
127.It Dv BIOCGDLT (u_int)
128Returns the type of the data link layer underlying the attached interface.
129EINVAL is returned if no interface has been specified.
130The device types, prefixed with
131.Dq DLT_ ,
132are defined in \*[Lt]net/bpf.h\*[Gt].
133.It Dv BIOCPROMISC
134Forces the interface into promiscuous mode.
135All packets, not just those destined for the local host, are processed.
136Since more than one file can be listening on a given interface,
137a listener that opened its interface non-promiscuously may receive
138packets promiscuously.  This problem can be remedied with an
139appropriate filter.
140.Pp
141The interface remains in promiscuous mode until all files listening
142promiscuously are closed.
143.It Dv BIOCFLUSH
144Flushes the buffer of incoming packets,
145and resets the statistics that are returned by
146.Dv BIOCGSTATS .
147.It Dv BIOCGETIF (struct ifreq)
148Returns the name of the hardware interface that the file is listening on.
149The name is returned in the ifr_name field of
150.Fa ifr .
151All other fields are undefined.
152.It Dv BIOCSETIF (struct ifreq)
153Sets the hardware interface associate with the file.  This
154command must be performed before any packets can be read.
155The device is indicated by name using the
156.Dv ifr_name
157field of the
158.Fa ifreq .
159Additionally, performs the actions of
160.Dv BIOCFLUSH .
161.It Dv BIOCSRTIMEOUT, BIOCGRTIMEOUT (struct timeval)
162Set or get the read timeout parameter.
163The
164.Fa timeval
165specifies the length of time to wait before timing
166out on a read request.
167This parameter is initialized to zero by
168.Xr open 2 ,
169indicating no timeout.
170.It Dv BIOCGSTATS (struct bpf_stat)
171Returns the following structure of packet statistics:
172.Bd -literal -offset indent
173struct bpf_stat {
174	u_int bs_recv;
175	u_int bs_drop;
176};
177.Ed
178.Pp
179The fields are:
180.Bl -tag -width bs_recv -offset indent
181.It Va bs_recv
182the number of packets received by the descriptor since opened or reset
183(including any buffered since the last read call);
184and
185.It Va bs_drop
186the number of packets which were accepted by the filter but dropped by the
187kernel because of buffer overflows
188(i.e., the application's reads aren't keeping up with the packet traffic).
189.El
190.It Dv BIOCIMMEDIATE (u_int)
191Enable or disable
192.Dq immediate mode ,
193based on the truth value of the argument.
194When immediate mode is enabled, reads return immediately upon packet
195reception.  Otherwise, a read will block until either the kernel buffer
196becomes full or a timeout occurs.
197This is useful for programs like
198.Xr rarpd 8 ,
199which must respond to messages in real time.
200The default for a new file is off.
201.It Dv BIOCSETF (struct bpf_program)
202Sets the filter program used by the kernel to discard uninteresting
203packets.  An array of instructions and its length is passed in using
204the following structure:
205.Bd -literal -offset indent
206struct bpf_program {
207	int bf_len;
208	struct bpf_insn *bf_insns;
209};
210.Ed
211.Pp
212The filter program is pointed to by the
213.Va bf_insns
214field while its length in units of
215.Sq struct bpf_insn
216is given by the
217.Va bf_len
218field.
219Also, the actions of
220.Dv BIOCFLUSH
221are performed.
222.Pp
223See section
224.Sy FILTER MACHINE
225for an explanation of the filter language.
226.It Dv BIOCVERSION (struct bpf_version)
227Returns the major and minor version numbers of the filter language currently
228recognized by the kernel.  Before installing a filter, applications must check
229that the current version is compatible with the running kernel.  Version
230numbers are compatible if the major numbers match and the application minor
231is less than or equal to the kernel minor.  The kernel version number is
232returned in the following structure:
233.Bd -literal -offset indent
234struct bpf_version {
235	u_short bv_major;
236	u_short bv_minor;
237};
238.Ed
239.Pp
240The current version numbers are given by
241.Dv BPF_MAJOR_VERSION
242and
243.Dv BPF_MINOR_VERSION
244from \*[Lt]net/bpf.h\*[Gt].
245An incompatible filter
246may result in undefined behavior (most likely, an error returned by
247.Xr ioctl 2
248or haphazard packet matching).
249.It Dv BIOCSRSIG BIOCGRSIG (u_int signal)
250Set or get the receive signal.  This signal will be sent to the process or process group
251specified by FIOSETOWN.  It defaults to SIGIO.
252.El
253.Sh STANDARD IOCTLS
254.Nm
255now supports several standard
256.Xr ioctl 2 's
257which allow the user to do async and/or non-blocking I/O to an open
258.I bpf
259file descriptor.
260.Bl -tag -width indent -offset indent
261.It Dv FIONREAD (int)
262Returns the number of bytes that are immediately available for reading.
263.It Dv SIOCGIFADDR (struct ifreq)
264Returns the address associated with the interface.
265.It Dv FIONBIO (int)
266Set or clear non-blocking I/O.  If arg is non-zero, then doing a
267.Xr read 2
268when no data is available will return -1 and
269.Va errno
270will be set to EAGAIN.
271If arg is zero, non-blocking I/O is disabled.  Note:  setting this
272overrides the timeout set by
273.Dv BIOCSRTIMEOUT .
274.It Dv FIOASYNC (int)
275Enable or disable async I/O.  When enabled (arg is non-zero), the process or
276process group specified by FIOSETOWN will start receiving SIGIO's when packets
277arrive.
278Note that you must do an FIOSETOWN in order for this to take affect, as
279the system will not default this for you.
280The signal may be changed via
281.Dv BIOCSRSIG .
282.It Dv FIOSETOWN FIOGETOWN (int)
283Set or get the process or process group (if negative) that should receive SIGIO
284when packets are available.
285The signal may be changed using
286.Dv BIOCSRSIG
287(see above).
288.El
289.Sh BPF HEADER
290The following structure is prepended to each packet returned by
291.Xr read 2 :
292.Bd -literal -offset indent
293struct bpf_hdr {
294	struct timeval bh_tstamp;
295	u_long bh_caplen;
296	u_long bh_datalen;
297	u_short bh_hdrlen;
298};
299.Ed
300.Pp
301The fields, whose values are stored in host order, and are:
302.Bl -tag -width bh_datalen -offset indent
303.It Va bh_tstamp
304The time at which the packet was processed by the packet filter.
305.It Va bh_caplen
306The length of the captured portion of the packet.  This is the minimum of
307the truncation amount specified by the filter and the length of the packet.
308.It Va bh_datalen
309The length of the packet off the wire.
310This value is independent of the truncation amount specified by the filter.
311.It Va bh_hdrlen
312The length of the BPF header, which may not be equal to
313.Em sizeof(struct bpf_hdr) .
314.El
315.Pp
316The
317.Va bh_hdrlen
318field exists to account for
319padding between the header and the link level protocol.
320The purpose here is to guarantee proper alignment of the packet
321data structures, which is required on alignment sensitive
322architectures and and improves performance on many other architectures.
323The packet filter ensures that the
324.Va bpf_hdr
325and the
326.Em network layer
327header will be word aligned.  Suitable precautions
328must be taken when accessing the link layer protocol fields on alignment
329restricted machines.  (This isn't a problem on an Ethernet, since
330the type field is a short falling on an even offset,
331and the addresses are probably accessed in a bytewise fashion).
332.Pp
333Additionally, individual packets are padded so that each starts
334on a word boundary.  This requires that an application
335has some knowledge of how to get from packet to packet.
336The macro
337.Dv BPF_WORDALIGN
338is defined in
339.Pa \*[Lt]net/bpf.h\*[Gt]
340to facilitate this process.
341It rounds up its argument
342to the nearest word aligned value (where a word is BPF_ALIGNMENT bytes wide).
343.Pp
344For example, if
345.Sq Va p
346points to the start of a packet, this expression
347will advance it to the next packet:
348.Pp
349.Dl p = (char *)p + BPF_WORDALIGN(p-\*[Gt]bh_hdrlen + p-\*[Gt]bh_caplen)
350.Pp
351For the alignment mechanisms to work properly, the
352buffer passed to
353.Xr read 2
354must itself be word aligned.
355.Xr malloc 3
356will always return an aligned buffer.
357.Sh FILTER MACHINE
358A filter program is an array of instructions, with all branches forwardly
359directed, terminated by a
360.Sy return
361instruction.
362Each instruction performs some action on the pseudo-machine state,
363which consists of an accumulator, index register, scratch memory store,
364and implicit program counter.
365.Pp
366The following structure defines the instruction format:
367.Bd -literal -offset indent
368struct bpf_insn {
369	u_short	code;
370	u_char 	jt;
371	u_char 	jf;
372	long k;
373};
374.Ed
375.Pp
376The
377.Va k
378field is used in different ways by different instructions,
379and the
380.Va jt
381and
382.Va jf
383fields are used as offsets
384by the branch instructions.
385The opcodes are encoded in a semi-hierarchical fashion.
386There are eight classes of instructions: BPF_LD, BPF_LDX, BPF_ST, BPF_STX,
387BPF_ALU, BPF_JMP, BPF_RET, and BPF_MISC.  Various other mode and
388operator bits are or'd into the class to give the actual instructions.
389The classes and modes are defined in \*[Lt]net/bpf.h\*[Gt].
390.Pp
391Below are the semantics for each defined BPF instruction.
392We use the convention that A is the accumulator, X is the index register,
393P[] packet data, and M[] scratch memory store.
394P[i:n] gives the data at byte offset
395.Dq i
396in the packet,
397interpreted as a word (n=4),
398unsigned halfword (n=2), or unsigned byte (n=1).
399M[i] gives the i'th word in the scratch memory store, which is only
400addressed in word units.  The memory store is indexed from 0 to BPF_MEMWORDS-1.
401.Va k ,
402.Va jt ,
403and
404.Va jf
405are the corresponding fields in the
406instruction definition.
407.Dq len
408refers to the length of the packet.
409.Bl -tag -width indent -offset indent
410.It Sy BPF_LD
411These instructions copy a value into the accumulator.  The type of the
412source operand is specified by an
413.Dq addressing mode
414and can be a constant
415.No ( Ns Sy BBPF_IMM Ns ) ,
416packet data at a fixed offset
417.No ( Ns Sy BPF_ABS Ns ) ,
418packet data at a variable offset
419.No ( Ns Sy BPF_IND Ns ) ,
420the packet length
421.No ( Ns Sy BPF_LEN Ns ) ,
422or a word in the scratch memory store
423.No ( Ns Sy BPF_MEM Ns ) .
424For
425.Sy BPF_IND
426and
427.Sy BPF_ABS ,
428the data size must be specified as a word
429.No ( Ns Sy BPF_W Ns ) ,
430halfword
431.No ( Ns Sy BPF_H Ns ) ,
432or byte
433.No ( Ns Sy BPF_B Ns ) .
434The semantics of all the recognized BPF_LD instructions follow.
435.Bl -column "BPF_LD+BPF_W+BPF_ABS" "A \*[Lt]- P[k:4]" -width indent -offset indent
436.It Sy BPF_LD+BPF_W+BPF_ABS Ta A \*[Lt]- P[k:4]
437.It Li Sy BPF_LD+BPF_H+BPF_ABS Ta A \*[Lt]- P[k:2]
438.It Li Sy BPF_LD+BPF_B+BPF_ABS Ta A \*[Lt]- P[k:1]
439.It Li Sy BPF_LD+BPF_W+BPF_IND Ta A \*[Lt]- P[X+k:4]
440.It Li Sy BPF_LD+BPF_H+BPF_IND Ta A \*[Lt]- P[X+k:2]
441.It Li Sy BPF_LD+BPF_B+BPF_IND Ta A \*[Lt]- P[X+k:1]
442.It Li Sy BPF_LD+BPF_W+BPF_LEN Ta A \*[Lt]- len
443.It Li Sy BPF_LD+BPF_IMM Ta A \*[Lt]- k
444.It Li Sy BPF_LD+BPF_MEM Ta A \*[Lt]- M[k]
445.El
446.It Sy BPF_LDX
447These instructions load a value into the index register.  Note that
448the addressing modes are more restricted than those of the accumulator loads,
449but they include
450.Sy BPF_MSH ,
451a hack for efficiently loading the IP header length.
452.Bl -column "BPF_LDX+BPF_W+BPF_IMM" "X \*[Lt]- k" -width indent -offset indent
453.It Sy BPF_LDX+BPF_W+BPF_IMM Ta X \*[Lt]- k
454.It Li Sy BPF_LDX+BPF_W+BPF_MEM Ta X \*[Lt]- M[k]
455.It Li Sy BPF_LDX+BPF_W+BPF_LEN Ta X \*[Lt]- len
456.It Li Sy BPF_LDX+BPF_B+BPF_MSH Ta X \*[Lt]- 4*(P[k:1]\*[Am]0xf)
457.El
458.It Sy BPF_ST
459This instruction stores the accumulator into the scratch memory.
460We do not need an addressing mode since there is only one possibility
461for the destination.
462.Bl -column "BPF_ST" "M[k] \*[Lt]- A" -width indent -offset indent
463.It Sy BPF_ST Ta M[k] \*[Lt]- A
464.El
465.It Sy BPF_STX
466This instruction stores the index register in the scratch memory store.
467.Bl -column "BPF_STX" "M[k] \*[Lt]- X" -width indent -offset indent
468.It Sy BPF_STX Ta M[k] \*[Lt]- X
469.El
470.It Sy BPF_ALU
471The alu instructions perform operations between the accumulator and
472index register or constant, and store the result back in the accumulator.
473For binary operations, a source mode is required
474.No ( Ns Sy BPF_K
475or
476.Sy BPF_X Ns ) .
477.Bl -column "BPF_ALU+BPF_ADD+BPF_K" "A \*[Lt]- A + k" -width indent -offset indent
478.It Sy BPF_ALU+BPF_ADD+BPF_K Ta A \*[Lt]- A + k
479.It Li Sy BPF_ALU+BPF_SUB+BPF_K Ta A \*[Lt]- A - k
480.It Li Sy BPF_ALU+BPF_MUL+BPF_K Ta A \*[Lt]- A * k
481.It Li Sy BPF_ALU+BPF_DIV+BPF_K Ta A \*[Lt]- A / k
482.It Li Sy BPF_ALU+BPF_AND+BPF_K Ta A \*[Lt]- A \*[Am] k
483.It Li Sy BPF_ALU+BPF_OR+BPF_K Ta A \*[Lt]- A | k
484.It Li Sy BPF_ALU+BPF_LSH+BPF_K Ta A \*[Lt]- A \*[Lt]\*[Lt] k
485.It Li Sy BPF_ALU+BPF_RSH+BPF_K Ta A \*[Lt]- A \*[Gt]\*[Gt] k
486.It Li Sy BPF_ALU+BPF_ADD+BPF_X Ta A \*[Lt]- A + X
487.It Li Sy BPF_ALU+BPF_SUB+BPF_X Ta A \*[Lt]- A - X
488.It Li Sy BPF_ALU+BPF_MUL+BPF_X Ta A \*[Lt]- A * X
489.It Li Sy BPF_ALU+BPF_DIV+BPF_X Ta A \*[Lt]- A / X
490.It Li Sy BPF_ALU+BPF_AND+BPF_X Ta A \*[Lt]- A \*[Am] X
491.It Li Sy BPF_ALU+BPF_OR+BPF_X Ta A \*[Lt]- A | X
492.It Li Sy BPF_ALU+BPF_LSH+BPF_X Ta A \*[Lt]- A \*[Lt]\*[Lt] X
493.It Li Sy BPF_ALU+BPF_RSH+BPF_X Ta A \*[Lt]- A \*[Gt]\*[Gt] X
494.It Li Sy BPF_ALU+BPF_NEG Ta A \*[Lt]- -A
495.El
496.It Sy BPF_JMP
497The jump instructions alter flow of control.  Conditional jumps
498compare the accumulator against a constant
499.No ( Ns Sy BPF_K Ns )
500or the index register
501.No ( Ns Sy BPF_X Ns ) .
502If the result is true (or non-zero),
503the true branch is taken, otherwise the false branch is taken.
504Jump offsets are encoded in 8 bits so the longest jump is 256 instructions.
505However, the jump always
506.No ( Ns Sy BPF_JA Ns )
507opcode uses the 32 bit
508.Va k
509field as the offset, allowing arbitrarily distant destinations.
510All conditionals use unsigned comparison conventions.
511.Bl -column "BPF_JMP+BPF_JGE+BPF_K" "pc += (A \*[Ge] k) ? jt : jf" -width indent -offset indent
512.It Sy BPF_JMP+BPF_JA Ta pc += k
513.It Li Sy BPF_JMP+BPF_JGT+BPF_K Ta "pc += (A \*[Gt] k) ? jt : jf"
514.It Li Sy BPF_JMP+BPF_JGE+BPF_K Ta "pc += (A \*[Ge] k) ? jt : jf"
515.It Li Sy BPF_JMP+BPF_JEQ+BPF_K Ta "pc += (A == k) ? jt : jf"
516.It Li Sy BPF_JMP+BPF_JSET+BPF_K Ta "pc += (A \*[Am] k) ? jt : jf"
517.It Li Sy BPF_JMP+BPF_JGT+BPF_X Ta "pc += (A \*[Gt] X) ? jt : jf"
518.It Li Sy BPF_JMP+BPF_JGE+BPF_X Ta "pc += (A \*[Ge] X) ? jt : jf"
519.It Li Sy BPF_JMP+BPF_JEQ+BPF_X Ta "pc += (A == X) ? jt : jf"
520.It Li Sy BPF_JMP+BPF_JSET+BPF_X Ta "pc += (A \*[Am] X) ? jt : jf"
521.El
522.It Sy BPF_RET
523The return instructions terminate the filter program and specify the amount
524of packet to accept (i.e., they return the truncation amount).  A return
525value of zero indicates that the packet should be ignored.
526The return value is either a constant
527.No ( Ns Sy BPF_K Ns )
528or the accumulator
529.No ( Ns Sy BPF_A Ns ) .
530.Bl -column "BPF_RET+BPF_A" "accept A bytes" -width indent -offset indent
531.It Sy BPF_RET+BPF_A Ta accept A bytes
532.It Li Sy BPF_RET+BPF_K Ta accept k bytes
533.El
534.It Sy BPF_MISC
535The miscellaneous category was created for anything that doesn't
536fit into the above classes, and for any new instructions that might need to
537be added.  Currently, these are the register transfer instructions
538that copy the index register to the accumulator or vice versa.
539.Bl -column "BPF_MISC+BPF_TAX" "X \*[Lt]- A" -width indent -offset indent
540.It Sy BPF_MISC+BPF_TAX Ta X \*[Lt]- A
541.It Li Sy BPF_MISC+BPF_TXA Ta A \*[Lt]- X
542.El
543.El
544.Pp
545The BPF interface provides the following macros to facilitate
546array initializers:
547.Bd -literal -offset indent
548.Sy BPF_STMT Ns (opcode, operand)
549.Sy BPF_JUMP Ns (opcode, operand, true_offset, false_offset)
550.Ed
551.Sh FILES
552/dev/bpf0, /dev/bpf1, ...
553.Sh EXAMPLES
554The following filter is taken from the Reverse ARP Daemon.  It accepts
555only Reverse ARP requests.
556.Bd -literal -offset indent
557struct bpf_insn insns[] = {
558	BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12),
559	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_REVARP, 0, 3),
560	BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 20),
561	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, REVARP_REQUEST, 0, 1),
562	BPF_STMT(BPF_RET+BPF_K, sizeof(struct ether_arp) +
563	    sizeof(struct ether_header)),
564	BPF_STMT(BPF_RET+BPF_K, 0),
565};
566.Ed
567.Pp
568This filter accepts only IP packets between host 128.3.112.15 and
569128.3.112.35.
570.Bd -literal -offset indent
571struct bpf_insn insns[] = {
572	BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12),
573	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_IP, 0, 8),
574	BPF_STMT(BPF_LD+BPF_W+BPF_ABS, 26),
575	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x8003700f, 0, 2),
576	BPF_STMT(BPF_LD+BPF_W+BPF_ABS, 30),
577	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x80037023, 3, 4),
578	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x80037023, 0, 3),
579	BPF_STMT(BPF_LD+BPF_W+BPF_ABS, 30),
580	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x8003700f, 0, 1),
581	BPF_STMT(BPF_RET+BPF_K, (u_int)-1),
582	BPF_STMT(BPF_RET+BPF_K, 0),
583};
584.Ed
585.Pp
586Finally, this filter returns only TCP finger packets.  We must parse
587the IP header to reach the TCP header.  The
588.Sy BPF_JSET
589instruction checks that the IP fragment offset is 0 so we are sure
590that we have a TCP header.
591.Bd -literal -offset indent
592struct bpf_insn insns[] = {
593	BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12),
594	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_IP, 0, 10),
595	BPF_STMT(BPF_LD+BPF_B+BPF_ABS, 23),
596	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, IPPROTO_TCP, 0, 8),
597	BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 20),
598	BPF_JUMP(BPF_JMP+BPF_JSET+BPF_K, 0x1fff, 6, 0),
599	BPF_STMT(BPF_LDX+BPF_B+BPF_MSH, 14),
600	BPF_STMT(BPF_LD+BPF_H+BPF_IND, 14),
601	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 79, 2, 0),
602	BPF_STMT(BPF_LD+BPF_H+BPF_IND, 16),
603	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 79, 0, 1),
604	BPF_STMT(BPF_RET+BPF_K, (u_int)-1),
605	BPF_STMT(BPF_RET+BPF_K, 0),
606};
607.Ed
608.Sh SEE ALSO
609.Xr ioctl 2 ,
610.Xr read 2 ,
611.Xr select 2 ,
612.Xr signal 3 ,
613.Xr tcpdump 8
614.Rs
615.%T "The BSD Packet Filter: A New Architecture for User-level Packet Capture"
616.%A S. McCanne
617.%A V. Jacobson
618.%J Proceedings of the 1993 Winter USENIX
619.%C Technical Conference, San Diego, CA
620.Re
621.Sh HISTORY
622The Enet packet filter was created in 1980 by Mike Accetta and
623Rick Rashid at Carnegie-Mellon University.  Jeffrey Mogul, at
624Stanford, ported the code to BSD and continued its development from
6251983 on.  Since then, it has evolved into the Ultrix Packet Filter
626at DEC, a STREAMS NIT module under SunOS 4.1, and BPF.
627.Sh AUTHORS
628Steven McCanne, of Lawrence Berkeley Laboratory, implemented BPF in
629Summer 1990.  The design was in collaboration with Van Jacobson,
630also of Lawrence Berkeley Laboratory.
631.Sh BUGS
632The read buffer must be of a fixed size (returned by the
633.Dv BIOCGBLEN
634ioctl).
635.Pp
636A file that does not request promiscuous mode may receive promiscuously
637received packets as a side effect of another file requesting this
638mode on the same hardware interface.  This could be fixed in the kernel
639with additional processing overhead.  However, we favor the model where
640all files must assume that the interface is promiscuous, and if
641so desired, must utilize a filter to reject foreign packets.
642.Pp
643Data link protocols with variable length headers are not currently supported.
644.Pp
645Under SunOS, if a BPF application reads more than 2^31 bytes of
646data, read will fail in EINVAL.  You can either fix the bug in SunOS,
647or lseek to 0 when read fails for this reason.
648.Pp
649.Dq Immediate mode
650and the
651.Dq read timeout
652are misguided features.
653This functionality can be emulated with non-blocking mode and
654.Xr select 2 .
655