xref: /openbsd/share/man/man4/bpf.4 (revision 832330f3)
1.\"	$OpenBSD: bpf.4,v 1.27 2005/11/03 20:00:18 reyk Exp $
2.\"     $NetBSD: bpf.4,v 1.7 1995/09/27 18:31:50 thorpej Exp $
3.\"
4.\" Copyright (c) 1990 The Regents of the University of California.
5.\" All rights reserved.
6.\"
7.\" Redistribution and use in source and binary forms, with or without
8.\" modification, are permitted provided that: (1) source code distributions
9.\" retain the above copyright notice and this paragraph in its entirety, (2)
10.\" distributions including binary code include the above copyright notice and
11.\" this paragraph in its entirety in the documentation or other materials
12.\" provided with the distribution, and (3) all advertising materials mentioning
13.\" features or use of this software display the following acknowledgement:
14.\" ``This product includes software developed by the University of California,
15.\" Lawrence Berkeley Laboratory and its contributors.'' Neither the name of
16.\" the University nor the names of its contributors may be used to endorse
17.\" or promote products derived from this software without specific prior
18.\" written permission.
19.\" THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR IMPLIED
20.\" WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
21.\" MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
22.\"
23.\" This document is derived in part from the enet man page (enet.4)
24.\" distributed with 4.3BSD Unix.
25.\"
26.Dd May 23, 1991
27.Dt BPF 4
28.Os
29.Sh NAME
30.Nm bpf
31.Nd Berkeley Packet Filter
32.Sh SYNOPSIS
33.Cd "pseudo-device bpfilter"
34.Sh DESCRIPTION
35The Berkeley Packet Filter provides a raw interface to data link layers in
36a protocol-independent fashion.
37All packets on the network, even those destined for other hosts, are
38accessible through this mechanism.
39.Pp
40The packet filter appears as a character special device,
41.Pa /dev/bpf0 ,
42.Pa /dev/bpf1 ,
43etc.
44After opening the device, the file descriptor must be bound to a specific
45network interface with the
46.Dv BIOCSETIF
47.Xr ioctl 2 .
48A given interface can be shared between multiple listeners, and the filter
49underlying each descriptor will see an identical packet stream.
50.Pp
51A separate device file is required for each minor device.
52If a file is in use, the open will fail and
53.Va errno
54will be set to
55.Er EBUSY .
56The number of open files can be increased by creating additional
57device nodes with the
58.Xr MAKEDEV 8
59script.
60.Pp
61Associated with each open instance of a
62.Nm
63file is a user-settable
64packet filter.
65Whenever a packet is received by an interface, all file descriptors
66listening on that interface apply their filter.
67Each descriptor that accepts the packet receives its own copy.
68.Pp
69Reads from these files return the next group of packets that have matched
70the filter.
71To improve performance, the buffer passed to read must be the same size as
72the buffers used internally by
73.Nm bpf .
74This size is returned by the
75.Dv BIOCGBLEN
76.Xr ioctl 2
77and can be set with
78.Dv BIOCSBLEN .
79Note that an individual packet larger than this size is necessarily truncated.
80.Pp
81The packet filter will support any link level protocol that has fixed length
82headers.
83Currently, only Ethernet, SLIP, and PPP drivers have been modified to
84interact with
85.Nm bpf .
86.Pp
87Since packet data is in network byte order, applications should use the
88.Xr byteorder 3
89macros to extract multi-byte values.
90.Pp
91A packet can be sent out on the network by writing to a
92.Nm
93file descriptor.
94Each descriptor can also have a user-settable filter
95for controlling the writes.
96Only packets matching the filter are sent out of the interface.
97The writes are unbuffered, meaning only one packet can be processed per write.
98.Pp
99Once a descriptor is configured, further changes to the configuration
100can be prevented using the
101.Dv BIOCLOCK
102.Xr ioctl 2 .
103.Sh IOCTL INTERFACE
104The
105.Xr ioctl 2
106command codes below are defined in
107.Aq Pa net/bpf.h .
108All commands require these includes:
109.Bd -unfilled -offset indent
110.Cd #include <sys/types.h>
111.Cd #include <sys/time.h>
112.Cd #include <sys/ioctl.h>
113.Cd #include <net/bpf.h>
114.Ed
115.Pp
116Additionally,
117.Dv BIOCGETIF
118and
119.Dv BIOCSETIF
120require
121.Aq Pa sys/socket.h
122and
123.Aq Pa net/if.h .
124.Pp
125The (third) argument to the
126.Xr ioctl 2
127call should be a pointer to the type indicated.
128.Pp
129.Bl -tag -width Ds -compact
130.It Dv BIOCGBLEN Fa "u_int *"
131Returns the required buffer length for reads on
132.Nm
133files.
134.Pp
135.It Dv BIOCSBLEN Fa "u_int *"
136Sets the buffer length for reads on
137.Nm
138files.
139The buffer must be set before the file is attached to an interface with
140.Dv BIOCSETIF .
141If the requested buffer size cannot be accommodated, the closest allowable
142size will be set and returned in the argument.
143A read call will result in
144.Er EIO
145if it is passed a buffer that is not this size.
146.Pp
147.It Dv BIOCGDLT Fa "u_int *"
148Returns the type of the data link layer underlying the attached interface.
149.Er EINVAL
150is returned if no interface has been specified.
151The device types, prefixed with
152.Dq DLT_ ,
153are defined in
154.Aq Pa net/bpf.h .
155.Pp
156.It Dv BIOCGDLTLIST Fa "struct bpf_dltlist *"
157Returns an array of the available types of the data link layer
158underlying the attached interface:
159.Bd -literal -offset indent
160struct bpf_dltlist {
161	u_int bfl_len;
162	u_int *bfl_list;
163};
164.Ed
165.Pp
166The available types are returned in the array pointed to by the
167.Va bfl_list
168field while their length in
169.Vt u_int
170is supplied to the
171.Va bfl_len
172field.
173.Er ENOMEM
174is returned if there is not enough buffer space and
175.Er EFAULT
176is returned if a bad address is encountered.
177The
178.Va bfl_len
179field is modified on return to indicate the actual length in
180.Vt u_int
181of the array returned.
182If
183.Va bfl_list
184is
185.Dv NULL ,
186the
187.Va bfl_len
188field is set to indicate the required length of the array in
189.Vt u_int .
190.Pp
191.It Dv BIOCSDLT Fa "u_int *"
192Changes the type of the data link layer underlying the attached interface.
193.Er EINVAL
194is returned if no interface has been specified or the specified
195type is not available for the interface.
196.Pp
197.It Dv BIOCPROMISC
198Forces the interface into promiscuous mode.
199All packets, not just those destined for the local host, are processed.
200Since more than one file can be listening on a given interface, a listener
201that opened its interface non-promiscuously may receive packets promiscuously.
202This problem can be remedied with an appropriate filter.
203.Pp
204The interface remains in promiscuous mode until all files listening
205promiscuously are closed.
206.Pp
207.It Dv BIOCFLUSH
208Flushes the buffer of incoming packets and resets the statistics that are
209returned by
210.Dv BIOCGSTATS .
211.Pp
212.It Dv BIOCLOCK
213This ioctl is designed to prevent the security issues associated
214with an open
215.Nm
216descriptor in unprivileged programs.
217Even with dropped privileges, an open
218.Nm
219descriptor can be abused by a rogue program to listen on any interface
220on the system, send packets on these interfaces if the descriptor was
221opened read-write and send signals to arbitrary processes using the
222signaling mechanism of
223.Nm bpf .
224By allowing only
225.Dq known safe
226ioctls, the
227.Dv BIOCLOCK
228ioctl prevents this abuse.
229The allowable ioctls are
230.Dv BIOCGBLEN ,
231.Dv BIOCFLUSH ,
232.Dv BIOCGDLT ,
233.Dv BIOCGETIF ,
234.Dv BIOCGRTIMEOUT ,
235.Dv BIOCSRTIMEOUT ,
236.Dv BIOCIMMEDIATE ,
237.Dv BIOCGSTATS ,
238.Dv BIOCVERSION ,
239.Dv BIOCGRSIG ,
240.Dv BIOCGHDRCMPLT ,
241.Dv TIOCGPGRP ,
242and
243.Dv FIONREAD .
244Use of any other ioctl is denied with error
245.Er EPERM .
246Once a descriptor is locked, it is not possible to unlock it.
247A process with root privileges is not affected by the lock.
248.Pp
249A privileged program can open a
250.Nm
251device, drop privileges, set the interface, filters and modes on the
252descriptor, and lock it.
253Once the descriptor is locked, the system is safe
254from further abuse through the descriptor.
255Locking a descriptor does not prevent writes.
256If the application does not need to send packets through
257.Nm bpf ,
258it can open the device read-only to prevent writing.
259If sending packets is necessary, a write-filter can be set before locking the
260descriptor to prevent arbitrary packets from being sent out.
261.Pp
262.It Dv BIOCGETIF Fa "struct ifreq *"
263Returns the name of the hardware interface that the file is listening on.
264The name is returned in the
265.Fa ifr_name
266field of the
267.Li struct ifreq .
268All other fields are undefined.
269.Pp
270.It Dv BIOCSETIF Fa "struct ifreq *"
271Sets the hardware interface associated with the file.
272This command must be performed before any packets can be read.
273The device is indicated by name using the
274.Fa ifr_name
275field of the
276.Li struct ifreq .
277Additionally, performs the actions of
278.Dv BIOCFLUSH .
279.Pp
280.It Dv BIOCSRTIMEOUT Fa "struct timeval *"
281.It Dv BIOCGRTIMEOUT Fa "struct timeval *"
282Set or get the read timeout parameter.
283The
284.Ar timeval
285specifies the length of time to wait before timing out on a read request.
286This parameter is initialized to zero by
287.Xr open 2 ,
288indicating no timeout.
289.Pp
290.It Dv BIOCGSTATS Fa "struct bpf_stat *"
291Returns the following structure of packet statistics:
292.Bd -literal -offset indent
293struct bpf_stat {
294	u_int bs_recv;
295	u_int bs_drop;
296};
297.Ed
298.Pp
299The fields are:
300.Bl -tag -width bs_recv
301.It Fa bs_recv
302Number of packets received by the descriptor since opened or reset (including
303any buffered since the last read call).
304.It Fa bs_drop
305Number of packets which were accepted by the filter but dropped by the kernel
306because of buffer overflows (i.e., the application's reads aren't keeping up
307with the packet traffic).
308.El
309.Pp
310.It Dv BIOCIMMEDIATE Fa "u_int *"
311Enable or disable
312.Dq immediate mode ,
313based on the truth value of the argument.
314When immediate mode is enabled, reads return immediately upon packet reception.
315Otherwise, a read will block until either the kernel buffer becomes full or a
316timeout occurs.
317This is useful for programs like
318.Xr rarpd 8 ,
319which must respond to messages in real time.
320The default for a new file is off.
321.Pp
322.It Dv BIOCSETF Fa "struct bpf_program *"
323Sets the filter program used by the kernel to discard uninteresting packets.
324An array of instructions and its length are passed in using the following
325structure:
326.Bd -literal -offset indent
327struct bpf_program {
328	int bf_len;
329	struct bpf_insn *bf_insns;
330};
331.Ed
332.Pp
333The filter program is pointed to by the
334.Fa bf_insns
335field, while its length in units of
336.Li struct bpf_insn
337is given by the
338.Fa bf_len
339field.
340Also, the actions of
341.Dv BIOCFLUSH
342are performed.
343.Pp
344See section
345.Sx FILTER MACHINE
346for an explanation of the filter language.
347.Pp
348.It Dv BIOCSETWF Fa "struct bpf_program *"
349Sets the filter program used by the kernel to filter the packets
350written to the descriptor before the packets are sent out on the
351network.
352See
353.Dv BIOCSETF
354for a description of the filter program.
355This ioctl also acts as
356.Dv BIOCFLUSH .
357.Pp
358Note that the filter operates on the packet data written to the descriptor.
359If the
360.Dq header complete
361flag is not set, the kernel sets the link-layer source address
362of the packet after filtering.
363.Pp
364.It Dv BIOCVERSION Fa "struct bpf_version *"
365Returns the major and minor version numbers of the filter language currently
366recognized by the kernel.
367Before installing a filter, applications must check that the current version
368is compatible with the running kernel.
369Version numbers are compatible if the major numbers match and the application
370minor is less than or equal to the kernel minor.
371The kernel version number is returned in the following structure:
372.Bd -literal -offset indent
373struct bpf_version {
374	u_short bv_major;
375	u_short bv_minor;
376};
377.Ed
378.Pp
379The current version numbers are given by
380.Dv BPF_MAJOR_VERSION
381and
382.Dv BPF_MINOR_VERSION
383from
384.Aq Pa net/bpf.h .
385An incompatible filter may result in undefined behavior (most likely, an
386error returned by
387.Xr ioctl 2
388or haphazard packet matching).
389.Pp
390.It Dv BIOCSRSIG Fa "u_int *"
391.It Dv BIOCGRSIG Fa "u_int *"
392Set or get the receive signal.
393This signal will be sent to the process or process group specified by
394.Dv FIOSETOWN .
395It defaults to
396.Dv SIGIO .
397.Pp
398.It Dv BIOCSHDRCMPLT Fa "u_int *"
399.It Dv BIOCGHDRCMPLT Fa "u_int *"
400Set or get the status of the
401.Dq header complete
402flag.
403Set to zero if the link level source address should be filled in
404automatically by the interface output routine.
405Set to one if the link level source address will be written,
406as provided, to the wire.
407This flag is initialized to zero by default.
408.Pp
409.It Dv BIOCGFILDROP Fa "u_int *"
410.It Dv BIOCSFILDROP Fa "u_int *"
411Get or set the status of the
412.Dq filter drop
413flag.
414If non-zero, packets matching any filters will be reported to the
415associated interface so that they can be dropped.
416.El
417.Ss Standard ioctls
418.Nm
419now supports several standard ioctls which allow the user to do asynchronous
420and/or non-blocking I/O to an open
421.Nm
422file descriptor.
423.Pp
424.Bl -tag -width Ds -compact
425.It Dv FIONREAD Fa "int *"
426Returns the number of bytes that are immediately available for reading.
427.Pp
428.It Dv SIOCGIFADDR Fa "struct ifreq *"
429Returns the address associated with the interface.
430.Pp
431.It Dv FIONBIO Fa "int *"
432Set or clear non-blocking I/O.
433If the argument is non-zero, enable non-blocking I/O.
434If the argument is zero, disable non-blocking I/O.
435If non-blocking I/O is enabled, the return value of a read while no data
436is available will be 0.
437The non-blocking read behavior is different from performing non-blocking
438reads on other file descriptors, which will return \-1 and set
439.Va errno
440to
441.Er EAGAIN
442if no data is available.
443Note: setting this overrides the timeout set by
444.Dv BIOCSRTIMEOUT .
445.Pp
446.It Dv FIOASYNC Fa "int *"
447Enable or disable asynchronous I/O.
448When enabled (argument is non-zero), the process or process group specified
449by
450.Dv FIOSETOWN
451will start receiving
452.Dv SIGIO
453signals when packets arrive.
454Note that you must perform an
455.Dv FIOSETOWN
456command in order for this to take effect, as the system will not do it by
457default.
458The signal may be changed via
459.Dv BIOCSRSIG .
460.Pp
461.It Dv FIOSETOWN Fa "int *"
462.It Dv FIOGETOWN Fa "int *"
463Set or get the process or process group (if negative) that should receive
464.Dv SIGIO
465when packets are available.
466The signal may be changed using
467.Dv BIOCSRSIG
468(see above).
469.El
470.Ss BPF header
471The following structure is prepended to each packet returned by
472.Xr read 2 :
473.Bd -literal -offset indent
474struct bpf_hdr {
475	struct bpf_timeval bh_tstamp;
476	u_int32_t	bh_caplen;
477	u_int32_t	bh_datalen;
478	u_int16_t	bh_hdrlen;
479};
480.Ed
481.Pp
482The fields, stored in host order, are as follows:
483.Bl -tag -width Ds
484.It Fa bh_tstamp
485Time at which the packet was processed by the packet filter.
486.It Fa bh_caplen
487Length of the captured portion of the packet.
488This is the minimum of the truncation amount specified by the filter and the
489length of the packet.
490.It Fa bh_datalen
491Length of the packet off the wire.
492This value is independent of the truncation amount specified by the filter.
493.It Fa bh_hdrlen
494Length of the BPF header, which may not be equal to
495.Li sizeof(struct bpf_hdr) .
496.El
497.Pp
498The
499.Fa bh_hdrlen
500field exists to account for padding between the header and the link level
501protocol.
502The purpose here is to guarantee proper alignment of the packet data
503structures, which is required on alignment-sensitive architectures and
504improves performance on many other architectures.
505The packet filter ensures that the
506.Fa bpf_hdr
507and the network layer header will be word aligned.
508Suitable precautions must be taken when accessing the link layer protocol
509fields on alignment restricted machines.
510(This isn't a problem on an Ethernet, since the type field is a
511.Li short
512falling on an even offset, and the addresses are probably accessed in a
513bytewise fashion).
514.Pp
515Additionally, individual packets are padded so that each starts on a
516word boundary.
517This requires that an application has some knowledge of how to get from packet
518to packet.
519The macro
520.Dv BPF_WORDALIGN
521is defined in
522.Aq Pa net/bpf.h
523to facilitate this process.
524It rounds up its argument to the nearest word aligned value (where a word is
525.Dv BPF_ALIGNMENT
526bytes wide).
527For example, if
528.Va p
529points to the start of a packet, this expression will advance it to the
530next packet:
531.Pp
532.Dl p = (char *)p + BPF_WORDALIGN(p->bh_hdrlen + p->bh_caplen);
533.Pp
534For the alignment mechanisms to work properly, the buffer passed to
535.Xr read 2
536must itself be word aligned.
537.Xr malloc 3
538will always return an aligned buffer.
539.Ss Filter machine
540A filter program is an array of instructions with all branches forwardly
541directed, terminated by a
542.Dq return
543instruction.
544Each instruction performs some action on the pseudo-machine state, which
545consists of an accumulator, index register, scratch memory store, and
546implicit program counter.
547.Pp
548The following structure defines the instruction format:
549.Bd -literal -offset indent
550struct bpf_insn {
551	u_int16_t	code;
552	u_char		jt;
553	u_char		jf;
554	u_int32_t	k;
555};
556.Ed
557.Pp
558The
559.Fa k
560field is used in different ways by different instructions, and the
561.Fa jt
562and
563.Fa jf
564fields are used as offsets by the branch instructions.
565The opcodes are encoded in a semi-hierarchical fashion.
566There are eight classes of instructions:
567.Dv BPF_LD ,
568.Dv BPF_LDX ,
569.Dv BPF_ST ,
570.Dv BPF_STX ,
571.Dv BPF_ALU ,
572.Dv BPF_JMP ,
573.Dv BPF_RET ,
574and
575.Dv BPF_MISC .
576Various other mode and operator bits are logically OR'd into the class to
577give the actual instructions.
578The classes and modes are defined in
579.Aq Pa net/bpf.h .
580Below are the semantics for each defined
581.Nm
582instruction.
583We use the convention that A is the accumulator, X is the index register,
584P[] packet data, and M[] scratch memory store.
585P[i:n] gives the data at byte offset
586.Dq i
587in the packet, interpreted as a word (n=4), unsigned halfword (n=2), or
588unsigned byte (n=1).
589M[i] gives the i'th word in the scratch memory store, which is only addressed
590in word units.
591The memory store is indexed from 0 to
592.Dv BPF_MEMWORDS Ns \-1 .
593.Fa k ,
594.Fa jt ,
595and
596.Fa jf
597are the corresponding fields in the instruction definition.
598.Dq len
599refers to the length of the packet.
600.Bl -tag -width Ds
601.It Dv BPF_LD
602These instructions copy a value into the accumulator.
603The type of the source operand is specified by an
604.Dq addressing mode
605and can be a constant
606.Pf ( Dv BPF_IMM ) ,
607packet data at a fixed offset
608.Pf ( Dv BPF_ABS ) ,
609packet data at a variable offset
610.Pf ( Dv BPF_IND ) ,
611the packet length
612.Pf ( Dv BPF_LEN ) ,
613or a word in the scratch memory store
614.Pf ( Dv BPF_MEM ) .
615For
616.Dv BPF_IND
617and
618.Dv BPF_ABS ,
619the data size must be specified as a word
620.Pf ( Dv BPF_W ) ,
621halfword
622.Pf ( Dv BPF_H ) ,
623or byte
624.Pf ( Dv BPF_B ) .
625The semantics of all recognized
626.Dv BPF_LD
627instructions follow.
628.Pp
629.Bl -tag -width 32n -compact
630.Sm off
631.It Xo Dv BPF_LD No + Dv BPF_W No +
632.Dv BPF_ABS
633.Xc
634.Sm on
635A <- P[k:4]
636.Sm off
637.It Xo Dv BPF_LD No + Dv BPF_H No +
638.Dv BPF_ABS
639.Xc
640.Sm on
641A <- P[k:2]
642.Sm off
643.It Xo Dv BPF_LD No + Dv BPF_B No +
644.Dv BPF_ABS
645.Xc
646.Sm on
647A <- P[k:1]
648.Sm off
649.It Xo Dv BPF_LD No + Dv BPF_W No +
650.Dv BPF_IND
651.Xc
652.Sm on
653A <- P[X+k:4]
654.Sm off
655.It Xo Dv BPF_LD No + Dv BPF_H No +
656.Dv BPF_IND
657.Xc
658.Sm on
659A <- P[X+k:2]
660.Sm off
661.It Xo Dv BPF_LD No + Dv BPF_B No +
662.Dv BPF_IND
663.Xc
664.Sm on
665A <- P[X+k:1]
666.Sm off
667.It Xo Dv BPF_LD No + Dv BPF_W No +
668.Dv BPF_LEN
669.Xc
670.Sm on
671A <- len
672.Sm off
673.It Dv BPF_LD No + Dv BPF_IMM
674.Sm on
675A <- k
676.Sm off
677.It Dv BPF_LD No + Dv BPF_MEM
678.Sm on
679A <- M[k]
680.El
681.It Dv BPF_LDX
682These instructions load a value into the index register.
683Note that the addressing modes are more restricted than those of the
684accumulator loads, but they include
685.Dv BPF_MSH ,
686a hack for efficiently loading the IP header length.
687.Pp
688.Bl -tag -width 32n -compact
689.Sm off
690.It Xo Dv BPF_LDX No + Dv BPF_W No +
691.Dv BPF_IMM
692.Xc
693.Sm on
694X <- k
695.Sm off
696.It Xo Dv BPF_LDX No + Dv BPF_W No +
697.Dv BPF_MEM
698.Xc
699.Sm on
700X <- M[k]
701.Sm off
702.It Xo Dv BPF_LDX No + Dv BPF_W No +
703.Dv BPF_LEN
704.Xc
705.Sm on
706X <- len
707.Sm off
708.It Xo Dv BPF_LDX No + Dv BPF_B No +
709.Dv BPF_MSH
710.Xc
711.Sm on
712X <- 4*(P[k:1]&0xf)
713.El
714.It Dv BPF_ST
715This instruction stores the accumulator into the scratch memory.
716We do not need an addressing mode since there is only one possibility for
717the destination.
718.Pp
719.Bl -tag -width 32n -compact
720.It Dv BPF_ST
721M[k] <- A
722.El
723.It Dv BPF_STX
724This instruction stores the index register in the scratch memory store.
725.Pp
726.Bl -tag -width 32n -compact
727.It Dv BPF_STX
728M[k] <- X
729.El
730.It Dv BPF_ALU
731The ALU instructions perform operations between the accumulator and index
732register or constant, and store the result back in the accumulator.
733For binary operations, a source mode is required
734.Pf ( Dv BPF_K
735or
736.Dv BPF_X ) .
737.Pp
738.Bl -tag -width 32n -compact
739.Sm off
740.It Xo Dv BPF_ALU No + BPF_ADD No +
741.Dv BPF_K
742.Xc
743.Sm on
744A <- A + k
745.Sm off
746.It Xo Dv BPF_ALU No + BPF_SUB No +
747.Dv BPF_K
748.Xc
749.Sm on
750A <- A - k
751.Sm off
752.It Xo Dv BPF_ALU No + BPF_MUL No +
753.Dv BPF_K
754.Xc
755.Sm on
756A <- A * k
757.Sm off
758.It Xo Dv BPF_ALU No + BPF_DIV No +
759.Dv BPF_K
760.Xc
761.Sm on
762A <- A / k
763.Sm off
764.It Xo Dv BPF_ALU No + BPF_AND No +
765.Dv BPF_K
766.Xc
767.Sm on
768A <- A & k
769.Sm off
770.It Xo Dv BPF_ALU No + BPF_OR No +
771.Dv BPF_K
772.Xc
773.Sm on
774A <- A | k
775.Sm off
776.It Xo Dv BPF_ALU No + BPF_LSH No +
777.Dv BPF_K
778.Xc
779.Sm on
780A <- A << k
781.Sm off
782.It Xo Dv BPF_ALU No + BPF_RSH No +
783.Dv BPF_K
784.Xc
785.Sm on
786A <- A >> k
787.Sm off
788.It Xo Dv BPF_ALU No + BPF_ADD No +
789.Dv BPF_X
790.Xc
791.Sm on
792A <- A + X
793.Sm off
794.It Xo Dv BPF_ALU No + BPF_SUB No +
795.Dv BPF_X
796.Xc
797.Sm on
798A <- A - X
799.Sm off
800.It Xo Dv BPF_ALU No + BPF_MUL No +
801.Dv BPF_X
802.Xc
803.Sm on
804A <- A * X
805.Sm off
806.It Xo Dv BPF_ALU No + BPF_DIV No +
807.Dv BPF_X
808.Xc
809.Sm on
810A <- A / X
811.Sm off
812.It Xo Dv BPF_ALU No + BPF_AND No +
813.Dv BPF_X
814.Xc
815.Sm on
816A <- A & X
817.Sm off
818.It Xo Dv BPF_ALU No + BPF_OR No +
819.Dv BPF_X
820.Xc
821.Sm on
822A <- A | X
823.Sm off
824.It Xo Dv BPF_ALU No + BPF_LSH No +
825.Dv BPF_X
826.Xc
827.Sm on
828A <- A << X
829.Sm off
830.It Xo Dv BPF_ALU No + BPF_RSH No +
831.Dv BPF_X
832.Xc
833.Sm on
834A <- A >> X
835.Sm off
836.It Dv BPF_ALU No + BPF_NEG
837.Sm on
838A <- -A
839.El
840.It Dv BPF_JMP
841The jump instructions alter flow of control.
842Conditional jumps compare the accumulator against a constant
843.Pf ( Dv BPF_K )
844or the index register
845.Pf ( Dv BPF_X ) .
846If the result is true (or non-zero), the true branch is taken, otherwise the
847false branch is taken.
848Jump offsets are encoded in 8 bits so the longest jump is 256 instructions.
849However, the jump always
850.Pf ( Dv BPF_JA )
851opcode uses the 32-bit
852.Fa k
853field as the offset, allowing arbitrarily distant destinations.
854All conditionals use unsigned comparison conventions.
855.Pp
856.Bl -tag -width 32n -compact
857.Sm off
858.It Dv BPF_JMP No + BPF_JA
859pc += k
860.Sm on
861.Sm off
862.It Xo Dv BPF_JMP No + BPF_JGT No +
863.Dv BPF_K
864.Xc
865.Sm on
866pc += (A > k) ? jt : jf
867.Sm off
868.It Xo Dv BPF_JMP No + BPF_JGE No +
869.Dv BPF_K
870.Xc
871.Sm on
872pc += (A >= k) ? jt : jf
873.Sm off
874.It Xo Dv BPF_JMP No + BPF_JEQ No +
875.Dv BPF_K
876.Xc
877.Sm on
878pc += (A == k) ? jt : jf
879.Sm off
880.It Xo Dv BPF_JMP No + BPF_JSET No +
881.Dv BPF_K
882.Xc
883.Sm on
884pc += (A & k) ? jt : jf
885.Sm off
886.It Xo Dv BPF_JMP No + BPF_JGT No +
887.Dv BPF_X
888.Xc
889.Sm on
890pc += (A > X) ? jt : jf
891.Sm off
892.It Xo Dv BPF_JMP No + BPF_JGE No +
893.Dv BPF_X
894.Xc
895.Sm on
896pc += (A >= X) ? jt : jf
897.Sm off
898.It Xo Dv BPF_JMP No + BPF_JEQ No +
899.Dv BPF_X
900.Xc
901.Sm on
902pc += (A == X) ? jt : jf
903.Sm off
904.It Xo Dv BPF_JMP No + BPF_JSET No +
905.Dv BPF_X
906.Xc
907.Sm on
908pc += (A & X) ? jt : jf
909.El
910.It Dv BPF_RET
911The return instructions terminate the filter program and specify the
912amount of packet to accept (i.e., they return the truncation amount)
913or, for the write filter, the maximum acceptable size for the packet
914(i.e., the packet is dropped if it is larger than the returned
915amount).
916A return value of zero indicates that the packet should be ignored/dropped.
917The return value is either a constant
918.Pf ( Dv BPF_K )
919or the accumulator
920.Pf ( Dv BPF_A ) .
921.Pp
922.Bl -tag -width 32n -compact
923.It Dv BPF_RET No + Dv BPF_A
924Accept A bytes.
925.It Dv BPF_RET No + Dv BPF_K
926Accept k bytes.
927.El
928.It Dv BPF_MISC
929The miscellaneous category was created for anything that doesn't fit into
930the above classes, and for any new instructions that might need to be added.
931Currently, these are the register transfer instructions that copy the index
932register to the accumulator or vice versa.
933.Pp
934.Bl -tag -width 32n -compact
935.Sm off
936.It Dv BPF_MISC No + Dv BPF_TAX
937.Sm on
938X <- A
939.Sm off
940.It Dv BPF_MISC No + Dv BPF_TXA
941.Sm on
942A <- X
943.El
944.El
945.Pp
946The
947.Nm
948interface provides the following macros to facilitate array initializers:
949.Bd -filled -offset indent
950.Dv BPF_STMT ( Ns Ar opcode ,
951.Ar operand )
952.Pp
953.Dv BPF_JUMP ( Ns Ar opcode ,
954.Ar operand ,
955.Ar true_offset ,
956.Ar false_offset )
957.Ed
958.Sh FILES
959.Bl -tag -width /dev/bpf[0-9] -compact
960.It Pa /dev/bpf[0-9]
961.Nm
962devices
963.El
964.Sh EXAMPLES
965The following filter is taken from the Reverse ARP daemon.
966It accepts only Reverse ARP requests.
967.Bd -literal -offset indent
968struct bpf_insn insns[] = {
969	BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12),
970	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_REVARP, 0, 3),
971	BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 20),
972	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, REVARP_REQUEST, 0, 1),
973	BPF_STMT(BPF_RET+BPF_K, sizeof(struct ether_arp) +
974	    sizeof(struct ether_header)),
975	BPF_STMT(BPF_RET+BPF_K, 0),
976};
977.Ed
978.Pp
979This filter accepts only IP packets between host 128.3.112.15 and
980128.3.112.35.
981.Bd -literal -offset indent
982struct bpf_insn insns[] = {
983	BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12),
984	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_IP, 0, 8),
985	BPF_STMT(BPF_LD+BPF_W+BPF_ABS, 26),
986	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x8003700f, 0, 2),
987	BPF_STMT(BPF_LD+BPF_W+BPF_ABS, 30),
988	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x80037023, 3, 4),
989	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x80037023, 0, 3),
990	BPF_STMT(BPF_LD+BPF_W+BPF_ABS, 30),
991	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x8003700f, 0, 1),
992	BPF_STMT(BPF_RET+BPF_K, (u_int)-1),
993	BPF_STMT(BPF_RET+BPF_K, 0),
994};
995.Ed
996.Pp
997Finally, this filter returns only TCP finger packets.
998We must parse the IP header to reach the TCP header.
999The
1000.Dv BPF_JSET
1001instruction checks that the IP fragment offset is 0 so we are sure that we
1002have a TCP header.
1003.Bd -literal -offset indent
1004struct bpf_insn insns[] = {
1005	BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12),
1006	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_IP, 0, 10),
1007	BPF_STMT(BPF_LD+BPF_B+BPF_ABS, 23),
1008	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, IPPROTO_TCP, 0, 8),
1009	BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 20),
1010	BPF_JUMP(BPF_JMP+BPF_JSET+BPF_K, 0x1fff, 6, 0),
1011	BPF_STMT(BPF_LDX+BPF_B+BPF_MSH, 14),
1012	BPF_STMT(BPF_LD+BPF_H+BPF_IND, 14),
1013	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 79, 2, 0),
1014	BPF_STMT(BPF_LD+BPF_H+BPF_IND, 16),
1015	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 79, 0, 1),
1016	BPF_STMT(BPF_RET+BPF_K, (u_int)-1),
1017	BPF_STMT(BPF_RET+BPF_K, 0),
1018};
1019.Ed
1020.Sh SEE ALSO
1021.Xr ioctl 2 ,
1022.Xr read 2 ,
1023.Xr select 2 ,
1024.Xr signal 3 ,
1025.Xr MAKEDEV 8 ,
1026.Xr tcpdump 8
1027.Rs
1028.%A McCanne, S.
1029.%A Jacobson, V.
1030.%J "An efficient, extensible, and portable network monitor"
1031.Re
1032.Sh HISTORY
1033The Enet packet filter was created in 1980 by Mike Accetta and Rick Rashid
1034at Carnegie-Mellon University.
1035Jeffrey Mogul, at Stanford, ported the code to BSD and continued its
1036development from 1983 on.
1037Since then, it has evolved into the Ultrix Packet Filter at DEC, a STREAMS
1038NIT module under SunOS 4.1, and BPF.
1039.Sh AUTHORS
1040Steve McCanne of Lawrence Berkeley Laboratory implemented BPF in Summer 1990.
1041Much of the design is due to Van Jacobson.
1042.Sh BUGS
1043The read buffer must be of a fixed size (returned by the
1044.Dv BIOCGBLEN
1045ioctl).
1046.Pp
1047A file that does not request promiscuous mode may receive promiscuously
1048received packets as a side effect of another file requesting this mode on
1049the same hardware interface.
1050This could be fixed in the kernel with additional processing overhead.
1051However, we favor the model where all files must assume that the interface
1052is promiscuous, and if so desired, must utilize a filter to reject foreign
1053packets.
1054.Pp
1055Data link protocols with variable length headers are not currently supported.
1056