xref: /freebsd/share/man/man4/ip.4 (revision 7bd6fde3)
1.\" Copyright (c) 1983, 1991, 1993
2.\"	The Regents of the University of California.  All rights reserved.
3.\"
4.\" Redistribution and use in source and binary forms, with or without
5.\" modification, are permitted provided that the following conditions
6.\" are met:
7.\" 1. Redistributions of source code must retain the above copyright
8.\"    notice, this list of conditions and the following disclaimer.
9.\" 2. Redistributions in binary form must reproduce the above copyright
10.\"    notice, this list of conditions and the following disclaimer in the
11.\"    documentation and/or other materials provided with the distribution.
12.\" 3. All advertising materials mentioning features or use of this software
13.\"    must display the following acknowledgement:
14.\"	This product includes software developed by the University of
15.\"	California, Berkeley and its contributors.
16.\" 4. Neither the name of the University nor the names of its contributors
17.\"    may be used to endorse or promote products derived from this software
18.\"    without specific prior written permission.
19.\"
20.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
21.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
22.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
23.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
24.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
25.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
26.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
27.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
28.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
29.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
30.\" SUCH DAMAGE.
31.\"
32.\"     @(#)ip.4	8.2 (Berkeley) 11/30/93
33.\" $FreeBSD$
34.\"
35.Dd May 14, 2006
36.Dt IP 4
37.Os
38.Sh NAME
39.Nm ip
40.Nd Internet Protocol
41.Sh SYNOPSIS
42.In sys/types.h
43.In sys/socket.h
44.In netinet/in.h
45.Ft int
46.Fn socket AF_INET SOCK_RAW proto
47.Sh DESCRIPTION
48.Tn IP
49is the transport layer protocol used
50by the Internet protocol family.
51Options may be set at the
52.Tn IP
53level
54when using higher-level protocols that are based on
55.Tn IP
56(such as
57.Tn TCP
58and
59.Tn UDP ) .
60It may also be accessed
61through a
62.Dq raw socket
63when developing new protocols, or
64special-purpose applications.
65.Pp
66There are several
67.Tn IP-level
68.Xr setsockopt 2
69and
70.Xr getsockopt 2
71options.
72.Dv IP_OPTIONS
73may be used to provide
74.Tn IP
75options to be transmitted in the
76.Tn IP
77header of each outgoing packet
78or to examine the header options on incoming packets.
79.Tn IP
80options may be used with any socket type in the Internet family.
81The format of
82.Tn IP
83options to be sent is that specified by the
84.Tn IP
85protocol specification (RFC-791), with one exception:
86the list of addresses for Source Route options must include the first-hop
87gateway at the beginning of the list of gateways.
88The first-hop gateway address will be extracted from the option list
89and the size adjusted accordingly before use.
90To disable previously specified options,
91use a zero-length buffer:
92.Bd -literal
93setsockopt(s, IPPROTO_IP, IP_OPTIONS, NULL, 0);
94.Ed
95.Pp
96.Dv IP_TOS
97and
98.Dv IP_TTL
99may be used to set the type-of-service and time-to-live
100fields in the
101.Tn IP
102header for
103.Dv SOCK_STREAM , SOCK_DGRAM ,
104and certain types of
105.Dv SOCK_RAW
106sockets.
107For example,
108.Bd -literal
109int tos = IPTOS_LOWDELAY;       /* see <netinet/ip.h> */
110setsockopt(s, IPPROTO_IP, IP_TOS, &tos, sizeof(tos));
111
112int ttl = 60;                   /* max = 255 */
113setsockopt(s, IPPROTO_IP, IP_TTL, &ttl, sizeof(ttl));
114.Ed
115.Pp
116.Dv IP_MINTTL
117may be used to set the minimum acceptable TTL a packet must have when
118received on a socket.
119All packets with a lower TTL are silently dropped.
120This option is only really useful when set to 255, preventing packets
121from outside the directly connected networks reaching local listeners
122on sockets.
123.Pp
124.Dv IP_DONTFRAG
125may be used to set the Don't Fragment flag on IP packets.
126Currently this option is respected only on
127.Xr udp 4
128and raw
129.Xr ip 4
130sockets, unless the
131.Dv IP_HDRINCL
132option has been set.
133On
134.Xr tcp 4
135sockets, the Don't Fragment flag is controlled by the Path
136MTU Discovery option.
137Sending a packet larger than the MTU size of the egress interface,
138determined by the destination address, returns an
139.Er EMSGSIZE
140error.
141.Pp
142If the
143.Dv IP_RECVDSTADDR
144option is enabled on a
145.Dv SOCK_DGRAM
146socket,
147the
148.Xr recvmsg 2
149call will return the destination
150.Tn IP
151address for a
152.Tn UDP
153datagram.
154The
155.Vt msg_control
156field in the
157.Vt msghdr
158structure points to a buffer
159that contains a
160.Vt cmsghdr
161structure followed by the
162.Tn IP
163address.
164The
165.Vt cmsghdr
166fields have the following values:
167.Bd -literal
168cmsg_len = sizeof(struct in_addr)
169cmsg_level = IPPROTO_IP
170cmsg_type = IP_RECVDSTADDR
171.Ed
172.Pp
173The source address to be used for outgoing
174.Tn UDP
175datagrams on a socket that is not bound to a specific
176.Tn IP
177address can be specified as ancillary data with a type code of
178.Dv IP_SENDSRCADDR .
179The msg_control field in the msghdr structure should point to a buffer
180that contains a
181.Vt cmsghdr
182structure followed by the
183.Tn IP
184address.
185The cmsghdr fields should have the following values:
186.Bd -literal
187cmsg_len = sizeof(struct in_addr)
188cmsg_level = IPPROTO_IP
189cmsg_type = IP_SENDSRCADDR
190.Ed
191.Pp
192For convenience,
193.Dv IP_SENDSRCADDR
194is defined to have the same value as
195.Dv IP_RECVDSTADDR ,
196so the
197.Dv IP_RECVDSTADDR
198control message from
199.Xr recvmsg 2
200can be used directly as a control message for
201.Xr sendmsg 2 .
202.Pp
203If the
204.Dv IP_ONESBCAST
205option is enabled on a
206.Dv SOCK_DGRAM
207or a
208.Dv SOCK_RAW
209socket, the destination address of outgoing
210broadcast datagrams on that socket will be forced
211to the undirected broadcast address,
212.Dv INADDR_BROADCAST ,
213before transmission.
214This is in contrast to the default behavior of the
215system, which is to transmit undirected broadcasts
216via the first network interface with the
217.Dv IFF_BROADCAST flag set.
218.Pp
219This option allows applications to choose which
220interface is used to transmit an undirected broadcast
221datagram.
222For example, the following code would force an
223undirected broadcast to be transmitted via the interface
224configured with the broadcast address 192.168.2.255:
225.Bd -literal
226char msg[512];
227struct sockaddr_in sin;
228u_char onesbcast = 1;	/* 0 = disable (default), 1 = enable */
229
230setsockopt(s, IPPROTO_IP, IP_ONESBCAST, &onesbcast, sizeof(onesbcast));
231sin.sin_addr.s_addr = inet_addr("192.168.2.255");
232sin.sin_port = htons(1234);
233sendto(s, msg, sizeof(msg), 0, &sin, sizeof(sin));
234.Ed
235.Pp
236It is the application's responsibility to set the
237.Dv IP_TTL option
238to an appropriate value in order to prevent broadcast storms.
239The application must have sufficient credentials to set the
240.Dv SO_BROADCAST
241socket level option, otherwise the
242.Dv IP_ONESBCAST option has no effect.
243.Pp
244If the
245.Dv IP_RECVTTL
246option is enabled on a
247.Dv SOCK_DGRAM
248socket, the
249.Xr recvmsg 2
250call will return the
251.Tn IP
252.Tn TTL
253(time to live) field for a
254.Tn UDP
255datagram.
256The msg_control field in the msghdr structure points to a buffer
257that contains a cmsghdr structure followed by the
258.Tn TTL .
259The cmsghdr fields have the following values:
260.Bd -literal
261cmsg_len = sizeof(u_char)
262cmsg_level = IPPROTO_IP
263cmsg_type = IP_RECVTTL
264.Ed
265.Pp
266If the
267.Dv IP_RECVIF
268option is enabled on a
269.Dv SOCK_DGRAM
270socket, the
271.Xr recvmsg 2
272call returns a
273.Vt "struct sockaddr_dl"
274corresponding to the interface on which the
275packet was received.
276The
277.Va msg_control
278field in the
279.Vt msghdr
280structure points to a buffer that contains a
281.Vt cmsghdr
282structure followed by the
283.Vt "struct sockaddr_dl" .
284The
285.Vt cmsghdr
286fields have the following values:
287.Bd -literal
288cmsg_len = sizeof(struct sockaddr_dl)
289cmsg_level = IPPROTO_IP
290cmsg_type = IP_RECVIF
291.Ed
292.Pp
293.Dv IP_PORTRANGE
294may be used to set the port range used for selecting a local port number
295on a socket with an unspecified (zero) port number.
296It has the following
297possible values:
298.Bl -tag -width IP_PORTRANGE_DEFAULT
299.It Dv IP_PORTRANGE_DEFAULT
300use the default range of values, normally
301.Dv IPPORT_HIFIRSTAUTO
302through
303.Dv IPPORT_HILASTAUTO .
304This is adjustable through the sysctl setting:
305.Va net.inet.ip.portrange.first
306and
307.Va net.inet.ip.portrange.last .
308.It Dv IP_PORTRANGE_HIGH
309use a high range of values, normally
310.Dv IPPORT_HIFIRSTAUTO
311and
312.Dv IPPORT_HILASTAUTO .
313This is adjustable through the sysctl setting:
314.Va net.inet.ip.portrange.hifirst
315and
316.Va net.inet.ip.portrange.hilast .
317.It Dv IP_PORTRANGE_LOW
318use a low range of ports, which are normally restricted to
319privileged processes on
320.Ux
321systems.
322The range is normally from
323.Dv IPPORT_RESERVED
324\- 1 down to
325.Li IPPORT_RESERVEDSTART
326in descending order.
327This is adjustable through the sysctl setting:
328.Va net.inet.ip.portrange.lowfirst
329and
330.Va net.inet.ip.portrange.lowlast .
331.El
332.Pp
333The range of privileged ports which only may be opened by
334root-owned processes may be modified by the
335.Va net.inet.ip.portrange.reservedlow
336and
337.Va net.inet.ip.portrange.reservedhigh
338sysctl settings.
339The values default to the traditional range,
3400 through
341.Dv IPPORT_RESERVED
342\- 1
343(0 through 1023), respectively.
344Note that these settings do not affect and are not accounted for in the
345use or calculation of the other
346.Va net.inet.ip.portrange
347values above.
348Changing these values departs from
349.Ux
350tradition and has security
351consequences that the administrator should carefully evaluate before
352modifying these settings.
353.Pp
354Ports are allocated at random within the specified port range in order
355to increase the difficulty of random spoofing attacks.
356In scenarios such as benchmarking, this behavior may be undesirable.
357In these cases,
358.Va net.inet.ip.portrange.randomized
359can be used to toggle randomization off.
360If more than
361.Va net.inet.ip.portrange.randomcps
362ports have been allocated in the last second, then return to sequential
363port allocation.
364Return to random allocation only once the current port allocation rate
365drops below
366.Va net.inet.ip.portrange.randomcps
367for at least
368.Va net.inet.ip.portrange.randomtime
369seconds.
370The default values for
371.Va net.inet.ip.portrange.randomcps
372and
373.Va net.inet.ip.portrange.randomtime
374are 10 port allocations per second and 45 seconds correspondingly.
375.Ss "Multicast Options"
376.Pp
377.Tn IP
378multicasting is supported only on
379.Dv AF_INET
380sockets of type
381.Dv SOCK_DGRAM
382and
383.Dv SOCK_RAW ,
384and only on networks where the interface
385driver supports multicasting.
386.Pp
387The
388.Dv IP_MULTICAST_TTL
389option changes the time-to-live (TTL)
390for outgoing multicast datagrams
391in order to control the scope of the multicasts:
392.Bd -literal
393u_char ttl;	/* range: 0 to 255, default = 1 */
394setsockopt(s, IPPROTO_IP, IP_MULTICAST_TTL, &ttl, sizeof(ttl));
395.Ed
396.Pp
397Datagrams with a TTL of 1 are not forwarded beyond the local network.
398Multicast datagrams with a TTL of 0 will not be transmitted on any network,
399but may be delivered locally if the sending host belongs to the destination
400group and if multicast loopback has not been disabled on the sending socket
401(see below).
402Multicast datagrams with TTL greater than 1 may be forwarded
403to other networks if a multicast router is attached to the local network.
404.Pp
405For hosts with multiple interfaces, each multicast transmission is
406sent from the primary network interface.
407The
408.Dv IP_MULTICAST_IF
409option overrides the default for
410subsequent transmissions from a given socket:
411.Bd -literal
412struct in_addr addr;
413setsockopt(s, IPPROTO_IP, IP_MULTICAST_IF, &addr, sizeof(addr));
414.Ed
415.Pp
416where "addr" is the local
417.Tn IP
418address of the desired interface or
419.Dv INADDR_ANY
420to specify the default interface.
421An interface's local IP address and multicast capability can
422be obtained via the
423.Dv SIOCGIFCONF
424and
425.Dv SIOCGIFFLAGS
426ioctls.
427Normal applications should not need to use this option.
428.Pp
429If a multicast datagram is sent to a group to which the sending host itself
430belongs (on the outgoing interface), a copy of the datagram is, by default,
431looped back by the IP layer for local delivery.
432The
433.Dv IP_MULTICAST_LOOP
434option gives the sender explicit control
435over whether or not subsequent datagrams are looped back:
436.Bd -literal
437u_char loop;	/* 0 = disable, 1 = enable (default) */
438setsockopt(s, IPPROTO_IP, IP_MULTICAST_LOOP, &loop, sizeof(loop));
439.Ed
440.Pp
441This option
442improves performance for applications that may have no more than one
443instance on a single host (such as a router daemon), by eliminating
444the overhead of receiving their own transmissions.
445It should generally not
446be used by applications for which there may be more than one instance on a
447single host (such as a conferencing program) or for which the sender does
448not belong to the destination group (such as a time querying program).
449.Pp
450A multicast datagram sent with an initial TTL greater than 1 may be delivered
451to the sending host on a different interface from that on which it was sent,
452if the host belongs to the destination group on that other interface.
453The loopback control option has no effect on such delivery.
454.Pp
455A host must become a member of a multicast group before it can receive
456datagrams sent to the group.
457To join a multicast group, use the
458.Dv IP_ADD_MEMBERSHIP
459option:
460.Bd -literal
461struct ip_mreq mreq;
462setsockopt(s, IPPROTO_IP, IP_ADD_MEMBERSHIP, &mreq, sizeof(mreq));
463.Ed
464.Pp
465where
466.Fa mreq
467is the following structure:
468.Bd -literal
469struct ip_mreq {
470    struct in_addr imr_multiaddr; /* IP multicast address of group */
471    struct in_addr imr_interface; /* local IP address of interface */
472}
473.Ed
474.Pp
475.Va imr_interface
476should be set to
477.Dv INADDR_ANY
478to choose the default multicast interface,
479or the
480.Tn IP
481address of a particular multicast-capable interface if
482the host is multihomed.
483Since
484.Fx 4.4 ,
485if the
486.Va imr_interface
487member is within the network range
488.Li 0.0.0.0/8 ,
489it is treated as an interface index in the system interface MIB,
490as per the RIP Version 2 MIB Extension (RFC-1724).
491.Pp
492Membership is associated with a single interface;
493programs running on multihomed hosts may need to
494join the same group on more than one interface.
495Up to
496.Dv IP_MAX_MEMBERSHIPS
497memberships may be added on a single socket.
498.Pp
499To drop a membership, use:
500.Bd -literal
501struct ip_mreq mreq;
502setsockopt(s, IPPROTO_IP, IP_DROP_MEMBERSHIP, &mreq, sizeof(mreq));
503.Ed
504.Pp
505where
506.Fa mreq
507contains the same values as used to add the membership.
508Memberships are dropped when the socket is closed or the process exits.
509.\"-----------------------
510.Ss "Raw IP Sockets"
511.Pp
512Raw
513.Tn IP
514sockets are connectionless,
515and are normally used with the
516.Xr sendto 2
517and
518.Xr recvfrom 2
519calls, though the
520.Xr connect 2
521call may also be used to fix the destination for future
522packets (in which case the
523.Xr read 2
524or
525.Xr recv 2
526and
527.Xr write 2
528or
529.Xr send 2
530system calls may be used).
531.Pp
532If
533.Fa proto
534is 0, the default protocol
535.Dv IPPROTO_RAW
536is used for outgoing
537packets, and only incoming packets destined for that protocol
538are received.
539If
540.Fa proto
541is non-zero, that protocol number will be used on outgoing packets
542and to filter incoming packets.
543.Pp
544Outgoing packets automatically have an
545.Tn IP
546header prepended to
547them (based on the destination address and the protocol
548number the socket is created with),
549unless the
550.Dv IP_HDRINCL
551option has been set.
552Incoming packets are received with
553.Tn IP
554header and options intact.
555.Pp
556.Dv IP_HDRINCL
557indicates the complete IP header is included with the data
558and may be used only with the
559.Dv SOCK_RAW
560type.
561.Bd -literal
562#include <netinet/in_systm.h>
563#include <netinet/ip.h>
564
565int hincl = 1;                  /* 1 = on, 0 = off */
566setsockopt(s, IPPROTO_IP, IP_HDRINCL, &hincl, sizeof(hincl));
567.Ed
568.Pp
569Unlike previous
570.Bx
571releases, the program must set all
572the fields of the IP header, including the following:
573.Bd -literal
574ip->ip_v = IPVERSION;
575ip->ip_hl = hlen >> 2;
576ip->ip_id = 0;  /* 0 means kernel set appropriate value */
577ip->ip_off = offset;
578.Ed
579.Pp
580The
581.Va ip_len
582and
583.Va ip_off
584fields
585.Em must
586be provided in host byte order .
587All other fields must be provided in network byte order.
588See
589.Xr byteorder 3
590for more information on network byte order.
591If the
592.Va ip_id
593field is set to 0 then the kernel will choose an
594appropriate value.
595If the header source address is set to
596.Dv INADDR_ANY ,
597the kernel will choose an appropriate address.
598.Sh ERRORS
599A socket operation may fail with one of the following errors returned:
600.Bl -tag -width Er
601.It Bq Er EISCONN
602when trying to establish a connection on a socket which
603already has one, or when trying to send a datagram with the destination
604address specified and the socket is already connected;
605.It Bq Er ENOTCONN
606when trying to send a datagram, but
607no destination address is specified, and the socket has not been
608connected;
609.It Bq Er ENOBUFS
610when the system runs out of memory for
611an internal data structure;
612.It Bq Er EADDRNOTAVAIL
613when an attempt is made to create a
614socket with a network address for which no network interface
615exists.
616.It Bq Er EACCES
617when an attempt is made to create
618a raw IP socket by a non-privileged process.
619.El
620.Pp
621The following errors specific to
622.Tn IP
623may occur when setting or getting
624.Tn IP
625options:
626.Bl -tag -width Er
627.It Bq Er EINVAL
628An unknown socket option name was given.
629.It Bq Er EINVAL
630The IP option field was improperly formed;
631an option field was shorter than the minimum value
632or longer than the option buffer provided.
633.El
634.Pp
635The following errors may occur when attempting to send
636.Tn IP
637datagrams via a
638.Dq raw socket
639with the
640.Dv IP_HDRINCL
641option set:
642.Bl -tag -width Er
643.It Bq Er EINVAL
644The user-supplied
645.Va ip_len
646field was not equal to the length of the datagram written to the socket.
647.El
648.Sh SEE ALSO
649.Xr getsockopt 2 ,
650.Xr recv 2 ,
651.Xr send 2 ,
652.Xr byteorder 3 ,
653.Xr icmp 4 ,
654.Xr inet 4 ,
655.Xr intro 4
656.Sh HISTORY
657The
658.Nm
659protocol appeared in
660.Bx 4.2 .
661