xref: /freebsd/share/man/man4/rtnetlink.4 (revision 9768746b)
1.\"
2.\" Copyright (C) 2022 Alexander Chernikov <melifaro@FreeBSD.org>.
3.\"
4.\" Redistribution and use in source and binary forms, with or without
5.\" modification, are permitted provided that the following conditions
6.\" are met:
7.\" 1. Redistributions of source code must retain the above copyright
8.\"    notice, this list of conditions and the following disclaimer.
9.\" 2. Redistributions in binary form must reproduce the above copyright
10.\"    notice, this list of conditions and the following disclaimer in the
11.\"    documentation and/or other materials provided with the distribution.
12.\"
13.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
14.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
15.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
16.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
17.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
18.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
19.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
20.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
21.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
22.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
23.\" SUCH DAMAGE.
24.\"
25.\" $FreeBSD$
26.\"
27.Dd November 1, 2022
28.Dt RTNETLINK 4
29.Os
30.Sh NAME
31.Nm RTNetlink
32.Nd Network configuration-specific Netlink family
33.Sh SYNOPSIS
34.In netlink/netlink.h
35.In netlink/netlink_route.h
36.Ft int
37.Fn socket AF_NETLINK SOCK_RAW NETLINK_ROUTE
38.Sh DESCRIPTION
39The
40.Dv NETLINK_ROUTE
41family aims to be the primary configuration mechanism for all
42network-related tasks.
43Currently it supports configuring interfaces, interface addresses, routes,
44nexthops and arp/ndp neighbors.
45.Sh ROUTES
46All route configuration messages share the common header:
47.Bd -literal
48struct rtmsg {
49	unsigned char	rtm_family;	/* address family */
50	unsigned char	rtm_dst_len;	/* Prefix length */
51	unsigned char	rtm_src_len;	/* Deprecated, set to 0 */
52	unsigned char	rtm_tos;	/* Type of service (not used) */
53	unsigned char	rtm_table;	/* deprecated, set to 0 */
54	unsigned char	rtm_protocol;	/* Routing protocol id (RTPROT_) */
55	unsigned char	rtm_scope;	/* Route distance (RT_SCOPE_) */
56	unsigned char	rtm_type;	/* Route type (RTN_) */
57	unsigned 	rtm_flags;	/* Route flags (not supported) */
58};
59.Ed
60.Pp
61The
62.Va rtm_family
63specifies the route family to be operated on.
64Currently,
65.Dv AF_INET6
66and
67.Dv AF_INET
68are the only supported families.
69The route prefix length is stored in
70.Va rtm_dst_len
71.
72The caller should set the originator identity (one of the
73.Dv RTPROT_
74values) in
75.Va rtm_protocol
76.
77It is useful for users and for the application itself, allowing for easy
78identification of self-originated routes.
79The route scope has to be set via
80.Va rtm_scope
81field.
82The supported values are:
83.Bd -literal -offset indent -compact
84RT_SCOPE_UNIVERSE	Global scope
85RT_SCOPE_LINK		Link scope
86.Ed
87.Pp
88Route type needs to be set.
89The defined values are:
90.Bd -literal -offset indent -compact
91RTN_UNICAST	Unicast route
92RTN_MULTICAST	Multicast route
93RTN_BLACKHOLE	Drops traffic towards destination
94RTN_PROHIBIT	Drops traffic and sends reject
95.Ed
96.Pp
97The following messages are supported:
98.Ss RTM_NEWROUTE
99Adds a new route.
100All NL flags are supported.
101Extending a multipath route requires NLM_F_APPEND flag.
102.Ss RTM_DELROUTE
103Tries to delete a route.
104The route is specified using a combination of
105.Dv RTA_DST
106TLV and
107.Va rtm_dst_len .
108.Ss RTM_GETROUTE
109Fetches a single route or all routes in the current VNET, depending on the
110.Dv NLM_F_DUMP
111flag.
112Each route is reported as
113.Dv RTM_NEWROUTE
114message.
115The following filters are recognised by the kernel:
116.Pp
117.Bd -literal -offset indent -compact
118rtm_family	required family or AF_UNSPEC
119RTA_TABLE	fib number or RT_TABLE_UNSPEC to return all fibs
120.Ed
121.Ss TLVs
122.Bl -tag -width indent
123.It Dv RTA_DST
124(binary) IPv4/IPv6 address, depending on the
125.Va rtm_family .
126.It Dv RTA_OIF
127(uint32_t) transmit interface index.
128.It Dv RTA_GATEWAY
129(binary) IPv4/IPv6 gateway address, depending on the
130.Va rtm_family .
131.It Dv RTA_METRICS
132(nested) Container attribute, listing route properties.
133The only supported sub-attribute is
134.Dv RTAX_MTU , which stores path MTU as  uint32_t.
135.It Dv RTA_MULTIPATH
136This attribute contains multipath route nexthops with their weights.
137These nexthops are represented as a sequence of
138.Va rtnexthop
139structures, each followed by
140.Dv RTA_GATEWAY
141or
142.Dv RTA_VIA
143attributes.
144.Bd -literal
145struct rtnexthop {
146	unsigned short		rtnh_len;
147	unsigned char		rtnh_flags;
148	unsigned char		rtnh_hops;	/* nexthop weight */
149	int			rtnh_ifindex;
150};
151.Ed
152.Pp
153The
154.Va rtnh_len
155field specifies the total nexthop info length, including both
156.Va struct rtnexthop
157and the following TLVs.
158The
159.Va rtnh_hops
160field stores relative nexthop weight, used for load balancing between group
161members.
162The
163.Va rtnh_ifindex
164field contains the index of the transmit interface.
165.Pp
166The following TLVs can follow the structure:
167.Bd -literal -offset indent -compact
168RTA_GATEWAY	IPv4/IPv6 nexthop address of the gateway
169RTA_VIA		IPv6 nexthop address for IPv4 route
170RTA_KNH_ID	Kernel-specific index of the nexthop
171.Ed
172.It Dv RTA_KNH_ID
173(uint32_t) (FreeBSD-specific) Auto-allocated kernel index of the nexthop.
174.It Dv RTA_RTFLAGS
175(uint32_t) (FreeBSD-specific) rtsock route flags.
176.It Dv RTA_TABLE
177(uint32_t) Fib number of the route.
178Default route table is
179.Dv RT_TABLE_MAIN .
180To explicitely specify "all tables" one needs to set the value to
181.Dv RT_TABLE_UNSPEC .
182.It Dv RTA_EXPIRES
183(uint32_t) seconds till path expiration.
184.It Dv RTA_NH_ID
185(uint32_t) useland nexthop or nexthop group index.
186.El
187.Ss Groups
188The following groups are defined:
189.Bd -literal -offset indent -compact
190RTNLGRP_IPV4_ROUTE	Notifies on IPv4 route arrival/removal/change
191RTNLGRP_IPV6_ROUTE	Notifies on IPv6 route arrival/removal/change
192.Ed
193.Sh NEXTHOPS
194All nexthop/nexthop group configuration messages share the common header:
195.Bd -literal
196struct nhmsg {
197        unsigned char	nh_family;	/* transport family */
198	unsigned char	nh_scope;	/* ignored on RX, filled by kernel */
199	unsigned char	nh_protocol;	/* Routing protocol that installed nh */
200	unsigned char	resvd;
201	unsigned int	nh_flags;	/* RTNH_F_* flags from route.h */
202};
203.Ed
204The
205.Va nh_family
206specificies the gateway address family.
207It can be different from route address family for IPv4 routes with IPv6
208nexthops.
209The
210.Va nh_protocol
211is similar to
212.Va rtm_protocol
213field, which designates originator application identity.
214.Pp
215The following messages are supported:
216.Ss RTM_NEWNEXTHOP
217Creates a new nexthop or nexthop group.
218.Ss RTM_DELNEXTHOP
219Deletes nexthop or nexthhop group.
220The required object is specified by the
221.Dv RTA_NH_ID
222attribute.
223.Ss RTM_GETNEXTHOP
224Fetches a single nexthop or all nexthops/nexthop groups, depending on the
225.Dv NLM_F_DUMP
226flag.
227The following filters are recognised by the kernel:
228.Pp
229.Bd -literal -offset indent -compact
230RTA_NH_ID	nexthop or nexthtop group id
231NHA_GROUPS	match only nexthtop groups
232.Ed
233.Ss TLVs
234.Bl -tag -width indent
235.It Dv RTA_NH_ID
236(uint32_t) Nexthhop index used to identify particular nexthop or nexthop group.
237Should be provided by userland at the nexthtop creation time.
238.It Dv NHA_GROUP
239This attribute designates the nexthtop group and contains all of its nexthtops
240and their relative weights.
241The attribute constists of a list of
242.Va nexthop_grp
243structures:
244.Bd -literal
245struct nexthop_grp {
246	uint32_t	id;		/* nexhop userland index */
247	uint8_t		weight;         /* weight of this nexthop */
248	uint8_t		resvd1;
249	uint16_t	resvd2;
250};
251.Ed
252.It Dv NHA_GROUP_TYPE
253(uint16_t) Nexthtop group type, set to one of the following types:
254.Bd -literal -offset indent -compact
255NEXTHOP_GRP_TYPE_MPATH	default multipath group
256.Ed
257.It Dv NHA_BLACKHOLE
258(flag) Marks the nexthtop as blackhole.
259.It Dv NHA_OIF
260(uint32_t) Transmit interface index of the nexthtop.
261.It Dv NHA_GATEWAY
262(binary) IPv4/IPv6 gateway address
263.It Dv NHA_GROUPS
264(flag) Matches nexthtop groups during dump.
265.El
266.Ss Groups
267The following groups are defined:
268.Bd -literal -offset indent -compact
269RTNLGRP_NEXTHOP		Notifies on nexthop/groups arrival/removal/change
270.Ed
271.Sh INTERFACES
272All interface configuration messages share the common header:
273.Bd -literal
274struct ifinfomsg {
275	unsigned char	ifi_family;	/* not used, set to 0 */
276	unsigned char	__ifi_pad;
277	unsigned short	ifi_type;	/* ARPHRD_* */
278	int		ifi_index;	/* Inteface index */
279	unsigned	ifi_flags;	/* IFF_* flags */
280	unsigned	ifi_change;	/* IFF_* change mask */
281};
282.Ed
283.Ss RTM_NEWLINK
284Creates a new interface.
285The only mandatory TLV is
286.Dv IFLA_IFNAME .
287The following attributes are returned inside the nested
288.Dv NLMSGERR_ATTR_COOKIE :
289.Pp
290.Bd -literal -offset indent -compact
291IFLA_NEW_IFINDEX	(uint32) created interface index
292IFLA_IFNAME		(string) created interface name
293.Ed
294.Ss RTM_DELLINK
295Deletes the interface specified by
296.Dv IFLA_IFNAME .
297.Ss RTM_GETLINK
298Fetches a single interface or all interfaces in the current VNET, depending on the
299.Dv NLM_F_DUMP
300flag.
301Each interface is reported as a
302.Dv RTM_NEWLINK
303message.
304The following filters are recognised by the kernel:
305.Pp
306.Bd -literal -offset indent -compact
307ifi_index	interface index
308IFLA_IFNAME	interface name
309IFLA_ALT_IFNAME	interface name
310.Ed
311.Ss TLVs
312.Bl -tag -width indent
313.It Dv IFLA_ADDRESS
314(binary) Llink-level interface address (MAC).
315.It Dv IFLA_BROADCAST
316(binary) (readonly) Link-level broadcast address.
317.It Dv IFLA_IFNAME
318(string) New interface name.
319.It Dv IFLA_IFALIAS
320(string) Interface description.
321.It Dv IFLA_LINK
322(uint32_t) (readonly) Interface index.
323.It Dv IFLA_MASTER
324(uint32_t) Parent interface index.
325.It Dv IFLA_LINKINFO
326(nested) Interface type-specific attributes:
327.Bd -literal -offset indent -compact
328IFLA_INFO_KIND		(string) interface type ("vlan")
329IFLA_INFO_DATA		(nested) custom attributes
330.Ed
331The following types and attributes are supported:
332.Bl -tag -width indent
333.It Dv vlan
334.Bd -literal -offset indent -compact
335IFLA_VLAN_ID		(uint16_t) 802.1Q vlan id
336IFLA_VLAN_PROTOCOL	(uint16_t) Protocol: ETHERTYPE_VLAN or ETHERTYPE_QINQ
337.Ed
338.El
339.It Dv IFLA_OPERSTATE
340(uint8_t) Interface operational state per RFC 2863.
341Can be one of the following:
342.Bd -literal -offset indent -compact
343IF_OPER_UNKNOWN		status can not be determined
344IF_OPER_NOTPRESENT	some (hardware) component not present
345IF_OPER_DOWN		down
346IF_OPER_LOWERLAYERDOWN	some lower-level interface is down
347IF_OPER_TESTING		in some test mode
348IF_OPER_DORMANT		"up" but waiting for some condition (802.1X)
349IF_OPER_UP		ready to pass packets
350.Ed
351.It Dv IFLA_STATS64
352(readonly) Consists of the following 64-bit counters structure:
353.Bd -literal
354struct rtnl_link_stats64 {
355	uint64_t rx_packets;	/* total RX packets (IFCOUNTER_IPACKETS) */
356	uint64_t tx_packets;	/* total TX packets (IFCOUNTER_OPACKETS) */
357	uint64_t rx_bytes;	/* total RX bytes (IFCOUNTER_IBYTES) */
358	uint64_t tx_bytes;	/* total TX bytes (IFCOUNTER_OBYTES) */
359	uint64_t rx_errors;	/* RX errors (IFCOUNTER_IERRORS) */
360	uint64_t tx_errors;	/* RX errors (IFCOUNTER_OERRORS) */
361	uint64_t rx_dropped;	/* RX drop (no space in ring/no bufs) (IFCOUNTER_IQDROPS) */
362	uint64_t tx_dropped;	/* TX drop (IFCOUNTER_OQDROPS) */
363	uint64_t multicast;	/* RX multicast packets (IFCOUNTER_IMCASTS) */
364	uint64_t collisions;	/* not supported */
365	uint64_t rx_length_errors;	/* not supported */
366	uint64_t rx_over_errors;	/* not supported */
367	uint64_t rx_crc_errors;		/* not supported */
368	uint64_t rx_frame_errors;	/* not supported */
369	uint64_t rx_fifo_errors;	/* not supported */
370	uint64_t rx_missed_errors;	/* not supported */
371	uint64_t tx_aborted_errors;	/* not supported */
372	uint64_t tx_carrier_errors;	/* not supported */
373	uint64_t tx_fifo_errors;	/* not supported */
374	uint64_t tx_heartbeat_errors;	/* not supported */
375	uint64_t tx_window_errors;	/* not supported */
376	uint64_t rx_compressed;		/* not supported */
377	uint64_t tx_compressed;		/* not supported */
378	uint64_t rx_nohandler;	/* dropped due to no proto handler (IFCOUNTER_NOPROTO) */
379};
380.Ed
381.El
382.Ss Groups
383The following groups are defined:
384.Bd -literal -offset indent -compact
385RTNLGRP_LINK		Notifies on interface arrival/removal/change
386.Ed
387.Sh INTERFACE ADDRESSES
388All interface address configuration messages share the common header:
389.Bd -literal
390struct ifaddrmsg {
391	uint8_t		ifa_family;	/* Address family */
392	uint8_t		ifa_prefixlen;	/* Prefix length */
393	uint8_t		ifa_flags;	/* Address-specific flags */
394	uint8_t		ifa_scope;	/* Address scope */
395	uint32_t	ifa_index;	/* Link ifindex */
396};
397.Ed
398.Pp
399The
400.Va ifa_family
401specifies the address family of the interface address.
402The
403.Va ifa_prefixlen
404specifies the prefix length if applicable for the address family.
405The
406.Va ifa_index
407specifies the interface index of the target interface.
408.Ss RTM_NEWADDR
409Not supported
410.Ss RTM_DELADDR
411Not supported
412.Ss RTM_GETADDR
413Fetches interface addresses in the current VNET matching conditions.
414Each address is reported as a
415.Dv RTM_NEWADDR
416message.
417The following filters are recognised by the kernel:
418.Pp
419.Bd -literal -offset indent -compact
420ifa_family	required family or AF_UNSPEC
421ifa_index	matching interface index or 0
422.Ed
423.Ss TLVs
424.Bl -tag -width indent
425.It Dv IFA_ADDRESS
426(binary) masked interface address or destination address for p2p interfaces.
427.It Dv IFA_LOCAL
428(binary) local interface address.
429Set for IPv4 and p2p addresses.
430.It Dv IFA_LABEL
431(string) interface name.
432.It Dv IFA_BROADCAST
433(binary) broacast interface address.
434.El
435.Ss Groups
436The following groups are defined:
437.Bd -literal -offset indent -compact
438RTNLGRP_IPV4_IFADDR	Notifies on IPv4 ifaddr arrival/removal/change
439RTNLGRP_IPV6_IFADDR	Notifies on IPv6 ifaddr arrival/removal/change
440.Ed
441.Sh NEIGHBORS
442All neighbor configuration messages share the common header:
443.Bd -literal
444struct ndmsg {
445	uint8_t		ndm_family;
446	uint8_t		ndm_pad1;
447	uint16_t	ndm_pad2;
448	int32_t		ndm_ifindex;
449	uint16_t	ndm_state;
450	uint8_t		ndm_flags;
451	uint8_t		ndm_type;
452};
453.Ed
454.Pp
455The
456.Va ndm_family
457field specifies the address family (IPv4 or IPv6) of the neighbor.
458The
459.Va ndm_ifindex
460specifies the interface to operate on.
461The
462.Va ndm_state
463represents the entry state according to the neighbor model.
464The state can be one of the following:
465.Bd -literal -offset indent -compact
466NUD_INCOMPLETE		No lladdr, address resolution in progress
467NUD_REACHABLE		reachable & recently resolved
468NUD_STALE		has lladdr but it's stale
469NUD_DELAY		has lladdr, is stale, probes delayed
470NUD_PROBE		has lladdr, is stale, probes sent
471NUD_FAILED		unused
472.Ed
473.Pp
474The
475.Va ndm_flags
476field stores the options specific to this entry.
477Available flags:
478.Bd -literal -offset indent -compact
479NTF_SELF		local station (LLE_IFADDR)
480NTF_PROXY		proxy entry (LLE_PUB)
481NTF_STICKY		permament entry (LLE_STATIC)
482NTF_ROUTER		dst indicated itself as a router
483.Ed
484.Ss RTM_NEWNEIGH
485Creates new neighbor entry.
486The mandatory options are
487.Dv NDA_DST ,
488.Dv NDA_LLADDR
489and
490.Dv NDA_IFINDEX .
491.Ss RTM_DELNEIGH
492Deletes the neighbor entry.
493The entry is specified by the combination of
494.Dv NDA_DST
495and
496.Dv NDA_IFINDEX .
497.Ss RTM_GETNEIGH
498Fetches a single neighbor or all neighbors in the current VNET, depending on the
499.Dv NLM_F_DUMP
500flag.
501Each entry is reported as
502.Dv RTM_NEWNEIGH
503message.
504The following filters are recognised by the kernel:
505.Pp
506.Bd -literal -offset indent -compact
507ndm_family	required family or AF_UNSPEC
508ndm_ifindex	target ifindex
509NDA_IFINDEX	target ifindex
510.Ed
511.Ss TLVs
512.Bl -tag -width indent
513.It Dv NDA_DST
514(binary) neighbor IPv4/IPv6 address.
515.It Dv NDA_LLADDR
516(binary) neighbor link-level address.
517.It Dv NDA_IFINDEX
518(uint32_t) interface index.
519.It Dv NDA_FLAGS_EXT
520(uint32_t) extended version of
521.Va ndm_flags .
522.El
523.Ss Groups
524The following groups are defined:
525.Bd -literal -offset indent -compact
526RTNLGRP_NEIGH	Notifies on ARP/NDP neighbor  arrival/removal/change
527.Ed
528.Sh SEE ALSO
529.Xr netlink 4 ,
530.Xr route 4
531.Sh HISTORY
532The
533.Dv NETLINK_ROUTE
534protocol family appeared in
535.Fx 14.0 .
536.Sh AUTHORS
537The netlink was implementated by
538.An -nosplit
539.An Alexander Chernikov Aq Mt melifaro@FreeBSD.org .
540It was derived from the Google Summer of Code 2021 project by
541.An Ng Peng Nam Sean .
542