1.. SPDX-License-Identifier: GPL-2.0+
2
3=================================================================
4Linux Base Driver for the Intel(R) Ethernet Controller 800 Series
5=================================================================
6
7Intel ice Linux driver.
8Copyright(c) 2018-2021 Intel Corporation.
9
10Contents
11========
12
13- Overview
14- Identifying Your Adapter
15- Important Notes
16- Additional Features & Configurations
17- Performance Optimization
18
19
20The associated Virtual Function (VF) driver for this driver is iavf.
21
22Driver information can be obtained using ethtool and lspci.
23
24For questions related to hardware requirements, refer to the documentation
25supplied with your Intel adapter. All hardware requirements listed apply to use
26with Linux.
27
28This driver supports XDP (Express Data Path) and AF_XDP zero-copy. Note that
29XDP is blocked for frame sizes larger than 3KB.
30
31
32Identifying Your Adapter
33========================
34For information on how to identify your adapter, and for the latest Intel
35network drivers, refer to the Intel Support website:
36https://www.intel.com/support
37
38
39Important Notes
40===============
41
42Packet drops may occur under receive stress
43-------------------------------------------
44Devices based on the Intel(R) Ethernet Controller 800 Series are designed to
45tolerate a limited amount of system latency during PCIe and DMA transactions.
46If these transactions take longer than the tolerated latency, it can impact the
47length of time the packets are buffered in the device and associated memory,
48which may result in dropped packets. These packets drops typically do not have
49a noticeable impact on throughput and performance under standard workloads.
50
51If these packet drops appear to affect your workload, the following may improve
52the situation:
53
541) Make sure that your system's physical memory is in a high-performance
55   configuration, as recommended by the platform vendor. A common
56   recommendation is for all channels to be populated with a single DIMM
57   module.
582) In your system's BIOS/UEFI settings, select the "Performance" profile.
593) Your distribution may provide tools like "tuned," which can help tweak
60   kernel settings to achieve better standard settings for different workloads.
61
62
63Configuring SR-IOV for improved network security
64------------------------------------------------
65In a virtualized environment, on Intel(R) Ethernet Network Adapters that
66support SR-IOV, the virtual function (VF) may be subject to malicious behavior.
67Software-generated layer two frames, like IEEE 802.3x (link flow control), IEEE
68802.1Qbb (priority based flow-control), and others of this type, are not
69expected and can throttle traffic between the host and the virtual switch,
70reducing performance. To resolve this issue, and to ensure isolation from
71unintended traffic streams, configure all SR-IOV enabled ports for VLAN tagging
72from the administrative interface on the PF. This configuration allows
73unexpected, and potentially malicious, frames to be dropped.
74
75See "Configuring VLAN Tagging on SR-IOV Enabled Adapter Ports" later in this
76README for configuration instructions.
77
78
79Do not unload port driver if VF with active VM is bound to it
80-------------------------------------------------------------
81Do not unload a port's driver if a Virtual Function (VF) with an active Virtual
82Machine (VM) is bound to it. Doing so will cause the port to appear to hang.
83Once the VM shuts down, or otherwise releases the VF, the command will
84complete.
85
86
87Additional Features and Configurations
88======================================
89
90ethtool
91-------
92The driver utilizes the ethtool interface for driver configuration and
93diagnostics, as well as displaying statistical information. The latest ethtool
94version is required for this functionality. Download it at:
95https://kernel.org/pub/software/network/ethtool/
96
97NOTE: The rx_bytes value of ethtool does not match the rx_bytes value of
98Netdev, due to the 4-byte CRC being stripped by the device. The difference
99between the two rx_bytes values will be 4 x the number of Rx packets. For
100example, if Rx packets are 10 and Netdev (software statistics) displays
101rx_bytes as "X", then ethtool (hardware statistics) will display rx_bytes as
102"X+40" (4 bytes CRC x 10 packets).
103
104
105Viewing Link Messages
106---------------------
107Link messages will not be displayed to the console if the distribution is
108restricting system messages. In order to see network driver link messages on
109your console, set dmesg to eight by entering the following::
110
111  # dmesg -n 8
112
113NOTE: This setting is not saved across reboots.
114
115
116Dynamic Device Personalization
117------------------------------
118Dynamic Device Personalization (DDP) allows you to change the packet processing
119pipeline of a device by applying a profile package to the device at runtime.
120Profiles can be used to, for example, add support for new protocols, change
121existing protocols, or change default settings. DDP profiles can also be rolled
122back without rebooting the system.
123
124The DDP package loads during device initialization. The driver looks for
125``intel/ice/ddp/ice.pkg`` in your firmware root (typically ``/lib/firmware/``
126or ``/lib/firmware/updates/``) and checks that it contains a valid DDP package
127file.
128
129NOTE: Your distribution should likely have provided the latest DDP file, but if
130ice.pkg is missing, you can find it in the linux-firmware repository or from
131intel.com.
132
133If the driver is unable to load the DDP package, the device will enter Safe
134Mode. Safe Mode disables advanced and performance features and supports only
135basic traffic and minimal functionality, such as updating the NVM or
136downloading a new driver or DDP package. Safe Mode only applies to the affected
137physical function and does not impact any other PFs. See the "Intel(R) Ethernet
138Adapters and Devices User Guide" for more details on DDP and Safe Mode.
139
140NOTES:
141
142- If you encounter issues with the DDP package file, you may need to download
143  an updated driver or DDP package file. See the log messages for more
144  information.
145
146- The ice.pkg file is a symbolic link to the default DDP package file.
147
148- You cannot update the DDP package if any PF drivers are already loaded. To
149  overwrite a package, unload all PFs and then reload the driver with the new
150  package.
151
152- Only the first loaded PF per device can download a package for that device.
153
154You can install specific DDP package files for different physical devices in
155the same system. To install a specific DDP package file:
156
1571. Download the DDP package file you want for your device.
158
1592. Rename the file ice-xxxxxxxxxxxxxxxx.pkg, where 'xxxxxxxxxxxxxxxx' is the
160   unique 64-bit PCI Express device serial number (in hex) of the device you
161   want the package downloaded on. The filename must include the complete
162   serial number (including leading zeros) and be all lowercase. For example,
163   if the 64-bit serial number is b887a3ffffca0568, then the file name would be
164   ice-b887a3ffffca0568.pkg.
165
166   To find the serial number from the PCI bus address, you can use the
167   following command::
168
169     # lspci -vv -s af:00.0 | grep -i Serial
170     Capabilities: [150 v1] Device Serial Number b8-87-a3-ff-ff-ca-05-68
171
172   You can use the following command to format the serial number without the
173   dashes::
174
175     # lspci -vv -s af:00.0 | grep -i Serial | awk '{print $7}' | sed s/-//g
176     b887a3ffffca0568
177
1783. Copy the renamed DDP package file to
179   ``/lib/firmware/updates/intel/ice/ddp/``. If the directory does not yet
180   exist, create it before copying the file.
181
1824. Unload all of the PFs on the device.
183
1845. Reload the driver with the new package.
185
186NOTE: The presence of a device-specific DDP package file overrides the loading
187of the default DDP package file (ice.pkg).
188
189
190Intel(R) Ethernet Flow Director
191-------------------------------
192The Intel Ethernet Flow Director performs the following tasks:
193
194- Directs receive packets according to their flows to different queues
195- Enables tight control on routing a flow in the platform
196- Matches flows and CPU cores for flow affinity
197
198NOTE: This driver supports the following flow types:
199
200- IPv4
201- TCPv4
202- UDPv4
203- SCTPv4
204- IPv6
205- TCPv6
206- UDPv6
207- SCTPv6
208
209Each flow type supports valid combinations of IP addresses (source or
210destination) and UDP/TCP/SCTP ports (source and destination). You can supply
211only a source IP address, a source IP address and a destination port, or any
212combination of one or more of these four parameters.
213
214NOTE: This driver allows you to filter traffic based on a user-defined flexible
215two-byte pattern and offset by using the ethtool user-def and mask fields. Only
216L3 and L4 flow types are supported for user-defined flexible filters. For a
217given flow type, you must clear all Intel Ethernet Flow Director filters before
218changing the input set (for that flow type).
219
220
221Flow Director Filters
222---------------------
223Flow Director filters are used to direct traffic that matches specified
224characteristics. They are enabled through ethtool's ntuple interface. To enable
225or disable the Intel Ethernet Flow Director and these filters::
226
227  # ethtool -K <ethX> ntuple <off|on>
228
229NOTE: When you disable ntuple filters, all the user programmed filters are
230flushed from the driver cache and hardware. All needed filters must be re-added
231when ntuple is re-enabled.
232
233To display all of the active filters::
234
235  # ethtool -u <ethX>
236
237To add a new filter::
238
239  # ethtool -U <ethX> flow-type <type> src-ip <ip> [m <ip_mask>] dst-ip <ip>
240  [m <ip_mask>] src-port <port> [m <port_mask>] dst-port <port> [m <port_mask>]
241  action <queue>
242
243  Where:
244    <ethX> - the Ethernet device to program
245    <type> - can be ip4, tcp4, udp4, sctp4, ip6, tcp6, udp6, sctp6
246    <ip> - the IP address to match on
247    <ip_mask> - the IPv4 address to mask on
248              NOTE: These filters use inverted masks.
249    <port> - the port number to match on
250    <port_mask> - the 16-bit integer for masking
251              NOTE: These filters use inverted masks.
252    <queue> - the queue to direct traffic toward (-1 discards the
253              matched traffic)
254
255To delete a filter::
256
257  # ethtool -U <ethX> delete <N>
258
259  Where <N> is the filter ID displayed when printing all the active filters,
260  and may also have been specified using "loc <N>" when adding the filter.
261
262EXAMPLES:
263
264To add a filter that directs packet to queue 2::
265
266  # ethtool -U <ethX> flow-type tcp4 src-ip 192.168.10.1 dst-ip \
267  192.168.10.2 src-port 2000 dst-port 2001 action 2 [loc 1]
268
269To set a filter using only the source and destination IP address::
270
271  # ethtool -U <ethX> flow-type tcp4 src-ip 192.168.10.1 dst-ip \
272  192.168.10.2 action 2 [loc 1]
273
274To set a filter based on a user-defined pattern and offset::
275
276  # ethtool -U <ethX> flow-type tcp4 src-ip 192.168.10.1 dst-ip \
277  192.168.10.2 user-def 0x4FFFF action 2 [loc 1]
278
279  where the value of the user-def field contains the offset (4 bytes) and
280  the pattern (0xffff).
281
282To match TCP traffic sent from 192.168.0.1, port 5300, directed to 192.168.0.5,
283port 80, and then send it to queue 7::
284
285  # ethtool -U enp130s0 flow-type tcp4 src-ip 192.168.0.1 dst-ip 192.168.0.5
286  src-port 5300 dst-port 80 action 7
287
288To add a TCPv4 filter with a partial mask for a source IP subnet::
289
290  # ethtool -U <ethX> flow-type tcp4 src-ip 192.168.0.0 m 0.255.255.255 dst-ip
291  192.168.5.12 src-port 12600 dst-port 31 action 12
292
293NOTES:
294
295For each flow-type, the programmed filters must all have the same matching
296input set. For example, issuing the following two commands is acceptable::
297
298  # ethtool -U enp130s0 flow-type ip4 src-ip 192.168.0.1 src-port 5300 action 7
299  # ethtool -U enp130s0 flow-type ip4 src-ip 192.168.0.5 src-port 55 action 10
300
301Issuing the next two commands, however, is not acceptable, since the first
302specifies src-ip and the second specifies dst-ip::
303
304  # ethtool -U enp130s0 flow-type ip4 src-ip 192.168.0.1 src-port 5300 action 7
305  # ethtool -U enp130s0 flow-type ip4 dst-ip 192.168.0.5 src-port 55 action 10
306
307The second command will fail with an error. You may program multiple filters
308with the same fields, using different values, but, on one device, you may not
309program two tcp4 filters with different matching fields.
310
311The ice driver does not support matching on a subportion of a field, thus
312partial mask fields are not supported.
313
314
315Flex Byte Flow Director Filters
316-------------------------------
317The driver also supports matching user-defined data within the packet payload.
318This flexible data is specified using the "user-def" field of the ethtool
319command in the following way:
320
321.. table::
322
323    ============================== ============================
324    ``31    28    24    20    16`` ``15    12    8    4    0``
325    ``offset into packet payload`` ``2 bytes of flexible data``
326    ============================== ============================
327
328For example,
329
330::
331
332  ... user-def 0x4FFFF ...
333
334tells the filter to look 4 bytes into the payload and match that value against
3350xFFFF. The offset is based on the beginning of the payload, and not the
336beginning of the packet. Thus
337
338::
339
340  flow-type tcp4 ... user-def 0x8BEAF ...
341
342would match TCP/IPv4 packets which have the value 0xBEAF 8 bytes into the
343TCP/IPv4 payload.
344
345Note that ICMP headers are parsed as 4 bytes of header and 4 bytes of payload.
346Thus to match the first byte of the payload, you must actually add 4 bytes to
347the offset. Also note that ip4 filters match both ICMP frames as well as raw
348(unknown) ip4 frames, where the payload will be the L3 payload of the IP4
349frame.
350
351The maximum offset is 64. The hardware will only read up to 64 bytes of data
352from the payload. The offset must be even because the flexible data is 2 bytes
353long and must be aligned to byte 0 of the packet payload.
354
355The user-defined flexible offset is also considered part of the input set and
356cannot be programmed separately for multiple filters of the same type. However,
357the flexible data is not part of the input set and multiple filters may use the
358same offset but match against different data.
359
360
361RSS Hash Flow
362-------------
363Allows you to set the hash bytes per flow type and any combination of one or
364more options for Receive Side Scaling (RSS) hash byte configuration.
365
366::
367
368  # ethtool -N <ethX> rx-flow-hash <type> <option>
369
370  Where <type> is:
371    tcp4  signifying TCP over IPv4
372    udp4  signifying UDP over IPv4
373    tcp6  signifying TCP over IPv6
374    udp6  signifying UDP over IPv6
375  And <option> is one or more of:
376    s     Hash on the IP source address of the Rx packet.
377    d     Hash on the IP destination address of the Rx packet.
378    f     Hash on bytes 0 and 1 of the Layer 4 header of the Rx packet.
379    n     Hash on bytes 2 and 3 of the Layer 4 header of the Rx packet.
380
381
382Accelerated Receive Flow Steering (aRFS)
383----------------------------------------
384Devices based on the Intel(R) Ethernet Controller 800 Series support
385Accelerated Receive Flow Steering (aRFS) on the PF. aRFS is a load-balancing
386mechanism that allows you to direct packets to the same CPU where an
387application is running or consuming the packets in that flow.
388
389NOTES:
390
391- aRFS requires that ntuple filtering is enabled via ethtool.
392- aRFS support is limited to the following packet types:
393
394    - TCP over IPv4 and IPv6
395    - UDP over IPv4 and IPv6
396    - Nonfragmented packets
397
398- aRFS only supports Flow Director filters, which consist of the
399  source/destination IP addresses and source/destination ports.
400- aRFS and ethtool's ntuple interface both use the device's Flow Director. aRFS
401  and ntuple features can coexist, but you may encounter unexpected results if
402  there's a conflict between aRFS and ntuple requests. See "Intel(R) Ethernet
403  Flow Director" for additional information.
404
405To set up aRFS:
406
4071. Enable the Intel Ethernet Flow Director and ntuple filters using ethtool.
408
409::
410
411   # ethtool -K <ethX> ntuple on
412
4132. Set up the number of entries in the global flow table. For example:
414
415::
416
417   # NUM_RPS_ENTRIES=16384
418   # echo $NUM_RPS_ENTRIES > /proc/sys/net/core/rps_sock_flow_entries
419
4203. Set up the number of entries in the per-queue flow table. For example:
421
422::
423
424   # NUM_RX_QUEUES=64
425   # for file in /sys/class/net/$IFACE/queues/rx-*/rps_flow_cnt; do
426   # echo $(($NUM_RPS_ENTRIES/$NUM_RX_QUEUES)) > $file;
427   # done
428
4294. Disable the IRQ balance daemon (this is only a temporary stop of the service
430   until the next reboot).
431
432::
433
434   # systemctl stop irqbalance
435
4365. Configure the interrupt affinity.
437
438   See ``/Documentation/core-api/irq/irq-affinity.rst``
439
440
441To disable aRFS using ethtool::
442
443  # ethtool -K <ethX> ntuple off
444
445NOTE: This command will disable ntuple filters and clear any aRFS filters in
446software and hardware.
447
448Example Use Case:
449
4501. Set the server application on the desired CPU (e.g., CPU 4).
451
452::
453
454   # taskset -c 4 netserver
455
4562. Use netperf to route traffic from the client to CPU 4 on the server with
457   aRFS configured. This example uses TCP over IPv4.
458
459::
460
461   # netperf -H <Host IPv4 Address> -t TCP_STREAM
462
463
464Enabling Virtual Functions (VFs)
465--------------------------------
466Use sysfs to enable virtual functions (VF).
467
468For example, you can create 4 VFs as follows::
469
470  # echo 4 > /sys/class/net/<ethX>/device/sriov_numvfs
471
472To disable VFs, write 0 to the same file::
473
474  # echo 0 > /sys/class/net/<ethX>/device/sriov_numvfs
475
476The maximum number of VFs for the ice driver is 256 total (all ports). To check
477how many VFs each PF supports, use the following command::
478
479  # cat /sys/class/net/<ethX>/device/sriov_totalvfs
480
481Note: You cannot use SR-IOV when link aggregation (LAG)/bonding is active, and
482vice versa. To enforce this, the driver checks for this mutual exclusion.
483
484
485Displaying VF Statistics on the PF
486----------------------------------
487Use the following command to display the statistics for the PF and its VFs::
488
489  # ip -s link show dev <ethX>
490
491NOTE: The output of this command can be very large due to the maximum number of
492possible VFs.
493
494The PF driver will display a subset of the statistics for the PF and for all
495VFs that are configured. The PF will always print a statistics block for each
496of the possible VFs, and it will show zero for all unconfigured VFs.
497
498
499Configuring VLAN Tagging on SR-IOV Enabled Adapter Ports
500--------------------------------------------------------
501To configure VLAN tagging for the ports on an SR-IOV enabled adapter, use the
502following command. The VLAN configuration should be done before the VF driver
503is loaded or the VM is booted. The VF is not aware of the VLAN tag being
504inserted on transmit and removed on received frames (sometimes called "port
505VLAN" mode).
506
507::
508
509  # ip link set dev <ethX> vf <id> vlan <vlan id>
510
511For example, the following will configure PF eth0 and the first VF on VLAN 10::
512
513  # ip link set dev eth0 vf 0 vlan 10
514
515
516Enabling a VF link if the port is disconnected
517----------------------------------------------
518If the physical function (PF) link is down, you can force link up (from the
519host PF) on any virtual functions (VF) bound to the PF.
520
521For example, to force link up on VF 0 bound to PF eth0::
522
523  # ip link set eth0 vf 0 state enable
524
525Note: If the command does not work, it may not be supported by your system.
526
527
528Setting the MAC Address for a VF
529--------------------------------
530To change the MAC address for the specified VF::
531
532  # ip link set <ethX> vf 0 mac <address>
533
534For example::
535
536  # ip link set <ethX> vf 0 mac 00:01:02:03:04:05
537
538This setting lasts until the PF is reloaded.
539
540NOTE: Assigning a MAC address for a VF from the host will disable any
541subsequent requests to change the MAC address from within the VM. This is a
542security feature. The VM is not aware of this restriction, so if this is
543attempted in the VM, it will trigger MDD events.
544
545
546Trusted VFs and VF Promiscuous Mode
547-----------------------------------
548This feature allows you to designate a particular VF as trusted and allows that
549trusted VF to request selective promiscuous mode on the Physical Function (PF).
550
551To set a VF as trusted or untrusted, enter the following command in the
552Hypervisor::
553
554  # ip link set dev <ethX> vf 1 trust [on|off]
555
556NOTE: It's important to set the VF to trusted before setting promiscuous mode.
557If the VM is not trusted, the PF will ignore promiscuous mode requests from the
558VF. If the VM becomes trusted after the VF driver is loaded, you must make a
559new request to set the VF to promiscuous.
560
561Once the VF is designated as trusted, use the following commands in the VM to
562set the VF to promiscuous mode.
563
564For promiscuous all::
565
566  # ip link set <ethX> promisc on
567  Where <ethX> is a VF interface in the VM
568
569For promiscuous Multicast::
570
571  # ip link set <ethX> allmulticast on
572  Where <ethX> is a VF interface in the VM
573
574NOTE: By default, the ethtool private flag vf-true-promisc-support is set to
575"off," meaning that promiscuous mode for the VF will be limited. To set the
576promiscuous mode for the VF to true promiscuous and allow the VF to see all
577ingress traffic, use the following command::
578
579  # ethtool --set-priv-flags <ethX> vf-true-promisc-support on
580
581The vf-true-promisc-support private flag does not enable promiscuous mode;
582rather, it designates which type of promiscuous mode (limited or true) you will
583get when you enable promiscuous mode using the ip link commands above. Note
584that this is a global setting that affects the entire device. However, the
585vf-true-promisc-support private flag is only exposed to the first PF of the
586device. The PF remains in limited promiscuous mode regardless of the
587vf-true-promisc-support setting.
588
589Next, add a VLAN interface on the VF interface. For example::
590
591  # ip link add link eth2 name eth2.100 type vlan id 100
592
593Note that the order in which you set the VF to promiscuous mode and add the
594VLAN interface does not matter (you can do either first). The result in this
595example is that the VF will get all traffic that is tagged with VLAN 100.
596
597
598Malicious Driver Detection (MDD) for VFs
599----------------------------------------
600Some Intel Ethernet devices use Malicious Driver Detection (MDD) to detect
601malicious traffic from the VF and disable Tx/Rx queues or drop the offending
602packet until a VF driver reset occurs. You can view MDD messages in the PF's
603system log using the dmesg command.
604
605- If the PF driver logs MDD events from the VF, confirm that the correct VF
606  driver is installed.
607- To restore functionality, you can manually reload the VF or VM or enable
608  automatic VF resets.
609- When automatic VF resets are enabled, the PF driver will immediately reset
610  the VF and reenable queues when it detects MDD events on the receive path.
611- If automatic VF resets are disabled, the PF will not automatically reset the
612  VF when it detects MDD events.
613
614To enable or disable automatic VF resets, use the following command::
615
616  # ethtool --set-priv-flags <ethX> mdd-auto-reset-vf on|off
617
618
619MAC and VLAN Anti-Spoofing Feature for VFs
620------------------------------------------
621When a malicious driver on a Virtual Function (VF) interface attempts to send a
622spoofed packet, it is dropped by the hardware and not transmitted.
623
624NOTE: This feature can be disabled for a specific VF::
625
626  # ip link set <ethX> vf <vf id> spoofchk {off|on}
627
628
629Jumbo Frames
630------------
631Jumbo Frames support is enabled by changing the Maximum Transmission Unit (MTU)
632to a value larger than the default value of 1500.
633
634Use the ifconfig command to increase the MTU size. For example, enter the
635following where <ethX> is the interface number::
636
637  # ifconfig <ethX> mtu 9000 up
638
639Alternatively, you can use the ip command as follows::
640
641  # ip link set mtu 9000 dev <ethX>
642  # ip link set up dev <ethX>
643
644This setting is not saved across reboots.
645
646
647NOTE: The maximum MTU setting for jumbo frames is 9702. This corresponds to the
648maximum jumbo frame size of 9728 bytes.
649
650NOTE: This driver will attempt to use multiple page sized buffers to receive
651each jumbo packet. This should help to avoid buffer starvation issues when
652allocating receive packets.
653
654NOTE: Packet loss may have a greater impact on throughput when you use jumbo
655frames. If you observe a drop in performance after enabling jumbo frames,
656enabling flow control may mitigate the issue.
657
658
659Speed and Duplex Configuration
660------------------------------
661In addressing speed and duplex configuration issues, you need to distinguish
662between copper-based adapters and fiber-based adapters.
663
664In the default mode, an Intel(R) Ethernet Network Adapter using copper
665connections will attempt to auto-negotiate with its link partner to determine
666the best setting. If the adapter cannot establish link with the link partner
667using auto-negotiation, you may need to manually configure the adapter and link
668partner to identical settings to establish link and pass packets. This should
669only be needed when attempting to link with an older switch that does not
670support auto-negotiation or one that has been forced to a specific speed or
671duplex mode. Your link partner must match the setting you choose. 1 Gbps speeds
672and higher cannot be forced. Use the autonegotiation advertising setting to
673manually set devices for 1 Gbps and higher.
674
675Speed, duplex, and autonegotiation advertising are configured through the
676ethtool utility. For the latest version, download and install ethtool from the
677following website:
678
679   https://kernel.org/pub/software/network/ethtool/
680
681To see the speed configurations your device supports, run the following::
682
683  # ethtool <ethX>
684
685Caution: Only experienced network administrators should force speed and duplex
686or change autonegotiation advertising manually. The settings at the switch must
687always match the adapter settings. Adapter performance may suffer or your
688adapter may not operate if you configure the adapter differently from your
689switch.
690
691
692Data Center Bridging (DCB)
693--------------------------
694NOTE: The kernel assumes that TC0 is available, and will disable Priority Flow
695Control (PFC) on the device if TC0 is not available. To fix this, ensure TC0 is
696enabled when setting up DCB on your switch.
697
698DCB is a configuration Quality of Service implementation in hardware. It uses
699the VLAN priority tag (802.1p) to filter traffic. That means that there are 8
700different priorities that traffic can be filtered into. It also enables
701priority flow control (802.1Qbb) which can limit or eliminate the number of
702dropped packets during network stress. Bandwidth can be allocated to each of
703these priorities, which is enforced at the hardware level (802.1Qaz).
704
705DCB is normally configured on the network using the DCBX protocol (802.1Qaz), a
706specialization of LLDP (802.1AB). The ice driver supports the following
707mutually exclusive variants of DCBX support:
708
7091) Firmware-based LLDP Agent
7102) Software-based LLDP Agent
711
712In firmware-based mode, firmware intercepts all LLDP traffic and handles DCBX
713negotiation transparently for the user. In this mode, the adapter operates in
714"willing" DCBX mode, receiving DCB settings from the link partner (typically a
715switch). The local user can only query the negotiated DCB configuration. For
716information on configuring DCBX parameters on a switch, please consult the
717switch manufacturer's documentation.
718
719In software-based mode, LLDP traffic is forwarded to the network stack and user
720space, where a software agent can handle it. In this mode, the adapter can
721operate in either "willing" or "nonwilling" DCBX mode and DCB configuration can
722be both queried and set locally. This mode requires the FW-based LLDP Agent to
723be disabled.
724
725NOTE:
726
727- You can enable and disable the firmware-based LLDP Agent using an ethtool
728  private flag. Refer to the "FW-LLDP (Firmware Link Layer Discovery Protocol)"
729  section in this README for more information.
730- In software-based DCBX mode, you can configure DCB parameters using software
731  LLDP/DCBX agents that interface with the Linux kernel's DCB Netlink API. We
732  recommend using OpenLLDP as the DCBX agent when running in software mode. For
733  more information, see the OpenLLDP man pages and
734  https://github.com/intel/openlldp.
735- The driver implements the DCB netlink interface layer to allow the user space
736  to communicate with the driver and query DCB configuration for the port.
737- iSCSI with DCB is not supported.
738
739
740FW-LLDP (Firmware Link Layer Discovery Protocol)
741------------------------------------------------
742Use ethtool to change FW-LLDP settings. The FW-LLDP setting is per port and
743persists across boots.
744
745To enable LLDP::
746
747  # ethtool --set-priv-flags <ethX> fw-lldp-agent on
748
749To disable LLDP::
750
751  # ethtool --set-priv-flags <ethX> fw-lldp-agent off
752
753To check the current LLDP setting::
754
755  # ethtool --show-priv-flags <ethX>
756
757NOTE: You must enable the UEFI HII "LLDP Agent" attribute for this setting to
758take effect. If "LLDP AGENT" is set to disabled, you cannot enable it from the
759OS.
760
761
762Flow Control
763------------
764Ethernet Flow Control (IEEE 802.3x) can be configured with ethtool to enable
765receiving and transmitting pause frames for ice. When transmit is enabled,
766pause frames are generated when the receive packet buffer crosses a predefined
767threshold. When receive is enabled, the transmit unit will halt for the time
768delay specified when a pause frame is received.
769
770NOTE: You must have a flow control capable link partner.
771
772Flow Control is disabled by default.
773
774Use ethtool to change the flow control settings.
775
776To enable or disable Rx or Tx Flow Control::
777
778  # ethtool -A <ethX> rx <on|off> tx <on|off>
779
780Note: This command only enables or disables Flow Control if auto-negotiation is
781disabled. If auto-negotiation is enabled, this command changes the parameters
782used for auto-negotiation with the link partner.
783
784Note: Flow Control auto-negotiation is part of link auto-negotiation. Depending
785on your device, you may not be able to change the auto-negotiation setting.
786
787NOTE:
788
789- The ice driver requires flow control on both the port and link partner. If
790  flow control is disabled on one of the sides, the port may appear to hang on
791  heavy traffic.
792- You may encounter issues with link-level flow control (LFC) after disabling
793  DCB. The LFC status may show as enabled but traffic is not paused. To resolve
794  this issue, disable and reenable LFC using ethtool::
795
796   # ethtool -A <ethX> rx off tx off
797   # ethtool -A <ethX> rx on tx on
798
799
800NAPI
801----
802
803This driver supports NAPI (Rx polling mode).
804
805See :ref:`Documentation/networking/napi.rst <napi>` for more information.
806
807MACVLAN
808-------
809This driver supports MACVLAN. Kernel support for MACVLAN can be tested by
810checking if the MACVLAN driver is loaded. You can run 'lsmod | grep macvlan' to
811see if the MACVLAN driver is loaded or run 'modprobe macvlan' to try to load
812the MACVLAN driver.
813
814NOTE:
815
816- In passthru mode, you can only set up one MACVLAN device. It will inherit the
817  MAC address of the underlying PF (Physical Function) device.
818
819
820IEEE 802.1ad (QinQ) Support
821---------------------------
822The IEEE 802.1ad standard, informally known as QinQ, allows for multiple VLAN
823IDs within a single Ethernet frame. VLAN IDs are sometimes referred to as
824"tags," and multiple VLAN IDs are thus referred to as a "tag stack." Tag stacks
825allow L2 tunneling and the ability to segregate traffic within a particular
826VLAN ID, among other uses.
827
828NOTES:
829
830- Receive checksum offloads and VLAN acceleration are not supported for 802.1ad
831  (QinQ) packets.
832
833- 0x88A8 traffic will not be received unless VLAN stripping is disabled with
834  the following command::
835
836    # ethtool -K <ethX> rxvlan off
837
838- 0x88A8/0x8100 double VLANs cannot be used with 0x8100 or 0x8100/0x8100 VLANS
839  configured on the same port. 0x88a8/0x8100 traffic will not be received if
840  0x8100 VLANs are configured.
841
842- The VF can only transmit 0x88A8/0x8100 (i.e., 802.1ad/802.1Q) traffic if:
843
844    1) The VF is not assigned a port VLAN.
845    2) spoofchk is disabled from the PF. If you enable spoofchk, the VF will
846       not transmit 0x88A8/0x8100 traffic.
847
848- The VF may not receive all network traffic based on the Inner VLAN header
849  when VF true promiscuous mode (vf-true-promisc-support) and double VLANs are
850  enabled in SR-IOV mode.
851
852The following are examples of how to configure 802.1ad (QinQ)::
853
854  # ip link add link eth0 eth0.24 type vlan proto 802.1ad id 24
855  # ip link add link eth0.24 eth0.24.371 type vlan proto 802.1Q id 371
856
857  Where "24" and "371" are example VLAN IDs.
858
859
860Tunnel/Overlay Stateless Offloads
861---------------------------------
862Supported tunnels and overlays include VXLAN, GENEVE, and others depending on
863hardware and software configuration. Stateless offloads are enabled by default.
864
865To view the current state of all offloads::
866
867  # ethtool -k <ethX>
868
869
870UDP Segmentation Offload
871------------------------
872Allows the adapter to offload transmit segmentation of UDP packets with
873payloads up to 64K into valid Ethernet frames. Because the adapter hardware is
874able to complete data segmentation much faster than operating system software,
875this feature may improve transmission performance.
876In addition, the adapter may use fewer CPU resources.
877
878NOTE:
879
880- The application sending UDP packets must support UDP segmentation offload.
881
882To enable/disable UDP Segmentation Offload, issue the following command::
883
884  # ethtool -K <ethX> tx-udp-segmentation [off|on]
885
886
887GNSS module
888-----------
889Requires kernel compiled with CONFIG_GNSS=y or CONFIG_GNSS=m.
890Allows user to read messages from the GNSS hardware module and write supported
891commands. If the module is physically present, a GNSS device is spawned:
892``/dev/gnss<id>``.
893The protocol of write command is dependent on the GNSS hardware module as the
894driver writes raw bytes by the GNSS object to the receiver through i2c. Please
895refer to the hardware GNSS module documentation for configuration details.
896
897
898Performance Optimization
899========================
900Driver defaults are meant to fit a wide variety of workloads, but if further
901optimization is required, we recommend experimenting with the following
902settings.
903
904
905Rx Descriptor Ring Size
906-----------------------
907To reduce the number of Rx packet discards, increase the number of Rx
908descriptors for each Rx ring using ethtool.
909
910  Check if the interface is dropping Rx packets due to buffers being full
911  (rx_dropped.nic can mean that there is no PCIe bandwidth)::
912
913    # ethtool -S <ethX> | grep "rx_dropped"
914
915  If the previous command shows drops on queues, it may help to increase
916  the number of descriptors using 'ethtool -G'::
917
918    # ethtool -G <ethX> rx <N>
919    Where <N> is the desired number of ring entries/descriptors
920
921  This can provide temporary buffering for issues that create latency while
922  the CPUs process descriptors.
923
924
925Interrupt Rate Limiting
926-----------------------
927This driver supports an adaptive interrupt throttle rate (ITR) mechanism that
928is tuned for general workloads. The user can customize the interrupt rate
929control for specific workloads, via ethtool, adjusting the number of
930microseconds between interrupts.
931
932To set the interrupt rate manually, you must disable adaptive mode::
933
934  # ethtool -C <ethX> adaptive-rx off adaptive-tx off
935
936For lower CPU utilization:
937
938  Disable adaptive ITR and lower Rx and Tx interrupts. The examples below
939  affect every queue of the specified interface.
940
941  Setting rx-usecs and tx-usecs to 80 will limit interrupts to about
942  12,500 interrupts per second per queue::
943
944    # ethtool -C <ethX> adaptive-rx off adaptive-tx off rx-usecs 80 tx-usecs 80
945
946For reduced latency:
947
948  Disable adaptive ITR and ITR by setting rx-usecs and tx-usecs to 0
949  using ethtool::
950
951    # ethtool -C <ethX> adaptive-rx off adaptive-tx off rx-usecs 0 tx-usecs 0
952
953Per-queue interrupt rate settings:
954
955  The following examples are for queues 1 and 3, but you can adjust other
956  queues.
957
958  To disable Rx adaptive ITR and set static Rx ITR to 10 microseconds or
959  about 100,000 interrupts/second, for queues 1 and 3::
960
961    # ethtool --per-queue <ethX> queue_mask 0xa --coalesce adaptive-rx off
962    rx-usecs 10
963
964  To show the current coalesce settings for queues 1 and 3::
965
966    # ethtool --per-queue <ethX> queue_mask 0xa --show-coalesce
967
968Bounding interrupt rates using rx-usecs-high:
969
970  :Valid Range: 0-236 (0=no limit)
971
972   The range of 0-236 microseconds provides an effective range of 4,237 to
973   250,000 interrupts per second. The value of rx-usecs-high can be set
974   independently of rx-usecs and tx-usecs in the same ethtool command, and is
975   also independent of the adaptive interrupt moderation algorithm. The
976   underlying hardware supports granularity in 4-microsecond intervals, so
977   adjacent values may result in the same interrupt rate.
978
979  The following command would disable adaptive interrupt moderation, and allow
980  a maximum of 5 microseconds before indicating a receive or transmit was
981  complete. However, instead of resulting in as many as 200,000 interrupts per
982  second, it limits total interrupts per second to 50,000 via the rx-usecs-high
983  parameter.
984
985  ::
986
987    # ethtool -C <ethX> adaptive-rx off adaptive-tx off rx-usecs-high 20
988    rx-usecs 5 tx-usecs 5
989
990
991Virtualized Environments
992------------------------
993In addition to the other suggestions in this section, the following may be
994helpful to optimize performance in VMs.
995
996  Using the appropriate mechanism (vcpupin) in the VM, pin the CPUs to
997  individual LCPUs, making sure to use a set of CPUs included in the
998  device's local_cpulist: ``/sys/class/net/<ethX>/device/local_cpulist``.
999
1000  Configure as many Rx/Tx queues in the VM as available. (See the iavf driver
1001  documentation for the number of queues supported.) For example::
1002
1003    # ethtool -L <virt_interface> rx <max> tx <max>
1004
1005
1006Support
1007=======
1008For general information, go to the Intel support website at:
1009https://www.intel.com/support/
1010
1011If an issue is identified with the released source code on a supported kernel
1012with a supported adapter, email the specific information related to the issue
1013to intel-wired-lan@lists.osuosl.org.
1014
1015
1016Trademarks
1017==========
1018Intel is a trademark or registered trademark of Intel Corporation or its
1019subsidiaries in the United States and/or other countries.
1020
1021* Other names and brands may be claimed as the property of others.
1022