#
fb6e30a7 |
| 13-Dec-2023 |
Ahmed Zaki <ahmed.zaki@intel.com> |
net: ethtool: pass a pointer to parameters to get/set_rxfh ethtool ops
The get/set_rxfh ethtool ops currently takes the rxfh (RSS) parameters as direct function arguments. This will force us to chan
net: ethtool: pass a pointer to parameters to get/set_rxfh ethtool ops
The get/set_rxfh ethtool ops currently takes the rxfh (RSS) parameters as direct function arguments. This will force us to change the API (and all drivers' functions) every time some new parameters are added.
This is part 1/2 of the fix, as suggested in [1]:
- First simplify the code by always providing a pointer to all params (indir, key and func); the fact that some of them may be NULL seems like a weird historic thing or a premature optimization. It will simplify the drivers if all pointers are always present.
- Then make the functions take a dev pointer, and a pointer to a single struct wrapping all arguments. The set_* should also take an extack.
Link: https://lore.kernel.org/netdev/20231121152906.2dd5f487@kernel.org/ [1] Suggested-by: Jakub Kicinski <kuba@kernel.org> Suggested-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com> Link: https://lore.kernel.org/r/20231213003321.605376-2-ahmed.zaki@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
16a2c763 |
| 18-Apr-2023 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: enetc: include MAC Merge / FP registers in register dump
These have been useful in debugging various problems related to frame preemption, so make them available through ethtool --register-dump
net: enetc: include MAC Merge / FP registers in register dump
These have been useful in debugging various problems related to frame preemption, so make them available through ethtool --register-dump for later too.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
82714539 |
| 18-Apr-2023 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: enetc: only commit preemptible TCs to hardware when MM TX is active
This was left as TODO in commit 01e23b2b3bad ("net: enetc: add support for preemptible traffic classes") since it's relativel
net: enetc: only commit preemptible TCs to hardware when MM TX is active
This was left as TODO in commit 01e23b2b3bad ("net: enetc: add support for preemptible traffic classes") since it's relatively complicated.
Where this makes a difference is with a configuration as follows:
ethtool --set-mm eno0 pmac-enabled on tx-enabled on verify-enabled on
Preemptible packets should only be sent when the MAC Merge TX direction becomes active (i.o.w. when the verification process succeeds, aka when the link partner confirms it can process preemptible traffic). But the tc qdisc with the preemptible traffic classes is offloaded completely asynchronously w.r.t. the MM becoming active.
The ENETC manual does suggest that this should be handled in the driver: "On startup, software should wait for the verification process to complete (MMCSR[VSTS]=011) before initiating traffic".
Adding the necessary logic allows future selftests to uphold the claim that an inactive or disabled MAC Merge layer should never send data packets through the pMAC.
This change moves enetc_set_ptcfpr() from enetc.c to enetc_ethtool.c, where its only caller is now - enetc_mm_commit_preemptible_tcs().
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
153b5b1d |
| 18-Apr-2023 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: enetc: report mm tx-active based on tx-enabled and verify-status
The MMCSR register contains 2 fields with overlapping meaning:
- LPA (Local preemption active): This read-only status bit indic
net: enetc: report mm tx-active based on tx-enabled and verify-status
The MMCSR register contains 2 fields with overlapping meaning:
- LPA (Local preemption active): This read-only status bit indicates whether preemption is active for this port. This bit will be set if preemption is both enabled and has completed the verification process. - TXSTS (Merge status): This read-only status field provides the state of the MAC Merge sublayer transmit status as defined in IEEE Std 802.3-2018 Clause 99. 00 Transmit preemption is inactive 01 Transmit preemption is active 10 Reserved 11 Reserved
However none of these 2 fields offer reliable reporting to software.
When connecting ENETC to a link partner which is not capable of Frame Preemption, the expectation is that ENETC's verification should fail (VSTS=4) and its MM TX direction should be inactive (LPA=0, TXSTS=00) even though the MM TX is enabled (ME=1). But surprise, the LPA bit of MMCSR stays set even if VSTS=4 and ME=1.
OTOH, the TXSTS field has the opposite problem. I cannot get its value to change from 0, even when connecting to a link partner capable of frame preemption, which does respond to its verification frames (ME=1 and VSTS=3, "SUCCEEDED").
The only option with such buggy hardware seems to be to reimplement the formula for calculating tx-active in software, which is for tx-enabled to be true, and for the verify-status to be either SUCCEEDED, or DISABLED.
Without reliable tx-active reporting, we have no good indication when to commit the preemptible traffic classes to hardware, which makes it possible (but not desirable) to send preemptible traffic to a link partner incapable of receiving it. However, currently we do not have the logic to wait for TX to be active yet, so the impact is limited.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
59be75db |
| 18-Apr-2023 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: enetc: fix MAC Merge layer remaining enabled until a link down event
Current enetc_set_mm() is designed to set the priv->active_offloads bit ENETC_F_QBU for enetc_mm_link_state_update() to act
net: enetc: fix MAC Merge layer remaining enabled until a link down event
Current enetc_set_mm() is designed to set the priv->active_offloads bit ENETC_F_QBU for enetc_mm_link_state_update() to act on, but if the link is already up, it modifies the ENETC_MMCSR_ME ("Merge Enable") bit directly.
The problem is that it only *sets* ENETC_MMCSR_ME if the link is up, it doesn't *clear* it if needed. So subsequent enetc_get_mm() calls still see tx-enabled as true, up until a link down event, which is when enetc_mm_link_state_update() will get called.
This is not a functional issue as far as I can assess. It has only come up because I'd like to uphold a simple API rule in core ethtool code: the pMAC cannot be disabled if TX is going to be enabled. Currently, the fact that TX remains enabled for longer than expected (after the enetc_set_mm() call that disables it) is going to violate that rule, which is how it was caught.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
5b7be2d4 |
| 11-Apr-2023 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: enetc: workaround for unresponsive pMAC after receiving express traffic
I have observed an issue where the RX direction of the LS1028A ENETC pMAC seems unresponsive. The minimal procedure to re
net: enetc: workaround for unresponsive pMAC after receiving express traffic
I have observed an issue where the RX direction of the LS1028A ENETC pMAC seems unresponsive. The minimal procedure to reproduce the issue is:
1. Connect ENETC port 0 with a loopback RJ45 cable to one of the Felix switch ports (0).
2. Bring the ports up (MAC Merge layer is not enabled on either end).
3. Send a large quantity of unidirectional (express) traffic from Felix to ENETC. I tried altering frame size and frame count, and it doesn't appear to be specific to either of them, but rather, to the quantity of octets received. Lowering the frame count, the minimum quantity of packets to reproduce relatively consistently seems to be around 37000 frames at 1514 octets (w/o FCS) each.
4. Using ethtool --set-mm, enable the pMAC in the Felix and in the ENETC ports, in both RX and TX directions, and with verification on both ends.
5. Wait for verification to complete on both sides.
6. Configure a traffic class as preemptible on both ends.
7. Send some packets again.
The issue is at step 5, where the verification process of ENETC ends (meaning that Felix responds with an SMD-R and ENETC sees the response), but the verification process of Felix never ends (it remains VERIFYING).
If step 3 is skipped or if ENETC receives less traffic than approximately that threshold, the test runs all the way through (verification succeeds on both ends, preemptible traffic passes fine).
If, between step 4 and 5, the step below is also introduced:
4.1. Disable and re-enable PM0_COMMAND_CONFIG bit RX_EN
then again, the sequence of steps runs all the way through, and verification succeeds, even if there was the previous RX traffic injected into ENETC.
Traffic sent *by* the ENETC port prior to enabling the MAC Merge layer does not seem to influence the verification result, only received traffic does.
The LS1028A manual does not mention any relationship between PM0_COMMAND_CONFIG and MMCSR, and the hardware people don't seem to know for now either.
The bit that is toggled to work around the issue is also toggled by enetc_mac_enable(), called from phylink's mac_link_down() and mac_link_up() methods - which is how the workaround was found: verification would work after a link down/up.
Fixes: c7b9e8086902 ("net: enetc: add support for MAC Merge layer") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20230411192645.1896048-1-vladimir.oltean@nxp.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
show more ...
|
#
c79493c3 |
| 21-Mar-2023 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: enetc: fix aggregate RMON counters not showing the ranges
When running "ethtool -S eno0 --groups rmon" without an explicit "--src emac|pmac" argument, the kernel will not report rx-rmon-etherSt
net: enetc: fix aggregate RMON counters not showing the ranges
When running "ethtool -S eno0 --groups rmon" without an explicit "--src emac|pmac" argument, the kernel will not report rx-rmon-etherStatsPkts64to64Octets, rx-rmon-etherStatsPkts65to127Octets, etc. This is because on ETHTOOL_MAC_STATS_SRC_AGGREGATE, we do not populate the "ranges" argument.
ocelot_port_get_rmon_stats() does things differently and things work there. I had forgotten to make sure that the code is structured the same way in both drivers, so do that now.
Fixes: cf52bd238b75 ("net: enetc: add support for MAC Merge statistics counters") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20230321232831.1200905-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
cf52bd23 |
| 06-Feb-2023 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: enetc: add support for MAC Merge statistics counters
Add PF driver support for the following:
- Viewing the standardized MAC Merge layer counters.
- Viewing the standardized Ethernet MAC and
net: enetc: add support for MAC Merge statistics counters
Add PF driver support for the following:
- Viewing the standardized MAC Merge layer counters.
- Viewing the standardized Ethernet MAC and RMON counters associated with the pMAC.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20230206094531.444988-2-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
c7b9e808 |
| 06-Feb-2023 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: enetc: add support for MAC Merge layer
Add PF driver support for viewing and changing the MAC Merge sublayer parameters, and seeing the verification state machine's current state. The verificat
net: enetc: add support for MAC Merge layer
Add PF driver support for viewing and changing the MAC Merge sublayer parameters, and seeing the verification state machine's current state. The verification handshake with the link partner is driven by hardware.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20230206094531.444988-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
e3972399 |
| 19-Jan-2023 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: enetc: build common object files into a separate module
The build system is complaining about the following:
enetc.o is added to multiple modules: fsl-enetc fsl-enetc-vf enetc_cbdr.o is added
net: enetc: build common object files into a separate module
The build system is complaining about the following:
enetc.o is added to multiple modules: fsl-enetc fsl-enetc-vf enetc_cbdr.o is added to multiple modules: fsl-enetc fsl-enetc-vf enetc_ethtool.o is added to multiple modules: fsl-enetc fsl-enetc-vf
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
59cc773a |
| 04-Jan-2023 |
Lorenzo Bianconi <lorenzo@kernel.org> |
net: ethernet: enetc: get rid of xdp_redirect_sg counter
Remove xdp_redirect_sg counter and the related ethtool entry since it is no longer used.
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com
net: ethernet: enetc: get rid of xdp_redirect_sg counter
Remove xdp_redirect_sg counter and the related ethtool entry since it is no longer used.
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
38b922c9 |
| 09-Sep-2022 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: enetc: expose some standardized ethtool counters
Structure the code in such a way that it can be reused later for the pMAC statistics, by just changing the "mac" argument to 1.
Usage: ethtool
net: enetc: expose some standardized ethtool counters
Structure the code in such a way that it can be reused later for the pMAC statistics, by just changing the "mac" argument to 1.
Usage: ethtool --include-statistics --show-pause eno2 ethtool -S eno0 --groups eth-mac ethtool -S eno0 --groups eth-ctrl ethtool -S eno0 --groups rmon
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
e2bd065c |
| 09-Sep-2022 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: enetc: parameterize port MAC stats to also cover the pMAC
The ENETC has counters for the eMAC and for the pMAC exactly 0x1000 apart from each other. The driver only contains definitions for PM0
net: enetc: parameterize port MAC stats to also cover the pMAC
The ENETC has counters for the eMAC and for the pMAC exactly 0x1000 apart from each other. The driver only contains definitions for PM0, the eMAC.
Rather than duplicating everything for PM1, modify the register definitions such that they take the MAC as argument.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
f029c781 |
| 30-Aug-2022 |
Wolfram Sang <wsa+renesas@sang-engineering.com> |
net: ethernet: move from strlcpy with unused retval to strscpy
Follow the advice of the below link and prefer 'strscpy' in this subsystem. Conversion is 1:1 because the return value is not used. Gen
net: ethernet: move from strlcpy with unused retval to strscpy
Follow the advice of the below link and prefer 'strscpy' in this subsystem. Conversion is 1:1 because the return value is not used. Generated by a coccinelle script.
Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/ Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Petr Machata <petrm@nvidia.com> # For drivers/net/ethernet/mellanox/mlxsw Acked-by: Geoff Levand <geoff@infradead.org> # For ps3_gelic_net and spider_net_ethtool Acked-by: Tom Lendacky <thomas.lendacky@amd.com> # For drivers/net/ethernet/amd/xgbe/xgbe-ethtool.c Acked-by: Marcin Wojtas <mw@semihalf.com> # For drivers/net/ethernet/marvell/mvpp2 Reviewed-by: Leon Romanovsky <leonro@nvidia.com> # For drivers/net/ethernet/mellanox/mlx{4|5} Reviewed-by: Shay Agroskin <shayagr@amazon.com> # For drivers/net/ethernet/amazon/ena Acked-by: Krzysztof Hałasa <khalasa@piap.pl> # For IXP4xx Ethernet Link: https://lore.kernel.org/r/20220830201457.7984-3-wsa+renesas@sang-engineering.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
285e8ded |
| 10-May-2022 |
Po Liu <Po.Liu@nxp.com> |
net: enetc: count the tc-taprio window drops
The enetc scheduler for IEEE 802.1Qbv has 2 options (depending on PTGCR[TG_DROP_DISABLE]) when we attempt to send an oversized packet which will never fi
net: enetc: count the tc-taprio window drops
The enetc scheduler for IEEE 802.1Qbv has 2 options (depending on PTGCR[TG_DROP_DISABLE]) when we attempt to send an oversized packet which will never fit in its allotted time slot for its traffic class: either block the entire port due to head-of-line blocking, or drop the packet and set a bit in the writeback format of the transmit buffer descriptor, allowing other packets to be sent.
We obviously choose the second option in the driver, but we do not detect the drop condition, so from the perspective of the network stack, the packet is sent and no error counter is incremented.
This change checks the writeback of the TX BD when tc-taprio is enabled, and increments a specific ethtool statistics counter and a generic "tx_dropped" counter in ndo_get_stats64.
Signed-off-by: Po Liu <Po.Liu@nxp.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
feb13dcb |
| 24-Mar-2022 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: enetc: report software timestamping via SO_TIMESTAMPING
Let user space properly determine that the enetc driver provides software timestamps.
Fixes: 4caefbce06d1 ("enetc: add software timestam
net: enetc: report software timestamping via SO_TIMESTAMPING
Let user space properly determine that the enetc driver provides software timestamps.
Fixes: 4caefbce06d1 ("enetc: add software timestamping") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com> Link: https://lore.kernel.org/r/20220324161210.4122281-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
74624944 |
| 18-Nov-2021 |
Hao Chen <chenhao288@hisilicon.com> |
ethtool: extend ringparam setting/getting API with rx_buf_len
Add two new parameters kernel_ringparam and extack for .get_ringparam and .set_ringparam to extend more ring params through netlink.
Si
ethtool: extend ringparam setting/getting API with rx_buf_len
Add two new parameters kernel_ringparam and extack for .get_ringparam and .set_ringparam to extend more ring params through netlink.
Signed-off-by: Hao Chen <chenhao288@hisilicon.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
fb8dc5fc |
| 20-Oct-2021 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: enetc: fix ethtool counter name for PM0_TERR
There are two counters named "MAC tx frames", one of them is actually incorrect. The correct name for that counter should be "MAC tx error frames",
net: enetc: fix ethtool counter name for PM0_TERR
There are two counters named "MAC tx frames", one of them is actually incorrect. The correct name for that counter should be "MAC tx error frames", which is symmetric to the existing "MAC rx error frames".
Fixes: 16eb4c85c964 ("enetc: Add ethtool statistics") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: <Claudiu Manoil <claudiu.manoil@nxp.com> Link: https://lore.kernel.org/r/20211020165206.1069889-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
f3ccfda1 |
| 20-Aug-2021 |
Yufeng Mo <moyufeng@huawei.com> |
ethtool: extend coalesce setting uAPI with CQE mode
In order to support more coalesce parameters through netlink, add two new parameter kernel_coal and extack for .set_coalesce and .get_coalesce, th
ethtool: extend coalesce setting uAPI with CQE mode
In order to support more coalesce parameters through netlink, add two new parameter kernel_coal and extack for .set_coalesce and .get_coalesce, then some extra info can return to user with the netlink API.
Signed-off-by: Yufeng Mo <moyufeng@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
a8648887 |
| 16-Apr-2021 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: enetc: add support for flow control
In the ENETC receive path, a frame received by the MAC is first stored in a 256KB 'FIFO' memory, then transferred to DRAM when enqueuing it to the RX ring. T
net: enetc: add support for flow control
In the ENETC receive path, a frame received by the MAC is first stored in a 256KB 'FIFO' memory, then transferred to DRAM when enqueuing it to the RX ring. The FIFO is a shared resource for all ENETC ports, but every port keeps track of its own memory utilization, on RX and on TX.
There is a setting for RX rings through which they can either operate in 'lossy' mode (where the lack of a free buffer causes an immediate discard of the frame) or in 'lossless' mode (where the lack of a free buffer in the ring makes the frame stay longer in the FIFO).
In turn, when the memory utilization of the FIFO exceeds a certain margin, the MAC can be configured to emit PAUSE frames.
There is enough FIFO memory to buffer up to 3 MTU-sized frames per RX port while not jeopardizing the other use cases (jumbo frames), and also not consume bytes from the port TX allocations. Also, 3 MTU-sized frames worth of memory is enough to ensure zero loss for 64 byte packets at 1G line rate.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
7294380c |
| 12-Apr-2021 |
Yangbo Lu <yangbo.lu@nxp.com> |
enetc: support PTP Sync packet one-step timestamping
This patch is to add support for PTP Sync packet one-step timestamping. Since ENETC single-step register has to be configured dynamically per pac
enetc: support PTP Sync packet one-step timestamping
This patch is to add support for PTP Sync packet one-step timestamping. Since ENETC single-step register has to be configured dynamically per packet for correctionField offeset and UDP checksum update, current one-step timestamping packet has to be sent only when the last one completes transmitting on hardware. So, on the TX, this patch handles one-step timestamping packet as below:
- Trasmit packet immediately if no other one in transfer, or queue to skb queue if there is already one in transfer. The test_and_set_bit_lock() is used here to lock and check state. - Start a work when complete transfer on hardware, to release the bit lock and to send one skb in skb queue if has.
And the configuration for one-step timestamping on ENETC before transmitting is,
- Set one-step timestamping flag in extension BD. - Write 30 bits current timestamp in tstamp field of extension BD. - Update PTP Sync packet originTimestamp field with current timestamp. - Configure single-step register for correctionField offeset and UDP checksum update.
Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com> Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
9d2b68cc |
| 31-Mar-2021 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: enetc: add support for XDP_REDIRECT
The driver implementation of the XDP_REDIRECT action reuses parts from XDP_TX, most notably the enetc_xdp_tx function which transmits an array of TX software
net: enetc: add support for XDP_REDIRECT
The driver implementation of the XDP_REDIRECT action reuses parts from XDP_TX, most notably the enetc_xdp_tx function which transmits an array of TX software BDs. Only this time, the buffers don't have DMA mappings, we need to create them.
When a BPF program reaches the XDP_REDIRECT verdict for a frame, we can employ the same buffer reuse strategy as for the normal processing path and for XDP_PASS: we can flip to the other page half and seed that to the RX ring.
Note that scatter/gather support is there, but disabled due to lack of multi-buffer support in XDP (which is added by this series): https://patchwork.kernel.org/project/netdevbpf/cover/cover.1616179034.git.lorenzo@kernel.org/
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
7ed2bc80 |
| 31-Mar-2021 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: enetc: add support for XDP_TX
For reflecting packets back into the interface they came from, we create an array of TX software BDs derived from the RX software BDs. Therefore, we need to extend
net: enetc: add support for XDP_TX
For reflecting packets back into the interface they came from, we create an array of TX software BDs derived from the RX software BDs. Therefore, we need to extend the TX software BD structure to contain most of the stuff that's already present in the RX software BD structure, for reasons that will become evident in a moment.
For a frame with the XDP_TX verdict, we don't reuse any buffer right away as we do for XDP_DROP (the same page half) or XDP_PASS (the other page half, same as the skb code path).
Because the buffer transfers ownership from the RX ring to the TX ring, reusing any page half right away is very dangerous. So what we can do is we can recycle the same page half as soon as TX is complete.
The code path is: enetc_poll -> enetc_clean_rx_ring_xdp -> enetc_xdp_tx -> enetc_refill_rx_ring (time passes, another MSI interrupt is raised) enetc_poll -> enetc_clean_tx_ring -> enetc_recycle_xdp_tx_buff
But that creates a problem, because there is a potentially large time window between enetc_xdp_tx and enetc_recycle_xdp_tx_buff, period in which we'll have less and less RX buffers.
Basically, when the ship starts sinking, the knee-jerk reaction is to let enetc_refill_rx_ring do what it does for the standard skb code path (refill every 16 consumed buffers), but that turns out to be very inefficient. The problem is that we have no rx_swbd->page at our disposal from the enetc_reuse_page path, so enetc_refill_rx_ring would have to call enetc_new_page for every buffer that we refill (if we choose to refill at this early stage). Very inefficient, it only makes the problem worse, because page allocation is an expensive process, and CPU time is exactly what we're lacking.
Additionally, there is an even bigger problem: if we let enetc_refill_rx_ring top up the ring's buffers again from the RX path, remember that the buffers sent to transmission haven't disappeared anywhere. They will be eventually sent, and processed in enetc_clean_tx_ring, and an attempt will be made to recycle them. But surprise, the RX ring is already full of new buffers, because we were premature in deciding that we should refill. So not only we took the expensive decision of allocating new pages, but now we must throw away perfectly good and reusable buffers.
So what we do is we implement an elastic refill mechanism, which keeps track of the number of in-flight XDP_TX buffer descriptors. We top up the RX ring only up to the total ring capacity minus the number of BDs that are in flight (because we know that those BDs will return to us eventually).
The enetc driver manages 1 RX ring per CPU, and the default TX ring management is the same. So we do XDP_TX towards the TX ring of the same index, because it is affined to the same CPU. This will probably not produce great results when we have a tc-taprio/tc-mqprio qdisc on the interface, because in that case, the number of TX rings might be greater, but I didn't add any checks for that yet (mostly because I didn't know what checks to add).
It should also be noted that we need to change the DMA mapping direction for RX buffers, since they may now be reflected into the TX ring of the same device. We choose to use DMA_BIDIRECTIONAL instead of unmapping and remapping as DMA_TO_DEVICE, because performance is better this way.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
d1b15102 |
| 31-Mar-2021 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: enetc: add support for XDP_DROP and XDP_PASS
For the RX ring, enetc uses an allocation scheme based on pages split into two buffers, which is already very efficient in terms of preventing reall
net: enetc: add support for XDP_DROP and XDP_PASS
For the RX ring, enetc uses an allocation scheme based on pages split into two buffers, which is already very efficient in terms of preventing reallocations / maximizing reuse, so I see no reason why I would change that.
+--------+--------+--------+--------+--------+--------+--------+ | | | | | | | | | half B | half B | half B | half B | half B | half B | half B | | | | | | | | | +--------+--------+--------+--------+--------+--------+--------+ | | | | | | | | | half A | half A | half A | half A | half A | half A | half A | RX ring | | | | | | | | +--------+--------+--------+--------+--------+--------+--------+ ^ ^ | | next_to_clean next_to_alloc next_to_use
+--------+--------+--------+--------+--------+ | | | | | | | half B | half B | half B | half B | half B | | | | | | | +--------+--------+--------+--------+--------+--------+--------+ | | | | | | | | | half B | half B | half A | half A | half A | half A | half A | RX ring | | | | | | | | +--------+--------+--------+--------+--------+--------+--------+ | | | ^ ^ | half A | half A | | | | | | next_to_clean next_to_use +--------+--------+ ^ | next_to_alloc
then when enetc_refill_rx_ring is called, whose purpose is to advance next_to_use, it sees that it can take buffers up to next_to_alloc, and it says "oh, hey, rx_swbd->page isn't NULL, I don't need to allocate one!".
The only problem is that for default PAGE_SIZE values of 4096, buffer sizes are 2048 bytes. While this is enough for normal skb allocations at an MTU of 1500 bytes, for XDP it isn't, because the XDP headroom is 256 bytes, and including skb_shared_info and alignment, we end up being able to make use of only 1472 bytes, which is insufficient for the default MTU.
To solve that problem, we implement scatter/gather processing in the driver, because we would really like to keep the existing allocation scheme. A packet of 1500 bytes is received in a buffer of 1472 bytes and another one of 28 bytes.
Because the headroom required by XDP is different (and much larger) than the one required by the network stack, whenever a BPF program is added or deleted on the port, we drain the existing RX buffers and seed new ones with the required headroom. We also keep the required headroom in rx_ring->buffer_offset.
The simplest way to implement XDP_PASS, where an skb must be created, is to create an xdp_buff based on the next_to_clean RX BDs, but not clear those BDs from the RX ring yet, just keep the original index at which the BDs for this frame started. Then, if the verdict is XDP_PASS, instead of converting the xdb_buff to an skb, we replay a call to enetc_build_skb (just as in the normal enetc_clean_rx_ring case), starting from the original BD index.
We would also like to be minimally invasive to the regular RX data path, and not check whether there is a BPF program attached to the ring on every packet. So we create a separate RX ring processing function for XDP.
Because we only install/remove the BPF program while the interface is down, we forgo the rcu_read_lock() in enetc_clean_rx_ring, since there shouldn't be any circumstance in which we are processing packets and there is a potentially freed BPF program attached to the RX ring.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
eb96b686 |
| 04-Dec-2020 |
Claudiu Manoil <claudiu.manoil@nxp.com> |
enetc: Fix reporting of h/w packet counters
Noticed some inconsistencies in packet statistics reporting. This patch adds the missing Tx packet counter registers to ethtool reporting and fixes the in
enetc: Fix reporting of h/w packet counters
Noticed some inconsistencies in packet statistics reporting. This patch adds the missing Tx packet counter registers to ethtool reporting and fixes the information strings for a few of them.
Fixes: 16eb4c85c964 ("enetc: Add ethtool statistics") Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com> Link: https://lore.kernel.org/r/20201204171505.21389-1-claudiu.manoil@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|