#
848789d5 |
| 05-Sep-2024 |
Ido Schimmel <idosch@nvidia.com> |
ipv4: icmp: Unmask upper DSCP bits in icmp_reply()
Unmask the upper DSCP bits when calling ip_route_output_key() so that in the future it could perform the FIB lookup according to the full DSCP valu
ipv4: icmp: Unmask upper DSCP bits in icmp_reply()
Unmask the upper DSCP bits when calling ip_route_output_key() so that in the future it could perform the FIB lookup according to the full DSCP value.
Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
4805646c |
| 29-Aug-2024 |
Ido Schimmel <idosch@nvidia.com> |
ipv4: icmp: Unmask upper DSCP bits in icmp_route_lookup()
The function is called to resolve a route for an ICMP message that is sent in response to a situation. Based on the type of the generated IC
ipv4: icmp: Unmask upper DSCP bits in icmp_route_lookup()
The function is called to resolve a route for an ICMP message that is sent in response to a situation. Based on the type of the generated ICMP message, the function is either passed the DS field of the packet that generated the ICMP message or a DS field that is derived from it.
Unmask the upper DSCP bits before resolving and output route via ip_route_output_key_hash() so that in the future the lookup could be performed according to the full DSCP value.
Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
f17bf505 |
| 29-Aug-2024 |
Eric Dumazet <edumazet@google.com> |
icmp: icmp_msgs_per_sec and icmp_msgs_burst sysctls become per netns
Previous patch made ICMP rate limits per netns, it makes sense to allow each netns to change the associated sysctl.
Signed-off-b
icmp: icmp_msgs_per_sec and icmp_msgs_burst sysctls become per netns
Previous patch made ICMP rate limits per netns, it makes sense to allow each netns to change the associated sysctl.
Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20240829144641.3880376-4-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
b056b4cd |
| 29-Aug-2024 |
Eric Dumazet <edumazet@google.com> |
icmp: move icmp_global.credit and icmp_global.stamp to per netns storage
Host wide ICMP ratelimiter should be per netns, to provide better isolation.
Following patch in this series makes the sysctl
icmp: move icmp_global.credit and icmp_global.stamp to per netns storage
Host wide ICMP ratelimiter should be per netns, to provide better isolation.
Following patch in this series makes the sysctl per netns.
Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20240829144641.3880376-3-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
8c2bd38b |
| 29-Aug-2024 |
Eric Dumazet <edumazet@google.com> |
icmp: change the order of rate limits
ICMP messages are ratelimited :
After the blamed commits, the two rate limiters are applied in this order:
1) host wide ratelimit (icmp_global_allow())
2) Pe
icmp: change the order of rate limits
ICMP messages are ratelimited :
After the blamed commits, the two rate limiters are applied in this order:
1) host wide ratelimit (icmp_global_allow())
2) Per destination ratelimit (inetpeer based)
In order to avoid side-channels attacks, we need to apply the per destination check first.
This patch makes the following change :
1) icmp_global_allow() checks if the host wide limit is reached. But credits are not yet consumed. This is deferred to 3)
2) The per destination limit is checked/updated. This might add a new node in inetpeer tree.
3) icmp_global_consume() consumes tokens if prior operations succeeded.
This means that host wide ratelimit is still effective in keeping inetpeer tree small even under DDOS.
As a bonus, I removed icmp_global.lock as the fast path can use a lock-free operation.
Fixes: c0303efeab73 ("net: reduce cycles spend on ICMP replies that gets rate limited") Fixes: 4cdf507d5452 ("icmp: add a global rate limitation") Reported-by: Keyu Man <keyu.man@email.ucr.edu> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Cc: Jesper Dangaard Brouer <hawk@kernel.org> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20240829144641.3880376-2-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
1c6f50b3 |
| 21-Aug-2024 |
Ido Schimmel <idosch@nvidia.com> |
ipv4: icmp: Pass full DS field to ip_route_input()
Align the ICMP code to other callers of ip_route_input() and pass the full DS field. In the future this will allow us to perform a route lookup acc
ipv4: icmp: Pass full DS field to ip_route_input()
Align the ICMP code to other callers of ip_route_input() and pass the full DS field. In the future this will allow us to perform a route lookup according to the full DSCP value.
No functional changes intended since the upper DSCP bits are masked when comparing against the TOS selectors in FIB rules and routes.
Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Acked-by: Florian Westphal <fw@strlen.de> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20240821125251.1571445-11-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
db3efdcf |
| 07-May-2024 |
Peilin He <he.peilin@zte.com.cn> |
net/ipv4: add tracepoint for icmp_send
Introduce a tracepoint for icmp_send, which can help users to get more detail information conveniently when icmp abnormal events happen.
1. Giving an usecase
net/ipv4: add tracepoint for icmp_send
Introduce a tracepoint for icmp_send, which can help users to get more detail information conveniently when icmp abnormal events happen.
1. Giving an usecase example: ============================= When an application experiences packet loss due to an unreachable UDP destination port, the kernel will send an exception message through the icmp_send function. By adding a trace point for icmp_send, developers or system administrators can obtain detailed information about the UDP packet loss, including the type, code, source address, destination address, source port, and destination port. This facilitates the trouble-shooting of UDP packet loss issues especially for those network-service applications.
2. Operation Instructions: ========================== Switch to the tracing directory. cd /sys/kernel/tracing Filter for destination port unreachable. echo "type==3 && code==3" > events/icmp/icmp_send/filter Enable trace event. echo 1 > events/icmp/icmp_send/enable
3. Result View: ================ udp_client_erro-11370 [002] ...s.12 124.728002: icmp_send: icmp_send: type=3, code=3. From 127.0.0.1:41895 to 127.0.0.1:6666 ulen=23 skbaddr=00000000589b167a
Signed-off-by: Peilin He <he.peilin@zte.com.cn> Signed-off-by: xu xin <xu.xin16@zte.com.cn> Reviewed-by: Yunkai Zhang <zhang.yunkai@zte.com.cn> Cc: Yang Yang <yang.yang29@zte.com.cn> Cc: Liu Chun <liu.chun2@zte.com.cn> Cc: Xuexin Jiang <jiang.xuexin@zte.com.cn> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
05d6d492 |
| 29-Apr-2024 |
Eric Dumazet <edumazet@google.com> |
inet: introduce dst_rtable() helper
I added dst_rt6_info() in commit e8dfd42c17fa ("ipv6: introduce dst_rt6_info() helper")
This patch does a similar change for IPv4.
Instead of (struct rtable *)d
inet: introduce dst_rtable() helper
I added dst_rt6_info() in commit e8dfd42c17fa ("ipv6: introduce dst_rt6_info() helper")
This patch does a similar change for IPv4.
Instead of (struct rtable *)dst casts, we can use :
#define dst_rtable(_ptr) \ container_of_const(_ptr, struct rtable, dst)
Patch is smaller than IPv6 one, because IPv4 has skb_rtable() helper.
Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://lore.kernel.org/r/20240429133009.1227754-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
c58e88d4 |
| 20-Apr-2024 |
Eric Dumazet <edumazet@google.com> |
icmp: prevent possible NULL dereferences from icmp_build_probe()
First problem is a double call to __in_dev_get_rcu(), because the second one could return NULL.
if (__in_dev_get_rcu(dev) && __in_de
icmp: prevent possible NULL dereferences from icmp_build_probe()
First problem is a double call to __in_dev_get_rcu(), because the second one could return NULL.
if (__in_dev_get_rcu(dev) && __in_dev_get_rcu(dev)->ifa_list)
Second problem is a read from dev->ip6_ptr with no NULL check:
if (!list_empty(&rcu_dereference(dev->ip6_ptr)->addr_list))
Use the correct RCU API to fix these.
v2: add missing include <net/addrconf.h>
Fixes: d329ea5bd884 ("icmp: add response to RFC 8335 PROBE messages") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Andreas Roeseler <andreas.a.roeseler@gmail.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
2b1dc628 |
| 04-Oct-2023 |
Florian Westphal <fw@strlen.de> |
xfrm: pass struct net to xfrm_decode_session wrappers
Preparation patch, extra arg is not used. No functional changes intended.
This is needed to replace the xfrm session decode functions with the
xfrm: pass struct net to xfrm_decode_session wrappers
Preparation patch, extra arg is not used. No functional changes intended.
This is needed to replace the xfrm session decode functions with the flow dissector.
skb_flow_dissect() cannot be used as-is, because it attempts to deduce the 'struct net' to use for bpf program fetch from skb->sk or skb->dev, but xfrm code path can see skbs that have neither sk or dev filled in.
So either flow dissector needs to try harder, e.g. by also trying skb->dst->dev, or we have to pass the struct net explicitly.
Passing the struct net doesn't look too bad to me, most places already have it available or can derive it from the output device.
Reported-by: kernel test robot <oliver.sang@intel.com> Link: https://lore.kernel.org/netdev/202309271628.27fd2187-oliver.sang@intel.com/ Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
show more ...
|
#
7d63b671 |
| 30-Mar-2023 |
Eric Dumazet <edumazet@google.com> |
icmp: guard against too small mtu
syzbot was able to trigger a panic [1] in icmp_glue_bits(), or more exactly in skb_copy_and_csum_bits()
There is no repro yet, but I think the issue is that syzbot
icmp: guard against too small mtu
syzbot was able to trigger a panic [1] in icmp_glue_bits(), or more exactly in skb_copy_and_csum_bits()
There is no repro yet, but I think the issue is that syzbot manages to lower device mtu to a small value, fooling __icmp_send()
__icmp_send() must make sure there is enough room for the packet to include at least the headers.
We might in the future refactor skb_copy_and_csum_bits() and its callers to no longer crash when something bad happens.
[1] kernel BUG at net/core/skbuff.c:3343 ! invalid opcode: 0000 [#1] PREEMPT SMP KASAN CPU: 0 PID: 15766 Comm: syz-executor.0 Not tainted 6.3.0-rc4-syzkaller-00039-gffe78bbd5121 #0 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014 RIP: 0010:skb_copy_and_csum_bits+0x798/0x860 net/core/skbuff.c:3343 Code: f0 c1 c8 08 41 89 c6 e9 73 ff ff ff e8 61 48 d4 f9 e9 41 fd ff ff 48 8b 7c 24 48 e8 52 48 d4 f9 e9 c3 fc ff ff e8 c8 27 84 f9 <0f> 0b 48 89 44 24 28 e8 3c 48 d4 f9 48 8b 44 24 28 e9 9d fb ff ff RSP: 0018:ffffc90000007620 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 00000000000001e8 RCX: 0000000000000100 RDX: ffff8880276f6280 RSI: ffffffff87fdd138 RDI: 0000000000000005 RBP: 0000000000000000 R08: 0000000000000005 R09: 0000000000000000 R10: 00000000000001e8 R11: 0000000000000001 R12: 000000000000003c R13: 0000000000000000 R14: ffff888028244868 R15: 0000000000000b0e FS: 00007fbc81f1c700(0000) GS:ffff88802ca00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000001b2df43000 CR3: 00000000744db000 CR4: 0000000000150ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <IRQ> icmp_glue_bits+0x7b/0x210 net/ipv4/icmp.c:353 __ip_append_data+0x1d1b/0x39f0 net/ipv4/ip_output.c:1161 ip_append_data net/ipv4/ip_output.c:1343 [inline] ip_append_data+0x115/0x1a0 net/ipv4/ip_output.c:1322 icmp_push_reply+0xa8/0x440 net/ipv4/icmp.c:370 __icmp_send+0xb80/0x1430 net/ipv4/icmp.c:765 ipv4_send_dest_unreach net/ipv4/route.c:1239 [inline] ipv4_link_failure+0x5a9/0x9e0 net/ipv4/route.c:1246 dst_link_failure include/net/dst.h:423 [inline] arp_error_report+0xcb/0x1c0 net/ipv4/arp.c:296 neigh_invalidate+0x20d/0x560 net/core/neighbour.c:1079 neigh_timer_handler+0xc77/0xff0 net/core/neighbour.c:1166 call_timer_fn+0x1a0/0x580 kernel/time/timer.c:1700 expire_timers+0x29b/0x4b0 kernel/time/timer.c:1751 __run_timers kernel/time/timer.c:2022 [inline]
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: syzbot+d373d60fddbdc915e666@syzkaller.appspotmail.com Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20230330174502.1915328-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
d0941130 |
| 25-Jan-2023 |
Jamie Bainbridge <jamie.bainbridge@gmail.com> |
icmp: Add counters for rate limits
There are multiple ICMP rate limiting mechanisms:
* Global limits: net.ipv4.icmp_msgs_burst/icmp_msgs_per_sec * v4 per-host limits: net.ipv4.icmp_ratelimit/ratema
icmp: Add counters for rate limits
There are multiple ICMP rate limiting mechanisms:
* Global limits: net.ipv4.icmp_msgs_burst/icmp_msgs_per_sec * v4 per-host limits: net.ipv4.icmp_ratelimit/ratemask * v6 per-host limits: net.ipv6.icmp_ratelimit/ratemask
However, when ICMP output is limited, there is no way to tell which limit has been hit or even if the limits are responsible for the lack of ICMP output.
Add counters for each of the cases above. As we are within local_bh_disable(), use the __INC stats variant.
Example output:
# nstat -sz "*RateLimit*" IcmpOutRateLimitGlobal 134 0.0 IcmpOutRateLimitHost 770 0.0 Icmp6OutRateLimitHost 84 0.0
Signed-off-by: Jamie Bainbridge <jamie.bainbridge@gmail.com> Suggested-by: Abhishek Rawal <rawal.abhishek92@gmail.com> Link: https://lore.kernel.org/r/273b32241e6b7fdc5c609e6f5ebc68caf3994342.1674605770.git.jamie.bainbridge@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
show more ...
|
#
8032bf12 |
| 10-Oct-2022 |
Jason A. Donenfeld <Jason@zx2c4.com> |
treewide: use get_random_u32_below() instead of deprecated function
This is a simple mechanical transformation done by:
@@ expression E; @@ - prandom_u32_max + get_random_u32_below (E)
Reviewed-
treewide: use get_random_u32_below() instead of deprecated function
This is a simple mechanical transformation done by:
@@ expression E; @@ - prandom_u32_max + get_random_u32_below (E)
Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Darrick J. Wong <djwong@kernel.org> # for xfs Reviewed-by: SeongJae Park <sj@kernel.org> # for damon Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> # for infiniband Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> # for arm Acked-by: Ulf Hansson <ulf.hansson@linaro.org> # for mmc Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
show more ...
|
#
0968d2a4 |
| 13-Jul-2022 |
Kuniyuki Iwashima <kuniyu@amazon.com> |
ip: Fix data-races around sysctl_ip_no_pmtu_disc.
While reading sysctl_ip_no_pmtu_disc, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers.
Fixes: 1da177e4c3f4 ("Linux-
ip: Fix data-races around sysctl_ip_no_pmtu_disc.
While reading sysctl_ip_no_pmtu_disc, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
1ebcb25a |
| 12-Jul-2022 |
Kuniyuki Iwashima <kuniyu@amazon.com> |
icmp: Fix a data-race around sysctl_icmp_ratemask.
While reading sysctl_icmp_ratemask, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader.
Fixes: 1da177e4c3f4 ("Linux-2.
icmp: Fix a data-race around sysctl_icmp_ratemask.
While reading sysctl_icmp_ratemask, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
2a4eb714 |
| 12-Jul-2022 |
Kuniyuki Iwashima <kuniyu@amazon.com> |
icmp: Fix a data-race around sysctl_icmp_ratelimit.
While reading sysctl_icmp_ratelimit, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader.
Fixes: 1da177e4c3f4 ("Linux-
icmp: Fix a data-race around sysctl_icmp_ratelimit.
While reading sysctl_icmp_ratelimit, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
d2efabce |
| 12-Jul-2022 |
Kuniyuki Iwashima <kuniyu@amazon.com> |
icmp: Fix a data-race around sysctl_icmp_errors_use_inbound_ifaddr.
While reading sysctl_icmp_errors_use_inbound_ifaddr, it can be changed concurrently. Thus, we need to add READ_ONCE() to its read
icmp: Fix a data-race around sysctl_icmp_errors_use_inbound_ifaddr.
While reading sysctl_icmp_errors_use_inbound_ifaddr, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader.
Fixes: 1c2fb7f93cb2 ("[IPV4]: Sysctl configurable icmp error source address.") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
b04f9b7e |
| 12-Jul-2022 |
Kuniyuki Iwashima <kuniyu@amazon.com> |
icmp: Fix a data-race around sysctl_icmp_ignore_bogus_error_responses.
While reading sysctl_icmp_ignore_bogus_error_responses, it can be changed concurrently. Thus, we need to add READ_ONCE() to it
icmp: Fix a data-race around sysctl_icmp_ignore_bogus_error_responses.
While reading sysctl_icmp_ignore_bogus_error_responses, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
66484bb9 |
| 12-Jul-2022 |
Kuniyuki Iwashima <kuniyu@amazon.com> |
icmp: Fix a data-race around sysctl_icmp_echo_ignore_broadcasts.
While reading sysctl_icmp_echo_ignore_broadcasts, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader.
F
icmp: Fix a data-race around sysctl_icmp_echo_ignore_broadcasts.
While reading sysctl_icmp_echo_ignore_broadcasts, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
4a2f7083 |
| 12-Jul-2022 |
Kuniyuki Iwashima <kuniyu@amazon.com> |
icmp: Fix data-races around sysctl_icmp_echo_enable_probe.
While reading sysctl_icmp_echo_enable_probe, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers.
Fixes: d329
icmp: Fix data-races around sysctl_icmp_echo_enable_probe.
While reading sysctl_icmp_echo_enable_probe, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers.
Fixes: d329ea5bd884 ("icmp: add response to RFC 8335 PROBE messages") Fixes: 1fd07f33c3ea ("ipv6: ICMPV6: add response to ICMPV6 RFC 8335 PROBE messages") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
bb7bb35a |
| 12-Jul-2022 |
Kuniyuki Iwashima <kuniyu@amazon.com> |
icmp: Fix a data-race around sysctl_icmp_echo_ignore_all.
While reading sysctl_icmp_echo_ignore_all, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader.
Fixes: 1da177e4c
icmp: Fix a data-race around sysctl_icmp_echo_ignore_all.
While reading sysctl_icmp_echo_ignore_all, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
48d7ee32 |
| 06-Jul-2022 |
Kuniyuki Iwashima <kuniyu@amazon.com> |
icmp: Fix data-races around sysctl.
While reading icmp sysctl variables, they can be changed concurrently. So, we need to add READ_ONCE() to avoid data-races.
Fixes: 4cdf507d5452 ("icmp: add a glob
icmp: Fix data-races around sysctl.
While reading icmp sysctl variables, they can be changed concurrently. So, we need to add READ_ONCE() to avoid data-races.
Fixes: 4cdf507d5452 ("icmp: add a global rate limitation") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
2e47eece |
| 29-Apr-2022 |
Yu Zhe <yuzhe@nfschina.com> |
ipv4: remove unnecessary type castings
remove unnecessary void* type castings.
Signed-off-by: Yu Zhe <yuzhe@nfschina.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
b384c95a |
| 07-Apr-2022 |
Menglong Dong <imagedong@tencent.com> |
net: icmp: add skb drop reasons to icmp protocol
Replace kfree_skb() used in icmp_rcv() and icmpv6_rcv() with kfree_skb_reason().
In order to get the reasons of the skb drops after icmp message han
net: icmp: add skb drop reasons to icmp protocol
Replace kfree_skb() used in icmp_rcv() and icmpv6_rcv() with kfree_skb_reason().
In order to get the reasons of the skb drops after icmp message handle, we change the return type of 'handler()' in 'struct icmp_control' from 'bool' to 'enum skb_drop_reason'. This may change its original intention, as 'false' means failure, but 'SKB_NOT_DROPPED_YET' means success now. Therefore, all 'handler' and the call of them need to be handled. Following 'handler' functions are involved:
icmp_unreach() icmp_redirect() icmp_echo() icmp_timestamp() icmp_discard()
And following new drop reasons are added:
SKB_DROP_REASON_ICMP_CSUM SKB_DROP_REASON_INVALID_PROTO
The reason 'INVALID_PROTO' is introduced for the case that the packet doesn't follow rfc 1122 and is dropped. This is not a common case, and I believe we can locate the problem from the data in the packet. For now, this 'INVALID_PROTO' is used for the icmp broadcasts with wrong types.
Maybe there should be a document file for these reasons. For example, list all the case that causes the 'UNHANDLED_PROTO' and 'INVALID_PROTO' drop reason. Therefore, users can locate their problems according to the document.
Reviewed-by: Hao Peng <flyingpeng@tencent.com> Reviewed-by: Jiang Biao <benbjiang@tencent.com> Signed-off-by: Menglong Dong <imagedong@tencent.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
a15c89c7 |
| 24-Jan-2022 |
Eric Dumazet <edumazet@google.com> |
ipv4: do not use per netns icmp sockets
Back in linux-2.6.25 (commit 4a6ad7a141cb "[NETNS]: Make icmp_sk per namespace."), we added private per-cpu/per-netns ipv4 icmp sockets.
This adds memory and
ipv4: do not use per netns icmp sockets
Back in linux-2.6.25 (commit 4a6ad7a141cb "[NETNS]: Make icmp_sk per namespace."), we added private per-cpu/per-netns ipv4 icmp sockets.
This adds memory and cpu costs, which do not seem needed. Now typical servers have 256 or more cores, this adds considerable tax to netns users.
icmp sockets are used from BH context, are not receiving packets, and do not store any persistent state but the 'struct net' pointer.
icmp_xmit_lock() already makes sure to lock the chosen per-cpu socket.
Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|