#
1fcd7945 |
| 14-Oct-2021 |
Xin Long <lucien.xin@gmail.com> |
icmp: fix icmp_ext_echo_iio parsing in icmp_build_probe
In icmp_build_probe(), the icmp_ext_echo_iio parsing should be done step by step and skb_header_pointer() return value should always be checke
icmp: fix icmp_ext_echo_iio parsing in icmp_build_probe
In icmp_build_probe(), the icmp_ext_echo_iio parsing should be done step by step and skb_header_pointer() return value should always be checked, this patch fixes 3 places in there:
- On case ICMP_EXT_ECHO_CTYPE_NAME, it should only copy ident.name from skb by skb_header_pointer(), its len is ident_len. Besides, the return value of skb_header_pointer() should always be checked.
- On case ICMP_EXT_ECHO_CTYPE_INDEX, move ident_len check ahead of skb_header_pointer(), and also do the return value check for skb_header_pointer().
- On case ICMP_EXT_ECHO_CTYPE_ADDR, before accessing iio->ident.addr. ctype3_hdr.addrlen, skb_header_pointer() should be called first, then check its return value and ident_len. On subcases ICMP_AFI_IP and ICMP_AFI_IP6, also do check for ident. addr.ctype3_hdr.addrlen and skb_header_pointer()'s return value. On subcase ICMP_AFI_IP, the len for skb_header_pointer() should be "sizeof(iio->extobj_hdr) + sizeof(iio->ident.addr.ctype3_hdr) + sizeof(struct in_addr)" or "ident_len".
v1->v2: - To make it more clear, call skb_header_pointer() once only for iio->indent's parsing as Jakub Suggested. v2->v3: - The extobj_hdr.length check against sizeof(_iio) should be done before calling skb_header_pointer(), as Eric noticed.
Fixes: d329ea5bd884 ("icmp: add response to RFC 8335 PROBE messages") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/31628dd76657ea62f5cf78bb55da6b35240831f1.1634205050.git.lucien.xin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
1160dfa1 |
| 05-Aug-2021 |
Yajun Deng <yajun.deng@linux.dev> |
net: Remove redundant if statements
The 'if (dev)' statement already move into dev_{put , hold}, so remove redundant if statements.
Signed-off-by: Yajun Deng <yajun.deng@linux.dev> Signed-off-by: D
net: Remove redundant if statements
The 'if (dev)' statement already move into dev_{put , hold}, so remove redundant if statements.
Signed-off-by: Yajun Deng <yajun.deng@linux.dev> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
1fd07f33 |
| 26-Jun-2021 |
Andreas Roeseler <andreas.a.roeseler@gmail.com> |
ipv6: ICMPV6: add response to ICMPV6 RFC 8335 PROBE messages
This patch builds off of commit 2b246b2569cd2ac6ff700d0dce56b8bae29b1842 and adds functionality to respond to ICMPV6 PROBE requests.
Add
ipv6: ICMPV6: add response to ICMPV6 RFC 8335 PROBE messages
This patch builds off of commit 2b246b2569cd2ac6ff700d0dce56b8bae29b1842 and adds functionality to respond to ICMPV6 PROBE requests.
Add icmp_build_probe function to construct PROBE requests for both ICMPV4 and ICMPV6.
Modify icmpv6_rcv to detect ICMPV6 PROBE messages and call the icmpv6_echo_reply handler.
Modify icmpv6_echo_reply to build a PROBE response message based on the queried interface.
This patch has been tested using a branch of the iputils git repo which can be found here: https://github.com/Juniper-Clinic-2020/iputils/tree/probe-request
Signed-off-by: Andreas Roeseler <andreas.a.roeseler@gmail.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
32182747 |
| 18-Jun-2021 |
Toke Høiland-Jørgensen <toke@redhat.com> |
icmp: don't send out ICMP messages with a source address of 0.0.0.0
When constructing ICMP response messages, the kernel will try to pick a suitable source address for the outgoing packet. However,
icmp: don't send out ICMP messages with a source address of 0.0.0.0
When constructing ICMP response messages, the kernel will try to pick a suitable source address for the outgoing packet. However, if no IPv4 addresses are configured on the system at all, this will fail and we end up producing an ICMP message with a source address of 0.0.0.0. This can happen on a box routing IPv4 traffic via v6 nexthops, for instance.
Since 0.0.0.0 is not generally routable on the internet, there's a good chance that such ICMP messages will never make it back to the sender of the original packet that the ICMP message was sent in response to. This, in turn, can create connectivity and PMTUd problems for senders. Fortunately, RFC7600 reserves a dummy address to be used as a source for ICMP messages (192.0.0.8/32), so let's teach the kernel to substitute that address as a last resort if the regular source address selection procedure fails.
Below is a quick example reproducing this issue with network namespaces:
ip netns add ns0 ip l add type veth peer netns ns0 ip l set dev veth0 up ip a add 10.0.0.1/24 dev veth0 ip a add fc00:dead:cafe:42::1/64 dev veth0 ip r add 10.1.0.0/24 via inet6 fc00:dead:cafe:42::2 ip -n ns0 l set dev veth0 up ip -n ns0 a add fc00:dead:cafe:42::2/64 dev veth0 ip -n ns0 r add 10.0.0.0/24 via inet6 fc00:dead:cafe:42::1 ip netns exec ns0 sysctl -w net.ipv4.icmp_ratelimit=0 ip netns exec ns0 sysctl -w net.ipv4.ip_forward=1 tcpdump -tpni veth0 -c 2 icmp & ping -w 1 10.1.0.1 > /dev/null tcpdump: verbose output suppressed, use -v[v]... for full protocol decode listening on veth0, link-type EN10MB (Ethernet), snapshot length 262144 bytes IP 10.0.0.1 > 10.1.0.1: ICMP echo request, id 29, seq 1, length 64 IP 0.0.0.0 > 10.0.0.1: ICMP net 10.1.0.1 unreachable, length 92 2 packets captured 2 packets received by filter 0 packets dropped by kernel
With this patch the above capture changes to: IP 10.0.0.1 > 10.1.0.1: ICMP echo request, id 31127, seq 1, length 64 IP 192.0.0.8 > 10.0.0.1: ICMP net 10.1.0.1 unreachable, length 92
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: Juliusz Chroboczek <jch@irif.fr> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
e32ea44c |
| 03-Jun-2021 |
Andreas Roeseler <andreas.a.roeseler@gmail.com> |
icmp: fix lib conflict with trinity
Including <linux/in.h> and <netinet/in.h> in the dependencies breaks compilation of trinity due to multiple definitions. <linux/in.h> is only used in <linux/icmp.
icmp: fix lib conflict with trinity
Including <linux/in.h> and <netinet/in.h> in the dependencies breaks compilation of trinity due to multiple definitions. <linux/in.h> is only used in <linux/icmp.h> to provide the definition of the struct in_addr, but this can be substituted out by using the datatype __be32.
Signed-off-by: Andreas Roeseler <andreas.a.roeseler@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
e542d29c |
| 27-Apr-2021 |
Andreas Roeseler <andreas.a.roeseler@gmail.com> |
icmp: standardize naming of RFC 8335 PROBE constants
The current definitions of constants for PROBE, currently defined only in the net-next kernel branch, are inconsistent, with some beginning with
icmp: standardize naming of RFC 8335 PROBE constants
The current definitions of constants for PROBE, currently defined only in the net-next kernel branch, are inconsistent, with some beginning with ICMP and others with simply EXT. This patch attempts to standardize the naming conventions of the constants for PROBE before their release into a stable Kernel, and to update the relevant definitions in net/ipv4/icmp.c.
Similarly, the definitions for the code field (previously ICMP_EXT_MAL_QUERY, etc) use the same prefixes as the type field. This patch adds _CODE_ to the prefix to clarify the distinction of these constants.
Signed-off-by: Andreas Roeseler <andreas.a.roeseler@gmail.com> Acked-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20210427153635.2591-1-andreas.a.roeseler@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
31433202 |
| 12-Apr-2021 |
Andreas Roeseler <andreas.a.roeseler@gmail.com> |
icmp: ICMPV6: pass RFC 8335 reply messages to ping_rcv
The current icmp_rcv function drops all unknown ICMP types, including ICMP_EXT_ECHOREPLY (type 43). In order to parse Extended Echo Reply messa
icmp: ICMPV6: pass RFC 8335 reply messages to ping_rcv
The current icmp_rcv function drops all unknown ICMP types, including ICMP_EXT_ECHOREPLY (type 43). In order to parse Extended Echo Reply messages, we have to pass these packets to the ping_rcv function, which does not do any other filtering and passes the packet to the designated socket.
Pass incoming RFC 8335 ICMP Extended Echo Reply packets to the ping_rcv handler instead of discarding the packet.
Signed-off-by: Andreas Roeseler <andreas.a.roeseler@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
d329ea5b |
| 30-Mar-2021 |
Andreas Roeseler <andreas.a.roeseler@gmail.com> |
icmp: add response to RFC 8335 PROBE messages
Modify the icmp_rcv function to check PROBE messages and call icmp_echo if a PROBE request is detected.
Modify the existing icmp_echo function to respo
icmp: add response to RFC 8335 PROBE messages
Modify the icmp_rcv function to check PROBE messages and call icmp_echo if a PROBE request is detected.
Modify the existing icmp_echo function to respond ot both ping and PROBE requests.
This was tested using a custom modification to the iputils package and wireshark. It supports IPV4 probing by name, ifindex, and probing by both IPV4 and IPV6 addresses. It currently does not support responding to probes off the proxy node (see RFC 8335 Section 2).
The modification to the iputils package is still in development and can be found here: https://github.com/Juniper-Clinic-2020/iputils.git. It supports full sending functionality of PROBE requests, but currently does not parse the response messages, which is why Wireshark is required to verify the sent and recieved PROBE messages. The modification adds the ``-e'' flag to the command which allows the user to specify the interface identifier to query the probed host. An example usage would be <./ping -4 -e 1 [destination]> to send a PROBE request of ifindex 1 to the destination node.
Signed-off-by: Andreas Roeseler <andreas.a.roeseler@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
ee576c47 |
| 23-Feb-2021 |
Jason A. Donenfeld <Jason@zx2c4.com> |
net: icmp: pass zeroed opts from icmp{,v6}_ndo_send before sending
The icmp{,v6}_send functions make all sorts of use of skb->cb, casting it with IPCB or IP6CB, assuming the skb to have come directl
net: icmp: pass zeroed opts from icmp{,v6}_ndo_send before sending
The icmp{,v6}_send functions make all sorts of use of skb->cb, casting it with IPCB or IP6CB, assuming the skb to have come directly from the inet layer. But when the packet comes from the ndo layer, especially when forwarded, there's no telling what might be in skb->cb at that point. As a result, the icmp sending code risks reading bogus memory contents, which can result in nasty stack overflows such as this one reported by a user:
panic+0x108/0x2ea __stack_chk_fail+0x14/0x20 __icmp_send+0x5bd/0x5c0 icmp_ndo_send+0x148/0x160
In icmp_send, skb->cb is cast with IPCB and an ip_options struct is read from it. The optlen parameter there is of particular note, as it can induce writes beyond bounds. There are quite a few ways that can happen in __ip_options_echo. For example:
// sptr/skb are attacker-controlled skb bytes sptr = skb_network_header(skb); // dptr/dopt points to stack memory allocated by __icmp_send dptr = dopt->__data; // sopt is the corrupt skb->cb in question if (sopt->rr) { optlen = sptr[sopt->rr+1]; // corrupt skb->cb + skb->data soffset = sptr[sopt->rr+2]; // corrupt skb->cb + skb->data // this now writes potentially attacker-controlled data, over // flowing the stack: memcpy(dptr, sptr+sopt->rr, optlen); }
In the icmpv6_send case, the story is similar, but not as dire, as only IP6CB(skb)->iif and IP6CB(skb)->dsthao are used. The dsthao case is worse than the iif case, but it is passed to ipv6_find_tlv, which does a bit of bounds checking on the value.
This is easy to simulate by doing a `memset(skb->cb, 0x41, sizeof(skb->cb));` before calling icmp{,v6}_ndo_send, and it's only by good fortune and the rarity of icmp sending from that context that we've avoided reports like this until now. For example, in KASAN:
BUG: KASAN: stack-out-of-bounds in __ip_options_echo+0xa0e/0x12b0 Write of size 38 at addr ffff888006f1f80e by task ping/89 CPU: 2 PID: 89 Comm: ping Not tainted 5.10.0-rc7-debug+ #5 Call Trace: dump_stack+0x9a/0xcc print_address_description.constprop.0+0x1a/0x160 __kasan_report.cold+0x20/0x38 kasan_report+0x32/0x40 check_memory_region+0x145/0x1a0 memcpy+0x39/0x60 __ip_options_echo+0xa0e/0x12b0 __icmp_send+0x744/0x1700
Actually, out of the 4 drivers that do this, only gtp zeroed the cb for the v4 case, while the rest did not. So this commit actually removes the gtp-specific zeroing, while putting the code where it belongs in the shared infrastructure of icmp{,v6}_ndo_send.
This commit fixes the issue by passing an empty IPCB or IP6CB along to the functions that actually do the work. For the icmp_send, this was already trivial, thanks to __icmp_send providing the plumbing function. For icmpv6_send, this required a tiny bit of refactoring to make it behave like the v4 case, after which it was straight forward.
Fixes: a2b78e9b2cac ("sunvnet: generate ICMP PTMUD messages for smaller port MTUs") Reported-by: SinYu <liuxyon@gmail.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/netdev/CAF=yD-LOF116aHub6RMe8vB8ZpnrrnoTdqhobEx+bvoA8AsP0w@mail.gmail.com/T/ Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Link: https://lore.kernel.org/r/20210223131858.72082-1-Jason@zx2c4.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
3df98d79 |
| 28-Sep-2020 |
Paul Moore <paul@paul-moore.com> |
lsm,selinux: pass flowi_common instead of flowi to the LSM hooks
As pointed out by Herbert in a recent related patch, the LSM hooks do not have the necessary address family information to use the fl
lsm,selinux: pass flowi_common instead of flowi to the LSM hooks
As pointed out by Herbert in a recent related patch, the LSM hooks do not have the necessary address family information to use the flowi struct safely. As none of the LSMs currently use any of the protocol specific flowi information, replace the flowi pointers with pointers to the address family independent flowi_common struct.
Reported-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: James Morris <jamorris@linux.microsoft.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
show more ...
|
#
b38e7819 |
| 15-Oct-2020 |
Eric Dumazet <edumazet@google.com> |
icmp: randomize the global rate limiter
Keyu Man reported that the ICMP rate limiter could be used by attackers to get useful signal. Details will be provided in an upcoming academic publication.
O
icmp: randomize the global rate limiter
Keyu Man reported that the ICMP rate limiter could be used by attackers to get useful signal. Details will be provided in an upcoming academic publication.
Our solution is to add some noise, so that the attackers no longer can get help from the predictable token bucket limiter.
Fixes: 4cdf507d5452 ("icmp: add a global rate limitation") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Keyu Man <kman001@ucr.edu> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
e1e84eb5 |
| 12-Oct-2020 |
Mathieu Desnoyers <mathieu.desnoyers@efficios.com> |
ipv4/icmp: l3mdev: Perform icmp error route lookup on source device routing table (v2)
As per RFC792, ICMP errors should be sent to the source host.
However, in configurations with Virtual Routing
ipv4/icmp: l3mdev: Perform icmp error route lookup on source device routing table (v2)
As per RFC792, ICMP errors should be sent to the source host.
However, in configurations with Virtual Routing and Forwarding tables, looking up which routing table to use is currently done by using the destination net_device.
commit 9d1a6c4ea43e ("net: icmp_route_lookup should use rt dev to determine L3 domain") changes the interface passed to l3mdev_master_ifindex() and inet_addr_type_dev_table() from skb_in->dev to skb_dst(skb_in)->dev. This effectively uses the destination device rather than the source device for choosing which routing table should be used to lookup where to send the ICMP error.
Therefore, if the source and destination interfaces are within separate VRFs, or one in the global routing table and the other in a VRF, looking up the source host in the destination interface's routing table will fail if the destination interface's routing table contains no route to the source host.
One observable effect of this issue is that traceroute does not work in the following cases:
- Route leaking between global routing table and VRF - Route leaking between VRFs
Preferably use the source device routing table when sending ICMP error messages. If no source device is set, fall-back on the destination device routing table. Else, use the main routing table (index 0).
[ It has been pointed out that a similar issue may exist with ICMP errors triggered when forwarding between network namespaces. It would be worthwhile to investigate, but is outside of the scope of this investigation. ]
[ It has also been pointed out that a similar issue exists with unreachable / fragmentation needed messages, which can be triggered by changing the MTU of eth1 in r1 to 1400 and running:
ip netns exec h1 ping -s 1450 -Mdo -c1 172.16.2.2
Some investigation points to raw_icmp_error() and raw_err() as being involved in this last scenario. The focus of this patch is TTL expired ICMP messages, which go through icmp_route_lookup. Investigation of failure modes related to raw_icmp_error() is beyond this investigation's scope. ]
Fixes: 9d1a6c4ea43e ("net: icmp_route_lookup should use rt dev to determine L3 domain") Link: https://tools.ietf.org/html/rfc792 Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
5af68891 |
| 29-Aug-2020 |
Miaohe Lin <linmiaohe@huawei.com> |
net: clean up codestyle
This is a pure codestyle cleanup patch. No functional change intended.
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
373c15c2 |
| 24-Aug-2020 |
Miaohe Lin <linmiaohe@huawei.com> |
net: Use helper macro RT_TOS() in __icmp_send()
Use helper macro RT_TOS() to get tos in __icmp_send().
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: David S. Miller <davem@davemlo
net: Use helper macro RT_TOS() in __icmp_send()
Use helper macro RT_TOS() to get tos in __icmp_send().
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
cc44c17b |
| 11-Jul-2020 |
Al Viro <viro@zeniv.linux.org.uk> |
csum_partial_copy_nocheck(): drop the last argument
It's always 0. Note that we theoretically could use ~0U as well - result will be the same modulo 0xffff, _if_ the damn thing did the right thing
csum_partial_copy_nocheck(): drop the last argument
It's always 0. Note that we theoretically could use ~0U as well - result will be the same modulo 0xffff, _if_ the damn thing did the right thing for any value of initial sum; later we'll make use of that when convenient.
However, unlike csum_and_copy_..._user(), there are instances that did not work for arbitrary initial sums; c6x is one such.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
show more ...
|
#
3ea7ca80 |
| 11-Jul-2020 |
Al Viro <viro@zeniv.linux.org.uk> |
icmp_push_reply(): reorder adding the checksum up
do csum_partial_copy_nocheck() on the first fragment, then add the rest to it. Equivalent transformation.
That was the only caller of csum_partial
icmp_push_reply(): reorder adding the checksum up
do csum_partial_copy_nocheck() on the first fragment, then add the rest to it. Equivalent transformation.
That was the only caller of csum_partial_copy_nocheck() that might pass it non-zero as the last argument.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
show more ...
|
#
8d5930df |
| 11-Jul-2020 |
Al Viro <viro@zeniv.linux.org.uk> |
skb_copy_and_csum_bits(): don't bother with the last argument
it's always 0
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
#
01370434 |
| 24-Jul-2020 |
Willem de Bruijn <willemb@google.com> |
icmp6: support rfc 4884
Extend the rfc 4884 read interface introduced for ipv4 in commit eba75c587e81 ("icmp: support rfc 4884") to ipv6.
Add socket option SOL_IPV6/IPV6_RECVERR_RFC4884.
Changes v
icmp6: support rfc 4884
Extend the rfc 4884 read interface introduced for ipv4 in commit eba75c587e81 ("icmp: support rfc 4884") to ipv6.
Add socket option SOL_IPV6/IPV6_RECVERR_RFC4884.
Changes v1->v2: - make ipv6_icmp_error_rfc4884 static (file scope)
Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
178c49d9 |
| 24-Jul-2020 |
Willem de Bruijn <willemb@google.com> |
icmp: prepare rfc 4884 for ipv6
The RFC 4884 spec is largely the same between IPv4 and IPv6. Factor out the IPv4 specific parts in preparation for IPv6 support:
- icmp types supported
- icmp heade
icmp: prepare rfc 4884 for ipv6
The RFC 4884 spec is largely the same between IPv4 and IPv6. Factor out the IPv4 specific parts in preparation for IPv6 support:
- icmp types supported
- icmp header size, and thus offset to original datagram start
- datagram length field offset in icmp(6)hdr.
- datagram length field word size: 4B for IPv4, 8B for IPv6.
Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
c4e9e09f |
| 24-Jul-2020 |
Willem de Bruijn <willemb@google.com> |
icmp: revise rfc4884 tests
1) Only accept packets with original datagram len field >= header len.
The extension header must start after the original datagram headers. The embedded datagram len fiel
icmp: revise rfc4884 tests
1) Only accept packets with original datagram len field >= header len.
The extension header must start after the original datagram headers. The embedded datagram len field is compared against the 128B minimum stipulated by RFC 4884. It is unlikely that headers extend beyond this. But as we know the exact header length, check explicitly.
2) Remove the check that datagram length must be <= 576B.
This is a send constraint. There is no value in testing this on rx. Within private networks it may be known safe to send larger packets. Process these packets.
This test was also too lax. It compared original datagram length rather than entire icmp packet length. The stand-alone fix would be:
- if (hlen + skb->len > 576) + if (-skb_network_offset(skb) + skb->len > 576)
Fixes: eba75c587e81 ("icmp: support rfc 4884") Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
eba75c58 |
| 10-Jul-2020 |
Willem de Bruijn <willemb@google.com> |
icmp: support rfc 4884
Add setsockopt SOL_IP/IP_RECVERR_4884 to return the offset to an extension struct if present.
ICMP messages may include an extension structure after the original datagram. RF
icmp: support rfc 4884
Add setsockopt SOL_IP/IP_RECVERR_4884 to return the offset to an extension struct if present.
ICMP messages may include an extension structure after the original datagram. RFC 4884 standardized this behavior. It stores the offset in words to the extension header in u8 icmphdr.un.reserved[1].
The field is valid only for ICMP types destination unreachable, time exceeded and parameter problem, if length is at least 128 bytes and entire packet does not exceed 576 bytes.
Return the offset to the start of the extension struct when reading an ICMP error from the error queue, if it matches the above constraints.
Do not return the raw u8 field. Return the offset from the start of the user buffer, in bytes. The kernel does not return the network and transport headers, so subtract those.
Also validate the headers. Return the offset regardless of validation, as an invalid extension must still not be misinterpreted as part of the original datagram. Note that !invalid does not imply valid. If the extension version does not match, no validation can take place, for instance.
For backward compatibility, make this optional, set by setsockopt SOL_IP/IP_RECVERR_RFC4884. For API example and feature test, see github.com/wdebruij/kerneltools/blob/master/tests/recv_icmp_v2.c
For forward compatibility, reserve only setsockopt value 1, leaving other bits for additional icmp extensions.
Changes v1->v2: - convert word offset to byte offset from start of user buffer - return in ee_data as u8 may be insufficient - define extension struct and object header structs - return len only if constraints met - if returning len, also validate
Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
0da7536f |
| 01-Jul-2020 |
Willem de Bruijn <willemb@google.com> |
ip: Fix SO_MARK in RST, ACK and ICMP packets
When no full socket is available, skbs are sent over a per-netns control socket. Its sk_mark is temporarily adjusted to match that of the real (request o
ip: Fix SO_MARK in RST, ACK and ICMP packets
When no full socket is available, skbs are sent over a per-netns control socket. Its sk_mark is temporarily adjusted to match that of the real (request or timewait) socket or to reflect an incoming skb, so that the outgoing skb inherits this in __ip_make_skb.
Introduction of the socket cookie mark field broke this. Now the skb is set through the cookie and cork:
<caller> # init sockc.mark from sk_mark or cmsg ip_append_data ip_setup_cork # convert sockc.mark to cork mark ip_push_pending_frames ip_finish_skb __ip_make_skb # set skb->mark to cork mark
But I missed these special control sockets. Update all callers of __ip(6)_make_skb that were originally missed.
For IPv6, the same two icmp(v6) paths are affected. The third case is not, as commit 92e55f412cff ("tcp: don't annotate mark on control socket from tcp_v6_send_response()") replaced the ctl_sk->sk_mark with passing the mark field directly as a function argument. That commit predates the commit that introduced the bug.
Fixes: c6af0c227a22 ("ip: support SO_MARK cmsg") Signed-off-by: Willem de Bruijn <willemb@google.com> Reported-by: Martin KaFai Lau <kafai@fb.com> Reviewed-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
1cec2cac |
| 27-Apr-2020 |
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> |
docs: networking: convert ip-sysctl.txt to ReST
- add SPDX header; - adjust titles and chapters, adding proper markups; - mark code blocks and literals as such; - mark lists as such; - mark tables a
docs: networking: convert ip-sysctl.txt to ReST
- add SPDX header; - adjust titles and chapters, adding proper markups; - mark code blocks and literals as such; - mark lists as such; - mark tables as such; - use footnote markup; - adjust identation, whitespaces and blank lines; - add to networking/index.rst.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
a8eceea8 |
| 12-Mar-2020 |
Joe Perches <joe@perches.com> |
inet: Use fallthrough;
Convert the various uses of fallthrough comments to fallthrough;
Done via script Link: https://lore.kernel.org/lkml/b56602fcf79f849e733e7b521bb0e17895d390fa.1582230379.git.jo
inet: Use fallthrough;
Convert the various uses of fallthrough comments to fallthrough;
Done via script Link: https://lore.kernel.org/lkml/b56602fcf79f849e733e7b521bb0e17895d390fa.1582230379.git.joe@perches.com/
And by hand:
net/ipv6/ip6_fib.c has a fallthrough comment outside of an #ifdef block that causes gcc to emit a warning if converted in-place.
So move the new fallthrough; inside the containing #ifdef/#endif too.
Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
0b41713b |
| 11-Feb-2020 |
Jason A. Donenfeld <Jason@zx2c4.com> |
icmp: introduce helper for nat'd source address in network device context
This introduces a helper function to be called only by network drivers that wraps calls to icmp[v6]_send in a conntrack tran
icmp: introduce helper for nat'd source address in network device context
This introduces a helper function to be called only by network drivers that wraps calls to icmp[v6]_send in a conntrack transformation, in case NAT has been used. We don't want to pollute the non-driver path, though, so we introduce this as a helper to be called by places that actually make use of this, as suggested by Florian.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|