History log of /linux/net/ipv4/icmp.c (Results 26 – 50 of 252)
Revision Date Author Comments
# 1fcd7945 14-Oct-2021 Xin Long <lucien.xin@gmail.com>

icmp: fix icmp_ext_echo_iio parsing in icmp_build_probe

In icmp_build_probe(), the icmp_ext_echo_iio parsing should be done
step by step and skb_header_pointer() return value should always be
checke

icmp: fix icmp_ext_echo_iio parsing in icmp_build_probe

In icmp_build_probe(), the icmp_ext_echo_iio parsing should be done
step by step and skb_header_pointer() return value should always be
checked, this patch fixes 3 places in there:

- On case ICMP_EXT_ECHO_CTYPE_NAME, it should only copy ident.name
from skb by skb_header_pointer(), its len is ident_len. Besides,
the return value of skb_header_pointer() should always be checked.

- On case ICMP_EXT_ECHO_CTYPE_INDEX, move ident_len check ahead of
skb_header_pointer(), and also do the return value check for
skb_header_pointer().

- On case ICMP_EXT_ECHO_CTYPE_ADDR, before accessing iio->ident.addr.
ctype3_hdr.addrlen, skb_header_pointer() should be called first,
then check its return value and ident_len.
On subcases ICMP_AFI_IP and ICMP_AFI_IP6, also do check for ident.
addr.ctype3_hdr.addrlen and skb_header_pointer()'s return value.
On subcase ICMP_AFI_IP, the len for skb_header_pointer() should be
"sizeof(iio->extobj_hdr) + sizeof(iio->ident.addr.ctype3_hdr) +
sizeof(struct in_addr)" or "ident_len".

v1->v2:
- To make it more clear, call skb_header_pointer() once only for
iio->indent's parsing as Jakub Suggested.
v2->v3:
- The extobj_hdr.length check against sizeof(_iio) should be done
before calling skb_header_pointer(), as Eric noticed.

Fixes: d329ea5bd884 ("icmp: add response to RFC 8335 PROBE messages")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/31628dd76657ea62f5cf78bb55da6b35240831f1.1634205050.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

show more ...


# 1160dfa1 05-Aug-2021 Yajun Deng <yajun.deng@linux.dev>

net: Remove redundant if statements

The 'if (dev)' statement already move into dev_{put , hold}, so remove
redundant if statements.

Signed-off-by: Yajun Deng <yajun.deng@linux.dev>
Signed-off-by: D

net: Remove redundant if statements

The 'if (dev)' statement already move into dev_{put , hold}, so remove
redundant if statements.

Signed-off-by: Yajun Deng <yajun.deng@linux.dev>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


# 1fd07f33 26-Jun-2021 Andreas Roeseler <andreas.a.roeseler@gmail.com>

ipv6: ICMPV6: add response to ICMPV6 RFC 8335 PROBE messages

This patch builds off of commit 2b246b2569cd2ac6ff700d0dce56b8bae29b1842
and adds functionality to respond to ICMPV6 PROBE requests.

Add

ipv6: ICMPV6: add response to ICMPV6 RFC 8335 PROBE messages

This patch builds off of commit 2b246b2569cd2ac6ff700d0dce56b8bae29b1842
and adds functionality to respond to ICMPV6 PROBE requests.

Add icmp_build_probe function to construct PROBE requests for both
ICMPV4 and ICMPV6.

Modify icmpv6_rcv to detect ICMPV6 PROBE messages and call the
icmpv6_echo_reply handler.

Modify icmpv6_echo_reply to build a PROBE response message based on the
queried interface.

This patch has been tested using a branch of the iputils git repo which can
be found here: https://github.com/Juniper-Clinic-2020/iputils/tree/probe-request

Signed-off-by: Andreas Roeseler <andreas.a.roeseler@gmail.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


# 32182747 18-Jun-2021 Toke Høiland-Jørgensen <toke@redhat.com>

icmp: don't send out ICMP messages with a source address of 0.0.0.0

When constructing ICMP response messages, the kernel will try to pick a
suitable source address for the outgoing packet. However,

icmp: don't send out ICMP messages with a source address of 0.0.0.0

When constructing ICMP response messages, the kernel will try to pick a
suitable source address for the outgoing packet. However, if no IPv4
addresses are configured on the system at all, this will fail and we end up
producing an ICMP message with a source address of 0.0.0.0. This can happen
on a box routing IPv4 traffic via v6 nexthops, for instance.

Since 0.0.0.0 is not generally routable on the internet, there's a good
chance that such ICMP messages will never make it back to the sender of the
original packet that the ICMP message was sent in response to. This, in
turn, can create connectivity and PMTUd problems for senders. Fortunately,
RFC7600 reserves a dummy address to be used as a source for ICMP
messages (192.0.0.8/32), so let's teach the kernel to substitute that
address as a last resort if the regular source address selection procedure
fails.

Below is a quick example reproducing this issue with network namespaces:

ip netns add ns0
ip l add type veth peer netns ns0
ip l set dev veth0 up
ip a add 10.0.0.1/24 dev veth0
ip a add fc00:dead:cafe:42::1/64 dev veth0
ip r add 10.1.0.0/24 via inet6 fc00:dead:cafe:42::2
ip -n ns0 l set dev veth0 up
ip -n ns0 a add fc00:dead:cafe:42::2/64 dev veth0
ip -n ns0 r add 10.0.0.0/24 via inet6 fc00:dead:cafe:42::1
ip netns exec ns0 sysctl -w net.ipv4.icmp_ratelimit=0
ip netns exec ns0 sysctl -w net.ipv4.ip_forward=1
tcpdump -tpni veth0 -c 2 icmp &
ping -w 1 10.1.0.1 > /dev/null
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on veth0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
IP 10.0.0.1 > 10.1.0.1: ICMP echo request, id 29, seq 1, length 64
IP 0.0.0.0 > 10.0.0.1: ICMP net 10.1.0.1 unreachable, length 92
2 packets captured
2 packets received by filter
0 packets dropped by kernel

With this patch the above capture changes to:
IP 10.0.0.1 > 10.1.0.1: ICMP echo request, id 31127, seq 1, length 64
IP 192.0.0.8 > 10.0.0.1: ICMP net 10.1.0.1 unreachable, length 92

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: Juliusz Chroboczek <jch@irif.fr>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


# e32ea44c 03-Jun-2021 Andreas Roeseler <andreas.a.roeseler@gmail.com>

icmp: fix lib conflict with trinity

Including <linux/in.h> and <netinet/in.h> in the dependencies breaks
compilation of trinity due to multiple definitions. <linux/in.h> is only
used in <linux/icmp.

icmp: fix lib conflict with trinity

Including <linux/in.h> and <netinet/in.h> in the dependencies breaks
compilation of trinity due to multiple definitions. <linux/in.h> is only
used in <linux/icmp.h> to provide the definition of the struct in_addr,
but this can be substituted out by using the datatype __be32.

Signed-off-by: Andreas Roeseler <andreas.a.roeseler@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


# e542d29c 27-Apr-2021 Andreas Roeseler <andreas.a.roeseler@gmail.com>

icmp: standardize naming of RFC 8335 PROBE constants

The current definitions of constants for PROBE, currently defined only
in the net-next kernel branch, are inconsistent, with
some beginning with

icmp: standardize naming of RFC 8335 PROBE constants

The current definitions of constants for PROBE, currently defined only
in the net-next kernel branch, are inconsistent, with
some beginning with ICMP and others with simply EXT. This patch
attempts to standardize the naming conventions of the constants for
PROBE before their release into a stable Kernel, and to update the
relevant definitions in net/ipv4/icmp.c.

Similarly, the definitions for the code field (previously
ICMP_EXT_MAL_QUERY, etc) use the same prefixes as the type field. This
patch adds _CODE_ to the prefix to clarify the distinction of these
constants.

Signed-off-by: Andreas Roeseler <andreas.a.roeseler@gmail.com>
Acked-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20210427153635.2591-1-andreas.a.roeseler@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

show more ...


# 31433202 12-Apr-2021 Andreas Roeseler <andreas.a.roeseler@gmail.com>

icmp: ICMPV6: pass RFC 8335 reply messages to ping_rcv

The current icmp_rcv function drops all unknown ICMP types, including
ICMP_EXT_ECHOREPLY (type 43). In order to parse Extended Echo Reply messa

icmp: ICMPV6: pass RFC 8335 reply messages to ping_rcv

The current icmp_rcv function drops all unknown ICMP types, including
ICMP_EXT_ECHOREPLY (type 43). In order to parse Extended Echo Reply messages, we have
to pass these packets to the ping_rcv function, which does not do any
other filtering and passes the packet to the designated socket.

Pass incoming RFC 8335 ICMP Extended Echo Reply packets to the ping_rcv
handler instead of discarding the packet.

Signed-off-by: Andreas Roeseler <andreas.a.roeseler@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


# d329ea5b 30-Mar-2021 Andreas Roeseler <andreas.a.roeseler@gmail.com>

icmp: add response to RFC 8335 PROBE messages

Modify the icmp_rcv function to check PROBE messages and call icmp_echo
if a PROBE request is detected.

Modify the existing icmp_echo function to respo

icmp: add response to RFC 8335 PROBE messages

Modify the icmp_rcv function to check PROBE messages and call icmp_echo
if a PROBE request is detected.

Modify the existing icmp_echo function to respond ot both ping and PROBE
requests.

This was tested using a custom modification to the iputils package and
wireshark. It supports IPV4 probing by name, ifindex, and probing by
both IPV4 and IPV6 addresses. It currently does not support responding
to probes off the proxy node (see RFC 8335 Section 2).

The modification to the iputils package is still in development and can
be found here: https://github.com/Juniper-Clinic-2020/iputils.git. It
supports full sending functionality of PROBE requests, but currently
does not parse the response messages, which is why Wireshark is required
to verify the sent and recieved PROBE messages. The modification adds
the ``-e'' flag to the command which allows the user to specify the
interface identifier to query the probed host. An example usage would be
<./ping -4 -e 1 [destination]> to send a PROBE request of ifindex 1 to the
destination node.

Signed-off-by: Andreas Roeseler <andreas.a.roeseler@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


# ee576c47 23-Feb-2021 Jason A. Donenfeld <Jason@zx2c4.com>

net: icmp: pass zeroed opts from icmp{,v6}_ndo_send before sending

The icmp{,v6}_send functions make all sorts of use of skb->cb, casting
it with IPCB or IP6CB, assuming the skb to have come directl

net: icmp: pass zeroed opts from icmp{,v6}_ndo_send before sending

The icmp{,v6}_send functions make all sorts of use of skb->cb, casting
it with IPCB or IP6CB, assuming the skb to have come directly from the
inet layer. But when the packet comes from the ndo layer, especially
when forwarded, there's no telling what might be in skb->cb at that
point. As a result, the icmp sending code risks reading bogus memory
contents, which can result in nasty stack overflows such as this one
reported by a user:

panic+0x108/0x2ea
__stack_chk_fail+0x14/0x20
__icmp_send+0x5bd/0x5c0
icmp_ndo_send+0x148/0x160

In icmp_send, skb->cb is cast with IPCB and an ip_options struct is read
from it. The optlen parameter there is of particular note, as it can
induce writes beyond bounds. There are quite a few ways that can happen
in __ip_options_echo. For example:

// sptr/skb are attacker-controlled skb bytes
sptr = skb_network_header(skb);
// dptr/dopt points to stack memory allocated by __icmp_send
dptr = dopt->__data;
// sopt is the corrupt skb->cb in question
if (sopt->rr) {
optlen = sptr[sopt->rr+1]; // corrupt skb->cb + skb->data
soffset = sptr[sopt->rr+2]; // corrupt skb->cb + skb->data
// this now writes potentially attacker-controlled data, over
// flowing the stack:
memcpy(dptr, sptr+sopt->rr, optlen);
}

In the icmpv6_send case, the story is similar, but not as dire, as only
IP6CB(skb)->iif and IP6CB(skb)->dsthao are used. The dsthao case is
worse than the iif case, but it is passed to ipv6_find_tlv, which does
a bit of bounds checking on the value.

This is easy to simulate by doing a `memset(skb->cb, 0x41,
sizeof(skb->cb));` before calling icmp{,v6}_ndo_send, and it's only by
good fortune and the rarity of icmp sending from that context that we've
avoided reports like this until now. For example, in KASAN:

BUG: KASAN: stack-out-of-bounds in __ip_options_echo+0xa0e/0x12b0
Write of size 38 at addr ffff888006f1f80e by task ping/89
CPU: 2 PID: 89 Comm: ping Not tainted 5.10.0-rc7-debug+ #5
Call Trace:
dump_stack+0x9a/0xcc
print_address_description.constprop.0+0x1a/0x160
__kasan_report.cold+0x20/0x38
kasan_report+0x32/0x40
check_memory_region+0x145/0x1a0
memcpy+0x39/0x60
__ip_options_echo+0xa0e/0x12b0
__icmp_send+0x744/0x1700

Actually, out of the 4 drivers that do this, only gtp zeroed the cb for
the v4 case, while the rest did not. So this commit actually removes the
gtp-specific zeroing, while putting the code where it belongs in the
shared infrastructure of icmp{,v6}_ndo_send.

This commit fixes the issue by passing an empty IPCB or IP6CB along to
the functions that actually do the work. For the icmp_send, this was
already trivial, thanks to __icmp_send providing the plumbing function.
For icmpv6_send, this required a tiny bit of refactoring to make it
behave like the v4 case, after which it was straight forward.

Fixes: a2b78e9b2cac ("sunvnet: generate ICMP PTMUD messages for smaller port MTUs")
Reported-by: SinYu <liuxyon@gmail.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/netdev/CAF=yD-LOF116aHub6RMe8vB8ZpnrrnoTdqhobEx+bvoA8AsP0w@mail.gmail.com/T/
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Link: https://lore.kernel.org/r/20210223131858.72082-1-Jason@zx2c4.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

show more ...


# 3df98d79 28-Sep-2020 Paul Moore <paul@paul-moore.com>

lsm,selinux: pass flowi_common instead of flowi to the LSM hooks

As pointed out by Herbert in a recent related patch, the LSM hooks do
not have the necessary address family information to use the fl

lsm,selinux: pass flowi_common instead of flowi to the LSM hooks

As pointed out by Herbert in a recent related patch, the LSM hooks do
not have the necessary address family information to use the flowi
struct safely. As none of the LSMs currently use any of the protocol
specific flowi information, replace the flowi pointers with pointers
to the address family independent flowi_common struct.

Reported-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: James Morris <jamorris@linux.microsoft.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>

show more ...


# b38e7819 15-Oct-2020 Eric Dumazet <edumazet@google.com>

icmp: randomize the global rate limiter

Keyu Man reported that the ICMP rate limiter could be used
by attackers to get useful signal. Details will be provided
in an upcoming academic publication.

O

icmp: randomize the global rate limiter

Keyu Man reported that the ICMP rate limiter could be used
by attackers to get useful signal. Details will be provided
in an upcoming academic publication.

Our solution is to add some noise, so that the attackers
no longer can get help from the predictable token bucket limiter.

Fixes: 4cdf507d5452 ("icmp: add a global rate limitation")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Keyu Man <kman001@ucr.edu>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

show more ...


# e1e84eb5 12-Oct-2020 Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

ipv4/icmp: l3mdev: Perform icmp error route lookup on source device routing table (v2)

As per RFC792, ICMP errors should be sent to the source host.

However, in configurations with Virtual Routing

ipv4/icmp: l3mdev: Perform icmp error route lookup on source device routing table (v2)

As per RFC792, ICMP errors should be sent to the source host.

However, in configurations with Virtual Routing and Forwarding tables,
looking up which routing table to use is currently done by using the
destination net_device.

commit 9d1a6c4ea43e ("net: icmp_route_lookup should use rt dev to
determine L3 domain") changes the interface passed to
l3mdev_master_ifindex() and inet_addr_type_dev_table() from skb_in->dev
to skb_dst(skb_in)->dev. This effectively uses the destination device
rather than the source device for choosing which routing table should be
used to lookup where to send the ICMP error.

Therefore, if the source and destination interfaces are within separate
VRFs, or one in the global routing table and the other in a VRF, looking
up the source host in the destination interface's routing table will
fail if the destination interface's routing table contains no route to
the source host.

One observable effect of this issue is that traceroute does not work in
the following cases:

- Route leaking between global routing table and VRF
- Route leaking between VRFs

Preferably use the source device routing table when sending ICMP error
messages. If no source device is set, fall-back on the destination
device routing table. Else, use the main routing table (index 0).

[ It has been pointed out that a similar issue may exist with ICMP
errors triggered when forwarding between network namespaces. It would
be worthwhile to investigate, but is outside of the scope of this
investigation. ]

[ It has also been pointed out that a similar issue exists with
unreachable / fragmentation needed messages, which can be triggered by
changing the MTU of eth1 in r1 to 1400 and running:

ip netns exec h1 ping -s 1450 -Mdo -c1 172.16.2.2

Some investigation points to raw_icmp_error() and raw_err() as being
involved in this last scenario. The focus of this patch is TTL expired
ICMP messages, which go through icmp_route_lookup.
Investigation of failure modes related to raw_icmp_error() is beyond
this investigation's scope. ]

Fixes: 9d1a6c4ea43e ("net: icmp_route_lookup should use rt dev to determine L3 domain")
Link: https://tools.ietf.org/html/rfc792
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

show more ...


# 5af68891 29-Aug-2020 Miaohe Lin <linmiaohe@huawei.com>

net: clean up codestyle

This is a pure codestyle cleanup patch. No functional change intended.

Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 373c15c2 24-Aug-2020 Miaohe Lin <linmiaohe@huawei.com>

net: Use helper macro RT_TOS() in __icmp_send()

Use helper macro RT_TOS() to get tos in __icmp_send().

Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: David S. Miller <davem@davemlo

net: Use helper macro RT_TOS() in __icmp_send()

Use helper macro RT_TOS() to get tos in __icmp_send().

Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


# cc44c17b 11-Jul-2020 Al Viro <viro@zeniv.linux.org.uk>

csum_partial_copy_nocheck(): drop the last argument

It's always 0. Note that we theoretically could use ~0U as well -
result will be the same modulo 0xffff, _if_ the damn thing did the
right thing

csum_partial_copy_nocheck(): drop the last argument

It's always 0. Note that we theoretically could use ~0U as well -
result will be the same modulo 0xffff, _if_ the damn thing did the
right thing for any value of initial sum; later we'll make use of
that when convenient.

However, unlike csum_and_copy_..._user(), there are instances that
did not work for arbitrary initial sums; c6x is one such.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

show more ...


# 3ea7ca80 11-Jul-2020 Al Viro <viro@zeniv.linux.org.uk>

icmp_push_reply(): reorder adding the checksum up

do csum_partial_copy_nocheck() on the first fragment, then
add the rest to it. Equivalent transformation.

That was the only caller of csum_partial

icmp_push_reply(): reorder adding the checksum up

do csum_partial_copy_nocheck() on the first fragment, then
add the rest to it. Equivalent transformation.

That was the only caller of csum_partial_copy_nocheck() that
might pass it non-zero as the last argument.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

show more ...


# 8d5930df 11-Jul-2020 Al Viro <viro@zeniv.linux.org.uk>

skb_copy_and_csum_bits(): don't bother with the last argument

it's always 0

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>


# 01370434 24-Jul-2020 Willem de Bruijn <willemb@google.com>

icmp6: support rfc 4884

Extend the rfc 4884 read interface introduced for ipv4 in
commit eba75c587e81 ("icmp: support rfc 4884") to ipv6.

Add socket option SOL_IPV6/IPV6_RECVERR_RFC4884.

Changes v

icmp6: support rfc 4884

Extend the rfc 4884 read interface introduced for ipv4 in
commit eba75c587e81 ("icmp: support rfc 4884") to ipv6.

Add socket option SOL_IPV6/IPV6_RECVERR_RFC4884.

Changes v1->v2:
- make ipv6_icmp_error_rfc4884 static (file scope)

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


# 178c49d9 24-Jul-2020 Willem de Bruijn <willemb@google.com>

icmp: prepare rfc 4884 for ipv6

The RFC 4884 spec is largely the same between IPv4 and IPv6.
Factor out the IPv4 specific parts in preparation for IPv6 support:

- icmp types supported

- icmp heade

icmp: prepare rfc 4884 for ipv6

The RFC 4884 spec is largely the same between IPv4 and IPv6.
Factor out the IPv4 specific parts in preparation for IPv6 support:

- icmp types supported

- icmp header size, and thus offset to original datagram start

- datagram length field offset in icmp(6)hdr.

- datagram length field word size: 4B for IPv4, 8B for IPv6.

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


# c4e9e09f 24-Jul-2020 Willem de Bruijn <willemb@google.com>

icmp: revise rfc4884 tests

1) Only accept packets with original datagram len field >= header len.

The extension header must start after the original datagram headers.
The embedded datagram len fiel

icmp: revise rfc4884 tests

1) Only accept packets with original datagram len field >= header len.

The extension header must start after the original datagram headers.
The embedded datagram len field is compared against the 128B minimum
stipulated by RFC 4884. It is unlikely that headers extend beyond
this. But as we know the exact header length, check explicitly.

2) Remove the check that datagram length must be <= 576B.

This is a send constraint. There is no value in testing this on rx.
Within private networks it may be known safe to send larger packets.
Process these packets.

This test was also too lax. It compared original datagram length
rather than entire icmp packet length. The stand-alone fix would be:

- if (hlen + skb->len > 576)
+ if (-skb_network_offset(skb) + skb->len > 576)

Fixes: eba75c587e81 ("icmp: support rfc 4884")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


# eba75c58 10-Jul-2020 Willem de Bruijn <willemb@google.com>

icmp: support rfc 4884

Add setsockopt SOL_IP/IP_RECVERR_4884 to return the offset to an
extension struct if present.

ICMP messages may include an extension structure after the original
datagram. RF

icmp: support rfc 4884

Add setsockopt SOL_IP/IP_RECVERR_4884 to return the offset to an
extension struct if present.

ICMP messages may include an extension structure after the original
datagram. RFC 4884 standardized this behavior. It stores the offset
in words to the extension header in u8 icmphdr.un.reserved[1].

The field is valid only for ICMP types destination unreachable, time
exceeded and parameter problem, if length is at least 128 bytes and
entire packet does not exceed 576 bytes.

Return the offset to the start of the extension struct when reading an
ICMP error from the error queue, if it matches the above constraints.

Do not return the raw u8 field. Return the offset from the start of
the user buffer, in bytes. The kernel does not return the network and
transport headers, so subtract those.

Also validate the headers. Return the offset regardless of validation,
as an invalid extension must still not be misinterpreted as part of
the original datagram. Note that !invalid does not imply valid. If
the extension version does not match, no validation can take place,
for instance.

For backward compatibility, make this optional, set by setsockopt
SOL_IP/IP_RECVERR_RFC4884. For API example and feature test, see
github.com/wdebruij/kerneltools/blob/master/tests/recv_icmp_v2.c

For forward compatibility, reserve only setsockopt value 1, leaving
other bits for additional icmp extensions.

Changes
v1->v2:
- convert word offset to byte offset from start of user buffer
- return in ee_data as u8 may be insufficient
- define extension struct and object header structs
- return len only if constraints met
- if returning len, also validate

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


# 0da7536f 01-Jul-2020 Willem de Bruijn <willemb@google.com>

ip: Fix SO_MARK in RST, ACK and ICMP packets

When no full socket is available, skbs are sent over a per-netns
control socket. Its sk_mark is temporarily adjusted to match that
of the real (request o

ip: Fix SO_MARK in RST, ACK and ICMP packets

When no full socket is available, skbs are sent over a per-netns
control socket. Its sk_mark is temporarily adjusted to match that
of the real (request or timewait) socket or to reflect an incoming
skb, so that the outgoing skb inherits this in __ip_make_skb.

Introduction of the socket cookie mark field broke this. Now the
skb is set through the cookie and cork:

<caller> # init sockc.mark from sk_mark or cmsg
ip_append_data
ip_setup_cork # convert sockc.mark to cork mark
ip_push_pending_frames
ip_finish_skb
__ip_make_skb # set skb->mark to cork mark

But I missed these special control sockets. Update all callers of
__ip(6)_make_skb that were originally missed.

For IPv6, the same two icmp(v6) paths are affected. The third
case is not, as commit 92e55f412cff ("tcp: don't annotate
mark on control socket from tcp_v6_send_response()") replaced
the ctl_sk->sk_mark with passing the mark field directly as a
function argument. That commit predates the commit that
introduced the bug.

Fixes: c6af0c227a22 ("ip: support SO_MARK cmsg")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Reported-by: Martin KaFai Lau <kafai@fb.com>
Reviewed-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


# 1cec2cac 27-Apr-2020 Mauro Carvalho Chehab <mchehab+huawei@kernel.org>

docs: networking: convert ip-sysctl.txt to ReST

- add SPDX header;
- adjust titles and chapters, adding proper markups;
- mark code blocks and literals as such;
- mark lists as such;
- mark tables a

docs: networking: convert ip-sysctl.txt to ReST

- add SPDX header;
- adjust titles and chapters, adding proper markups;
- mark code blocks and literals as such;
- mark lists as such;
- mark tables as such;
- use footnote markup;
- adjust identation, whitespaces and blank lines;
- add to networking/index.rst.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


# a8eceea8 12-Mar-2020 Joe Perches <joe@perches.com>

inet: Use fallthrough;

Convert the various uses of fallthrough comments to fallthrough;

Done via script
Link: https://lore.kernel.org/lkml/b56602fcf79f849e733e7b521bb0e17895d390fa.1582230379.git.jo

inet: Use fallthrough;

Convert the various uses of fallthrough comments to fallthrough;

Done via script
Link: https://lore.kernel.org/lkml/b56602fcf79f849e733e7b521bb0e17895d390fa.1582230379.git.joe@perches.com/

And by hand:

net/ipv6/ip6_fib.c has a fallthrough comment outside of an #ifdef block
that causes gcc to emit a warning if converted in-place.

So move the new fallthrough; inside the containing #ifdef/#endif too.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


# 0b41713b 11-Feb-2020 Jason A. Donenfeld <Jason@zx2c4.com>

icmp: introduce helper for nat'd source address in network device context

This introduces a helper function to be called only by network drivers
that wraps calls to icmp[v6]_send in a conntrack tran

icmp: introduce helper for nat'd source address in network device context

This introduces a helper function to be called only by network drivers
that wraps calls to icmp[v6]_send in a conntrack transformation, in case
NAT has been used. We don't want to pollute the non-driver path, though,
so we introduce this as a helper to be called by places that actually
make use of this, as suggested by Florian.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


1234567891011