History log of /linux/net/sunrpc/xprtsock.c (Results 1 – 25 of 508)
Revision Date Author Comments
# ca5d1fce 01-May-2024 Joel Granados <j.granados@samsung.com>

net: sunrpc: Remove the now superfluous sentinel elements from ctl_table array

This commit comes at the tail end of a greater effort to remove the
empty elements at the end of the ctl_table arrays (

net: sunrpc: Remove the now superfluous sentinel elements from ctl_table array

This commit comes at the tail end of a greater effort to remove the
empty elements at the end of the ctl_table arrays (sentinels) which
will reduce the overall build time size of the kernel and run time
memory bloat by ~64 bytes per sentinel (further information Link :
https://lore.kernel.org/all/ZO5Yx5JFogGi%2FcBo@bombadil.infradead.org/)

* Remove sentinel element from ctl_table structs.

Signed-off-by: Joel Granados <j.granados@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


# 8e088a20 02-Apr-2024 Olga Kornievskaia <kolga@netapp.com>

SUNRPC: add a missing rpc_stat for TCP TLS

Commit 1548036ef120 ("nfs: make the rpc_stat per net namespace") added
functionality to specify rpc_stats function but missed adding it to the
TCP TLS func

SUNRPC: add a missing rpc_stat for TCP TLS

Commit 1548036ef120 ("nfs: make the rpc_stat per net namespace") added
functionality to specify rpc_stats function but missed adding it to the
TCP TLS functionality. As the result, mounting with xprtsec=tls lead to
the following kernel oops.

[ 128.984192] Unable to handle kernel NULL pointer dereference at
virtual address 000000000000001c
[ 128.985058] Mem abort info:
[ 128.985372] ESR = 0x0000000096000004
[ 128.985709] EC = 0x25: DABT (current EL), IL = 32 bits
[ 128.986176] SET = 0, FnV = 0
[ 128.986521] EA = 0, S1PTW = 0
[ 128.986804] FSC = 0x04: level 0 translation fault
[ 128.987229] Data abort info:
[ 128.987597] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
[ 128.988169] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[ 128.988811] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 128.989302] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000106c84000
[ 128.990048] [000000000000001c] pgd=0000000000000000, p4d=0000000000000000
[ 128.990736] Internal error: Oops: 0000000096000004 [#1] SMP
[ 128.991168] Modules linked in: nfs_layout_nfsv41_files
rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace netfs
uinput dm_mod nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib
nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct
nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfkill
ip_set nf_tables nfnetlink qrtr vsock_loopback
vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock
sunrpc vfat fat uvcvideo videobuf2_vmalloc videobuf2_memops uvc
videobuf2_v4l2 videodev videobuf2_common mc vmw_vmci xfs libcrc32c
e1000e crct10dif_ce ghash_ce sha2_ce vmwgfx nvme sha256_arm64
nvme_core sr_mod cdrom sha1_ce drm_ttm_helper ttm drm_kms_helper drm
sg fuse
[ 128.996466] CPU: 0 PID: 179 Comm: kworker/u4:26 Kdump: loaded Not
tainted 6.8.0-rc6+ #12
[ 128.997226] Hardware name: VMware, Inc. VMware20,1/VBSA, BIOS
VMW201.00V.21805430.BA64.2305221830 05/22/2023
[ 128.998084] Workqueue: xprtiod xs_tcp_tls_setup_socket [sunrpc]
[ 128.998701] pstate: 81400005 (Nzcv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
[ 128.999384] pc : call_start+0x74/0x138 [sunrpc]
[ 128.999809] lr : __rpc_execute+0xb8/0x3e0 [sunrpc]
[ 129.000244] sp : ffff8000832b3a00
[ 129.000508] x29: ffff8000832b3a00 x28: ffff800081ac79c0 x27: ffff800081ac7000
[ 129.001111] x26: 0000000004248060 x25: 0000000000000000 x24: ffff800081596008
[ 129.001757] x23: ffff80007b087240 x22: ffff00009a509d30 x21: 0000000000000000
[ 129.002345] x20: ffff000090075600 x19: ffff00009a509d00 x18: ffffffffffffffff
[ 129.002912] x17: 733d4d4554535953 x16: 42555300312d746e x15: ffff8000832b3a88
[ 129.003464] x14: ffffffffffffffff x13: ffff8000832b3a7d x12: 0000000000000008
[ 129.004021] x11: 0101010101010101 x10: ffff8000150cb560 x9 : ffff80007b087c00
[ 129.004577] x8 : ffff00009a509de0 x7 : 0000000000000000 x6 : 00000000be8c4ee3
[ 129.005026] x5 : 0000000000000000 x4 : 0000000000000000 x3 : ffff000094d56680
[ 129.005425] x2 : ffff80007b0637f8 x1 : ffff000090075600 x0 : ffff00009a509d00
[ 129.005824] Call trace:
[ 129.005967] call_start+0x74/0x138 [sunrpc]
[ 129.006233] __rpc_execute+0xb8/0x3e0 [sunrpc]
[ 129.006506] rpc_execute+0x160/0x1d8 [sunrpc]
[ 129.006778] rpc_run_task+0x148/0x1f8 [sunrpc]
[ 129.007204] tls_probe+0x80/0xd0 [sunrpc]
[ 129.007460] rpc_ping+0x28/0x80 [sunrpc]
[ 129.007715] rpc_create_xprt+0x134/0x1a0 [sunrpc]
[ 129.007999] rpc_create+0x128/0x2a0 [sunrpc]
[ 129.008264] xs_tcp_tls_setup_socket+0xdc/0x508 [sunrpc]
[ 129.008583] process_one_work+0x174/0x3c8
[ 129.008813] worker_thread+0x2c8/0x3e0
[ 129.009033] kthread+0x100/0x110
[ 129.009225] ret_from_fork+0x10/0x20
[ 129.009432] Code: f0ffffc2 911fe042 aa1403e1 aa1303e0 (b9401c83)

Fixes: 1548036ef120 ("nfs: make the rpc_stat per net namespace")
Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

show more ...


# 3f0ba614 26-Jan-2024 Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Remove stale comments

bc_close() and bc_destroy now do something, so the comments are
no longer correct. Commit 6221f1d9b63f ("SUNRPC: Fix backchannel
RPC soft lockups") should have removed

SUNRPC: Remove stale comments

bc_close() and bc_destroy now do something, so the comments are
no longer correct. Commit 6221f1d9b63f ("SUNRPC: Fix backchannel
RPC soft lockups") should have removed these.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

show more ...


# 3f7edeac 12-Dec-2023 Trond Myklebust <trond.myklebust@hammerspace.com>

SUNRPC: Add a transport callback to handle dequeuing of an RPC request

Add a transport level callback to allow it to handle the consequences of
dequeuing the request that was in the process of being

SUNRPC: Add a transport callback to handle dequeuing of an RPC request

Add a transport level callback to allow it to handle the consequences of
dequeuing the request that was in the process of being transmitted.
For something like a TCP connection, we may need to disconnect if the
request was partially transmitted.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

show more ...


# 31d90deb 27-Sep-2023 Trond Myklebust <trond.myklebust@hammerspace.com>

SUNRPC: Don't retry using the same source port if connection failed

If the TCP connection attempt fails without ever establishing a
connection, then assume the problem may be the server is rejecting

SUNRPC: Don't retry using the same source port if connection failed

If the TCP connection attempt fails without ever establishing a
connection, then assume the problem may be the server is rejecting us
due to port reuse.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

show more ...


# f663507e 17-Sep-2023 Trond Myklebust <trond.myklebust@hammerspace.com>

SUNRPC: Force close the socket when a hard error is reported

Fix up xs_wake_error() to close the socket when a hard error is being
reported. Usually, that means an ECONNRESET was received on a conne

SUNRPC: Force close the socket when a hard error is reported

Fix up xs_wake_error() to close the socket when a hard error is being
reported. Usually, that means an ECONNRESET was received on a connection
attempt.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

show more ...


# 26e8bfa3 26-Sep-2023 Anna Schumaker <Anna.Schumaker@Netapp.com>

SUNRPC/TLS: Lock the lower_xprt during the tls handshake

Otherwise we run the risk of having the lower_xprt freed from underneath
us, causing an oops that looks like this:

[ 224.150698] BUG: kerne

SUNRPC/TLS: Lock the lower_xprt during the tls handshake

Otherwise we run the risk of having the lower_xprt freed from underneath
us, causing an oops that looks like this:

[ 224.150698] BUG: kernel NULL pointer dereference, address: 0000000000000018
[ 224.150951] #PF: supervisor read access in kernel mode
[ 224.151117] #PF: error_code(0x0000) - not-present page
[ 224.151278] PGD 0 P4D 0
[ 224.151361] Oops: 0000 [#1] PREEMPT SMP NOPTI
[ 224.151499] CPU: 2 PID: 99 Comm: kworker/u10:6 Not tainted 6.6.0-rc3-g6465e260f487 #41264 a00b0960990fb7bc6d6a330ee03588b67f08a47b
[ 224.151977] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS unknown 2/2/2022
[ 224.152216] Workqueue: xprtiod xs_tcp_tls_setup_socket [sunrpc]
[ 224.152434] RIP: 0010:xs_tcp_tls_setup_socket+0x3cc/0x7e0 [sunrpc]
[ 224.152643] Code: 00 00 48 8b 7c 24 08 e9 f3 01 00 00 48 83 7b c0 00 0f 85 d2 01 00 00 49 8d 84 24 f8 05 00 00 48 89 44 24 10 48 8b 00 48 89 c5 <4c> 8b 68 18 66 41 83 3f 0a 75 71 45 31 ff 4c 89 ef 31 f6 e8 5c 76
[ 224.153246] RSP: 0018:ffffb00ec060fd18 EFLAGS: 00010246
[ 224.153427] RAX: 0000000000000000 RBX: ffff8c06c2e53e40 RCX: 0000000000000001
[ 224.153652] RDX: ffff8c073bca2408 RSI: 0000000000000282 RDI: ffff8c06c259ee00
[ 224.153868] RBP: 0000000000000000 R08: ffffffff9da55aa0 R09: 0000000000000001
[ 224.154084] R10: 00000034306c30f1 R11: 0000000000000002 R12: ffff8c06c2e51800
[ 224.154300] R13: ffff8c06c355d400 R14: 0000000004208160 R15: ffff8c06c2e53820
[ 224.154521] FS: 0000000000000000(0000) GS:ffff8c073bd00000(0000) knlGS:0000000000000000
[ 224.154763] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 224.154940] CR2: 0000000000000018 CR3: 0000000062c1e000 CR4: 0000000000750ee0
[ 224.155157] PKRU: 55555554
[ 224.155244] Call Trace:
[ 224.155325] <TASK>
[ 224.155395] ? __die_body+0x68/0xb0
[ 224.155507] ? page_fault_oops+0x34c/0x3a0
[ 224.155635] ? _raw_spin_unlock_irqrestore+0xe/0x40
[ 224.155793] ? exc_page_fault+0x7a/0x1b0
[ 224.155916] ? asm_exc_page_fault+0x26/0x30
[ 224.156047] ? xs_tcp_tls_setup_socket+0x3cc/0x7e0 [sunrpc ae3a15912ae37fd51dafbdbc2dbd069117f8f5c8]
[ 224.156367] ? xs_tcp_tls_setup_socket+0x2fe/0x7e0 [sunrpc ae3a15912ae37fd51dafbdbc2dbd069117f8f5c8]
[ 224.156697] ? __pfx_xs_tls_handshake_done+0x10/0x10 [sunrpc ae3a15912ae37fd51dafbdbc2dbd069117f8f5c8]
[ 224.157013] process_scheduled_works+0x24e/0x450
[ 224.157158] worker_thread+0x21c/0x2d0
[ 224.157275] ? __pfx_worker_thread+0x10/0x10
[ 224.157409] kthread+0xe8/0x110
[ 224.157510] ? __pfx_kthread+0x10/0x10
[ 224.157628] ret_from_fork+0x37/0x50
[ 224.157741] ? __pfx_kthread+0x10/0x10
[ 224.157859] ret_from_fork_asm+0x1b/0x30
[ 224.157983] </TASK>

Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

show more ...


# d2ee4138 19-Aug-2023 Trond Myklebust <trond.myklebust@hammerspace.com>

SUNRPC: Allow specification of TCP client connect timeout at setup

When we create a TCP transport, the connect timeout parameters are
currently fixed to be 90s. This is problematic in the pNFS flexf

SUNRPC: Allow specification of TCP client connect timeout at setup

When we create a TCP transport, the connect timeout parameters are
currently fixed to be 90s. This is problematic in the pNFS flexfiles
case, where we may have multiple mirrors, and we would like to fail over
quickly to the next mirror if a data server is down.

This patch adds the ability to specify the connection parameters at RPC
client creation time.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

show more ...


# 3e6ff89d 19-Aug-2023 Trond Myklebust <trond.myklebust@hammerspace.com>

SUNRPC: Refactor and simplify connect timeout

Instead of requiring the requests to redrive the connection several
times, just let the TCP connect code manage it now that we've adjusted
the TCP_SYNCN

SUNRPC: Refactor and simplify connect timeout

Instead of requiring the requests to redrive the connection several
times, just let the TCP connect code manage it now that we've adjusted
the TCP_SYNCNT value.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

show more ...


# 3a107f07 19-Aug-2023 Trond Myklebust <trond.myklebust@hammerspace.com>

SUNRPC: Set the TCP_SYNCNT to match the socket timeout

Set the TCP SYN count so that we abort the connection attempt at around
the expected timeout value.

Signed-off-by: Trond Myklebust <trond.mykl

SUNRPC: Set the TCP_SYNCNT to match the socket timeout

Set the TCP SYN count so that we abort the connection attempt at around
the expected timeout value.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

show more ...


# 39067dda 27-Jul-2023 Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Use new helpers to handle TLS Alerts

Use the helpers to parse the level and description fields in
incoming alerts. "Warning" alerts are discarded, and "fatal"
alerts mean the session is no l

SUNRPC: Use new helpers to handle TLS Alerts

Use the helpers to parse the level and description fields in
incoming alerts. "Warning" alerts are discarded, and "fatal"
alerts mean the session is no longer valid.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://lore.kernel.org/r/169047944747.5241.1974889594004407123.stgit@oracle-102.nfsv4bat.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

show more ...


# 5dd5ad68 27-Jul-2023 Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Send TLS Closure alerts before closing a TCP socket

Before closing a TCP connection, the TLS protocol wants peers to
send session close Alert notifications. Add those in both the RPC
client

SUNRPC: Send TLS Closure alerts before closing a TCP socket

Before closing a TCP connection, the TLS protocol wants peers to
send session close Alert notifications. Add those in both the RPC
client and server.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://lore.kernel.org/r/169047939404.5241.14392506226409865832.stgit@oracle-102.nfsv4bat.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

show more ...


# 6a7eccef 27-Jul-2023 Chuck Lever <chuck.lever@oracle.com>

net/tls: Move TLS protocol elements to a separate header

Kernel TLS consumers will need definitions of various parts of the
TLS protocol, but often do not need the function declarations and
other in

net/tls: Move TLS protocol elements to a separate header

Kernel TLS consumers will need definitions of various parts of the
TLS protocol, but often do not need the function declarations and
other infrastructure provided in <net/tls.h>.

Break out existing standardized protocol elements into a separate
header, and make room for a few more elements in subsequent patches.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://lore.kernel.org/r/169047931374.5241.7713175865185969309.stgit@oracle-102.nfsv4bat.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

show more ...


# 75eb6af7 07-Jun-2023 Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Add a TCP-with-TLS RPC transport class

Use the new TLS handshake API to enable the SunRPC client code
to request a TLS handshake. This implements support for RFC 9289,
only on TCP sockets.

SUNRPC: Add a TCP-with-TLS RPC transport class

Use the new TLS handshake API to enable the SunRPC client code
to request a TLS handshake. This implements support for RFC 9289,
only on TCP sockets.

Upper layers such as NFS use RPC-with-TLS to protect in-transit
traffic.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

show more ...


# dea034b9 07-Jun-2023 Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Capture CMSG metadata on client-side receive

kTLS sockets use CMSG to report decryption errors and the need
for session re-keying.

For RPC-with-TLS, an "application data" message contains a

SUNRPC: Capture CMSG metadata on client-side receive

kTLS sockets use CMSG to report decryption errors and the need
for session re-keying.

For RPC-with-TLS, an "application data" message contains a ULP
payload, and that is passed along to the RPC client. An "alert"
message triggers connection reset. Everything else is discarded.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

show more ...


# 0d3ca07f 07-Jun-2023 Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Ignore data_ready callbacks during TLS handshakes

The RPC header parser doesn't recognize TLS handshake traffic, so it
will close the connection prematurely with an error. To avoid that,
shu

SUNRPC: Ignore data_ready callbacks during TLS handshakes

The RPC header parser doesn't recognize TLS handshake traffic, so it
will close the connection prematurely with an error. To avoid that,
shunt the transport's data_ready callback when there is a TLS
handshake in progress.

The XPRT_SOCK_IGNORE_RECV flag will be toggled by code added in a
subsequent patch.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

show more ...


# 4388ce05 10-May-2023 NeilBrown <neilb@suse.de>

SUNRPC: support abstract unix socket addresses

An "abtract" address for an AF_UNIX socket start with a nul and can
contain any bytes for the given length, but traditionally doesn't
contain other nul

SUNRPC: support abstract unix socket addresses

An "abtract" address for an AF_UNIX socket start with a nul and can
contain any bytes for the given length, but traditionally doesn't
contain other nuls. When reported, the leading nul is replaced by '@'.

sunrpc currently rejects connections to these addresses and reports them
as an empty string. To provide support for future use of these
addresses, allow them for outgoing connections and report them more
usefully.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

show more ...


# c946cb69 11-Mar-2023 Luis Chamberlain <mcgrof@kernel.org>

sunrpc: simplify one-level sysctl registration for xs_tunables_table

There is no need to declare an extra tables to just create directory,
this can be easily be done with a prefix path with register

sunrpc: simplify one-level sysctl registration for xs_tunables_table

There is no need to declare an extra tables to just create directory,
this can be easily be done with a prefix path with register_sysctl().

Simplify this registration.

Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

show more ...


# 943d045a 20-Mar-2023 Siddharth Kawar <Siddharth.Kawar@microsoft.com>

SUNRPC: fix shutdown of NFS TCP client socket

NFS server Duplicate Request Cache (DRC) algorithms rely on NFS clients
reconnecting using the same local TCP port. Unique NFS operations are
identified

SUNRPC: fix shutdown of NFS TCP client socket

NFS server Duplicate Request Cache (DRC) algorithms rely on NFS clients
reconnecting using the same local TCP port. Unique NFS operations are
identified by the per-TCP connection set of XIDs. This prevents file
corruption when non-idempotent NFS operations are retried.

Currently, NFS client TCP connections are using different local TCP ports
when reconnecting to NFS servers.

After an NFS server initiates shutdown of the TCP connection, the NFS
client's TCP socket is set to NULL after the socket state has reached
TCP_LAST_ACK(9). When reconnecting, the new socket attempts to reuse
the same local port but fails with EADDRNOTAVAIL (99). This forces the
socket to use a different local TCP port to reconnect to the remote NFS
server.

State Transition and Events:
TCP_CLOSE_WAIT(8)
TCP_LAST_ACK(9)
connect(fail EADDRNOTAVAIL(99))
TCP_CLOSE(7)
bind on new port
connect success

dmesg excerpts showing reconnect switching from TCP local port of 926 to
763 after commit 7c81e6a9d75b:
[13354.947854] NFS call mkdir testW
...
[13405.654781] RPC: xs_tcp_state_change client 00000000037d0f03...
[13405.654813] RPC: state 8 conn 1 dead 0 zapped 1 sk_shutdown 1
[13405.654826] RPC: xs_data_ready...
[13405.654892] RPC: xs_tcp_state_change client 00000000037d0f03...
[13405.654895] RPC: state 9 conn 0 dead 0 zapped 1 sk_shutdown 3
[13405.654899] RPC: xs_tcp_state_change client 00000000037d0f03...
[13405.654900] RPC: state 9 conn 0 dead 0 zapped 1 sk_shutdown 3
[13405.654950] RPC: xs_connect scheduled xprt 00000000037d0f03
[13405.654975] RPC: xs_bind 0.0.0.0:926: ok (0)
[13405.654980] RPC: worker connecting xprt 00000000037d0f03 via tcp
to 10.101.6.228 (port 2049)
[13405.654991] RPC: 00000000037d0f03 connect status 99 connected 0
sock state 7
[13405.655001] RPC: xs_tcp_state_change client 00000000037d0f03...
[13405.655002] RPC: state 7 conn 0 dead 0 zapped 1 sk_shutdown 3
[13405.655024] RPC: xs_connect scheduled xprt 00000000037d0f03
[13405.655038] RPC: xs_bind 0.0.0.0:763: ok (0)
[13405.655041] RPC: worker connecting xprt 00000000037d0f03 via tcp
to 10.101.6.228 (port 2049)
[13405.655065] RPC: 00000000037d0f03 connect status 115 connected 0
sock state 2

State Transition and Events with patch applied:
TCP_CLOSE_WAIT(8)
TCP_LAST_ACK(9)
TCP_CLOSE(7)
connect(reuse of port succeeds)

dmesg excerpts showing reconnect on same TCP local port of 936 with patch
applied:
[ 257.139935] NFS: mkdir(0:59/560857152), testQ
[ 257.139937] NFS call mkdir testQ
...
[ 307.822702] RPC: state 8 conn 1 dead 0 zapped 1 sk_shutdown 1
[ 307.822714] RPC: xs_data_ready...
[ 307.822817] RPC: xs_tcp_state_change client 00000000ce702f14...
[ 307.822821] RPC: state 9 conn 0 dead 0 zapped 1 sk_shutdown 3
[ 307.822825] RPC: xs_tcp_state_change client 00000000ce702f14...
[ 307.822826] RPC: state 9 conn 0 dead 0 zapped 1 sk_shutdown 3
[ 307.823606] RPC: xs_tcp_state_change client 00000000ce702f14...
[ 307.823609] RPC: state 7 conn 0 dead 0 zapped 1 sk_shutdown 3
[ 307.823629] RPC: xs_tcp_state_change client 00000000ce702f14...
[ 307.823632] RPC: state 7 conn 0 dead 0 zapped 1 sk_shutdown 3
[ 307.823676] RPC: xs_connect scheduled xprt 00000000ce702f14
[ 307.823704] RPC: xs_bind 0.0.0.0:936: ok (0)
[ 307.823709] RPC: worker connecting xprt 00000000ce702f14 via tcp
to 10.101.1.30 (port 2049)
[ 307.823748] RPC: 00000000ce702f14 connect status 115 connected 0
sock state 2
...
[ 314.916193] RPC: state 7 conn 0 dead 0 zapped 1 sk_shutdown 3
[ 314.916251] RPC: xs_connect scheduled xprt 00000000ce702f14
[ 314.916282] RPC: xs_bind 0.0.0.0:936: ok (0)
[ 314.916292] RPC: worker connecting xprt 00000000ce702f14 via tcp
to 10.101.1.30 (port 2049)
[ 314.916342] RPC: 00000000ce702f14 connect status 115 connected 0
sock state 2

Fixes: 7c81e6a9d75b ("SUNRPC: Tweak TCP socket shutdown in the RPC client")
Signed-off-by: Siddharth Rajendra Kawar <sikawar@microsoft.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

show more ...


# 40e0b090 20-Jan-2023 Peilin Ye <peilin.ye@bytedance.com>

net/sock: Introduce trace_sk_data_ready()

As suggested by Cong, introduce a tracepoint for all ->sk_data_ready()
callback implementations. For example:

<...>
iperf-609 [002] ..... 70.660425: s

net/sock: Introduce trace_sk_data_ready()

As suggested by Cong, introduce a tracepoint for all ->sk_data_ready()
callback implementations. For example:

<...>
iperf-609 [002] ..... 70.660425: sk_data_ready: family=2 protocol=6 func=sock_def_readable
iperf-609 [002] ..... 70.660436: sk_data_ready: family=2 protocol=6 func=sock_def_readable
<...>

Suggested-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


# 98123866 16-Dec-2022 Benjamin Coddington <bcodding@redhat.com>

Treewide: Stop corrupting socket's task_frag

Since moving to memalloc_nofs_save/restore, SUNRPC has stopped setting the
GFP_NOIO flag on sk_allocation which the networking system uses to decide
when

Treewide: Stop corrupting socket's task_frag

Since moving to memalloc_nofs_save/restore, SUNRPC has stopped setting the
GFP_NOIO flag on sk_allocation which the networking system uses to decide
when it is safe to use current->task_frag. The results of this are
unexpected corruption in task_frag when SUNRPC is involved in memory
reclaim.

The corruption can be seen in crashes, but the root cause is often
difficult to ascertain as a crashing machine's stack trace will have no
evidence of being near NFS or SUNRPC code. I believe this problem to
be much more pervasive than reports to the community may indicate.

Fix this by having kernel users of sockets that may corrupt task_frag due
to reclaim set sk_use_task_frag = false. Preemptively correcting this
situation for users that still set sk_allocation allows them to convert to
memalloc_nofs_save/restore without the same unexpected corruptions that are
sure to follow, unlikely to show up in testing, and difficult to bisect.

CC: Philipp Reisner <philipp.reisner@linbit.com>
CC: Lars Ellenberg <lars.ellenberg@linbit.com>
CC: "Christoph Böhmwalder" <christoph.boehmwalder@linbit.com>
CC: Jens Axboe <axboe@kernel.dk>
CC: Josef Bacik <josef@toxicpanda.com>
CC: Keith Busch <kbusch@kernel.org>
CC: Christoph Hellwig <hch@lst.de>
CC: Sagi Grimberg <sagi@grimberg.me>
CC: Lee Duncan <lduncan@suse.com>
CC: Chris Leech <cleech@redhat.com>
CC: Mike Christie <michael.christie@oracle.com>
CC: "James E.J. Bottomley" <jejb@linux.ibm.com>
CC: "Martin K. Petersen" <martin.petersen@oracle.com>
CC: Valentina Manea <valentina.manea.m@gmail.com>
CC: Shuah Khan <shuah@kernel.org>
CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
CC: David Howells <dhowells@redhat.com>
CC: Marc Dionne <marc.dionne@auristor.com>
CC: Steve French <sfrench@samba.org>
CC: Christine Caulfield <ccaulfie@redhat.com>
CC: David Teigland <teigland@redhat.com>
CC: Mark Fasheh <mark@fasheh.com>
CC: Joel Becker <jlbec@evilplan.org>
CC: Joseph Qi <joseph.qi@linux.alibaba.com>
CC: Eric Van Hensbergen <ericvh@gmail.com>
CC: Latchesar Ionkov <lucho@ionkov.net>
CC: Dominique Martinet <asmadeus@codewreck.org>
CC: Ilya Dryomov <idryomov@gmail.com>
CC: Xiubo Li <xiubli@redhat.com>
CC: Chuck Lever <chuck.lever@oracle.com>
CC: Jeff Layton <jlayton@kernel.org>
CC: Trond Myklebust <trond.myklebust@hammerspace.com>
CC: Anna Schumaker <anna@kernel.org>
CC: Steffen Klassert <steffen.klassert@secunet.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>

Suggested-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

show more ...


# de4eda9d 16-Sep-2022 Al Viro <viro@zeniv.linux.org.uk>

use less confusing names for iov_iter direction initializers

READ/WRITE proved to be actively confusing - the meanings are
"data destination, as used with read(2)" and "data source, as
used with wri

use less confusing names for iov_iter direction initializers

READ/WRITE proved to be actively confusing - the meanings are
"data destination, as used with read(2)" and "data source, as
used with write(2)", but people keep interpreting those as
"we read data from it" and "we write data to it", i.e. exactly
the wrong way.

Call them ITER_DEST and ITER_SOURCE - at least that is harder
to misinterpret...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

show more ...


# 8032bf12 10-Oct-2022 Jason A. Donenfeld <Jason@zx2c4.com>

treewide: use get_random_u32_below() instead of deprecated function

This is a simple mechanical transformation done by:

@@
expression E;
@@
- prandom_u32_max
+ get_random_u32_below
(E)

Reviewed-

treewide: use get_random_u32_below() instead of deprecated function

This is a simple mechanical transformation done by:

@@
expression E;
@@
- prandom_u32_max
+ get_random_u32_below
(E)

Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Darrick J. Wong <djwong@kernel.org> # for xfs
Reviewed-by: SeongJae Park <sj@kernel.org> # for damon
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> # for infiniband
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> # for arm
Acked-by: Ulf Hansson <ulf.hansson@linaro.org> # for mmc
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

show more ...


# 81895a65 05-Oct-2022 Jason A. Donenfeld <Jason@zx2c4.com>

treewide: use prandom_u32_max() when possible, part 1

Rather than incurring a division or requesting too many random bytes for
the given range, use the prandom_u32_max() function, which only takes
t

treewide: use prandom_u32_max() when possible, part 1

Rather than incurring a division or requesting too many random bytes for
the given range, use the prandom_u32_max() function, which only takes
the minimum required bytes from the RNG and avoids divisions. This was
done mechanically with this coccinelle script:

@basic@
expression E;
type T;
identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32";
typedef u64;
@@
(
- ((T)get_random_u32() % (E))
+ prandom_u32_max(E)
|
- ((T)get_random_u32() & ((E) - 1))
+ prandom_u32_max(E * XXX_MAKE_SURE_E_IS_POW2)
|
- ((u64)(E) * get_random_u32() >> 32)
+ prandom_u32_max(E)
|
- ((T)get_random_u32() & ~PAGE_MASK)
+ prandom_u32_max(PAGE_SIZE)
)

@multi_line@
identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32";
identifier RAND;
expression E;
@@

- RAND = get_random_u32();
... when != RAND
- RAND %= (E);
+ RAND = prandom_u32_max(E);

// Find a potential literal
@literal_mask@
expression LITERAL;
type T;
identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32";
position p;
@@

((T)get_random_u32()@p & (LITERAL))

// Add one to the literal.
@script:python add_one@
literal << literal_mask.LITERAL;
RESULT;
@@

value = None
if literal.startswith('0x'):
value = int(literal, 16)
elif literal[0] in '123456789':
value = int(literal, 10)
if value is None:
print("I don't know how to handle %s" % (literal))
cocci.include_match(False)
elif value == 2**32 - 1 or value == 2**31 - 1 or value == 2**24 - 1 or value == 2**16 - 1 or value == 2**8 - 1:
print("Skipping 0x%x for cleanup elsewhere" % (value))
cocci.include_match(False)
elif value & (value + 1) != 0:
print("Skipping 0x%x because it's not a power of two minus one" % (value))
cocci.include_match(False)
elif literal.startswith('0x'):
coccinelle.RESULT = cocci.make_expr("0x%x" % (value + 1))
else:
coccinelle.RESULT = cocci.make_expr("%d" % (value + 1))

// Replace the literal mask with the calculated result.
@plus_one@
expression literal_mask.LITERAL;
position literal_mask.p;
expression add_one.RESULT;
identifier FUNC;
@@

- (FUNC()@p & (LITERAL))
+ prandom_u32_max(RESULT)

@collapse_ret@
type T;
identifier VAR;
expression E;
@@

{
- T VAR;
- VAR = (E);
- return VAR;
+ return E;
}

@drop_var@
type T;
identifier VAR;
@@

{
- T VAR;
... when != VAR
}

Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Yury Norov <yury.norov@gmail.com>
Reviewed-by: KP Singh <kpsingh@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz> # for ext4 and sbitmap
Reviewed-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> # for drbd
Acked-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Heiko Carstens <hca@linux.ibm.com> # for s390
Acked-by: Ulf Hansson <ulf.hansson@linaro.org> # for mmc
Acked-by: Darrick J. Wong <djwong@kernel.org> # for xfs
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

show more ...


# 39494194 05-Oct-2022 Trond Myklebust <trond.myklebust@hammerspace.com>

SUNRPC: Fix races with rpc_killall_tasks()

Ensure that we immediately call rpc_exit_task() after waking up, and
that the tk_rpc_status cannot get clobbered by some other function.

Signed-off-by: Tr

SUNRPC: Fix races with rpc_killall_tasks()

Ensure that we immediately call rpc_exit_task() after waking up, and
that the tk_rpc_status cannot get clobbered by some other function.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

show more ...


12345678910>>...21