History log of /freebsd/sys/sys/sockbuf.h (Results 1 – 25 of 390)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: vendor/llvm-project/llvmorg-18.1.5-0-g617a15a9eac9, vendor/NetBSD/bmake/20240430, vendor/libcbor/0.11.0, vendor/llvm-project/llvmorg-18.1.4-0-ge6c3289804a6, vendor/device-tree/6.8, vendor/device-tree/6.7
# 5716d902 09-Apr-2024 Gleb Smirnoff <glebius@FreeBSD.org>

Revert "unix: new implementation of unix/stream & unix/seqpacket"

The regressions in aio(4) and kernel RPC aren't a 5 minute problem.

This reverts commit d80a97def9a1db6f07f5d2e68f7ad62b27918947.
T

Revert "unix: new implementation of unix/stream & unix/seqpacket"

The regressions in aio(4) and kernel RPC aren't a 5 minute problem.

This reverts commit d80a97def9a1db6f07f5d2e68f7ad62b27918947.
This reverts commit d1cbb17a873c787a527316bbb27551e97d5ad30c.
This reverts commit fb8a8333b481cc4256d0b3f0b5b4feaa4594e01f.

show more ...


# d80a97de 08-Apr-2024 Gleb Smirnoff <glebius@FreeBSD.org>

unix: new implementation of unix/stream & unix/seqpacket

Provide protocol specific pr_sosend and pr_soreceive for PF_UNIX
SOCK_STREAM sockets and implement SOCK_SEQPACKET sockets as an extension
of

unix: new implementation of unix/stream & unix/seqpacket

Provide protocol specific pr_sosend and pr_soreceive for PF_UNIX
SOCK_STREAM sockets and implement SOCK_SEQPACKET sockets as an extension
of SOCK_STREAM. The change meets three goals: get rid of unix(4) specific
stuff in the generic socket code, provide a faster and robust unix/stream
sockets and bring unix/seqpacket much closer to specification. Highlights
follow:

- The send buffer now is truly bypassed. Previously it was always empty,
but the send(2) still needed to acquire its lock and do a variety of
tricks to be woken up in the right time while sleeping on it. Now the
only two things we care about in the send buffer is the I/O sx(9) lock
that serializes operations and value of so_snd.sb_hiwat, which we can read
without obtaining a lock. The sleep of a send(2) happens on the mutex of
the receive buffer of the peer. A bulk send/recv of data with large
socket buffers will make both syscalls just bounce between owning the
receive buffer lock and copyin(9)/copyout(9), no other locks would be
involved.

- The implementation uses new mchain structure to manipulate mbuf chains.
Note that this required converting to mchain two functions that are shared
with unix/dgram: unp_internalize() and unp_addsockcred() as well as adding
a new shared one uipc_process_kernel_mbuf(). This induces some non-
functional changes in the unix/dgram code as well. There is a space for
improvement here, as right now it is a mix of mchain and manually managed
mbuf chains.

- unix/seqpacket previously marked as PR_ADDR & PR_ATOMIC and thus treated
as a datagram socket by the generic socket code, now becomes a true stream
socket with record markers.

- unix/stream loses the sendfile(2) support. This can be brought back,
but requires some work. Let's first see if there is any interest in this
feature, except purely academical.

Reviewed by: markj, tuexen
Differential Revision: https://reviews.freebsd.org/D44151

show more ...


Revision tags: vendor/llvm-project/llvmorg-18.1.3-0-gc13b7485b879, vendor/device-tree/6.5, vendor/openssh/9.7p1, vendor/unbound/1.19.3, vendor/NetBSD/bmake/20240309, vendor/sqlite3/sqlite-3450100, vendor/llvm-project/llvmorg-18.1.1-0-gdba2a75e9c7e, vendor/got/diff/2023-09-15, release/13.3.0, vendor/libucl/20240206, vendor/xz/5.6.0, vendor/llvm-project/llvmorg-18.1.0-rc3-0-g6c90f8dd5463, vendor/llvm-project/llvmorg-18.1.0-rc2-53-gc7b0a6ecd442, vendor/arm-optimized-routines/v24.01, vendor/zlib/1.3.1, vendor/expat/2.6.0, vendor/unbound/1.19.1, vendor/tzcode/tzcode2024a, vendor/llvm-project/llvmorg-18.1.0-rc2-0-gc6c86965d967, vendor/tzdata/tzdata2024a, vendor/sendmail/8.18.1, vendor/acpica/20230628, vendor/acpica/20230331, vendor/llvm-project/llvmorg-18-init-18361-g22683463740e, vendor/libcxxrt/2024-01-25-fd484be8d1e94a1fcf6bc5c67e5c07b65ada19b6, vendor/llvm-project/llvmorg-18-init-18359-g93248729cfae, vendor/sqlite3/sqlite-3450000, vendor/NetBSD/bmake/20240108, vendor/llvm-project/llvmorg-18-init-16864-g3b3ee1f53424, vendor/llvm-project/llvmorg-18-init-16595-g7c00a5be5cde, vendor/llvm-project/llvmorg-18-init-16003-gfc5f51cf5af4
# 660bd40a 02-Jan-2024 Gleb Smirnoff <glebius@FreeBSD.org>

netlink: use domain specific send buffer

Instead of using generic socket code, create Netlink specific socket
buffer. It is a simple TAILQ of writes that came from userland. This
saves us one memo

netlink: use domain specific send buffer

Instead of using generic socket code, create Netlink specific socket
buffer. It is a simple TAILQ of writes that came from userland. This
saves us one memory allocation that could fail and one memory copy.

Reviewed by: melifaro
Differential Revision: https://reviews.freebsd.org/D42522

show more ...


Revision tags: vendor/bc/6.7.4, vendor/ena-com/2.7.0, vendor/llvm-project/llvmorg-18-init-15692-g007ed0dccd6a, vendor/tzdata/tzdata2023d, vendor/openssh/9.6p1, vendor/llvm-project/llvmorg-18-init-15088-gd14ee76181fb, vendor/llvm-project/llvmorg-18-init-14265-ga17671084db1, vendor/llvm-project/llvmorg-17.0.6-0-g6009708b4367, vendor/xz/5.4.5
# 29363fb4 23-Nov-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove ancient SCCS tags.

Remove ancient SCCS tags from the tree, automated scripting, with two
minor fixup to keep things compiling. All the common forms in the tree
were removed with a perl s

sys: Remove ancient SCCS tags.

Remove ancient SCCS tags from the tree, automated scripting, with two
minor fixup to keep things compiling. All the common forms in the tree
were removed with a perl script.

Sponsored by: Netflix

show more ...


Revision tags: vendor/llvm-project/llvmorg-17.0.5-0-g98bfdac5ce82, vendor/unbound/1.19.0, vendor/sqlite3/sqlite-3440000, release/14.0.0, vendor/bc/6.7.2, vendor/llvm-project/llvmorg-17.0.3-0-g888437e1b600, vendor/bsddialog/1.0, vendor/llvm-project/llvmorg-17.0.2-0-gb2417f51dbbd, vendor/openssh/9.5p1, vendor/llvm-project/llvmorg-17.0.1-25-g098e653a5bed, vendor/nvi/2.2.1, vendor/openssl/3.0.11, vendor/sqlite3/sqlite-3430100, vendor/unbound/1.18.0, vendor/NetBSD/bmake/20230909, vendor/openssl/1.1.1w, vendor/llvm-project/llvmorg-17.0.0-rc4-10-g0176e8729ea4, vendor/file/5.45, vendor/llvm-project/llvmorg-17.0.0-rc3-79-ga612cb0b81d8, vendor/krb5/1.21.2, vendor/unifdef/2.12, vendor/unifdef/2.11, 2023.08.19-b34f66deb02e188104, vendor/zlib/1.3
# 95ee2897 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove $FreeBSD$: two-line .h pattern

Remove /^\s*\*\n \*\s+\$FreeBSD\$$\n/


Revision tags: vendor/less/v643, vendor/NetBSD/libc-vis/20230813, vendor/openssh/9.4p1, vendor/device-tree/6.4, vendor/device-tree/6.3, vendor/device-tree/6.2, vendor/device-tree/6.1, vendor/krb5/1.21.1, vendor/xz/5.4.4, vendor/openssl/3.0.10, vendor/openssl/1.1.1v, vendor/llvm-project/llvmorg-17-init-19311-gbc849e525f80, vendor/llvm-project/llvmorg-17-init-19304-gd0b54bb50e51, vendor/openssh/9.3p2, vendor/lua/5.4.6, vendor/NetBSD/bmake/20230622, vendor/openpam/XIMENIA, vendor/heimdal/7.8.0-2023-06-10-f62e2f278, vendor/openssl/3.0.9, vendor/llvm-project/llvmorg-16.0.6-0-g7cbf1a259152, vendor/ntp/4.2.8p17, vendor/llvm-project/llvmorg-16.0.5-0-g185b81e034ba, vendor/spleen/2.0.0, vendor/ntp/4.2.8p16, vendor/openssl/1.1.1u, vendor/sqlite3/sqlite-3420000, vendor/bc/6.6.0, vendor/llvm-project/llvmorg-16.0.4-0-gae42196bc493, vendor/NetBSD/bmake/20230510, vendor/xz/5.4.3, vendor/tcpdump/4.99.4, vendor/llvm-project/llvmorg-16.0.3-0-gda3cd333bea5, vendor/ldns/1.8.3, vendor/spleen/1.9.3, vendor/libpcap/1.10.4, vendor/spleen/1.6.0, vendor/less/v632, vendor/bc/6.5.0, vendor/libfido2/1.13.0, vendor/libfido2/1.12.0, vendor/libfido2/1.11.0, vendor/libfido2/1.10.0, vendor/libfido2/1.9.0, vendor/NetBSD/bmake/20230414, vendor/llvm-project/llvmorg-16.0.2-0-g18ddebe1a1a9, vendor/libcbor/0.10.2, vendor/tzcode/tzcode2023c, vendor/tzcode/tzcode2023b, vendor/tzcode/tzcode2023a, vendor/sqlite3/sqlite-3410200, vendor/llvm-project/llvmorg-16.0.1-0-gcd89023f7979, release/13.2.0, vendor/llvm-project/llvmorg-16.0.0-45-g42d1b276f779, vendor/llvm-project/llvmorg-16.0.0-0-g08d094a0e457, vendor/tzdata/tzdata2023c, vendor/libpcap/1.10.3, vendor/opencsd/v1.4.0, vendor/arm-optimized-routines/v23.01, vendor/tzdata/tzdata2023b, vendor/tzdata/tzdata2023a, vendor/xz/5.4.2, vendor/openssh/9.3p1, vendor/openssl/3.0.8, vendor/bc/6.4.0, vendor/sqlite3/sqlite-3410000, vendor/bc/6.3.1, vendor/bearssl/20230220, vendor/zlib/1.2.13, vendor/llvm-project/llvmorg-16.0.0-rc2-10-g073506d8c15c, vendor/llvm-project/llvmorg-16-init-18548-gb0daacf58f41, vendor/NetBSD/bmake/20230208
# c0e4090e 08-Feb-2023 Andrew Gallatin <gallatin@FreeBSD.org>

ktls: Accurately track if ifnet ktls is enabled

This allows us to avoid spurious calls to ktls_disable_ifnet()

When we implemented ifnet kTLSe, we set a flag in the tx socket
buffer (SB_TLS_IFNET)

ktls: Accurately track if ifnet ktls is enabled

This allows us to avoid spurious calls to ktls_disable_ifnet()

When we implemented ifnet kTLSe, we set a flag in the tx socket
buffer (SB_TLS_IFNET) to indicate ifnet kTLS. This flag meant that
now, or in the past, ifnet ktls was active on a socket. Later,
I added code to switch ifnet ktls sessions to software in the case
of lossy TCP connections that have a high retransmit rate.
Because TCP was using SB_TLS_IFNET to know if it needed to do math
to calculate the retransmit ratio and potentially call into
ktls_disable_ifnet(), it was doing unneeded work long after
a session was moved to software.

This patch carefully tracks whether or not ifnet ktls is still enabled
on a TCP connection. Because the inp is now embedded in the tcpcb, and
because TCP is the most frequent accessor of this state, it made sense to
move this from the socket buffer flags to the tcpcb. Because we now need
reliable access to the tcbcb, we take a ref on the inp when creating a tx
ktls session.

While here, I noticed that rack/bbr were incorrectly implementing
tfb_hwtls_change(), and applying the change to all pending sends,
when it should apply only to future sends.

This change reduces spurious calls to ktls_disable_ifnet() by 95% or so
in a Netflix CDN environment.

Reviewed by: markj, rrs
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D38380

show more ...


Revision tags: vendor/byacc/20230201, vendor/openssl/1.1.1t, vendor/NetBSD/libedit/2023-01-06, vendor/openssh/9.2p1, vendor/tcsh/6.24.07, vendor/bc/6.2.2, vendor/bc/6.2.1, vendor/bc/6.2.0, vendor/bc/6.1.0, vendor/bc/6.0.4, vendor/NetBSD/bmake/20230126, vendor/Juniper/libxo/1.6.0, vendor/zstd/1.5.2, vendor/xz/5.4.1, vendor/sendmail/8.17.1, vendor/llvm-project/llvmorg-15.0.7-0-g8dfdcc7b7bf6, vendor/heimdal/7.8.0, vendor/sqlite3/sqlite-3400100, vendor/xz/5.4.0, vendor/tzcode/tzcode2022g, vendor/tzcode/tzcode2022f, vendor/tzcode/tzcode2022e, vendor/tzcode/tzcode2022d, vendor/xz/5.2.9, vendor/llvm-project/llvmorg-15.0.6-0-g088f33605d8a, vendor/tzdata/tzdata2022g, release/12.4.0, vendor/sqlite3/sqlite-3400000, vendor/expat/2.5.0, vendor/xz/5.2.8, vendor/device-tree/6.0, vendor/device-tree/5.19, vendor/openssl/1.1.1s, vendor/wireguard-tools/v1.0.20210914, vendor/tzdata/tzdata2022f, vendor/acpica/20221020, vendor/unbound/1.17.0, vendor/llvm-project/llvmorg-15.0.2-10-gf3c5289e7846, vendor/llvm-project/llvmorg-15.0.2-0-g4bd3f3759259, vendor/llvm-project/llvmorg-15.0.1-0-gb73d2c8c720a, vendor/tzdata/tzdata2022e, vendor/openssh/9.1p1, vendor/unbound/1.16.3
# 7b660faa 27-Sep-2022 Alexander V. Chernikov <melifaro@FreeBSD.org>

sockbufs: add sbreserve_locked_limit() with custom maxsockbuf limit.

Protocols such as netlink may need a large socket receive buffer,
measured in tens of megabytes. This change allows netlink to

sockbufs: add sbreserve_locked_limit() with custom maxsockbuf limit.

Protocols such as netlink may need a large socket receive buffer,
measured in tens of megabytes. This change allows netlink to
set larger socket buffers (given the privs are in place), without
requiring user to manuall bump maxsockbuf.

Reviewed by: glebius
Differential Revision: https://reviews.freebsd.org/D36747

show more ...


# f6696856 27-Sep-2022 Alexander V. Chernikov <melifaro@FreeBSD.org>

protocols: make socket buffers ioctl handler changeable

Allow to set custom per-protocol handlers for the socket buffers
ioctls by introducing pr_setsbopt callback with the default value
set to th

protocols: make socket buffers ioctl handler changeable

Allow to set custom per-protocol handlers for the socket buffers
ioctls by introducing pr_setsbopt callback with the default value
set to the currently-used sbsetopt().

Reviewed by: glebius
Differential Revision: https://reviews.freebsd.org/D36746

show more ...


Revision tags: vendor/bsddialog/0.4, vendor/tzdata/tzdata2022d, vendor/file/5.43, vendor/expat/2.4.9, vendor/sqlite3/sqlite-3390300, vendor/llvm-project/llvmorg-15.0.0-9-g1c73596d3454, vendor/llvm-project/llvmorg-15.0.0-0-g4ba6a9c9f65b, vendor/less/v608, vendor/bsddialog/0.3, vendor/lua/5.4.4, vendor/lua/5.4.3, vendor/sqlite3/sqlite-3390200, vendor/bc/6.0.2, verndor/bc/6.0.2, vendor/dhcpcd/9.4.1, vendor/tzcode/tzcode2022c, vendor/tzcode/unsplit, vendor/tzdata/tzdata2022c, vendor/llvm-project/llvmorg-15.0.0-rc2-40-gfbd2950d8d0d, vendor/tzdata/tzdata2022b, vendor/arm-optimized-routines/20220210-89ca9c3, vendor/device-tree/5.18, vendor/device-tree/5.17, vendor/device-tree/5.16, vendor/device-tree/5.15, vendor/device-tree/5.14, vendor/unbound/1.16.2, vendor/llvm-project/llvmorg-15-init-17826-g1f8ae9d7e7e4, vendor/llvm-project/llvmorg-15-init-17827-gd77882e66779, vendor/NetBSD/bmake/20220726, vendor/NetBSD/bmake/20220724, vendor/llvm-project/llvmorg-15-init-17485-ga3e38b4a206b, vendor/llvm-project/llvmorg-15-init-16436-g18a6ab5b8d1f, vendor/unbound/1.16.1, vendor/sqlite3/sqlite-3390000, vendor/openssl/1.1.1q, vendor/file/5.42, vendor/llvm-project/llvmorg-15-init-15358-g53dc0f107877
# 458f475d 24-Jun-2022 Gleb Smirnoff <glebius@FreeBSD.org>

unix/dgram: smart socket buffers for one-to-many sockets

A one-to-many unix/dgram socket is a socket that has been bound
with bind(2) and can get multiple connections. A typical example
is /var/run

unix/dgram: smart socket buffers for one-to-many sockets

A one-to-many unix/dgram socket is a socket that has been bound
with bind(2) and can get multiple connections. A typical example
is /var/run/log bound by syslogd(8) and receiving multiple
connections from libc syslog(3) API. Until now all of these
connections shared the same receive socket buffer of the bound
socket. This made the socket vulnerable to overflow attack.
See 240d5a9b1ce for a historical attempt to workaround the problem.

This commit creates a per-connection socket buffer for every single
connected socket and eliminates the problem. The new behavior will
optimize seldom writers over frequent writers. See added test case
scenarios and code comments for more detailed description of the
new behavior.

Reviewed by: markj
Differential revision: https://reviews.freebsd.org/D35303

show more ...


# a7444f80 24-Jun-2022 Gleb Smirnoff <glebius@FreeBSD.org>

unix/dgram: use minimal possible socket buffer for PF_UNIX/SOCK_DGRAM

This change fully splits away PF_UNIX/SOCK_DGRAM from other socket
buffer implementations, without any behavior changes.

Generi

unix/dgram: use minimal possible socket buffer for PF_UNIX/SOCK_DGRAM

This change fully splits away PF_UNIX/SOCK_DGRAM from other socket
buffer implementations, without any behavior changes.

Generic socket implementation is reduced down to one STAILQ and very
little code.

Reviewed by: markj
Differential revision: https://reviews.freebsd.org/D35300

show more ...


# a4fc4142 24-Jun-2022 Gleb Smirnoff <glebius@FreeBSD.org>

sockets: enable protocol specific socket buffers

Split struct sockbuf into common shared fields and protocol specific
union, where protocols are free to implement whatever buffer they
want. Such pr

sockets: enable protocol specific socket buffers

Split struct sockbuf into common shared fields and protocol specific
union, where protocols are free to implement whatever buffer they
want. Such protocols should mark themselves with PR_SOCKBUF and are
expected to initialize their buffers in their pr_attach and tear
them down in pr_detach.

Reviewed by: markj
Differential revision: https://reviews.freebsd.org/D35299

show more ...


Revision tags: vendor/openssl/1.1.1p, vendor/bc/5.3.3, vendor/bc/5.3.2, vendor/llvm-project/llvmorg-14.0.5-0-gc12386ae247c, vendor/bc/5.3.1, vendor/bc/5.3.0, vendor/unbound/1.16.0, vendor/llvm-project/llvmorg-14.0.4-0-g29f1039a7285, vendor/sqlite3/sqlite-3380500, release/13.1.0, upstream/13.1.0, vendor/bc/5.2.5, vendor/openssl/1.1.1o, vendor/llvm-project/llvmorg-14.0.2-0-g0e27d08cdeb3, vendor/llvm-project/llvmorg-14.0.3-0-g1f9140064dfb
# fe8c78f0 23-Apr-2022 Hans Petter Selasky <hselasky@FreeBSD.org>

ktls: Add full support for TLS RX offloading via network interface.

Basic TLS RX offloading uses the "csum_flags" field in the mbuf packet
header to figure out if an incoming mbuf has been fully off

ktls: Add full support for TLS RX offloading via network interface.

Basic TLS RX offloading uses the "csum_flags" field in the mbuf packet
header to figure out if an incoming mbuf has been fully offloaded or
not. This information follows the packet stream via the LRO engine, IP
stack and finally to the TCP stack. The TCP stack preserves the mbuf
packet header also when re-assembling packets after packet loss. When
the mbuf goes into the socket buffer the packet header is demoted and
the offload information is transferred to "m_flags" . Later on a
worker thread will analyze the mbuf flags and decide if the mbufs
making up a TLS record indicate a fully-, partially- or not decrypted
TLS record. Based on these three cases the worker thread will either
pass the packet on as-is or recrypt the decrypted bits, if any, or
decrypt the packet as usual.

During packet loss the kernel TLS code will call back into the network
driver using the send tag, informing about the TCP starting sequence
number of every TLS record that is not fully decrypted by the network
interface. The network interface then stores this information in a
compressed table and starts asking the hardware if it has found a
valid TLS header in the TCP data payload. If the hardware has found a
valid TLS header and the referred TLS header is at a valid TCP
sequence number according to the TCP sequence numbers provided by the
kernel TLS code, the network driver then informs the hardware that it
can resume decryption.

Care has been taken to not merge encrypted and decrypted mbuf chains,
in the LRO engine and when appending mbufs to the socket buffer.

The mbuf's leaf network interface pointer is used to figure out from
which network interface the offloading rule should be allocated. Also
this pointer is used to track route changes.

Currently mbuf send tags are used in both transmit and receive
direction, due to convenience, but may get a new name in the future to
better reflect their usage.

Reviewed by: jhb@ and gallatin@
Differential revision: https://reviews.freebsd.org/D32356
Sponsored by: NVIDIA Networking

show more ...


# d59bc188 27-May-2022 Gleb Smirnoff <glebius@FreeBSD.org>

sockbuf: remove unused mbuf counter and cluster counter

With M_EXTPG mbufs these two counters already do not represent the
reality. As we are moving towards protocol independent socket buffers,
whi

sockbuf: remove unused mbuf counter and cluster counter

With M_EXTPG mbufs these two counters already do not represent the
reality. As we are moving towards protocol independent socket buffers,
which may not even use mbufs at all, the counters become less and less
relevant. The only userland seeing them was 'netstat -x'.

PR: 264181 (exp-run)
Reviewed by: markj
Differential revision: https://reviews.freebsd.org/D35334

show more ...


# 6890b588 17-May-2022 Gleb Smirnoff <glebius@FreeBSD.org>

sockbuf: improve sbcreatecontrol()

o Constify memory pointer. Make length unsigned.
o Make it never fail with M_WAITOK and assert that length is sane.


# b46667c6 17-May-2022 Gleb Smirnoff <glebius@FreeBSD.org>

sockbuf: merge two versions of sbcreatecontrol() into one

No functional change.


# 43283184 12-May-2022 Gleb Smirnoff <glebius@FreeBSD.org>

sockets: use socket buffer mutexes in struct socket directly

Since c67f3b8b78e the sockbuf mutexes belong to the containing socket,
and socket buffers just point to it. In 74a68313b50 macros that a

sockets: use socket buffer mutexes in struct socket directly

Since c67f3b8b78e the sockbuf mutexes belong to the containing socket,
and socket buffers just point to it. In 74a68313b50 macros that access
this mutex directly were added. Go over the core socket code and
eliminate code that reaches the mutex by dereferencing the sockbuf
compatibility pointer.

This change requires a KPI change, as some functions were given the
sockbuf pointer only without any hint if it is a receive or send buffer.

This change doesn't cover the whole kernel, many protocols still use
compatibility pointers internally. However, it allows operation of a
protocol that doesn't use them.

Reviewed by: markj
Differential revision: https://reviews.freebsd.org/D35152

show more ...


# 7db54446 09-May-2022 Gleb Smirnoff <glebius@FreeBSD.org>

sockbufs: make sbrelease_internal() private


# a982ce04 09-May-2022 Gleb Smirnoff <glebius@FreeBSD.org>

sockets: remove the socket-on-stack hack from sorflush()

The hack can be tracked down to 4.4BSD, where copy was performed
under splimp() and then after splx() dom_dispose was called.
Stevens has a c

sockets: remove the socket-on-stack hack from sorflush()

The hack can be tracked down to 4.4BSD, where copy was performed
under splimp() and then after splx() dom_dispose was called.
Stevens has a chapter on this function, but he doesn't answer why
this trick is necessary. Why can't we call into dom_dispose under
splimp()? Anyway, with multithreaded kernel the hack seems to be
necessary to avoid LORs between socket buffer lock and different
filesystem locks, especially network file systems.

The new socket buffers KPI sbcut() from 1d2df300e9b allow us to get
rid of the hack.

Reviewed by: markj
Differential revision: https://reviews.freebsd.org/D35125

show more ...


Revision tags: vendor/NetBSD/bmake/20220418, vendor/bearssl/20220418, vendor/bc/5.2.4, vendor/NetBSD/libedit/2022-04-11, vendor/openssh/9.0p1, vendor/NetBSD/bmake/20220330, vendor/acpica/20220331, vendor/zlib/1.2.12, vendor/llvm-project/llvmorg-14.0.0-2-g3f43d803382d, vendor/heimdal/7.7.0, vendor/expat/2.4.7, vendor/llvm-project/llvmorg-14.0.0-rc4-2-gadd3ab7f4c8a, vendor/tzdata/tzdata2022a, vendor/openssl/1.1.1n, vendor/bsddialog/0.2, vendor/libcxxrt/2022-03-09-fd484be8d1e94a1fcf6bc5c67e5c07b65ada19b6, vendor/bc/5.2.3, vendor/llvm-project/llvmorg-14.0.0-rc2-12-g09546e1b5103, vendor/expat/2.4.6, vendor/openssh/8.9p1, vendor/llvm-project/llvmorg-13.0.1-0-g75e33f71c2da, vendor/llvm-project/llvmorg-14.0.0-rc1-74-g4dc3cb8e3255, vendor/unbound/1.15.0, vendor/NetBSD/bmake/20220208, vendor/bc/5.2.2, vendor/NetBSD/bmake/20220204, vendor/llvm-project/llvmorg-14-init-18315-g190be5457c90, vendor/llvm-project/llvmorg-14-init-18294-gdb01b123d012, vendor/terminus/terminus-font-4.49.1, vendor/bsddialog/0.1, vendor/llvm-project/llvmorg-14-init-17616-g024a1fab5c35, vendor/dma/2022-01-27, vendor/ena-com/2.5.0, vendor/wpa/2.10, vendor/expat/2.4.3, vendor/sqlite3/sqlite-3370200, vendor/wpa/gb26f5c0fe, vendor/sqlite3/sqlite-3370100, vendor/file/5.41, vendor/llvm-project/llvmorg-14-init-13186-g0c553cc1af2e, vendor/bsddialog/0.0.2, vendor/NetBSD/bmake/20211212, vendor/openssl/1.1.1m, vendor/unbound/1.14.0, vendor/bsddialog/0.0.1, vendor/unbound/1.14.0rc1, vendor/llvm-project/llvmorg-14-init-11187-g222442ec2d71, release/12.3.0, upstream/12.3.0, vendor/wpa/g14ab4a816, vendor/bc/5.2.1, vendor/bc/5.2.0, vendor/bsddialog/2021-11-24, vendor/llvm-project/llvmorg-14-init-10223-g401b76fdf2b3, vendor/llvm-project/llvmorg-14-init-10186-gff7f2cfa959b, vendor/mandoc/1.14.6, vendor/openssh/8.8p1, vendor/ck/2021029, vendor/tzdata/tzdata2021e, vendor/tzdata/tzdata2021d, vendor/bc/5.1.1, vendor/bc/5.1.0, vendor/tzdata/tzdata2021c, vendor/libfido2/1.8.0, vendor/libcbor/0.8.0, vendor/acpica/20210930, vendor/llvm-project/llvmorg-13.0.0-0-gd7b669b3a303, vendor/llvm-project/llvmorg-13.0.0-rc4-0-gd7b669b3a303, vendor/tzdata/tzdata2021b, vendor/dma/2021-07-10, vendor/NetBSD/libedit/2021-09-10, vendor/bc/5.0.2, vendor/llvm-project/llvmorg-13.0.0-rc3-8-g08642a395f23
# ade1daa5 17-Sep-2021 Mark Johnston <markj@FreeBSD.org>

socket: Synchronize soshutdown() with listen(2) and AIO

To handle shutdown(SHUT_RD) we flush the receive buffer of the socket.
This may involve searching for control messages of type SCM_RIGHTS,
sin

socket: Synchronize soshutdown() with listen(2) and AIO

To handle shutdown(SHUT_RD) we flush the receive buffer of the socket.
This may involve searching for control messages of type SCM_RIGHTS,
since we need to close the file references. Closing arbitrary files
with socket buffer locks held is undesirable, mainly due to lock
ordering issues, so we instead make a copy of the socket buffer and
operate on that without any locks. Fields in the original buffer are
cleared.

This behaviour clobbered the AIO job queue associated with a receive
buffer. It could also cause us to leak a KTLS session reference.
Reorder socket buffer fields to address this.

An alternate solution would be to remove the hack in sorflush(), but
this is not quite feasible (yet). In particular, though sorflush()
flags the sockbuf with SBS_CANTRCVMORE, it is possible for more data to
be queued - the flag just prevents userspace from reading more data. I
suspect we should fix this; SBS_CANTRCVMORE represents a terminal state
and protocols can likely just drop any data destined for such a buffer.
Many of them already do, but in some cases the check is racy, and some
KPI churn will be needed to fix everything. This approach is more
straightforward for now.

Reported by: syzbot+104d8ee3430361cb2795@syzkaller.appspotmail.com
Reported by: syzbot+5bd2e7d05f84a59d0d1b@syzkaller.appspotmail.com
Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31976

show more ...


# 74a68313 10-Sep-2021 Mark Johnston <markj@FreeBSD.org>

socket: Add macros to lock socket buffers using socket references

Since commit c67f3b8b78e50c6df7c057d6cf108e4d6b4312d0 the sockbuf
mutexes belong to the containing socket. Sockbufs contain a point

socket: Add macros to lock socket buffers using socket references

Since commit c67f3b8b78e50c6df7c057d6cf108e4d6b4312d0 the sockbuf
mutexes belong to the containing socket. Sockbufs contain a pointer to
a mutex, which by default is initialized to the corresponding mutexes in
the socket. The SOCKBUF_LOCK() etc. macros operate on this pointer.
However, the pointer is clobbered by listen(2) so it's not safe to use
them unless one is sure that the socket is not a listening socket.

This change introduces a new set of macros which lock socket buffers
through the socket. This is a bit cheaper since it removes the pointer
indirection, and allows one to safely lock socket buffers and then check
for a listening socket.

For MFC, these macros should be reimplemented in terms of the existing
socket buffer layout.

Reviewed by: tuexen, gallatin, jhb
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31900

show more ...


Revision tags: vendor/llvm-project/llvmorg-13.0.0-rc2-43-gf56129fe78d5
# c67f3b8b 07-Sep-2021 Mark Johnston <markj@FreeBSD.org>

socket: Move sockbuf mutexes into the owning socket

This is necessary to provide proper interlocking with listen(2), which
destroys the socket buffers. Otherwise, code must lock the socket
itself a

socket: Move sockbuf mutexes into the owning socket

This is necessary to provide proper interlocking with listen(2), which
destroys the socket buffers. Otherwise, code must lock the socket
itself and check SOLISTENING(so), but most I/O paths do not otherwise
need to acquire the socket lock, so the extra overhead needed to check a
rare error case is undesirable.

listen(2) calls are relatively rare. Thus, the strategy is to have it
acquire all socket buffer locks when transitioning to a listening
socket. To do this safely, these locks must be stable, and not
destroyed during listen(2) as they are today. So, move them out of the
sockbuf and into the owning socket. For the sockbuf mutexes, keep a
pointer to the mutex in the sockbuf itself, for now. This can be
removed by replacing SOCKBUF_LOCK() etc. with macros which operate on
the socket itself, as was done for the sockbuf I/O locks.

Reviewed by: tuexen, gallatin
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31658

show more ...


# f94acf52 07-Sep-2021 Mark Johnston <markj@FreeBSD.org>

socket: Rename sb(un)lock() and interlock with listen(2)

In preparation for moving sockbuf locks into the containing socket,
provide alternative macros for the sockbuf I/O locks:
SOCK_IO_SEND_(UN)LO

socket: Rename sb(un)lock() and interlock with listen(2)

In preparation for moving sockbuf locks into the containing socket,
provide alternative macros for the sockbuf I/O locks:
SOCK_IO_SEND_(UN)LOCK() and SOCK_IO_RECV_(UN)LOCK(). These operate on a
socket rather than a socket buffer. Note that these locks are used only
to prevent concurrent readers and writters from interleaving I/O.

When locking for I/O, return an error if the socket is a listening
socket. Currently the check is racy since the sockbuf sx locks are
destroyed during the transition to a listening socket, but that will no
longer be true after some follow-up changes.

Modify a few places to check for errors from
sblock()/SOCK_IO_(SEND|RECV)_LOCK() where they were not before. In
particular, add checks to sendfile() and sorflush().

Reviewed by: tuexen, gallatin
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31657

show more ...


# 97cf43eb 07-Sep-2021 Mark Johnston <markj@FreeBSD.org>

socket: Reorder socket and sockbuf fields to eliminate some padding

This is in preparation for moving sockbuf locks into the owning socket,
in order to provide proper interlocking for listen(2). In

socket: Reorder socket and sockbuf fields to eliminate some padding

This is in preparation for moving sockbuf locks into the owning socket,
in order to provide proper interlocking for listen(2). In particular,
listening sockets do not use the socket buffers and repurpose that space
in struct socket for their own purposes. Moving the locks out of the
socket buffers and into the socket proper makes it possible to safely
lock socket buffers and test for a listening socket before deciding how
to proceed.

Reordering these fields saves some space and helps ensure that UMA will
provide the same space efficiency for sockets as before. No functional
change intended.

Reviewed by: tuexen, gallatin
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31656

show more ...


Revision tags: vendor/openssl/1.1.1l, vendor/openssh/8.7p1, vendor/llvm-project/llvmorg-13.0.0-rc1-97-g23ba3732246a, vendor/llvm-project/llvmorg-13.0.0-rc1-0-gd6974c010878, vendor/unbound/1.13.2, vendor/one-true-awk/0592de4a, vendor/acpica/20210730, vendor/llvm-project/llvmorg-13-init-16854-g6b2e4c5a58d7, vendor/llvm-project/llvmorg-12.0.1-0-gfed41342a82f, vendor/llvm-project/llvmorg-12.0.1-rc2-0-ge7dac564cd0e, vendor/llvm-project/llvmorg-13-init-16847-g88e66fa60ae5, vendor/less/v590, llvmorg-12.0.1-0-gfed41342a82f, vendor/one-true-awk/1e4bc42c53a1, vendor/device-tree/5.13, vendor/device-tree/5.12, vendor/NetBSD/bmake/20210621, vendor/ena-com/2.4.0, vendor/NetBSD/vis/20210621, llvmorg-12.0.1-rc2-0-ge7dac564cd0e, vendor/acpica/20210604, vendor/nvi/2.2.0-3bbdfe4, vendor/tcsh/6.22.04, vendor/bc/4.0.2, vendor/sqlite3/sqlite-3350500, vendor/less/v581.2, vendor/bc/4.0.1, vendor/openssh/8.6p1, vendor/openssh/8.5p1, vendor/llvm-project/llvmorg-12.0.0-0-gd28af7c654d8, vendor/less/v581, vendor/google/capsicum-test/ea66424d921bb393539b298c108a46edee5c3051, release/13.0.0, upstream/13.0.0, vendor/bc/4.0.0, vendor/acpica/20210331, vendor/NetBSD/libedit/2021-03-28, vendor/openssl/1.1.1k, vendor/device-tree/5.11, vendor/NetBSD/libedit/2020-07-10, vendor/libucl/20210314, vendor/bc/3.3.4, vendor/wpa/g9d9b42306541, vendor/tcsh/6.22.03-ceccc7f, bc/3.3.3, vendor/google/capsicum-test/20210302, vendor/dialog/1.3-20210117, vendor/ncurses/6.2-20210220, vendor/arm-optimized-routines/v21.02, vendor/libcxxrt/2021-02-18-8049924686b8414d8e652cbd2a52c763b48e8456, vendor/bc/bc-3.3.0, vendor/llvm-project/llvmorg-12.0.0-rc1-109-gd5d089bf08c9, vendor/llvm-project/llvmorg-12-init-17869-g8e464dd76bef, vendor/openssl/1.1.1j, vendor/google/capsicum-test/7707222b46abe52d18fd4fbb76115ffdb3e6f74b, vendor/openssh/8.4p1, vendor/openssh/8.3p1, vendor/openssh/8.2p1, vendor/openssh/8.1p1, vendor/openzfs/20210210, vendor/subversion/subversion-1.14.1, vendor/NetBSD/bmake/20210206, vendor/unbound/1.13.1, vendor/bc/3.2.6, vendor/atf/20210128, vendor/sqlite3/sqlite-3340100, vendor/tzdata/tzdata2021a, vendor/device-tree/5.10, vendor/device-tree/5.9, vendor/NetBSD/bmake/20210110, vendor/openzfs/20210107, vendor/acpica/20210105, vendor/acpica/20201217, vendor/llvm-project/llvmorg-11.0.1-0-g43ff75f2c3fe, vendor/llvm-project/llvmorg-11.0.1-rc2-0-g43ff75f2c3f, vendor/pnglite/20130820, vendor/terminus/terminus-font-4.48, vendor/tzdata/tzdata2020f, vendor/libarchive/3.5.1, vendor/bc/3.2.4, vendor/lua/5.4.2, vendor/zstd/1.4.8, vendor/tzdata/tzdata2020e, vendor/unbound/1.13.0, vendor/openssl/1.1.1i, vendor/bc/3.2.3, vendor/libarchive/3.5.0, vendor/bc/3.2.0, vendor/NetBSD/bmake/20201117, vendor/ena-com/2.3.0, vendor/ena-com/2.2.1, vendor/acpica/20201113, vendor/NetBSD/bmake/20201101, vendor/unbound/1.12.0, vendor/less/v563, release/12.2.0, upstream/12.2.0, vendor/tzdata/tzdata2020d, vendor/tzdata/tzdata2020c, vendor/openzfs/2.0.0-rc3-gfc5966, vendor/lua/5.3.6, vendor/llvm-project/llvmorg-11.0.0-0-g176249bd673, vendor/acpica/20200925, vendor/tzdata/tzdata2020b, vendor/openzfs/2.0-rc3-gfc5966, vendor/llvm-project/llvmorg-11.0.0-rc5-0-g60a25202a7d, vendor/bc/3.1.6, vendor/nvi/2.2.0-05ed8b9, vendor/openssl/1.1.1h, vendor/openzfs/2.0-rc2-g4ce06f, vendor/llvm-project/llvmorg-11.0.0-rc2-91-g6e042866c30, vendor/lib9p/9d5aee77bcc1bf0e79b0a3bfefff5fdf2146283c, vendor/nvi/2.2.0, vendor/NetBSD/bmake/20200902, vendor/openzfs/2.0-rc1-gfd20a8, vendor/openzfs/2.0-rc1-ga00c61, vendor/openzfs/2.0-rc0-g184df27, vendor/llvm-project/llvmorg-11.0.0-rc2-0-g414f32a9e86, vendor/unbound/1.11.0, vendor/sqlite3/sqlite-3330000, vendor/llvm-project/llvmorg-11.0.0-rc1-47-gff47911ddfc, vendor/bc/3.1.5, vendor/device-tree/5.8, vendor/bc/3.1.4, vendor/llvm-project/llvmorg-11.0.0-rc1-25-g903c872b169, vendor/pcg-c/20190718-83252d9, vendor/llvm-project/llvmorg-11-init-20933-g3c1fca803bc, vendor/llvm-project/llvmorg-11-init-20887-g2e10b7a39b9
# 3c0e5685 23-Jul-2020 John Baldwin <jhb@FreeBSD.org>

Add support for KTLS RX via software decryption.

Allow TLS records to be decrypted in the kernel after being received
by a NIC. At a high level this is somewhat similar to software KTLS
for the tra

Add support for KTLS RX via software decryption.

Allow TLS records to be decrypted in the kernel after being received
by a NIC. At a high level this is somewhat similar to software KTLS
for the transmit path except in reverse. Protocols enqueue mbufs
containing encrypted TLS records (or portions of records) into the
tail of a socket buffer and the KTLS layer decrypts those records
before returning them to userland applications. However, there is an
important difference:

- In the transmit case, the socket buffer is always a single "record"
holding a chain of mbufs. Not-yet-encrypted mbufs are marked not
ready (M_NOTREADY) and released to protocols for transmit by marking
mbufs ready once their data is encrypted.

- In the receive case, incoming (encrypted) data appended to the
socket buffer is still a single stream of data from the protocol,
but decrypted TLS records are stored as separate records in the
socket buffer and read individually via recvmsg().

Initially I tried to make this work by marking incoming mbufs as
M_NOTREADY, but there didn't seemed to be a non-gross way to deal with
picking a portion of the mbuf chain and turning it into a new record
in the socket buffer after decrypting the TLS record it contained
(along with prepending a control message). Also, such mbufs would
also need to be "pinned" in some way while they are being decrypted
such that a concurrent sbcut() wouldn't free them out from under the
thread performing decryption.

As such, I settled on the following solution:

- Socket buffers now contain an additional chain of mbufs (sb_mtls,
sb_mtlstail, and sb_tlscc) containing encrypted mbufs appended by
the protocol layer. These mbufs are still marked M_NOTREADY, but
soreceive*() generally don't know about them (except that they will
block waiting for data to be decrypted for a blocking read).

- Each time a new mbuf is appended to this TLS mbuf chain, the socket
buffer peeks at the TLS record header at the head of the chain to
determine the encrypted record's length. If enough data is queued
for the TLS record, the socket is placed on a per-CPU TLS workqueue
(reusing the existing KTLS workqueues and worker threads).

- The worker thread loops over the TLS mbuf chain decrypting records
until it runs out of data. Each record is detached from the TLS
mbuf chain while it is being decrypted to keep the mbufs "pinned".
However, a new sb_dtlscc field tracks the character count of the
detached record and sbcut()/sbdrop() is updated to account for the
detached record. After the record is decrypted, the worker thread
first checks to see if sbcut() dropped the record. If so, it is
freed (can happen when a socket is closed with pending data).
Otherwise, the header and trailer are stripped from the original
mbufs, a control message is created holding the decrypted TLS
header, and the decrypted TLS record is appended to the "normal"
socket buffer chain.

(Side note: the SBCHECK() infrastucture was very useful as I was
able to add assertions there about the TLS chain that caught several
bugs during development.)

Tested by: rmacklem (various versions)
Relnotes: yes
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D24628

show more ...


Revision tags: vendor/acpica/20200717, vendor/sendmail/8.16.1, vendor/NetBSD/bmake/20200710, vendor/bc/3.1.3, vendor/NetBSD/bmake/20200704, vendor/sqlite3/sqlite-3320300, vendor/bc/3.1.1, vendor/NetBSD/bmake/20200629, vendor/llvm-project/llvmorg-10.0.1-0-gef32c611aa2, vendor/llvm-project/llvmorg-10.0.1-rc2-0-g77d76b71d7d, vendor/bc/3.0.2, vendor/llvm-project/llvmorg-10.0.0-129-gd24d5c8e308, vendor/ntp/4.2.8p15, vendor/byacc/20200330, vendor/llvm-project/llvmorg-10.0.0-97-g6f71678ecd2, vendor/flex/2.6.4, vendor/file/5.39, vendor/blocklist/20200615, vendor/opencsd/v0.14.2, vendor/sqlite3/sqlite-3320200, release/11.4.0, upstream/11.4.0, vendor/sqlite3/sqlite-3320000, vendor/NetBSD/bmake/20200606, vendor/device-tree/5.7, vendor/edk2/ca407c7246bf405da6d9b1b9d93e5e7f17b4b1f9, vendor/subversion/subversion-1.14.0, vendor/apr/apr-1.7.0, vendor/acpica/20200528, vendor/ena-com/2.2.0, vendor/zstd/1.4.5, vendor/llvm-project/llvmorg-10.0.1-rc1-0-gf79cd71e145, vendor/unbound/1.10.1, vendor/NetBSD/bmake/20200517, vendor/libarchive/3.4.3, vendor/acpica/20200430, vendor/lib9p/7ddb1164407da19b9b1afb83df83ae65a71a9a66, vendor/tzdata/tzdata2020a, vendor/openssl/1.1.1g, vendor/sqlite3/sqlite-3310100, vendor/device-tree/5.6
# 25f4ddfb 10-Apr-2020 Mark Johnston <markj@FreeBSD.org>

sbappendcontrol() needs to avoid clearing M_NOTREADY on data mbufs.

If LOCAL_CREDS is set on a unix socket and sendfile() is called,
sendfile will call uipc_send(PRUS_NOTREADY), prepending a control

sbappendcontrol() needs to avoid clearing M_NOTREADY on data mbufs.

If LOCAL_CREDS is set on a unix socket and sendfile() is called,
sendfile will call uipc_send(PRUS_NOTREADY), prepending a control
message to the M_NOTREADY mbufs. uipc_send() then calls
sbappendcontrol() instead of sbappend(), and sbappendcontrol() would
erroneously clear M_NOTREADY.

Pass send flags to sbappendcontrol(), like we do for sbappend(), to
preserve M_READY when necessary.

Reported by: syzkaller
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D24333

show more ...


12345678910>>...16