Revision tags: vendor/llvm-project/llvmorg-18.1.5-0-g617a15a9eac9, vendor/NetBSD/bmake/20240430, vendor/libcbor/0.11.0, vendor/llvm-project/llvmorg-18.1.4-0-ge6c3289804a6, vendor/device-tree/6.8, vendor/device-tree/6.7 |
|
#
5716d902 |
| 09-Apr-2024 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Revert "unix: new implementation of unix/stream & unix/seqpacket"
The regressions in aio(4) and kernel RPC aren't a 5 minute problem.
This reverts commit d80a97def9a1db6f07f5d2e68f7ad62b27918947. T
Revert "unix: new implementation of unix/stream & unix/seqpacket"
The regressions in aio(4) and kernel RPC aren't a 5 minute problem.
This reverts commit d80a97def9a1db6f07f5d2e68f7ad62b27918947. This reverts commit d1cbb17a873c787a527316bbb27551e97d5ad30c. This reverts commit fb8a8333b481cc4256d0b3f0b5b4feaa4594e01f.
show more ...
|
#
d80a97de |
| 08-Apr-2024 |
Gleb Smirnoff <glebius@FreeBSD.org> |
unix: new implementation of unix/stream & unix/seqpacket
Provide protocol specific pr_sosend and pr_soreceive for PF_UNIX SOCK_STREAM sockets and implement SOCK_SEQPACKET sockets as an extension of
unix: new implementation of unix/stream & unix/seqpacket
Provide protocol specific pr_sosend and pr_soreceive for PF_UNIX SOCK_STREAM sockets and implement SOCK_SEQPACKET sockets as an extension of SOCK_STREAM. The change meets three goals: get rid of unix(4) specific stuff in the generic socket code, provide a faster and robust unix/stream sockets and bring unix/seqpacket much closer to specification. Highlights follow:
- The send buffer now is truly bypassed. Previously it was always empty, but the send(2) still needed to acquire its lock and do a variety of tricks to be woken up in the right time while sleeping on it. Now the only two things we care about in the send buffer is the I/O sx(9) lock that serializes operations and value of so_snd.sb_hiwat, which we can read without obtaining a lock. The sleep of a send(2) happens on the mutex of the receive buffer of the peer. A bulk send/recv of data with large socket buffers will make both syscalls just bounce between owning the receive buffer lock and copyin(9)/copyout(9), no other locks would be involved.
- The implementation uses new mchain structure to manipulate mbuf chains. Note that this required converting to mchain two functions that are shared with unix/dgram: unp_internalize() and unp_addsockcred() as well as adding a new shared one uipc_process_kernel_mbuf(). This induces some non- functional changes in the unix/dgram code as well. There is a space for improvement here, as right now it is a mix of mchain and manually managed mbuf chains.
- unix/seqpacket previously marked as PR_ADDR & PR_ATOMIC and thus treated as a datagram socket by the generic socket code, now becomes a true stream socket with record markers.
- unix/stream loses the sendfile(2) support. This can be brought back, but requires some work. Let's first see if there is any interest in this feature, except purely academical.
Reviewed by: markj, tuexen Differential Revision: https://reviews.freebsd.org/D44151
show more ...
|
Revision tags: vendor/llvm-project/llvmorg-18.1.3-0-gc13b7485b879, vendor/device-tree/6.5, vendor/openssh/9.7p1, vendor/unbound/1.19.3, vendor/NetBSD/bmake/20240309, vendor/sqlite3/sqlite-3450100, vendor/llvm-project/llvmorg-18.1.1-0-gdba2a75e9c7e, vendor/got/diff/2023-09-15, release/13.3.0, vendor/libucl/20240206, vendor/xz/5.6.0, vendor/llvm-project/llvmorg-18.1.0-rc3-0-g6c90f8dd5463, vendor/llvm-project/llvmorg-18.1.0-rc2-53-gc7b0a6ecd442, vendor/arm-optimized-routines/v24.01, vendor/zlib/1.3.1, vendor/expat/2.6.0, vendor/unbound/1.19.1, vendor/tzcode/tzcode2024a, vendor/llvm-project/llvmorg-18.1.0-rc2-0-gc6c86965d967, vendor/tzdata/tzdata2024a, vendor/sendmail/8.18.1, vendor/acpica/20230628, vendor/acpica/20230331, vendor/llvm-project/llvmorg-18-init-18361-g22683463740e, vendor/libcxxrt/2024-01-25-fd484be8d1e94a1fcf6bc5c67e5c07b65ada19b6, vendor/llvm-project/llvmorg-18-init-18359-g93248729cfae, vendor/sqlite3/sqlite-3450000, vendor/NetBSD/bmake/20240108, vendor/llvm-project/llvmorg-18-init-16864-g3b3ee1f53424, vendor/llvm-project/llvmorg-18-init-16595-g7c00a5be5cde, vendor/llvm-project/llvmorg-18-init-16003-gfc5f51cf5af4 |
|
#
660bd40a |
| 02-Jan-2024 |
Gleb Smirnoff <glebius@FreeBSD.org> |
netlink: use domain specific send buffer
Instead of using generic socket code, create Netlink specific socket buffer. It is a simple TAILQ of writes that came from userland. This saves us one memo
netlink: use domain specific send buffer
Instead of using generic socket code, create Netlink specific socket buffer. It is a simple TAILQ of writes that came from userland. This saves us one memory allocation that could fail and one memory copy.
Reviewed by: melifaro Differential Revision: https://reviews.freebsd.org/D42522
show more ...
|
Revision tags: vendor/bc/6.7.4, vendor/ena-com/2.7.0, vendor/llvm-project/llvmorg-18-init-15692-g007ed0dccd6a, vendor/tzdata/tzdata2023d, vendor/openssh/9.6p1, vendor/llvm-project/llvmorg-18-init-15088-gd14ee76181fb, vendor/llvm-project/llvmorg-18-init-14265-ga17671084db1, vendor/llvm-project/llvmorg-17.0.6-0-g6009708b4367, vendor/xz/5.4.5 |
|
#
29363fb4 |
| 23-Nov-2023 |
Warner Losh <imp@FreeBSD.org> |
sys: Remove ancient SCCS tags.
Remove ancient SCCS tags from the tree, automated scripting, with two minor fixup to keep things compiling. All the common forms in the tree were removed with a perl s
sys: Remove ancient SCCS tags.
Remove ancient SCCS tags from the tree, automated scripting, with two minor fixup to keep things compiling. All the common forms in the tree were removed with a perl script.
Sponsored by: Netflix
show more ...
|
Revision tags: vendor/llvm-project/llvmorg-17.0.5-0-g98bfdac5ce82, vendor/unbound/1.19.0, vendor/sqlite3/sqlite-3440000, release/14.0.0, vendor/bc/6.7.2, vendor/llvm-project/llvmorg-17.0.3-0-g888437e1b600, vendor/bsddialog/1.0, vendor/llvm-project/llvmorg-17.0.2-0-gb2417f51dbbd, vendor/openssh/9.5p1, vendor/llvm-project/llvmorg-17.0.1-25-g098e653a5bed, vendor/nvi/2.2.1, vendor/openssl/3.0.11, vendor/sqlite3/sqlite-3430100, vendor/unbound/1.18.0, vendor/NetBSD/bmake/20230909, vendor/openssl/1.1.1w, vendor/llvm-project/llvmorg-17.0.0-rc4-10-g0176e8729ea4, vendor/file/5.45, vendor/llvm-project/llvmorg-17.0.0-rc3-79-ga612cb0b81d8, vendor/krb5/1.21.2, vendor/unifdef/2.12, vendor/unifdef/2.11, 2023.08.19-b34f66deb02e188104, vendor/zlib/1.3 |
|
#
95ee2897 |
| 16-Aug-2023 |
Warner Losh <imp@FreeBSD.org> |
sys: Remove $FreeBSD$: two-line .h pattern
Remove /^\s*\*\n \*\s+\$FreeBSD\$$\n/
|
Revision tags: vendor/less/v643, vendor/NetBSD/libc-vis/20230813, vendor/openssh/9.4p1, vendor/device-tree/6.4, vendor/device-tree/6.3, vendor/device-tree/6.2, vendor/device-tree/6.1, vendor/krb5/1.21.1, vendor/xz/5.4.4, vendor/openssl/3.0.10, vendor/openssl/1.1.1v, vendor/llvm-project/llvmorg-17-init-19311-gbc849e525f80, vendor/llvm-project/llvmorg-17-init-19304-gd0b54bb50e51, vendor/openssh/9.3p2, vendor/lua/5.4.6, vendor/NetBSD/bmake/20230622, vendor/openpam/XIMENIA, vendor/heimdal/7.8.0-2023-06-10-f62e2f278, vendor/openssl/3.0.9, vendor/llvm-project/llvmorg-16.0.6-0-g7cbf1a259152, vendor/ntp/4.2.8p17, vendor/llvm-project/llvmorg-16.0.5-0-g185b81e034ba, vendor/spleen/2.0.0, vendor/ntp/4.2.8p16, vendor/openssl/1.1.1u, vendor/sqlite3/sqlite-3420000, vendor/bc/6.6.0, vendor/llvm-project/llvmorg-16.0.4-0-gae42196bc493, vendor/NetBSD/bmake/20230510, vendor/xz/5.4.3, vendor/tcpdump/4.99.4, vendor/llvm-project/llvmorg-16.0.3-0-gda3cd333bea5, vendor/ldns/1.8.3, vendor/spleen/1.9.3, vendor/libpcap/1.10.4, vendor/spleen/1.6.0, vendor/less/v632, vendor/bc/6.5.0, vendor/libfido2/1.13.0, vendor/libfido2/1.12.0, vendor/libfido2/1.11.0, vendor/libfido2/1.10.0, vendor/libfido2/1.9.0, vendor/NetBSD/bmake/20230414, vendor/llvm-project/llvmorg-16.0.2-0-g18ddebe1a1a9, vendor/libcbor/0.10.2, vendor/tzcode/tzcode2023c, vendor/tzcode/tzcode2023b, vendor/tzcode/tzcode2023a, vendor/sqlite3/sqlite-3410200, vendor/llvm-project/llvmorg-16.0.1-0-gcd89023f7979, release/13.2.0, vendor/llvm-project/llvmorg-16.0.0-45-g42d1b276f779, vendor/llvm-project/llvmorg-16.0.0-0-g08d094a0e457, vendor/tzdata/tzdata2023c, vendor/libpcap/1.10.3, vendor/opencsd/v1.4.0, vendor/arm-optimized-routines/v23.01, vendor/tzdata/tzdata2023b, vendor/tzdata/tzdata2023a, vendor/xz/5.4.2, vendor/openssh/9.3p1, vendor/openssl/3.0.8, vendor/bc/6.4.0, vendor/sqlite3/sqlite-3410000, vendor/bc/6.3.1, vendor/bearssl/20230220, vendor/zlib/1.2.13, vendor/llvm-project/llvmorg-16.0.0-rc2-10-g073506d8c15c, vendor/llvm-project/llvmorg-16-init-18548-gb0daacf58f41, vendor/NetBSD/bmake/20230208 |
|
#
c0e4090e |
| 08-Feb-2023 |
Andrew Gallatin <gallatin@FreeBSD.org> |
ktls: Accurately track if ifnet ktls is enabled
This allows us to avoid spurious calls to ktls_disable_ifnet()
When we implemented ifnet kTLSe, we set a flag in the tx socket buffer (SB_TLS_IFNET)
ktls: Accurately track if ifnet ktls is enabled
This allows us to avoid spurious calls to ktls_disable_ifnet()
When we implemented ifnet kTLSe, we set a flag in the tx socket buffer (SB_TLS_IFNET) to indicate ifnet kTLS. This flag meant that now, or in the past, ifnet ktls was active on a socket. Later, I added code to switch ifnet ktls sessions to software in the case of lossy TCP connections that have a high retransmit rate. Because TCP was using SB_TLS_IFNET to know if it needed to do math to calculate the retransmit ratio and potentially call into ktls_disable_ifnet(), it was doing unneeded work long after a session was moved to software.
This patch carefully tracks whether or not ifnet ktls is still enabled on a TCP connection. Because the inp is now embedded in the tcpcb, and because TCP is the most frequent accessor of this state, it made sense to move this from the socket buffer flags to the tcpcb. Because we now need reliable access to the tcbcb, we take a ref on the inp when creating a tx ktls session.
While here, I noticed that rack/bbr were incorrectly implementing tfb_hwtls_change(), and applying the change to all pending sends, when it should apply only to future sends.
This change reduces spurious calls to ktls_disable_ifnet() by 95% or so in a Netflix CDN environment.
Reviewed by: markj, rrs Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D38380
show more ...
|
Revision tags: vendor/byacc/20230201, vendor/openssl/1.1.1t, vendor/NetBSD/libedit/2023-01-06, vendor/openssh/9.2p1, vendor/tcsh/6.24.07, vendor/bc/6.2.2, vendor/bc/6.2.1, vendor/bc/6.2.0, vendor/bc/6.1.0, vendor/bc/6.0.4, vendor/NetBSD/bmake/20230126, vendor/Juniper/libxo/1.6.0, vendor/zstd/1.5.2, vendor/xz/5.4.1, vendor/sendmail/8.17.1, vendor/llvm-project/llvmorg-15.0.7-0-g8dfdcc7b7bf6, vendor/heimdal/7.8.0, vendor/sqlite3/sqlite-3400100, vendor/xz/5.4.0, vendor/tzcode/tzcode2022g, vendor/tzcode/tzcode2022f, vendor/tzcode/tzcode2022e, vendor/tzcode/tzcode2022d, vendor/xz/5.2.9, vendor/llvm-project/llvmorg-15.0.6-0-g088f33605d8a, vendor/tzdata/tzdata2022g, release/12.4.0, vendor/sqlite3/sqlite-3400000, vendor/expat/2.5.0, vendor/xz/5.2.8, vendor/device-tree/6.0, vendor/device-tree/5.19, vendor/openssl/1.1.1s, vendor/wireguard-tools/v1.0.20210914, vendor/tzdata/tzdata2022f, vendor/acpica/20221020, vendor/unbound/1.17.0, vendor/llvm-project/llvmorg-15.0.2-10-gf3c5289e7846, vendor/llvm-project/llvmorg-15.0.2-0-g4bd3f3759259, vendor/llvm-project/llvmorg-15.0.1-0-gb73d2c8c720a, vendor/tzdata/tzdata2022e, vendor/openssh/9.1p1, vendor/unbound/1.16.3 |
|
#
7b660faa |
| 27-Sep-2022 |
Alexander V. Chernikov <melifaro@FreeBSD.org> |
sockbufs: add sbreserve_locked_limit() with custom maxsockbuf limit.
Protocols such as netlink may need a large socket receive buffer, measured in tens of megabytes. This change allows netlink to
sockbufs: add sbreserve_locked_limit() with custom maxsockbuf limit.
Protocols such as netlink may need a large socket receive buffer, measured in tens of megabytes. This change allows netlink to set larger socket buffers (given the privs are in place), without requiring user to manuall bump maxsockbuf.
Reviewed by: glebius Differential Revision: https://reviews.freebsd.org/D36747
show more ...
|
#
f6696856 |
| 27-Sep-2022 |
Alexander V. Chernikov <melifaro@FreeBSD.org> |
protocols: make socket buffers ioctl handler changeable
Allow to set custom per-protocol handlers for the socket buffers ioctls by introducing pr_setsbopt callback with the default value set to th
protocols: make socket buffers ioctl handler changeable
Allow to set custom per-protocol handlers for the socket buffers ioctls by introducing pr_setsbopt callback with the default value set to the currently-used sbsetopt().
Reviewed by: glebius Differential Revision: https://reviews.freebsd.org/D36746
show more ...
|
Revision tags: vendor/bsddialog/0.4, vendor/tzdata/tzdata2022d, vendor/file/5.43, vendor/expat/2.4.9, vendor/sqlite3/sqlite-3390300, vendor/llvm-project/llvmorg-15.0.0-9-g1c73596d3454, vendor/llvm-project/llvmorg-15.0.0-0-g4ba6a9c9f65b, vendor/less/v608, vendor/bsddialog/0.3, vendor/lua/5.4.4, vendor/lua/5.4.3, vendor/sqlite3/sqlite-3390200, vendor/bc/6.0.2, verndor/bc/6.0.2, vendor/dhcpcd/9.4.1, vendor/tzcode/tzcode2022c, vendor/tzcode/unsplit, vendor/tzdata/tzdata2022c, vendor/llvm-project/llvmorg-15.0.0-rc2-40-gfbd2950d8d0d, vendor/tzdata/tzdata2022b, vendor/arm-optimized-routines/20220210-89ca9c3, vendor/device-tree/5.18, vendor/device-tree/5.17, vendor/device-tree/5.16, vendor/device-tree/5.15, vendor/device-tree/5.14, vendor/unbound/1.16.2, vendor/llvm-project/llvmorg-15-init-17826-g1f8ae9d7e7e4, vendor/llvm-project/llvmorg-15-init-17827-gd77882e66779, vendor/NetBSD/bmake/20220726, vendor/NetBSD/bmake/20220724, vendor/llvm-project/llvmorg-15-init-17485-ga3e38b4a206b, vendor/llvm-project/llvmorg-15-init-16436-g18a6ab5b8d1f, vendor/unbound/1.16.1, vendor/sqlite3/sqlite-3390000, vendor/openssl/1.1.1q, vendor/file/5.42, vendor/llvm-project/llvmorg-15-init-15358-g53dc0f107877 |
|
#
458f475d |
| 24-Jun-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
unix/dgram: smart socket buffers for one-to-many sockets
A one-to-many unix/dgram socket is a socket that has been bound with bind(2) and can get multiple connections. A typical example is /var/run
unix/dgram: smart socket buffers for one-to-many sockets
A one-to-many unix/dgram socket is a socket that has been bound with bind(2) and can get multiple connections. A typical example is /var/run/log bound by syslogd(8) and receiving multiple connections from libc syslog(3) API. Until now all of these connections shared the same receive socket buffer of the bound socket. This made the socket vulnerable to overflow attack. See 240d5a9b1ce for a historical attempt to workaround the problem.
This commit creates a per-connection socket buffer for every single connected socket and eliminates the problem. The new behavior will optimize seldom writers over frequent writers. See added test case scenarios and code comments for more detailed description of the new behavior.
Reviewed by: markj Differential revision: https://reviews.freebsd.org/D35303
show more ...
|
#
a7444f80 |
| 24-Jun-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
unix/dgram: use minimal possible socket buffer for PF_UNIX/SOCK_DGRAM
This change fully splits away PF_UNIX/SOCK_DGRAM from other socket buffer implementations, without any behavior changes.
Generi
unix/dgram: use minimal possible socket buffer for PF_UNIX/SOCK_DGRAM
This change fully splits away PF_UNIX/SOCK_DGRAM from other socket buffer implementations, without any behavior changes.
Generic socket implementation is reduced down to one STAILQ and very little code.
Reviewed by: markj Differential revision: https://reviews.freebsd.org/D35300
show more ...
|
#
a4fc4142 |
| 24-Jun-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
sockets: enable protocol specific socket buffers
Split struct sockbuf into common shared fields and protocol specific union, where protocols are free to implement whatever buffer they want. Such pr
sockets: enable protocol specific socket buffers
Split struct sockbuf into common shared fields and protocol specific union, where protocols are free to implement whatever buffer they want. Such protocols should mark themselves with PR_SOCKBUF and are expected to initialize their buffers in their pr_attach and tear them down in pr_detach.
Reviewed by: markj Differential revision: https://reviews.freebsd.org/D35299
show more ...
|
Revision tags: vendor/openssl/1.1.1p, vendor/bc/5.3.3, vendor/bc/5.3.2, vendor/llvm-project/llvmorg-14.0.5-0-gc12386ae247c, vendor/bc/5.3.1, vendor/bc/5.3.0, vendor/unbound/1.16.0, vendor/llvm-project/llvmorg-14.0.4-0-g29f1039a7285, vendor/sqlite3/sqlite-3380500, release/13.1.0, upstream/13.1.0, vendor/bc/5.2.5, vendor/openssl/1.1.1o, vendor/llvm-project/llvmorg-14.0.2-0-g0e27d08cdeb3, vendor/llvm-project/llvmorg-14.0.3-0-g1f9140064dfb |
|
#
fe8c78f0 |
| 23-Apr-2022 |
Hans Petter Selasky <hselasky@FreeBSD.org> |
ktls: Add full support for TLS RX offloading via network interface.
Basic TLS RX offloading uses the "csum_flags" field in the mbuf packet header to figure out if an incoming mbuf has been fully off
ktls: Add full support for TLS RX offloading via network interface.
Basic TLS RX offloading uses the "csum_flags" field in the mbuf packet header to figure out if an incoming mbuf has been fully offloaded or not. This information follows the packet stream via the LRO engine, IP stack and finally to the TCP stack. The TCP stack preserves the mbuf packet header also when re-assembling packets after packet loss. When the mbuf goes into the socket buffer the packet header is demoted and the offload information is transferred to "m_flags" . Later on a worker thread will analyze the mbuf flags and decide if the mbufs making up a TLS record indicate a fully-, partially- or not decrypted TLS record. Based on these three cases the worker thread will either pass the packet on as-is or recrypt the decrypted bits, if any, or decrypt the packet as usual.
During packet loss the kernel TLS code will call back into the network driver using the send tag, informing about the TCP starting sequence number of every TLS record that is not fully decrypted by the network interface. The network interface then stores this information in a compressed table and starts asking the hardware if it has found a valid TLS header in the TCP data payload. If the hardware has found a valid TLS header and the referred TLS header is at a valid TCP sequence number according to the TCP sequence numbers provided by the kernel TLS code, the network driver then informs the hardware that it can resume decryption.
Care has been taken to not merge encrypted and decrypted mbuf chains, in the LRO engine and when appending mbufs to the socket buffer.
The mbuf's leaf network interface pointer is used to figure out from which network interface the offloading rule should be allocated. Also this pointer is used to track route changes.
Currently mbuf send tags are used in both transmit and receive direction, due to convenience, but may get a new name in the future to better reflect their usage.
Reviewed by: jhb@ and gallatin@ Differential revision: https://reviews.freebsd.org/D32356 Sponsored by: NVIDIA Networking
show more ...
|
#
d59bc188 |
| 27-May-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
sockbuf: remove unused mbuf counter and cluster counter
With M_EXTPG mbufs these two counters already do not represent the reality. As we are moving towards protocol independent socket buffers, whi
sockbuf: remove unused mbuf counter and cluster counter
With M_EXTPG mbufs these two counters already do not represent the reality. As we are moving towards protocol independent socket buffers, which may not even use mbufs at all, the counters become less and less relevant. The only userland seeing them was 'netstat -x'.
PR: 264181 (exp-run) Reviewed by: markj Differential revision: https://reviews.freebsd.org/D35334
show more ...
|
#
6890b588 |
| 17-May-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
sockbuf: improve sbcreatecontrol()
o Constify memory pointer. Make length unsigned. o Make it never fail with M_WAITOK and assert that length is sane.
|
#
b46667c6 |
| 17-May-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
sockbuf: merge two versions of sbcreatecontrol() into one
No functional change.
|
#
43283184 |
| 12-May-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
sockets: use socket buffer mutexes in struct socket directly
Since c67f3b8b78e the sockbuf mutexes belong to the containing socket, and socket buffers just point to it. In 74a68313b50 macros that a
sockets: use socket buffer mutexes in struct socket directly
Since c67f3b8b78e the sockbuf mutexes belong to the containing socket, and socket buffers just point to it. In 74a68313b50 macros that access this mutex directly were added. Go over the core socket code and eliminate code that reaches the mutex by dereferencing the sockbuf compatibility pointer.
This change requires a KPI change, as some functions were given the sockbuf pointer only without any hint if it is a receive or send buffer.
This change doesn't cover the whole kernel, many protocols still use compatibility pointers internally. However, it allows operation of a protocol that doesn't use them.
Reviewed by: markj Differential revision: https://reviews.freebsd.org/D35152
show more ...
|
#
7db54446 |
| 09-May-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
sockbufs: make sbrelease_internal() private
|
#
a982ce04 |
| 09-May-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
sockets: remove the socket-on-stack hack from sorflush()
The hack can be tracked down to 4.4BSD, where copy was performed under splimp() and then after splx() dom_dispose was called. Stevens has a c
sockets: remove the socket-on-stack hack from sorflush()
The hack can be tracked down to 4.4BSD, where copy was performed under splimp() and then after splx() dom_dispose was called. Stevens has a chapter on this function, but he doesn't answer why this trick is necessary. Why can't we call into dom_dispose under splimp()? Anyway, with multithreaded kernel the hack seems to be necessary to avoid LORs between socket buffer lock and different filesystem locks, especially network file systems.
The new socket buffers KPI sbcut() from 1d2df300e9b allow us to get rid of the hack.
Reviewed by: markj Differential revision: https://reviews.freebsd.org/D35125
show more ...
|
Revision tags: vendor/NetBSD/bmake/20220418, vendor/bearssl/20220418, vendor/bc/5.2.4, vendor/NetBSD/libedit/2022-04-11, vendor/openssh/9.0p1, vendor/NetBSD/bmake/20220330, vendor/acpica/20220331, vendor/zlib/1.2.12, vendor/llvm-project/llvmorg-14.0.0-2-g3f43d803382d, vendor/heimdal/7.7.0, vendor/expat/2.4.7, vendor/llvm-project/llvmorg-14.0.0-rc4-2-gadd3ab7f4c8a, vendor/tzdata/tzdata2022a, vendor/openssl/1.1.1n, vendor/bsddialog/0.2, vendor/libcxxrt/2022-03-09-fd484be8d1e94a1fcf6bc5c67e5c07b65ada19b6, vendor/bc/5.2.3, vendor/llvm-project/llvmorg-14.0.0-rc2-12-g09546e1b5103, vendor/expat/2.4.6, vendor/openssh/8.9p1, vendor/llvm-project/llvmorg-13.0.1-0-g75e33f71c2da, vendor/llvm-project/llvmorg-14.0.0-rc1-74-g4dc3cb8e3255, vendor/unbound/1.15.0, vendor/NetBSD/bmake/20220208, vendor/bc/5.2.2, vendor/NetBSD/bmake/20220204, vendor/llvm-project/llvmorg-14-init-18315-g190be5457c90, vendor/llvm-project/llvmorg-14-init-18294-gdb01b123d012, vendor/terminus/terminus-font-4.49.1, vendor/bsddialog/0.1, vendor/llvm-project/llvmorg-14-init-17616-g024a1fab5c35, vendor/dma/2022-01-27, vendor/ena-com/2.5.0, vendor/wpa/2.10, vendor/expat/2.4.3, vendor/sqlite3/sqlite-3370200, vendor/wpa/gb26f5c0fe, vendor/sqlite3/sqlite-3370100, vendor/file/5.41, vendor/llvm-project/llvmorg-14-init-13186-g0c553cc1af2e, vendor/bsddialog/0.0.2, vendor/NetBSD/bmake/20211212, vendor/openssl/1.1.1m, vendor/unbound/1.14.0, vendor/bsddialog/0.0.1, vendor/unbound/1.14.0rc1, vendor/llvm-project/llvmorg-14-init-11187-g222442ec2d71, release/12.3.0, upstream/12.3.0, vendor/wpa/g14ab4a816, vendor/bc/5.2.1, vendor/bc/5.2.0, vendor/bsddialog/2021-11-24, vendor/llvm-project/llvmorg-14-init-10223-g401b76fdf2b3, vendor/llvm-project/llvmorg-14-init-10186-gff7f2cfa959b, vendor/mandoc/1.14.6, vendor/openssh/8.8p1, vendor/ck/2021029, vendor/tzdata/tzdata2021e, vendor/tzdata/tzdata2021d, vendor/bc/5.1.1, vendor/bc/5.1.0, vendor/tzdata/tzdata2021c, vendor/libfido2/1.8.0, vendor/libcbor/0.8.0, vendor/acpica/20210930, vendor/llvm-project/llvmorg-13.0.0-0-gd7b669b3a303, vendor/llvm-project/llvmorg-13.0.0-rc4-0-gd7b669b3a303, vendor/tzdata/tzdata2021b, vendor/dma/2021-07-10, vendor/NetBSD/libedit/2021-09-10, vendor/bc/5.0.2, vendor/llvm-project/llvmorg-13.0.0-rc3-8-g08642a395f23 |
|
#
ade1daa5 |
| 17-Sep-2021 |
Mark Johnston <markj@FreeBSD.org> |
socket: Synchronize soshutdown() with listen(2) and AIO
To handle shutdown(SHUT_RD) we flush the receive buffer of the socket. This may involve searching for control messages of type SCM_RIGHTS, sin
socket: Synchronize soshutdown() with listen(2) and AIO
To handle shutdown(SHUT_RD) we flush the receive buffer of the socket. This may involve searching for control messages of type SCM_RIGHTS, since we need to close the file references. Closing arbitrary files with socket buffer locks held is undesirable, mainly due to lock ordering issues, so we instead make a copy of the socket buffer and operate on that without any locks. Fields in the original buffer are cleared.
This behaviour clobbered the AIO job queue associated with a receive buffer. It could also cause us to leak a KTLS session reference. Reorder socket buffer fields to address this.
An alternate solution would be to remove the hack in sorflush(), but this is not quite feasible (yet). In particular, though sorflush() flags the sockbuf with SBS_CANTRCVMORE, it is possible for more data to be queued - the flag just prevents userspace from reading more data. I suspect we should fix this; SBS_CANTRCVMORE represents a terminal state and protocols can likely just drop any data destined for such a buffer. Many of them already do, but in some cases the check is racy, and some KPI churn will be needed to fix everything. This approach is more straightforward for now.
Reported by: syzbot+104d8ee3430361cb2795@syzkaller.appspotmail.com Reported by: syzbot+5bd2e7d05f84a59d0d1b@syzkaller.appspotmail.com Reviewed by: jhb Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D31976
show more ...
|
#
74a68313 |
| 10-Sep-2021 |
Mark Johnston <markj@FreeBSD.org> |
socket: Add macros to lock socket buffers using socket references
Since commit c67f3b8b78e50c6df7c057d6cf108e4d6b4312d0 the sockbuf mutexes belong to the containing socket. Sockbufs contain a point
socket: Add macros to lock socket buffers using socket references
Since commit c67f3b8b78e50c6df7c057d6cf108e4d6b4312d0 the sockbuf mutexes belong to the containing socket. Sockbufs contain a pointer to a mutex, which by default is initialized to the corresponding mutexes in the socket. The SOCKBUF_LOCK() etc. macros operate on this pointer. However, the pointer is clobbered by listen(2) so it's not safe to use them unless one is sure that the socket is not a listening socket.
This change introduces a new set of macros which lock socket buffers through the socket. This is a bit cheaper since it removes the pointer indirection, and allows one to safely lock socket buffers and then check for a listening socket.
For MFC, these macros should be reimplemented in terms of the existing socket buffer layout.
Reviewed by: tuexen, gallatin, jhb Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D31900
show more ...
|
Revision tags: vendor/llvm-project/llvmorg-13.0.0-rc2-43-gf56129fe78d5 |
|
#
c67f3b8b |
| 07-Sep-2021 |
Mark Johnston <markj@FreeBSD.org> |
socket: Move sockbuf mutexes into the owning socket
This is necessary to provide proper interlocking with listen(2), which destroys the socket buffers. Otherwise, code must lock the socket itself a
socket: Move sockbuf mutexes into the owning socket
This is necessary to provide proper interlocking with listen(2), which destroys the socket buffers. Otherwise, code must lock the socket itself and check SOLISTENING(so), but most I/O paths do not otherwise need to acquire the socket lock, so the extra overhead needed to check a rare error case is undesirable.
listen(2) calls are relatively rare. Thus, the strategy is to have it acquire all socket buffer locks when transitioning to a listening socket. To do this safely, these locks must be stable, and not destroyed during listen(2) as they are today. So, move them out of the sockbuf and into the owning socket. For the sockbuf mutexes, keep a pointer to the mutex in the sockbuf itself, for now. This can be removed by replacing SOCKBUF_LOCK() etc. with macros which operate on the socket itself, as was done for the sockbuf I/O locks.
Reviewed by: tuexen, gallatin MFC after: 1 month Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D31658
show more ...
|
#
f94acf52 |
| 07-Sep-2021 |
Mark Johnston <markj@FreeBSD.org> |
socket: Rename sb(un)lock() and interlock with listen(2)
In preparation for moving sockbuf locks into the containing socket, provide alternative macros for the sockbuf I/O locks: SOCK_IO_SEND_(UN)LO
socket: Rename sb(un)lock() and interlock with listen(2)
In preparation for moving sockbuf locks into the containing socket, provide alternative macros for the sockbuf I/O locks: SOCK_IO_SEND_(UN)LOCK() and SOCK_IO_RECV_(UN)LOCK(). These operate on a socket rather than a socket buffer. Note that these locks are used only to prevent concurrent readers and writters from interleaving I/O.
When locking for I/O, return an error if the socket is a listening socket. Currently the check is racy since the sockbuf sx locks are destroyed during the transition to a listening socket, but that will no longer be true after some follow-up changes.
Modify a few places to check for errors from sblock()/SOCK_IO_(SEND|RECV)_LOCK() where they were not before. In particular, add checks to sendfile() and sorflush().
Reviewed by: tuexen, gallatin MFC after: 1 month Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D31657
show more ...
|
#
97cf43eb |
| 07-Sep-2021 |
Mark Johnston <markj@FreeBSD.org> |
socket: Reorder socket and sockbuf fields to eliminate some padding
This is in preparation for moving sockbuf locks into the owning socket, in order to provide proper interlocking for listen(2). In
socket: Reorder socket and sockbuf fields to eliminate some padding
This is in preparation for moving sockbuf locks into the owning socket, in order to provide proper interlocking for listen(2). In particular, listening sockets do not use the socket buffers and repurpose that space in struct socket for their own purposes. Moving the locks out of the socket buffers and into the socket proper makes it possible to safely lock socket buffers and test for a listening socket before deciding how to proceed.
Reordering these fields saves some space and helps ensure that UMA will provide the same space efficiency for sockets as before. No functional change intended.
Reviewed by: tuexen, gallatin Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D31656
show more ...
|
Revision tags: vendor/openssl/1.1.1l, vendor/openssh/8.7p1, vendor/llvm-project/llvmorg-13.0.0-rc1-97-g23ba3732246a, vendor/llvm-project/llvmorg-13.0.0-rc1-0-gd6974c010878, vendor/unbound/1.13.2, vendor/one-true-awk/0592de4a, vendor/acpica/20210730, vendor/llvm-project/llvmorg-13-init-16854-g6b2e4c5a58d7, vendor/llvm-project/llvmorg-12.0.1-0-gfed41342a82f, vendor/llvm-project/llvmorg-12.0.1-rc2-0-ge7dac564cd0e, vendor/llvm-project/llvmorg-13-init-16847-g88e66fa60ae5, vendor/less/v590, llvmorg-12.0.1-0-gfed41342a82f, vendor/one-true-awk/1e4bc42c53a1, vendor/device-tree/5.13, vendor/device-tree/5.12, vendor/NetBSD/bmake/20210621, vendor/ena-com/2.4.0, vendor/NetBSD/vis/20210621, llvmorg-12.0.1-rc2-0-ge7dac564cd0e, vendor/acpica/20210604, vendor/nvi/2.2.0-3bbdfe4, vendor/tcsh/6.22.04, vendor/bc/4.0.2, vendor/sqlite3/sqlite-3350500, vendor/less/v581.2, vendor/bc/4.0.1, vendor/openssh/8.6p1, vendor/openssh/8.5p1, vendor/llvm-project/llvmorg-12.0.0-0-gd28af7c654d8, vendor/less/v581, vendor/google/capsicum-test/ea66424d921bb393539b298c108a46edee5c3051, release/13.0.0, upstream/13.0.0, vendor/bc/4.0.0, vendor/acpica/20210331, vendor/NetBSD/libedit/2021-03-28, vendor/openssl/1.1.1k, vendor/device-tree/5.11, vendor/NetBSD/libedit/2020-07-10, vendor/libucl/20210314, vendor/bc/3.3.4, vendor/wpa/g9d9b42306541, vendor/tcsh/6.22.03-ceccc7f, bc/3.3.3, vendor/google/capsicum-test/20210302, vendor/dialog/1.3-20210117, vendor/ncurses/6.2-20210220, vendor/arm-optimized-routines/v21.02, vendor/libcxxrt/2021-02-18-8049924686b8414d8e652cbd2a52c763b48e8456, vendor/bc/bc-3.3.0, vendor/llvm-project/llvmorg-12.0.0-rc1-109-gd5d089bf08c9, vendor/llvm-project/llvmorg-12-init-17869-g8e464dd76bef, vendor/openssl/1.1.1j, vendor/google/capsicum-test/7707222b46abe52d18fd4fbb76115ffdb3e6f74b, vendor/openssh/8.4p1, vendor/openssh/8.3p1, vendor/openssh/8.2p1, vendor/openssh/8.1p1, vendor/openzfs/20210210, vendor/subversion/subversion-1.14.1, vendor/NetBSD/bmake/20210206, vendor/unbound/1.13.1, vendor/bc/3.2.6, vendor/atf/20210128, vendor/sqlite3/sqlite-3340100, vendor/tzdata/tzdata2021a, vendor/device-tree/5.10, vendor/device-tree/5.9, vendor/NetBSD/bmake/20210110, vendor/openzfs/20210107, vendor/acpica/20210105, vendor/acpica/20201217, vendor/llvm-project/llvmorg-11.0.1-0-g43ff75f2c3fe, vendor/llvm-project/llvmorg-11.0.1-rc2-0-g43ff75f2c3f, vendor/pnglite/20130820, vendor/terminus/terminus-font-4.48, vendor/tzdata/tzdata2020f, vendor/libarchive/3.5.1, vendor/bc/3.2.4, vendor/lua/5.4.2, vendor/zstd/1.4.8, vendor/tzdata/tzdata2020e, vendor/unbound/1.13.0, vendor/openssl/1.1.1i, vendor/bc/3.2.3, vendor/libarchive/3.5.0, vendor/bc/3.2.0, vendor/NetBSD/bmake/20201117, vendor/ena-com/2.3.0, vendor/ena-com/2.2.1, vendor/acpica/20201113, vendor/NetBSD/bmake/20201101, vendor/unbound/1.12.0, vendor/less/v563, release/12.2.0, upstream/12.2.0, vendor/tzdata/tzdata2020d, vendor/tzdata/tzdata2020c, vendor/openzfs/2.0.0-rc3-gfc5966, vendor/lua/5.3.6, vendor/llvm-project/llvmorg-11.0.0-0-g176249bd673, vendor/acpica/20200925, vendor/tzdata/tzdata2020b, vendor/openzfs/2.0-rc3-gfc5966, vendor/llvm-project/llvmorg-11.0.0-rc5-0-g60a25202a7d, vendor/bc/3.1.6, vendor/nvi/2.2.0-05ed8b9, vendor/openssl/1.1.1h, vendor/openzfs/2.0-rc2-g4ce06f, vendor/llvm-project/llvmorg-11.0.0-rc2-91-g6e042866c30, vendor/lib9p/9d5aee77bcc1bf0e79b0a3bfefff5fdf2146283c, vendor/nvi/2.2.0, vendor/NetBSD/bmake/20200902, vendor/openzfs/2.0-rc1-gfd20a8, vendor/openzfs/2.0-rc1-ga00c61, vendor/openzfs/2.0-rc0-g184df27, vendor/llvm-project/llvmorg-11.0.0-rc2-0-g414f32a9e86, vendor/unbound/1.11.0, vendor/sqlite3/sqlite-3330000, vendor/llvm-project/llvmorg-11.0.0-rc1-47-gff47911ddfc, vendor/bc/3.1.5, vendor/device-tree/5.8, vendor/bc/3.1.4, vendor/llvm-project/llvmorg-11.0.0-rc1-25-g903c872b169, vendor/pcg-c/20190718-83252d9, vendor/llvm-project/llvmorg-11-init-20933-g3c1fca803bc, vendor/llvm-project/llvmorg-11-init-20887-g2e10b7a39b9 |
|
#
3c0e5685 |
| 23-Jul-2020 |
John Baldwin <jhb@FreeBSD.org> |
Add support for KTLS RX via software decryption.
Allow TLS records to be decrypted in the kernel after being received by a NIC. At a high level this is somewhat similar to software KTLS for the tra
Add support for KTLS RX via software decryption.
Allow TLS records to be decrypted in the kernel after being received by a NIC. At a high level this is somewhat similar to software KTLS for the transmit path except in reverse. Protocols enqueue mbufs containing encrypted TLS records (or portions of records) into the tail of a socket buffer and the KTLS layer decrypts those records before returning them to userland applications. However, there is an important difference:
- In the transmit case, the socket buffer is always a single "record" holding a chain of mbufs. Not-yet-encrypted mbufs are marked not ready (M_NOTREADY) and released to protocols for transmit by marking mbufs ready once their data is encrypted.
- In the receive case, incoming (encrypted) data appended to the socket buffer is still a single stream of data from the protocol, but decrypted TLS records are stored as separate records in the socket buffer and read individually via recvmsg().
Initially I tried to make this work by marking incoming mbufs as M_NOTREADY, but there didn't seemed to be a non-gross way to deal with picking a portion of the mbuf chain and turning it into a new record in the socket buffer after decrypting the TLS record it contained (along with prepending a control message). Also, such mbufs would also need to be "pinned" in some way while they are being decrypted such that a concurrent sbcut() wouldn't free them out from under the thread performing decryption.
As such, I settled on the following solution:
- Socket buffers now contain an additional chain of mbufs (sb_mtls, sb_mtlstail, and sb_tlscc) containing encrypted mbufs appended by the protocol layer. These mbufs are still marked M_NOTREADY, but soreceive*() generally don't know about them (except that they will block waiting for data to be decrypted for a blocking read).
- Each time a new mbuf is appended to this TLS mbuf chain, the socket buffer peeks at the TLS record header at the head of the chain to determine the encrypted record's length. If enough data is queued for the TLS record, the socket is placed on a per-CPU TLS workqueue (reusing the existing KTLS workqueues and worker threads).
- The worker thread loops over the TLS mbuf chain decrypting records until it runs out of data. Each record is detached from the TLS mbuf chain while it is being decrypted to keep the mbufs "pinned". However, a new sb_dtlscc field tracks the character count of the detached record and sbcut()/sbdrop() is updated to account for the detached record. After the record is decrypted, the worker thread first checks to see if sbcut() dropped the record. If so, it is freed (can happen when a socket is closed with pending data). Otherwise, the header and trailer are stripped from the original mbufs, a control message is created holding the decrypted TLS header, and the decrypted TLS record is appended to the "normal" socket buffer chain.
(Side note: the SBCHECK() infrastucture was very useful as I was able to add assertions there about the TLS chain that caught several bugs during development.)
Tested by: rmacklem (various versions) Relnotes: yes Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D24628
show more ...
|
Revision tags: vendor/acpica/20200717, vendor/sendmail/8.16.1, vendor/NetBSD/bmake/20200710, vendor/bc/3.1.3, vendor/NetBSD/bmake/20200704, vendor/sqlite3/sqlite-3320300, vendor/bc/3.1.1, vendor/NetBSD/bmake/20200629, vendor/llvm-project/llvmorg-10.0.1-0-gef32c611aa2, vendor/llvm-project/llvmorg-10.0.1-rc2-0-g77d76b71d7d, vendor/bc/3.0.2, vendor/llvm-project/llvmorg-10.0.0-129-gd24d5c8e308, vendor/ntp/4.2.8p15, vendor/byacc/20200330, vendor/llvm-project/llvmorg-10.0.0-97-g6f71678ecd2, vendor/flex/2.6.4, vendor/file/5.39, vendor/blocklist/20200615, vendor/opencsd/v0.14.2, vendor/sqlite3/sqlite-3320200, release/11.4.0, upstream/11.4.0, vendor/sqlite3/sqlite-3320000, vendor/NetBSD/bmake/20200606, vendor/device-tree/5.7, vendor/edk2/ca407c7246bf405da6d9b1b9d93e5e7f17b4b1f9, vendor/subversion/subversion-1.14.0, vendor/apr/apr-1.7.0, vendor/acpica/20200528, vendor/ena-com/2.2.0, vendor/zstd/1.4.5, vendor/llvm-project/llvmorg-10.0.1-rc1-0-gf79cd71e145, vendor/unbound/1.10.1, vendor/NetBSD/bmake/20200517, vendor/libarchive/3.4.3, vendor/acpica/20200430, vendor/lib9p/7ddb1164407da19b9b1afb83df83ae65a71a9a66, vendor/tzdata/tzdata2020a, vendor/openssl/1.1.1g, vendor/sqlite3/sqlite-3310100, vendor/device-tree/5.6 |
|
#
25f4ddfb |
| 10-Apr-2020 |
Mark Johnston <markj@FreeBSD.org> |
sbappendcontrol() needs to avoid clearing M_NOTREADY on data mbufs.
If LOCAL_CREDS is set on a unix socket and sendfile() is called, sendfile will call uipc_send(PRUS_NOTREADY), prepending a control
sbappendcontrol() needs to avoid clearing M_NOTREADY on data mbufs.
If LOCAL_CREDS is set on a unix socket and sendfile() is called, sendfile will call uipc_send(PRUS_NOTREADY), prepending a control message to the M_NOTREADY mbufs. uipc_send() then calls sbappendcontrol() instead of sbappend(), and sbappendcontrol() would erroneously clear M_NOTREADY.
Pass send flags to sbappendcontrol(), like we do for sbappend(), to preserve M_READY when necessary.
Reported by: syzkaller MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D24333
show more ...
|