#
86c9325d |
| 06-Jun-2024 |
Michael Tuexen <tuexen@FreeBSD.org> |
tcp: simplify stack switching protocol
Before this patch, a stack (tfb) accepts a tcpcb (tp), if the tp->t_state is TCPS_CLOSED or tfb->tfb_tcp_handoff_ok is not NULL and tfb->tfb_tcp_handoff_ok(tp)
tcp: simplify stack switching protocol
Before this patch, a stack (tfb) accepts a tcpcb (tp), if the tp->t_state is TCPS_CLOSED or tfb->tfb_tcp_handoff_ok is not NULL and tfb->tfb_tcp_handoff_ok(tp) returns 0. After this patch, the only check is tfb->tfb_tcp_handoff_ok(tp) returns 0. tfb->tfb_tcp_handoff_ok must always be provided. For existing TCP stacks (FreeBSD, RACK and BBR) there is no functional change. However, the logic is simpler.
Reviewed by: lstewart, peter_lei_ieee_.org, rrs MFC after: 1 week Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D45253
show more ...
|
#
ea916b64 |
| 18-May-2024 |
Randall Stewart <rrs@FreeBSD.org> |
Remove TCP_SAD optional code now that the sack filter performs this function.
With the commit of D44903 we no longer need the SAD option. Instead all stacks that use the sack filter inherit its prot
Remove TCP_SAD optional code now that the sack filter performs this function.
With the commit of D44903 we no longer need the SAD option. Instead all stacks that use the sack filter inherit its protection against sack-attack.
Reviewed by: tuexen@ Differential Revision:https://reviews.freebsd.org/D45216
show more ...
|
#
59884aea |
| 04-May-2024 |
Richard Scheffenegger <rscheff@FreeBSD.org> |
tcp: clean up macro useage in tcp_fixed_maxseg()
Replace local PAD macro with PADTCPOLEN macro No functional change.
Reviewed By: tuexen, #transport Sponsored by: NetApp, Inc. Differential Revisi
tcp: clean up macro useage in tcp_fixed_maxseg()
Replace local PAD macro with PADTCPOLEN macro No functional change.
Reviewed By: tuexen, #transport Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D45076
show more ...
|
#
fce03f85 |
| 05-May-2024 |
Randall Stewart <rrs@FreeBSD.org> |
TCP can be subject to Sack Attacks lets fix this issue.
There is a type of attack that a TCP peer can launch on a connection. This is for sure in Rack or BBR and probably even the default stack if i
TCP can be subject to Sack Attacks lets fix this issue.
There is a type of attack that a TCP peer can launch on a connection. This is for sure in Rack or BBR and probably even the default stack if it uses lists in sack processing. The idea of the attack is that the attacker is driving you to look at 100's of sack blocks that only update 1 byte. So for example if you have 1 - 10,000 bytes outstanding the attacker sends in something like:
ACK 0 SACK(1-512) SACK(1024 - 1536), SACK(2048-2536), SACK(4096 - 4608), SACK(8192-8704) This first sack looks fine but then the attacker sends
ACK 0 SACK(1-512) SACK(1025 - 1537), SACK(2049-2537), SACK(4097 - 4609), SACK(8193-8705) ACK 0 SACK(1-512) SACK(1027 - 1539), SACK(2051-2539), SACK(4099 - 4611), SACK(8195-8707) ... These blocks are making you hunt across your linked list and split things up so that you have an entry for every other byte. Has your list grows you spend more and more CPU running through the lists. The idea here is the attacker chooses entries as far apart as possible that make you run through the list. This example is small but in theory if the window is open to say 1Meg you could end up with 100's of thousands link list entries.
To combat this we introduce three things.
when the peer requests a very small MSS we stop processing SACK's from them. This prevents a malicious peer from just using a small MSS to do the same thing. Any time we get a sack block, we use the sack-filter to remove sacks that are smaller than the smallest v4 mss (minus 40 for max TCP options) unless it ties up to snd_max (since that is legal). All other sacks in theory should be at least an MSS. If we get such an attacker that means we basically start skipping all but MSS sized Sacked blocks. The sack filter used to throw away data when its bounds were exceeded, instead now we increase its size to 15 and then throw away sack's if the filter gets over-run to prevent the malicious attacker from over-running the sack filter and thus we start to process things anyway. The default stack will need to start using the sack-filter which we have talked about in past conference calls to take full advantage of the protections offered by it (and reduce cpu consumption when processing sacks).
After this set of changes is in rack can drop its SAD detection completely
Reviewed by:tuexen@, rscheff@ Differential Revision: <https://reviews.freebsd.org/D44903>
show more ...
|
#
6b454da6 |
| 03-Apr-2024 |
Michael Tuexen <tuexen@FreeBSD.org> |
tcp: address a warning
t_state is an unsigned variable, so no need for testing that it is non-negative.
Reported by: Coverity Scan CID: 1390885 Reviewed by: glebius MFC after: 1 week Sponsored
tcp: address a warning
t_state is an unsigned variable, so no need for testing that it is non-negative.
Reported by: Coverity Scan CID: 1390885 Reviewed by: glebius MFC after: 1 week Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D44619
show more ...
|
#
e0bd1801 |
| 03-Apr-2024 |
Michael Tuexen <tuexen@FreeBSD.org> |
tcp: fix conversion of rttvar
A wrong variable and wrong scaling factors were used.
Reported by: Coverity Scan CID: 1508689 Reviewed by: rscheff MFC after: 1 week Sponsored by: Netflix, Inc.
tcp: fix conversion of rttvar
A wrong variable and wrong scaling factors were used.
Reported by: Coverity Scan CID: 1508689 Reviewed by: rscheff MFC after: 1 week Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D44612
show more ...
|
#
1a8d1764 |
| 29-Mar-2024 |
Gleb Smirnoff <glebius@FreeBSD.org> |
inpcb: fully retire inp_ppcb pointer
Before a protocol specific control block started to embed inpcb in self (see 0aa120d52f3c, e68b3792440c, 483fe96511ec) this pointer used to point at it.
Retain
inpcb: fully retire inp_ppcb pointer
Before a protocol specific control block started to embed inpcb in self (see 0aa120d52f3c, e68b3792440c, 483fe96511ec) this pointer used to point at it.
Retain kf_sock_inpcb field in the struct kinfo_file in <sys/user.h>. The exp-run detected a minimal use of the field in ports: * sysutils/lsof - patched upstream * net-mgmt/netdata - patch accepted upstream * emulators/qemu-user-static - upstream master branch seems not using the field anymore We can keep the field around for some time, but eventually it may be reused for something else.
PR: 277659 (exp-run) Reviewed by: tuexen Differential Revision: https://reviews.freebsd.org/D44491
show more ...
|
#
e34ea019 |
| 18-Mar-2024 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp: clear all TCP timers in tcp_timer_stop() when in callout
When a TCP callout decides to disable self, e.g. tcp_timer_2msl() calling tcp_close(), we must also clear all other possible timers. Ot
tcp: clear all TCP timers in tcp_timer_stop() when in callout
When a TCP callout decides to disable self, e.g. tcp_timer_2msl() calling tcp_close(), we must also clear all other possible timers. Otherwise, upon return, the callout would be scheduled again in tcp_timer_enter().
Revert 57e27ff07aff, which was a temporary partial revert of otherwise correct 62d47d73b7eb, that exposed the problem being fixed now. Add an extra assertion in tcp_timer_enter() to check we aren't arming callout for a closed connection.
Reviewed by: rscheff
show more ...
|
#
dd7b86e2 |
| 18-Mar-2024 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp: remove IS_FASTOPEN() macro
The macro is more obfuscating than helping as it just checks a single flag of t_flags. All other t_flags bits are checked without a macro.
A bigger problem was that
tcp: remove IS_FASTOPEN() macro
The macro is more obfuscating than helping as it just checks a single flag of t_flags. All other t_flags bits are checked without a macro.
A bigger problem was that declaration of the macro in tcp_var.h depended on a kernel option. It is a bad practice to create such definitions in installable headers.
Reviewed by: rscheff, tuexen, kib Differential Revision: https://reviews.freebsd.org/D44362
show more ...
|
#
e18b97bd |
| 12-Mar-2024 |
Randall Stewart <rrs@FreeBSD.org> |
Update to bring the rack stack with all its fixes in.
This brings the rack stack up to the current level used at NF. Many fixes and improvements have been added. I also add in a fix to BBR to deal w
Update to bring the rack stack with all its fixes in.
This brings the rack stack up to the current level used at NF. Many fixes and improvements have been added. I also add in a fix to BBR to deal with the changes that have been in hpts for a while i.e. only one call no matter if mbuf queue or tcp_output.
It basically does little except BBlogs and is a placemark for future work on doing path capacity measurements.
With a bit of a struggle with git I finally got rack_pcm.c into place (apologies for not noticing this error). The LINT kernel is running on my box now .. sigh.
Reviewed by: tuexen, glebius Sponsored by: Netflix Inc. Differential Revision:https://reviews.freebsd.org/D43986
show more ...
|
#
c112243f |
| 11-Mar-2024 |
Brooks Davis <brooks@FreeBSD.org> |
Revert "Update to bring the rack stack with all its fixes in."
This commit was incomplete and breaks LINT kernels. The tree has been broken for 8+ hours.
This reverts commit f6d489f402c320f1a6eaa4
Revert "Update to bring the rack stack with all its fixes in."
This commit was incomplete and breaks LINT kernels. The tree has been broken for 8+ hours.
This reverts commit f6d489f402c320f1a6eaa473491a0b8c3878113e.
show more ...
|
#
f6d489f4 |
| 11-Mar-2024 |
Randall Stewart <rrs@FreeBSD.org> |
Update to bring the rack stack with all its fixes in.
This brings the rack stack up to the current level used at NF. Many fixes and improvements have been added. I also add in a fix to BBR to deal w
Update to bring the rack stack with all its fixes in.
This brings the rack stack up to the current level used at NF. Many fixes and improvements have been added. I also add in a fix to BBR to deal with the changes that have been in hpts for a while i.e. only one call no matter if mbuf queue or tcp_output.
Note there is a new file that I can't figure out how to get in rack_pcm.c
It basically does little except BBlogs and is a placemark for future work on doing path capacity measurements.
Reviewed by: tuexen, glebius Sponsored by: Netflix Inc. Differential Revision:https://reviews.freebsd.org/D43986
show more ...
|
#
57e27ff0 |
| 12-Feb-2024 |
Richard Scheffenegger <rscheff@FreeBSD.org> |
tcp: partially undo D43792
At the destruction of the tcpcb, no timers are supposed to be running. However, it turns out that stopping them in the close() / shutdown() call does not have the desired
tcp: partially undo D43792
At the destruction of the tcpcb, no timers are supposed to be running. However, it turns out that stopping them in the close() / shutdown() call does not have the desired effect under all circumstances.
This partially reverts 62d47d73b7eb to reduce the nuisance caused.
PR: 277009 Reported-by: syzbot+9a9aa434a14a2b35c3ba@syzkaller.appspotmail.com Reported-by: syzbot+e82856782410e895bae7@syzkaller.appspotmail.com Reviewed By: glebius, tuexen, #transport Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D43855
show more ...
|
#
62d47d73 |
| 10-Feb-2024 |
Richard Scheffenegger <rscheff@FreeBSD.org> |
tcp: stop timers and clean scoreboard in tcp_close()
Stop timers when in tcp_close() instead of doing that in tcp_discardcb(). A connection in CLOSED state shall not need any timers. Assert that no
tcp: stop timers and clean scoreboard in tcp_close()
Stop timers when in tcp_close() instead of doing that in tcp_discardcb(). A connection in CLOSED state shall not need any timers. Assert that no timer is rescheduled after that in tcp_timer_activate() and verfiy that this is also the expected state in tcp_discardcb().
PR: 276761 Reviewed By: glebius, tuexen, #transport Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D43792
show more ...
|
#
3eeb22cb |
| 10-Feb-2024 |
Richard Scheffenegger <rscheff@FreeBSD.org> |
tcp: clean scoreboard when releasing the socket buffer
The SACK scoreboard is conceptually an extention of the socket buffer. Remove it when the socket buffer goes away with soisdisconnected(). Veri
tcp: clean scoreboard when releasing the socket buffer
The SACK scoreboard is conceptually an extention of the socket buffer. Remove it when the socket buffer goes away with soisdisconnected(). Verify that this is also the expected state in tcp_discardcb().
PR: 276761 Reviewed by: glebius, tuexen, #transport Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D43805
show more ...
|
#
3f46be6a |
| 07-Dec-2023 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp_hpts: let tcp_hpts_init() set a random CPU only once
After d2ef52ef3dee the tcp_hpts_init() function can be called multiple times on a tcpcb if it is switched there and back between two TCP stac
tcp_hpts: let tcp_hpts_init() set a random CPU only once
After d2ef52ef3dee the tcp_hpts_init() function can be called multiple times on a tcpcb if it is switched there and back between two TCP stacks. First, this makes existing assertion in tcp_hpts_init() incorrect. Second, it creates possibility to change a randomly set t_hpts_cpu to a different random value, while a tcpcb is already in the HPTS wheel, triggering other assertions later in tcp_hptsi().
The best approach here would be to work on the stacks to really clear a tcpcb out of HPTS wheel in tfb_tcp_fb_fini, draining the IHPTS_MOVING state. But that's pretty intrusive change, so let's just get back to the old logic (pre d2ef52ef3dee) where t_hpts_cpu was set to a random value only once in a CPU lifetime and a newly switched stack inherits t_hpts_cpu from the previous stack.
Reviewed by: rrs, tuexen Differential Revision: https://reviews.freebsd.org/D42946 Reported-by: syzbot+fab29fe1ab089c52998d@syzkaller.appspotmail.com Reported-by: syzbot+ca5f2aa0fda15dcfe6d7@syzkaller.appspotmail.com Fixes: 2b3a77467dd3d74a7170f279fb25f9736b46ef8a
show more ...
|
#
ade05d63 |
| 07-Dec-2023 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp: stop stack timers in tcp_switch_back_to_default()
This funcion is an alternative code path that detaches an alternative TCP stack, missed in d2ef52ef3dee38cccb7f54d33ecc2a4b944dad9d.
Reviewed
tcp: stop stack timers in tcp_switch_back_to_default()
This funcion is an alternative code path that detaches an alternative TCP stack, missed in d2ef52ef3dee38cccb7f54d33ecc2a4b944dad9d.
Reviewed by: rrs, tuexen Differential Revision: https://reviews.freebsd.org/D42917 Reported-by: syzbot+186130be9f0ca5557d4e@syzkaller.appspotmail.com Fixes: d2ef52ef3dee38cccb7f54d33ecc2a4b944dad9d
show more ...
|
#
d2ef52ef |
| 04-Dec-2023 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp/hpts: make stacks responsible for clearing themselves out HPTS
There already is the tfb_tcp_timer_stop_all method that is supposed to stop all time events associated with a given tcpcb by given
tcp/hpts: make stacks responsible for clearing themselves out HPTS
There already is the tfb_tcp_timer_stop_all method that is supposed to stop all time events associated with a given tcpcb by given stack. Some time ago it was doing actual callout_stop(). Today bbr/rack just mark their internal state as inactive in their tfb_tcp_timer_stop_all methods, but tcpcb stays in HPTS wheel and potentially called in from HPTS. Change the methods to also call tcp_hpts_remove(). Note: I'm not sure if internal flag is still relevant once we are out of HPTS wheel.
Call the method when connection goes into TCP_CLOSED state, instead of calling it later when tcpcb is freed. Also call it when we switch between stacks.
Reviewed by: tuexen, rrs Differential Revision: https://reviews.freebsd.org/D42857
show more ...
|
#
2b3a7746 |
| 04-Dec-2023 |
Gleb Smirnoff <glebius@FreeBSD.org> |
hpts: make stacks responsible for tcp_hpts_init()
Those stacks that use HPTS should care about init, not generic code.
Reviewed by: imp, tuexen, rrs Differential Revision: https://reviews.freebsd.
hpts: make stacks responsible for tcp_hpts_init()
Those stacks that use HPTS should care about init, not generic code.
Reviewed by: imp, tuexen, rrs Differential Revision: https://reviews.freebsd.org/D42856
show more ...
|
#
8e907391 |
| 04-Dec-2023 |
Gleb Smirnoff <glebius@FreeBSD.org> |
hpts: don't ifdef tcp_in_hpts()
This small inline function is always available.
Reviewed by: imp, tuexen, rrs Differential Revision: https://reviews.freebsd.org/D42855
|
#
29363fb4 |
| 23-Nov-2023 |
Warner Losh <imp@FreeBSD.org> |
sys: Remove ancient SCCS tags.
Remove ancient SCCS tags from the tree, automated scripting, with two minor fixup to keep things compiling. All the common forms in the tree were removed with a perl s
sys: Remove ancient SCCS tags.
Remove ancient SCCS tags from the tree, automated scripting, with two minor fixup to keep things compiling. All the common forms in the tree were removed with a perl script.
Sponsored by: Netflix
show more ...
|
#
219a6ca9 |
| 21-Nov-2023 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp: uninline tcp_account_for_send()
This allows to clear inclusion of "opt_kern_tls.h" from a system header.
Reviewed by: rscheff, tuexen Differential Revision: https://reviews.freebsd.org/D42696
|
#
38ecc80b |
| 08-Oct-2023 |
Zhenlei Huang <zlei@FreeBSD.org> |
tcp: Simplify the initialization of loader tunable 'net.inet.tcp.tcbhashsize'
No functional change intended.
Reviewed by: cc, rscheff, #transport MFC after: 1 week Differential Revision: https://re
tcp: Simplify the initialization of loader tunable 'net.inet.tcp.tcbhashsize'
No functional change intended.
Reviewed by: cc, rscheff, #transport MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D41998
show more ...
|
#
685dc743 |
| 16-Aug-2023 |
Warner Losh <imp@FreeBSD.org> |
sys: Remove $FreeBSD$: one-line .c pattern
Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/
|
#
d66540e8 |
| 05-Jun-2023 |
Michael Tuexen <tuexen@FreeBSD.org> |
tcp: improve sending of TTL/hoplimit and DSCP
Ensure that a user specified value of TTL/hoplimit and DSCP is used when sending packets.
Reviewed by: cc, rscheff MFC after: 1 week Sponsored by: N
tcp: improve sending of TTL/hoplimit and DSCP
Ensure that a user specified value of TTL/hoplimit and DSCP is used when sending packets.
Reviewed by: cc, rscheff MFC after: 1 week Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D40423
show more ...
|