History log of /freebsd/sys/netinet/tcp_subr.c (Results 1 – 25 of 661)
Revision Date Author Comments
# 86c9325d 06-Jun-2024 Michael Tuexen <tuexen@FreeBSD.org>

tcp: simplify stack switching protocol

Before this patch, a stack (tfb) accepts a tcpcb (tp), if the
tp->t_state is TCPS_CLOSED or tfb->tfb_tcp_handoff_ok is not NULL
and tfb->tfb_tcp_handoff_ok(tp)

tcp: simplify stack switching protocol

Before this patch, a stack (tfb) accepts a tcpcb (tp), if the
tp->t_state is TCPS_CLOSED or tfb->tfb_tcp_handoff_ok is not NULL
and tfb->tfb_tcp_handoff_ok(tp) returns 0.
After this patch, the only check is tfb->tfb_tcp_handoff_ok(tp)
returns 0. tfb->tfb_tcp_handoff_ok must always be provided.
For existing TCP stacks (FreeBSD, RACK and BBR) there is no
functional change. However, the logic is simpler.

Reviewed by: lstewart, peter_lei_ieee_.org, rrs
MFC after: 1 week
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D45253

show more ...


# ea916b64 18-May-2024 Randall Stewart <rrs@FreeBSD.org>

Remove TCP_SAD optional code now that the sack filter performs this function.

With the commit of D44903 we no longer need the SAD option. Instead all stacks that
use the sack filter inherit its prot

Remove TCP_SAD optional code now that the sack filter performs this function.

With the commit of D44903 we no longer need the SAD option. Instead all stacks that
use the sack filter inherit its protection against sack-attack.

Reviewed by: tuexen@
Differential Revision:https://reviews.freebsd.org/D45216

show more ...


# 59884aea 04-May-2024 Richard Scheffenegger <rscheff@FreeBSD.org>

tcp: clean up macro useage in tcp_fixed_maxseg()

Replace local PAD macro with PADTCPOLEN macro
No functional change.

Reviewed By: tuexen, #transport
Sponsored by: NetApp, Inc.
Differential Revisi

tcp: clean up macro useage in tcp_fixed_maxseg()

Replace local PAD macro with PADTCPOLEN macro
No functional change.

Reviewed By: tuexen, #transport
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D45076

show more ...


# fce03f85 05-May-2024 Randall Stewart <rrs@FreeBSD.org>

TCP can be subject to Sack Attacks lets fix this issue.

There is a type of attack that a TCP peer can launch on a connection. This is for sure in Rack or BBR and probably even the default stack if i

TCP can be subject to Sack Attacks lets fix this issue.

There is a type of attack that a TCP peer can launch on a connection. This is for sure in Rack or BBR and probably even the default stack if it uses lists in sack processing. The idea of the attack is that the attacker is driving you to look at 100's of sack blocks that only update 1 byte. So for example if you have 1 - 10,000 bytes outstanding the attacker sends in something like:

ACK 0 SACK(1-512) SACK(1024 - 1536), SACK(2048-2536), SACK(4096 - 4608), SACK(8192-8704)
This first sack looks fine but then the attacker sends

ACK 0 SACK(1-512) SACK(1025 - 1537), SACK(2049-2537), SACK(4097 - 4609), SACK(8193-8705)
ACK 0 SACK(1-512) SACK(1027 - 1539), SACK(2051-2539), SACK(4099 - 4611), SACK(8195-8707)
...
These blocks are making you hunt across your linked list and split things up so that you have an entry for every other byte. Has your list grows you spend more and more CPU running through the lists. The idea here is the attacker chooses entries as far apart as possible that make you run through the list. This example is small but in theory if the window is open to say 1Meg you could end up with 100's of thousands link list entries.

To combat this we introduce three things.

when the peer requests a very small MSS we stop processing SACK's from them. This prevents a malicious peer from just using a small MSS to do the same thing.
Any time we get a sack block, we use the sack-filter to remove sacks that are smaller than the smallest v4 mss (minus 40 for max TCP options) unless it ties up to snd_max (since that is legal). All other sacks in theory should be at least an MSS. If we get such an attacker that means we basically start skipping all but MSS sized Sacked blocks.
The sack filter used to throw away data when its bounds were exceeded, instead now we increase its size to 15 and then throw away sack's if the filter gets over-run to prevent the malicious attacker from over-running the sack filter and thus we start to process things anyway.
The default stack will need to start using the sack-filter which we have talked about in past conference calls to take full advantage of the protections offered by it (and reduce cpu consumption when processing sacks).

After this set of changes is in rack can drop its SAD detection completely

Reviewed by:tuexen@, rscheff@
Differential Revision: <https://reviews.freebsd.org/D44903>

show more ...


# 6b454da6 03-Apr-2024 Michael Tuexen <tuexen@FreeBSD.org>

tcp: address a warning

t_state is an unsigned variable, so no need for testing that it is
non-negative.

Reported by: Coverity Scan
CID: 1390885
Reviewed by: glebius
MFC after: 1 week
Sponsored

tcp: address a warning

t_state is an unsigned variable, so no need for testing that it is
non-negative.

Reported by: Coverity Scan
CID: 1390885
Reviewed by: glebius
MFC after: 1 week
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D44619

show more ...


# e0bd1801 03-Apr-2024 Michael Tuexen <tuexen@FreeBSD.org>

tcp: fix conversion of rttvar

A wrong variable and wrong scaling factors were used.

Reported by: Coverity Scan
CID: 1508689
Reviewed by: rscheff
MFC after: 1 week
Sponsored by: Netflix, Inc.

tcp: fix conversion of rttvar

A wrong variable and wrong scaling factors were used.

Reported by: Coverity Scan
CID: 1508689
Reviewed by: rscheff
MFC after: 1 week
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D44612

show more ...


# 1a8d1764 29-Mar-2024 Gleb Smirnoff <glebius@FreeBSD.org>

inpcb: fully retire inp_ppcb pointer

Before a protocol specific control block started to embed inpcb in self
(see 0aa120d52f3c, e68b3792440c, 483fe96511ec) this pointer used to point
at it.

Retain

inpcb: fully retire inp_ppcb pointer

Before a protocol specific control block started to embed inpcb in self
(see 0aa120d52f3c, e68b3792440c, 483fe96511ec) this pointer used to point
at it.

Retain kf_sock_inpcb field in the struct kinfo_file in <sys/user.h>. The
exp-run detected a minimal use of the field in ports:
* sysutils/lsof - patched upstream
* net-mgmt/netdata - patch accepted upstream
* emulators/qemu-user-static - upstream master branch seems not using
the field anymore
We can keep the field around for some time, but eventually it may be
reused for something else.

PR: 277659 (exp-run)
Reviewed by: tuexen
Differential Revision: https://reviews.freebsd.org/D44491

show more ...


# e34ea019 18-Mar-2024 Gleb Smirnoff <glebius@FreeBSD.org>

tcp: clear all TCP timers in tcp_timer_stop() when in callout

When a TCP callout decides to disable self, e.g. tcp_timer_2msl() calling
tcp_close(), we must also clear all other possible timers. Ot

tcp: clear all TCP timers in tcp_timer_stop() when in callout

When a TCP callout decides to disable self, e.g. tcp_timer_2msl() calling
tcp_close(), we must also clear all other possible timers. Otherwise,
upon return, the callout would be scheduled again in tcp_timer_enter().

Revert 57e27ff07aff, which was a temporary partial revert of otherwise
correct 62d47d73b7eb, that exposed the problem being fixed now. Add an
extra assertion in tcp_timer_enter() to check we aren't arming callout for
a closed connection.

Reviewed by: rscheff

show more ...


# dd7b86e2 18-Mar-2024 Gleb Smirnoff <glebius@FreeBSD.org>

tcp: remove IS_FASTOPEN() macro

The macro is more obfuscating than helping as it just checks a single flag
of t_flags. All other t_flags bits are checked without a macro.

A bigger problem was that

tcp: remove IS_FASTOPEN() macro

The macro is more obfuscating than helping as it just checks a single flag
of t_flags. All other t_flags bits are checked without a macro.

A bigger problem was that declaration of the macro in tcp_var.h depended
on a kernel option. It is a bad practice to create such definitions in
installable headers.

Reviewed by: rscheff, tuexen, kib
Differential Revision: https://reviews.freebsd.org/D44362

show more ...


# e18b97bd 12-Mar-2024 Randall Stewart <rrs@FreeBSD.org>

Update to bring the rack stack with all its fixes in.

This brings the rack stack up to the current level used at NF. Many fixes
and improvements have been added. I also add in a fix to BBR to deal w

Update to bring the rack stack with all its fixes in.

This brings the rack stack up to the current level used at NF. Many fixes
and improvements have been added. I also add in a fix to BBR to deal with
the changes that have been in hpts for a while i.e. only one call no matter
if mbuf queue or tcp_output.

It basically does little except BBlogs and is a placemark for future work on
doing path capacity measurements.

With a bit of a struggle with git I finally got rack_pcm.c into place (apologies
for not noticing this error). The LINT kernel is running on my box now .. sigh.

Reviewed by: tuexen, glebius
Sponsored by: Netflix Inc.
Differential Revision:https://reviews.freebsd.org/D43986

show more ...


# c112243f 11-Mar-2024 Brooks Davis <brooks@FreeBSD.org>

Revert "Update to bring the rack stack with all its fixes in."

This commit was incomplete and breaks LINT kernels. The tree has been
broken for 8+ hours.

This reverts commit f6d489f402c320f1a6eaa4

Revert "Update to bring the rack stack with all its fixes in."

This commit was incomplete and breaks LINT kernels. The tree has been
broken for 8+ hours.

This reverts commit f6d489f402c320f1a6eaa473491a0b8c3878113e.

show more ...


# f6d489f4 11-Mar-2024 Randall Stewart <rrs@FreeBSD.org>

Update to bring the rack stack with all its fixes in.

This brings the rack stack up to the current level used at NF. Many fixes
and improvements have been added. I also add in a fix to BBR to deal w

Update to bring the rack stack with all its fixes in.

This brings the rack stack up to the current level used at NF. Many fixes
and improvements have been added. I also add in a fix to BBR to deal with
the changes that have been in hpts for a while i.e. only one call no matter
if mbuf queue or tcp_output.

Note there is a new file that I can't figure out how to get in rack_pcm.c

It basically does little except BBlogs and is a placemark for future work on
doing path capacity measurements.

Reviewed by: tuexen, glebius
Sponsored by: Netflix Inc.
Differential Revision:https://reviews.freebsd.org/D43986

show more ...


# 57e27ff0 12-Feb-2024 Richard Scheffenegger <rscheff@FreeBSD.org>

tcp: partially undo D43792

At the destruction of the tcpcb, no timers are supposed to
be running. However, it turns out that stopping them in the
close() / shutdown() call does not have the desired

tcp: partially undo D43792

At the destruction of the tcpcb, no timers are supposed to
be running. However, it turns out that stopping them in the
close() / shutdown() call does not have the desired effect
under all circumstances.

This partially reverts 62d47d73b7eb to reduce the nuisance
caused.

PR: 277009
Reported-by: syzbot+9a9aa434a14a2b35c3ba@syzkaller.appspotmail.com
Reported-by: syzbot+e82856782410e895bae7@syzkaller.appspotmail.com
Reviewed By: glebius, tuexen, #transport
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D43855

show more ...


# 62d47d73 10-Feb-2024 Richard Scheffenegger <rscheff@FreeBSD.org>

tcp: stop timers and clean scoreboard in tcp_close()

Stop timers when in tcp_close() instead of doing that in tcp_discardcb().
A connection in CLOSED state shall not need any timers. Assert that no

tcp: stop timers and clean scoreboard in tcp_close()

Stop timers when in tcp_close() instead of doing that in tcp_discardcb().
A connection in CLOSED state shall not need any timers. Assert that no
timer is rescheduled after that in tcp_timer_activate() and verfiy that
this is also the expected state in tcp_discardcb().

PR: 276761
Reviewed By: glebius, tuexen, #transport
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D43792

show more ...


# 3eeb22cb 10-Feb-2024 Richard Scheffenegger <rscheff@FreeBSD.org>

tcp: clean scoreboard when releasing the socket buffer

The SACK scoreboard is conceptually an extention of the socket
buffer. Remove it when the socket buffer goes away with
soisdisconnected(). Veri

tcp: clean scoreboard when releasing the socket buffer

The SACK scoreboard is conceptually an extention of the socket
buffer. Remove it when the socket buffer goes away with
soisdisconnected(). Verify that this is also the expected
state in tcp_discardcb().

PR: 276761
Reviewed by: glebius, tuexen, #transport
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D43805

show more ...


# 3f46be6a 07-Dec-2023 Gleb Smirnoff <glebius@FreeBSD.org>

tcp_hpts: let tcp_hpts_init() set a random CPU only once

After d2ef52ef3dee the tcp_hpts_init() function can be called multiple
times on a tcpcb if it is switched there and back between two TCP stac

tcp_hpts: let tcp_hpts_init() set a random CPU only once

After d2ef52ef3dee the tcp_hpts_init() function can be called multiple
times on a tcpcb if it is switched there and back between two TCP stacks.
First, this makes existing assertion in tcp_hpts_init() incorrect. Second,
it creates possibility to change a randomly set t_hpts_cpu to a different
random value, while a tcpcb is already in the HPTS wheel, triggering other
assertions later in tcp_hptsi().

The best approach here would be to work on the stacks to really clear a
tcpcb out of HPTS wheel in tfb_tcp_fb_fini, draining the IHPTS_MOVING
state. But that's pretty intrusive change, so let's just get back to the
old logic (pre d2ef52ef3dee) where t_hpts_cpu was set to a random value
only once in a CPU lifetime and a newly switched stack inherits t_hpts_cpu
from the previous stack.

Reviewed by: rrs, tuexen
Differential Revision: https://reviews.freebsd.org/D42946
Reported-by: syzbot+fab29fe1ab089c52998d@syzkaller.appspotmail.com
Reported-by: syzbot+ca5f2aa0fda15dcfe6d7@syzkaller.appspotmail.com
Fixes: 2b3a77467dd3d74a7170f279fb25f9736b46ef8a

show more ...


# ade05d63 07-Dec-2023 Gleb Smirnoff <glebius@FreeBSD.org>

tcp: stop stack timers in tcp_switch_back_to_default()

This funcion is an alternative code path that detaches an alternative
TCP stack, missed in d2ef52ef3dee38cccb7f54d33ecc2a4b944dad9d.

Reviewed

tcp: stop stack timers in tcp_switch_back_to_default()

This funcion is an alternative code path that detaches an alternative
TCP stack, missed in d2ef52ef3dee38cccb7f54d33ecc2a4b944dad9d.

Reviewed by: rrs, tuexen
Differential Revision: https://reviews.freebsd.org/D42917
Reported-by: syzbot+186130be9f0ca5557d4e@syzkaller.appspotmail.com
Fixes: d2ef52ef3dee38cccb7f54d33ecc2a4b944dad9d

show more ...


# d2ef52ef 04-Dec-2023 Gleb Smirnoff <glebius@FreeBSD.org>

tcp/hpts: make stacks responsible for clearing themselves out HPTS

There already is the tfb_tcp_timer_stop_all method that is supposed to stop
all time events associated with a given tcpcb by given

tcp/hpts: make stacks responsible for clearing themselves out HPTS

There already is the tfb_tcp_timer_stop_all method that is supposed to stop
all time events associated with a given tcpcb by given stack. Some time
ago it was doing actual callout_stop(). Today bbr/rack just mark their
internal state as inactive in their tfb_tcp_timer_stop_all methods, but
tcpcb stays in HPTS wheel and potentially called in from HPTS. Change the
methods to also call tcp_hpts_remove(). Note: I'm not sure if internal
flag is still relevant once we are out of HPTS wheel.

Call the method when connection goes into TCP_CLOSED state, instead of
calling it later when tcpcb is freed. Also call it when we switch between
stacks.

Reviewed by: tuexen, rrs
Differential Revision: https://reviews.freebsd.org/D42857

show more ...


# 2b3a7746 04-Dec-2023 Gleb Smirnoff <glebius@FreeBSD.org>

hpts: make stacks responsible for tcp_hpts_init()

Those stacks that use HPTS should care about init, not generic code.

Reviewed by: imp, tuexen, rrs
Differential Revision: https://reviews.freebsd.

hpts: make stacks responsible for tcp_hpts_init()

Those stacks that use HPTS should care about init, not generic code.

Reviewed by: imp, tuexen, rrs
Differential Revision: https://reviews.freebsd.org/D42856

show more ...


# 8e907391 04-Dec-2023 Gleb Smirnoff <glebius@FreeBSD.org>

hpts: don't ifdef tcp_in_hpts()

This small inline function is always available.

Reviewed by: imp, tuexen, rrs
Differential Revision: https://reviews.freebsd.org/D42855


# 29363fb4 23-Nov-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove ancient SCCS tags.

Remove ancient SCCS tags from the tree, automated scripting, with two
minor fixup to keep things compiling. All the common forms in the tree
were removed with a perl s

sys: Remove ancient SCCS tags.

Remove ancient SCCS tags from the tree, automated scripting, with two
minor fixup to keep things compiling. All the common forms in the tree
were removed with a perl script.

Sponsored by: Netflix

show more ...


# 219a6ca9 21-Nov-2023 Gleb Smirnoff <glebius@FreeBSD.org>

tcp: uninline tcp_account_for_send()

This allows to clear inclusion of "opt_kern_tls.h" from a system header.

Reviewed by: rscheff, tuexen
Differential Revision: https://reviews.freebsd.org/D42696


# 38ecc80b 08-Oct-2023 Zhenlei Huang <zlei@FreeBSD.org>

tcp: Simplify the initialization of loader tunable 'net.inet.tcp.tcbhashsize'

No functional change intended.

Reviewed by: cc, rscheff, #transport
MFC after: 1 week
Differential Revision: https://re

tcp: Simplify the initialization of loader tunable 'net.inet.tcp.tcbhashsize'

No functional change intended.

Reviewed by: cc, rscheff, #transport
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D41998

show more ...


# 685dc743 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove $FreeBSD$: one-line .c pattern

Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/


# d66540e8 05-Jun-2023 Michael Tuexen <tuexen@FreeBSD.org>

tcp: improve sending of TTL/hoplimit and DSCP

Ensure that a user specified value of TTL/hoplimit and DSCP is
used when sending packets.

Reviewed by: cc, rscheff
MFC after: 1 week
Sponsored by: N

tcp: improve sending of TTL/hoplimit and DSCP

Ensure that a user specified value of TTL/hoplimit and DSCP is
used when sending packets.

Reviewed by: cc, rscheff
MFC after: 1 week
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D40423

show more ...


12345678910>>...27