History log of /openbsd/sys/net/pf.c (Results 1 – 25 of 1196)
Revision Date Author Comments
# 3b372c34 14-May-2024 jsg <jsg@openbsd.org>

remove prototypes with no matching function


# 8de813b2 10-May-2024 jsg <jsg@openbsd.org>

make pf_match_rule() prototype match the function


# 93536db2 12-Apr-2024 bluhm <bluhm@openbsd.org>

Split single TCP inpcb table into IPv4 and IPv6 parts.

With two separate TCP hash tables, each one becomes smaller. When
we remove the exclusive net lock from TCP, contention on internet
PCB table

Split single TCP inpcb table into IPv4 and IPv6 parts.

With two separate TCP hash tables, each one becomes smaller. When
we remove the exclusive net lock from TCP, contention on internet
PCB table mutex will be reduced. UDP has been split earlier into
IPv4 and IPv6. Replace branch conditions based on INP_IPV6 with
assertions.

OK mvs@

show more ...


# 1ef7e4b4 10-Jan-2024 bluhm <bluhm@openbsd.org>

Split UDP PCB table into IPv4 and IPv6.

Having two hash tables instead of a common one, reduces table size
and contention on the per table lock. The address family is always
known in advance. The

Split UDP PCB table into IPv4 and IPv6.

Having two hash tables instead of a common one, reduces table size
and contention on the per table lock. The address family is always
known in advance. The lookups and loops are more specific.

OK sashan@

show more ...


# 034f31ce 01-Jan-2024 bluhm <bluhm@openbsd.org>

Protect link between pf and inp with mutex.

Introduce global mutex to protect the pointers between pf state key
and internet PCB. Then in_pcbdisconnect() and in_pcbdetach() do
not need exclusive ne

Protect link between pf and inp with mutex.

Introduce global mutex to protect the pointers between pf state key
and internet PCB. Then in_pcbdisconnect() and in_pcbdetach() do
not need exclusive netlock anymore. Use a bunch of read once
unlocked access to reduce performance impact.

OK sashan@

show more ...


# 5851f6d7 01-Jan-2024 bluhm <bluhm@openbsd.org>

Fix white space in pf.c.


# 00a8a9e1 28-Dec-2023 aisha <aisha@openbsd.org>

use RB_FOREACH_SAFE for pf_purge_expired_src_nodes

OK bluhm@


# 9d9f4dc6 01-Dec-2023 sashan <sashan@openbsd.org>

Prevent race between pf_test() and pf_purge_expired_states().
Packets (callers to pf_test()) must alter pf_state::timeout
under protection of pf_state::mtx. We also have to make sure
the packet does

Prevent race between pf_test() and pf_purge_expired_states().
Packets (callers to pf_test()) must alter pf_state::timeout
under protection of pf_state::mtx. We also have to make sure
the packet does not update pf_state::timeout when ::timeout
reaches PFTM_UNLINKED.

The first report came from Johan Huldtgren, but he is not
the single user who has noticed "st->timeout == PFTM_UNLINKED"
assert violation.

OK bluhm@

show more ...


# 89860f87 10-Oct-2023 bluhm <bluhm@openbsd.org>

pf(4) must not pass packet if state cannot be created.

The behavior of the PFRULE_SRCTRACK and max_states check was
unintentionally changed by commit revision 1.964. If the state was
not created du

pf(4) must not pass packet if state cannot be created.

The behavior of the PFRULE_SRCTRACK and max_states check was
unintentionally changed by commit revision 1.964. If the state was
not created due to some limit had been reached, pf still passed the
packet. Restore the old logic by setting action to pass later,
after the checks. In pf_test_rule() action is initialized to drop.

OK sashan@

show more ...


# 46650f23 10-Oct-2023 bluhm <bluhm@openbsd.org>

Remove dead code in pf_pull_hdr().

pf_pull_hdr() allows to pass an action pointer parameter as output
value. This is never used, all callers pass a NULL argument. Remove
ACTION_SET() entirely.

Th

Remove dead code in pf_pull_hdr().

pf_pull_hdr() allows to pass an action pointer parameter as output
value. This is never used, all callers pass a NULL argument. Remove
ACTION_SET() entirely.

The logic (fragoff >= len) in pf_pull_hdr() does not work since
revision 1.4. Before it was used to drop short TCP or UDP fragments
that contained only part of the header. Current code in pf_pull_hdr()
drops the packets anyway, so always set reason PFRES_FRAG.

OK kn@ sashan@

show more ...


# dbfb0ac1 08-Sep-2023 naddy <naddy@openbsd.org>

revert previous

The change broke IPv6 neighbor discovery, and anton@ reports several
regression test failures.

ok bluhm@


# 37052907 07-Sep-2023 sashan <sashan@openbsd.org>

pf(4) ignores 'keep state' and 'nat-to' actions for unsolicited
icmp error responses. Fix tightens rule matching logic so icmp
error responses no longer match 'keep state' rule. In typical
scenarios

pf(4) ignores 'keep state' and 'nat-to' actions for unsolicited
icmp error responses. Fix tightens rule matching logic so icmp
error responses no longer match 'keep state' rule. In typical
scenarios icmp errors (if solicited) should match existing state.
The change is going to bite firewalls which deal with asymmetric
routes. In those cases the 'keep state' action should be relaxed
to sloppy or new 'no state' rule to explicitly match icmp
errors should be added.

The issue has been reported by Peter J. Philip (pjp _at_ delphinusdns.org).

Discussed with bluhm@ and florian@

OK bluhm@

show more ...


# 740063f5 31-Jul-2023 dlg <dlg@openbsd.org>

don't let pfsync send an insert message for a state pfsync just inserted

sthen@ upgraded and ended up with a lot of pfsync traffic which was
mostly made up of the two firewalls telling each other to

don't let pfsync send an insert message for a state pfsync just inserted

sthen@ upgraded and ended up with a lot of pfsync traffic which was
mostly made up of the two firewalls telling each other to insert
the same state over and over again.

this has each of the paths that insert states (actual pf, ioctls,
and pfsync) identify themselves so pfsync can enter them into its
own state machine in the right place. when pfsync inserts a state
into pf, it knows it should just swallow the state silently without
sending out another insert for it.

ok sthen@ sashan@

show more ...


# 5ebaba9d 07-Jul-2023 bluhm <bluhm@openbsd.org>

Fix path MTU discovery for TCP LRO/TSO when forwarding.

When doing LRO (Large Receive Offload), the drivers, currently ix(4)
and lo(4) only, record an upper bound of the size of the original
packets

Fix path MTU discovery for TCP LRO/TSO when forwarding.

When doing LRO (Large Receive Offload), the drivers, currently ix(4)
and lo(4) only, record an upper bound of the size of the original
packets in ph_mss. When sending, either stack or hardware must
chop the packets with TSO (TCP Segmentation Offload) to that size.
That means we have to call tcp_if_output_tso() before ifp->if_output().
Put that logic into if_output_tso() to avoid code duplication. As
TCP packets on the wire do not get larger that way, path MTU discovery
should still work.

tested by and OK jan@

show more ...


# cc90b7e6 06-Jul-2023 dlg <dlg@openbsd.org>

big update to pfsync to try and clean up locking in particular.

moving pf forward has been a real struggle, and pfsync has been a
constant source of pain. we have been papering over the problems
for

big update to pfsync to try and clean up locking in particular.

moving pf forward has been a real struggle, and pfsync has been a
constant source of pain. we have been papering over the problems
for a while now, but it reached the point that it needed a fundamental
restructure, which is what this diff is.

the big headliner changes in this diff are:

- pfsync specific locks

this is the whole reason for this diff.

rather than rely on NET_LOCK or KERNEL_LOCK or whatever, pfsync now
has it's own locks to protect it's internal data structures. this
is important because pfsync runs a bunch of timeouts and tasks to
push pfsync packets out on the wire, or when it's handling requests
generated by incoming pfsync packets, both of which happen outside
pf itself running. having pfsync specific locks around pfsync data
structures makes the mutations of these data structures a lot more
explicit and auditable.

- partitioning

to enable future parallelisation of the network stack, this rewrite
includes support for pfsync to partition states into different "slices".
these slices run independently, ie, the states collected by one slice
are serialised into a separate packet to the states collected and
serialised by another slice.

states are mapped to pfsync slices based on the pf state hash, which
is the same hash that the rest of the network stack and multiq
hardware uses.

- no more pfsync called from netisr

pfsync used to be called from netisr to try and bundle packets, but now
that there's multiple pfsync slices this doesnt make sense. instead it
uses tasks in softnet tqs.

- improved bulk transfer handling

there's shiny new state machines around both the bulk transmit and
receive handling. pfsync used to do horrible things to carp demotion
counters, but now it is very predictable and returns the counters back
where they started.

- better tdb handling

the tdb handling was pretty hairy, but hrvoje has kicked this around
a lot with ipsec and sasyncd and we've found and fixed a bunch of
issues as a result of that testing.

- mpsafe pf state purges

this was committed previously, but because the locks pfsync relied on
weren't clear this just caused a ton of bugs. as part of this diff it's
now reliable, and moves a big chunk of work out from under KERNEL_LOCK,
which in turn improves the responsiveness and throughput of a firewall
even if you're not using pfsync.

there's a bunch of other little changes along the way, but the above are
the big ones.

hrvoje has done performance testing with this diff and notes a big
improvement when pfsync is not in use. performance when pfsync is
enabled is about the same, but im hoping the slices means we can scale
along with pf as it improves.

lots (months) of testing by me and hrvoje on pfsync boxes
tests and ok sashan@
deraadt@ says this is a good time to put it in

show more ...


# 0c23d001 05-Jun-2023 sashan <sashan@openbsd.org>

pf_remove_state() should not attempt to remove state which
is already removed.

OK dlg@


# 510f4386 15-May-2023 bluhm <bluhm@openbsd.org>

Implement the TCP/IP layer for hardware TCP segmentation offload.
If the driver of a network interface claims to support TSO, do not
chop the packet in software, but pass it down to the interface
lay

Implement the TCP/IP layer for hardware TCP segmentation offload.
If the driver of a network interface claims to support TSO, do not
chop the packet in software, but pass it down to the interface
layer.
Precalculate parts of the pseudo header checksum, but without the
packet length. The length of all generated smaller packets is not
known yet. Driver and hardware will use the mbuf packet header
field ph_mss to calculate it and update checksum.
Introduce separate flags IFCAP_TSOv4 and IFCAP_TSOv6 as hardware
might support ony one protocol family. The old flag IFXF_TSO is
only relevant for large receive offload. It is missnamed, but keep
that for now.
Note that drivers do not set TSO capabilites yet. Also the ifconfig
flags and pseudo interfaces capabilities will be done separately.
So this commit should not change behavior.
heavily based on the work from jan@; OK sashan@

show more ...


# 55055d61 13-May-2023 bluhm <bluhm@openbsd.org>

Instead of implementing IPv4 header checksum creation everywhere,
introduce in_hdr_cksum_out(). It is used like in_proto_cksum_out().
OK claudio@


# c06845b1 10-May-2023 bluhm <bluhm@openbsd.org>

Implement TCP send offloading, for now in software only. This is
meant as a fallback if network hardware does not support TSO. Driver
support is still work in progress. TCP output generates large

Implement TCP send offloading, for now in software only. This is
meant as a fallback if network hardware does not support TSO. Driver
support is still work in progress. TCP output generates large
packets. In IP output the packet is chopped to TCP maximum segment
size. This reduces the CPU cycles used by pf. The regular output
could be assisted by hardware later, but pf route-to and IPsec needs
the software fallback in general.
For performance comparison or to workaround possible bugs, sysctl
net.inet.tcp.tso=0 disables the feature. netstat -s -p tcp shows
TSO counter with chopped and generated packets.
based on work from jan@
tested by jmc@ jan@ Hrvoje Popovski
OK jan@ claudio@

show more ...


# 003cfddf 08-May-2023 bluhm <bluhm@openbsd.org>

The call to in_proto_cksum_out() is only needed before the packet
is passed to ifp->if_output(). The fragment code has its own
checksum calculation and the other paths end in goto bad.
OK claudio@


# b8646e37 07-May-2023 bluhm <bluhm@openbsd.org>

I preparation for TSO in software, cleanup the fragment code. Use
if_output_ml() to send mbuf lists to interfaces. This can be used
for TSO, fragments, ARP and ND6. Rename variable fml to ml. In

I preparation for TSO in software, cleanup the fragment code. Use
if_output_ml() to send mbuf lists to interfaces. This can be used
for TSO, fragments, ARP and ND6. Rename variable fml to ml. In
pf_route6() split the if else block. Put the safety check (hlen +
firstlen < tlen) into ip_fragment(). It makes the code correct in
case the packet is too short to be fragmented. This should not
happen, but other functions also have this logic.
No functional change. OK sashan@

show more ...


# 3e519755 03-May-2023 kn <kn@openbsd.org>

Remove net lock from DIOCGETRULESET and DIOCGETRULESETS

Both walk the list of rulesets aka. anchors, to yield a total count and
specific anchor name, respectively. Same access, different copy out.

Remove net lock from DIOCGETRULESET and DIOCGETRULESETS

Both walk the list of rulesets aka. anchors, to yield a total count and
specific anchor name, respectively. Same access, different copy out.

pf_anchor_global are contained within pf_ioctl.c and pf_ruleset.c and
fully protected by the pf lock, as is pf_main_ruleset and its pf.c usage.

Rely on and assert for pf lock alone. 'pfctl -sr' on 60k unique rules gets
noticably faster, around 2.1s instead of 3.5s.

OK sashan

show more ...


# 49f39043 28-Apr-2023 phessler <phessler@openbsd.org>

Relax the "pass all" rule so all forms of neighbor advertisements are allowed
in either direction.

This more closely matches the IPv4 ARP behaviour.

From sashan@
discussed with kn@ deraadt@


# 74537007 23-Mar-2023 jsg <jsg@openbsd.org>

fix off-by-one in pf_state_expires() bounds test
such a value would have triggered a KASSERT()
ok sashan@ deraadt@


# 58feb3ff 04-Mar-2023 sashan <sashan@openbsd.org>

pf(4) should be enforcing TTL=1 to packets sent to 224.0.0.1 only.
Issue found and kindly reported by Luca Di Gregorio <lucdig _at_ gmail>

OK bluhm@


12345678910>>...48