pf.c - OpenGrok history log for /openbsd/sys/net/pf.c

Revision	Date	Author	Comments
# 3b372c34	14-May-2024	jsg <jsg@openbsd.org>	remove prototypes with no matching function
# 8de813b2	10-May-2024	jsg <jsg@openbsd.org>	make pf_match_rule() prototype match the function
# 93536db2	12-Apr-2024	bluhm <bluhm@openbsd.org>	Split single TCP inpcb table into IPv4 and IPv6 parts. With two separate TCP hash tables, each one becomes smaller. When we remove the exclusive net lock from TCP, contention on internet PCB table Split single TCP inpcb table into IPv4 and IPv6 parts. With two separate TCP hash tables, each one becomes smaller. When we remove the exclusive net lock from TCP, contention on internet PCB table mutex will be reduced. UDP has been split earlier into IPv4 and IPv6. Replace branch conditions based on INP_IPV6 with assertions. OK mvs@ show more ...
# 1ef7e4b4	10-Jan-2024	bluhm <bluhm@openbsd.org>	Split UDP PCB table into IPv4 and IPv6. Having two hash tables instead of a common one, reduces table size and contention on the per table lock. The address family is always known in advance. The Split UDP PCB table into IPv4 and IPv6. Having two hash tables instead of a common one, reduces table size and contention on the per table lock. The address family is always known in advance. The lookups and loops are more specific. OK sashan@ show more ...
# 034f31ce	01-Jan-2024	bluhm <bluhm@openbsd.org>	Protect link between pf and inp with mutex. Introduce global mutex to protect the pointers between pf state key and internet PCB. Then in_pcbdisconnect() and in_pcbdetach() do not need exclusive ne Protect link between pf and inp with mutex. Introduce global mutex to protect the pointers between pf state key and internet PCB. Then in_pcbdisconnect() and in_pcbdetach() do not need exclusive netlock anymore. Use a bunch of read once unlocked access to reduce performance impact. OK sashan@ show more ...
# 5851f6d7	01-Jan-2024	bluhm <bluhm@openbsd.org>	Fix white space in pf.c.
# 00a8a9e1	28-Dec-2023	aisha <aisha@openbsd.org>	use RB_FOREACH_SAFE for pf_purge_expired_src_nodes OK bluhm@
# 9d9f4dc6	01-Dec-2023	sashan <sashan@openbsd.org>	Prevent race between pf_test() and pf_purge_expired_states(). Packets (callers to pf_test()) must alter pf_state::timeout under protection of pf_state::mtx. We also have to make sure the packet does Prevent race between pf_test() and pf_purge_expired_states(). Packets (callers to pf_test()) must alter pf_state::timeout under protection of pf_state::mtx. We also have to make sure the packet does not update pf_state::timeout when ::timeout reaches PFTM_UNLINKED. The first report came from Johan Huldtgren, but he is not the single user who has noticed "st->timeout == PFTM_UNLINKED" assert violation. OK bluhm@ show more ...
# 89860f87	10-Oct-2023	bluhm <bluhm@openbsd.org>	pf(4) must not pass packet if state cannot be created. The behavior of the PFRULE_SRCTRACK and max_states check was unintentionally changed by commit revision 1.964. If the state was not created du pf(4) must not pass packet if state cannot be created. The behavior of the PFRULE_SRCTRACK and max_states check was unintentionally changed by commit revision 1.964. If the state was not created due to some limit had been reached, pf still passed the packet. Restore the old logic by setting action to pass later, after the checks. In pf_test_rule() action is initialized to drop. OK sashan@ show more ...
# 46650f23	10-Oct-2023	bluhm <bluhm@openbsd.org>	Remove dead code in pf_pull_hdr(). pf_pull_hdr() allows to pass an action pointer parameter as output value. This is never used, all callers pass a NULL argument. Remove ACTION_SET() entirely. Th Remove dead code in pf_pull_hdr(). pf_pull_hdr() allows to pass an action pointer parameter as output value. This is never used, all callers pass a NULL argument. Remove ACTION_SET() entirely. The logic (fragoff >= len) in pf_pull_hdr() does not work since revision 1.4. Before it was used to drop short TCP or UDP fragments that contained only part of the header. Current code in pf_pull_hdr() drops the packets anyway, so always set reason PFRES_FRAG. OK kn@ sashan@ show more ...
# dbfb0ac1	08-Sep-2023	naddy <naddy@openbsd.org>	revert previous The change broke IPv6 neighbor discovery, and anton@ reports several regression test failures. ok bluhm@
# 37052907	07-Sep-2023	sashan <sashan@openbsd.org>	pf(4) ignores 'keep state' and 'nat-to' actions for unsolicited icmp error responses. Fix tightens rule matching logic so icmp error responses no longer match 'keep state' rule. In typical scenarios pf(4) ignores 'keep state' and 'nat-to' actions for unsolicited icmp error responses. Fix tightens rule matching logic so icmp error responses no longer match 'keep state' rule. In typical scenarios icmp errors (if solicited) should match existing state. The change is going to bite firewalls which deal with asymmetric routes. In those cases the 'keep state' action should be relaxed to sloppy or new 'no state' rule to explicitly match icmp errors should be added. The issue has been reported by Peter J. Philip (pjp _at_ delphinusdns.org). Discussed with bluhm@ and florian@ OK bluhm@ show more ...
# 740063f5	31-Jul-2023	dlg <dlg@openbsd.org>	don't let pfsync send an insert message for a state pfsync just inserted sthen@ upgraded and ended up with a lot of pfsync traffic which was mostly made up of the two firewalls telling each other to don't let pfsync send an insert message for a state pfsync just inserted sthen@ upgraded and ended up with a lot of pfsync traffic which was mostly made up of the two firewalls telling each other to insert the same state over and over again. this has each of the paths that insert states (actual pf, ioctls, and pfsync) identify themselves so pfsync can enter them into its own state machine in the right place. when pfsync inserts a state into pf, it knows it should just swallow the state silently without sending out another insert for it. ok sthen@ sashan@ show more ...
# 5ebaba9d	07-Jul-2023	bluhm <bluhm@openbsd.org>	Fix path MTU discovery for TCP LRO/TSO when forwarding. When doing LRO (Large Receive Offload), the drivers, currently ix(4) and lo(4) only, record an upper bound of the size of the original packets Fix path MTU discovery for TCP LRO/TSO when forwarding. When doing LRO (Large Receive Offload), the drivers, currently ix(4) and lo(4) only, record an upper bound of the size of the original packets in ph_mss. When sending, either stack or hardware must chop the packets with TSO (TCP Segmentation Offload) to that size. That means we have to call tcp_if_output_tso() before ifp->if_output(). Put that logic into if_output_tso() to avoid code duplication. As TCP packets on the wire do not get larger that way, path MTU discovery should still work. tested by and OK jan@ show more ...
# cc90b7e6	06-Jul-2023	dlg <dlg@openbsd.org>	big update to pfsync to try and clean up locking in particular. moving pf forward has been a real struggle, and pfsync has been a constant source of pain. we have been papering over the problems for big update to pfsync to try and clean up locking in particular. moving pf forward has been a real struggle, and pfsync has been a constant source of pain. we have been papering over the problems for a while now, but it reached the point that it needed a fundamental restructure, which is what this diff is. the big headliner changes in this diff are: - pfsync specific locks this is the whole reason for this diff. rather than rely on NET_LOCK or KERNEL_LOCK or whatever, pfsync now has it's own locks to protect it's internal data structures. this is important because pfsync runs a bunch of timeouts and tasks to push pfsync packets out on the wire, or when it's handling requests generated by incoming pfsync packets, both of which happen outside pf itself running. having pfsync specific locks around pfsync data structures makes the mutations of these data structures a lot more explicit and auditable. - partitioning to enable future parallelisation of the network stack, this rewrite includes support for pfsync to partition states into different "slices". these slices run independently, ie, the states collected by one slice are serialised into a separate packet to the states collected and serialised by another slice. states are mapped to pfsync slices based on the pf state hash, which is the same hash that the rest of the network stack and multiq hardware uses. - no more pfsync called from netisr pfsync used to be called from netisr to try and bundle packets, but now that there's multiple pfsync slices this doesnt make sense. instead it uses tasks in softnet tqs. - improved bulk transfer handling there's shiny new state machines around both the bulk transmit and receive handling. pfsync used to do horrible things to carp demotion counters, but now it is very predictable and returns the counters back where they started. - better tdb handling the tdb handling was pretty hairy, but hrvoje has kicked this around a lot with ipsec and sasyncd and we've found and fixed a bunch of issues as a result of that testing. - mpsafe pf state purges this was committed previously, but because the locks pfsync relied on weren't clear this just caused a ton of bugs. as part of this diff it's now reliable, and moves a big chunk of work out from under KERNEL_LOCK, which in turn improves the responsiveness and throughput of a firewall even if you're not using pfsync. there's a bunch of other little changes along the way, but the above are the big ones. hrvoje has done performance testing with this diff and notes a big improvement when pfsync is not in use. performance when pfsync is enabled is about the same, but im hoping the slices means we can scale along with pf as it improves. lots (months) of testing by me and hrvoje on pfsync boxes tests and ok sashan@ deraadt@ says this is a good time to put it in show more ...
# 0c23d001	05-Jun-2023	sashan <sashan@openbsd.org>	pf_remove_state() should not attempt to remove state which is already removed. OK dlg@
# 510f4386	15-May-2023	bluhm <bluhm@openbsd.org>	Implement the TCP/IP layer for hardware TCP segmentation offload. If the driver of a network interface claims to support TSO, do not chop the packet in software, but pass it down to the interface lay Implement the TCP/IP layer for hardware TCP segmentation offload. If the driver of a network interface claims to support TSO, do not chop the packet in software, but pass it down to the interface layer. Precalculate parts of the pseudo header checksum, but without the packet length. The length of all generated smaller packets is not known yet. Driver and hardware will use the mbuf packet header field ph_mss to calculate it and update checksum. Introduce separate flags IFCAP_TSOv4 and IFCAP_TSOv6 as hardware might support ony one protocol family. The old flag IFXF_TSO is only relevant for large receive offload. It is missnamed, but keep that for now. Note that drivers do not set TSO capabilites yet. Also the ifconfig flags and pseudo interfaces capabilities will be done separately. So this commit should not change behavior. heavily based on the work from jan@; OK sashan@ show more ...
# 55055d61	13-May-2023	bluhm <bluhm@openbsd.org>	Instead of implementing IPv4 header checksum creation everywhere, introduce in_hdr_cksum_out(). It is used like in_proto_cksum_out(). OK claudio@
# c06845b1	10-May-2023	bluhm <bluhm@openbsd.org>	Implement TCP send offloading, for now in software only. This is meant as a fallback if network hardware does not support TSO. Driver support is still work in progress. TCP output generates large Implement TCP send offloading, for now in software only. This is meant as a fallback if network hardware does not support TSO. Driver support is still work in progress. TCP output generates large packets. In IP output the packet is chopped to TCP maximum segment size. This reduces the CPU cycles used by pf. The regular output could be assisted by hardware later, but pf route-to and IPsec needs the software fallback in general. For performance comparison or to workaround possible bugs, sysctl net.inet.tcp.tso=0 disables the feature. netstat -s -p tcp shows TSO counter with chopped and generated packets. based on work from jan@ tested by jmc@ jan@ Hrvoje Popovski OK jan@ claudio@ show more ...
# 003cfddf	08-May-2023	bluhm <bluhm@openbsd.org>	The call to in_proto_cksum_out() is only needed before the packet is passed to ifp->if_output(). The fragment code has its own checksum calculation and the other paths end in goto bad. OK claudio@
# b8646e37	07-May-2023	bluhm <bluhm@openbsd.org>	I preparation for TSO in software, cleanup the fragment code. Use if_output_ml() to send mbuf lists to interfaces. This can be used for TSO, fragments, ARP and ND6. Rename variable fml to ml. In I preparation for TSO in software, cleanup the fragment code. Use if_output_ml() to send mbuf lists to interfaces. This can be used for TSO, fragments, ARP and ND6. Rename variable fml to ml. In pf_route6() split the if else block. Put the safety check (hlen + firstlen < tlen) into ip_fragment(). It makes the code correct in case the packet is too short to be fragmented. This should not happen, but other functions also have this logic. No functional change. OK sashan@ show more ...
# 3e519755	03-May-2023	kn <kn@openbsd.org>	Remove net lock from DIOCGETRULESET and DIOCGETRULESETS Both walk the list of rulesets aka. anchors, to yield a total count and specific anchor name, respectively. Same access, different copy out. Remove net lock from DIOCGETRULESET and DIOCGETRULESETS Both walk the list of rulesets aka. anchors, to yield a total count and specific anchor name, respectively. Same access, different copy out. pf_anchor_global are contained within pf_ioctl.c and pf_ruleset.c and fully protected by the pf lock, as is pf_main_ruleset and its pf.c usage. Rely on and assert for pf lock alone. 'pfctl -sr' on 60k unique rules gets noticably faster, around 2.1s instead of 3.5s. OK sashan show more ...
# 49f39043	28-Apr-2023	phessler <phessler@openbsd.org>	Relax the "pass all" rule so all forms of neighbor advertisements are allowed in either direction. This more closely matches the IPv4 ARP behaviour. From sashan@ discussed with kn@ deraadt@
# 74537007	23-Mar-2023	jsg <jsg@openbsd.org>	fix off-by-one in pf_state_expires() bounds test such a value would have triggered a KASSERT() ok sashan@ deraadt@
# 58feb3ff	04-Mar-2023	sashan <sashan@openbsd.org>	pf(4) should be enforcing TTL=1 to packets sent to 224.0.0.1 only. Issue found and kindly reported by Luca Di Gregorio <lucdig _at_ gmail> OK bluhm@
12 3 4 5 6 7 8 9 10 >>...48