#
181b71c4 |
| 05-Nov-2024 |
jsg <jsg@openbsd.org> |
remove unused M_MAXCOMPRESS MCLOFSET; ok claudio@ miod@
|
#
5b00b7dd |
| 29-Aug-2024 |
bluhm <bluhm@openbsd.org> |
Show expensive mbuf operations in netstat(1) statistics.
If the memory layout is not optimal, m_defrag(), m_prepend(), m_pullup(), and m_pulldown() will allocate mbufs or copy memory. Count these op
Show expensive mbuf operations in netstat(1) statistics.
If the memory layout is not optimal, m_defrag(), m_prepend(), m_pullup(), and m_pulldown() will allocate mbufs or copy memory. Count these operations to find possible optimizations.
input dhill@; OK mvs@
show more ...
|
#
7019ae97 |
| 14-Apr-2024 |
bluhm <bluhm@openbsd.org> |
Run raw IP input in parallel.
Running raw IPv4 input with shared net lock in parallel is less complex than UDP. Especially there is no socket splicing.
New ip_deliver() may run with shared or excl
Run raw IP input in parallel.
Running raw IPv4 input with shared net lock in parallel is less complex than UDP. Especially there is no socket splicing.
New ip_deliver() may run with shared or exclusive net lock. The last parameter indicates the mode. If is is running with shared netlock and encounters a protocol that needs exclusive lock, the packet is queued. Old ip_ours() always queued the packet. Now it calls ip_deliver() with shared net lock, and if that cannot handle the packet completely, the packet is queued and later processed with exclusive net lock.
In case of an IPv6 header chain, that switches from shared to exclusive processing, the next protocol and mbuf offset are stored in a mbuf tag.
OK mvs@
show more ...
|
#
8788436e |
| 21-Feb-2024 |
bluhm <bluhm@openbsd.org> |
Add missing checksum flag M_TCP_TSO to ddb show mbuf.
OK mglocker@ claudio@
|
#
cea35e79 |
| 16-Jul-2023 |
yasuoka <yasuoka@openbsd.org> |
Make the mbstat preserve the same size which is actually used. Also revert the previous that the mbstat is located on the stack.
ok claudio
|
#
032e1cec |
| 07-Jul-2023 |
yasuoka <yasuoka@openbsd.org> |
Expand the counters in struct mbstat from u_short to u_long.
ok blumn mvs
|
#
aba40e07 |
| 04-Jul-2023 |
jsg <jsg@openbsd.org> |
Remove mbuf low watermark vars. Unused since uipc_mbuf.c rev 1.244. ok kn@ bluhm@
|
#
49ad7235 |
| 04-Jul-2023 |
jsg <jsg@openbsd.org> |
m_reclaim() was removed in uipc_mbuf.c rev 1.195
|
#
c06845b1 |
| 10-May-2023 |
bluhm <bluhm@openbsd.org> |
Implement TCP send offloading, for now in software only. This is meant as a fallback if network hardware does not support TSO. Driver support is still work in progress. TCP output generates large
Implement TCP send offloading, for now in software only. This is meant as a fallback if network hardware does not support TSO. Driver support is still work in progress. TCP output generates large packets. In IP output the packet is chopped to TCP maximum segment size. This reduces the CPU cycles used by pf. The regular output could be assisted by hardware later, but pf route-to and IPsec needs the software fallback in general. For performance comparison or to workaround possible bugs, sysctl net.inet.tcp.tso=0 disables the feature. netstat -s -p tcp shows TSO counter with chopped and generated packets. based on work from jan@ tested by jmc@ jan@ Hrvoje Popovski OK jan@ claudio@
show more ...
|
#
16d357f8 |
| 05-May-2023 |
bluhm <bluhm@openbsd.org> |
The mbuf_queue API allows read access to integer variables which another CPU may change simultaneously. To prevent miss optimisation by the compiler, they need the READ_ONCE() macro. Otherwise ther
The mbuf_queue API allows read access to integer variables which another CPU may change simultaneously. To prevent miss optimisation by the compiler, they need the READ_ONCE() macro. Otherwise there could be two read operations with inconsistent values. Writing to integer in mq_set_maxlen() needs mutex protection. Otherwise the value could change within critical sections. Again the compiler could optimize to multiple read operations within the critical section. With inconsistent values, the behavior is undefined. OK dlg@
show more ...
|
#
0b448d84 |
| 15-Aug-2022 |
bluhm <bluhm@openbsd.org> |
Run IPv6 hop-by-hop options processing in parallel. The ip6_hbhchcheck() code is MP safe and moves from ip6_local() to ip6_ours(). If there are any options, store the chain offset and next protocol
Run IPv6 hop-by-hop options processing in parallel. The ip6_hbhchcheck() code is MP safe and moves from ip6_local() to ip6_ours(). If there are any options, store the chain offset and next protocol in a mbuf tag. When dequeuing without tag, it is a regular IPv6 header. As mbuf tags degrade performance, use them only if a hop-by-hop header is present. Such packets are rare and pf drops them by default. OK mvs@
show more ...
|
#
c8502062 |
| 14-Feb-2022 |
dlg <dlg@openbsd.org> |
update sbchecklowmem() to better detect actual mbuf memory usage.
previously sbchecklowmem() (and sonewconn()) would look at the mbuf and mbuf cluster pools to see if they were approaching their har
update sbchecklowmem() to better detect actual mbuf memory usage.
previously sbchecklowmem() (and sonewconn()) would look at the mbuf and mbuf cluster pools to see if they were approaching their hard limits. based on how many mbufs/clusters were allocated against the limits, socket operations would start to fail with ENOBUFS until utilisation went down.
mbufs and clusters have changed a lot since then though. there are now many mbuf cluster pools, not just one for 2k clusters. because of this the mbuf layer now limits the amount of memory all the mbuf pools can allocate backend pages from rather than limit the individual pools. this means sbchecklowmem() ends up looking at the default pool hard limit, which is UINT_MAX, which in turn means means sbchecklowmem() probably never applies backpressure. this is made worse on multiprocessor systems where per cpu caches of mbuf and cluster pool items are enabled because the number of in use pool items is distorted by the cpu caches.
this switches sbchecklowmem to looking at the page allocations made by all the pools instead. the big benefit of this is that the page allocations are much more representative of the overall mbuf memory usage in the system. the downside is is that the backend page allocation accounting does not see idle memory held by pools. pools cannot release partially free pages to the page backend (obviously), and pools cache idle items to avoid thrashing on the backend page allocator. this means the page allocation level is higher than the memory used by actual in-flight mbufs.
however, this can also be a benefit. the backend page allocation is a kind of smoothed out "trend" line. mbuf utilisation over short periods can be extremely bursty because of things like rx ring dequeue and fill cycles, or large socket sends. if you're trying to grow socket buffers while these things are happening, luck becomes an important factor in whether it will work or not. because pools cache idle items, the backend page utilisation better represents the overall trend of activity in the system and will give more consistent behaviour here.
this diff is deliberately simple. we're basically going from "no limits" to "some sort of limit" for sockets again, so keeping the code simple means it should be easy to understand and tweak in the future.
ok djm@ visa@ claudio@
show more ...
|
#
a1c674a4 |
| 15-May-2021 |
yasuoka <yasuoka@openbsd.org> |
Fix IPsec NAT-T to work with pipex(4). Introduce a new packet tag PACKET_TAG_IPSEC_FLOWINFO to specify the IPsec flow.
ok mvs
|
#
bbe404a3 |
| 25-Feb-2021 |
dlg <dlg@openbsd.org> |
let m_copydata use a void * instead of caddr_t
i'm not a fan of having to cast to caddr_t when we have modern inventions like void *s we can take advantage of.
ok claudio@ mvs@ bluhm@
|
#
471f2571 |
| 12-Dec-2020 |
jan <jan@openbsd.org> |
Rename the macro MCLGETI to MCLGETL and removes the dead parameter ifp.
OK dlg@, bluhm@ No Opinion mpi@ Not against it claudio@
|
#
ec4d335c |
| 08-Aug-2020 |
florian <florian@openbsd.org> |
Remove now unused M_ACAST flag. Reminded by, input & OK jca
|
#
87c66d4c |
| 21-Jun-2020 |
dlg <dlg@openbsd.org> |
wireguard is taking over the gif mbuf tag.
gif used its mbuf tag to store it's interface index so it could detect loops. gre also did this, and i cut most of the drivers (including gif) over to usin
wireguard is taking over the gif mbuf tag.
gif used its mbuf tag to store it's interface index so it could detect loops. gre also did this, and i cut most of the drivers (including gif) over to using the gre tag. so the gif tag is unused.
wireguard uses the tag to store peer information between different contexts the packet is processed in. it also needs a bit more space to do that.
from Matt Dunwoodie and Jason A. Donenfeld
ok deraadt@
show more ...
|
#
6281916f |
| 21-Jun-2020 |
dlg <dlg@openbsd.org> |
add mq_push. it's like mq_enqueue, but drops from the head, not the tail.
from Matt Dunwoodie and Jason A. Donenfeld
|
#
379654e8 |
| 17-Jun-2020 |
dlg <dlg@openbsd.org> |
make ph_flowid in mbufs 16bits by storing whether it's set in csum_flags.
i've been wanting to do this for a while, and now that we've got stoeplitz and it gives us 16 bits, it seems like the right
make ph_flowid in mbufs 16bits by storing whether it's set in csum_flags.
i've been wanting to do this for a while, and now that we've got stoeplitz and it gives us 16 bits, it seems like the right time.
show more ...
|
#
50ec4b0e |
| 22-Jan-2020 |
dlg <dlg@openbsd.org> |
add ml_hdatalen and mq_hdatalen as workalikes of ifq_hdatalen.
this is so pppx(4) and the upcoming pppac(4) can give kq read data dn FIONREAD values that makes sense like the ones tun(4) and tap(4)
add ml_hdatalen and mq_hdatalen as workalikes of ifq_hdatalen.
this is so pppx(4) and the upcoming pppac(4) can give kq read data dn FIONREAD values that makes sense like the ones tun(4) and tap(4) provide with ifq_hdatalen.
show more ...
|
#
9392a735 |
| 16-Jul-2019 |
bluhm <bluhm@openbsd.org> |
Prevent integer overflow in kernel and userland when checking mbuf limits. Convert kernel variables and calculations for mbuf memory into long to allow larger values on 64 bit machines. Put a range
Prevent integer overflow in kernel and userland when checking mbuf limits. Convert kernel variables and calculations for mbuf memory into long to allow larger values on 64 bit machines. Put a range check into the kernel sysctl. For the interface itself int is still sufficient. In netstat -m cast all multiplications to unsigned long to hold the product of two unsigned int. input and OK visa@
show more ...
|
#
5bac5b4f |
| 10-Jun-2019 |
dlg <dlg@openbsd.org> |
add m_microtime for getting the wall clock time associated with a packet
if the packet has the M_TIMESTAMP csum_flag, ph_timestamp is added to the boottime clock, otherwise it just uses microtime().
|
#
2f01f25d |
| 10-Jun-2019 |
dlg <dlg@openbsd.org> |
add M_TIMESTAMP as a csum_flags option to say ph_timestamp is set
this is so hardware that supports timestamping can set the time and say so for things like bpf and the SO_TIMESTAMP socket option to
add M_TIMESTAMP as a csum_flags option to say ph_timestamp is set
this is so hardware that supports timestamping can set the time and say so for things like bpf and the SO_TIMESTAMP socket option to use.
the intention is that ph_timestamp will store the nanosecond since the system booted, which is in line with how fq_codel (the only user of the field at the moment) uses it.
show more ...
|
#
12172476 |
| 11-Feb-2019 |
bluhm <bluhm@openbsd.org> |
In ddb add description for show mbuf flags bit SYNCOOKIE_RECREATED of struct pkthdr_pf. from Jan Klemkow
|
#
103e7fff |
| 07-Dec-2018 |
claudio <claudio@openbsd.org> |
All the references to the M_ALIGN and MH_ALIGN macros are gone. Time to bring them behind the shed and free them. Use m_align() instead. OK mpi@ henning@ florian@ kn@
|