History log of /openbsd/sys/sys/socketvar.h (Results 1 – 25 of 130)
Revision Date Author Comments
# bb0cd11a 03-May-2024 mvs <mvs@openbsd.org>

Push solock() down to sosend() and remove it from soreceive() paths fro
unix(4) sockets.

Push solock() deep down to sosend() and remove it from soreceive() paths
for unix(4) sockets.

The transmissi

Push solock() down to sosend() and remove it from soreceive() paths fro
unix(4) sockets.

Push solock() deep down to sosend() and remove it from soreceive() paths
for unix(4) sockets.

The transmission of unix(4) sockets already half-unlocked because
connected peer is not locked by solock() during sbappend*() call. Use
`sb_mtx' mutex(9) and `sb_lock' rwlock(9) to protect both `so_snd' and
`so_rcv'.

Since the `so_snd' is protected by `sb_mtx' mutex(9) the re-locking
is not required in uipc_rcvd().

Do direct `so_rcv' dispose and cleanup in sofree(). This sockets is
almost dead and unlinked from everywhere include spliced peer, so
concurrent sotask() thread will just exit. This required to keep locks
order between `i_lock' and `sb_lock'. Also this removes re-locking from
sofree() for all sockets.

SB_OWNLOCK became redundant with SB_MTXLOCK, so remove it. SB_MTXLOCK
was kept because checks against SB_MTXLOCK within sb*() routines are mor
consistent.

Feedback and ok bluhm

show more ...


# 1baa51d9 11-Apr-2024 mvs <mvs@openbsd.org>

Don't take solock() in soreceive() for SOCK_RAW inet sockets.

For inet sockets solock() is the netlock wrapper, so soreceive() could
be performed simultaneous with exclusively locked code paths.

Th

Don't take solock() in soreceive() for SOCK_RAW inet sockets.

For inet sockets solock() is the netlock wrapper, so soreceive() could
be performed simultaneous with exclusively locked code paths.

These sockets are not connection oriented, they don't call pru_rcvd(),
they can't be spliced, they don't set `so_error'. Nothing to protect
with solock() in soreceive() path.

`so_rcv' buffer protected by `sb_mtx' mutex(9), but since it released,
sblock() required to serialize concurrent soreceive() and sorflush()
threads. Current sblock() is some kind of rwlock(9) implementation, so
introduce `sb_lock' rwlock(9) and use it directly for that purpose.

The sorflush() and callers were refactored to avoid solock() for raw
inet sockets. This was done to avoid packet processing stop.

Tested and ok bluhm.

show more ...


# b11de491 10-Apr-2024 mvs <mvs@openbsd.org>

Remove `head' socket re-locking in sonewconn().

uipc_attach() releases solock() because it should be taken after
`unp_gc_lock' rwlock(9) which protects the `unp_link' list. For this
reason, the list

Remove `head' socket re-locking in sonewconn().

uipc_attach() releases solock() because it should be taken after
`unp_gc_lock' rwlock(9) which protects the `unp_link' list. For this
reason, the listening `head' socket should be unlocked too while
sonewconn() calls uipc_attach(). This could be reworked because now
`so_rcv' sockbuf relies on `sb_mtx' mutex(9).

The last one `unp_link' foreach loop within unp_gc() discards sockets
previously marked as UNP_GCDEAD. These sockets are not accessed from the
userland. The only exception is the sosend() threads of connected
sending peers, but they only sbappend*() mbuf(9) to `so_rcv'. So it's
enough to unlink mbuf(9) chain with `sb_mtx' held and discard lockless.

Please note, the existing SS_NEWCONN_WAIT logic was never used because
the listening unix(4) socket protected from concurrent unp_detach() by
vnode(9) lock, however `head' re-locked all times.

ok bluhm

show more ...


# 47517268 27-Mar-2024 mvs <mvs@openbsd.org>

Introduce SB_OWNLOCK to mark sockets which `so_rcv' buffer modified
outside socket lock.

`sb_mtx' mutex(9) used for this case and it should not be released between
`so_rcv' usage check and correspon

Introduce SB_OWNLOCK to mark sockets which `so_rcv' buffer modified
outside socket lock.

`sb_mtx' mutex(9) used for this case and it should not be released between
`so_rcv' usage check and corresponding sbwait() sleep. Otherwise wakeup()
could be lost sometimes.

ok bluhm

show more ...


# 02a82712 26-Mar-2024 mvs <mvs@openbsd.org>

Use `sb_mtx' to protect `so_rcv' receive buffer of unix(4) sockets.

This makes re-locking unnecessary in the uipc_*send() paths, because
it's enough to lock one socket to prevent peer from concurren

Use `sb_mtx' to protect `so_rcv' receive buffer of unix(4) sockets.

This makes re-locking unnecessary in the uipc_*send() paths, because
it's enough to lock one socket to prevent peer from concurrent
disconnection. As the little bonus, one unix(4) socket can perform
simultaneous transmission and reception with one exception for
uipc_rcvd(), which still requires the re-lock for connection oriented
sockets.

The socket lock is not held while filt_soread() and filt_soexcept()
called from uipc_*send() through sorwakeup(). However, the unlocked
access to the `so_options', `so_state' and `so_error' is fine.

The receiving socket can't be or became listening socket. It also can't
be disconnected concurrently. This makes immutable SO_ACCEPTCONN,
SS_ISDISCONNECTED and SS_ISCONNECTED bits which are clean and set
respectively.

`so_error' is set on the peer sockets only by unp_detach(), which also
can't be called concurrently on sending socket.

This is also true for filt_fiforead() and filt_fifoexcept(). For other
callers like kevent(2) or doaccept() the socket lock is still held.

ok bluhm

show more ...


# fdada4b1 22-Mar-2024 mvs <mvs@openbsd.org>

Use sorflush() instead of direct unp_scan(..., unp_discard) to discard
dead unix(4) sockets.

The difference in direct unp_scan() and sorflush() is the mbuf(9) chain.
For the first case it is still l

Use sorflush() instead of direct unp_scan(..., unp_discard) to discard
dead unix(4) sockets.

The difference in direct unp_scan() and sorflush() is the mbuf(9) chain.
For the first case it is still linked to the `so_rcv', for the second it
is not. This is required to make `sb_mtx' mutex(9) the only `so_rcv'
sockbuf protection and remove socket re-locking from the most of
uipc_*send() paths. The unlinked mbuf(9) chain doesn't require any
protection, so this allows to perform sleeping unp_discard() lockless.

Also, the mbuf(9) chain of the discarded socket still contains addresses
of file descriptors and it is much safer to unlink it before FRELE()
them. This is the reason to commit this diff standalone.

ok bluhm

show more ...


# 94b08229 12-Feb-2024 mvs <mvs@openbsd.org>

Pass protosw instead of domain structure to soalloc() to get real
`pr_type'. The corresponding domain is referenced as `pr_domain'.
Otherwise dp->dom_protosw->pr_type of inet sockets always points
to

Pass protosw instead of domain structure to soalloc() to get real
`pr_type'. The corresponding domain is referenced as `pr_domain'.
Otherwise dp->dom_protosw->pr_type of inet sockets always points
to inetsw[0].

ok bluhm

show more ...


# c75e434f 11-Feb-2024 mvs <mvs@openbsd.org>

Use `sb_mtx' instead of `inp_mtx' in receive path for inet sockets.

In soreceve(), we only touch `so_rcv' socket buffer, which has it's own
`sb_mtx' mutex(9) for protection. So, we can avoid solock(

Use `sb_mtx' instead of `inp_mtx' in receive path for inet sockets.

In soreceve(), we only touch `so_rcv' socket buffer, which has it's own
`sb_mtx' mutex(9) for protection. So, we can avoid solock() in this
path - it's enough to hold `sb_mtx' in soreceive() and around
corresponding sbappend*(). But not right now :)

This time we use shared netlock for some inet sockets in the soreceive()
path. To protect `so_rcv' buffer we use `inp_mtx' mutex(9) and the
pru_lock() to acquire this mutex(9) in socket layer. But the `inp_mtx'
mutex belongs to the PCB. We initialize socket before PCB, tcp(4)
sockets could exist without PCB, so use `sb_mtx' mutex(9) to protect
sockbuf stuff.

This diff mechanically replaces `inp_mtx' by `sb_mtx' in the receive
path. Only for sockets which already use `inp_mtx'. All other sockets
left as is. They will be converted later.

Since the `sb_mtx' is optional, the new SB_MTXLOCK flag introduced. If
this flag is set on `sb_flags', the `sb_mtx' mutex(9) should be taken.
New sb_mtx_lock() and sb_mtx_unlock() was introduced to hide this check.
They are temporary and will be replaced by mtx_enter() when all this
area will be converted to `sb_mtx' mutex(9).

Also, the new sbmtxassertlocked() function introduced to throw
corresponding assertion for SB_MTXLOCK marked buffers. This time only
sbappendaddr() calls it. This function is also temporary and will be
replaced by MTX_ASSERT_LOCKED() later.

ok bluhm

show more ...


# 104c0a83 03-Feb-2024 mvs <mvs@openbsd.org>

Rework socket buffers locking for shared netlock.

Shared netlock is not sufficient to call so{r,w}wakeup(). The following
sowakeup() modifies `sb_flags' and knote(9) stuff. Unfortunately, we
can't c

Rework socket buffers locking for shared netlock.

Shared netlock is not sufficient to call so{r,w}wakeup(). The following
sowakeup() modifies `sb_flags' and knote(9) stuff. Unfortunately, we
can't call so{r,w}wakeup() with `inp_mtx' mutex(9) because sowakeup()
also calls pgsigio() which grabs kernel lock.

However, `so*_filtops' callbacks only perform read-only access to the
socket stuff, so it is enough to hold shared netlock only, but the klist
stuff needs to be protected.

This diff introduces `sb_mtx' mutex(9) to protect sockbuf. This time
`sb_mtx' used to protect only `sb_flags' and `sb_klist'.

Now we have soassertlocked_readonly() and soassertlocked(). The first
one is happy if only shared netlock is held, meanwhile the second wants
`so_lock' or pru_lock() be held together with shared netlock.

To keep soassertlocked*() assertions soft, we need to know mutex(9)
state, so new mtx_owned() macro was introduces. Also, the new optional
(*pru_locked)() handler brings the state of pru_lock().

Tests and ok from bluhm.

show more ...


# 66bd633e 11-Jan-2024 bluhm <bluhm@openbsd.org>

Use domain name for socket lock.

Syzkaller with witness complains about lock ordering of pf lock
with socket lock. Socket lock for inet is taken before pf lock.
Pf lock is taken before socket lock

Use domain name for socket lock.

Syzkaller with witness complains about lock ordering of pf lock
with socket lock. Socket lock for inet is taken before pf lock.
Pf lock is taken before socket lock for route. This is a false
positive as route and inet socket locks are distinct. Witness does
not know this. Name the socket lock like the domain of the socket,
then rwlock name is used in witness lo_name subtype. Make domain
names more consistent for locking, they were not used anyway.
Regardless of witness problem, unique lock name for each socket
type make sense.

Reported-by: syzbot+34d22dcbf20d76629c5a@syzkaller.appspotmail.com
Reported-by: syzbot+fde8d07ba74b69d0adfe@syzkaller.appspotmail.com
OK mvs@

show more ...


# 9f40d4ac 04-Jul-2023 mvs <mvs@openbsd.org>

Introduce SBL_WAIT and SBL_NOINTR sbwait() flags.

This refactoring is another step to make standalone socket buffers
locking. sblock() uses M_WAITOK and M_NOWAIT flags passed as the third
argument t

Introduce SBL_WAIT and SBL_NOINTR sbwait() flags.

This refactoring is another step to make standalone socket buffers
locking. sblock() uses M_WAITOK and M_NOWAIT flags passed as the third
argument together with the SB_NOINTR flag on the `sb_flags' to control
sleep behaviour. To perform uninterruptible acquisition, SB_NOINTR flag
should be set before sblock() call. `sb_flags' modification requires to
hold solock() around sblock()/sbunlock() that makes standalone call
impossible.

Also `sb_flags' modifications outside sblock()/sbunlock() makes
uninterruptible acquisition code huge enough. This time only sorflush()
does this (and forgets to restore SB_NOINTR flag, so shutdown(SHUT_RDWR)
call permanently modifies socket locking behaviour) and this looks not
the big problem. But with the standalone socket buffer locking it will
be many such places, so this huge construction is unwanted.

Introduce new SBL_NOINTR flag passed as third sblock() argument. The
sblock() acquisition will be uninterruptible when existing SB_NOINTR
flag is set on `sb_flags' or SBL_NOINTR was passed.

The M_WAITOK and M_NOWAIT flags belongs to malloc(9). It has no M_NOINTR
flag and there is no reason to introduce it. So for consistency reasons
introduce new SBL_WAIT and use it together with SBL_NOINTR instead of
M_WAITOK and M_NOINTR respectively.

ok bluhm

show more ...


# 80b1b8fb 27-Jan-2023 mvs <mvs@openbsd.org>

Replace selinfo structure by klist in sockbuf. No reason to keep it,
selinfo is just wrapper to klist. netstat(1) and libkvm use socket
structure, but don't touch so_{snd,rcv}.sb_sel.

ok visa@


# 734e0323 23-Jan-2023 mvs <mvs@openbsd.org>

Remove "sb == &so->so_rcv || sb == &so->so_snd" assertion from
sb_notify() and sbspace(). Now it's overkilling.

ok bluhm@


# a651bdb4 23-Jan-2023 mvs <mvs@openbsd.org>

Move SS_ISSENDING flag to `sb_state'. It should belong to the send
buffer as the SS_CANTSENDMORE flag.

ok bluhm@


# 4b9bfff3 22-Jan-2023 mvs <mvs@openbsd.org>

Move SS_CANTRCVMORE and SS_RCVATMARK bits from `so_state' to `sb_state' of
receive buffer. As it was done for SS_CANTSENDMORE bit, the definition
kept as is, but now these bits belongs to the `sb_sta

Move SS_CANTRCVMORE and SS_RCVATMARK bits from `so_state' to `sb_state' of
receive buffer. As it was done for SS_CANTSENDMORE bit, the definition
kept as is, but now these bits belongs to the `sb_state' of receive
buffer. `sb_state' ored with `so_state' when socket data exporting to the
userland.

ok bluhm@

show more ...


# 9e437519 21-Jan-2023 mvs <mvs@openbsd.org>

Introduce per-sockbuf `sb_state' to use it with SS_CANTSENDMORE.

This time, socket's buffer lock requires solock() to be held. As a part of
socket buffers standalone locking work, move socket state

Introduce per-sockbuf `sb_state' to use it with SS_CANTSENDMORE.

This time, socket's buffer lock requires solock() to be held. As a part of
socket buffers standalone locking work, move socket state bits which
represent its buffers state to per buffer state.

Opposing the previous reverted diff, the SS_CANTSENDMORE definition left
as is, but it used only with `sb_state'. `sb_state' ored with original
`so_state' when socket's data exported to the userland, so the ABI kept as
it was.

Inputs from deraadt@.

ok bluhm@

show more ...


# 33da1efb 12-Dec-2022 tb <tb@openbsd.org>

Revert sb_state changes to unbreak tree.


# cba97bf9 11-Dec-2022 mvs <mvs@openbsd.org>

This time, socket's buffer lock requires solock() to be held. As a part of
socket buffers standalone locking work, move socket state bits which
represent its buffers state to per buffer state. Introd

This time, socket's buffer lock requires solock() to be held. As a part of
socket buffers standalone locking work, move socket state bits which
represent its buffers state to per buffer state. Introduce `sb_state' and
turn SS_CANTSENDMORE to SBS_CANTSENDMORE. This bit will be processed on
`so_snd' buffer only.

Move SS_CANTRCVMORE and SS_RCVATMARK bits with separate diff to make
review easier and exclude possible so_rcv/so_snd mistypes.

Also, don't adjust the remaining SS_* bits right now.

ok millert@

show more ...


# d452b926 26-Nov-2022 mvs <mvs@openbsd.org>

Turn sowriteable(), sballoc() and sbfree() macro to inline functions.

soreadable() is already presented as inline function, but corresponding
sowriteable() is still macro. Also it's no reason to kee

Turn sowriteable(), sballoc() and sbfree() macro to inline functions.

soreadable() is already presented as inline function, but corresponding
sowriteable() is still macro. Also it's no reason to keep sballoc() and
sbfree() as macro.

The first argument of sballoc() and sbfree() is not used, but keep it for
a while.

ok kn@ bluhm@

show more ...


# 62440853 03-Oct-2022 bluhm <bluhm@openbsd.org>

System calls should not fail due to temporary memory shortage in
malloc(9) or pool_get(9).
Pass down a wait flag to pru_attach(). During syscall socket(2)
it is ok to wait, this logic was missing fo

System calls should not fail due to temporary memory shortage in
malloc(9) or pool_get(9).
Pass down a wait flag to pru_attach(). During syscall socket(2)
it is ok to wait, this logic was missing for internet pcb. Pfkey
and route sockets were already waiting.
sonewconn() must not wait when called during TCP 3-way handshake.
This logic has been preserved. Unix domain stream socket connect(2)
can wait until the other side has created the socket to accept.
OK mvs@

show more ...


# 9bbbb445 05-Sep-2022 bluhm <bluhm@openbsd.org>

Use shared netlock in soreceive(). The UDP and IP divert layer
provide locking of the PCB. If that is possible, use shared instead
of exclusive netlock in soreceive(). The PCB mutex provides a per

Use shared netlock in soreceive(). The UDP and IP divert layer
provide locking of the PCB. If that is possible, use shared instead
of exclusive netlock in soreceive(). The PCB mutex provides a per
socket lock against multiple soreceive() running in parallel.
Release and regrab both locks in sosleep_nsec().
OK mvs@

show more ...


# 70f8daa1 01-Sep-2022 jsg <jsg@openbsd.org>

remove sb_lock() prototype; removed in uipc_socket2.c 1.64


# 621c49c5 21-Aug-2022 mvs <mvs@openbsd.org>

Change soabort() return value to void. We never interesting on it.

ok bluhm@


# 69690c9b 13-Aug-2022 mvs <mvs@openbsd.org>

Introduce the pru_*() wrappers for corresponding (*pr_usrreq)() calls.

This is helpful for the following (*pr_usrreq)() split to multiple
handlers. But right now this makes code more readable.

Also

Introduce the pru_*() wrappers for corresponding (*pr_usrreq)() calls.

This is helpful for the following (*pr_usrreq)() split to multiple
handlers. But right now this makes code more readable.

Also add '#ifndef _SYS_SOCKETVAR_H_' to sys/socketvar.h. This prevents the
collisions when both sys/protosw.h and sys/socketvar.h are included
together. Both 'socket' and 'protosw' structures are required to be
defined before pru_*() wrappers, so we need to include sys/socketvar.h to
sys/protosw.h.

ok bluhm@

show more ...


# 88d6f328 15-Jul-2022 deraadt <deraadt@openbsd.org>

pledge "getpw" would notice access to /var/run/ypbind.lock, and grant "inet"
rights, so that libc/yp could access YP services via a fairly complex 'protocol'
including file access, sockets, etc. Thi

pledge "getpw" would notice access to /var/run/ypbind.lock, and grant "inet"
rights, so that libc/yp could access YP services via a fairly complex 'protocol'
including file access, sockets, etc. This YP protocol is also used by ypldap --
this is our way of bringing 'NIS' services into libc without monster sub-libraries.
I have managed to remove this "inet" right by creating a new ypconnect() system
call, which performs parts of the yp_bind.c dance inside the kernel.. It checks if
domainname is set, looks for a binding file with advisory lock, reads it to
get the IP and udp/tcp port numbers, and then establishes a connnected socket
direct to that ypserv. This socket has a SS_YP flag set, and non-required system
calls are prohibited. libc maintains lifetime on this socket so a process
should never see it, but it seems safer to block udp re-connect and other calls
even in non-pledge mode.
Userland changes to use this will follow in a few days.
Lots of help from claudio and jmatthew, also ok miod

show more ...


123456