#
938ff1ae |
| 23-Dec-2023 |
bluhm <bluhm@openbsd.org> |
Backout always allocate per-CPU statistics counters for network interface descriptor. It panics during attach of em(4) device at boot.
|
#
4046f503 |
| 22-Dec-2023 |
mvs <mvs@openbsd.org> |
Always allocate per-CPU statistics counters for network interface descriptor.
We have the mess in network interface statistics. Only pseudo drivers do per-CPU counters allocation, all other network
Always allocate per-CPU statistics counters for network interface descriptor.
We have the mess in network interface statistics. Only pseudo drivers do per-CPU counters allocation, all other network devices use the old `if_data'. The network stack partially uses per-CPU counters and partially use `if_data', but the protection is inconsistent: some times counters accessed with exclusive netlock, some times with shared netlock, some times with kernel lock, but without netlock, some times with another locks.
To make network interfaces statistics more consistent, always allocate per-CPU counters at interface attachment time and use it instead of `if_data'. At this step only move counters allocation to the if_attach() internals. The `if_data' removal will be performed with the following diffs to make review and tests easier.
ok bluhm
show more ...
|
#
6e862afc |
| 10-Feb-2023 |
visa <visa@openbsd.org> |
Make tun(4) and tap(4) event filters MP-safe.
OK mvs@
|
#
1525749f |
| 02-Jul-2022 |
visa <visa@openbsd.org> |
Remove unused device poll functions.
Also remove unneeded includes of <sys/poll.h> and <sys/select.h>.
Some addenda from jsg@.
OK miod@ mpi@
|
#
1c9104c3 |
| 26-Feb-2022 |
dlg <dlg@openbsd.org> |
have another go at fixing assert "sc->sc_dev == NUM" failed.
claudio figured it out. his clue was that multiple concurrent calls to tunopen (or tapopen) will share a vnode. because tunopen can sleep
have another go at fixing assert "sc->sc_dev == NUM" failed.
claudio figured it out. his clue was that multiple concurrent calls to tunopen (or tapopen) will share a vnode. because tunopen can sleep, multiple programs can be inside tunopen for the same tun interface at the same time, all with references against the same vnode.
at the same time as this another thread/program can call VOP_REVOKE via tun_clone_destroy (eg, ifconfig tun1 destroy does this). VOP_REVOKE marks a vnode as bad, which in turn means that subsequent open()s of a tun interface will get a brand new vnode.
so multiple threads holding references to a vnode can be sleeping in tun_dev_open on the interface cloner lock. one thread wins and takes ownership of the tun interface, then another thread can destroy that tun interface, calls VOP_REVOKE which calls tun_dev_close to tear down the vnodes association with the tun interface and mark the vnode as bad. the thread that called tun_clone_destroy then creates another instance of the interface by calling tun_clone_create immediately.
one of the original threads with the old vnode reference wakes up and takes ownership of the new tun_softc. however, because the vnode is bad, all the vnode ops have been replaced with the deadfs ops. the close() op on the old vnode is now a nop from the point of view of tun interfaces. the old vnode is no longer associated with tun and tap and will now never call tun_dev_close (via tunclose or tapclose), which in turn means sc_dev won't get cleared.
another thread can now call tun_clone_destroy against the new instance of tun_softc. this instance has sc_dev set, so it tries to revoke it, but there's no vnode associated with it because the old vnode reference is dead.
because this second call to VOP_REVOKE couldnt find a vnode, it can't call tunclose against it, so sc_dev is still set and this KASSERT fires.
claudio and i came up with the following, which is to have tun_dev_open check the state of the vnode associated with the current open call after all the sleeping and potential tun_clone_destroy and tun_clone_create calls. if the vnode has been made bad/dead after all the sleeping, it returns with ENXIO.
Reported-by: syzbot+5e13201866c43afbfbf6@syzkaller.appspotmail.com ok claudio@ visa@
show more ...
|
#
7eb8d89d |
| 22-Feb-2022 |
guenther <guenther@openbsd.org> |
Delete unnecessary #includes of <sys/domain.h> and/or <sys/protosw.h>
net/if_pppx.c pointed out by jsg@ ok gnezdo@ deraadt@ jsg@ mpi@ millert@
|
#
156bbf72 |
| 16-Feb-2022 |
dlg <dlg@openbsd.org> |
prevent (re)opening of tun/tap interfaces that are being destroyed.
if an open tun (or tap) device is destroyed via the clone destroy ioctl (eg, like what ifconfig destroy does), there is a window w
prevent (re)opening of tun/tap interfaces that are being destroyed.
if an open tun (or tap) device is destroyed via the clone destroy ioctl (eg, like what ifconfig destroy does), there is a window while the open device is being revoked on the vfs side that a third thread can come and open it again. this in turn triggers a kassert in the ifconfig destroy path where it expects the device to be closed.
fix this by having tun_dev_open check for the TUN_DEAD flag that the destroy function sets. this still relies on the kernel lock for serialisation.
Reported-by: syzbot+5df2ad232f5f8b671442@syzkaller.appspotmail.com ok visa@
show more ...
|
#
4ad9c8ac |
| 15-Feb-2022 |
dlg <dlg@openbsd.org> |
only tweak ifp if_flags while holding NET_LOCK.
tun_dev_open and tun_dev_close were being optmistic.
|
#
0e8c468f |
| 15-Feb-2022 |
dlg <dlg@openbsd.org> |
make tun_link_state take the ifnet pointer instead of tun_softc.
it only works on struct ifnet data, so passing ifp makes it clearer what's actually being manipulated. also fix tun_dev_open so tun_l
make tun_link_state take the ifnet pointer instead of tun_softc.
it only works on struct ifnet data, so passing ifp makes it clearer what's actually being manipulated. also fix tun_dev_open so tun_link_state is called before if_put instead of immediately after.
show more ...
|
#
43dfcaac |
| 09-Mar-2021 |
anton <anton@openbsd.org> |
Issuing FIOSETOWN and TIOCSPGRP ioctl commands on a tun(4) device leaks device references causing a hang while trying to remove the same interface since the reference count will never reach zero. Ins
Issuing FIOSETOWN and TIOCSPGRP ioctl commands on a tun(4) device leaks device references causing a hang while trying to remove the same interface since the reference count will never reach zero. Instead of returning, break out of the switch in order to ensure that tun_put() gets called.
ok deraadt@ mvs@
Reported-by: syzbot+2ca11c73711a1d0b5c6c@syzkaller.appspotmail.com
show more ...
|
#
42c39955 |
| 20-Feb-2021 |
dlg <dlg@openbsd.org> |
let tun use bpf_mtap for handling input packets.
tun (not tap) input packets are written from userland in the same format that it's bpf dlt is expecting, so we can push the packet straight into bpf
let tun use bpf_mtap for handling input packets.
tun (not tap) input packets are written from userland in the same format that it's bpf dlt is expecting, so we can push the packet straight into bpf with bpf_mtap. this is more correct that using bpf_mtap_ether for tun.
show more ...
|
#
7e26cc3c |
| 19-Jan-2021 |
mvs <mvs@openbsd.org> |
pipex(4): convert ifunit() to if_unit(9)
ok dlg@
|
#
9b0cf67b |
| 25-Dec-2020 |
visa <visa@openbsd.org> |
Refactor klist insertion and removal
Rename klist_{insert,remove}() to klist_{insert,remove}_locked(). These functions assume that the caller has locked the klist. The current state of locking remai
Refactor klist insertion and removal
Rename klist_{insert,remove}() to klist_{insert,remove}_locked(). These functions assume that the caller has locked the klist. The current state of locking remains intact because the kernel lock is still used with all klists.
Add new functions klist_insert() and klist_remove() that lock the klist internally. This allows some code simplification.
OK mpi@
show more ...
|
#
17a53f5a |
| 04-Oct-2020 |
anton <anton@openbsd.org> |
fix indent
|
#
0e70c421 |
| 21-Aug-2020 |
kn <kn@openbsd.org> |
Leave default ifq_maxlen handling to ifq_init()
Most clonable interface drivers (except bridge, enc, loop, pppx, switch, trunk and vlan) initialise the send queue's length to IFQ_MAXLEN during *_clo
Leave default ifq_maxlen handling to ifq_init()
Most clonable interface drivers (except bridge, enc, loop, pppx, switch, trunk and vlan) initialise the send queue's length to IFQ_MAXLEN during *_clone_create() even though ifq_init(), which is eventually called through if_attach(), does the same.
Remove all early "ifq_set_maxlen(&ifq->if_snd, IFQ_MAXLEN);" lines to leave it to ifq_init() and have clonable drivers a tad more in sync.
OK mvs
show more ...
|
#
23293512 |
| 22-Jul-2020 |
dlg <dlg@openbsd.org> |
deprecate interface input handler lists, just use one input function.
the interface input handler lists were originally set up to help us during the intial mpsafe network stack work. at the time not
deprecate interface input handler lists, just use one input function.
the interface input handler lists were originally set up to help us during the intial mpsafe network stack work. at the time not all the virtual ethernet interfaces (vlan, svlan, bridge, trunk, etc) were mpsafe, so we wanted a way to avoid them by default, and only take the kernel lock hit when they were specifically enabled on the interface. since then, they have been fixed up to be mpsafe.
i could leave the list in place, but it has some semantic problems. because virtual interfaces filter packets based on the order they were attached to the parent interface, you can get packets taken away in surprising ways, especially when you reboot and netstart does something different to what you did by hand. by hardcoding the order that things like vlan and bridge get to look at packets, we can document the behaviour and get consistency.
it also means we can get rid of a use of SRPs which were difficult to replace with SMRs. the interface input handler list is an SRPL, which we would like to deprecate. it turns out that you can sleep during stack processing, which you're not supposed to do with SRPs or SMRs, but SRPs are a lot more forgiving and it worked.
lastly, it turns out that this code is faster than the input list handling, so lots of winning all around.
special thanks to hrvoje popovski and aaron bieber for testing. this has been in snaps as part of a larger diff for over a week.
show more ...
|
#
0cae21bd |
| 10-Jul-2020 |
patrick <patrick@openbsd.org> |
Change users of IFQ_SET_MAXLEN() and IFQ_IS_EMPTY() to use the "new" API.
ok dlg@ tobhe@
|
#
63bcfa73 |
| 10-Jul-2020 |
patrick <patrick@openbsd.org> |
Change users of IFQ_DEQUEUE(), IFQ_ENQUEUE() and IFQ_LEN() to use the "new" API.
ok dlg@ tobhe@
|
#
ec906fa8 |
| 13-May-2020 |
dlg <dlg@openbsd.org> |
only pass the IO_NDELAY flag to ifq_deq_sleep as the nbio argument.
|
#
3c6c3993 |
| 12-Apr-2020 |
mpi <mpi@openbsd.org> |
Stop processing packets under non-exclusive (read) netlock.
Prevent concurrency in the socket layer which is not ready for that.
Two recent data corruptions in pfsync(4) and the socket layer pointe
Stop processing packets under non-exclusive (read) netlock.
Prevent concurrency in the socket layer which is not ready for that.
Two recent data corruptions in pfsync(4) and the socket layer pointed out that, at least, tun(4) was incorrectly using NET_RUNLOCK(). Until we find a way in software to avoid future mistakes and to make sure that only the softnet thread and some ioctls are safe to use a read version of the lock, put everything back to the exclusive version.
ok stsp@, visa@
show more ...
|
#
9c969c9a |
| 07-Apr-2020 |
visa <visa@openbsd.org> |
Abstract the head of knote lists. This allows extending the lists, for example, with locking assertions.
OK mpi@, anton@
|
#
b8213689 |
| 20-Feb-2020 |
visa <visa@openbsd.org> |
Replace field f_isfd with field f_flags in struct filterops to allow adding more filter properties without cluttering the struct.
OK mpi@, anton@
|
#
112e0b7c |
| 14-Feb-2020 |
mpi <mpi@openbsd.org> |
Push the KERNEL_LOCK() insidge pgsigio() and selwakeup().
The 3 subsystems: signal, poll/select and kqueue can now be addressed separatly.
Note that bpf(4) and audio(4) currently delay the wakeups
Push the KERNEL_LOCK() insidge pgsigio() and selwakeup().
The 3 subsystems: signal, poll/select and kqueue can now be addressed separatly.
Note that bpf(4) and audio(4) currently delay the wakeups to a separate context in order to respect the KERNEL_LOCK() requirement. Sockets (UDP, TCP) and pipes spin to grab the lock for the sames reasons.
ok anton@, visa@
show more ...
|
#
3c2d0735 |
| 31-Jan-2020 |
dlg <dlg@openbsd.org> |
actually set the link state down when the /dev entry is closed.
this means a route message is sent when the interface is closed and goes down, but also causes another route message to be sent when t
actually set the link state down when the /dev entry is closed.
this means a route message is sent when the interface is closed and goes down, but also causes another route message to be sent when the interface comes up on the next open. this is important for things like ospfd and the ospfd regress test because they want to know when link comes up.
the regression was pointed out by bluhm, who also helped me isolate the problem.
show more ...
|
#
4f8d0d25 |
| 30-Jan-2020 |
dlg <dlg@openbsd.org> |
device poll handlers should return POLL flags, not errnos.
this restores restores returning POLLERR when the device is gone. ENXIO doesn't make much sense as part of a pollfd revents field.
|