sched_bsd.c - OpenGrok history log for /openbsd/sys/kern/sched

Revision	Date	Author	Comments
# 133ff0d4	09-Oct-2024	claudio <claudio@openbsd.org>	Clear ps_xsig when continuing after a PS_TRACED stop. Also remove the ps_xsig handling in setrunnable() it is in the wrong spot and causes signals to be delivered over and over again. Attaching to Clear ps_xsig when continuing after a PS_TRACED stop. Also remove the ps_xsig handling in setrunnable() it is in the wrong spot and causes signals to be delivered over and over again. Attaching to an already stopped process is affected by this. The SIGSTOP sent by ptrace is now ignored in ptsignal() and as a result gdb will hang in wait4() until a SIGCONT is delivered to the process. After that all works as usual. OK mpi@ show more ...
# 7b3f8d1d	08-Oct-2024	claudio <claudio@openbsd.org>	Move common code to update the proc runtime into tuagg_add_runtime(). OK mpi@ kn@
# 241d6723	08-Jul-2024	claudio <claudio@openbsd.org>	Rework per proc and per process time usage accounting For procs (threads) the accounting happens now lockless by curproc using a generation counter. Callers need to use tu_enter() and tu_leave() for Rework per proc and per process time usage accounting For procs (threads) the accounting happens now lockless by curproc using a generation counter. Callers need to use tu_enter() and tu_leave() for this. To read the proc p_tu struct tuagg_get_proc() should be used. It ensures that the values read is consistent. For processes only the time of exited threads is accumulated in ps_tu and to get the proper process time usage tuagg_get_process() needs to be called. tuagg_get_process() will sum up all procs p_tu plus the ps_tu. This removes another SCHED_LOCK() dependency. Adjust the code in exit1() and exit2() to correctly account for the full run time. For this adjust sched_exit() to do the runtime accounting like it is done in mi_switch(). OK jca@ dlg@ show more ...
# a09e9584	03-Jun-2024	claudio <claudio@openbsd.org>	Remove the now unsued s argument to SCHED_LOCK and SCHED_UNLOCK. The SPL level is not tacked by the mutex and we no longer need to track this in the callers. OK miod@ mlarkin@ tb@ jca@
# de29a8a5	29-May-2024	claudio <claudio@openbsd.org>	Convert SCHED_LOCK from a recursive kernel lock to a mutex. Over the last weeks the last SCHED_LOCK recursion was removed so this is now possible and will allow to split up the SCHED_LOCK in a upcom Convert SCHED_LOCK from a recursive kernel lock to a mutex. Over the last weeks the last SCHED_LOCK recursion was removed so this is now possible and will allow to split up the SCHED_LOCK in a upcoming step. Instead of implementing an MP and SP version of SCHED_LOCK this just always uses the mutex implementation. While this makes the local s argument unused (the spl is now tracked by the mutex itself) it is still there to keep this diff minimal. Tested by many. OK jca@ mpi@ show more ...
# e1edc428	30-Mar-2024	mpi <mpi@openbsd.org>	Prevent a recursion inside wakeup(9) when scheduler tracepoints are enabled. Tracepoints like "sched:enqueue" and "sched:unsleep" were called from inside the loop iterating over sleeping threads as Prevent a recursion inside wakeup(9) when scheduler tracepoints are enabled. Tracepoints like "sched:enqueue" and "sched:unsleep" were called from inside the loop iterating over sleeping threads as part of wakeup_proc(). When such tracepoints were enabled they could result in another wakeup(9) possibly corrupting the sleepqueue. Rewrite wakeup(9) in two stages, first dequeue threads from the sleepqueue then call setrunnable() and possible tracepoints for each of them. This requires moving unsleep() outside of setrunnable() because it messes with the sleepqueue. ok claudio@ show more ...
# 1d970828	24-Jan-2024	cheloha <cheloha@openbsd.org>	clockintr: switch from callee- to caller-allocated clockintr structs Currently, clockintr_establish() calls malloc(9) to allocate a clockintr struct on behalf of the caller. mpi@ says this behavior clockintr: switch from callee- to caller-allocated clockintr structs Currently, clockintr_establish() calls malloc(9) to allocate a clockintr struct on behalf of the caller. mpi@ says this behavior is incompatible with dt(4). In particular, calling malloc(9) during the initialization of a PCB outside of dt_pcb_alloc() is (a) awkward and (b) may conflict with future changes/optimizations to PCB allocation. To side-step the problem, this patch changes the clockintr subsystem to use caller-allocated clockintr structs instead of callee-allocated structs. clockintr_establish() is named after softintr_establish(), which uses malloc(9) internally to create softintr objects. The clockintr subsystem is no longer using malloc(9), so the "establish" naming is no longer apt. To avoid confusion, this patch also renames "clockintr_establish" to "clockintr_bind". Requested by mpi@. Tweaked by mpi@. Thread: https://marc.info/?l=openbsd-tech&m=170597126103504&w=2 ok claudio@ mlarkin@ mpi@ show more ...
# 106c68c4	17-Oct-2023	cheloha <cheloha@openbsd.org>	clockintr: move callback-specific API behaviors to "clockrequest" namespace The API's behavior when invoked from a callback function is impossible to document. Move the special behavior into a dist clockintr: move callback-specific API behaviors to "clockrequest" namespace The API's behavior when invoked from a callback function is impossible to document. Move the special behavior into a distinct namespace, "clockrequest". - Add a 'struct clockrequest'. Basically a stripped-down 'struct clockintr' for exclusive use during clockintr_dispatch(). - In clockintr_queue, replace the "cq_shadow" clockintr with a "cq_request" clockrequest. They serve the same purpose. - CLST_SHADOW_PENDING -> CR_RESCHEDULE; different namespace, same meaning. - CLST_IGNORE_SHADOW -> CLST_IGNORE_REQUEST; same meaning. - Move shadow branch in clockintr_advance() to clockrequest_advance(). - clockintr_request_random() becomes clockrequest_advance_random(). - Delete dead shadow branches in clockintr_cancel(), clockintr_schedule(). - Callback functions now get a clockrequest pointer instead of a special clockintr pointer: update all prototypes, callers. No functional change intended. show more ...
# 961828bc	11-Oct-2023	cheloha <cheloha@openbsd.org>	kernel: expand fixed clock interrupt periods to 64-bit values Technically, all the current fixed clock interrupt periods fit within an unsigned 32-bit value. But 32-bit multiplication is an acciden kernel: expand fixed clock interrupt periods to 64-bit values Technically, all the current fixed clock interrupt periods fit within an unsigned 32-bit value. But 32-bit multiplication is an accident waiting to happen. So, expand the fixed periods for hardclock, statclock, profclock, and roundrobin to 64-bit values. One exception: statclock_mask remains 32-bit because random(9) yields 32-bit values. Update the initclocks() comment to make it clear that this is not an accident. show more ...
# 6c64bd7f	17-Sep-2023	cheloha <cheloha@openbsd.org>	scheduler_start: move static timeout structs into callback functions Move the schedcpu() and update_loadavg() timeout structs from scheduler_start() into their respective callback functions and stat scheduler_start: move static timeout structs into callback functions Move the schedcpu() and update_loadavg() timeout structs from scheduler_start() into their respective callback functions and statically initialize them with TIMEOUT_INITIALIZER(9). The structs are already hidden from the global namespace and the timeouts are already self-managing, so we may as well fully consolidate things. Thread: https://marc.info/?l=openbsd-tech&m=169488184019047&w=2 "Sure." claudio@ show more ...
# a3464c93	10-Sep-2023	cheloha <cheloha@openbsd.org>	clockintr: support an arbitrary callback function argument Callers can now provide an argument pointer to clockintr_establish(). The pointer is kept in a new struct clockintr member, cl_arg. The po clockintr: support an arbitrary callback function argument Callers can now provide an argument pointer to clockintr_establish(). The pointer is kept in a new struct clockintr member, cl_arg. The pointer is passed as the third parameter to clockintr.cl_func when it is executed during clockintr_dispatch(). Like the callback function, the callback argument is immutable after the clockintr is established. At present, nothing uses this. All current clockintr_establish() callers pass a NULL arg pointer. However, I am confident that dt(4)'s profile provider will need this in the near future. Requested by dlg@ back in March. show more ...
# 3a63f5a8	30-Aug-2023	claudio <claudio@openbsd.org>	Preempt a running proc even if there is no other process/thread queued on that CPU's runqueue. This way mi_switch() is invoked which is necessary to a) signal srm that the cpu changed context b) runt Preempt a running proc even if there is no other process/thread queued on that CPU's runqueue. This way mi_switch() is invoked which is necessary to a) signal srm that the cpu changed context b) runtime stats are updated c) requests to stop the CPU are checked. This should fix the issue reported by Eric Wong (e at 80x24 org) that RLIMIT_CPU is unreliable on idle systems. OK kettenis@ cheloha@ show more ...
# 94c38e45	29-Aug-2023	claudio <claudio@openbsd.org>	Remove p_rtime from struct proc and replace it by passing the timespec as argument to the tuagg_locked function. - Remove incorrect use of p_rtime in other parts of the tree. p_rtime was almost alwa Remove p_rtime from struct proc and replace it by passing the timespec as argument to the tuagg_locked function. - Remove incorrect use of p_rtime in other parts of the tree. p_rtime was almost always 0 so including it in any sum did not alter the result. - In main() the update of time can be further simplified since at that time only the primary cpu is running. - Add missing nanouptime() call in cpu_hatch() for hppa - Rename tuagg_unlocked to tuagg_locked like it is done in the rest of the tree. OK cheloha@ dlg@ show more ...
# 7927db41	19-Aug-2023	claudio <claudio@openbsd.org>	Refetch the spc pointer after cpu_switchto() since the value is stale after the proc switch. With the value refetched the rest of the code can be simplified. Input guenther@, OK cheloha@, miod@
# 02d561d0	18-Aug-2023	claudio <claudio@openbsd.org>	Move the loadavg calculation to sched_bsd.c as update_loadav() With this uvm_meter() is no more and update_loadav() uses a simple timeout instead of getting called via schedcpu(). OK deraadt@ mpi@ Move the loadavg calculation to sched_bsd.c as update_loadav() With this uvm_meter() is no more and update_loadav() uses a simple timeout instead of getting called via schedcpu(). OK deraadt@ mpi@ cheloha@ show more ...
# 9b3d5a4a	14-Aug-2023	mpi <mpi@openbsd.org>	Extend scheduler tracepoints to follow CPU jumping. - Add two new tracpoints sched:fork & sched:steal - Include selected CPU number in sched:wakeup - Add sched:unsleep corresponding to sched:sleep w Extend scheduler tracepoints to follow CPU jumping. - Add two new tracpoints sched:fork & sched:steal - Include selected CPU number in sched:wakeup - Add sched:unsleep corresponding to sched:sleep which matches add/removal of threads on the sleep queue ok claudio@ show more ...
# 9ac452c7	11-Aug-2023	cheloha <cheloha@openbsd.org>	hardclock(9), roundrobin: make roundrobin() an independent clock interrupt - Remove the roundrobin() call from hardclock(9). - Revise roundrobin() to make it a valid clock interrupt callback. It hardclock(9), roundrobin: make roundrobin() an independent clock interrupt - Remove the roundrobin() call from hardclock(9). - Revise roundrobin() to make it a valid clock interrupt callback. It is still periodic and it still runs at one tenth of the hardclock frequency. - Account for multiple expirations in roundrobin(): if two or more roundrobin periods have elapsed, set SPCF_SHOULDYIELD on the running thread immediately to simulate normal behavior. - Each schedstate_percpu has its own roundrobin() handle, spc_roundrobin. spc_roundrobin is started/advanced during clockintr_cpu_init(). Intervals elapsed across suspend/resume are discarded. - rrticks_init and schedstate_percpu.spc_rrticks are now useless: delete them. Tweaked by mpi@. With input from mpi@ and claudio@. Thread: https://marc.info/?l=openbsd-tech&m=169127381314651&w=2 ok mpi@ claudio@ show more ...
# 44e0cbf2	05-Aug-2023	cheloha <cheloha@openbsd.org>	hardclock(9): move setitimer(2) code into itimer_update() - Move the setitimer(2) code responsible for updating the ITIMER_VIRTUAL and ITIMER_PROF timers from hardclock(9) into a new clock interru hardclock(9): move setitimer(2) code into itimer_update() - Move the setitimer(2) code responsible for updating the ITIMER_VIRTUAL and ITIMER_PROF timers from hardclock(9) into a new clock interrupt routine, itimer_update(). itimer_update() is periodic and runs at the same frequency as the hardclock. + Revise itimerdecr() to run within itimer_mtx instead of entering and leaving it. - Each schedstate_percpu has its own itimer_update() handle, spc_itimer. A new scheduler flag, SPCF_ITIMER, indicates whether spc_itimer was started during the last mi_switch() and needs to be stopped during the next mi_switch() or sched_exit(). - A new per-process flag, PS_ITIMER, indicates whether ITIMER_VIRTUAL and/or ITIMER_PROF are running. Checking the flag is easier than entering itimer_mtx to check process.ps_timer[]. The flag is set and cleared in a new helper function, process_reset_itimer_flag(). - In setitimer(), call need_resched() when the state of ITIMER_VIRTUAL or ITIMER_PROF is changed to force an mi_switch() and update spc_itimer. claudio@ notes that ITIMER_PROF could be implemented as a high-res timer using the thread's execution time as a guide for when to interrupt the process and assert SIGPROF. This would probably work really well in single-threaded processes. ITIMER_VIRTUAL would be more difficult to make high-res, though, as you need to exclude time spent in the kernel. Tested on powerpc64 by gkoehler@. With input from claudio@. Thread: https://marc.info/?l=openbsd-tech&m=169038818517101&w=2 ok claudio@ show more ...
# 671537bf	25-Jul-2023	cheloha <cheloha@openbsd.org>	statclock: move profil(2), GPROF code to profclock(), gmonclock() This patch isolates profil(2) and GPROF from statclock(). Currently, statclock() implements both profil(2) and GPROF through a comp statclock: move profil(2), GPROF code to profclock(), gmonclock() This patch isolates profil(2) and GPROF from statclock(). Currently, statclock() implements both profil(2) and GPROF through a complex mechanism involving both platform code (setstatclockrate) and the scheduler (pscnt, psdiv, and psratio). We have a machine-independent interface to the clock interrupt hardware now, so we no longer need to do it this way. - Move profil(2)-specific code from statclock() to a new clock interrupt callback, profclock(), in subr_prof.c. Each schedstate_percpu has its own profclock handle. The profclock is enabled/disabled for a given CPU when it is needed by the running thread during mi_switch() and sched_exit(). - Move GPROF-specific code from statclock() to a new clock interrupt callback, gmonclock(), in subr_prof.c. Where available, each cpu_info has its own gmonclock handle . The gmonclock is enabled/disabled for a given CPU via sysctl(2) in prof_state_toggle(). - Both profclock() and gmonclock() have a fixed period, profclock_period, that is initialized during initclocks(). - Export clockintr_advance(), clockintr_cancel(), clockintr_establish(), and clockintr_stagger() via <sys/clockintr.h>. They have external callers now. - Delete pscnt, psdiv, psratio. From schedstate_percpu, also delete spc_pscnt and spc_psdiv. The statclock frequency is not dynamic anymore so these variables are now useless. - Delete code/state related to the dynamic statclock frequency from kern_clockintr.c. The statclock frequency can still be pseudo-random, so move the contents of clockintr_statvar_init() into clockintr_init(). With input from miod@, deraadt@, and claudio@. Early revisions cleaned up by claudio. Early revisions tested by claudio@. Tested by cheloha@ on amd64, arm64, macppc, octeon, and sparc64 (sun4v). Compile- and boot- tested on i386 by mlarkin@. riscv64 compilation bugs found by mlarkin@. Tested on riscv64 by jca@. Tested on powerpc64 by gkoehler@. show more ...
# aa563902	11-Jul-2023	claudio <claudio@openbsd.org>	Rework sleep_setup()/sleep_finish() to no longer hold the scheduler lock between calls. Instead of forcing an atomic operation across multiple calls use a three step transaction. 1. setup sleep stat Rework sleep_setup()/sleep_finish() to no longer hold the scheduler lock between calls. Instead of forcing an atomic operation across multiple calls use a three step transaction. 1. setup sleep state by calling sleep_setup() 2. recheck sleep condition to ensure that the event did not fire before sleep_setup() registered the proc onto the sleep queue 3. call sleep_finish() to either sleep or keep on running based on the step 2 outcome and any possible signal delivery To make this work wakeup from signals, single thread api and wakeup(9) need to be aware if a process is between step 1 and step 3 so that the process is not enqueued back onto the runqueue while going to sleep. Introduce the p_flag P_WSLEEP to detect this situation. On top of this remove the spl dance in msleep() which is no longer required. It is ok to process interrupts between step 1 and 3. OK mpi@ cheloha@ show more ...
# 2e520600	21-Jun-2023	cheloha <cheloha@openbsd.org>	Revert "schedcpu, uvm_meter(9): make uvm_meter() an independent timeout" Sometimes causes boot hang after mounting root partition. Thread 1: https://marc.info/?l=openbsd-misc&m=168736497407357&w=2 Revert "schedcpu, uvm_meter(9): make uvm_meter() an independent timeout" Sometimes causes boot hang after mounting root partition. Thread 1: https://marc.info/?l=openbsd-misc&m=168736497407357&w=2 Thread 2: https://marc.info/?l=openbsd-misc&m=168737429214370&w=2 show more ...
# 71d823ac	20-Jun-2023	cheloha <cheloha@openbsd.org>	schedcpu, uvm_meter(9): make uvm_meter() an independent timeout uvm_meter(9) should not base its periodic uvm_loadav() call on the UTC clock. It also no longer needs to periodically wake up proc0 b schedcpu, uvm_meter(9): make uvm_meter() an independent timeout uvm_meter(9) should not base its periodic uvm_loadav() call on the UTC clock. It also no longer needs to periodically wake up proc0 because proc0 doesn't do any work. schedcpu() itself may change or go away, but as kettenis@ notes we probably can't completely remove the concept of a "load average" from OpenBSD, given its long Unix heritage. So, (1) remove the uvm_meter() call from schedcpu(), (2) make uvm_meter() an independent timeout started alongside schedcpu() during scheduler_start(), and (3) delete the vestigial periodic proc0 wakeup. With input from deraadt@, kettenis@, and claudio@. deraadt@ cautions that this change may confuse administrators who hold the load average in high regard. Thread: https://marc.info/?l=openbsd-tech&m=168710929409153&w=2 general agreement with this direction from kettenis@ ok claudio@ show more ...
# 9bcfcad5	04-Feb-2023	cheloha <cheloha@openbsd.org>	kernel: stathz is always non-zero after cpu_initclocks() Now that the clockintr switch is complete, cpu_initclocks() always initializes stathz to a non-zero value. We don't call statclock() from ha kernel: stathz is always non-zero after cpu_initclocks() Now that the clockintr switch is complete, cpu_initclocks() always initializes stathz to a non-zero value. We don't call statclock() from hardclock(9) anymore and, more broadly, we don't need to test whether stathz is non-zero before using it. With input from kettenis@. Link: https://marc.info/?l=openbsd-tech&m=167434223309668&w=2 ok kettenis@ miod@ show more ...
# 2b46a8cb	05-Dec-2022	deraadt <deraadt@openbsd.org>	zap a pile of dangling tabs
# 0d280c5f	14-Aug-2022	jsg <jsg@openbsd.org>	remove unneeded includes in sys/kern ok mpi@ miod@
12 3 4