History log of /openbsd/sys/kern/kern_sched.c (Results 1 – 25 of 96)
Revision Date Author Comments
# a09e9584 03-Jun-2024 claudio <claudio@openbsd.org>

Remove the now unsued s argument to SCHED_LOCK and SCHED_UNLOCK.

The SPL level is not tacked by the mutex and we no longer need to track
this in the callers.
OK miod@ mlarkin@ tb@ jca@


# 2286c11f 28-Feb-2024 mpi <mpi@openbsd.org>

No need to kick a CPU twice when putting a thread on its runqueue.

From Christian Ludwig, ok claudio@


# 1d970828 24-Jan-2024 cheloha <cheloha@openbsd.org>

clockintr: switch from callee- to caller-allocated clockintr structs

Currently, clockintr_establish() calls malloc(9) to allocate a
clockintr struct on behalf of the caller. mpi@ says this behavior

clockintr: switch from callee- to caller-allocated clockintr structs

Currently, clockintr_establish() calls malloc(9) to allocate a
clockintr struct on behalf of the caller. mpi@ says this behavior is
incompatible with dt(4). In particular, calling malloc(9) during the
initialization of a PCB outside of dt_pcb_alloc() is (a) awkward and
(b) may conflict with future changes/optimizations to PCB allocation.

To side-step the problem, this patch changes the clockintr subsystem
to use caller-allocated clockintr structs instead of callee-allocated
structs.

clockintr_establish() is named after softintr_establish(), which uses
malloc(9) internally to create softintr objects. The clockintr subsystem
is no longer using malloc(9), so the "establish" naming is no longer apt.
To avoid confusion, this patch also renames "clockintr_establish" to
"clockintr_bind".

Requested by mpi@. Tweaked by mpi@.

Thread: https://marc.info/?l=openbsd-tech&m=170597126103504&w=2

ok claudio@ mlarkin@ mpi@

show more ...


# bb00e811 24-Oct-2023 claudio <claudio@openbsd.org>

Normally context switches happen in mi_switch() but there are 3 cases
where a switch happens outside. Cleanup these code paths and make the
machine independent.

- when a process forks (fork, tfork,

Normally context switches happen in mi_switch() but there are 3 cases
where a switch happens outside. Cleanup these code paths and make the
machine independent.

- when a process forks (fork, tfork, kthread), the new proc needs to
somehow be scheduled for the first time. This is done by proc_trampoline.
Since proc_trampoline is machine dependent assembler code change
the MP specific proc_trampoline_mp() to proc_trampoline_mi() and make
sure it is now always called.
- cpu_hatch: when booting APs the code needs to jump to the first proc
running on that CPU. This should be the idle thread for that CPU.
- sched_exit: when a proc exits it needs to switch away from itself and
then instruct the reaper to clean up the rest. This is done by switching
to the idle loop.

Since the last two cases require a context switch to the idle proc factor
out the common code to sched_toidle() and use it in those places.

Tested by many on all archs.
OK miod@ mpi@ cheloha@

show more ...


# 709f9596 19-Sep-2023 claudio <claudio@openbsd.org>

Add a KASSERT for p->p_wchan == NULL to setrunqueue()

There is the same check in sched_chooseproc() but that is too late
to know where the bad insertion into the runqueue was done.
OK mpi@


# a332869a 14-Sep-2023 cheloha <cheloha@openbsd.org>

clockintr, scheduler: move statclock handle from clockintr_queue to schedstate_percpu

Move the statclock handle from clockintr_queue.cq_statclock to
schedstate_percpu.spc_statclock. Establish spc_s

clockintr, scheduler: move statclock handle from clockintr_queue to schedstate_percpu

Move the statclock handle from clockintr_queue.cq_statclock to
schedstate_percpu.spc_statclock. Establish spc_statclock during
sched_init_cpu() alongside the other scheduler clock interrupts.

Thread: https://marc.info/?l=openbsd-tech&m=169428749720476&w=2

show more ...


# a3464c93 10-Sep-2023 cheloha <cheloha@openbsd.org>

clockintr: support an arbitrary callback function argument

Callers can now provide an argument pointer to clockintr_establish().
The pointer is kept in a new struct clockintr member, cl_arg. The
po

clockintr: support an arbitrary callback function argument

Callers can now provide an argument pointer to clockintr_establish().
The pointer is kept in a new struct clockintr member, cl_arg. The
pointer is passed as the third parameter to clockintr.cl_func when it
is executed during clockintr_dispatch(). Like the callback function,
the callback argument is immutable after the clockintr is established.

At present, nothing uses this. All current clockintr_establish()
callers pass a NULL arg pointer. However, I am confident that dt(4)'s
profile provider will need this in the near future.

Requested by dlg@ back in March.

show more ...


# 529ac442 06-Sep-2023 cheloha <cheloha@openbsd.org>

clockintr: clockintr_establish: change first argument to a cpu_info pointer

All CPUs control a single clockintr_queue. clockintr_establish()
callers don't need to know about the underlying clockint

clockintr: clockintr_establish: change first argument to a cpu_info pointer

All CPUs control a single clockintr_queue. clockintr_establish()
callers don't need to know about the underlying clockintr_queue.
Accepting a cpu_info pointer as argument simplifies the API.

From mpi@.

ok mpi@

show more ...


# 3cdedeae 31-Aug-2023 cheloha <cheloha@openbsd.org>

sched_cpu_init: remove unnecessary NULL-checks for clockintr pointers

sched_cpu_init() is only run once per cpu_info struct, so we don't
need these NULL-checks.

The NULL-checks are a vestige of clo

sched_cpu_init: remove unnecessary NULL-checks for clockintr pointers

sched_cpu_init() is only run once per cpu_info struct, so we don't
need these NULL-checks.

The NULL-checks are a vestige of clockintr_cpu_init(), which runs more
than once per CPU and uses the checks to avoid leaking clockintr handles.

Thread: https://marc.info/?l=openbsd-tech&m=169349579804340&w=2

ok claudio@

show more ...


# 94c38e45 29-Aug-2023 claudio <claudio@openbsd.org>

Remove p_rtime from struct proc and replace it by passing the timespec
as argument to the tuagg_locked function.

- Remove incorrect use of p_rtime in other parts of the tree. p_rtime was
almost alwa

Remove p_rtime from struct proc and replace it by passing the timespec
as argument to the tuagg_locked function.

- Remove incorrect use of p_rtime in other parts of the tree. p_rtime was
almost always 0 so including it in any sum did not alter the result.
- In main() the update of time can be further simplified since at that time
only the primary cpu is running.
- Add missing nanouptime() call in cpu_hatch() for hppa
- Rename tuagg_unlocked to tuagg_locked like it is done in the rest of
the tree.

OK cheloha@ dlg@

show more ...


# 9b3d5a4a 14-Aug-2023 mpi <mpi@openbsd.org>

Extend scheduler tracepoints to follow CPU jumping.

- Add two new tracpoints sched:fork & sched:steal
- Include selected CPU number in sched:wakeup
- Add sched:unsleep corresponding to sched:sleep w

Extend scheduler tracepoints to follow CPU jumping.

- Add two new tracpoints sched:fork & sched:steal
- Include selected CPU number in sched:wakeup
- Add sched:unsleep corresponding to sched:sleep which matches add/removal
of threads on the sleep queue

ok claudio@

show more ...


# 9ac452c7 11-Aug-2023 cheloha <cheloha@openbsd.org>

hardclock(9), roundrobin: make roundrobin() an independent clock interrupt

- Remove the roundrobin() call from hardclock(9).

- Revise roundrobin() to make it a valid clock interrupt callback.
It

hardclock(9), roundrobin: make roundrobin() an independent clock interrupt

- Remove the roundrobin() call from hardclock(9).

- Revise roundrobin() to make it a valid clock interrupt callback.
It is still periodic and it still runs at one tenth of the hardclock
frequency.

- Account for multiple expirations in roundrobin(): if two or more
roundrobin periods have elapsed, set SPCF_SHOULDYIELD on the running
thread immediately to simulate normal behavior.

- Each schedstate_percpu has its own roundrobin() handle, spc_roundrobin.
spc_roundrobin is started/advanced during clockintr_cpu_init().
Intervals elapsed across suspend/resume are discarded.

- rrticks_init and schedstate_percpu.spc_rrticks are now useless:
delete them.

Tweaked by mpi@. With input from mpi@ and claudio@.

Thread: https://marc.info/?l=openbsd-tech&m=169127381314651&w=2

ok mpi@ claudio@

show more ...


# 44e0cbf2 05-Aug-2023 cheloha <cheloha@openbsd.org>

hardclock(9): move setitimer(2) code into itimer_update()

- Move the setitimer(2) code responsible for updating the ITIMER_VIRTUAL
and ITIMER_PROF timers from hardclock(9) into a new clock interru

hardclock(9): move setitimer(2) code into itimer_update()

- Move the setitimer(2) code responsible for updating the ITIMER_VIRTUAL
and ITIMER_PROF timers from hardclock(9) into a new clock interrupt
routine, itimer_update(). itimer_update() is periodic and runs at the
same frequency as the hardclock.

+ Revise itimerdecr() to run within itimer_mtx instead of entering
and leaving it.

- Each schedstate_percpu has its own itimer_update() handle, spc_itimer.
A new scheduler flag, SPCF_ITIMER, indicates whether spc_itimer was
started during the last mi_switch() and needs to be stopped during the
next mi_switch() or sched_exit().

- A new per-process flag, PS_ITIMER, indicates whether ITIMER_VIRTUAL
and/or ITIMER_PROF are running. Checking the flag is easier than
entering itimer_mtx to check process.ps_timer[]. The flag is set
and cleared in a new helper function, process_reset_itimer_flag().

- In setitimer(), call need_resched() when the state of ITIMER_VIRTUAL
or ITIMER_PROF is changed to force an mi_switch() and update
spc_itimer.

claudio@ notes that ITIMER_PROF could be implemented as a high-res
timer using the thread's execution time as a guide for when to
interrupt the process and assert SIGPROF. This would probably work
really well in single-threaded processes. ITIMER_VIRTUAL would be
more difficult to make high-res, though, as you need to exclude time
spent in the kernel.

Tested on powerpc64 by gkoehler@. With input from claudio@.

Thread: https://marc.info/?l=openbsd-tech&m=169038818517101&w=2

ok claudio@

show more ...


# 1588c842 05-Aug-2023 claudio <claudio@openbsd.org>

Remove the P_WSLEEP specific KASSERT(). Not only procs in state SSTOP
can be added to the run queue but also procs in state SRUN. The latter
happens when schedcpu() kicks in before the proc had a cha

Remove the P_WSLEEP specific KASSERT(). Not only procs in state SSTOP
can be added to the run queue but also procs in state SRUN. The latter
happens when schedcpu() kicks in before the proc had a chance to run.
Problem spotted by gkoehler@
OK cheloha@

show more ...


# 834cc80d 03-Aug-2023 claudio <claudio@openbsd.org>

Remove the per-cpu loadavg calculation.
The current scheduler useage is highly questionable and probably not helpful.
OK kettenis@ cheloha@ deraadt@


# 96496668 27-Jul-2023 cheloha <cheloha@openbsd.org>

sched_init_cpu: move profclock staggering to clockintr_cpu_init()

initclocks() runs after sched_init_cpu() is called for secondary CPUs,
so profclock_period is still zero and the clockintr_stagger()

sched_init_cpu: move profclock staggering to clockintr_cpu_init()

initclocks() runs after sched_init_cpu() is called for secondary CPUs,
so profclock_period is still zero and the clockintr_stagger() call for
spc_profclock is useless. For now, just stagger spc_profclock during
clockintr_cpu_init() along with everything else.

show more ...


# 671537bf 25-Jul-2023 cheloha <cheloha@openbsd.org>

statclock: move profil(2), GPROF code to profclock(), gmonclock()

This patch isolates profil(2) and GPROF from statclock(). Currently,
statclock() implements both profil(2) and GPROF through a comp

statclock: move profil(2), GPROF code to profclock(), gmonclock()

This patch isolates profil(2) and GPROF from statclock(). Currently,
statclock() implements both profil(2) and GPROF through a complex
mechanism involving both platform code (setstatclockrate) and the
scheduler (pscnt, psdiv, and psratio). We have a machine-independent
interface to the clock interrupt hardware now, so we no longer need to
do it this way.

- Move profil(2)-specific code from statclock() to a new clock
interrupt callback, profclock(), in subr_prof.c. Each
schedstate_percpu has its own profclock handle. The profclock is
enabled/disabled for a given CPU when it is needed by the running
thread during mi_switch() and sched_exit().

- Move GPROF-specific code from statclock() to a new clock interrupt
callback, gmonclock(), in subr_prof.c. Where available, each cpu_info
has its own gmonclock handle . The gmonclock is enabled/disabled for
a given CPU via sysctl(2) in prof_state_toggle().

- Both profclock() and gmonclock() have a fixed period, profclock_period,
that is initialized during initclocks().

- Export clockintr_advance(), clockintr_cancel(), clockintr_establish(),
and clockintr_stagger() via <sys/clockintr.h>. They have external
callers now.

- Delete pscnt, psdiv, psratio. From schedstate_percpu, also delete
spc_pscnt and spc_psdiv. The statclock frequency is not dynamic
anymore so these variables are now useless.

- Delete code/state related to the dynamic statclock frequency from
kern_clockintr.c. The statclock frequency can still be pseudo-random,
so move the contents of clockintr_statvar_init() into clockintr_init().

With input from miod@, deraadt@, and claudio@. Early revisions
cleaned up by claudio. Early revisions tested by claudio@. Tested by
cheloha@ on amd64, arm64, macppc, octeon, and sparc64 (sun4v).
Compile- and boot- tested on i386 by mlarkin@. riscv64 compilation
bugs found by mlarkin@. Tested on riscv64 by jca@. Tested on
powerpc64 by gkoehler@.

show more ...


# f2e7dc09 14-Jul-2023 claudio <claudio@openbsd.org>

struct sleep_state is no longer used, remove it.
Also remove the priority argument to sleep_finish() the code can use
the p_flag P_SINTR flag to know if the signal check is needed or not.
OK cheloha@

struct sleep_state is no longer used, remove it.
Also remove the priority argument to sleep_finish() the code can use
the p_flag P_SINTR flag to know if the signal check is needed or not.
OK cheloha@ kettenis@ mpi@

show more ...


# aa563902 11-Jul-2023 claudio <claudio@openbsd.org>

Rework sleep_setup()/sleep_finish() to no longer hold the scheduler lock
between calls.

Instead of forcing an atomic operation across multiple calls use a three
step transaction.
1. setup sleep stat

Rework sleep_setup()/sleep_finish() to no longer hold the scheduler lock
between calls.

Instead of forcing an atomic operation across multiple calls use a three
step transaction.
1. setup sleep state by calling sleep_setup()
2. recheck sleep condition to ensure that the event did not fire before
sleep_setup() registered the proc onto the sleep queue
3. call sleep_finish() to either sleep or keep on running based on the
step 2 outcome and any possible signal delivery

To make this work wakeup from signals, single thread api and wakeup(9) need
to be aware if a process is between step 1 and step 3 so that the process
is not enqueued back onto the runqueue while going to sleep. Introduce
the p_flag P_WSLEEP to detect this situation.

On top of this remove the spl dance in msleep() which is no longer required.
It is ok to process interrupts between step 1 and 3.

OK mpi@ cheloha@

show more ...


# b2536c64 28-Jun-2023 claudio <claudio@openbsd.org>

First step at removing struct sleep_state.

Pass the timeout and sleep priority not only to sleep_setup() but also
to sleep_finish(). With that sls_timeout and sls_catch can be removed
from struct sl

First step at removing struct sleep_state.

Pass the timeout and sleep priority not only to sleep_setup() but also
to sleep_finish(). With that sls_timeout and sls_catch can be removed
from struct sleep_state.

The timeout is now setup first thing in sleep_finish() and no longer as
last thing in sleep_setup(). This should not cause a noticeable difference
since the code run between sleep_setup() and sleep_finish() is minimal.

OK kettenis@

show more ...


# 2b46a8cb 05-Dec-2022 deraadt <deraadt@openbsd.org>

zap a pile of dangling tabs


# 0d280c5f 14-Aug-2022 jsg <jsg@openbsd.org>

remove unneeded includes in sys/kern
ok mpi@ miod@


# 41d7544a 20-Jan-2022 bluhm <bluhm@openbsd.org>

Shifting signed integers left by 31 is undefined behavior in C.
found by kubsan; joint work with tobhe@; OK miod@


# 9fde647a 09-Sep-2021 mpi <mpi@openbsd.org>

Add THREAD_PID_OFFSET to tracepoint arguments that pass a TID to userland.

Bring these values in sync with the `tid' builtin which already include
the offset. This is necessary to build script comp

Add THREAD_PID_OFFSET to tracepoint arguments that pass a TID to userland.

Bring these values in sync with the `tid' builtin which already include
the offset. This is necessary to build script comparing them, like:

tracepoint:sched:enqueue
{
@ts[arg0] = nsecs;
}

tracepoint:sched:on__cpu
/@ts[tid]/
{
latency = nsecs - @ts[tid];
}

Discussed with and ok bluhm@

show more ...


# d73de46f 06-Jul-2021 kettenis <kettenis@openbsd.org>

Introduce CPU_IS_RUNNING() and us it in scheduler-related code to prevent
waiting on CPUs that didn't spin up. This will allow us to spin down
CPUs in the future to save power as well.

ok mpi@


1234