kern_proc.c - OpenGrok history log for /dragonfly/sys/kern/kern

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
# ec1c3f3a	17-Nov-2022	Matthew Dillon <dillon@apollo.backplane.com>	KERN_PROC - Change behavior and bump version to 600302 * Change default behavior to not include pure LWPs. That is, to not include pure kernel threads without a process (pid returned as -1). * A KERN_PROC - Change behavior and bump version to 600302 * Change default behavior to not include pure LWPs. That is, to not include pure kernel threads without a process (pid returned as -1). * Add a flag KERN_PROC_FLAG_LWKT to re-include the LWPs for programs that don't get confused by them. * Adjust /bin/ps and /usr/bin/top to use the flag. Also conditionalized on the existance of the flag so buildworld on older systems doesn't fail. * Clean-up the sysctl kernel interface for KERN_PROC a bit, since adding the flag creates a lot more combinations that need to be handled as discrete sysctls. show more ...
Revision tags: v6.2.2, v6.2.1, v6.2.0, v6.3.0, v6.0.1
# a23ed7f5	19-May-2021	Matthew Dillon <dillon@apollo.backplane.com>	kernel - Fix kernel crash in sysctl path * Fix a kernel crash which can occur in a particular sysctl due to some processes not having a p_textnch path. The sysctl code was assuming that p->p_te kernel - Fix kernel crash in sysctl path * Fix a kernel crash which can occur in a particular sysctl due to some processes not having a p_textnch path. The sysctl code was assuming that p->p_textnch would always be valid. procfs already has the added check. * Fix a race against exit, requiring the proc->p_token to be held. Reported-by: htop devs, BenBE, cgzones show more ...
Revision tags: v6.0.0, v6.0.0rc1, v6.1.0, v5.8.3, v5.8.2, v5.8.1, v5.8.0, v5.9.0, v5.8.0rc1
# f071b5e0	15-Feb-2020	Matthew Dillon <dillon@apollo.backplane.com>	kernel - Fix rare wait() deadlock It is possible for the kernel to deadlock two processes or process threads attempting to wait() on the same pid. Fix by adding a bit of magic to give owner kernel - Fix rare wait() deadlock It is possible for the kernel to deadlock two processes or process threads attempting to wait() on the same pid. Fix by adding a bit of magic to give ownership of the reaping operation to one of the waiters, and causing the other waiters to skip/reject that pid. show more ...
Revision tags: v5.6.3
# 1cb34a03	02-Feb-2020	Matthew Dillon <dillon@apollo.backplane.com>	kernel - Rejigger random number generator to be per-cpu 1/2 * Refactor all the kernel random number generation code to operate on a per-cpu basis. The csprng, ibaa, and l15 structures are now p kernel - Rejigger random number generator to be per-cpu 1/2 * Refactor all the kernel random number generation code to operate on a per-cpu basis. The csprng, ibaa, and l15 structures are now per-cpu. * RDRAND now runs a periodic timer callback on all available cpus rather than just on cpu 0, allowing rdrand data to mix on each cpu's rng independently. * The nrandom helper thread now chains state with an iteration between cpus, injecting a random data buffer generated from the previous cpu into the mix of the current. show more ...
# 2ac7d105	01-Dec-2019	Sascha Wildner <saw@online.de>	Rename some functions to better names. devfs_find_device_by_udev() -> devfs_find_device_by_devid() dev2udev() -> devid_from_dev() udev2dev() -> dev_from_devid() Th Rename some functions to better names. devfs_find_device_by_udev() -> devfs_find_device_by_devid() dev2udev() -> devid_from_dev() udev2dev() -> dev_from_devid() This fits with the rest of the code. 'dev' usually means a cdev_t, such as in make_dev(), etc. Instead of 'udev', use 'devid', since that's what dev_t is, a "Device ID". show more ...
# 91ffdfc5	01-Dec-2019	Sascha Wildner <saw@online.de>	<sys/types.h>: Get rid of udev_t. In a time long long ago, dev_t was a pointer, which later became cdev_t during the great cleanups, until it ended up being a uint32_t, just like udev_t. See for exa <sys/types.h>: Get rid of udev_t. In a time long long ago, dev_t was a pointer, which later became cdev_t during the great cleanups, until it ended up being a uint32_t, just like udev_t. See for example the definitions of __dev_t in <sys/stat.h>. This commit cleans up further by removing the udev_t type, leaving just the POSIX dev_t type for both kernel and userland. Put it inside a _DEV_T_DECLARED to prepare for further cleanups in <sys/stat.h>. show more ...
# e7126f0a	12-Nov-2019	Matthew Dillon <dillon@apollo.backplane.com>	kernel and libc - Reimplement lwp_setname() using /dev/lpmap Generally speaking we are implementing the features necessary to allow per-thread titling set via pthread_set_name_np() to show up kernel and libc - Reimplement lwp_setname() using /dev/lpmap Generally speaking we are implementing the features necessary to allow per-thread titling set via pthread_set_name_np() to show up in 'ps' output, and to use lpmap to make it fast. * The lwp_setname() system call now stores the title in lpmap->thread_title[]. * Implement a libc fast-path for lwp_setname() using lpmap. If called more than 10 times, libc will use lpmap for any further calls, which omits the need to make any system calls. * setproctitle() now stores the title in upmap->proc_title[] instead of replacing proc->p_args. proc->p_args is now no longer modified from its original contents. * The kernel now includes lpmap->thread_title[] in the following priority order when retrieving the process command line: lpmap->thread_title[] User-supplied thread title, if not empty upmap->proc_title[] User-supplied process title, if not empty proc->p_args Original process arguments (no longer modified) * Put the TID in /dev/lpmap for convenient access * Enhance the KERN_PROC_ARGS sysctl to allow the TID to be specified. The sysctl now accepts { KERN_PROC, KERN_PROC_ARGS, pid, tid } in addition to the existing { KERN_PROC, KERN_PROC_ARGS, pid } mechanism. Enhance libkvm to use the new feature. libkvm will fall-back to the old version if necessary. show more ...
# 4aa6d05c	12-Nov-2019	Matthew Dillon <dillon@apollo.backplane.com>	libc - Implement sigblockall() and sigunblockall() (2) * Cleanup the logic a bit. Store the lwp or proc pointer in the vm_map_backing structure and make vm_map_fork() and friends more aware of libc - Implement sigblockall() and sigunblockall() (2) * Cleanup the logic a bit. Store the lwp or proc pointer in the vm_map_backing structure and make vm_map_fork() and friends more aware of it. * Rearrange lwp allocation in [v]fork() to make the pointer(s) available to vm_fork(). * Put the thread mappings on the lwp's list immediately rather than waiting for the first fault, which means that per-thread mappings will be deterministically removed on thread exit whether any faults happened or not. * Adjust vmspace_fork*() functions to not propagate 'dead' lwp mappings for threads that won't exist in the forked process. Only the lwp mappings for the thread doing the [v]fork() is retained. show more ...
# 64b5a8a5	12-Nov-2019	Matthew Dillon <dillon@apollo.backplane.com>	kernel - sigblockall()/sigunblockall() support (per thread shared page) * Implement /dev/lpmap, a per-thread RW shared page between userland and the kernel. Each thread in the process will receiv kernel - sigblockall()/sigunblockall() support (per thread shared page) * Implement /dev/lpmap, a per-thread RW shared page between userland and the kernel. Each thread in the process will receive a unique shared page for communication with the kernel when memory-mapping /dev/lpmap and can access varous variables via this map. * The current thread's TID is retained for both fork() and vfork(). Previously it was only retained for vfork(). This avoids userland code confusion for any bits and pieces that are indexed based on the TID. * Implement support for a per-thread block-all-signals feature that does not require any system calls (see next commit to libc). The functions will be called sigblockall() and sigunblockall(). The lpmap->blockallsigs variable prevents normal signals from being dispatched. They will still be queued to the LWP as per normal. The behavior is not quite that of a signal mask when dealing with global signals. The low 31 bits represents a recursion counter, allowing recursive use of the functions. The high bit (bit 31) is set by the kernel if a signal was prevented from being dispatched. When userland decrements the counter to 0 (the low 31 bits), it can check and clear bit 31 and if found to be set userland can then make a dummy 'real' system call to cause pending signals to be delivered. Synchronous TRAPs (e.g. kernel-generated SIGFPE, SIGSEGV, etc) are not affected by this feature and will still be dispatched synchronously. * PThreads is expected to unmap the mapped page upon thread exit. The kernel will force-unmap the page upon thread exit if pthreads does not. XXX needs work - currently if the page has not been faulted in the kernel has no visbility into the mapping and will not unmap it, but neither will it get confused if the address is accessed. To be fixed soon. Because if we don't, programs using LWP primitives instead of pthreads might not realize that libc has mapped the page. * The TID is reset to 1 on a successful exec() On [v]fork(), if lpmap exists for the current thread, the kernel will copy the lpmap->blockallsigs value to the lpmap for the new thread in the new process. This way sigblock() state is retained across the [v]fork(). This feature not only reduces code confusion in userland, it also allows [v]fork() to be implemented by the userland program in a way that ensures no signal races in either the parent or the new child process until it is ready for them. The implementation leverages our vm_map_backing extents by having the per-thread memory mappings indexed within the lwp. This allows the lwp to remove the mappings when it exits (since not doing so would result in a wild pmap entry and kernel memory disclosure). * The implementation currently delays instantiation of the mapped page(s) and some side structures until the first fault. XXX this will have to be changed. show more ...
# ed183f8c	23-Oct-2019	Sascha Wildner <saw@online.de>	world/kernel: Use the rounddown() macro in various places. Tested-by: zrj
Revision tags: v5.6.2
# 5deae5c6	19-Jun-2019	Matthew Dillon <dillon@apollo.backplane.com>	kernel - Change fill_kinfo_lwp() and fix top * Change fill_kinfo_lwp(), an internal function used by kern_proc.c and libkvm, to aggregate lwp data instead of replace it. Note that fill_kinfo_pr kernel - Change fill_kinfo_lwp() and fix top * Change fill_kinfo_lwp(), an internal function used by kern_proc.c and libkvm, to aggregate lwp data instead of replace it. Note that fill_kinfo_proc() will zero the lwp sub-structure and is already typically called before zero or more fill_kinfo_lwp() calls, so the new aggregation essentially just works even though the API is a bit different. In addition, when getprocs is told to aggregate lwps the tid field will be set to -1 since it is not applicable in the aggregation case. * 'top' will now properly aggregate the threads belonging to a process when thread mode 'H' is not in effect. * Also allow top to display cpu percentages above 100%, since in the aggregation case the sum of threads can easily exceed 100% of one core. Requested-by: hsw show more ...
Revision tags: v5.6.1, v5.6.0, v5.6.0rc1, v5.7.0
# 2a7bd4d8	18-May-2019	Sascha Wildner <saw@online.de>	kernel: Don't include <sys/user.h> in kernel code. There is really no point in doing that because its main purpose is to expose kernel structures to userland. The majority of cases wasn't needed at kernel: Don't include <sys/user.h> in kernel code. There is really no point in doing that because its main purpose is to expose kernel structures to userland. The majority of cases wasn't needed at all and the rest required only a couple of other includes. show more ...
Revision tags: v5.4.3, v5.4.2
# 502d982c	28-Dec-2018	Sascha Wildner <saw@online.de>	kernel: Remove more duplicate includes.
Revision tags: v5.4.1, v5.4.0, v5.5.0, v5.4.0rc1
# 2efb75f3	04-Oct-2018	Matthew Dillon <dillon@apollo.backplane.com>	kernel - Refactor tty_token, fix SMP performance issues * Remove most uses of tty_token in favor of per-tty tp->t_token. This is particularly important for removing bottlenecks related to PTYs, kernel - Refactor tty_token, fix SMP performance issues * Remove most uses of tty_token in favor of per-tty tp->t_token. This is particularly important for removing bottlenecks related to PTYs, which are used all over the place. tty_token remains in a few places managing overall registration and global list manipulation. * tty structures are now required to be persistent. Implement a sepearate ttyinit() function. Continue to allow ttyregister() and ttyunregister() calls, but these no longer presume destruction of the structure. * Refactor ttymalloc() to take a *tty pointer and interlock allocations. Allocations are intended to be one-time. ttymalloc() only requires the tty_token for initial allocations. Remove all critical section use that was combined with tty_token and tp->t_token. Leave only the tokens. The critical sections were hold-overs going all the way back to pre-SMP days. * syscons now gets its own token, vga_token. The ISA VGA code and the framebuffer code also now use this token instead of tty_token. * The keyboard subsystem now uses kbd_token instead of tty_token. * A few remaining serial-like devices (snp, nmdm) also get their own tokens, as well as use the now required tp->t_token. * Remove use of tty_token in the session management code. This fixes a niggling performance path since sessions almost universally go hand-in-hand with fork/exec/exit sequences. Instead we use the already-existing per-hash session token. show more ...
# 33b81dc9	30-Sep-2018	Matthew Dillon <dillon@apollo.backplane.com>	system - Add wait6(), waitid(), and si_pid/si_uid siginfo support * Add the wait6() system call (header definitions taken from FreeBSD). This required rearranging kern_wait() a bit. In particular system - Add wait6(), waitid(), and si_pid/si_uid siginfo support * Add the wait6() system call (header definitions taken from FreeBSD). This required rearranging kern_wait() a bit. In particular, we now maintain a hold count of 1 on the process during processing instead of releasing the hold count early. * Add waitid() to libc (waitid.c taken from FreeBSD). * Adjust manual pages (taken from FreeBSD). * Add siginfo si_pid and si_uid support. This basically allows a process taking a signal to determine where the signal came from. The fields already existed in siginfo but were not implemented. Implemented using a non-queued per-process array of signal numbers. The last originator sending any given signal is recorded and passed through to userland in the siginfo. * Fixes the 'lightdm' X display manager. lightdm relies on si_pid support. In addition, note that avoiding long lightdm related latencies and timeouts require a softlink from libmozjs-52.so to libmozjs-52.so.0 (must be addressed in dports, not addressed in this commit). Loosely-taken-from: FreeBSD (wait6, waitid support only) Reviewed-by: swildner show more ...
Revision tags: v5.2.2, v5.2.1, v5.2.0, v5.3.0, v5.2.0rc, v5.0.2, v5.0.1
# 39b9b6cd	19-Oct-2017	Matthew Dillon <dillon@apollo.backplane.com>	kernel - Add p_ppid * We have proc->p_pptr, but still needed a shared p->p_token to access the ppid. Buckle under and add proc->p_ppid as well so getppid() can run lockless. * Adjust the vmtot kernel - Add p_ppid * We have proc->p_pptr, but still needed a shared p->p_token to access the ppid. Buckle under and add proc->p_ppid as well so getppid() can run lockless. * Adjust the vmtotal proc scan to use a shared proc->p_token instead of an exclusive one. show more ...
# 618537cf	19-Oct-2017	Matthew Dillon <dillon@apollo.backplane.com>	kernel - Make certain sysctl's unlocked * Automatically flag all SYSCTL_[U]INT, [U]LONG, and [U]QUAD definitions CTLFLAG_NOLOCK. These do not have to be locked. Will improve program startup per kernel - Make certain sysctl's unlocked * Automatically flag all SYSCTL_[U]INT, [U]LONG, and [U]QUAD definitions CTLFLAG_NOLOCK. These do not have to be locked. Will improve program startup performance a tad. * Flag a ton of other sysctls used in program startup and also 'ps' CTLFLAG_NOLOCK. * For kern.hostname, interlock changes using XLOCK and allow the sysctl to run NOLOCK, avoiding unnecessary cache line bouncing. show more ...
Revision tags: v5.0.0, v5.0.0rc2, v5.1.0, v5.0.0rc1
# 6cbfbdb9	27-Sep-2017	Matthew Dillon <dillon@apollo.backplane.com>	kernel - Fix rare allproc scan vs p_ucred race * This race can occur because p->p_ucred can change out from under an allproc scan when the allproc scan is filtering based on credentials. * Acce kernel - Fix rare allproc scan vs p_ucred race * This race can occur because p->p_ucred can change out from under an allproc scan when the allproc scan is filtering based on credentials. * Access p->p_ucred via the per-process spinlock (p->p_spin). Also maintain a cache of the last ucred during the loop in order to avoid having to spin-lock every process. * Add missing spinlock around p->p_ucred = NULL in exit1(). This is also only applicable to races against allproc scans since p_token is held during exit1(). Reported-by: mjg_ show more ...
# 586c4308	12-Aug-2017	Matthew Dillon <dillon@apollo.backplane.com>	kernel - Break up scheduler and loadavg callout * Change the scheduler and loadavg callouts from cpu 0 to all cpus, and adjust the allproc_scan() and alllwp_scan() to segment the hash table when kernel - Break up scheduler and loadavg callout * Change the scheduler and loadavg callouts from cpu 0 to all cpus, and adjust the allproc_scan() and alllwp_scan() to segment the hash table when asked. Every cpu is now tasked with handling the nominal scheduler recalc and nominal load calculation for a portion of the process list. The portion is unrelated to which cpu(s) the processes are actually scheduled on, it is strictly a way to spread the work around, split up by hash range. * Significantly reduces cpu 0 stalls when a large number of user processes or threads are present (that is, in the tens of thousands or more). In the test below, before this change, cpu 0 was straining under 40%+ interupt load (from the callout). After this change the load is spread across all cpus, approximately 1.5% per cpu. * Tested with 400,000 running user processes on a 32-thread dual-socket xeon (yes, these numbers are real): 12:27PM up 8 mins, 3 users, load avg: 395143.28, 270541.13, 132638.33 12:33PM up 14 mins, 3 users, load avg: 399496.57, 361405.54, 225669.14 * NOTE: There are still a number of other non-segmented allproc scans in the system, particularly related to paging and swapping. * NOTE: Further spreading-out of the work may be needed, by using a more frequent callout and smaller hash index range for each. show more ...
# da1e1cb6	06-Aug-2017	Matthew Dillon <dillon@apollo.backplane.com>	kernel - Move sigtramp even lower * Attempt to work around a Ryzen cpu bug by moving sigtramp even lower than we have already.
Revision tags: v4.8.1
# 3e925ec2	03-Apr-2017	Matthew Dillon <dillon@apollo.backplane.com>	kernel - Implement NX * Implement the NX (no-execute) pmap bit. * Shift sigtramp down to a page-bound and protect it prot\|VM_PROT_EXECUTE. * Map the rest of the user stack VM_PROT_READ\|VM_PROT_WRI kernel - Implement NX * Implement the NX (no-execute) pmap bit. * Shift sigtramp down to a page-bound and protect it prot\|VM_PROT_EXECUTE. * Map the rest of the user stack VM_PROT_READ\|VM_PROT_WRITE without VM_PROT_EXECUTE. show more ...
# e6141a7f	29-Mar-2017	Matthew Dillon <dillon@apollo.backplane.com>	kernel - Add KERN_PROC_SIGTRAMP * Add a sysctl to retrieve the sigtramp address range for gdb. Reported-by: marino
Revision tags: v4.8.0, v4.6.2, v4.9.0, v4.8.0rc
# 282f3194	11-Jan-2017	Matthew Dillon <dillon@apollo.backplane.com>	kernel - Incidental MPLOCK removal * Remove misc #include <sys/mplock2.h> statements that are no longer needed. * Replace mplock with acct_lock in kern_acct.c * Replace mplock with msg_token in sy kernel - Incidental MPLOCK removal * Remove misc #include <sys/mplock2.h> statements that are no longer needed. * Replace mplock with acct_lock in kern_acct.c * Replace mplock with msg_token in sysv_msg.c * Replace mplock with p->p_token in the profiling code. show more ...
# 8cdef6cb	06-Dec-2016	Matthew Dillon <dillon@apollo.backplane.com>	kernel - Increase worst-case maximum exec rate * The pid reuse algorithm limits the maximum fork rate. This limit was set too low. Increase the limit from 10000/sec to 100000/sec. Currently ou kernel - Increase worst-case maximum exec rate * The pid reuse algorithm limits the maximum fork rate. This limit was set too low. Increase the limit from 10000/sec to 100000/sec. Currently our opteron maxes out at 43000/sec. Note that with 999999 pids and a 10-second mandatory reuse time floor there isn't much of a point increasing the limit beyond 100000/sec. 100,000/sec. Currently our opteron maxes out at around 43,000/sec (vfork/exec/wait3/exit of a small static binary). * The domain reuse array was increased to 1MB to accomodate this change. In addition, update the array in a cache-friendly manner. * Modify test/sysperf/exec1 to take a nprocesses argument for the timing run. show more ...
# 43fdf490	05-Dec-2016	Matthew Dillon <dillon@apollo.backplane.com>	kernel - Make kern_proc cache-friendly * Make the proc_tokens[], allprocs[], allpgrps[], and allsessn[] arrays cache-friendly by aggregating them into a cache-aligned struct procglob. * Doesn't kernel - Make kern_proc cache-friendly * Make the proc_tokens[], allprocs[], allpgrps[], and allsessn[] arrays cache-friendly by aggregating them into a cache-aligned struct procglob. * Doesn't do much for the token array, but should help allprocs/allpgrps/allsessn scans whos structures were previously 8-byte aligned. show more ...
12 3 4 5