History log of /dragonfly/sys/kern/kern_proc.c (Results 26 – 50 of 123)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v4.6.1
# 726f7ca0 05-Aug-2016 Matthew Dillon <dillon@apollo.backplane.com>

kernel - Fix kern.proc.pathname sysctl

* kern.proc.pathname is a sysctl used by programs to find the path
of the running program. This sysctl was created before we stored
sufficient information

kernel - Fix kern.proc.pathname sysctl

* kern.proc.pathname is a sysctl used by programs to find the path
of the running program. This sysctl was created before we stored
sufficient information in the proc structure to construct the
correct path when multiple aliases are present (due to e.g. null-mounts)
to the same file.

* We do have this information, in p->p_textnch, so change the sysctl to
use it. The sysctl will now return the actual full path in the context
of whomever ran the program, so it should properly take into account
chroots and such.

show more ...


Revision tags: v4.6.0, v4.6.0rc2, v4.6.0rc, v4.7.0, v4.4.3, v4.4.2, v4.4.1, v4.4.0, v4.5.0, v4.4.0rc, v4.2.4, v4.3.1, v4.2.3, v4.2.1, v4.2.0, v4.0.6, v4.3.0, v4.2.0rc
# ced589cb 30-May-2015 Sepherosa Ziehau <sephe@dragonflybsd.org>

lwkt: Initialize LWKT objcache initialization to earlier place

So that calling lwkt_create w/o thread template could work during
early boot.

LWKT is now initialized before SOFTCLOCK, since SOFTCLOC

lwkt: Initialize LWKT objcache initialization to earlier place

So that calling lwkt_create w/o thread template could work during
early boot.

LWKT is now initialized before SOFTCLOCK, since SOFTCLOCK creates
per-cpu callout threads, though the creation uses thread template.

show more ...


Revision tags: v4.0.5, v4.0.4, v4.0.3, v4.0.2, v4.0.1, v4.0.0, v4.0.0rc3, v4.0.0rc2, v4.0.0rc, v4.1.0
# 87116512 17-Oct-2014 Matthew Dillon <dillon@apollo.backplane.com>

kernel - Add /dev/upmap and /dev/kpmap and sys/upmap.h (3)

* Add upmap->invfork. When a vforked child is trying to access the upmap
prior to exec we must still access the parent's map and not the

kernel - Add /dev/upmap and /dev/kpmap and sys/upmap.h (3)

* Add upmap->invfork. When a vforked child is trying to access the upmap
prior to exec we must still access the parent's map and not the child's,
which means that the stored PID will be incorrect.

To fix this issue we add the invfork field which allows userland to
determine whether this is a vforked child accessing the parent's map.
If it is, getpid() will use the system call.

* Fix a bug where a vfork()d child creates p->p_upmap for itself but then
maps it into the parent's address space as a side effect of a getpid()
or other call. When this situation is detected, /dev/upmap will use
the parent's p_upmap and not the child's, and also properly set the
invfork flag.

* Implement system call overrides for getpid(), setproctitle(), and
clock_gettime() (*_FAST and *_SECOND clock ids). When more than 10 calls
are made to one of these functions the new libc upmap/kpmap support is
activated. /dev/upmap and /dev/kpmap will be memory-mapped into the
address space and further accesses will run through the maps instead of
making system calls.

This will obviously reduce overhead for these calls by a very significant
multiplier.

* NOTE! gettimeofday() is still a system call and will likely remain a system
call in order to return a fine-grained time value. Third-party code
that doesn't need a fine-grained time value must use clock_gettime()
to obtain the new performance efficiencies.

show more ...


# 0adbcbd6 16-Oct-2014 Matthew Dillon <dillon@apollo.backplane.com>

kernel - Add /dev/upmap and /dev/kpmap and sys/upmap.h

* Add two memory-mappable devices for accessing a per-process and global
kernel shared memory space. These can be mapped to acquire certain

kernel - Add /dev/upmap and /dev/kpmap and sys/upmap.h

* Add two memory-mappable devices for accessing a per-process and global
kernel shared memory space. These can be mapped to acquire certain
information from the kernel that would normally require a system call
in a more efficient manner.

Userland programs using this feature should NOT directly map the sys_upmap
and sys_kpmap structures (which is why they are in #ifdef _KERNEL sections
in sys/upmap.h). Instead, mmap the devices using UPMAP_MAPSIZE and
KPMAP_MAPSIZE and parse the ukpheader[] array at the front of each area
to locate the desired fields. You can then simply cache a pointer to
the desired field.

The width of the field is encoded in the UPTYPE/KPTYPE elements and
can be asserted if desired, user programs are not expected to handle
integers of multiple sizes for the same field type.

* Add /dev/upmap. A program can open and mmap() this device R+W and use
it to access:

header[...] - See sys/upmap.h. An array of headers terminating with
a type=0 header indicating where various fields are in
the mapping. This should be used by userland instead
of directly mapping to the struct sys_upmap structure.

version - The sys_upmap version, typically 1.

runticks - Scheduler run ticks (aggregate, all threads). This
may be used by userland interpreters to determine
when to soft-switch.

forkid - A unique non-zero 64-bit fork identifier. This is NOT a
pid. This may be used by userland libraries to determine
if a fork has occurred by comparing against a stored
value.

pid - The current process pid. This may be used to acquire the
process pid without having to make further system calls.

proc_title - This starts out as an empty buffer and may be used to set
the process title. To revert to the original process title,
set proc_title[0] to 0.

NOTE! Userland may write to the entire buffer, but it is recommended
that userland only write to fields intended to be writable.

NOTE! When a program forks, an area already mmap()d remains mmap()d but
will point to the new process's area and not the old, so libraries
do not need to do anything special atfork.

NOTE! Access to this structure is cpu localized.

* Add /dev/kpmap. A program can open and mmap() this device RO and use
it to access:

header[...] - See sys/upmap.h. An array of headers terminating with
a type=0 header indicating where various fields are in
the mapping. This should be used by userland instead
of directly mapping to the struct sys_upmap structure.

version - The sys_kpmap version, typically 1.

upticks - System uptime tick counter (32 bit integer). Monotonic,
uncompensated.

ts_uptime - System uptime in struct timespec format at tick-resolution.
Monotonic, uncompensated.

ts_realtime - System realtime in struct timespec format at tick-resolution.
This is compensated so reverse-indexing is possible.

tsc_freq - If the system supports a TSC of some sort, the TSC
frequency is recorded here, else 0.

tick_freq - The tick resolution of ts_uptime and ts_realtime and
approximate tick resolution for the scheduler. Typically
100.

NOTE! Userland may only read from this buffer.

NOTE! Access to this structure is NOT cpu localized. A memory fence
and double-check should be used when accessing non-atomic structures
which might change such as ts_uptime and ts_realtime.

XXX needs work.

show more ...


Revision tags: v3.8.2
# c07315c4 04-Jul-2014 Matthew Dillon <dillon@apollo.backplane.com>

kernel - Refactor cpumask_t to extend cpus past 64, part 1/2

* 64-bit systems only. 32-bit builds use the macros but cannot be expanded
past 32 cpus.

* Change cpumask_t from __uint64_t to a stru

kernel - Refactor cpumask_t to extend cpus past 64, part 1/2

* 64-bit systems only. 32-bit builds use the macros but cannot be expanded
past 32 cpus.

* Change cpumask_t from __uint64_t to a structure. This commit implements
one 64-bit sub-element (the next one will implement four for 256 cpus).

* Create a CPUMASK_*() macro API for non-atomic and atomic cpumask
manipulation. These macros generally take lvalues as arguments, allowing
for a fairly optimal implementation.

* Change all C code operating on cpumask's to use the newly created CPUMASK_*()
macro API.

* Compile-test 32 and 64-bit. Run-test 64-bit.

* Adjust sbin/usched, usr.sbin/powerd. usched currently needs more work.

show more ...


Revision tags: v3.8.1, v3.6.3, v3.8.0, v3.8.0rc2, v3.9.0, v3.8.0rc
# 3a877e44 21-Apr-2014 Matthew Dillon <dillon@apollo.backplane.com>

kernel - Improve pid-reuse algorithm, fix bug

* Fix a bug where under extreme loads it was possible for a PID to be
allocated twice.

* Implement a minimum pid-reuse delay of 10 seconds. No pid,

kernel - Improve pid-reuse algorithm, fix bug

* Fix a bug where under extreme loads it was possible for a PID to be
allocated twice.

* Implement a minimum pid-reuse delay of 10 seconds. No pid, session id,
or pgid will be reused for at least 10 seconds after being reaped.

This shouldn't really be necessary but it should help scripts, particularly
bulk builds, which rely on testing out-of-band PIDs with pwait.

* Increase PID_MAX from 99999 to 999999

Reported-by: marino

show more ...


Revision tags: v3.6.2
# e3790519 20-Mar-2014 Joris Giovannangeli <joris@giovannangeli.fr>

kernel: check that p_ucred is not NULL before calling p_trespass


# ca3546a8 14-Mar-2014 Antonio Huete Jimenez <tuxillo@quantumachine.net>

kernel - Add allproc_hsize global

- Used by kvm(3) to determine the proc hash size.


# 35c7df0f 09-Mar-2014 Markus Pfeiffer <markus.pfeiffer@morphism.de>

kernel: Fixup KERN_PROC_PATHNAME sysctl

The code for the sysctl uses pfind, which PHOLDs the process
if a pid is passed to the sysctl, so we PRELE the process
if necessary.


Revision tags: v3.6.1
# f849311b 04-Jan-2014 François Tigeot <ftigeot@wolfpond.org>

kernel: Add the KERN_PROC_PATHNAME sysctl

Which returns the full path of a process text file.

Obtained-from: FreeBSD


Revision tags: v3.6.0, v3.7.1, v3.6.0rc, v3.7.0
# b22b6ecd 25-Oct-2013 Matthew Dillon <dillon@apollo.backplane.com>

kernel - Remove proc_token, replace proc, pgrp, and session structure backend

* Isolate the remaining exposed topology for proc, pgrp, and session
into one source file (kern_proc.c).

* Remove all

kernel - Remove proc_token, replace proc, pgrp, and session structure backend

* Isolate the remaining exposed topology for proc, pgrp, and session
into one source file (kern_proc.c).

* Remove allproc, zombproc, pgrp's spinlocks, and start tracking session
structures so we don't have to indirect through other system structures.

* Replace with arrays-of-lists, 1024 elements, including a 1024 element
token lock array to protect each list.

proc_tokens[1024]
allprocs[1024]
allpgrps[1024]
allsessn[1024]

This removes nearly all the prior proc_token contention and also removes
process-group processing contention and makes it easier to track tty
sessions.

* Normal process, Zombie processes, the original linear list, and the
original has mechanic are now all combined into a single allprocs[]
table. The various API functions will filter out zombie vs non-zombie
based on the type of request.

* Rewrite the PID allocator to take advantage of the hashed array topology.
An atomic_fetchadd_int() is used on the static base value which will cause
each cpu to start at a different array entry, thus removing SMP conflicts.

At the moment we iterate the relatively small number of elements in the
bucket to find a free pid.

Since the same proc_tokens[n] lock applies to all three arrays (proc,
pgrp, and session), we can validate the pid against all three at the
same time with a single lock.

* Rewrite the procs sysctl to iterate the hash table. Since there are
1024 different locks, a 'ps' or similar operation no longer has any
significant effect on system performance, and 'ps' is VERY fast now
regardless of the load.

* poudriere bulk build tests on a blade (4 core / 8 thread) shows virtually
no SMP collisions even under extreme loads.

* poudriere bulk build tests on monster (48-core opteron) show very low
SMP collision statistics outside of filesystem writes in most situations.
Pipes (which are already fine-grained) sometimes show significant
collisions.

Most importantly, NO collisions on the process fork/exec/exit critical
path, end-to-end. Not even in the VM system.

show more ...


# a8d3ab53 25-Oct-2013 Matthew Dillon <dillon@apollo.backplane.com>

kernel - proc_token removal pass stage 1/2

* Remove proc_token use from all subsystems except kern/kern_proc.c.

* The token had become mostly useless in these subsystems now that process
locking

kernel - proc_token removal pass stage 1/2

* Remove proc_token use from all subsystems except kern/kern_proc.c.

* The token had become mostly useless in these subsystems now that process
locking is more fine-grained. Do the final wipe of proc_token except for
allproc/zombproc list use in kern_proc.c

show more ...


# 51818c08 24-Oct-2013 Matthew Dillon <dillon@apollo.backplane.com>

kernel - Improve vfork/exec and wait*() performance

* Use a flags interlock instead of a token interlock for PPWAIT for
vfork/exec (handling when the parent must wait for the child to
finish exe

kernel - Improve vfork/exec and wait*() performance

* Use a flags interlock instead of a token interlock for PPWAIT for
vfork/exec (handling when the parent must wait for the child to
finish exec'ing).

* The exit1() code must wakeup the parent's wait*()'s. Delay the wakeup
until after the token has been released.

* Change thet interlock in the parent's wait*() code to use a generation
counter.

* Do not wakeup p_nthreads on exit if the program was never multi-threaded,
saving a few cycles.

show more ...


# fa3d6eac 23-Oct-2013 Matthew Dillon <dillon@apollo.backplane.com>

kernel - proc_token performance cleanups

* pfind()/pfindn()/zpfind() now acquire proc_token shared.

* Fix a bug in alllwp_scan(). Must hold p->p_token while scanning
its lwp's.

* Process list s

kernel - proc_token performance cleanups

* pfind()/pfindn()/zpfind() now acquire proc_token shared.

* Fix a bug in alllwp_scan(). Must hold p->p_token while scanning
its lwp's.

* Process list scan can use a shared token, use pfind() instead of
pfindn() and remove proc_token for individual pid lookups.

* cwd can use a shared p->p_token.

* getgroups(), seteuid(), and numerous other uid/gid access and setting
functions need to use p->p_token, not proc_token (Repored by enjolras).

show more ...


# 849425f7 16-Oct-2013 Matthew Dillon <dillon@apollo.backplane.com>

kernel - Fix panic in sysctl_kern_proc()

* sysctl_kern_proc() loops through ncpus and moves the thread to each cpu
in turn in order to access its local thread list.

* Fix a panic where the functi

kernel - Fix panic in sysctl_kern_proc()

* sysctl_kern_proc() loops through ncpus and moves the thread to each cpu
in turn in order to access its local thread list.

* Fix a panic where the function does not return on the same cpu it was
called on. The userland scheduler expects threads to return to usermode
on the same cpu they left usermode on and is responsible for moving the
thread to another cpu (for userland scheduling purposes) itself.

show more ...


# e6a0f74a 09-Oct-2013 Matthew Dillon <dillon@apollo.backplane.com>

kernel - Fix pgrp and session ref-count races

* Fix some tight timing windows where the ref count on these structures
could race.

* Protect the pgrp hash table with a spinlock instead of using pr

kernel - Fix pgrp and session ref-count races

* Fix some tight timing windows where the ref count on these structures
could race.

* Protect the pgrp hash table with a spinlock instead of using proc_token.

* Improve pgfind() performance by using the spinlock in shared mode.

* Do not transition p_pgrp through NULL when changing a process's pgrp.
Atomically transition the process (protected p->p_token and
pg->pg_token).

show more ...


# a86ce0cd 20-Sep-2013 Matthew Dillon <dillon@apollo.backplane.com>

hammer2 - Merge Mihai Carabas's VKERNEL/VMM GSOC project into the main tree

* This merge contains work primarily by Mihai Carabas, with some misc
fixes also by Matthew Dillon.

* Special note on G

hammer2 - Merge Mihai Carabas's VKERNEL/VMM GSOC project into the main tree

* This merge contains work primarily by Mihai Carabas, with some misc
fixes also by Matthew Dillon.

* Special note on GSOC core

This is, needless to say, a huge amount of work compressed down into a
few paragraphs of comments. Adds the pc64/vmm subdirectory and tons
of stuff to support hardware virtualization in guest-user mode, plus
the ability for programs (vkernels) running in this mode to make normal
system calls to the host.

* Add system call infrastructure for VMM mode operations in kern/sys_vmm.c
which vectors through a structure to machine-specific implementations.

vmm_guest_ctl_args()
vmm_guest_sync_addr_args()

vmm_guest_ctl_args() - bootstrap VMM and EPT modes. Copydown the original
user stack for EPT (since EPT 'physical' addresses cannot reach that far
into the backing store represented by the process's original VM space).
Also installs the GUEST_CR3 for the guest using parameters supplied by
the guest.

vmm_guest_sync_addr_args() - A host helper function that the vkernel can
use to invalidate page tables on multiple real cpus. This is a lot more
efficient than having the vkernel try to do it itself with IPI signals
via cpusync*().

* Add Intel VMX support to the host infrastructure. Again, tons of work
compressed down into a one paragraph commit message. Intel VMX support
added. AMD SVM support is not part of this GSOC and not yet supported
by DragonFly.

* Remove PG_* defines for PTE's and related mmu operations. Replace with
a table lookup so the same pmap code can be used for normal page tables
and also EPT tables.

* Also include X86_PG_V defines specific to normal page tables for a few
situations outside the pmap code.

* Adjust DDB to disassemble SVM related (intel) instructions.

* Add infrastructure to exit1() to deal related structures.

* Optimize pfind() and pfindn() to remove the global token when looking
up the current process's PID (Matt)

* Add support for EPT (double layer page tables). This primarily required
adjusting the pmap code to use a table lookup to get the PG_* bits.

Add an indirect vector for copyin, copyout, and other user address space
copy operations to support manual walks when EPT is in use.

A multitude of system calls which manually looked up user addresses via
the vm_map now need a VMM layer call to translate EPT.

* Remove the MP lock from trapsignal() use cases in trap().

* (Matt) Add pthread_yield()s in most spin loops to help situations where
the vkernel is running on more cpu's than the host has, and to help with
scheduler edge cases on the host.

* (Matt) Add a pmap_fault_page_quick() infrastructure that vm_fault_page()
uses to try to shortcut operations and avoid locks. Implement it for
pc64. This function checks whether the page is already faulted in as
requested by looking up the PTE. If not it returns NULL and the full
blown vm_fault_page() code continues running.

* (Matt) Remove the MP lock from most the vkernel's trap() code

* (Matt) Use a shared spinlock when possible for certain critical paths
related to the copyin/copyout path.

show more ...


# 8cee56f4 14-Sep-2013 Matthew Dillon <dillon@apollo.backplane.com>

kernel - Misc adjustments used by the vkernel and VMM, misc optimizations

* This section committed separately because it is basically independent
of VMM.

* Improve pfind(). Don't get proc_token

kernel - Misc adjustments used by the vkernel and VMM, misc optimizations

* This section committed separately because it is basically independent
of VMM.

* Improve pfind(). Don't get proc_token if the process being looked up
is the current process.

* Improve kern_kill(). Do not obtain proc_token any more. p->p_token
is sufficient and the process group has its own lock now.

* Call pthread_yield() when spinning on various things.

x Spinlocks
x Tokens (spinning in lwkt_switch)
x cpusync (ipiq)

* Rewrite sched_yield() -> dfly_yield(). dfly_yield() will
unconditionally round-robin the LWP, ignoring estcpu. It isn't
perfect but it works fairly well.

The dfly scheduler will also no longer attempt to migrate threads
across cpus when handling yields. They migrate normally in all
other circumstances.

This fixes situations where the vkernel is spinning waiting for multiple
events from other cpus and in particular when it is doing a global IPI
for pmap synchronization of the kernel_pmap.

show more ...


Revision tags: v3.4.3
# dc71b7ab 31-May-2013 Justin C. Sherrill <justin@shiningsilence.com>

Correct BSD License clause numbering from 1-2-4 to 1-2-3.

Apparently everyone's doing it:
http://svnweb.freebsd.org/base?view=revision&revision=251069

Submitted-by: "Eitan Adler" <lists at eitanadl

Correct BSD License clause numbering from 1-2-4 to 1-2-3.

Apparently everyone's doing it:
http://svnweb.freebsd.org/base?view=revision&revision=251069

Submitted-by: "Eitan Adler" <lists at eitanadler.com>

show more ...


Revision tags: v3.4.2
# 2702099d 06-May-2013 Justin C. Sherrill <justin@shiningsilence.com>

Remove advertising clause from all that isn't contrib or userland bin.

By: Eitan Adler <lists@eitanadler.com>


Revision tags: v3.4.0, v3.4.1, v3.4.0rc, v3.5.0, v3.2.2
# 9072066a 08-Dec-2012 Matthew Dillon <dillon@apollo.backplane.com>

kernel - Adjust NFS server for new allocvnode() code

* Adjust the NFS server to check for LWP_MP_VNLRU garbage collection
requests and act on them.

This prevents excessive allocation of vnodes

kernel - Adjust NFS server for new allocvnode() code

* Adjust the NFS server to check for LWP_MP_VNLRU garbage collection
requests and act on them.

This prevents excessive allocation of vnodes by the nfsd's.

show more ...


# 62ae46c9 08-Dec-2012 Matthew Dillon <dillon@apollo.backplane.com>

kernel - Change allocvnode() to not recursively block freeing vnodes

allocvnode() has caused many deadlock issues over the years, including
recent issues with softupdates, because it is often called

kernel - Change allocvnode() to not recursively block freeing vnodes

allocvnode() has caused many deadlock issues over the years, including
recent issues with softupdates, because it is often called from deep
within VFS modules and attempts to clean and free unrelated vnodes when
the vnode limit is reached to make room for the new one.

* numvnodes is not protected by any locks and needs atomic ops.

* Change allocvnode() to always allocate and not attempt to free
other vnodes.

* allocvnode() now flags the LWP to handle reducing the number of vnodes
in the system as of when it returns to userland instead. Consolidate
several flags into a single conditional function call, lwpuserret().

When triggered, this code will do a limited scan of the free list to
try to find vnodes to free.

* The vnlru_proc_wait() code existed to handle a separate algorithm
related to vnodes with cached buffers and VM pages but represented
a major bottleneck in the system.

Remove vnlru_proc_wait() and allow vnodes with buffers and/or non-empty
VM objects to be placed on the free list.

This also requires not vhold()ing the vnode for related buffer cache
buffer since the vnode will not go away until related buffers have been
cleaned out. We shouldn't need those holds.

Testing-by: vsrinivas

show more ...


# 8d67cbb3 06-Nov-2012 Sascha Wildner <saw@online.de>

Fix some typos in user visible messages, etc.


Revision tags: v3.2.1
# 08a7d6d8 10-Oct-2012 Matthew Dillon <dillon@apollo.backplane.com>

kernel - Adjust cache_fullpath() API

* Add another argument to explicitly specify the base directory that the
path is to be relative to.


Revision tags: v3.2.0, v3.3.0, v3.0.3
# 0730ed66 16-Aug-2012 Matthew Dillon <dillon@apollo.backplane.com>

kernel - Fix exit races which can lead to a corrupt p_children list

* There are a few races when getting multiple tokens where a threaded
process is wait*()ing for exiting children from multiple t

kernel - Fix exit races which can lead to a corrupt p_children list

* There are a few races when getting multiple tokens where a threaded
process is wait*()ing for exiting children from multiple threads
at once.

Fix the problem by serializing the operation on a per-child basis,
and by using PHOLD/PRELE prior to acquiring the child's p_token.
Then re-check the conditions before accepting the child.

* There is a small chance this will also reduce or fix VM on-exit races
in i386, as this bug could result in an already-destroyed process
being pulled off by the racing wait*(). Maybe 25% chance.

show more ...


12345