History log of /dragonfly/sys/platform/pc64/x86_64/swtch.s (Results 1 – 25 of 34)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# a653ba3a 05-Apr-2023 Sascha Wildner <saw@online.de>

kernel/platform: Remove useless commented out includes of use_npx.h.


Revision tags: v6.4.0, v6.4.0rc1, v6.5.0, v6.2.2, v6.2.1, v6.2.0, v6.3.0, v6.0.1, v6.0.0, v6.0.0rc1, v6.1.0, v5.8.3, v5.8.2, v5.8.1, v5.8.0, v5.9.0, v5.8.0rc1, v5.6.3, v5.6.2, v5.6.1, v5.6.0, v5.6.0rc1, v5.7.0, v5.4.3
# 00780082 03-May-2019 Matthew Dillon <dillon@apollo.backplane.com>

kernel - Permanently fix FP bug - completely remove lazy heuristic

* Remove the FP lazy heuristic. When the FP unit is being used by a
thread, it will now *always* be actively saved and restored

kernel - Permanently fix FP bug - completely remove lazy heuristic

* Remove the FP lazy heuristic. When the FP unit is being used by a
thread, it will now *always* be actively saved and restored on
context switch.

This means that if a process uses the FP unit at all, its context
switches (to another thread) will active save/restore the state forever
more.

* This fixes a known hardware bug on Intel CPUs that we thought was fixed
before (by not saving The FP context from thread A from the DNA interrupt
on thread B)... but it turns out it wasn't.

We could tickle the bug on Intel CPUs by forcing synth to regenerate
its flavor index over and over again. This regeneration fork/exec's
about 60,000 make's, sequencing concurrently on all cores, and usually
hits the bug in less than 5 minutes.

* We no longer support lazy FP restores, period. This is like the fourth
time I've tried to deal with this, so now its time to give up and not
use lazy restoration at all, ever again.

show more ...


Revision tags: v5.4.2, v5.4.1, v5.4.0, v5.5.0, v5.4.0rc1, v5.2.2
# e5aace14 11-Jun-2018 Matthew Dillon <dillon@apollo.backplane.com>

Kernel - Additional cpu bug hardening part 1/2

* OpenBSD recently made a commit that scraps the use of delayed FP
state saving due to a rumor that the content of FP registers owned
by another pr

Kernel - Additional cpu bug hardening part 1/2

* OpenBSD recently made a commit that scraps the use of delayed FP
state saving due to a rumor that the content of FP registers owned
by another process can be speculatively detected when they are
present for the current process, even when the TS bit is used to
force a DNA trap.

This rumor has been circulating for a while. OpenBSD felt that the
lack of responsiveness from Intel forced their hand. Since they've
gone ahead and pushed a fix for this potential problem, we are
going to as well.

* DragonFlyBSD already synchronously saves FP state on switch-out.
However, it only cleans the state up afterwords by calling fninit
and this isn't enough to actually erase the content in the %xmm
registers. We want to continue to use delayed FP state restores
because it saves a considerable amount of switching time when we do
not have to do a FP restore.

Most programs touch the FP registers at startup due to rtld linking,
and more and more programs use the %xmm registers as general purpose
registers. OpenBSD's solution of always proactively saving and
restoring FP state is a reasonable one. DragonFlyBSD is going to
take a slightly different tact in order to try to retain more optimal
switching behavior when the FP unit is not in continuous use.

* Our first fix is to issue a FP restore on dummy state after our
FP save to guarantee that all FP registers are zerod after FP state
is saved. This allows us to continue to support delayed FP state
loads with zero chance of leakage between processes.

* Our second fix is to add a heuristic which allows the kernel to
detect when the FP unit is being used seriously (verses just during
program startup).

We have added a sysctl machdep.npx_fpu_heuristic heuristic for this
purpose which defaults to the value 32. Values can be:

0 - Proactive FPU state loading disabled (old behavior retained).
Note that the first fix remains active, the FP register state
is still cleared after saving so no leakage can occur. All
processes will take a DNA trap after a thread switch when they
access the FP state.

1 - Proactive FPU state loading is enabled at all times for a thread
after the first FP access made by that thread. Upon returning
from a thread switch, the FPU state will already be ready to go
and a DNA trap will not occur.

N - Proactive FPU state loading enabled for N context switches, then
disabled. The next DNA fault after disablement then re-enables
proactive loading for the next N context switches.

Default value is 32. This ensures that program startup can use
the FP unit but integer-centric programs which don't afterwords
will not incur the FP switching overhead (for either switch-away
or switch-back).

show more ...


Revision tags: v5.2.1, v5.2.0, v5.3.0, v5.2.0rc
# fc921477 04-Jan-2018 Matthew Dillon <dillon@apollo.backplane.com>

kernel - Intel user/kernel separation MMU bug fix part 2/3

* Cleanup pass. Throw in some documentation.

* Move the gd_pcb_* fields into the trampoline page to allow
kernel memory to be further r

kernel - Intel user/kernel separation MMU bug fix part 2/3

* Cleanup pass. Throw in some documentation.

* Move the gd_pcb_* fields into the trampoline page to allow
kernel memory to be further restricted in part 3.

show more ...


# 4611d87f 03-Jan-2018 Matthew Dillon <dillon@apollo.backplane.com>

kernel - Intel user/kernel separation MMU bug fix part 1/3

* Part 1/3 of the fix for the Intel user/kernel separation MMU bug.
It appears that it is possible to discern the contents of kernel
me

kernel - Intel user/kernel separation MMU bug fix part 1/3

* Part 1/3 of the fix for the Intel user/kernel separation MMU bug.
It appears that it is possible to discern the contents of kernel
memory with careful timing measurements of instructions due to
speculative memory reads and speculative instruction execution
by Intel cpus. This can happen because Intel will allow both to
occur even when the memory access is later disallowed due to
privilege separation in the PTE.

Even though the execution is always aborted, the speculative
reads and speculative execution results in timing artifacts which
can be measured. A speculative compare/branch can lead to timing
artifacts that allow the actual contents of kernel memory to be
discerned.

While there are multiple speculative attacks possible, the Intel
bug is particularly bad because it allows a user program to more
or less effortlessly access kernel memory (and if a DMAP is
present, all of physical memory).

* Part 1 implements all the logic required to load an 'isolated'
version of the user process's PML4e into %cr3 on all user
transitions, and to load the 'normal' U+K version into %cr3 on
all transitions from user to kernel.

* Part 1 fully allocates, copies, and implements the %cr3 loads for
the 'isolated' version of the user process PML4e.

* Part 1 does not yet actually adjust the contents of this isolated
version to replace the kernel map with just a trampoline map in
kernel space. It does remove the DMAP as a test, though. The
full separation will be done in part 3.

show more ...


# 466d4f43 19-Dec-2017 zrj <rimvydas.jasinskas@gmail.com>

kernel/pc64: Adjust some references to already removed i386.

While there, perform some whitespace fixes.
No functional change.


Revision tags: v5.0.2, v5.0.1, v5.0.0, v5.0.0rc2, v5.1.0, v5.0.0rc1, v4.8.1
# b1d2a2de 23-Jul-2017 zrj <rimvydas.jasinskas@gmail.com>

sys: Add size directives to assembly functions.

No functional change intended.


# 87ef2da6 23-Jul-2017 zrj <rimvydas.jasinskas@gmail.com>

sys: Some whitespace cleanup.

While there, fix indentation and few typos a bit.
No functional change.


Revision tags: v4.8.0, v4.6.2, v4.9.0, v4.8.0rc
# 95270b7e 01-Feb-2017 Matthew Dillon <dillon@apollo.backplane.com>

kernel - Many fixes for vkernel support, plus a few main kernel fixes

REAL KERNEL

* The big enchillada is that the main kernel's thread switch code has
a small timing window where it clears t

kernel - Many fixes for vkernel support, plus a few main kernel fixes

REAL KERNEL

* The big enchillada is that the main kernel's thread switch code has
a small timing window where it clears the PM_ACTIVE bit for the cpu
while switching between two threads. However, it *ALSO* checks and
avoids loading the %cr3 if the two threads have the same pmap.

This results in a situation where an invalidation on the pmap in another
cpuc may not have visibility to the cpu doing the switch, and yet the
cpu doing the switch also decides not to reload %cr3 and so does not
invalidate the TLB either. The result is a stale TLB and bad things
happen.

For now just unconditionally load %cr3 until I can come up with code
to handle the case.

This bug is very difficult to reproduce on a normal system, it requires
a multi-threaded program doing nasty things (munmap, etc) on one cpu
while another thread is switching to a third thread on some other cpu.

* KNOTE after handling the vkernel trap in postsig() instead of before.

* Change the kernel's pmap_inval_smp() code to take a 64-bit npgs
argument instead of a 32-bit npgs argument. This fixes situations
that crop up when a process uses more than 16TB of address space.

* Add an lfence to the pmap invalidation code that I think might be
needed.

* Handle some wrap/overflow cases in pmap_scan() related to the use of
large address spaces.

* Fix an unnecessary invltlb in pmap_clearbit() for unmanaged PTEs.

* Test PG_RW after locking the pv_entry to handle potential races.

* Add bio_crc to struct bio. This field is only used for debugging for
now but may come in useful later.

* Add some global debug variables in the pmap_inval_smp() and related
paths. Refactor the npgs handling.

* Load the tsc_target field after waiting for completion of the previous
invalidation op instead of before. Also add a conservative mfence()
in the invalidation path before loading the info fields.

* Remove the global pmap_inval_bulk_count counter.

* Adjust swtch.s to always reload the user process %cr3, with an
explanation. FIXME LATER!

* Add some test code to vm/swap_pager.c which double-checks that the page
being paged out does not get corrupted during the operation. This code
is #if 0'd.

* We must hold an object lock around the swp_pager_meta_ctl() call in
swp_pager_async_iodone(). I think.

* Reorder when PG_SWAPINPROG is cleared. Finish the I/O before clearing
the bit.

* Change the vm_map_growstack() API to pass a vm_map in instead of
curproc.

* Use atomic ops for vm_object->generation counts, since objects can be
locked shared.

VKERNEL

* Unconditionally save the FP state after returning from VMSPACE_CTL_RUN.
This solves a severe FP corruption bug in the vkernel due to calls it
makes into libc (which uses %xmm registers all over the place).

This is not a complete fix. We need a formal userspace/kernelspace FP
abstraction. Right now the vkernel doesn't have a kernelspace FP
abstraction so if a kernel thread switches preemptively bad things
happen.

* The kernel tracks and locks pv_entry structures to interlock pte's.
The vkernel never caught up, and does not really have a pv_entry or
placemark mechanism. The vkernel's pmap really needs a complete
re-port from the real-kernel pmap code. Until then, we use poor hacks.

* Use the vm_page's spinlock to interlock pte changes.

* Make sure that PG_WRITEABLE is set or cleared with the vm_page
spinlock held.

* Have pmap_clearbit() acquire the pmobj token for the pmap in the
iteration. This appears to be necessary, currently, as most of the
rest of the vkernel pmap code also uses the pmobj token.

* Fix bugs in the vkernel's swapu32() and swapu64().

* Change pmap_page_lookup() and pmap_unwire_pgtable() to fully busy
the page. Note however that a page table page is currently never
soft-busied. Also other vkernel code that busies a page table page.

* Fix some sillycode in a pmap->pm_ptphint test.

* Don't inherit e.g. PG_M from the previous pte when overwriting it
with a pte of a different physical address.

* Change the vkernel's pmap_clear_modify() function to clear VTPE_RW
(which also clears VPTE_M), and not just VPTE_M. Formally we want
the vkernel to be notified when a page becomes modified and it won't
be unless we also clear VPTE_RW and force a fault. <--- I may change
this back after testing.

* Wrap pmap_replacevm() with a critical section.

* Scrap the old grow_stack() code. vm_fault() and vm_fault_page() handle
it (vm_fault_page() just now got the ability).

* Properly flag VM_FAULT_USERMODE.

show more ...


Revision tags: v4.6.1, v4.6.0
# b4758707 26-Jul-2016 Matthew Dillon <dillon@apollo.backplane.com>

kernel - Disable lwp->lwp optimization in thread switcher

* Put #ifdef around the existing lwp->lwp switch optimization and then
disable it. This optimizations tries to avoid reloading %cr3 and a

kernel - Disable lwp->lwp optimization in thread switcher

* Put #ifdef around the existing lwp->lwp switch optimization and then
disable it. This optimizations tries to avoid reloading %cr3 and avoid
pmap->pm_active atomic ops when switching to a lwp that shares the same
process.

This optimization is no longer applicable on multi-core systems as such
switches are very rare. LWPs are usually distributed across multiple cores
so rarely does one switch to another on the same core (and in cpu-bound
situations, the scheduler will already be in batch mode). The conditionals
in the optimization, on the other hand, did measurably (just slightly)
reduce performance for normal switches. So turn it off.

* Implement an optimization for interrupt preemptions, but disable it for
now. I want to keep the code handy but so far my tests show no improvement
in performance with huge interrupt rates (from nvme devices), so it is
#undef'd for now.

show more ...


# 405f56bc 26-Jul-2016 Matthew Dillon <dillon@apollo.backplane.com>

kernel - Minor cleanup swtch.s

* Minor cleanup


# ee89e80b 26-Jul-2016 Matthew Dillon <dillon@apollo.backplane.com>

kernel - Reduce atomic ops in switch code

* Instead of using four atomic 'and' ops and four atomic 'or' ops, use
one atomic 'and' and one atomic 'or' when adjusting the pmap->pm_active.

* Store t

kernel - Reduce atomic ops in switch code

* Instead of using four atomic 'and' ops and four atomic 'or' ops, use
one atomic 'and' and one atomic 'or' when adjusting the pmap->pm_active.

* Store the array index and simplified cpu mask in the globaldata structure
for the above operation.

show more ...


Revision tags: v4.6.0rc2, v4.6.0rc, v4.7.0, v4.4.3, v4.4.2
# 2c64e990 25-Jan-2016 zrj <rimvydas.jasinskas@gmail.com>

Remove advertising header from sys/

Correct BSD License clause numbering from 1-2-4 to 1-2-3.

Some less clear cases taken as it was done of FreeBSD.


Revision tags: v4.4.1, v4.4.0, v4.5.0, v4.4.0rc, v4.2.4, v4.3.1, v4.2.3, v4.2.1, v4.2.0, v4.0.6, v4.3.0, v4.2.0rc, v4.0.5, v4.0.4, v4.0.3
# 32d3bd25 06-Jan-2015 Sascha Wildner <saw@online.de>

kernel/pc64: Change all the remaining #if JG's to #if 0 (fixing -Wundef).


Revision tags: v4.0.2, v4.0.1, v4.0.0, v4.0.0rc3, v4.0.0rc2, v4.0.0rc, v4.1.0, v3.8.2
# 06c66eb2 04-Jul-2014 Matthew Dillon <dillon@apollo.backplane.com>

kernel - Refactor cpumask_t to extend cpus past 64, part 2/2

* Expand SMP_MAXCPU from 64 to 256 (64-bit only)

* Expand cpumask_t from 64 to 256 bits

* Refactor the C macros and the assembly code.

kernel - Refactor cpumask_t to extend cpus past 64, part 2/2

* Expand SMP_MAXCPU from 64 to 256 (64-bit only)

* Expand cpumask_t from 64 to 256 bits

* Refactor the C macros and the assembly code.

* Add misc cpu_pauses()s and do a bit of work on the boot sequencing.

show more ...


# cc694a4a 30-Jun-2014 Matthew Dillon <dillon@apollo.backplane.com>

kernel - Move CPUMASK_LOCK out of the cpumask_t

* Add cpulock_t (a 32-bit integer on all platforms) and implement
CPULOCK_EXCL as well as space for a counter.

* Break-out CPUMASK_LOCK, add a new

kernel - Move CPUMASK_LOCK out of the cpumask_t

* Add cpulock_t (a 32-bit integer on all platforms) and implement
CPULOCK_EXCL as well as space for a counter.

* Break-out CPUMASK_LOCK, add a new field to the pmap (pm_active_lock)
and do the process vmm (p_vmm_cpulock) and implement the mmu interlock
there.

The VMM subsystem uses additional bits in cpulock_t as a mask counter
for implementing its interlock.

The PMAP subsystem just uses the CPULOCK_EXCL bit in pm_active_lock for
its own interlock.

* Max cpus on 64-bit systems is now 64 instead of 63.

* cpumask_t is now just a pure cpu mask and no longer requires all-or-none
atomic ops, just normal bit-for-bit atomic ops. This will allow us to
hopefully extend it past the 64-cpu limit soon.

show more ...


Revision tags: v3.8.1, v3.6.3, v3.8.0, v3.8.0rc2, v3.9.0, v3.8.0rc, v3.6.2
# 902419bf 02-Apr-2014 Matthew Dillon <dillon@apollo.backplane.com>

kernel - Fix mid-kernel-boot lockups due to module order

* Sometimes the kernel would just lock up for no apparent reason. Changing
the preloaded module set (even randomly) could effect the behav

kernel - Fix mid-kernel-boot lockups due to module order

* Sometimes the kernel would just lock up for no apparent reason. Changing
the preloaded module set (even randomly) could effect the behavior.

* The problem turned out to be an issue with kernel modules needing to
temporarily migrate to another cpu (such as when installing a SWI or
other interrupt handler). If the idle thread on cpu 0 had not yet
bootstrapped, lwkt_switch_return() would not be called properly and
the LWKT cpu migration code would never schedule on the target cpu.

* Fix the problem by handling the idle thread bootstrap case for cpu 0
properly in pc64.

show more ...


Revision tags: v3.6.1, v3.6.0, v3.7.1, v3.6.0rc, v3.7.0, v3.4.3, v3.4.2, v3.4.0, v3.4.1, v3.4.0rc, v3.5.0, v3.2.2
# 1918fc5c 24-Oct-2012 Sascha Wildner <saw@online.de>

kernel: Make SMP support default (and non-optional).

The 'SMP' kernel option gets removed with this commit, so it has to
be removed from everybody's configs.

Reviewed-by: sjg
Approved-by: many


Revision tags: v3.2.1, v3.2.0, v3.3.0, v3.0.3, v3.0.2, v3.0.1, v3.1.0, v3.0.0
# 3338cc67 12-Dec-2011 Matthew Dillon <dillon@apollo.backplane.com>

kernel - Misc fixes and debugging

* Add required CLDs in the exception paths. The interrupt paths already
CLD in PUSH_FRAME.

* Note that the fast_syscall (SYSENTER) path has an implied CLD due t

kernel - Misc fixes and debugging

* Add required CLDs in the exception paths. The interrupt paths already
CLD in PUSH_FRAME.

* Note that the fast_syscall (SYSENTER) path has an implied CLD due to the
hardware mask applied to rflags.

* Add the IOPL bits to the set of bits set to 0 during a fast_syscall.

* When creating a dummy interrupt frame we don't have to push the
actual %cs. Just push $0 so the frame isn't misinterpreted as coming
from userland.

* Additional debug verbosity for freeze_on_seg_fault.

* Reserve two void * fields for LWP debugging (for a later commit)

show more ...


# 121f93bc 08-Dec-2011 Matthew Dillon <dillon@apollo.backplane.com>

kernel - Add TDF_RUNNING assertions

* Assert that the target lwkt thread being switched to is not flagged as
running.

* Assert that the originating lwkt thread being switched from is flagged as

kernel - Add TDF_RUNNING assertions

* Assert that the target lwkt thread being switched to is not flagged as
running.

* Assert that the originating lwkt thread being switched from is flagged as
running.

* Fix the running flag initial condition for the idle thread.

show more ...


# d8d8c8c5 01-Dec-2011 Matthew Dillon <dillon@apollo.backplane.com>

kernel - Fix pmap->pm_active race in switch code

* Use an atomic cmpxchg to set the cpu bit in the pmap->pm_active bitmap
AND test the pmap interlock bit at the same time, instead of testing
the

kernel - Fix pmap->pm_active race in switch code

* Use an atomic cmpxchg to set the cpu bit in the pmap->pm_active bitmap
AND test the pmap interlock bit at the same time, instead of testing
the interlock bit afterwords.

* In addition, if we find the lock bit set and must spin-wait for it to
clear, we skip the %cr3 comparison check and unconditionally load %cr3.

* It is unclear if the race could be realized in any way. It was probably
not responsible for the seg-fault issue as prior tests with an unconditional
load of %cr3 did not fix the problem. Plus in the same-%cr3-as-last-thread
case the cpu bit is already set so there should be no possibility of
losing a TLB interlock IPI (and %cr3 is loaded unconditionally when it
doesn't match, so....).

But fix the race anyway.

show more ...


# 0ae279a9 30-Nov-2011 Matthew Dillon <dillon@apollo.backplane.com>

kernel - swtch.s cleanup

* Cleanup the code a bit, no functional changes.


# 86d7f5d3 26-Nov-2011 John Marino <draco@marino.st>

Initial import of binutils 2.22 on the new vendor branch

Future versions of binutils will also reside on this branch rather
than continuing to create new binutils branches for each new version.


# f2081646 12-Nov-2011 Matthew Dillon <dillon@apollo.backplane.com>

kernel - Add syscall quick return path for x86-64

* Flag the case where a sysretq can be performed to quickly return
from a system call instead of having to execute the slower doreti
code.

* Th

kernel - Add syscall quick return path for x86-64

* Flag the case where a sysretq can be performed to quickly return
from a system call instead of having to execute the slower doreti
code.

* This about halves syscall times for simple system calls such as
getuid(), and reduces longer syscalls by ~80ns or so on a fast
3.4GHz SandyBridge, but does not seem to really effect performance
a whole lot.

Taken-From: FreeBSD (loosely)

show more ...


Revision tags: v2.12.0, v2.13.0, v2.10.1, v2.11.0, v2.10.0, v2.9.1, v2.8.2, v2.8.1, v2.8.0, v2.9.0, v2.6.3, v2.7.3, v2.6.2, v2.7.2, v2.7.1, v2.6.1, v2.7.0, v2.6.0
# b2b3ffcd 04-Nov-2009 Simon Schubert <corecode@dragonflybsd.org>

rename amd64 architecture to x86_64

The rest of the world seems to call amd64 x86_64. Bite the bullet and
rename all of the architecture files and references. This will
hopefully make pkgsrc build

rename amd64 architecture to x86_64

The rest of the world seems to call amd64 x86_64. Bite the bullet and
rename all of the architecture files and references. This will
hopefully make pkgsrc builds less painful.

Discussed-with: dillon@

show more ...


12