History log of /dragonfly/sys/platform/pc64/x86_64/npx.c (Results 1 – 25 of 33)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v6.2.1, v6.2.0, v6.3.0
# 1be00ff1 31-Oct-2021 Matthew Dillon <dillon@apollo.backplane.com>

kernel - Add kernel_fpu_begin() and kernel_fpu_end()

* Add kernel_fpu_begin() and kernel_fpu_end(). Some linux stuff
in amdgpu will need it.

Generally speaking the entire FP system needs a rew

kernel - Add kernel_fpu_begin() and kernel_fpu_end()

* Add kernel_fpu_begin() and kernel_fpu_end(). Some linux stuff
in amdgpu will need it.

Generally speaking the entire FP system needs a rewrite, but I'm
not doing that now.

show more ...


Revision tags: v6.0.1
# 5f684046 07-Jul-2021 Aaron LI <aly@aaronly.me>

npx: Remove an unused typedef and clean up a bit

* Remove unused typedef bool_t
* Fix a minor typo: FXRSTR -> FXRSTOR
* Fix indentation
* Reorganize a bit


# 819b69cf 07-Jul-2021 Aaron LI <aly@aaronly.me>

npx: Fix inline ASM error in fpu_clean_state()

I made a mistake in commit 6becaabbb80a1d1b37c868ce7f22fca2ef6a743f when
I changed it use 'fldz' in the inline ASM. Fix it.


# 59661255 05-Jul-2021 Aaron LI <aly@aaronly.me>

npx: Use 64-bit version FXSAVE64/FXRSTOR64 and XSAVE64/XRSTOR64

Since DragonFly is 64-bit only, use the 64-bit version
FXSAVE64/FXRSTOR64 and XSAVE64/XRSTOR64.

The new FXSAVE64/FXRSTOR64 version re

npx: Use 64-bit version FXSAVE64/FXRSTOR64 and XSAVE64/XRSTOR64

Since DragonFly is 64-bit only, use the 64-bit version
FXSAVE64/FXRSTOR64 and XSAVE64/XRSTOR64.

The new FXSAVE64/FXRSTOR64 version represents FIP/FDP as 64-bit fields
(union fp_addr.fa_64), while the legacy FXSAVE/FXRSTOR version uses
split fields: 32-bit offset, 16-bit segment and 16-bit reserved field
(union fp_addr.fa_32). The latter implies that the actual addresses are
truncated to 32 bits which is insufficient in modern programs.

Improve the inline ASM code a bit to use 'xsave64'/'xrstor64' names.
The extra 'area' variable is introduced to help avoid dereferencing
'void *' pointer.

Referred-to: NetBSD

show more ...


# 3fe3fa25 06-Jul-2021 Aaron LI <aly@aaronly.me>

npx: Extend fpusave/fpurstor() to accept an XSAVE mask argument

XSAVE/XRSTOR requires a mask argument that determines the
components/states to save/restore. Thus this argument controls the
save are

npx: Extend fpusave/fpurstor() to accept an XSAVE mask argument

XSAVE/XRSTOR requires a mask argument that determines the
components/states to save/restore. Thus this argument controls the
save area size.

Extend fpusave/fpurstor() functions to accept an XSAVE mask argument,
so the caller can choose the wanted components/states to save/restore
and knows the exact area size.

NVMM will use this feature.

show more ...


# 6becaabb 06-Jul-2021 Aaron LI <aly@aaronly.me>

npx: Use 'fldz' in fpu_clean_state() to load dummy zero

The 'fldz' instruction pushs +0.0 onto the FPU register stack. Use it
to replace the 'dummy_variable' variable.

Referred-to: NetBSD


# 6379cf29 06-Jun-2021 Aaron LI <aly@aaronly.me>

kernel: Various minor whitespace adjustments and tweaks


# 53150464 16-May-2021 Aaron LI <aly@aaronly.me>

npx: Export fpusave()/fpurstor() functions for NVMM


# fb3360ae 11-May-2021 Aaron LI <aly@aaronly.me>

x86_64/specialreg.h: Rename several CR4 defines

Rename CR4_FXSR -> CR4_OSFXSR, CR4_XMM -> CR4_OSXMMEXCPT, and
CR4_XSAVE -> CR4_OSXSAVE, so that they match the naming conventions in
the Intel specifi

x86_64/specialreg.h: Rename several CR4 defines

Rename CR4_FXSR -> CR4_OSFXSR, CR4_XMM -> CR4_OSXMMEXCPT, and
CR4_XSAVE -> CR4_OSXSAVE, so that they match the naming conventions in
the Intel specification and look more clear.

Submitted in bug #3265 by chicken:
https://bugs.dragonflybsd.org/issues/3265

show more ...


Revision tags: v6.0.0, v6.0.0rc1, v6.1.0
# 7c656f7b 28-Jan-2021 Matthew Dillon <dillon@apollo.backplane.com>

kernel - Separate XSAVE support from AVX support

* XSAVE may be supported without AVX, on certain CPUs and also under
hypervisors that expose XSAVEOPT without AVX.

* Always use XSAVE, when suppor

kernel - Separate XSAVE support from AVX support

* XSAVE may be supported without AVX, on certain CPUs and also under
hypervisors that expose XSAVEOPT without AVX.

* Always use XSAVE, when supported, regardless of whether AVX is supported
or not.

Submitted-by: chicken
Bug-ID: #3264

show more ...


Revision tags: v5.8.3, v5.8.2, v5.8.1
# eca1e48f 28-Mar-2020 Sascha Wildner <saw@online.de>

kernel: Remove <sys/mplock2.h> from all files that do not need it.


Revision tags: v5.8.0
# c2830aa6 27-Feb-2020 Matthew Dillon <dillon@apollo.backplane.com>

kernel - Continue pmap work

* Conditionalize this work on PMAP_ADVANCED, default enabled.

* Remove md_page.pmap_count and md_page.writeable_count, no longer
track these counts which cause tons of

kernel - Continue pmap work

* Conditionalize this work on PMAP_ADVANCED, default enabled.

* Remove md_page.pmap_count and md_page.writeable_count, no longer
track these counts which cause tons of cache line interactions.

However, there are still a few stubborn hold-overs.

* The vm_page still needs to be soft-busied in the page fault path

* For now we need to have a md_page.interlock_count to flag pages
being replaced by pmap_enter() (e.g. COW faults) in order to be
able to safely dispose of the page without busying it.

This need will eventually go away, hopefully just leaving us with
the soft-busy-count issue.

show more ...


Revision tags: v5.9.0, v5.8.0rc1, v5.6.3, v5.6.2, v5.6.1, v5.6.0, v5.6.0rc1, v5.7.0, v5.4.3
# 00780082 03-May-2019 Matthew Dillon <dillon@apollo.backplane.com>

kernel - Permanently fix FP bug - completely remove lazy heuristic

* Remove the FP lazy heuristic. When the FP unit is being used by a
thread, it will now *always* be actively saved and restored

kernel - Permanently fix FP bug - completely remove lazy heuristic

* Remove the FP lazy heuristic. When the FP unit is being used by a
thread, it will now *always* be actively saved and restored on
context switch.

This means that if a process uses the FP unit at all, its context
switches (to another thread) will active save/restore the state forever
more.

* This fixes a known hardware bug on Intel CPUs that we thought was fixed
before (by not saving The FP context from thread A from the DNA interrupt
on thread B)... but it turns out it wasn't.

We could tickle the bug on Intel CPUs by forcing synth to regenerate
its flavor index over and over again. This regeneration fork/exec's
about 60,000 make's, sequencing concurrently on all cores, and usually
hits the bug in less than 5 minutes.

* We no longer support lazy FP restores, period. This is like the fourth
time I've tried to deal with this, so now its time to give up and not
use lazy restoration at all, ever again.

show more ...


Revision tags: v5.4.2, v5.4.1, v5.4.0, v5.5.0, v5.4.0rc1, v5.2.2
# e5aace14 11-Jun-2018 Matthew Dillon <dillon@apollo.backplane.com>

Kernel - Additional cpu bug hardening part 1/2

* OpenBSD recently made a commit that scraps the use of delayed FP
state saving due to a rumor that the content of FP registers owned
by another pr

Kernel - Additional cpu bug hardening part 1/2

* OpenBSD recently made a commit that scraps the use of delayed FP
state saving due to a rumor that the content of FP registers owned
by another process can be speculatively detected when they are
present for the current process, even when the TS bit is used to
force a DNA trap.

This rumor has been circulating for a while. OpenBSD felt that the
lack of responsiveness from Intel forced their hand. Since they've
gone ahead and pushed a fix for this potential problem, we are
going to as well.

* DragonFlyBSD already synchronously saves FP state on switch-out.
However, it only cleans the state up afterwords by calling fninit
and this isn't enough to actually erase the content in the %xmm
registers. We want to continue to use delayed FP state restores
because it saves a considerable amount of switching time when we do
not have to do a FP restore.

Most programs touch the FP registers at startup due to rtld linking,
and more and more programs use the %xmm registers as general purpose
registers. OpenBSD's solution of always proactively saving and
restoring FP state is a reasonable one. DragonFlyBSD is going to
take a slightly different tact in order to try to retain more optimal
switching behavior when the FP unit is not in continuous use.

* Our first fix is to issue a FP restore on dummy state after our
FP save to guarantee that all FP registers are zerod after FP state
is saved. This allows us to continue to support delayed FP state
loads with zero chance of leakage between processes.

* Our second fix is to add a heuristic which allows the kernel to
detect when the FP unit is being used seriously (verses just during
program startup).

We have added a sysctl machdep.npx_fpu_heuristic heuristic for this
purpose which defaults to the value 32. Values can be:

0 - Proactive FPU state loading disabled (old behavior retained).
Note that the first fix remains active, the FP register state
is still cleared after saving so no leakage can occur. All
processes will take a DNA trap after a thread switch when they
access the FP state.

1 - Proactive FPU state loading is enabled at all times for a thread
after the first FP access made by that thread. Upon returning
from a thread switch, the FPU state will already be ready to go
and a DNA trap will not occur.

N - Proactive FPU state loading enabled for N context switches, then
disabled. The next DNA fault after disablement then re-enables
proactive loading for the next N context switches.

Default value is 32. This ensures that program startup can use
the FP unit but integer-centric programs which don't afterwords
will not incur the FP switching overhead (for either switch-away
or switch-back).

show more ...


Revision tags: v5.2.1, v5.2.0, v5.3.0, v5.2.0rc, v5.0.2, v5.0.1, v5.0.0, v5.0.0rc2, v5.1.0, v5.0.0rc1, v4.8.1, v4.8.0, v4.6.2, v4.9.0, v4.8.0rc, v4.6.1, v4.6.0, v4.6.0rc2, v4.6.0rc, v4.7.0, v4.4.3, v4.4.2, v4.4.1, v4.4.0, v4.5.0, v4.4.0rc, v4.2.4, v4.3.1, v4.2.3, v4.2.1, v4.2.0, v4.0.6, v4.3.0, v4.2.0rc, v4.0.5, v4.0.4, v4.0.3, v4.0.2, v4.0.1, v4.0.0, v4.0.0rc3, v4.0.0rc2, v4.0.0rc, v4.1.0, v3.8.2
# 846ab393 30-Jun-2014 Markus Pfeiffer <markus.pfeiffer@morphism.de>

Set CR4.OSFXSR before probing the mxcsr mask

This fixes Bug #2691

Reported-By: Antonio Huete Jiménez <tuxillo@quantumachine.net>


Revision tags: v3.8.1, v3.6.3
# 52c23ea0 12-Jun-2014 Sascha Wildner <saw@online.de>

kernel: GC never true CPU_DISABLE_SSE checks from x86_64/vkernel64.

It is only an option in i386.

No functional changes.

Reported-by: profmakx


# 98d2b258 12-Jun-2014 Markus Pfeiffer <markus.pfeiffer@morphism.de>

vkernel64: fix compilation after npx mask work


# 186c803f 11-Jun-2014 Markus Pfeiffer <markus.pfeiffer@morphism.de>

kernel/npx: Add detection code for default MXCSR mask

As per Intel/AMD manuals the default MXCSR mask can be probed
by executing fxstor (if supported) and reading the mask from the
stored state. Thi

kernel/npx: Add detection code for default MXCSR mask

As per Intel/AMD manuals the default MXCSR mask can be probed
by executing fxstor (if supported) and reading the mask from the
stored state. This patch adds detection of the mask when it is
supported. Otherwise a default mask of 0xFFBF is used as before.

show more ...


Revision tags: v3.8.0
# b7dc54f2 29-May-2014 Markus Pfeiffer <markus.pfeiffer@morphism.de>

kernel/npx: add process name to error message


Revision tags: v3.8.0rc2, v3.9.0, v3.8.0rc, v3.6.2, v3.6.1, v3.6.0, v3.7.1, v3.6.0rc, v3.7.0, v3.4.3, v3.4.2
# c68c010a 18-May-2013 Markus Pfeiffer <markus.pfeiffer@morphism.de>

fix MXCSR default value

XEN fails to initialise its vcpus to behave like actual cpus. One
instance of this is that the MXCSR is not setup to the default
value documented in as documented in AMD64 Ar

fix MXCSR default value

XEN fails to initialise its vcpus to behave like actual cpus. One
instance of this is that the MXCSR is not setup to the default
value documented in as documented in AMD64 Architecture
Programmer's Manual Volume 1: Application Programming, Section
Section 4.3.2

show more ...


Revision tags: v3.4.0, v3.4.1, v3.4.0rc, v3.5.0
# e6e019a8 13-Jan-2013 Matthew Dillon <dillon@apollo.backplane.com>

kernel - Fix signal FP save/restore issues when AVX is enabled

* The kernel was not saving/restoring the full FP context when entering into
or returning from a signal, leading to corrupt FP regist

kernel - Fix signal FP save/restore issues when AVX is enabled

* The kernel was not saving/restoring the full FP context when entering into
or returning from a signal, leading to corrupt FP registers even when
AVX is not used, when AVX is enabled in the kernel.

ANY SIGNAL COULD CORRUPT THE FP STATE.

* Fixed by adjusting the on-user-stack fpsave area sizes and operation.

* This unfortunately changes a number of user visible structures.
ucontext_t, mcontext_t, sigcontext, sigframe.

It is POSSIBLE that most userland use cases will be unaffected, but I'm
not holding my breath.

Major-Sleuthing-by: ftigeot
Testing-by: ftigeot, dillon

show more ...


# 70c57cb3 03-Jan-2013 Sascha Wildner <saw@online.de>

kernel: The NPX_DEBUG kernel option is pc32 specific, too.


# 5cf56a8d 29-Dec-2012 Alex Hornung <alex@alexhornung.com>

x86_64 - support for AVX instructions

* CPU will be checked for XSAVE and AVX support on boot. If both are
found, they will be enabled.

* If enabled, the kernel will use the XSAVE and XRSTOR i

x86_64 - support for AVX instructions

* CPU will be checked for XSAVE and AVX support on boot. If both are
found, they will be enabled.

* If enabled, the kernel will use the XSAVE and XRSTOR instructions to
save and restore FPU, SSE and AVX registers.

Originally-Submitted-by: Adam Sakareassen (with modifications)

show more ...


Revision tags: v3.2.2
# 1918fc5c 24-Oct-2012 Sascha Wildner <saw@online.de>

kernel: Make SMP support default (and non-optional).

The 'SMP' kernel option gets removed with this commit, so it has to
be removed from everybody's configs.

Reviewed-by: sjg
Approved-by: many


Revision tags: v3.2.1, v3.2.0, v3.3.0, v3.0.3
# a8f1df17 12-Jul-2012 John Marino <netbsd@marino.st>

x86_64 FPU: Set 64-bit precision for fadd/fsub/fsqrt etc.

On AMD64, GCC and the ABI expects the x87 unit to be running in 80/64
mode rather than 64/53 mode seen on i386. This corrects errors seen
i

x86_64 FPU: Set 64-bit precision for fadd/fsub/fsqrt etc.

On AMD64, GCC and the ABI expects the x87 unit to be running in 80/64
mode rather than 64/53 mode seen on i386. This corrects errors seen
in long double tests involving runtime calculations. Previously, the
results of these runtime calculations would get rounded due to use
of 53-bit mantissas.

show more ...


12