Revision tags: v6.2.1, v6.2.0, v6.3.0 |
|
#
1be00ff1 |
| 31-Oct-2021 |
Matthew Dillon <dillon@apollo.backplane.com> |
kernel - Add kernel_fpu_begin() and kernel_fpu_end()
* Add kernel_fpu_begin() and kernel_fpu_end(). Some linux stuff in amdgpu will need it.
Generally speaking the entire FP system needs a rew
kernel - Add kernel_fpu_begin() and kernel_fpu_end()
* Add kernel_fpu_begin() and kernel_fpu_end(). Some linux stuff in amdgpu will need it.
Generally speaking the entire FP system needs a rewrite, but I'm not doing that now.
show more ...
|
Revision tags: v6.0.1 |
|
#
5f684046 |
| 07-Jul-2021 |
Aaron LI <aly@aaronly.me> |
npx: Remove an unused typedef and clean up a bit
* Remove unused typedef bool_t * Fix a minor typo: FXRSTR -> FXRSTOR * Fix indentation * Reorganize a bit
|
#
819b69cf |
| 07-Jul-2021 |
Aaron LI <aly@aaronly.me> |
npx: Fix inline ASM error in fpu_clean_state()
I made a mistake in commit 6becaabbb80a1d1b37c868ce7f22fca2ef6a743f when I changed it use 'fldz' in the inline ASM. Fix it.
|
#
59661255 |
| 05-Jul-2021 |
Aaron LI <aly@aaronly.me> |
npx: Use 64-bit version FXSAVE64/FXRSTOR64 and XSAVE64/XRSTOR64
Since DragonFly is 64-bit only, use the 64-bit version FXSAVE64/FXRSTOR64 and XSAVE64/XRSTOR64.
The new FXSAVE64/FXRSTOR64 version re
npx: Use 64-bit version FXSAVE64/FXRSTOR64 and XSAVE64/XRSTOR64
Since DragonFly is 64-bit only, use the 64-bit version FXSAVE64/FXRSTOR64 and XSAVE64/XRSTOR64.
The new FXSAVE64/FXRSTOR64 version represents FIP/FDP as 64-bit fields (union fp_addr.fa_64), while the legacy FXSAVE/FXRSTOR version uses split fields: 32-bit offset, 16-bit segment and 16-bit reserved field (union fp_addr.fa_32). The latter implies that the actual addresses are truncated to 32 bits which is insufficient in modern programs.
Improve the inline ASM code a bit to use 'xsave64'/'xrstor64' names. The extra 'area' variable is introduced to help avoid dereferencing 'void *' pointer.
Referred-to: NetBSD
show more ...
|
#
3fe3fa25 |
| 06-Jul-2021 |
Aaron LI <aly@aaronly.me> |
npx: Extend fpusave/fpurstor() to accept an XSAVE mask argument
XSAVE/XRSTOR requires a mask argument that determines the components/states to save/restore. Thus this argument controls the save are
npx: Extend fpusave/fpurstor() to accept an XSAVE mask argument
XSAVE/XRSTOR requires a mask argument that determines the components/states to save/restore. Thus this argument controls the save area size.
Extend fpusave/fpurstor() functions to accept an XSAVE mask argument, so the caller can choose the wanted components/states to save/restore and knows the exact area size.
NVMM will use this feature.
show more ...
|
#
6becaabb |
| 06-Jul-2021 |
Aaron LI <aly@aaronly.me> |
npx: Use 'fldz' in fpu_clean_state() to load dummy zero
The 'fldz' instruction pushs +0.0 onto the FPU register stack. Use it to replace the 'dummy_variable' variable.
Referred-to: NetBSD
|
#
6379cf29 |
| 06-Jun-2021 |
Aaron LI <aly@aaronly.me> |
kernel: Various minor whitespace adjustments and tweaks
|
#
53150464 |
| 16-May-2021 |
Aaron LI <aly@aaronly.me> |
npx: Export fpusave()/fpurstor() functions for NVMM
|
#
fb3360ae |
| 11-May-2021 |
Aaron LI <aly@aaronly.me> |
x86_64/specialreg.h: Rename several CR4 defines
Rename CR4_FXSR -> CR4_OSFXSR, CR4_XMM -> CR4_OSXMMEXCPT, and CR4_XSAVE -> CR4_OSXSAVE, so that they match the naming conventions in the Intel specifi
x86_64/specialreg.h: Rename several CR4 defines
Rename CR4_FXSR -> CR4_OSFXSR, CR4_XMM -> CR4_OSXMMEXCPT, and CR4_XSAVE -> CR4_OSXSAVE, so that they match the naming conventions in the Intel specification and look more clear.
Submitted in bug #3265 by chicken: https://bugs.dragonflybsd.org/issues/3265
show more ...
|
Revision tags: v6.0.0, v6.0.0rc1, v6.1.0 |
|
#
7c656f7b |
| 28-Jan-2021 |
Matthew Dillon <dillon@apollo.backplane.com> |
kernel - Separate XSAVE support from AVX support
* XSAVE may be supported without AVX, on certain CPUs and also under hypervisors that expose XSAVEOPT without AVX.
* Always use XSAVE, when suppor
kernel - Separate XSAVE support from AVX support
* XSAVE may be supported without AVX, on certain CPUs and also under hypervisors that expose XSAVEOPT without AVX.
* Always use XSAVE, when supported, regardless of whether AVX is supported or not.
Submitted-by: chicken Bug-ID: #3264
show more ...
|
Revision tags: v5.8.3, v5.8.2, v5.8.1 |
|
#
eca1e48f |
| 28-Mar-2020 |
Sascha Wildner <saw@online.de> |
kernel: Remove <sys/mplock2.h> from all files that do not need it.
|
Revision tags: v5.8.0 |
|
#
c2830aa6 |
| 27-Feb-2020 |
Matthew Dillon <dillon@apollo.backplane.com> |
kernel - Continue pmap work
* Conditionalize this work on PMAP_ADVANCED, default enabled.
* Remove md_page.pmap_count and md_page.writeable_count, no longer track these counts which cause tons of
kernel - Continue pmap work
* Conditionalize this work on PMAP_ADVANCED, default enabled.
* Remove md_page.pmap_count and md_page.writeable_count, no longer track these counts which cause tons of cache line interactions.
However, there are still a few stubborn hold-overs.
* The vm_page still needs to be soft-busied in the page fault path
* For now we need to have a md_page.interlock_count to flag pages being replaced by pmap_enter() (e.g. COW faults) in order to be able to safely dispose of the page without busying it.
This need will eventually go away, hopefully just leaving us with the soft-busy-count issue.
show more ...
|
Revision tags: v5.9.0, v5.8.0rc1, v5.6.3, v5.6.2, v5.6.1, v5.6.0, v5.6.0rc1, v5.7.0, v5.4.3 |
|
#
00780082 |
| 03-May-2019 |
Matthew Dillon <dillon@apollo.backplane.com> |
kernel - Permanently fix FP bug - completely remove lazy heuristic
* Remove the FP lazy heuristic. When the FP unit is being used by a thread, it will now *always* be actively saved and restored
kernel - Permanently fix FP bug - completely remove lazy heuristic
* Remove the FP lazy heuristic. When the FP unit is being used by a thread, it will now *always* be actively saved and restored on context switch.
This means that if a process uses the FP unit at all, its context switches (to another thread) will active save/restore the state forever more.
* This fixes a known hardware bug on Intel CPUs that we thought was fixed before (by not saving The FP context from thread A from the DNA interrupt on thread B)... but it turns out it wasn't.
We could tickle the bug on Intel CPUs by forcing synth to regenerate its flavor index over and over again. This regeneration fork/exec's about 60,000 make's, sequencing concurrently on all cores, and usually hits the bug in less than 5 minutes.
* We no longer support lazy FP restores, period. This is like the fourth time I've tried to deal with this, so now its time to give up and not use lazy restoration at all, ever again.
show more ...
|
Revision tags: v5.4.2, v5.4.1, v5.4.0, v5.5.0, v5.4.0rc1, v5.2.2 |
|
#
e5aace14 |
| 11-Jun-2018 |
Matthew Dillon <dillon@apollo.backplane.com> |
Kernel - Additional cpu bug hardening part 1/2
* OpenBSD recently made a commit that scraps the use of delayed FP state saving due to a rumor that the content of FP registers owned by another pr
Kernel - Additional cpu bug hardening part 1/2
* OpenBSD recently made a commit that scraps the use of delayed FP state saving due to a rumor that the content of FP registers owned by another process can be speculatively detected when they are present for the current process, even when the TS bit is used to force a DNA trap.
This rumor has been circulating for a while. OpenBSD felt that the lack of responsiveness from Intel forced their hand. Since they've gone ahead and pushed a fix for this potential problem, we are going to as well.
* DragonFlyBSD already synchronously saves FP state on switch-out. However, it only cleans the state up afterwords by calling fninit and this isn't enough to actually erase the content in the %xmm registers. We want to continue to use delayed FP state restores because it saves a considerable amount of switching time when we do not have to do a FP restore.
Most programs touch the FP registers at startup due to rtld linking, and more and more programs use the %xmm registers as general purpose registers. OpenBSD's solution of always proactively saving and restoring FP state is a reasonable one. DragonFlyBSD is going to take a slightly different tact in order to try to retain more optimal switching behavior when the FP unit is not in continuous use.
* Our first fix is to issue a FP restore on dummy state after our FP save to guarantee that all FP registers are zerod after FP state is saved. This allows us to continue to support delayed FP state loads with zero chance of leakage between processes.
* Our second fix is to add a heuristic which allows the kernel to detect when the FP unit is being used seriously (verses just during program startup).
We have added a sysctl machdep.npx_fpu_heuristic heuristic for this purpose which defaults to the value 32. Values can be:
0 - Proactive FPU state loading disabled (old behavior retained). Note that the first fix remains active, the FP register state is still cleared after saving so no leakage can occur. All processes will take a DNA trap after a thread switch when they access the FP state.
1 - Proactive FPU state loading is enabled at all times for a thread after the first FP access made by that thread. Upon returning from a thread switch, the FPU state will already be ready to go and a DNA trap will not occur.
N - Proactive FPU state loading enabled for N context switches, then disabled. The next DNA fault after disablement then re-enables proactive loading for the next N context switches.
Default value is 32. This ensures that program startup can use the FP unit but integer-centric programs which don't afterwords will not incur the FP switching overhead (for either switch-away or switch-back).
show more ...
|
Revision tags: v5.2.1, v5.2.0, v5.3.0, v5.2.0rc, v5.0.2, v5.0.1, v5.0.0, v5.0.0rc2, v5.1.0, v5.0.0rc1, v4.8.1, v4.8.0, v4.6.2, v4.9.0, v4.8.0rc, v4.6.1, v4.6.0, v4.6.0rc2, v4.6.0rc, v4.7.0, v4.4.3, v4.4.2, v4.4.1, v4.4.0, v4.5.0, v4.4.0rc, v4.2.4, v4.3.1, v4.2.3, v4.2.1, v4.2.0, v4.0.6, v4.3.0, v4.2.0rc, v4.0.5, v4.0.4, v4.0.3, v4.0.2, v4.0.1, v4.0.0, v4.0.0rc3, v4.0.0rc2, v4.0.0rc, v4.1.0, v3.8.2 |
|
#
846ab393 |
| 30-Jun-2014 |
Markus Pfeiffer <markus.pfeiffer@morphism.de> |
Set CR4.OSFXSR before probing the mxcsr mask
This fixes Bug #2691
Reported-By: Antonio Huete Jiménez <tuxillo@quantumachine.net>
|
Revision tags: v3.8.1, v3.6.3 |
|
#
52c23ea0 |
| 12-Jun-2014 |
Sascha Wildner <saw@online.de> |
kernel: GC never true CPU_DISABLE_SSE checks from x86_64/vkernel64.
It is only an option in i386.
No functional changes.
Reported-by: profmakx
|
#
98d2b258 |
| 12-Jun-2014 |
Markus Pfeiffer <markus.pfeiffer@morphism.de> |
vkernel64: fix compilation after npx mask work
|
#
186c803f |
| 11-Jun-2014 |
Markus Pfeiffer <markus.pfeiffer@morphism.de> |
kernel/npx: Add detection code for default MXCSR mask
As per Intel/AMD manuals the default MXCSR mask can be probed by executing fxstor (if supported) and reading the mask from the stored state. Thi
kernel/npx: Add detection code for default MXCSR mask
As per Intel/AMD manuals the default MXCSR mask can be probed by executing fxstor (if supported) and reading the mask from the stored state. This patch adds detection of the mask when it is supported. Otherwise a default mask of 0xFFBF is used as before.
show more ...
|
Revision tags: v3.8.0 |
|
#
b7dc54f2 |
| 29-May-2014 |
Markus Pfeiffer <markus.pfeiffer@morphism.de> |
kernel/npx: add process name to error message
|
Revision tags: v3.8.0rc2, v3.9.0, v3.8.0rc, v3.6.2, v3.6.1, v3.6.0, v3.7.1, v3.6.0rc, v3.7.0, v3.4.3, v3.4.2 |
|
#
c68c010a |
| 18-May-2013 |
Markus Pfeiffer <markus.pfeiffer@morphism.de> |
fix MXCSR default value
XEN fails to initialise its vcpus to behave like actual cpus. One instance of this is that the MXCSR is not setup to the default value documented in as documented in AMD64 Ar
fix MXCSR default value
XEN fails to initialise its vcpus to behave like actual cpus. One instance of this is that the MXCSR is not setup to the default value documented in as documented in AMD64 Architecture Programmer's Manual Volume 1: Application Programming, Section Section 4.3.2
show more ...
|
Revision tags: v3.4.0, v3.4.1, v3.4.0rc, v3.5.0 |
|
#
e6e019a8 |
| 13-Jan-2013 |
Matthew Dillon <dillon@apollo.backplane.com> |
kernel - Fix signal FP save/restore issues when AVX is enabled
* The kernel was not saving/restoring the full FP context when entering into or returning from a signal, leading to corrupt FP regist
kernel - Fix signal FP save/restore issues when AVX is enabled
* The kernel was not saving/restoring the full FP context when entering into or returning from a signal, leading to corrupt FP registers even when AVX is not used, when AVX is enabled in the kernel.
ANY SIGNAL COULD CORRUPT THE FP STATE.
* Fixed by adjusting the on-user-stack fpsave area sizes and operation.
* This unfortunately changes a number of user visible structures. ucontext_t, mcontext_t, sigcontext, sigframe.
It is POSSIBLE that most userland use cases will be unaffected, but I'm not holding my breath.
Major-Sleuthing-by: ftigeot Testing-by: ftigeot, dillon
show more ...
|
#
70c57cb3 |
| 03-Jan-2013 |
Sascha Wildner <saw@online.de> |
kernel: The NPX_DEBUG kernel option is pc32 specific, too.
|
#
5cf56a8d |
| 29-Dec-2012 |
Alex Hornung <alex@alexhornung.com> |
x86_64 - support for AVX instructions
* CPU will be checked for XSAVE and AVX support on boot. If both are found, they will be enabled.
* If enabled, the kernel will use the XSAVE and XRSTOR i
x86_64 - support for AVX instructions
* CPU will be checked for XSAVE and AVX support on boot. If both are found, they will be enabled.
* If enabled, the kernel will use the XSAVE and XRSTOR instructions to save and restore FPU, SSE and AVX registers.
Originally-Submitted-by: Adam Sakareassen (with modifications)
show more ...
|
Revision tags: v3.2.2 |
|
#
1918fc5c |
| 24-Oct-2012 |
Sascha Wildner <saw@online.de> |
kernel: Make SMP support default (and non-optional).
The 'SMP' kernel option gets removed with this commit, so it has to be removed from everybody's configs.
Reviewed-by: sjg Approved-by: many
|
Revision tags: v3.2.1, v3.2.0, v3.3.0, v3.0.3 |
|
#
a8f1df17 |
| 12-Jul-2012 |
John Marino <netbsd@marino.st> |
x86_64 FPU: Set 64-bit precision for fadd/fsub/fsqrt etc.
On AMD64, GCC and the ABI expects the x87 unit to be running in 80/64 mode rather than 64/53 mode seen on i386. This corrects errors seen i
x86_64 FPU: Set 64-bit precision for fadd/fsub/fsqrt etc.
On AMD64, GCC and the ABI expects the x87 unit to be running in 80/64 mode rather than 64/53 mode seen on i386. This corrects errors seen in long double tests involving runtime calculations. Previously, the results of these runtime calculations would get rounded due to use of 53-bit mantissas.
show more ...
|