npx.c - OpenGrok history log for /dragonfly/sys/platform/pc64/x86

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
Revision tags: v6.2.1, v6.2.0, v6.3.0
# 1be00ff1	31-Oct-2021	Matthew Dillon <dillon@apollo.backplane.com>	kernel - Add kernel_fpu_begin() and kernel_fpu_end() * Add kernel_fpu_begin() and kernel_fpu_end(). Some linux stuff in amdgpu will need it. Generally speaking the entire FP system needs a rew kernel - Add kernel_fpu_begin() and kernel_fpu_end() * Add kernel_fpu_begin() and kernel_fpu_end(). Some linux stuff in amdgpu will need it. Generally speaking the entire FP system needs a rewrite, but I'm not doing that now. show more ...
Revision tags: v6.0.1
# 5f684046	07-Jul-2021	Aaron LI <aly@aaronly.me>	npx: Remove an unused typedef and clean up a bit * Remove unused typedef bool_t * Fix a minor typo: FXRSTR -> FXRSTOR * Fix indentation * Reorganize a bit
# 819b69cf	07-Jul-2021	Aaron LI <aly@aaronly.me>	npx: Fix inline ASM error in fpu_clean_state() I made a mistake in commit 6becaabbb80a1d1b37c868ce7f22fca2ef6a743f when I changed it use 'fldz' in the inline ASM. Fix it.
# 59661255	05-Jul-2021	Aaron LI <aly@aaronly.me>	npx: Use 64-bit version FXSAVE64/FXRSTOR64 and XSAVE64/XRSTOR64 Since DragonFly is 64-bit only, use the 64-bit version FXSAVE64/FXRSTOR64 and XSAVE64/XRSTOR64. The new FXSAVE64/FXRSTOR64 version re npx: Use 64-bit version FXSAVE64/FXRSTOR64 and XSAVE64/XRSTOR64 Since DragonFly is 64-bit only, use the 64-bit version FXSAVE64/FXRSTOR64 and XSAVE64/XRSTOR64. The new FXSAVE64/FXRSTOR64 version represents FIP/FDP as 64-bit fields (union fp_addr.fa_64), while the legacy FXSAVE/FXRSTOR version uses split fields: 32-bit offset, 16-bit segment and 16-bit reserved field (union fp_addr.fa_32). The latter implies that the actual addresses are truncated to 32 bits which is insufficient in modern programs. Improve the inline ASM code a bit to use 'xsave64'/'xrstor64' names. The extra 'area' variable is introduced to help avoid dereferencing 'void *' pointer. Referred-to: NetBSD show more ...
# 3fe3fa25	06-Jul-2021	Aaron LI <aly@aaronly.me>	npx: Extend fpusave/fpurstor() to accept an XSAVE mask argument XSAVE/XRSTOR requires a mask argument that determines the components/states to save/restore. Thus this argument controls the save are npx: Extend fpusave/fpurstor() to accept an XSAVE mask argument XSAVE/XRSTOR requires a mask argument that determines the components/states to save/restore. Thus this argument controls the save area size. Extend fpusave/fpurstor() functions to accept an XSAVE mask argument, so the caller can choose the wanted components/states to save/restore and knows the exact area size. NVMM will use this feature. show more ...
# 6becaabb	06-Jul-2021	Aaron LI <aly@aaronly.me>	npx: Use 'fldz' in fpu_clean_state() to load dummy zero The 'fldz' instruction pushs +0.0 onto the FPU register stack. Use it to replace the 'dummy_variable' variable. Referred-to: NetBSD
# 6379cf29	06-Jun-2021	Aaron LI <aly@aaronly.me>	kernel: Various minor whitespace adjustments and tweaks
# 53150464	16-May-2021	Aaron LI <aly@aaronly.me>	npx: Export fpusave()/fpurstor() functions for NVMM
# fb3360ae	11-May-2021	Aaron LI <aly@aaronly.me>	x86_64/specialreg.h: Rename several CR4 defines Rename CR4_FXSR -> CR4_OSFXSR, CR4_XMM -> CR4_OSXMMEXCPT, and CR4_XSAVE -> CR4_OSXSAVE, so that they match the naming conventions in the Intel specifi x86_64/specialreg.h: Rename several CR4 defines Rename CR4_FXSR -> CR4_OSFXSR, CR4_XMM -> CR4_OSXMMEXCPT, and CR4_XSAVE -> CR4_OSXSAVE, so that they match the naming conventions in the Intel specification and look more clear. Submitted in bug #3265 by chicken: https://bugs.dragonflybsd.org/issues/3265 show more ...
Revision tags: v6.0.0, v6.0.0rc1, v6.1.0
# 7c656f7b	28-Jan-2021	Matthew Dillon <dillon@apollo.backplane.com>	kernel - Separate XSAVE support from AVX support * XSAVE may be supported without AVX, on certain CPUs and also under hypervisors that expose XSAVEOPT without AVX. * Always use XSAVE, when suppor kernel - Separate XSAVE support from AVX support * XSAVE may be supported without AVX, on certain CPUs and also under hypervisors that expose XSAVEOPT without AVX. * Always use XSAVE, when supported, regardless of whether AVX is supported or not. Submitted-by: chicken Bug-ID: #3264 show more ...
Revision tags: v5.8.3, v5.8.2, v5.8.1
# eca1e48f	28-Mar-2020	Sascha Wildner <saw@online.de>	kernel: Remove <sys/mplock2.h> from all files that do not need it.
Revision tags: v5.8.0
# c2830aa6	27-Feb-2020	Matthew Dillon <dillon@apollo.backplane.com>	kernel - Continue pmap work * Conditionalize this work on PMAP_ADVANCED, default enabled. * Remove md_page.pmap_count and md_page.writeable_count, no longer track these counts which cause tons of kernel - Continue pmap work * Conditionalize this work on PMAP_ADVANCED, default enabled. * Remove md_page.pmap_count and md_page.writeable_count, no longer track these counts which cause tons of cache line interactions. However, there are still a few stubborn hold-overs. * The vm_page still needs to be soft-busied in the page fault path * For now we need to have a md_page.interlock_count to flag pages being replaced by pmap_enter() (e.g. COW faults) in order to be able to safely dispose of the page without busying it. This need will eventually go away, hopefully just leaving us with the soft-busy-count issue. show more ...
Revision tags: v5.9.0, v5.8.0rc1, v5.6.3, v5.6.2, v5.6.1, v5.6.0, v5.6.0rc1, v5.7.0, v5.4.3
# 00780082	03-May-2019	Matthew Dillon <dillon@apollo.backplane.com>	kernel - Permanently fix FP bug - completely remove lazy heuristic * Remove the FP lazy heuristic. When the FP unit is being used by a thread, it will now always be actively saved and restored kernel - Permanently fix FP bug - completely remove lazy heuristic * Remove the FP lazy heuristic. When the FP unit is being used by a thread, it will now always be actively saved and restored on context switch. This means that if a process uses the FP unit at all, its context switches (to another thread) will active save/restore the state forever more. * This fixes a known hardware bug on Intel CPUs that we thought was fixed before (by not saving The FP context from thread A from the DNA interrupt on thread B)... but it turns out it wasn't. We could tickle the bug on Intel CPUs by forcing synth to regenerate its flavor index over and over again. This regeneration fork/exec's about 60,000 make's, sequencing concurrently on all cores, and usually hits the bug in less than 5 minutes. * We no longer support lazy FP restores, period. This is like the fourth time I've tried to deal with this, so now its time to give up and not use lazy restoration at all, ever again. show more ...
Revision tags: v5.4.2, v5.4.1, v5.4.0, v5.5.0, v5.4.0rc1, v5.2.2
# e5aace14	11-Jun-2018	Matthew Dillon <dillon@apollo.backplane.com>	Kernel - Additional cpu bug hardening part 1/2 * OpenBSD recently made a commit that scraps the use of delayed FP state saving due to a rumor that the content of FP registers owned by another pr Kernel - Additional cpu bug hardening part 1/2 * OpenBSD recently made a commit that scraps the use of delayed FP state saving due to a rumor that the content of FP registers owned by another process can be speculatively detected when they are present for the current process, even when the TS bit is used to force a DNA trap. This rumor has been circulating for a while. OpenBSD felt that the lack of responsiveness from Intel forced their hand. Since they've gone ahead and pushed a fix for this potential problem, we are going to as well. * DragonFlyBSD already synchronously saves FP state on switch-out. However, it only cleans the state up afterwords by calling fninit and this isn't enough to actually erase the content in the %xmm registers. We want to continue to use delayed FP state restores because it saves a considerable amount of switching time when we do not have to do a FP restore. Most programs touch the FP registers at startup due to rtld linking, and more and more programs use the %xmm registers as general purpose registers. OpenBSD's solution of always proactively saving and restoring FP state is a reasonable one. DragonFlyBSD is going to take a slightly different tact in order to try to retain more optimal switching behavior when the FP unit is not in continuous use. * Our first fix is to issue a FP restore on dummy state after our FP save to guarantee that all FP registers are zerod after FP state is saved. This allows us to continue to support delayed FP state loads with zero chance of leakage between processes. * Our second fix is to add a heuristic which allows the kernel to detect when the FP unit is being used seriously (verses just during program startup). We have added a sysctl machdep.npx_fpu_heuristic heuristic for this purpose which defaults to the value 32. Values can be: 0 - Proactive FPU state loading disabled (old behavior retained). Note that the first fix remains active, the FP register state is still cleared after saving so no leakage can occur. All processes will take a DNA trap after a thread switch when they access the FP state. 1 - Proactive FPU state loading is enabled at all times for a thread after the first FP access made by that thread. Upon returning from a thread switch, the FPU state will already be ready to go and a DNA trap will not occur. N - Proactive FPU state loading enabled for N context switches, then disabled. The next DNA fault after disablement then re-enables proactive loading for the next N context switches. Default value is 32. This ensures that program startup can use the FP unit but integer-centric programs which don't afterwords will not incur the FP switching overhead (for either switch-away or switch-back). show more ...
Revision tags: v5.2.1, v5.2.0, v5.3.0, v5.2.0rc, v5.0.2, v5.0.1, v5.0.0, v5.0.0rc2, v5.1.0, v5.0.0rc1, v4.8.1, v4.8.0, v4.6.2, v4.9.0, v4.8.0rc, v4.6.1, v4.6.0, v4.6.0rc2, v4.6.0rc, v4.7.0, v4.4.3, v4.4.2, v4.4.1, v4.4.0, v4.5.0, v4.4.0rc, v4.2.4, v4.3.1, v4.2.3, v4.2.1, v4.2.0, v4.0.6, v4.3.0, v4.2.0rc, v4.0.5, v4.0.4, v4.0.3, v4.0.2, v4.0.1, v4.0.0, v4.0.0rc3, v4.0.0rc2, v4.0.0rc, v4.1.0, v3.8.2
# 846ab393	30-Jun-2014	Markus Pfeiffer <markus.pfeiffer@morphism.de>	Set CR4.OSFXSR before probing the mxcsr mask This fixes Bug #2691 Reported-By: Antonio Huete Jiménez <tuxillo@quantumachine.net>
Revision tags: v3.8.1, v3.6.3
# 52c23ea0	12-Jun-2014	Sascha Wildner <saw@online.de>	kernel: GC never true CPU_DISABLE_SSE checks from x86_64/vkernel64. It is only an option in i386. No functional changes. Reported-by: profmakx
# 98d2b258	12-Jun-2014	Markus Pfeiffer <markus.pfeiffer@morphism.de>	vkernel64: fix compilation after npx mask work
# 186c803f	11-Jun-2014	Markus Pfeiffer <markus.pfeiffer@morphism.de>	kernel/npx: Add detection code for default MXCSR mask As per Intel/AMD manuals the default MXCSR mask can be probed by executing fxstor (if supported) and reading the mask from the stored state. Thi kernel/npx: Add detection code for default MXCSR mask As per Intel/AMD manuals the default MXCSR mask can be probed by executing fxstor (if supported) and reading the mask from the stored state. This patch adds detection of the mask when it is supported. Otherwise a default mask of 0xFFBF is used as before. show more ...
Revision tags: v3.8.0
# b7dc54f2	29-May-2014	Markus Pfeiffer <markus.pfeiffer@morphism.de>	kernel/npx: add process name to error message
Revision tags: v3.8.0rc2, v3.9.0, v3.8.0rc, v3.6.2, v3.6.1, v3.6.0, v3.7.1, v3.6.0rc, v3.7.0, v3.4.3, v3.4.2
# c68c010a	18-May-2013	Markus Pfeiffer <markus.pfeiffer@morphism.de>	fix MXCSR default value XEN fails to initialise its vcpus to behave like actual cpus. One instance of this is that the MXCSR is not setup to the default value documented in as documented in AMD64 Ar fix MXCSR default value XEN fails to initialise its vcpus to behave like actual cpus. One instance of this is that the MXCSR is not setup to the default value documented in as documented in AMD64 Architecture Programmer's Manual Volume 1: Application Programming, Section Section 4.3.2 show more ...
Revision tags: v3.4.0, v3.4.1, v3.4.0rc, v3.5.0
# e6e019a8	13-Jan-2013	Matthew Dillon <dillon@apollo.backplane.com>	kernel - Fix signal FP save/restore issues when AVX is enabled * The kernel was not saving/restoring the full FP context when entering into or returning from a signal, leading to corrupt FP regist kernel - Fix signal FP save/restore issues when AVX is enabled * The kernel was not saving/restoring the full FP context when entering into or returning from a signal, leading to corrupt FP registers even when AVX is not used, when AVX is enabled in the kernel. ANY SIGNAL COULD CORRUPT THE FP STATE. * Fixed by adjusting the on-user-stack fpsave area sizes and operation. * This unfortunately changes a number of user visible structures. ucontext_t, mcontext_t, sigcontext, sigframe. It is POSSIBLE that most userland use cases will be unaffected, but I'm not holding my breath. Major-Sleuthing-by: ftigeot Testing-by: ftigeot, dillon show more ...
# 70c57cb3	03-Jan-2013	Sascha Wildner <saw@online.de>	kernel: The NPX_DEBUG kernel option is pc32 specific, too.
# 5cf56a8d	29-Dec-2012	Alex Hornung <alex@alexhornung.com>	x86_64 - support for AVX instructions * CPU will be checked for XSAVE and AVX support on boot. If both are found, they will be enabled. * If enabled, the kernel will use the XSAVE and XRSTOR i x86_64 - support for AVX instructions * CPU will be checked for XSAVE and AVX support on boot. If both are found, they will be enabled. * If enabled, the kernel will use the XSAVE and XRSTOR instructions to save and restore FPU, SSE and AVX registers. Originally-Submitted-by: Adam Sakareassen (with modifications) show more ...
Revision tags: v3.2.2
# 1918fc5c	24-Oct-2012	Sascha Wildner <saw@online.de>	kernel: Make SMP support default (and non-optional). The 'SMP' kernel option gets removed with this commit, so it has to be removed from everybody's configs. Reviewed-by: sjg Approved-by: many
Revision tags: v3.2.1, v3.2.0, v3.3.0, v3.0.3
# a8f1df17	12-Jul-2012	John Marino <netbsd@marino.st>	x86_64 FPU: Set 64-bit precision for fadd/fsub/fsqrt etc. On AMD64, GCC and the ABI expects the x87 unit to be running in 80/64 mode rather than 64/53 mode seen on i386. This corrects errors seen i x86_64 FPU: Set 64-bit precision for fadd/fsub/fsqrt etc. On AMD64, GCC and the ABI expects the x87 unit to be running in 80/64 mode rather than 64/53 mode seen on i386. This corrects errors seen in long double tests involving runtime calculations. Previously, the results of these runtime calculations would get rounded due to use of 53-bit mantissas. show more ...
12