History log of /dragonfly/sys/platform/pc64/x86_64/exception.S (Results 1 – 25 of 31)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v6.2.1, v6.2.0, v6.3.0, v6.0.1, v6.0.0, v6.0.0rc1, v6.1.0, v5.8.3, v5.8.2
# add2647e 28-Jul-2020 Matthew Dillon <dillon@apollo.backplane.com>

kernel - Improve debugging of spurious interrupts

* Report spurious T_RESERVED interrupt vectors / trap numbers. Report
the actual trap number and try to ignore it. If this occurs it
helps wit

kernel - Improve debugging of spurious interrupts

* Report spurious T_RESERVED interrupt vectors / trap numbers. Report
the actual trap number and try to ignore it. If this occurs it
helps with debugging as a cold boot 'vmstat -i -v' can be matched
up against the spurious interrupt number (usually spurious# - 0x20)
to locate the PCI device causing the problem.

show more ...


Revision tags: v5.8.1, v5.8.0, v5.9.0, v5.8.0rc1, v5.6.3, v5.6.2, v5.6.1, v5.6.0, v5.6.0rc1, v5.7.0, v5.4.3
# 3cc72d3d 17-May-2019 Matthew Dillon <dillon@apollo.backplane.com>

Revert "kernel - Clean up direction flag on syscall entry"

Actually not needed, the D flag is cleared via the mask
set in MSR_SF_MASK. Revert.

This reverts commit cea0e49dc0b2e5aea1b929d02f12d00df

Revert "kernel - Clean up direction flag on syscall entry"

Actually not needed, the D flag is cleared via the mask
set in MSR_SF_MASK. Revert.

This reverts commit cea0e49dc0b2e5aea1b929d02f12d00df66528e2.

show more ...


# cea0e49d 17-May-2019 Matthew Dillon <dillon@apollo.backplane.com>

kernel - Clean up direction flag on syscall entry

* Make sure the direction flag is clear on syscall entry. Don't
trust userland.


Revision tags: v5.4.2, v5.4.1, v5.4.0, v5.5.0, v5.4.0rc1
# 9ee3b786 21-Sep-2018 Sascha Wildner <saw@online.de>

kernel: Remove some obsolete commented out code.


Revision tags: v5.2.2
# 9474cbef 11-Jun-2018 Matthew Dillon <dillon@apollo.backplane.com>

Kernel - Additional cpu bug hardening part 2/2

* Due to speculative instruction execution, the kernel may
speculatively execute instructions using data from registers that
still contain userland

Kernel - Additional cpu bug hardening part 2/2

* Due to speculative instruction execution, the kernel may
speculatively execute instructions using data from registers that
still contain userland-controlled content.

Reduce the chance of this situation arising by proactively clearing
all user registers after saving them for syscalls, exceptions, and
interrupts. In addition, for system calls, zero-out any
unrestored registers on-return to avoid leaking kernel data back to
userland.

* This was discussed over the last few months in various
OS groups and I've decided to implement it. After the FP
debacle, it is prudent to also give general registers similar
protections.

show more ...


Revision tags: v5.2.1
# 85b33048 01-May-2018 Matthew Dillon <dillon@apollo.backplane.com>

kernel - Fix CVE-2018-8897, debug register issue

* #DB can be delayed in a way that causes it to occur on the first
instruction of the int $3 or syscall handlers. These handlers must
be able to

kernel - Fix CVE-2018-8897, debug register issue

* #DB can be delayed in a way that causes it to occur on the first
instruction of the int $3 or syscall handlers. These handlers must
be able to detect and handle the condition. This is a historical
artifact of cpu operation that has existed for a very long time on
both AMD and Intel CPUs.

* Fix by giving #DB its own trampoline stack and a way to load a
deterministic %gs and %cr3 independent of the normal CS check.
This is CVE-2018-8897.

* Also fix the NMI trampoline while I'm here.

* Also fix an old issue with debug register trace traps which can
occur when the kernel is accessing the user's address space.
This fix was lost years ago, now recovered.

Credits: Nick Peterson of Everdox Tech, LLC (original reporter)
Credits: Thanks to Microsoft for coordinating the OS vendor response

show more ...


Revision tags: v5.2.0, v5.3.0, v5.2.0rc
# 26c7e964 11-Jan-2018 Matthew Dillon <dillon@apollo.backplane.com>

kernel - Implement spectre mitigations part 3 (stabilization)

* Fix a bug in the system call entry code. The wrong stack pointer
was being loaded for KMMUENTRY_SYSCALL and KMMUENTRY_SYSCALL was

kernel - Implement spectre mitigations part 3 (stabilization)

* Fix a bug in the system call entry code. The wrong stack pointer
was being loaded for KMMUENTRY_SYSCALL and KMMUENTRY_SYSCALL was
using an offset that did not exist in certain situations.

* Load the correct stack pointer, but also change KMMUENTRY_CORE
to not use stack-relative loads and stores. Instead it uses
the trampframe directly via %gs:BLAH

Reported-by: zrj

show more ...


# 9283c84b 10-Jan-2018 Matthew Dillon <dillon@apollo.backplane.com>

kernel - Implement spectre mitigations part 2

* NOTE: The last few commits may have said 'IBPB' but they really
meant 'IBRS. The last few commits addde IBRS support, this one
cleans that up and

kernel - Implement spectre mitigations part 2

* NOTE: The last few commits may have said 'IBPB' but they really
meant 'IBRS. The last few commits addde IBRS support, this one
cleans that up and adds IBPB support.

* Intel says for IBRS always-on mode (mode 2), SPEC_CTRL still has
to be poked on every user->kernel entry as a barrier, even though
the value is not being changed. So make this change. This actually
somewhat improves performance a little on Skylake and later verses
when I just set it globally and left it that way.

* Implement IBPB detection and support on Intel. At the moment
we default to turning it off because the performance hit is pretty
massive. Currently the work on linux is only using IBPB for
VMM related operations and not for user->kernel entry.

* Enhance the machdep.spectre_mitigation sysctl to print out
what the mode matrix is whenever you change it, in human
readable terms.

0 IBRS disabled IBPB disabled
1 IBRS mode 1 (kernel-only) IBPB disabled
2 IBRS mode 2 (at all times) IBPB disabled

4 IBRS disabled IBPB enabled
5 IBRS mode 1 (kernel-only) IBPB enabled
6 IBRS mode 2 (at all times) IBPB enabled

Currently we default to (1) instead of (5) when we detect that
the microcode detects both features. IBPB is not turned on by default
(you can see why below).

* Haswell and Skylake performance loss matrix using the following
test. This tests a high-concurrency compile, which is approximately
a 5:1 user:kernel test with high concurrency.

The haswell box is: i3-4130 CPU @ 3.40GHz (2-core/4-thread)
The skylake box is: i5-6500 CPU @ 3.20GHz (4-core/4-thread)

This does not include MMU isolation losses, which will add another
3-4% or so in losses.

(/usr/obj on tmpfs)
time make -j 8 nativekernel NO_MODULES=TRUE

PERFORMANCE LOSS MATRIX
HASWELL SKYLAKE
IBPB=0 IBPB=1 IBPB=0 IBPB=1
IBRS=0 0% 12% 0% 17%
IBRS=1 >12%< 21% >2.4%< 15%
IBRS=2 58% 60% 23% 32%

Note that the automatic default when microcode support is detected
is IBRS=1, IBPB=0 (12% loss on Haswell and 2.4% loss on Skylake
for this test). If we add 3-4% or so for MMU isolation, a Haswell
cpu loses around 16% and a Skylake cpu loses around 6% or so in
performance.

PERFORMANCE LOSS MATRIX
(including 3% MMU isolation losses)
HASWELL SKYLAKE
IBPB=0 IBPB=1 IBPB=0 IBPB=1
IBRS=0 3% 15% 3% 20%
IBRS=1 >15%< 24% >5.4%< 18%
IBRS=2 61% 63% 26% 35%

show more ...


# 8ed06571 10-Jan-2018 Matthew Dillon <dillon@apollo.backplane.com>

kernel - Implement spectre mitigations part 1

* Implement machdep.spectre_mitigation. This can be set as a tunable
or sysctl'd later. The tunable is only applicable if the BIOS has
the appropr

kernel - Implement spectre mitigations part 1

* Implement machdep.spectre_mitigation. This can be set as a tunable
or sysctl'd later. The tunable is only applicable if the BIOS has
the appropriate microcode, otherwise you have to update the microcode
first and then use sysctl to set the mode.

This works similarly to Linux's IBRS support.

mode 0 - Spectre IBPB MSRs disabled

mode 1 - Sets IBPB MSR on USER->KERN transition and clear it
on KERN->USER.

mode 2 - Leave IBPB set globally. Do not toggle on USER->KERN or
KERN->USER transitions.

* Retest spectre microcode MSRs on microcode update.

* Spectre mode 1 is enabled by default if the microcode supports it.
(we might change this to disabled by default, I'm still mulling it
over).

* General performance effects (not counting the MMU separation mode,
which is machdep.meltdown_mitigation and adds another 3% in overhead):

Skylake loses around 5% for mode 1 and 12% for mode 2, verses mode 0.
Haswell loses around 12% for mode 1 and 53% for mode 2, verses mode 0.

Add another 3% if MMU separation is also turned on (aka
machdep.meltdown_mitigation).

* General system call overhead effects on Skylake:

machdep.meltdown_mitigation=0, machdep.spectre_mitigation=0 103ns
machdep.meltdown_mitigation=1, machdep.spectre_mitigation=0 360ns
machdep.meltdown_mitigation=1, machdep.spectre_mitigation=1 848ns
machdep.meltdown_mitigation=1, machdep.spectre_mitigation=2 404ns

Note that mode 1 has better overall performance for mixed user+kernel
workloads despite having a much higher system call overhead, whereas
mode 2 has lower system call overhead but generally lower overall
performance because IBPB is enabled in usermode.

show more ...


# 6a65d560 06-Jan-2018 Matthew Dillon <dillon@apollo.backplane.com>

kernel - Intel user/kernel separation MMU bug fix part 5

* Fix iretq fault handling. As I thought, I messed it up with
the trampoline patches. Fixing it involves issuing the correct
KMMU* macr

kernel - Intel user/kernel separation MMU bug fix part 5

* Fix iretq fault handling. As I thought, I messed it up with
the trampoline patches. Fixing it involves issuing the correct
KMMU* macros to ensure that the code is on the correct stack
and has the correct mmu context.

Revalidate with a test program that uses a signal handler to
change the stack segment descriptor to something it shouldn't
be.

* Get rid of the "kernel trap 9..." console message for the iretq
fault case.

show more ...


# 9e24b495 05-Jan-2018 Matthew Dillon <dillon@apollo.backplane.com>

kernel - Intel user/kernel separation MMU bug fix part 3/3

* Implement the isolated pmap template, iso_pmap. The pmap code will
generate a dummy iso_pmap containing only the kernel mappings requi

kernel - Intel user/kernel separation MMU bug fix part 3/3

* Implement the isolated pmap template, iso_pmap. The pmap code will
generate a dummy iso_pmap containing only the kernel mappings required
for userland to be able to transition into the kernel and vise-versa.

The mappings needed are:

(1) The per-cpu trampoline area for our stack (rsp0)
(2) The global descriptor table (gdt) for all cpus
(3) The interrupt descriptor table (idt) for all cpus
(4) The TSS block for all cpus (we store this in the trampoline page)
(5) Kernel code addresses for the interrupt vector entry and exit

* In this implementation the 'kernel code' addresses are currently just
btext to etext. That is, the kernel's primary text area. Kernel
data and bss are not part of the isolation map.

TODO - just put the vector entry and exit points in the map, and
not the entire kernel.

* System call performance is reduced when isolation is turned on.
100ns -> 350ns or so. However, typical workloads should not lose
more than 5% performance or so. System-call heavy and interrupt-heavy
workloads (network, database, high-speed storage, etc) can lose a lot
more performance.

We leave the trampoline code in-place whether isolation is turned on
or not. The trampoline overhead, without isolation, is only 5nS or so.

* Fix a missing exec-related trampoline initialization.

* Clean-up kernel page table PTEs a bit. PG_M is ignored on non-terminal
PTEs, so don't set it. Also don't set PG_U in non-terminal kernel
page table pages (PG_U is never set on terminal PTEs so this wasn't
a problem, but we should be correct).

* Fix a bug in fast_syscall's trampoline stack. The wrong stack
pointer was being loaded.

* Move mdglobaldata->gd_common_tss to privatespace->common_tss.
Place common_tss in the same page as the trampoline to reduce
exposure to globaldata from the isolated MMU context.

* 16-byte align struct trampframe for convenience.

* Fix a bug in POP_FRAME. Always cli in order to avoid getting
an interrupt just at the iretq instruction, which might be
misinterpreted.

show more ...


# fc921477 04-Jan-2018 Matthew Dillon <dillon@apollo.backplane.com>

kernel - Intel user/kernel separation MMU bug fix part 2/3

* Cleanup pass. Throw in some documentation.

* Move the gd_pcb_* fields into the trampoline page to allow
kernel memory to be further r

kernel - Intel user/kernel separation MMU bug fix part 2/3

* Cleanup pass. Throw in some documentation.

* Move the gd_pcb_* fields into the trampoline page to allow
kernel memory to be further restricted in part 3.

show more ...


# 4611d87f 03-Jan-2018 Matthew Dillon <dillon@apollo.backplane.com>

kernel - Intel user/kernel separation MMU bug fix part 1/3

* Part 1/3 of the fix for the Intel user/kernel separation MMU bug.
It appears that it is possible to discern the contents of kernel
me

kernel - Intel user/kernel separation MMU bug fix part 1/3

* Part 1/3 of the fix for the Intel user/kernel separation MMU bug.
It appears that it is possible to discern the contents of kernel
memory with careful timing measurements of instructions due to
speculative memory reads and speculative instruction execution
by Intel cpus. This can happen because Intel will allow both to
occur even when the memory access is later disallowed due to
privilege separation in the PTE.

Even though the execution is always aborted, the speculative
reads and speculative execution results in timing artifacts which
can be measured. A speculative compare/branch can lead to timing
artifacts that allow the actual contents of kernel memory to be
discerned.

While there are multiple speculative attacks possible, the Intel
bug is particularly bad because it allows a user program to more
or less effortlessly access kernel memory (and if a DMAP is
present, all of physical memory).

* Part 1 implements all the logic required to load an 'isolated'
version of the user process's PML4e into %cr3 on all user
transitions, and to load the 'normal' U+K version into %cr3 on
all transitions from user to kernel.

* Part 1 fully allocates, copies, and implements the %cr3 loads for
the 'isolated' version of the user process PML4e.

* Part 1 does not yet actually adjust the contents of this isolated
version to replace the kernel map with just a trampoline map in
kernel space. It does remove the DMAP as a test, though. The
full separation will be done in part 3.

show more ...


# e1caeca9 19-Dec-2017 zrj <rimvydas.jasinskas@gmail.com>

i386 removal, part 63/x: Remove some leftovers in segments.h

Last users were removed in
8c2a9b77413a2154aa54084381f6703dd5446fa4


Revision tags: v5.0.2, v5.0.1, v5.0.0, v5.0.0rc2, v5.1.0, v5.0.0rc1, v4.8.1
# 87ef2da6 23-Jul-2017 zrj <rimvydas.jasinskas@gmail.com>

sys: Some whitespace cleanup.

While there, fix indentation and few typos a bit.
No functional change.


Revision tags: v4.8.0, v4.6.2, v4.9.0, v4.8.0rc
# d6e8ab2d 18-Oct-2016 Sascha Wildner <saw@online.de>

kernel: Remove the COMPAT_43 kernel option along with all related code.

It is commented out in our default kernel config files for almost five
years now, since 9466f37df5258f3bc3d99ae43627a71c1c085e

kernel: Remove the COMPAT_43 kernel option along with all related code.

It is commented out in our default kernel config files for almost five
years now, since 9466f37df5258f3bc3d99ae43627a71c1c085e7d.

Approved-by: dillon
Dragonfly-bug: <https://bugs.dragonflybsd.org/issues/2946>

show more ...


Revision tags: v4.6.1, v4.6.0, v4.6.0rc2, v4.6.0rc, v4.7.0, v4.4.3, v4.4.2
# c66c7e2f 25-Jan-2016 zrj <rimvydas.jasinskas@gmail.com>

Correct BSD License clause numbering from 1-2-4 to 1-2-3.


Revision tags: v4.4.1, v4.4.0, v4.5.0, v4.4.0rc, v4.2.4, v4.3.1, v4.2.3, v4.2.1, v4.2.0, v4.0.6, v4.3.0, v4.2.0rc, v4.0.5, v4.0.4, v4.0.3
# 32d3bd25 06-Jan-2015 Sascha Wildner <saw@online.de>

kernel/pc64: Change all the remaining #if JG's to #if 0 (fixing -Wundef).


Revision tags: v4.0.2, v4.0.1, v4.0.0, v4.0.0rc3, v4.0.0rc2, v4.0.0rc, v4.1.0, v3.8.2, v3.8.1, v3.6.3, v3.8.0, v3.8.0rc2, v3.9.0, v3.8.0rc, v3.6.2, v3.6.1, v3.6.0, v3.7.1, v3.6.0rc, v3.7.0, v3.4.3, v3.4.2, v3.4.0, v3.4.1, v3.4.0rc, v3.5.0, v3.2.2, v3.2.1, v3.2.0, v3.3.0, v3.0.3, v3.0.2, v3.0.1, v3.1.0, v3.0.0
# 3338cc67 12-Dec-2011 Matthew Dillon <dillon@apollo.backplane.com>

kernel - Misc fixes and debugging

* Add required CLDs in the exception paths. The interrupt paths already
CLD in PUSH_FRAME.

* Note that the fast_syscall (SYSENTER) path has an implied CLD due t

kernel - Misc fixes and debugging

* Add required CLDs in the exception paths. The interrupt paths already
CLD in PUSH_FRAME.

* Note that the fast_syscall (SYSENTER) path has an implied CLD due to the
hardware mask applied to rflags.

* Add the IOPL bits to the set of bits set to 0 during a fast_syscall.

* When creating a dummy interrupt frame we don't have to push the
actual %cs. Just push $0 so the frame isn't misinterpreted as coming
from userland.

* Additional debug verbosity for freeze_on_seg_fault.

* Reserve two void * fields for LWP debugging (for a later commit)

show more ...


# 86d7f5d3 26-Nov-2011 John Marino <draco@marino.st>

Initial import of binutils 2.22 on the new vendor branch

Future versions of binutils will also reside on this branch rather
than continuing to create new binutils branches for each new version.


# f2081646 12-Nov-2011 Matthew Dillon <dillon@apollo.backplane.com>

kernel - Add syscall quick return path for x86-64

* Flag the case where a sysretq can be performed to quickly return
from a system call instead of having to execute the slower doreti
code.

* Th

kernel - Add syscall quick return path for x86-64

* Flag the case where a sysretq can be performed to quickly return
from a system call instead of having to execute the slower doreti
code.

* This about halves syscall times for simple system calls such as
getuid(), and reduces longer syscalls by ~80ns or so on a fast
3.4GHz SandyBridge, but does not seem to really effect performance
a whole lot.

Taken-From: FreeBSD (loosely)

show more ...


Revision tags: v2.12.0, v2.13.0, v2.10.1, v2.11.0, v2.10.0, v2.9.1, v2.8.2, v2.8.1, v2.8.0, v2.9.0, v2.6.3, v2.7.3, v2.6.2, v2.7.2, v2.7.1, v2.6.1, v2.7.0, v2.6.0
# b2b3ffcd 04-Nov-2009 Simon Schubert <corecode@dragonflybsd.org>

rename amd64 architecture to x86_64

The rest of the world seems to call amd64 x86_64. Bite the bullet and
rename all of the architecture files and references. This will
hopefully make pkgsrc build

rename amd64 architecture to x86_64

The rest of the world seems to call amd64 x86_64. Bite the bullet and
rename all of the architecture files and references. This will
hopefully make pkgsrc builds less painful.

Discussed-with: dillon@

show more ...


# c1543a89 04-Nov-2009 Simon Schubert <corecode@dragonflybsd.org>

rename amd64 architecture to x86_64

The rest of the world seems to call amd64 x86_64. Bite the bullet and
rename all of the architecture files and references. This will
hopefully make pkgsrc build

rename amd64 architecture to x86_64

The rest of the world seems to call amd64 x86_64. Bite the bullet and
rename all of the architecture files and references. This will
hopefully make pkgsrc builds less painful.

Discussed-with: dillon@

show more ...


# cc9b6223 24-Mar-2011 Matthew Dillon <dillon@apollo.backplane.com>

kernel - Revamp LWKT thread migration

* Rearrange the handling of TDF_RUNNING, making lwkt_switch() responsible
for it instead of the assembly switch code. Adjust td->td_switch() to
return the

kernel - Revamp LWKT thread migration

* Rearrange the handling of TDF_RUNNING, making lwkt_switch() responsible
for it instead of the assembly switch code. Adjust td->td_switch() to
return the previously running thread.

This allows lwkt_switch() to process thread migration between cpus after
the thread has been completely and utterly switched out, removing the
need to loop on TDF_RUNNING on the target cpu.

* Fixes lwkt_setcpu_remote livelock failure

* This required major surgery on the core thread switch assembly, testing
is needed. I tried to avoid doing this but the livelock problems persisted,
so the only solution was to remove the need for the loops that were causing
the livelocks.

* NOTE: The user process scheduler is still using the old giveaway/acquire
method. More work is needed here.

Reported-by: "Magliano Andre'" <masterblaster@tiscali.it>

show more ...


# bd52bedf 31-Jan-2011 Matthew Dillon <dillon@apollo.backplane.com>

kernel64 - Fix disabled interrupts during dbg/bpt trap

* Interrupts were left improperly disabled during a dbg or bpt trap.
i386 enables interrupts for these traps. x86-64 needs to as well
or i

kernel64 - Fix disabled interrupts during dbg/bpt trap

* Interrupts were left improperly disabled during a dbg or bpt trap.
i386 enables interrupts for these traps. x86-64 needs to as well
or it will hit an assertion in lwkt_switch() under certain circumstances.

* Make debug code in lwkt_switch() also require INVARIANTS to function.

NOTE: This is temporary debug code and should be removed at some point
after 48-core testing is complete.

show more ...


12