History log of /openbsd/sys/arch/amd64/amd64/mptramp.S (Results 1 – 23 of 23)
Revision Date Author Comments
# 49b6d442 12-Jul-2024 deraadt <deraadt@openbsd.org>

manual ret-clean; ok mlarkin


# 4ce05526 01-Dec-2022 guenther <guenther@openbsd.org>

_C_LABEL() is no longer useful in the "everything is ELF" world.
Start eliminating it.

ok mpi@ mlarkin@ krw@


# ae37612f 29-Nov-2022 guenther <guenther@openbsd.org>

Put the original image of the MP-startup and ACPI-suspend/hibernate
trampolines into .rodata instead of .text. While here, give types
and sizes to all the global symbols and delete some superfluous

Put the original image of the MP-startup and ACPI-suspend/hibernate
trampolines into .rodata instead of .text. While here, give types
and sizes to all the global symbols and delete some superfluous
directives and unrelocated symbols in the ACPI trampoline image.

ok mlarkin@

show more ...


# fa231009 08-Oct-2019 mlarkin <mlarkin@openbsd.org>

amd64: ensure %fs is loaded after final lgdt

Mark Patruck reported this issue on a new EPYC 7402P CPU and tracked down
a DragonflyBSD diff that fixed the issue. I solicited some feedback/advice
from

amd64: ensure %fs is loaded after final lgdt

Mark Patruck reported this issue on a new EPYC 7402P CPU and tracked down
a DragonflyBSD diff that fixed the issue. I solicited some feedback/advice
from Matt Dillon of Dragonfly and discussions between him, guenther@ and
myself led to this fix which appears to resolve the problem.

ok guenther@

show more ...


# 49578e13 24-Jan-2019 deraadt <deraadt@openbsd.org>

mptramp.S does not export gdt64, another historical locore.s splitup error


# 417d3936 23-Jan-2019 deraadt <deraadt@openbsd.org>

RELOC() is not performed here (remained from when locore was split up)


# c9de630f 05-Jun-2018 guenther <guenther@openbsd.org>

Switch from lazy FPU switching to semi-eager FPU switching: track whether
curproc's xstate ("extended state") is loaded in the CPU or not.
- context switch, sendsig(), vmm, and doing CPU crypto in t

Switch from lazy FPU switching to semi-eager FPU switching: track whether
curproc's xstate ("extended state") is loaded in the CPU or not.
- context switch, sendsig(), vmm, and doing CPU crypto in the kernel all
check the flag and, if set, save the old thread's state to the PCB,
clear the flag, and then load the _blank_ state
- when returning to userspace, if the flag is clear then set it and restore
the thread's state

This simpler tracking also fixes the restoring of FPU state after nested
signal handlers.

With this, %cr0's TS flag is never set, the FPU #DNA trap can no
longer happen, and IPIs are no longer necessary for flushing or
syncing FPU state; on the other hand, restoring xstate while returning
to userspace means we have to handle xrstor faulting if we could
be loading an altered state. If that happens, reset the state,
fake a #GP fault (SIGBUS), and recheck for ASTs.

While here, regularize fxsave/fxrstor vs xsave/xrstor handling, by
using codepatching to switch to xsave/xrstor when present in the
CPU. In addition, code patch in use of xsaveopt in most places
when the CPU supports that. Use the 64bit-wide variants of the
instructions in all cases so that x87 instruction fault IPs are
reported correctly.

This change has three motivations:
1) with modern clang, SSE registers are used even in rcrt0.o, making
lazy FPU switching a smaller benefit vs trap costs
2) the Intel SDM warns that lazy FPU switching may increase power costs
3) post-Spectre rumors suggest that the %cr0 TS flag might not block
speculation, permitting leaking of information about FPU state
(AES keys?) across protection boundaries.

tested by many in snaps; prodding from deraadt@

show more ...


# 34a5a4a3 22-May-2018 guenther <guenther@openbsd.org>

Define CR0_DEFAULT with our default CR0_* flags for various .S files.
Replace a hex constant with the correct CR0_* define in mptramp.S.
Clean up lots and lots of whitespace glitches.

no binary chan

Define CR0_DEFAULT with our default CR0_* flags for various .S files.
Replace a hex constant with the correct CR0_* define in mptramp.S.
Clean up lots and lots of whitespace glitches.

no binary change.
ok mlarkin@

show more ...


# 704fc8cd 29-Jun-2017 mlarkin <mlarkin@openbsd.org>

suppress local symbols in mptramp. Matches a similar diff in
acpi_wakecode.s that was committed previously. Also remove an extra
symbol (mp_tramp_pdirpa) that was duplicated with mp_pdirpa.

Tested M

suppress local symbols in mptramp. Matches a similar diff in
acpi_wakecode.s that was committed previously. Also remove an extra
symbol (mp_tramp_pdirpa) that was duplicated with mp_pdirpa.

Tested MP boot, un-zzz, un-ZZZ, no issues seen.

show more ...


# a1752446 19-Dec-2016 kettenis <kettenis@openbsd.org>

Generating mixed 16-bit/32-bit/64-bit code with clang's integrated
assembler is a bit tricky. It supports the .code16, .code32 and
.code64 directives. But it doesn't know about the data16/data32 an

Generating mixed 16-bit/32-bit/64-bit code with clang's integrated
assembler is a bit tricky. It supports the .code16, .code32 and
.code64 directives. But it doesn't know about the data16/data32 and
addr16/addr32 instruction prefixes. Instead it tries to determine
those from the instruction opcode. It mostly succeeds, but there are
a couple of corner cases where clang will generate the "addr32" form
where gas generates the "addr16" form in .code16 segments. That
should be no problem (and just waste a couple of bytes), but it makes
comparing the generated code a bit difficult.

Allow the trampoline code to be compiled with both. For clang #define
away the addr32 prefix and avoid using the data32 prefix by using a
mnemonic that explicitly encodes the size of the operand. Add a few
addr32 prefixes in .code16 blocks to reduce the differences between
code generated by clang and gas.

ok patrick@, deraadt@, mlarkin@

show more ...


# 3be3713e 16-May-2016 mlarkin <mlarkin@openbsd.org>

default to int3 padding if we ever introduce ENTRY/NENTRY pads here

ok deraadt@


# 984d4744 19-Apr-2015 sf <sf@openbsd.org>

Add support for x2apic mode

This is currently only enabled on hypervisors because on real hardware, it
requires interrupt remapping which we don't support yet. But on virtualization
it reduces the n

Add support for x2apic mode

This is currently only enabled on hypervisors because on real hardware, it
requires interrupt remapping which we don't support yet. But on virtualization
it reduces the number of vmexits required per IPI from 4 to 1, causing a
significant speed-up for MP guests.

ok kettenis@

show more ...


# b2ad5bc0 08-Dec-2014 mlarkin <mlarkin@openbsd.org>

Move the data part of the mp trampoline to .rodata (initially). The kernel
moves a copy of this to the RW tramp data page during bootup.

ok deraadt@


# 031569f1 22-Nov-2014 mlarkin <mlarkin@openbsd.org>

Split the MP trampoline into two pages, one for code and one for data/stack
and then protect the code page as RX and the data/stack page as RW (NX).

ok deraadt@


# c3734e39 21-Nov-2014 mlarkin <mlarkin@openbsd.org>

remove unused #defines and labels.

ok deraadt, guenther


# 71ce0f95 05-Jan-2014 mlarkin <mlarkin@openbsd.org>

Don't use the first 64KB for anything, including tramps. Move tramps and
hibernate goo up after 64KB to avoid posible corruption by buggy BIOS SMM
code. Diff also ensures the first 64KB doesn't get h

Don't use the first 64KB for anything, including tramps. Move tramps and
hibernate goo up after 64KB to avoid posible corruption by buggy BIOS SMM
code. Diff also ensures the first 64KB doesn't get handed to UVM either.

ok deraadt@, tested by many with no regressions reported

show more ...


# fd94711f 13-Nov-2010 guenther <guenther@openbsd.org>

Switch from TSS-per-process to TSS-per-CPU, placing the TSS right
next to the cpu's GDT, also making the double-fault stack per-CPU,
leaving it at the top of the page of the CPU's idle process. Inli

Switch from TSS-per-process to TSS-per-CPU, placing the TSS right
next to the cpu's GDT, also making the double-fault stack per-CPU,
leaving it at the top of the page of the CPU's idle process. Inline
pmap_activate() and pmap_deactivate() into the asm cpu_switchto
routine, adding a check for the new pmap already being marked as
active on the CPU. Garbage collect the hasn't-been-used-in-years
GDT update IPI.

Tested by many; ok mikeb@, kettenis@

show more ...


# a2cf0d36 01-Apr-2010 kettenis <kettenis@openbsd.org>

Don't index cpu_info by apic id, but by device unit number instead. Recent
Intel CPUs come up with apic id's >= 32, even on systems with less than 32
logical CPUs.

ok krw@, marco@; tested by deraad

Don't index cpu_info by apic id, but by device unit number instead. Recent
Intel CPUs come up with apic id's >= 32, even on systems with less than 32
logical CPUs.

ok krw@, marco@; tested by deraadt@

show more ...


# d874cce4 26-Jun-2008 ray <ray@openbsd.org>

First pass at removing clauses 3 and 4 from NetBSD licenses.

Not sure what's more surprising: how long it took for NetBSD to
catch up to the rest of the BSDs (including UCB), or the amount of
code t

First pass at removing clauses 3 and 4 from NetBSD licenses.

Not sure what's more surprising: how long it took for NetBSD to
catch up to the rest of the BSDs (including UCB), or the amount of
code that NetBSD has claimed for itself without attributing to the
actual authors.

OK deraadt@

show more ...


# 45053f4a 10-Oct-2007 art <art@openbsd.org>

Make context switching much more MI:
- Move the functionality of choosing a process from cpu_switch into
a much simpler function: cpu_switchto. Instead of having the locore
code walk the run q

Make context switching much more MI:
- Move the functionality of choosing a process from cpu_switch into
a much simpler function: cpu_switchto. Instead of having the locore
code walk the run queues, let the MI code choose the process we
want to run and only implement the context switching itself in MD
code.
- Let MD context switching run without worrying about spls or locks.
- Instead of having the idle loop implemented with special contexts
in MD code, implement one idle proc for each cpu. make the idle
loop MI with MD hooks.
- Change the proc lists from the old style vax queues to TAILQs.
- Change the sleep queue from vax queues to TAILQs. This makes
wakeup() go from O(n^2) to O(n)

there will be some MD fallout, but it will be fixed shortly.
There's also a few cleanups to be done after this.

deraadt@, kettenis@ ok

show more ...


# 49e2e7f9 26-Jul-2005 art <art@openbsd.org>

Instead of juggling around with cr4 and enabling parts of it sometimes,
other parts later, etc. Just set it to the same default value everywhere.
We won't survive without PSE and tt's not like someon

Instead of juggling around with cr4 and enabling parts of it sometimes,
other parts later, etc. Just set it to the same default value everywhere.
We won't survive without PSE and tt's not like someone will suddenly make
an amd64 that doesn't support PGE.

This will allow us to make the bootstrap process slightly more sane.

show more ...


# 4814b8ce 19-Jul-2004 art <art@openbsd.org>

Implement __HAVE_PMAP_DIRECT on amd64 using large pages. At this moment
it's limited to 512GB (one L4 page table entry) physical memory. Only
used carefully at this moment, but more improvements are

Implement __HAVE_PMAP_DIRECT on amd64 using large pages. At this moment
it's limited to 512GB (one L4 page table entry) physical memory. Only
used carefully at this moment, but more improvements are in the pipeline.

tested by many, deraadt@ ok.

show more ...


# b5b9857b 25-Jun-2004 art <art@openbsd.org>

SMP support. Big parts from NetBSD, but with some really serious debugging
done by me, niklas and others. Especially wrt. NXE support.

Still needs some polishing, especially in dmesg messages, but w

SMP support. Big parts from NetBSD, but with some really serious debugging
done by me, niklas and others. Especially wrt. NXE support.

Still needs some polishing, especially in dmesg messages, but we're now
building kernel faster than ever.

show more ...