#
6651c3e5 |
| 17-Mar-2024 |
guenther <guenther@openbsd.org> |
Use VERW to mitigate the RFDS (Register File Data Sampling) vulnerability present in Intel Atom CPUs, reordering some ASM in return-to-userspace and start/resume-vmx-guest to reduce the number of ker
Use VERW to mitigate the RFDS (Register File Data Sampling) vulnerability present in Intel Atom CPUs, reordering some ASM in return-to-userspace and start/resume-vmx-guest to reduce the number of kernel values still live in registers when VERW is used. This mitigation requires updated firmware which has affected CPUs report RFDS_CLEAR in dmesg.
Firmware packaging by jsg@ and sthen@ Logic for interpreting intel's flags by jsg@ after lots of discussion between him, deraadt@, and I ok deraadt@
show more ...
|
#
073d4874 |
| 25-Feb-2024 |
guenther <guenther@openbsd.org> |
We don't do compat32 so MSR_CSTAR shouldn't be set up: delete the Xsyscall32 stub and UCODE32 selector, set MSR_CSTAR to zero at CPU startup, and rezero on ACPI resume and VM exit.
requested a while
We don't do compat32 so MSR_CSTAR shouldn't be set up: delete the Xsyscall32 stub and UCODE32 selector, set MSR_CSTAR to zero at CPU startup, and rezero on ACPI resume and VM exit.
requested a while ago by deraadt@ AMD VM testing chris@ testing and ok krw@
show more ...
|
#
6cbac32f |
| 12-Feb-2024 |
guenther <guenther@openbsd.org> |
Retpolines are an anti-pattern for IBT, so we need to shift protecting userspace from cross-process BTI to the kernel. Have each CPU track the last pmap run on in userspace and the last vmm VCPU in
Retpolines are an anti-pattern for IBT, so we need to shift protecting userspace from cross-process BTI to the kernel. Have each CPU track the last pmap run on in userspace and the last vmm VCPU in guest-mode and use the IBPB msr to flush predictors right before running in userspace on a different pmap or entering guest-mode on a different VCPU. Codepatch-nop the userspace bits and conditionalize the vmm bits to keep working if IBPB isn't supported.
ok deraadt@ kettenis@
show more ...
|
#
cafeb892 |
| 12-Dec-2023 |
deraadt <deraadt@openbsd.org> |
remove support for syscall(2) -- the "indirection system call" because it is a dangerous alternative entry point for all system calls, and thus incompatible with the precision system call entry point
remove support for syscall(2) -- the "indirection system call" because it is a dangerous alternative entry point for all system calls, and thus incompatible with the precision system call entry point scheme we are heading towards. This has been a 3-year mission: First perl needed a code-generated wrapper to fake syscall(2) as a giant switch table, then all the ports were cleaned with relatively minor fixes, except for "go". "go" required two fixes -- 1) a framework issue with old library versions, and 2) like perl, a fake syscall(2) wrapper to handle ioctl(2) and sysctl(2) because "syscall(SYS_ioctl" occurs all over the place in the "go" ecosystem because the "go developers" are plan9-loving unix-hating folk who tried to build an ecosystem without allowing "ioctl". ok kettenis, jsing, afresh1, sthen
show more ...
|
#
d8417bd7 |
| 12-Dec-2023 |
deraadt <deraadt@openbsd.org> |
The sigtramp was calling sigreturn(2), and upon failure exit(2), which doesn't make sense anymore. It is better to just issue an illegal instruction. ok kettenis, with some misgivings about inconsis
The sigtramp was calling sigreturn(2), and upon failure exit(2), which doesn't make sense anymore. It is better to just issue an illegal instruction. ok kettenis, with some misgivings about inconsistant approaches between architectures. In the future we could change sigreturn(2) to never return an exit code, but always just terminate the process. We stopped this system call from being callable ages ago with msyscall(2), and there is no stub for it in libc.. maybe that's the next step to take?
show more ...
|
#
20cef513 |
| 10-Dec-2023 |
deraadt <deraadt@openbsd.org> |
Add a new label "sigcodecall" inside every sigtramp definition, directly in front of the syscall instruction. This is used to calculate the start of the syscall for SYS_sigreturn and pinned system c
Add a new label "sigcodecall" inside every sigtramp definition, directly in front of the syscall instruction. This is used to calculate the start of the syscall for SYS_sigreturn and pinned system calls. ok kettenis
show more ...
|
#
bb00e811 |
| 24-Oct-2023 |
claudio <claudio@openbsd.org> |
Normally context switches happen in mi_switch() but there are 3 cases where a switch happens outside. Cleanup these code paths and make the machine independent.
- when a process forks (fork, tfork,
Normally context switches happen in mi_switch() but there are 3 cases where a switch happens outside. Cleanup these code paths and make the machine independent.
- when a process forks (fork, tfork, kthread), the new proc needs to somehow be scheduled for the first time. This is done by proc_trampoline. Since proc_trampoline is machine dependent assembler code change the MP specific proc_trampoline_mp() to proc_trampoline_mi() and make sure it is now always called. - cpu_hatch: when booting APs the code needs to jump to the first proc running on that CPU. This should be the idle thread for that CPU. - sched_exit: when a proc exits it needs to switch away from itself and then instruct the reaper to clean up the rest. This is done by switching to the idle loop.
Since the last two cases require a context switch to the idle proc factor out the common code to sched_toidle() and use it in those places.
Tested by many on all archs. OK miod@ mpi@ cheloha@
show more ...
|
#
1538f8cb |
| 31-Jul-2023 |
guenther <guenther@openbsd.org> |
On CPUs with eIBRS ("enhanced Indirect Branch Restricted Speculation") or IBT enabled the kernel, the hardware should the attacks which retpolines were created to prevent. In those cases, retpolines
On CPUs with eIBRS ("enhanced Indirect Branch Restricted Speculation") or IBT enabled the kernel, the hardware should the attacks which retpolines were created to prevent. In those cases, retpolines should be a net negative for security as they are an indirect branch gadget. They're also slower. * use -mretpoline-external-thunk to give us control of the code used for indirect branches * default to using a retpoline as before, but marks it and the other ASM kernel retpolines for code patching * if the CPU has eIBRS, then enable it * if the CPU has eIBRS *or* IBT, then codepatch the three different retpolines to just indirect jumps
make clean && make config required after this
ok kettenis@
show more ...
|
#
40ce500b |
| 28-Jul-2023 |
guenther <guenther@openbsd.org> |
Add CODEPATCH_CODE() macro to simplify defining a symbol for a chunk of code to use in codepatching. Use that for all the existing codepatching snippets.
Similarly, add CODEPATCH_CODE_LEN() which i
Add CODEPATCH_CODE() macro to simplify defining a symbol for a chunk of code to use in codepatching. Use that for all the existing codepatching snippets.
Similarly, add CODEPATCH_CODE_LEN() which is CODEPATCH_CODE() but also provides a short variable holding the length of the codepatch snippet. Use that for some snippets that will be used for retpoline replacement.
ok kettenis@ deraadt@
show more ...
|
#
d8c6becd |
| 27-Jul-2023 |
guenther <guenther@openbsd.org> |
Follow the lead of mips64 and make cpu_idle_cycle() just call the indirect pointer itself and provide an initializer for that going to the default "just enable interrupts and halt" path.
ok kettenis@
|
#
0c7aa8fc |
| 25-Jul-2023 |
guenther <guenther@openbsd.org> |
cpu_idle_{enter,leave} are no-ops on amd64 now, so just #define away the calls
ok deraadt@ mpi@ miod@
|
#
55fdb5fa |
| 10-Jul-2023 |
guenther <guenther@openbsd.org> |
Enable Indirect Branch Tracking for amd64 userland, using XSAVES/XRSTORS to save/restore the state and enabling it at exec-time (and for signal handling) if the PS_NOBTCFI flag isn't set.
Note: this
Enable Indirect Branch Tracking for amd64 userland, using XSAVES/XRSTORS to save/restore the state and enabling it at exec-time (and for signal handling) if the PS_NOBTCFI flag isn't set.
Note: this changes the format of the sc_fpstate data in the signal context to possibly be in compressed format: starting now we just guarantee that that state is in a format understood by the XRSTOR instruction of the system that is being executed on.
At this time, passing sigreturn a corrupt sc_fpstate now results in the process exiting with no attempt to fix it up or send a T_PROTFLT trap. That may change.
prodding by deraadt@ issues with my original signal handling design identified by kettenis@
lots of base and ports preparation for this by deraadt@ and the libressl and ports teams
ok deraadt@ kettenis@
show more ...
|
#
bc3c2f61 |
| 05-Jul-2023 |
anton <anton@openbsd.org> |
The hypercall page populated with instructions by the hypervisor is not IBT compatible due to lack of endbr64. Replace the indirect call with a new hv_hypercall_trampoline() routine which jumps to th
The hypercall page populated with instructions by the hypervisor is not IBT compatible due to lack of endbr64. Replace the indirect call with a new hv_hypercall_trampoline() routine which jumps to the hypercall page without any indirection.
Allows me to boot OpenBSD using Hyper-V on Windows 11 again.
ok guenther@
show more ...
|
#
339eb9d2 |
| 17-Apr-2023 |
deraadt <deraadt@openbsd.org> |
For future userland IBT, the sigcode needs to start with a endbr64. This is simpler than clearing the cet_u bits in the kernel. ok guenther, kettenis
|
#
0e2deb64 |
| 17-Apr-2023 |
deraadt <deraadt@openbsd.org> |
IDTVEC_NOALIGN() was the incorrect way to create a label in two places, use GENTRY() instead. Also add two endbr64 which cannot be supplied by macros ok guenther
|
#
e9e0c464 |
| 20-Jan-2023 |
deraadt <deraadt@openbsd.org> |
On cpu with the PKU feature, prot=PROT_EXEC pages now create pte which contain PG_XO, which is PKU key1. On every exit from kernel to userland, force the PKU register to inhibit data read against ke
On cpu with the PKU feature, prot=PROT_EXEC pages now create pte which contain PG_XO, which is PKU key1. On every exit from kernel to userland, force the PKU register to inhibit data read against key1 memory. On (some) traps into the kernel if the PKU register is changed, abort the process (processes have no reason to change the PKU register). This provides us with viable xonly functionality on most modern intel & AMD cpus. I started with a xsave-based diff from dv@, but discovered the fpu save/restore logic wasn't a good fit and went to direct register management. Disabled on HV (vm) systems until we know they handle PKU correctly. ok kettenis, dv, guenther, etc
show more ...
|
#
4ce05526 |
| 01-Dec-2022 |
guenther <guenther@openbsd.org> |
_C_LABEL() is no longer useful in the "everything is ELF" world. Start eliminating it.
ok mpi@ mlarkin@ krw@
|
#
36d473f7 |
| 29-Nov-2022 |
guenther <guenther@openbsd.org> |
Move the generic variable definitions from the ASM at the top of locore.S to be in C in cpu.c, machdep.c, pmap.c, or bus_space.c for better typing/debug info. Delete REALBASEMEM, REALEXTMEM, and bio
Move the generic variable definitions from the ASM at the top of locore.S to be in C in cpu.c, machdep.c, pmap.c, or bus_space.c for better typing/debug info. Delete REALBASEMEM, REALEXTMEM, and biosextmem as unused/ignored.
ok mpi@ krw@ mlarkin@
show more ...
|
#
f3c5c958 |
| 04-Nov-2022 |
kettenis <kettenis@openbsd.org> |
EFI firmware has bugs which may mean that calling EFI runtime services will fault because it does memory accesses outside of the regions it told us to map. Try to mitigate this by installing a fault
EFI firmware has bugs which may mean that calling EFI runtime services will fault because it does memory accesses outside of the regions it told us to map. Try to mitigate this by installing a fault handler (using the pcb_onfault mechanism) and bail out using longjmp(9) if we encounter a page fault while executing an EFI runtime services call.
Since some firmware bugs result in us executing code that isn't mapped, make kpageflttrap() handle execution faults as well as data faults.
ok guenther@
show more ...
|
#
0403d5bc |
| 07-Aug-2022 |
guenther <guenther@openbsd.org> |
Start to add annotations to the cpu_info members, doing I/a/o for immutable/atomic/owned ala <sys/proc.h>. Move CPUF_USERSEGS and CPUF_USERXSTATE, which really are private to the CPU, into a new ci_
Start to add annotations to the cpu_info members, doing I/a/o for immutable/atomic/owned ala <sys/proc.h>. Move CPUF_USERSEGS and CPUF_USERXSTATE, which really are private to the CPU, into a new ci_pflags and rename s/CPUF_/CPUPF_/. Make all (remaining) ci_flags alterations via atomic_{set,clear}bits_int(), so its annotation isn't a lie. Delete ci_info member as unused all the way from rev 1.1
ok jsg@ mlarkin@
show more ...
|
#
4039a24b |
| 31-Dec-2021 |
jsg <jsg@openbsd.org> |
specifed -> specified
|
#
3dd0809f |
| 04-Sep-2021 |
bluhm <bluhm@openbsd.org> |
To mitigate against spectre attacks, AMD processors without the IBRS feature need an lfence instruction after every near ret. Place them after all functions in the kernel which are implemented in as
To mitigate against spectre attacks, AMD processors without the IBRS feature need an lfence instruction after every near ret. Place them after all functions in the kernel which are implemented in assembler. Change the retguard macro so that the end of the lfence instruction is 16-byte aligned now. This prevents that the ret instruction is at the end of a 32-byte boundary. The latter would cause a performance impact on certain Intel processors which have a microcode update to mitigate the jump conditional code erratum. See software techniques for managing speculation on AMD processors revision 9.17.20 mitigation G-5. See Intel mitigations for jump conditional code erratum revision 1.0 november 2019 2.4 software guidance and optimization methods. OK deraadt@ mortimer@
show more ...
|
#
24056ac0 |
| 18-Jun-2021 |
guenther <guenther@openbsd.org> |
The pmap needs to know which CPUs to send IPIs when TLB entries need to be invalidated. Instead of keeping a bitset of CPUs in each pmap, have each cpu_info track which pmap it has loaded: replace p
The pmap needs to know which CPUs to send IPIs when TLB entries need to be invalidated. Instead of keeping a bitset of CPUs in each pmap, have each cpu_info track which pmap it has loaded: replace pmap->pm_cpus with cpu_info->ci_proc_pmap. This reduces the atomic operations (and cache thrashing) and simplifies cpu_switchto()
Also, fix a defect in cpu_switchto()'s "am I loading the same cr3?" test: ignore the CR3_REUSE_PCID bit when checking that. This makes switching between kernel threads slightly less costly.
over a week in snaps with no complaints looks ok to mlarkin@ kettenis@ mpi@
show more ...
|
#
9f1181d5 |
| 01-Jun-2021 |
guenther <guenther@openbsd.org> |
Don't clear the cpu's bit in the old pmap's pm_cpus until we're off the old one and set it in the new pmap's pm_cpus before loading %cr3 with the new value. In particular, do neither if %cr3 isn't c
Don't clear the cpu's bit in the old pmap's pm_cpus until we're off the old one and set it in the new pmap's pm_cpus before loading %cr3 with the new value. In particular, do neither if %cr3 isn't changing.
This eliminates a window where, when switching between threads in a single a process, the pmap wouldn't have this cpu's bit set even though we didn't change %cr3. With more of uvm unlocked, it was possible for another cpu to update the page tables but not see a need to send an IPI to this cpu, leading to crashes when TLB entries that should have been invalidated were used.
malloc_duel testing by abluhm@ ok abluhm@ kettenis@ mlarkin@
show more ...
|
#
ae97d4fc |
| 25-May-2021 |
guenther <guenther@openbsd.org> |
clang's assembler now supports 64-suffixed versions of the fxsave/xsave/fxrstor/xrstor family of instructions. Use them directly instead of inserting the 0x48 prefix manually.
ok kettenis@ deraadt@
|