94da7b6e | 18-Apr-2024 |
Zhao Liu <zhao1.liu@intel.com> |
accel/tcg/icount-common: Consolidate the use of warn_report_once()
Use warn_report_once() to get rid of the static local variable "notified".
Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Message-I
accel/tcg/icount-common: Consolidate the use of warn_report_once()
Use warn_report_once() to get rid of the static local variable "notified".
Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Message-ID: <20240418100716.1085491-1-zhao1.liu@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
565f4768 | 29-Feb-2024 |
Isaku Yamahata <isaku.yamahata@intel.com> |
kvm/tdx: Ignore memory conversion to shared of unassigned region
TDX requires vMMIO region to be shared. For KVM, MMIO region is the region which kvm memslot isn't assigned to (except in-kernel emu
kvm/tdx: Ignore memory conversion to shared of unassigned region
TDX requires vMMIO region to be shared. For KVM, MMIO region is the region which kvm memslot isn't assigned to (except in-kernel emulation). qemu has the memory region for vMMIO at each device level.
While OVMF issues MapGPA(to-shared) conservatively on 32bit PCI MMIO region, qemu doesn't find corresponding vMMIO region because it's before PCI device allocation and memory_region_find() finds the device region, not PCI bus region. It's safe to ignore MapGPA(to-shared) because when guest accesses those region they use GPA with shared bit set for vMMIO. Ignore memory conversion request of non-assigned region to shared and return success. Otherwise OVMF is confused and panics there.
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Message-ID: <20240229063726.610065-35-xiaoyao.li@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
c5d9425e | 29-Feb-2024 |
Isaku Yamahata <isaku.yamahata@intel.com> |
kvm/tdx: Don't complain when converting vMMIO region to shared
Because vMMIO region needs to be shared region, guest TD may explicitly convert such region from private to shared. Don't complain suc
kvm/tdx: Don't complain when converting vMMIO region to shared
Because vMMIO region needs to be shared region, guest TD may explicitly convert such region from private to shared. Don't complain such conversion.
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Message-ID: <20240229063726.610065-34-xiaoyao.li@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
c15e5684 | 20-Mar-2024 |
Chao Peng <chao.p.peng@linux.intel.com> |
kvm: handle KVM_EXIT_MEMORY_FAULT
Upon an KVM_EXIT_MEMORY_FAULT exit, userspace needs to do the memory conversion on the RAMBlock to turn the memory into desired attribute, switching between private
kvm: handle KVM_EXIT_MEMORY_FAULT
Upon an KVM_EXIT_MEMORY_FAULT exit, userspace needs to do the memory conversion on the RAMBlock to turn the memory into desired attribute, switching between private and shared.
Currently only KVM_MEMORY_EXIT_FLAG_PRIVATE in flags is valid when KVM_EXIT_MEMORY_FAULT happens.
Note, KVM_EXIT_MEMORY_FAULT makes sense only when the RAMBlock has guest_memfd memory backend.
Note, KVM_EXIT_MEMORY_FAULT returns with -EFAULT, so special handling is added.
When page is converted from shared to private, the original shared memory can be discarded via ram_block_discard_range(). Note, shared memory can be discarded only when it's not back'ed by hugetlb because hugetlb is supposed to be pre-allocated and no need for discarding.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com> Co-developed-by: Xiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20240320083945.991426-13-michael.roth@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
bd3bcf69 | 20-Mar-2024 |
Xiaoyao Li <xiaoyao.li@intel.com> |
kvm/memory: Make memory type private by default if it has guest memfd backend
KVM side leaves the memory to shared by default, which may incur the overhead of paging conversion on the first visit of
kvm/memory: Make memory type private by default if it has guest memfd backend
KVM side leaves the memory to shared by default, which may incur the overhead of paging conversion on the first visit of each page. Because the expectation is that page is likely to private for the VMs that require private memory (has guest memfd).
Explicitly set the memory to private when memory region has valid guest memfd backend.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: Michael Roth <michael.roth@amd.com> Message-ID: <20240320083945.991426-16-michael.roth@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
ce5a9832 | 20-Mar-2024 |
Chao Peng <chao.p.peng@linux.intel.com> |
kvm: Enable KVM_SET_USER_MEMORY_REGION2 for memslot
Switch to KVM_SET_USER_MEMORY_REGION2 when supported by KVM.
With KVM_SET_USER_MEMORY_REGION2, QEMU can set up memory region that backend'ed both
kvm: Enable KVM_SET_USER_MEMORY_REGION2 for memslot
Switch to KVM_SET_USER_MEMORY_REGION2 when supported by KVM.
With KVM_SET_USER_MEMORY_REGION2, QEMU can set up memory region that backend'ed both by hva-based shared memory and guest memfd based private memory.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com> Co-developed-by: Xiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Message-ID: <20240320083945.991426-10-michael.roth@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
15f7a80c | 20-Mar-2024 |
Xiaoyao Li <xiaoyao.li@intel.com> |
RAMBlock: Add support of KVM private guest memfd
Add KVM guest_memfd support to RAMBlock so both normal hva based memory and kvm guest memfd based private memory can be associated in one RAMBlock.
RAMBlock: Add support of KVM private guest memfd
Add KVM guest_memfd support to RAMBlock so both normal hva based memory and kvm guest memfd based private memory can be associated in one RAMBlock.
Introduce new flag RAM_GUEST_MEMFD. When it's set, it calls KVM ioctl to create private guest_memfd during RAMBlock setup.
Allocating a new RAM_GUEST_MEMFD flag to instruct the setup of guest memfd is more flexible and extensible than simply relying on the VM type because in the future we may have the case that not all the memory of a VM need guest memfd. As a benefit, it also avoid getting MachineState in memory subsystem.
Note, RAM_GUEST_MEMFD is supposed to be set for memory backends of confidential guests, such as TDX VM. How and when to set it for memory backends will be implemented in the following patches.
Introduce memory_region_has_guest_memfd() to query if the MemoryRegion has KVM guest_memfd allocated.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Reviewed-by: David Hildenbrand <david@redhat.com> Message-ID: <20240320083945.991426-7-michael.roth@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
0811baed | 20-Mar-2024 |
Xiaoyao Li <xiaoyao.li@intel.com> |
kvm: Introduce support for memory_attributes
Introduce the helper functions to set the attributes of a range of memory to private or shared.
This is necessary to notify KVM the private/shared attri
kvm: Introduce support for memory_attributes
Introduce the helper functions to set the attributes of a range of memory to private or shared.
This is necessary to notify KVM the private/shared attribute of each gpa range. KVM needs the information to decide the GPA needs to be mapped at hva-based shared memory or guest_memfd based private memory.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Message-ID: <20240320083945.991426-11-michael.roth@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
72853afc | 29-Feb-2024 |
Xiaoyao Li <xiaoyao.li@intel.com> |
trace/kvm: Split address space and slot id in trace_kvm_set_user_memory()
The upper 16 bits of kvm_userspace_memory_region::slot are address space id. Parse it separately in trace_kvm_set_user_memor
trace/kvm: Split address space and slot id in trace_kvm_set_user_memory()
The upper 16 bits of kvm_userspace_memory_region::slot are address space id. Parse it separately in trace_kvm_set_user_memory().
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Message-ID: <20240229063726.610065-5-xiaoyao.li@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
a99c0c66 | 18-Mar-2024 |
Paolo Bonzini <pbonzini@redhat.com> |
KVM: remove kvm_arch_cpu_check_are_resettable
Board reset requires writing a fresh CPU state. As far as KVM is concerned, the only thing that blocks reset is that CPU state is encrypted; therefore,
KVM: remove kvm_arch_cpu_check_are_resettable
Board reset requires writing a fresh CPU state. As far as KVM is concerned, the only thing that blocks reset is that CPU state is encrypted; therefore, kvm_cpus_are_resettable() can simply check if that is the case.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
5c3131c3 | 18-Mar-2024 |
Paolo Bonzini <pbonzini@redhat.com> |
KVM: track whether guest state is encrypted
So far, KVM has allowed KVM_GET/SET_* ioctls to execute even if the guest state is encrypted, in which case they do nothing. For the new API using VM typ
KVM: track whether guest state is encrypted
So far, KVM has allowed KVM_GET/SET_* ioctls to execute even if the guest state is encrypted, in which case they do nothing. For the new API using VM types, instead, the ioctls will fail which is a safer and more robust approach.
The new API will be the only one available for SEV-SNP and TDX, but it is also usable for SEV and SEV-ES. In preparation for that, require architecture-specific KVM code to communicate the point at which guest state is protected (which must be after kvm_cpu_synchronize_post_init(), though that might change in the future in order to suppor migration). From that point, skip reading registers so that cpu->vcpu_dirty is never true: if it ever becomes true, kvm_arch_put_registers() will fail miserably.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
1e1e4879 | 22-Mar-2024 |
Paolo Bonzini <pbonzini@redhat.com> |
kvm: use configs/ definition to conditionalize debug support
If an architecture adds support for KVM_CAP_SET_GUEST_DEBUG but QEMU does not have the necessary code, QEMU will fail to build after upda
kvm: use configs/ definition to conditionalize debug support
If an architecture adds support for KVM_CAP_SET_GUEST_DEBUG but QEMU does not have the necessary code, QEMU will fail to build after updating kernel headers. Avoid this by using a #define in config-target.h instead of KVM_CAP_SET_GUEST_DEBUG.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
dcd092a0 | 06-Apr-2024 |
Richard Henderson <richard.henderson@linaro.org> |
accel/tcg: Improve can_do_io management
We already attempted to set and clear can_do_io before the first and last insns, but only used the initial value of max_insns and the call to translator_io_st
accel/tcg: Improve can_do_io management
We already attempted to set and clear can_do_io before the first and last insns, but only used the initial value of max_insns and the call to translator_io_start to find those insns.
Now that we track insn_start in DisasContextBase, and now that we have emit_before_op, we can wait until we have finished translation to identify the true first and last insns and emit the sets of can_do_io at that time.
This fixes the case of a translation block which crossed a page boundary, and for which the second page turned out to be mmio. In this case we truncate the block, and the previous logic for can_do_io could leave a block with a single insn with can_do_io set to false, which would fail an assertion in cpu_io_recompile.
Reported-by: Jørgen Hansen <Jorgen.Hansen@wdc.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Jørgen Hansen <Jorgen.Hansen@wdc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
show more ...
|
e7face70 | 06-Apr-2024 |
Richard Henderson <richard.henderson@linaro.org> |
accel/tcg: Add insn_start to DisasContextBase
This is currently target-specific for many; begin making it target independent.
Tested-by: Jørgen Hansen <Jorgen.Hansen@wdc.com> Reviewed-by: Philippe
accel/tcg: Add insn_start to DisasContextBase
This is currently target-specific for many; begin making it target independent.
Tested-by: Jørgen Hansen <Jorgen.Hansen@wdc.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
show more ...
|
e34f4d87 | 08-Apr-2024 |
Igor Mammedov <imammedo@redhat.com> |
kvm: error out of kvm_irqchip_add_msi_route() in case of full route table
subj is calling kvm_add_routing_entry() which simply extends KVMState::irq_routes::entries[] but doesn't check if number o
kvm: error out of kvm_irqchip_add_msi_route() in case of full route table
subj is calling kvm_add_routing_entry() which simply extends KVMState::irq_routes::entries[] but doesn't check if number of routes goes beyond limit the kernel is willing to accept. Which later leads toi the assert
qemu-kvm: ../accel/kvm/kvm-all.c:1833: kvm_irqchip_commit_routes: Assertion `ret == 0' failed
typically it happens during guest boot for large enough guest
Reproduced with: ./qemu --enable-kvm -m 8G -smp 64 -machine pc \ `for b in {1..2}; do echo -n "-device pci-bridge,id=pci$b,chassis_nr=$b "; for i in {0..31}; do touch /tmp/vblk$b$i; echo -n "-drive file=/tmp/vblk$b$i,if=none,id=drive$b$i,format=raw -device virtio-blk-pci,drive=drive$b$i,bus=pci$b "; done; done`
While crash at boot time is bad, the same might happen at hotplug time which is unacceptable. So instead calling kvm_add_routing_entry() unconditionally, check first that number of routes won't exceed KVM_CAP_IRQ_ROUTING. This way virtio device insteads killin qemu, will gracefully fail to initialize device as expected with following warnings on console: virtio-blk failed to set guest notifier (-28), ensure -accel kvm is set. virtio_bus_start_ioeventfd: failed. Fallback to userspace (slower).
Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-ID: <20240408110956.451558-1-imammedo@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
93019696 | 13-Mar-2024 |
Philippe Mathieu-Daudé <philmd@linaro.org> |
accel/tcg/plugin: Remove CONFIG_SOFTMMU_GATE definition
The CONFIG_SOFTMMU_GATE definition was never used, remove it.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Thomas H
accel/tcg/plugin: Remove CONFIG_SOFTMMU_GATE definition
The CONFIG_SOFTMMU_GATE definition was never used, remove it.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20240313213339.82071-2-philmd@linaro.org>
show more ...
|
dafa0ecc | 28-Mar-2024 |
Richard Henderson <richard.henderson@linaro.org> |
accel/tcg: Use CPUState.get_pc in cpu_io_recompile
Using log_pc produces the pc at the beginning of TB, not the actual pc installed by cpu_restore_state_from_tb, which could be any of the guest inst
accel/tcg: Use CPUState.get_pc in cpu_io_recompile
Using log_pc produces the pc at the beginning of TB, not the actual pc installed by cpu_restore_state_from_tb, which could be any of the guest instructions within TB.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
show more ...
|
94956d7b | 29-Jan-2024 |
Philippe Mathieu-Daudé <philmd@linaro.org> |
bulk: Call in place single use cpu_env()
Avoid CPUArchState local variable when cpu_env() is used once.
Mechanical patch using the following Coccinelle spatch script:
@@ type CPUArchState; iden
bulk: Call in place single use cpu_env()
Avoid CPUArchState local variable when cpu_env() is used once.
Mechanical patch using the following Coccinelle spatch script:
@@ type CPUArchState; identifier env; expression cs; @@ { - CPUArchState *env = cpu_env(cs); ... when != env - env + cpu_env(cs) ... when != env }
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-ID: <20240129164514.73104-5-philmd@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
show more ...
|
f28b958c | 10-Nov-2023 |
Philippe Mathieu-Daudé <philmd@linaro.org> |
hw/xen: Extract 'xen_igd.h' from 'xen_pt.h'
"hw/xen/xen_pt.h" requires "hw/xen/xen_native.h" which is target specific. It also declares IGD methods, which are not target specific.
Target-agnostic c
hw/xen: Extract 'xen_igd.h' from 'xen_pt.h'
"hw/xen/xen_pt.h" requires "hw/xen/xen_native.h" which is target specific. It also declares IGD methods, which are not target specific.
Target-agnostic code can use IGD methods. To allow that, extract these methos into a new "hw/xen/xen_igd.h" header.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk> Message-Id: <20231114143816.71079-18-philmd@linaro.org>
show more ...
|
3077be25 | 05-Mar-2024 |
Pierrick Bouvier <pierrick.bouvier@linaro.org> |
plugins: cleanup codepath for previous inline operation
Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Pierrick Bouvie
plugins: cleanup codepath for previous inline operation
Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org> Message-Id: <20240304130036.124418-13-pierrick.bouvier@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20240305121005.3528075-26-alex.bennee@linaro.org>
show more ...
|
0bcebaba | 05-Mar-2024 |
Pierrick Bouvier <pierrick.bouvier@linaro.org> |
plugins: add inline operation per vcpu
Extends API with three new functions: qemu_plugin_register_vcpu_{tb, insn, mem}_exec_inline_per_vcpu().
Those functions takes a qemu_plugin_u64 as input.
Thi
plugins: add inline operation per vcpu
Extends API with three new functions: qemu_plugin_register_vcpu_{tb, insn, mem}_exec_inline_per_vcpu().
Those functions takes a qemu_plugin_u64 as input.
This allows to have a thread-safe and type-safe version of inline operations.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org> Message-Id: <20240304130036.124418-5-pierrick.bouvier@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20240305121005.3528075-18-alex.bennee@linaro.org>
show more ...
|
62f92b8d | 05-Mar-2024 |
Pierrick Bouvier <pierrick.bouvier@linaro.org> |
plugins: implement inline operation relative to cpu_index
Instead of working on a fixed memory location, allow to address it based on cpu_index, an element size and a given offset. Result address: p
plugins: implement inline operation relative to cpu_index
Instead of working on a fixed memory location, allow to address it based on cpu_index, an element size and a given offset. Result address: ptr + offset + cpu_index * element_size.
With this, we can target a member in a struct array from a base pointer.
Current semantic is not modified, thus inline operation still targets always the same memory location.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org> Message-Id: <20240304130036.124418-4-pierrick.bouvier@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20240305121005.3528075-17-alex.bennee@linaro.org>
show more ...
|
49fa457c | 01-Mar-2024 |
Richard Henderson <richard.henderson@linaro.org> |
accel/tcg: Add TLB_CHECK_ALIGNED
This creates a per-page method for checking of alignment.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderso
accel/tcg: Add TLB_CHECK_ALIGNED
This creates a per-page method for checking of alignment.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20240301204110.656742-5-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
show more ...
|
a0ff4a87 | 01-Mar-2024 |
Richard Henderson <richard.henderson@linaro.org> |
accel/tcg: Add tlb_fill_flags to CPUTLBEntryFull
Allow the target to set tlb flags to apply to all of the comparators. Remove MemTxAttrs.byte_swap, as the bit is not relevant to memory transactions
accel/tcg: Add tlb_fill_flags to CPUTLBEntryFull
Allow the target to set tlb flags to apply to all of the comparators. Remove MemTxAttrs.byte_swap, as the bit is not relevant to memory transactions, only the page mapping. Adjust target/sparc to set TLB_BSWAP directly.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20240301204110.656742-4-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
show more ...
|
33402cea | 02-Jan-2024 |
Richard Henderson <richard.henderson@linaro.org> |
accel/tcg: Disconnect TargetPageDataNode from page size
Dynamically size the node for the runtime target page size.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard He
accel/tcg: Disconnect TargetPageDataNode from page size
Dynamically size the node for the runtime target page size.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com> Acked-by: Helge Deller <deller@gmx.de> Message-Id: <20240102015808.132373-29-richard.henderson@linaro.org>
show more ...
|