565f4768 | 29-Feb-2024 |
Isaku Yamahata <isaku.yamahata@intel.com> |
kvm/tdx: Ignore memory conversion to shared of unassigned region
TDX requires vMMIO region to be shared. For KVM, MMIO region is the region which kvm memslot isn't assigned to (except in-kernel emu
kvm/tdx: Ignore memory conversion to shared of unassigned region
TDX requires vMMIO region to be shared. For KVM, MMIO region is the region which kvm memslot isn't assigned to (except in-kernel emulation). qemu has the memory region for vMMIO at each device level.
While OVMF issues MapGPA(to-shared) conservatively on 32bit PCI MMIO region, qemu doesn't find corresponding vMMIO region because it's before PCI device allocation and memory_region_find() finds the device region, not PCI bus region. It's safe to ignore MapGPA(to-shared) because when guest accesses those region they use GPA with shared bit set for vMMIO. Ignore memory conversion request of non-assigned region to shared and return success. Otherwise OVMF is confused and panics there.
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Message-ID: <20240229063726.610065-35-xiaoyao.li@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
c5d9425e | 29-Feb-2024 |
Isaku Yamahata <isaku.yamahata@intel.com> |
kvm/tdx: Don't complain when converting vMMIO region to shared
Because vMMIO region needs to be shared region, guest TD may explicitly convert such region from private to shared. Don't complain suc
kvm/tdx: Don't complain when converting vMMIO region to shared
Because vMMIO region needs to be shared region, guest TD may explicitly convert such region from private to shared. Don't complain such conversion.
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Message-ID: <20240229063726.610065-34-xiaoyao.li@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
c15e5684 | 20-Mar-2024 |
Chao Peng <chao.p.peng@linux.intel.com> |
kvm: handle KVM_EXIT_MEMORY_FAULT
Upon an KVM_EXIT_MEMORY_FAULT exit, userspace needs to do the memory conversion on the RAMBlock to turn the memory into desired attribute, switching between private
kvm: handle KVM_EXIT_MEMORY_FAULT
Upon an KVM_EXIT_MEMORY_FAULT exit, userspace needs to do the memory conversion on the RAMBlock to turn the memory into desired attribute, switching between private and shared.
Currently only KVM_MEMORY_EXIT_FLAG_PRIVATE in flags is valid when KVM_EXIT_MEMORY_FAULT happens.
Note, KVM_EXIT_MEMORY_FAULT makes sense only when the RAMBlock has guest_memfd memory backend.
Note, KVM_EXIT_MEMORY_FAULT returns with -EFAULT, so special handling is added.
When page is converted from shared to private, the original shared memory can be discarded via ram_block_discard_range(). Note, shared memory can be discarded only when it's not back'ed by hugetlb because hugetlb is supposed to be pre-allocated and no need for discarding.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com> Co-developed-by: Xiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20240320083945.991426-13-michael.roth@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
bd3bcf69 | 20-Mar-2024 |
Xiaoyao Li <xiaoyao.li@intel.com> |
kvm/memory: Make memory type private by default if it has guest memfd backend
KVM side leaves the memory to shared by default, which may incur the overhead of paging conversion on the first visit of
kvm/memory: Make memory type private by default if it has guest memfd backend
KVM side leaves the memory to shared by default, which may incur the overhead of paging conversion on the first visit of each page. Because the expectation is that page is likely to private for the VMs that require private memory (has guest memfd).
Explicitly set the memory to private when memory region has valid guest memfd backend.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: Michael Roth <michael.roth@amd.com> Message-ID: <20240320083945.991426-16-michael.roth@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
ce5a9832 | 20-Mar-2024 |
Chao Peng <chao.p.peng@linux.intel.com> |
kvm: Enable KVM_SET_USER_MEMORY_REGION2 for memslot
Switch to KVM_SET_USER_MEMORY_REGION2 when supported by KVM.
With KVM_SET_USER_MEMORY_REGION2, QEMU can set up memory region that backend'ed both
kvm: Enable KVM_SET_USER_MEMORY_REGION2 for memslot
Switch to KVM_SET_USER_MEMORY_REGION2 when supported by KVM.
With KVM_SET_USER_MEMORY_REGION2, QEMU can set up memory region that backend'ed both by hva-based shared memory and guest memfd based private memory.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com> Co-developed-by: Xiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Message-ID: <20240320083945.991426-10-michael.roth@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
15f7a80c | 20-Mar-2024 |
Xiaoyao Li <xiaoyao.li@intel.com> |
RAMBlock: Add support of KVM private guest memfd
Add KVM guest_memfd support to RAMBlock so both normal hva based memory and kvm guest memfd based private memory can be associated in one RAMBlock.
RAMBlock: Add support of KVM private guest memfd
Add KVM guest_memfd support to RAMBlock so both normal hva based memory and kvm guest memfd based private memory can be associated in one RAMBlock.
Introduce new flag RAM_GUEST_MEMFD. When it's set, it calls KVM ioctl to create private guest_memfd during RAMBlock setup.
Allocating a new RAM_GUEST_MEMFD flag to instruct the setup of guest memfd is more flexible and extensible than simply relying on the VM type because in the future we may have the case that not all the memory of a VM need guest memfd. As a benefit, it also avoid getting MachineState in memory subsystem.
Note, RAM_GUEST_MEMFD is supposed to be set for memory backends of confidential guests, such as TDX VM. How and when to set it for memory backends will be implemented in the following patches.
Introduce memory_region_has_guest_memfd() to query if the MemoryRegion has KVM guest_memfd allocated.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Reviewed-by: David Hildenbrand <david@redhat.com> Message-ID: <20240320083945.991426-7-michael.roth@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
0811baed | 20-Mar-2024 |
Xiaoyao Li <xiaoyao.li@intel.com> |
kvm: Introduce support for memory_attributes
Introduce the helper functions to set the attributes of a range of memory to private or shared.
This is necessary to notify KVM the private/shared attri
kvm: Introduce support for memory_attributes
Introduce the helper functions to set the attributes of a range of memory to private or shared.
This is necessary to notify KVM the private/shared attribute of each gpa range. KVM needs the information to decide the GPA needs to be mapped at hva-based shared memory or guest_memfd based private memory.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Message-ID: <20240320083945.991426-11-michael.roth@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
a99c0c66 | 18-Mar-2024 |
Paolo Bonzini <pbonzini@redhat.com> |
KVM: remove kvm_arch_cpu_check_are_resettable
Board reset requires writing a fresh CPU state. As far as KVM is concerned, the only thing that blocks reset is that CPU state is encrypted; therefore,
KVM: remove kvm_arch_cpu_check_are_resettable
Board reset requires writing a fresh CPU state. As far as KVM is concerned, the only thing that blocks reset is that CPU state is encrypted; therefore, kvm_cpus_are_resettable() can simply check if that is the case.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
5c3131c3 | 18-Mar-2024 |
Paolo Bonzini <pbonzini@redhat.com> |
KVM: track whether guest state is encrypted
So far, KVM has allowed KVM_GET/SET_* ioctls to execute even if the guest state is encrypted, in which case they do nothing. For the new API using VM typ
KVM: track whether guest state is encrypted
So far, KVM has allowed KVM_GET/SET_* ioctls to execute even if the guest state is encrypted, in which case they do nothing. For the new API using VM types, instead, the ioctls will fail which is a safer and more robust approach.
The new API will be the only one available for SEV-SNP and TDX, but it is also usable for SEV and SEV-ES. In preparation for that, require architecture-specific KVM code to communicate the point at which guest state is protected (which must be after kvm_cpu_synchronize_post_init(), though that might change in the future in order to suppor migration). From that point, skip reading registers so that cpu->vcpu_dirty is never true: if it ever becomes true, kvm_arch_put_registers() will fail miserably.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
2cb81af0 | 17-Oct-2023 |
Paolo Bonzini <pbonzini@redhat.com> |
kvm: unify listeners for PIO address space
Since we now assume that ioeventfds are present, kvm_io_listener is always registered. Merge it with kvm_coalesced_pio_listener in a single listener. Sin
kvm: unify listeners for PIO address space
Since we now assume that ioeventfds are present, kvm_io_listener is always registered. Merge it with kvm_coalesced_pio_listener in a single listener. Since PIO space does not have KVM memslots attached to it, the priority is irrelevant.
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
5d9ec1f4 | 17-Oct-2023 |
Paolo Bonzini <pbonzini@redhat.com> |
kvm: assume that many ioeventfds can be created
NR_IOBUS_DEVS was increased to 200 in Linux 2.6.34. By Linux 3.5 it had increased to 1000 and later ioeventfds were changed to not count against the
kvm: assume that many ioeventfds can be created
NR_IOBUS_DEVS was increased to 200 in Linux 2.6.34. By Linux 3.5 it had increased to 1000 and later ioeventfds were changed to not count against the limit. But the earlier limit of 200 would already be enough for kvm_check_many_ioeventfds() to be true, so remove the check.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
a788260b | 17-Oct-2023 |
Paolo Bonzini <pbonzini@redhat.com> |
kvm: require KVM_IRQFD for kernel irqchip
KVM_IRQFD was introduced in Linux 2.6.32, and since then it has always been available on architectures that support an in-kernel interrupt controller. We ca
kvm: require KVM_IRQFD for kernel irqchip
KVM_IRQFD was introduced in Linux 2.6.32, and since then it has always been available on architectures that support an in-kernel interrupt controller. We can require it unconditionally.
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|