#
c4fd4c5b |
| 10-Jul-2024 |
dv <dv@openbsd.org> |
Split vmd into mi/md parts.
Makes as much of the core of vmd mi, pushing x86-isms into separate compilation units. Adds build logic for arm64, but no emulation yet. (You can build vmd, but it won't
Split vmd into mi/md parts.
Makes as much of the core of vmd mi, pushing x86-isms into separate compilation units. Adds build logic for arm64, but no emulation yet. (You can build vmd, but it won't have a vmm device to connect to.)
Some more cleanup probably needed around interrupt controller abstraction, but that can come as we implement more than the i8259.
ok mlarkin@
show more ...
|
#
a246f7a0 |
| 20-Feb-2024 |
dv <dv@openbsd.org> |
Utilize separate threads for RX and TX in vmd(8)'s vionet.
This commit adds multithreading to allow both virtqueues to be processed in parallel along with additional synchronization primitives to pr
Utilize separate threads for RX and TX in vmd(8)'s vionet.
This commit adds multithreading to allow both virtqueues to be processed in parallel along with additional synchronization primitives to protect device configuration state. Allowing RX and TX to operate independently reduces overall network latency for guests and helps alleviate the TX side dominating cpu time.
Tested with help from phessler@, kn@, and mlarkin@. ok mlarkin@.
show more ...
|
#
4d307b04 |
| 30-Jan-2024 |
dv <dv@openbsd.org> |
Rewrite vmd(8)'s vionet to be zero-copy.
Similar to the rewrite of the virtio block device to use zero-copy semantics, this rewrites how the virtio network device works with the virtqueue ring buffe
Rewrite vmd(8)'s vionet to be zero-copy.
Similar to the rewrite of the virtio block device to use zero-copy semantics, this rewrites how the virtio network device works with the virtqueue ring buffers to minimize data copying. For guests that don't use the built-in DNS and mac filtering capabilities, data can now be transfered to/from the virtqueue and the tap(4) directly without temporary buffers.
A lot of the virtio semantics are cleaned up as well, including proper error states.
Tested with help by mbuhl@, friehm@, mlarkin@, and others.
"go for it," mlarkin@
show more ...
|
#
08d0da61 |
| 26-Sep-2023 |
dv <dv@openbsd.org> |
vmd(8): disambiguate log messages per vm and device.
The logging output from vmd(8) often specifies the function performing the logging, but leaves which vm or vm device to guesswork and reading tea
vmd(8): disambiguate log messages per vm and device.
The logging output from vmd(8) often specifies the function performing the logging, but leaves which vm or vm device to guesswork and reading tea leaves.
Change the logging formatting to prefix with information about the specific vm and potentially the device subprocess. Most of this logging is behind the "verbose" mode, but for warnings this will clarify which vm or device logged the warning.
The format of vm/<name>/<device><index> is chosen to be concise and less ugly than other approaches. This adjusts the process naming for devices to match, dropping the use of brackets.
In the process of this change, updating log settings dynamically via vmctl(8) is fixed by properly broadcasting that information to the device subprocesses. The "vmm" process also now updates its own state properly, so settings survive vm reboots.
ok mlarkin@
show more ...
|
#
20e554f8 |
| 14-Sep-2023 |
dv <dv@openbsd.org> |
vmd(8)/vioblk: use zero-copy approach & vectored io.
The original version of the virtio block device dynamically allocated buffers to hold intermediate data when reading or writing to the underlying
vmd(8)/vioblk: use zero-copy approach & vectored io.
The original version of the virtio block device dynamically allocated buffers to hold intermediate data when reading or writing to the underlying disk fd(s). Since vioblk drivers may chain multiple segments together, this leads to overly complex logic and on read(2)/write(2) call per data segment.
Additionally, the virtio block logic in vmd didn't handle segments that weren't block aligned (e.g. 512 bytes). If a guest provided unaligned segments, garbage will be read or written.
Since virtio descriptors mimic iovec structures, this changes vmd's device emulation to use that model. (This is how other hypervisors emulate virtio devices.) This allows for zero-copy semantics using iovec's, reducing memcpy and multiple read/write syscalls per io transaction.
Testing by phessler@ and mlarkin@. OK mlarkin@.
show more ...
|
#
4d22b0bd |
| 06-Sep-2023 |
dv <dv@openbsd.org> |
vmd(8): clean up struct ioinfo.
In prep for fixing some vioblk device issues, simplify the ioinfo struct by dropping members that aren't needed.
ok mlarkin@
|
#
2272e586 |
| 13-Jul-2023 |
dv <dv@openbsd.org> |
vmd(8): pull validation into local prefix parser.
Validation for local prefixes, both inet and inet6, was scattered around. To make it even more confusing, vmd was using generic address parsing logi
vmd(8): pull validation into local prefix parser.
Validation for local prefixes, both inet and inet6, was scattered around. To make it even more confusing, vmd was using generic address parsing logic from prior network daemons. vmd doesn't need to parse addresses other than when parsing the local prefix settings in vm.conf and no runtime parsing is needed.
This change merges parsing and validation based on vmd's specific needs for local prefixes (e.g. reserving enough bits for vm id and network interface id encoding in an ipv4 address). In addition, it simplifies the struct from a generic address struct to one focused on just storing the v4 and v6 prefixes and masks. This cleans up an unused TAILQ struct member that isn't used by vmd and was leftover copy-pasta from those prior daemons.
The address parsing that vmd uses is also updated to using the latest logic in bgpd(8).
ok mlarkin@
show more ...
|
#
3481ecdf |
| 27-Apr-2023 |
dv <dv@openbsd.org> |
vmd(8): introduce multi-process model for virtio devices.
Isolate virtio network and block device emulation in dedicated processes, forked and exec'd from the vm process. This allows for tightening
vmd(8): introduce multi-process model for virtio devices.
Isolate virtio network and block device emulation in dedicated processes, forked and exec'd from the vm process. This allows for tightening pledge promises to just "stdio".
Communication between the vcpu's and these devices now occurs via imsg channels, which adds the benefit of not always blocking the vcpu thread while emulating the device.
With this commit, it's possible that vmd is the first open source hypervisor that *defaults* to a multi-process device emulation model without requiring any additional configuration from the operator.
Testing help from phessler@ and Mischa Peters.
ok mlarkin@
show more ...
|
#
73a98491 |
| 25-Apr-2023 |
dv <dv@openbsd.org> |
vmm(4)/vmd(8): pull struct members out of vmm ioctl create struct.
The object sent to vmm(4) contained file paths and details the kernel does not need for cpu virtualization as device emulation is i
vmm(4)/vmd(8): pull struct members out of vmm ioctl create struct.
The object sent to vmm(4) contained file paths and details the kernel does not need for cpu virtualization as device emulation is in userland. Effectively, "pull up" the struct members from the vm_create_params struct to the parent vmop_create_params struct.
This allows us to clean up some of vmd(8) and simplify things for switching to having vmctl(8) open the "kernel" file (SeaBIOS, bsd.rd, etc.) to allow users to boot recovery ramdisk kernels.
ok mlarkin@
show more ...
|
#
0bd10b9f |
| 23-Dec-2022 |
dv <dv@openbsd.org> |
vmd(8): implement zero-copy operations on virtqueues.
The original virtio device implementation relied on allocating a buffer on heap, copying the virtqueue from the guest, mutating the copy, and th
vmd(8): implement zero-copy operations on virtqueues.
The original virtio device implementation relied on allocating a buffer on heap, copying the virtqueue from the guest, mutating the copy, and then overwriting the virtqueue in the guest.
While the approach worked, it was both complex and added extra overhead. On older hardware, switching to the zero-copy approach can show a noticeable performance improvement for vionet devices. An added benefit is this diff also reduces the amount of code in vmd, which is always a welcome change.
In addition, change to talking about the queue pfn and not "address" as the virtio-pci spec has drivers provide a 32-bit value representing the physical page number of the location in guest memory, not the linear address.
Original idea from dlg@ while working on re-adding async task queues.
ok dlg@, tested by many
show more ...
|
#
ead1b146 |
| 04-May-2022 |
dv <dv@openbsd.org> |
vmctl(8)/vmd(8): convert disk sizes from MB to bytes
Continue converting other parts to storing data in bytes instead of MB. In this case, the logic for disk sizes was being scaled.
This fixes issu
vmctl(8)/vmd(8): convert disk sizes from MB to bytes
Continue converting other parts to storing data in bytes instead of MB. In this case, the logic for disk sizes was being scaled.
This fixes issues reported by Martin Vahlensieck where vmctl could no longer create disks larger than 7 MiB after previous commits to change storing memory sizes as bytes.
While this keeps the vm memory limit check in vmctl's size parser, it skips the limit check for disks. The error messages adjust accordingly and this removes the double error message logging.
Update comments and function types accordingly.
ok marlkin@
show more ...
|
#
39d68386 |
| 16-Jul-2021 |
dv <dv@openbsd.org> |
vmd(8): simplify vcpu logic, removing uart & vionet reads
Remove legacy state handling on the ns8250 and virtio network devices originally put in place before using libevent for async device events.
vmd(8): simplify vcpu logic, removing uart & vionet reads
Remove legacy state handling on the ns8250 and virtio network devices originally put in place before using libevent for async device events. The vcpu thread doesn't need to process device data as it is handled by the libevent thread.
This has the benefit of simplifying some of the message passing between threads introduced to the ns8250 uart since both the vcpu and libevent threads were processing read events.
No functional change intended. Tested by many, including abieber@, weerd@, Mischa Peters, and Matthias Schmidt. (Thanks.)
OK mlarkin@
show more ...
|
#
6c31e103 |
| 21-Jun-2021 |
dv <dv@openbsd.org> |
vmd(8): support variable length vionet rx descriptor chains
The original implementation of the virtio network device assumed a driver would only provide a 2-descriptor chain for receiving packets. T
vmd(8): support variable length vionet rx descriptor chains
The original implementation of the virtio network device assumed a driver would only provide a 2-descriptor chain for receiving packets. The virtio spec allows for variable length chains and drivers, in practice, construct them when they use a sufficiently large MTU.
This change lets the device use variable length chains provided by the driver, thus allowing for drivers to set an MTU up to the underlying host-side tap(4)'s limit of TUNMRU (16384).
Size limitations are now enforced on both tx and rx-side dropping anything violating the underlying tap(4) min and max limits.
More work is needed to increase the read(2) buffer in use by vmd to prevent packet truncation.
OK mlarkin@
show more ...
|
#
6eb4c859 |
| 16-Jun-2021 |
dv <dv@openbsd.org> |
cleanup vmd(8) includes and header files
Lots of organic growth other the years lead to unnecessary includes (proc.h everywhere) and odd dependencies between header files. This cleans things up a bi
cleanup vmd(8) includes and header files
Lots of organic growth other the years lead to unnecessary includes (proc.h everywhere) and odd dependencies between header files. This cleans things up a bit to help with upcoming cleanup around dhcp code.
No functional change.
"go for it" mlarkin@
show more ...
|
#
68388e5f |
| 21-Apr-2021 |
dv <dv@openbsd.org> |
Fix packet size checks and remove bad casts.
Because dhcpsz was an uninitialized ssize_t, it was possible that a garbage "packet" would be queued on the receiving end of the virtio network device.
Fix packet size checks and remove bad casts.
Because dhcpsz was an uninitialized ssize_t, it was possible that a garbage "packet" would be queued on the receiving end of the virtio network device.
Change the type to size_t and add proper checks based on it being greater than zero. Remove the cast of ssize_t to uint64_t that also caused garbage sizes when dhcpsz was unintialized and set at runtime to something < 0.
show more ...
|
#
97f33f1d |
| 29-Mar-2021 |
dv <dv@openbsd.org> |
Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior ignored did not intercept these packets and instea
Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior ignored did not intercept these packets and instead transmitted them.
This should make vmd(8)'s dhcp behave more as a true dhcp server should and allows it to work properly with the new dhcpleased(8) attempting a renewal.
OK mlarkin@
show more ...
|
#
4090dcce |
| 07-Jan-2021 |
tracey <tracey@openbsd.org> |
bump VM shutdown event timeout ok mlarkin@ stsp@ florian@
VMs with addition package daemons were not given enough time to shutdown gracefully.
|
#
548054a9 |
| 11-Dec-2019 |
pd <pd@openbsd.org> |
vmd: proper concurrency control when pausing a vm
Removes an XXX which slept for 1s waiting for the vcpu thread to reach HLT and pause. We now define a paused and unpaused condition so that a call
vmd: proper concurrency control when pausing a vm
Removes an XXX which slept for 1s waiting for the vcpu thread to reach HLT and pause. We now define a paused and unpaused condition so that a call to pause_vm() / vmctl pause blocks till the vm really reaches a paused state.
Also, detach events for devices from event loop when pausing and add them back when unpausing. This is because some callbacks call pthread_mutex_lock and if the vm is paused, it would block also causing the libevent thread to block. This would mean that we would not be able to process any IMSGs received from vmm (parent process) including a message to unpause.
ok mlarkin@
show more ...
|
#
cc104512 |
| 06-Dec-2018 |
claudio <claudio@openbsd.org> |
Make it possible to define the bootdevice in vmd. This information is used currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the internal dhcp server will pass "auto_install" as
Make it possible to define the bootdevice in vmd. This information is used currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the internal dhcp server will pass "auto_install" as boot file to the client and the boot loader passes the MAC of the first interface to the kernel to indicate PXE booting. Adding boot order support to SeaBIOS is not yet implemented. Ok ccardenas@
show more ...
|
#
62df93ee |
| 26-Nov-2018 |
reyk <reyk@openbsd.org> |
Move the {qcow2,raw} create functions from vmctl into vmd/vio{qcow2,raw}.c
This way they are in the appropriate place and code can be shared with vmd.
Ok ori@ mlarkin@ ccardenas@
|
#
4d2a1fb2 |
| 19-Oct-2018 |
reyk <reyk@openbsd.org> |
Add support to create and convert disk images from existing images
The -i option to vmctl create (eg. vmctl create output.qcow2 -i input.img) lets you create a new image from an input file and conve
Add support to create and convert disk images from existing images
The -i option to vmctl create (eg. vmctl create output.qcow2 -i input.img) lets you create a new image from an input file and convert it if it is a different format. This allows to convert qcow2 images from raw images, raw from qcow2, or even qcow2 from qcow2 and raw from raw to re-optimize the disk.
This re-uses Ori's vioqcow2.c from vmd by reaching into it and compiling it in. The API has been adjust to be used from both vmctl and vmd accordingly.
OK mlarkin@
show more ...
|
#
73613953 |
| 08-Oct-2018 |
reyk <reyk@openbsd.org> |
Add support for qcow2 base images (external snapshots).
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are derived fr
Add support for qcow2 base images (external snapshots).
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are derived from a base image. Data lookups start in the derived image, and if the derived image does not contain some data, the search proceeds ot the base image. Multiple derived images may exist off of a single base image.
A limitation of this format is that modifying the base image will corrupt the derived image.
This change also adds support for creating disk derived disk images to vmctl. To use it:
vmctl create derived.qcow2 -s 16G -b base.qcow2
From Ori Bernstein OK mlarkin@ reyk@
show more ...
|
#
f6c09be3 |
| 28-Sep-2018 |
reyk <reyk@openbsd.org> |
Support vmd-internal's vmboot with qcow2 disk images.
OK mlarkin@
|
#
50bebf2c |
| 19-Sep-2018 |
ccardenas <ccardenas@openbsd.org> |
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
show more ...
|
#
f224f92a |
| 09-Sep-2018 |
ccardenas <ccardenas@openbsd.org> |
Add initial qcow2 image support.
Users are able to declare disk images as 'raw' or 'qcow2' using either vmctl and vm.conf. The default disk image format is 'raw' if not specified.
Examples of usin
Add initial qcow2 image support.
Users are able to declare disk images as 'raw' or 'qcow2' using either vmctl and vm.conf. The default disk image format is 'raw' if not specified.
Examples of using disk format:
vmctl start bsd -Lc -r cd64.iso -d qcow2:current.qc2 or vmctl start bsd -Lc -r cd64.iso -d raw:current.raw is equivalent to vmctl start bsd -Lc -r cd64.iso -d current.raw
in vm.conf vm "current" { disable memory 2G disk "/home/user/vmm/current.qc2" format "qcow2" interface { switch "external" } }
or
vm "current" { disable memory 2G disk "/home/user/vmm/current.raw" format "raw" interface { switch "external" } }
is equivlanet to
vm "current" { disable memory 2G disk "/home/user/vmm/current.raw" interface { switch "external" } }
Tested by many.
Big Thanks to Ori Bernstein.
show more ...
|