00580786 | 12-Mar-2024 |
Philippe Mathieu-Daudé <philmd@linaro.org> |
qapi: Inline and remove QERR_MIGRATION_ACTIVE definition
Address the comment added in commit 4629ed1e98 ("qerror: Finally unused, clean up"), from 2015:
/* * These macros will go away, please
qapi: Inline and remove QERR_MIGRATION_ACTIVE definition
Address the comment added in commit 4629ed1e98 ("qerror: Finally unused, clean up"), from 2015:
/* * These macros will go away, please don't use * in new code, and do not add new ones! */
Mechanical transformation using sed, manually removing the definition in include/qapi/qmp/qerror.h.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-ID: <20240312141343.3168265-10-armbru@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> [Straightforward conflict with commit aeaafb1e59f (migration: export migration_is_running) resolved]
show more ...
|
45d19d93 | 12-Mar-2024 |
Philippe Mathieu-Daudé <philmd@linaro.org> |
qapi: Correct error message for 'vcpu_dirty_limit' parameter
QERR_INVALID_PARAMETER_VALUE is defined as:
#define QERR_INVALID_PARAMETER_VALUE \ "Parameter '%s' expects %s"
The current erro
qapi: Correct error message for 'vcpu_dirty_limit' parameter
QERR_INVALID_PARAMETER_VALUE is defined as:
#define QERR_INVALID_PARAMETER_VALUE \ "Parameter '%s' expects %s"
The current error is formatted as:
"Parameter 'vcpu_dirty_limit' expects is invalid, it must greater then 1 MB/s"
Replace by:
"Parameter 'vcpu_dirty_limit' must be greater than 1 MB/s"
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-ID: <20240312141343.3168265-9-armbru@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> [New error message corrected, commit message updated accordingly]
show more ...
|
2cc637f1 | 17-Apr-2024 |
Li Zhijian <lizhijian@fujitsu.com> |
migration/colo: Fix bdrv_graph_rdlock_main_loop: Assertion `!qemu_in_coroutine()' failed.
bdrv_activate_all() should not be called from the coroutine context, move it to the QEMU thread colo_process
migration/colo: Fix bdrv_graph_rdlock_main_loop: Assertion `!qemu_in_coroutine()' failed.
bdrv_activate_all() should not be called from the coroutine context, move it to the QEMU thread colo_process_incoming_thread() with the bql_lock protected.
The backtrace is as follows: #4 0x0000561af7948362 in bdrv_graph_rdlock_main_loop () at ../block/graph-lock.c:260 #5 0x0000561af7907a68 in graph_lockable_auto_lock_mainloop (x=0x7fd29810be7b) at /patch/to/qemu/include/block/graph-lock.h:259 #6 0x0000561af79167d1 in bdrv_activate_all (errp=0x7fd29810bed0) at ../block.c:6906 #7 0x0000561af762b4af in colo_incoming_co () at ../migration/colo.c:935 #8 0x0000561af7607e57 in process_incoming_migration_co (opaque=0x0) at ../migration/migration.c:793 #9 0x0000561af7adbeeb in coroutine_trampoline (i0=-106876144, i1=22042) at ../util/coroutine-ucontext.c:175 #10 0x00007fd2a5cf21c0 in () at /lib64/libc.so.6
Cc: qemu-stable@nongnu.org Cc: Fabiano Rosas <farosas@suse.de> Closes: https://gitlab.com/qemu-project/qemu/-/issues/2277 Fixes: 2b3912f135 ("block: Mark bdrv_first_blk() and bdrv_is_root_node() GRAPH_RDLOCK") Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Zhang Chen <chen.zhang@intel.com> Tested-by: Zhang Chen <chen.zhang@intel.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240417025634.1014582-1-lizhijian@fujitsu.com Signed-off-by: Peter Xu <peterx@redhat.com>
show more ...
|
5ef7e26b | 01-Apr-2024 |
Yuan Liu <yuan1.liu@intel.com> |
migration/multifd: solve zero page causing multiple page faults
Implemented recvbitmap tracking of received pages in multifd.
If the zero page appears for the first time in the recvbitmap, this pag
migration/multifd: solve zero page causing multiple page faults
Implemented recvbitmap tracking of received pages in multifd.
If the zero page appears for the first time in the recvbitmap, this page is not checked and set.
If the zero page has already appeared in the recvbitmap, there is no need to check the data but directly set the data to 0, because it is unlikely that the zero page will be migrated multiple times.
Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240401154110.2028453-2-yuan1.liu@intel.com [peterx: touch up the comment, as the bitmap is used outside postcopy now] Signed-off-by: Peter Xu <peterx@redhat.com>
show more ...
|
dd031677 | 29-Mar-2024 |
Cédric Le Goater <clg@redhat.com> |
migration: Add Error** argument to add_bitmaps_to_list()
This allows to report more precise errors in the migration handler dirty_bitmap_save_setup().
Suggested-by: Vladimir Sementsov-Ogievskiy <vs
migration: Add Error** argument to add_bitmaps_to_list()
This allows to report more precise errors in the migration handler dirty_bitmap_save_setup().
Suggested-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Signed-off-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Link: https://lore.kernel.org/r/20240329105627.311227-1-clg@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>
show more ...
|
030b56b2 | 20-Mar-2024 |
Cédric Le Goater <clg@redhat.com> |
migration: Modify ram_init_bitmaps() to report dirty tracking errors
The .save_setup() handler has now an Error** argument that we can use to propagate errors reported by the .log_global_start() han
migration: Modify ram_init_bitmaps() to report dirty tracking errors
The .save_setup() handler has now an Error** argument that we can use to propagate errors reported by the .log_global_start() handler. Do that for the RAM. The caller qemu_savevm_state_setup() will store the error under the migration stream for later detection in the migration sequence.
Signed-off-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240320064911.545001-15-clg@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>
show more ...
|
7bee8ba8 | 20-Mar-2024 |
Cédric Le Goater <clg@redhat.com> |
migration: Add Error** argument to xbzrle_init()
Since the return value (-ENOMEM) is not exploited, follow the recommendations of qapi/error.h and change it to a bool
Signed-off-by: Cédric Le Goate
migration: Add Error** argument to xbzrle_init()
Since the return value (-ENOMEM) is not exploited, follow the recommendations of qapi/error.h and change it to a bool
Signed-off-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240320064911.545001-14-clg@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>
show more ...
|
16ecd25a | 20-Mar-2024 |
Cédric Le Goater <clg@redhat.com> |
migration: Add Error** argument to ram_state_init()
Since the return value not exploited, follow the recommendations of qapi/error.h and change it to a bool
Signed-off-by: Cédric Le Goater <clg@red
migration: Add Error** argument to ram_state_init()
Since the return value not exploited, follow the recommendations of qapi/error.h and change it to a bool
Signed-off-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240320064911.545001-13-clg@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>
show more ...
|
639ec3fb | 20-Mar-2024 |
Cédric Le Goater <clg@redhat.com> |
memory: Add Error** argument to the global_dirty_log routines
Now that the log_global*() handlers take an Error** parameter and return a bool, do the same for memory_global_dirty_log_start() and mem
memory: Add Error** argument to the global_dirty_log routines
Now that the log_global*() handlers take an Error** parameter and return a bool, do the same for memory_global_dirty_log_start() and memory_global_dirty_log_stop(). The error is reported in the callers for now and it will be propagated in the call stack in the next changes.
To be noted a functional change in ram_init_bitmaps(), if the dirty pages logger fails to start, there is no need to synchronize the dirty pages bitmaps. colo_incoming_start_dirty_log() could be modified in a similar way.
Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Anthony Perard <anthony.perard@citrix.com> Cc: Paul Durrant <paul@xen.org> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: David Hildenbrand <david@redhat.com> Cc: Hyman Huang <yong.huang@smartx.com> Signed-off-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Acked-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/20240320064911.545001-12-clg@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>
show more ...
|
92c20b2f | 20-Mar-2024 |
Cédric Le Goater <clg@redhat.com> |
migration: Introduce ram_bitmaps_destroy()
We will use it in ram_init_bitmaps() to clear the allocated bitmaps when support for error reporting is added to memory_global_dirty_log_start().
Signed-o
migration: Introduce ram_bitmaps_destroy()
We will use it in ram_init_bitmaps() to clear the allocated bitmaps when support for error reporting is added to memory_global_dirty_log_start().
Signed-off-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240320064911.545001-11-clg@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>
show more ...
|
e4fa064d | 20-Mar-2024 |
Cédric Le Goater <clg@redhat.com> |
migration: Add Error** argument to .load_setup() handler
This will be useful to report errors at a higher level, mostly in VFIO today.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Review
migration: Add Error** argument to .load_setup() handler
This will be useful to report errors at a higher level, mostly in VFIO today.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/r/20240320064911.545001-9-clg@redhat.com [peterx: drop comment for ERRP_GUARD, per Markus] Signed-off-by: Peter Xu <peterx@redhat.com>
show more ...
|
01c3ac68 | 20-Mar-2024 |
Cédric Le Goater <clg@redhat.com> |
migration: Add Error** argument to .save_setup() handler
The purpose is to record a potential error in the migration stream if qemu_savevm_state_setup() fails. Most of the current .save_setup() hand
migration: Add Error** argument to .save_setup() handler
The purpose is to record a potential error in the migration stream if qemu_savevm_state_setup() fails. Most of the current .save_setup() handlers can be modified to use the Error argument instead of managing their own and calling locally error_report().
Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Harsh Prateek Bora <harshpb@linux.ibm.com> Cc: Halil Pasic <pasic@linux.ibm.com> Cc: Thomas Huth <thuth@redhat.com> Cc: Eric Blake <eblake@redhat.com> Cc: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Cc: John Snow <jsnow@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Signed-off-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/r/20240320064911.545001-8-clg@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>
show more ...
|
057a2009 | 20-Mar-2024 |
Cédric Le Goater <clg@redhat.com> |
migration: Add Error** argument to qemu_savevm_state_setup()
This prepares ground for the changes coming next which add an Error** argument to the .save_setup() handler. Callers of qemu_savevm_state
migration: Add Error** argument to qemu_savevm_state_setup()
This prepares ground for the changes coming next which add an Error** argument to the .save_setup() handler. Callers of qemu_savevm_state_setup() now handle the error and fail earlier setting the migration state from MIGRATION_STATUS_SETUP to MIGRATION_STATUS_FAILED.
In qemu_savevm_state(), move the cleanup to preserve the error reported by .save_setup() handlers.
Since the previous behavior was to ignore errors at this step of migration, this change should be examined closely to check that cleanups are still correctly done.
Signed-off-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/20240320064911.545001-7-clg@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>
show more ...
|
6138d43a | 20-Mar-2024 |
Cédric Le Goater <clg@redhat.com> |
migration: Add Error** argument to vmstate_save()
This will prepare ground for future changes adding an Error** argument to qemu_savevm_state_setup().
Reviewed-by: Prasad Pandit <pjp@fedoraproject.
migration: Add Error** argument to vmstate_save()
This will prepare ground for future changes adding an Error** argument to qemu_savevm_state_setup().
Reviewed-by: Prasad Pandit <pjp@fedoraproject.org> Signed-off-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/r/20240320064911.545001-6-clg@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>
show more ...
|
76936bbc | 20-Mar-2024 |
Cédric Le Goater <clg@redhat.com> |
migration: Always report an error in ram_save_setup()
This will prepare ground for future changes adding an Error** argument to the save_setup() handler. We need to make sure that on failure, ram_sa
migration: Always report an error in ram_save_setup()
This will prepare ground for future changes adding an Error** argument to the save_setup() handler. We need to make sure that on failure, ram_save_setup() sets a new error.
Reviewed-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/r/20240320064911.545001-5-clg@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>
show more ...
|
150da48c | 20-Mar-2024 |
Cédric Le Goater <clg@redhat.com> |
migration: Always report an error in block_save_setup()
This will prepare ground for future changes adding an Error** argument to the save_setup() handler. We need to make sure that on failure, bloc
migration: Always report an error in block_save_setup()
This will prepare ground for future changes adding an Error** argument to the save_setup() handler. We need to make sure that on failure, block_save_setup() always sets a new error.
Cc: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/r/20240320064911.545001-4-clg@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>
show more ...
|
857f504c | 08-Apr-2024 |
Paolo Bonzini <pbonzini@redhat.com> |
colo: move stubs out of stubs/
Since the colo stubs are needed exactly when the build options are not enabled, move them together with the code they stub.
Signed-off-by: Paolo Bonzini <pbonzini@red
colo: move stubs out of stubs/
Since the colo stubs are needed exactly when the build options are not enabled, move them together with the code they stub.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20240408155330.522792-16-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
7afbdada | 05-Apr-2024 |
Wei Wang <wei.w.wang@intel.com> |
migration/postcopy: ensure preempt channel is ready before loading states
Before loading the guest states, ensure that the preempt channel has been ready to use, as some of the states (e.g. via virt
migration/postcopy: ensure preempt channel is ready before loading states
Before loading the guest states, ensure that the preempt channel has been ready to use, as some of the states (e.g. via virtio_load) might trigger page faults that will be handled through the preempt channel. So yield to the main thread in the case that the channel create event hasn't been dispatched.
Cc: qemu-stable <qemu-stable@nongnu.org> Fixes: 9358982744 ("migration: Send requested page directly in rp-return thread") Originally-by: Lei Wang <lei4.wang@intel.com> Link: https://lore.kernel.org/all/9aa5d1be-7801-40dd-83fd-f7e041ced249@intel.com/T/ Signed-off-by: Lei Wang <lei4.wang@intel.com> Signed-off-by: Wei Wang <wei.w.wang@intel.com> Link: https://lore.kernel.org/r/20240405034056.23933-1-wei.w.wang@intel.com [peterx: add a todo section, add Fixes and copy stable for 8.0+] Signed-off-by: Peter Xu <peterx@redhat.com>
show more ...
|
d0ad271a | 28-Mar-2024 |
Avihai Horon <avihaih@nvidia.com> |
migration/postcopy: Ensure postcopy_start() sets errp if it fails
There are several places where postcopy_start() fails without setting errp. This can cause a null pointer de-reference, as in case o
migration/postcopy: Ensure postcopy_start() sets errp if it fails
There are several places where postcopy_start() fails without setting errp. This can cause a null pointer de-reference, as in case of error, the caller of postcopy_start() copies/prints the error set in errp.
Fix it by setting errp in all of postcopy_start() error paths.
Cc: qemu-stable <qemu-stable@nongnu.org> Fixes: 908927db28ea ("migration: Update error description whenever migration fails") Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/20240328140252.16756-3-avihaih@nvidia.com Signed-off-by: Peter Xu <peterx@redhat.com>
show more ...
|
30158d88 | 28-Mar-2024 |
Avihai Horon <avihaih@nvidia.com> |
migration: Set migration error in migration_completion()
After commit 9425ef3f990a ("migration: Use migrate_has_error() in close_return_path_on_source()"), close_return_path_on_source() assumes that
migration: Set migration error in migration_completion()
After commit 9425ef3f990a ("migration: Use migrate_has_error() in close_return_path_on_source()"), close_return_path_on_source() assumes that migration error is set if an error occurs during migration.
This may not be true if migration errors in migration_completion(). For example, if qemu_savevm_state_complete_precopy() errors, migration error will not be set.
This in turn, will cause a migration hang bug, similar to the bug that was fixed by commit 22b04245f0d5 ("migration: Join the return path thread before releasing to_dst_file"), as shutdown() will not be issued for the return-path channel.
Fix it by ensuring migration error is set in case of error in migration_completion().
Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Peter Xu <peterx@redhat.com> Fixes: 9425ef3f990a ("migration: Use migrate_has_error() in close_return_path_on_source()") Acked-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/r/20240328140252.16756-2-avihaih@nvidia.com Signed-off-by: Peter Xu <peterx@redhat.com>
show more ...
|
8fa1a21c | 21-Mar-2024 |
Fabiano Rosas <farosas@suse.de> |
migration/multifd: Fix clearing of mapped-ram zero pages
When the zero page detection is done in the multifd threads, we need to iterate the second part of the pages->offset array and clear the file
migration/multifd: Fix clearing of mapped-ram zero pages
When the zero page detection is done in the multifd threads, we need to iterate the second part of the pages->offset array and clear the file bitmap for each zero page. The piece of code we merged to do that is wrong.
The reason this has passed all the tests is because the bitmap is initialized with zeroes already, so clearing the bits only really has an effect during live migration and when a data page goes from having data to no data.
Fixes: 303e6f54f9 ("migration/multifd: Implement zero page transmission on the multifd thread.") Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240321201242.6009-1-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>
show more ...
|
910c1647 | 20-Mar-2024 |
Peter Xu <peterx@redhat.com> |
migration/postcopy: Fix high frequency sync
With current code base I can observe extremely high sync count during precopy, as long as one enables postcopy-ram=on before switchover to postcopy.
To p
migration/postcopy: Fix high frequency sync
With current code base I can observe extremely high sync count during precopy, as long as one enables postcopy-ram=on before switchover to postcopy.
To provide some context of when QEMU decides to do a full sync: it checks must_precopy (which implies "data must be sent during precopy phase"), and as long as it is lower than the threshold size we calculated (out of bandwidth and expected downtime) QEMU will kick off the slow/exact sync.
However, when postcopy is enabled (even if still during precopy phase), RAM only reports all pages as can_postcopy, and report must_precopy==0. Then "must_precopy <= threshold_size" mostly always triggers and enforces a slow sync for every call to migration_iteration_run() when postcopy is enabled even if not used. That is insane.
It turns out it was a regress bug introduced in the previous refactoring in 8.0 as reported by Nina [1]:
(a) c8df4a7aef ("migration: Split save_live_pending() into state_pending_*")
Then a workaround patch is applied at the end of release (8.0-rc4) to fix it:
(b) 28ef5339c3 ("migration: fix ram_state_pending_exact()")
However that "workaround" was overlooked when during the cleanup in this 9.0 release in this commit..
(c) b0504edd40 ("migration: Drop unnecessary check in ram's pending_exact()")
Then the issue was re-exposed as reported by Nina [1].
The problem with (b) is that it only fixed the case for RAM, rather than all the rest of iterators. Here a slow sync should only be required if all dirty data (precopy+postcopy) is less than the threshold_size that QEMU calculated. It is even debatable whether a sync is needed when switched to postcopy. Currently ram_state_pending_exact() will be mostly noop if switched to postcopy, and that logic seems to apply too for all the rest of iterators, as sync dirty bitmap during a postcopy doesn't make much sense. However let's leave such change for later, as we're in rc phase.
So rather than reusing commit (b), this patch provides the complete fix for all iterators. When at it, cleanup a little bit on the lines around.
[1] https://gitlab.com/qemu-project/qemu/-/issues/1565
Reported-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com> Fixes: b0504edd40 ("migration: Drop unnecessary check in ram's pending_exact()") Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240320214453.584374-1-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>
show more ...
|
bd4480b0 | 19-Mar-2024 |
Fabiano Rosas <farosas@suse.de> |
migration: Revert mapped-ram multifd support to fd: URI
This reverts commit decdc76772c453ff1444612e910caa0d45cd8eac in full and also the relevant migration-tests from 7a09f092834641b7a793d50a3a2610
migration: Revert mapped-ram multifd support to fd: URI
This reverts commit decdc76772c453ff1444612e910caa0d45cd8eac in full and also the relevant migration-tests from 7a09f092834641b7a793d50a3a261073bbb404a6.
After the addition of the new QAPI-based migration address API in 8.2 we've been converting an "fd:" URI into a SocketAddress, missing the fact that the "fd:" syntax could also be used for a plain file instead of a socket. This is a problem because the SocketAddress is part of the API, so we're effectively asking users to create a "socket" channel to pass in a plain file.
The easiest way to fix this situation is to deprecate the usage of both SocketAddress and "fd:" when used with a plain file for migration. Since this has been possible since 8.2, we can wait until 9.1 to deprecate it.
For 9.0, however, we should avoid adding further support to migration to a plain file using the old "fd:" syntax or the new SocketAddress API, and instead require the usage of either the old-style "file:" URI or the FileMigrationArgs::filename field of the new API with the "/dev/fdset/NN" syntax, both of which are already supported.
Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240319210941.1907-1-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>
show more ...
|
9adfb308 | 15-Mar-2024 |
Fabiano Rosas <farosas@suse.de> |
migration/multifd: Duplicate the fd for the outgoing_args
We currently store the file descriptor used during the main outgoing channel creation to use it again when creating the multifd channels.
S
migration/multifd: Duplicate the fd for the outgoing_args
We currently store the file descriptor used during the main outgoing channel creation to use it again when creating the multifd channels.
Since this fd is used for the first iochannel, there's risk that the QIOChannel gets freed and the fd closed while outgoing_args.fd still has it available. This could lead to an fd-reuse bug.
Duplicate the outgoing_args fd to avoid this issue.
Suggested-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/20240315032040.7974-3-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>
show more ...
|
73f6f9a1 | 15-Mar-2024 |
Fabiano Rosas <farosas@suse.de> |
migration/multifd: Ensure we're not given a socket for file migration
When doing migration using the fd: URI, QEMU will fetch the file descriptor passed in via the monitor at fd_start_outgoing|incom
migration/multifd: Ensure we're not given a socket for file migration
When doing migration using the fd: URI, QEMU will fetch the file descriptor passed in via the monitor at fd_start_outgoing|incoming_migration(), which means the checks at migration_channels_and_transport_compatible() happen too soon and we don't know at that point whether the FD refers to a plain file or a socket.
For this reason, we've been allowing a migration channel of type SOCKET_ADDRESS_TYPE_FD to pass the initial verifications in scenarios where the socket migration is not supported, such as with fd + multifd.
The commit decdc76772 ("migration/multifd: Add mapped-ram support to fd: URI") was supposed to add a second check prior to starting migration to make sure a socket fd is not passed instead of a file fd, but failed to do so.
Add the missing verification and update the comment explaining this situation which is currently incorrect.
Fixes: decdc76772 ("migration/multifd: Add mapped-ram support to fd: URI") Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/20240315032040.7974-2-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>
show more ...
|