io_uring.c - OpenGrok history log for /qemu/block/io

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
Revision tags: v8.2.1, v8.1.5, v7.2.9
# 75e79f5a	23-Jan-2024	Fiona Ebner <f.ebner@proxmox.com>	block/io_uring: improve error message when init fails The man page for io_uring_queue_init states: > io_uring_queue_init(3) returns 0 on success and -errno on failure. and the man page for io_urin block/io_uring: improve error message when init fails The man page for io_uring_queue_init states: > io_uring_queue_init(3) returns 0 on success and -errno on failure. and the man page for io_uring_setup (which is one of the functions where the return value of io_uring_queue_init() can come from) states: > On error, a negative error code is returned. The caller should not > rely on errno variable. Tested using 'sysctl kernel.io_uring_disabled=2'. Output before this change: > failed to init linux io_uring ring Output after this change: > failed to init linux io_uring ring: Operation not permitted Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20240123135044.204985-1-f.ebner@proxmox.com> show more ...
Revision tags: v8.1.4, v7.2.8, v8.2.0
# 592d0bc0	13-Dec-2023	Paolo Bonzini <pbonzini@redhat.com>	remove unnecessary casts from uintptr_t uintptr_t, or unsigned long which is equivalent on Linux I32LP64 systems, is an unsigned type and there is no need to further cast to __u64 which is another u remove unnecessary casts from uintptr_t uintptr_t, or unsigned long which is equivalent on Linux I32LP64 systems, is an unsigned type and there is no need to further cast to __u64 which is another unsigned integer type; widening casts from unsigned integers zero-extend the value. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> show more ...
Revision tags: v8.2.0-rc4, v8.2.0-rc3, v8.2.0-rc2, v8.2.0-rc1, v7.2.7, v8.1.3, v8.2.0-rc0, v8.1.2, v8.1.1, v7.2.6, v8.0.5, v8.1.0, v8.1.0-rc4, v8.1.0-rc3, v7.2.5, v8.0.4, v8.1.0-rc2, v8.1.0-rc1, v8.1.0-rc0, v8.0.3, v7.2.4, v8.0.2, v8.0.1, v7.2.3, v7.2.2, v8.0.0, v8.0.0-rc4, v8.0.0-rc3, v7.2.1, v8.0.0-rc2, v8.0.0-rc1, v8.0.0-rc0, v7.2.0, v7.2.0-rc4, v7.2.0-rc3, v7.2.0-rc2, v7.2.0-rc1, v7.2.0-rc0, v7.1.0, v7.1.0-rc4, v7.1.0-rc3, v7.1.0-rc2, v7.1.0-rc1, v7.1.0-rc0, v7.0.0, v7.0.0-rc4, v7.0.0-rc3, v7.0.0-rc2, v7.0.0-rc1, v7.0.0-rc0, v6.1.1, v6.2.0, v6.2.0-rc4, v6.2.0-rc3, v6.2.0-rc2, v6.2.0-rc1, v6.2.0-rc0, v6.0.1, v6.1.0, v6.1.0-rc4, v6.1.0-rc3, v6.1.0-rc2, v6.1.0-rc1, v6.1.0-rc0
# 3cbc17ee	12-Jul-2021	Paolo Bonzini <pbonzini@redhat.com>	io_uring: move LuringState typedef to block/aio.h The LuringState typedef is defined twice, in include/block/raw-aio.h and block/io_uring.c. Move it in include/block/aio.h, which is included everyw io_uring: move LuringState typedef to block/aio.h The LuringState typedef is defined twice, in include/block/raw-aio.h and block/io_uring.c. Move it in include/block/aio.h, which is included everywhere the typedef is needed, since include/block/aio.h already has to define the forward reference to the struct. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> show more ...
# 84d61e5f	13-Sep-2023	Stefan Hajnoczi <stefanha@redhat.com>	virtio: use defer_call() in virtio_irqfd_notify() virtio-blk and virtio-scsi invoke virtio_irqfd_notify() to send Used Buffer Notifications from an IOThread. This involves an eventfd write(2) syscal virtio: use defer_call() in virtio_irqfd_notify() virtio-blk and virtio-scsi invoke virtio_irqfd_notify() to send Used Buffer Notifications from an IOThread. This involves an eventfd write(2) syscall. Calling this repeatedly when completing multiple I/O requests in a row is wasteful. Use the defer_call() API to batch together virtio_irqfd_notify() calls made during thread pool (aio=threads), Linux AIO (aio=native), and io_uring (aio=io_uring) completion processing. Behavior is unchanged for emulated devices that do not use defer_call_begin()/defer_call_end() since defer_call() immediately invokes the callback when called outside a defer_call_begin()/defer_call_end() region. fio rw=randread bs=4k iodepth=64 numjobs=8 IOPS increases by ~9% with a single IOThread and 8 vCPUs. iodepth=1 decreases by ~1% but this could be noise. Detailed performance data and configuration specifics are available here: https://gitlab.com/stefanha/virt-playbooks/-/tree/blk_io_plug-irqfd This duplicates the BH that virtio-blk uses for batching. The next commit will remove it. Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230913200045.1024233-4-stefanha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> show more ...
# 433fcea4	13-Sep-2023	Stefan Hajnoczi <stefanha@redhat.com>	util/defer-call: move defer_call() to util/ The networking subsystem may wish to use defer_call(), so move the code to util/ where it can be reused. As a reminder of what defer_call() does: This A util/defer-call: move defer_call() to util/ The networking subsystem may wish to use defer_call(), so move the code to util/ where it can be reused. As a reminder of what defer_call() does: This API defers a function call within a defer_call_begin()/defer_call_end() section, allowing multiple calls to batch up. This is a performance optimization that is used in the block layer to submit several I/O requests at once instead of individually: defer_call_begin(); <-- start of section ... defer_call(my_func, my_obj); <-- deferred my_func(my_obj) call defer_call(my_func, my_obj); <-- another defer_call(my_func, my_obj); <-- another ... defer_call_end(); <-- end of section, my_func(my_obj) is called once Suggested-by: Ilya Maximets <i.maximets@ovn.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230913200045.1024233-3-stefanha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> show more ...
# ccee48aa	13-Sep-2023	Stefan Hajnoczi <stefanha@redhat.com>	block: rename blk_io_plug_call() API to defer_call() Prepare to move the blk_io_plug_call() API out of the block layer so that other subsystems call use this deferred call mechanism. Rename it to de block: rename blk_io_plug_call() API to defer_call() Prepare to move the blk_io_plug_call() API out of the block layer so that other subsystems call use this deferred call mechanism. Rename it to defer_call() but leave the code in block/plug.c. The next commit will move the code out of the block layer. Suggested-by: Ilya Maximets <i.maximets@ovn.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Paul Durrant <paul@xen.org> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230913200045.1024233-2-stefanha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> show more ...
# 84d61e5f	13-Sep-2023	Stefan Hajnoczi <stefanha@redhat.com>	virtio: use defer_call() in virtio_irqfd_notify() virtio-blk and virtio-scsi invoke virtio_irqfd_notify() to send Used Buffer Notifications from an IOThread. This involves an eventfd write(2) syscal virtio: use defer_call() in virtio_irqfd_notify() virtio-blk and virtio-scsi invoke virtio_irqfd_notify() to send Used Buffer Notifications from an IOThread. This involves an eventfd write(2) syscall. Calling this repeatedly when completing multiple I/O requests in a row is wasteful. Use the defer_call() API to batch together virtio_irqfd_notify() calls made during thread pool (aio=threads), Linux AIO (aio=native), and io_uring (aio=io_uring) completion processing. Behavior is unchanged for emulated devices that do not use defer_call_begin()/defer_call_end() since defer_call() immediately invokes the callback when called outside a defer_call_begin()/defer_call_end() region. fio rw=randread bs=4k iodepth=64 numjobs=8 IOPS increases by ~9% with a single IOThread and 8 vCPUs. iodepth=1 decreases by ~1% but this could be noise. Detailed performance data and configuration specifics are available here: https://gitlab.com/stefanha/virt-playbooks/-/tree/blk_io_plug-irqfd This duplicates the BH that virtio-blk uses for batching. The next commit will remove it. Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230913200045.1024233-4-stefanha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> show more ...
# 433fcea4	13-Sep-2023	Stefan Hajnoczi <stefanha@redhat.com>	util/defer-call: move defer_call() to util/ The networking subsystem may wish to use defer_call(), so move the code to util/ where it can be reused. As a reminder of what defer_call() does: This A util/defer-call: move defer_call() to util/ The networking subsystem may wish to use defer_call(), so move the code to util/ where it can be reused. As a reminder of what defer_call() does: This API defers a function call within a defer_call_begin()/defer_call_end() section, allowing multiple calls to batch up. This is a performance optimization that is used in the block layer to submit several I/O requests at once instead of individually: defer_call_begin(); <-- start of section ... defer_call(my_func, my_obj); <-- deferred my_func(my_obj) call defer_call(my_func, my_obj); <-- another defer_call(my_func, my_obj); <-- another ... defer_call_end(); <-- end of section, my_func(my_obj) is called once Suggested-by: Ilya Maximets <i.maximets@ovn.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230913200045.1024233-3-stefanha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> show more ...
# ccee48aa	13-Sep-2023	Stefan Hajnoczi <stefanha@redhat.com>	block: rename blk_io_plug_call() API to defer_call() Prepare to move the blk_io_plug_call() API out of the block layer so that other subsystems call use this deferred call mechanism. Rename it to de block: rename blk_io_plug_call() API to defer_call() Prepare to move the blk_io_plug_call() API out of the block layer so that other subsystems call use this deferred call mechanism. Rename it to defer_call() but leave the code in block/plug.c. The next commit will move the code out of the block layer. Suggested-by: Ilya Maximets <i.maximets@ovn.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Paul Durrant <paul@xen.org> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230913200045.1024233-2-stefanha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> show more ...
# 6a6da231	30-May-2023	Stefan Hajnoczi <stefanha@redhat.com>	block/io_uring: convert to blk_io_plug_call() API Stop using the .bdrv_co_io_plug() API because it is not multi-queue block layer friendly. Use the new blk_io_plug_call() API to batch I/O submission block/io_uring: convert to blk_io_plug_call() API Stop using the .bdrv_co_io_plug() API because it is not multi-queue block layer friendly. Use the new blk_io_plug_call() API to batch I/O submission instead. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Acked-by: Kevin Wolf <kwolf@redhat.com> Message-id: 20230530180959.1108766-5-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> show more ...
# 60f782b6	16-May-2023	Stefan Hajnoczi <stefanha@redhat.com>	aio: remove aio_disable_external() API All callers now pass is_external=false to aio_set_fd_handler() and aio_set_event_notifier(). The aio_disable_external() API that temporarily disables fd handle aio: remove aio_disable_external() API All callers now pass is_external=false to aio_set_fd_handler() and aio_set_event_notifier(). The aio_disable_external() API that temporarily disables fd handlers that were registered is_external=true is therefore dead code. Remove aio_disable_external(), aio_enable_external(), and the is_external arguments to aio_set_fd_handler() and aio_set_event_notifier(). The entire test-fdmon-epoll test is removed because its sole purpose was testing aio_disable_external(). Parts of this patch were generated using the following coccinelle (https://coccinelle.lip6.fr/) semantic patch: @@ expression ctx, fd, is_external, io_read, io_write, io_poll, io_poll_ready, opaque; @@ - aio_set_fd_handler(ctx, fd, is_external, io_read, io_write, io_poll, io_poll_ready, opaque) + aio_set_fd_handler(ctx, fd, io_read, io_write, io_poll, io_poll_ready, opaque) @@ expression ctx, notifier, is_external, io_read, io_poll, io_poll_ready; @@ - aio_set_event_notifier(ctx, notifier, is_external, io_read, io_poll, io_poll_ready) + aio_set_event_notifier(ctx, notifier, io_read, io_poll, io_poll_ready) Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20230516190238.8401-21-stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> show more ...
# 4751d09a	08-May-2023	Sam Li <faithilikerun@gmail.com>	block: introduce zone append write for zoned devices A zone append command is a write operation that specifies the first logical block of a zone as the write position. When writing to a zoned block block: introduce zone append write for zoned devices A zone append command is a write operation that specifies the first logical block of a zone as the write position. When writing to a zoned block device using zone append, the byte offset of the call may point at any position within the zone to which the data is being appended. Upon completion the device will respond with the position where the data has been written in the zone. Signed-off-by: Sam Li <faithilikerun@gmail.com> Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20230508051510.177850-3-faithilikerun@gmail.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> show more ...
# a75e4e43	03-Feb-2023	Emanuele Giuseppe Esposito <eesposit@redhat.com>	io_uring: use LuringState from the running thread Remove usage of aio_context_acquire by always submitting asynchronous AIO to the current thread's LuringState. In order to prevent mistakes from th io_uring: use LuringState from the running thread Remove usage of aio_context_acquire by always submitting asynchronous AIO to the current thread's LuringState. In order to prevent mistakes from the caller side, avoid passing LuringState in luring_io_{plug/unplug} and luring_co_submit, and document the functions to make clear that they work in the current thread's AioContext. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20230203131731.851116-3-eesposit@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> show more ...
# a75e4e43	03-Feb-2023	Emanuele Giuseppe Esposito <eesposit@redhat.com>	io_uring: use LuringState from the running thread Remove usage of aio_context_acquire by always submitting asynchronous AIO to the current thread's LuringState. In order to prevent mistakes from th io_uring: use LuringState from the running thread Remove usage of aio_context_acquire by always submitting asynchronous AIO to the current thread's LuringState. In order to prevent mistakes from the caller side, avoid passing LuringState in luring_io_{plug/unplug} and luring_co_submit, and document the functions to make clear that they work in the current thread's AioContext. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20230203131731.851116-3-eesposit@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> show more ...
# a75e4e43	03-Feb-2023	Emanuele Giuseppe Esposito <eesposit@redhat.com>	io_uring: use LuringState from the running thread Remove usage of aio_context_acquire by always submitting asynchronous AIO to the current thread's LuringState. In order to prevent mistakes from th io_uring: use LuringState from the running thread Remove usage of aio_context_acquire by always submitting asynchronous AIO to the current thread's LuringState. In order to prevent mistakes from the caller side, avoid passing LuringState in luring_io_{plug/unplug} and luring_co_submit, and document the functions to make clear that they work in the current thread's AioContext. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20230203131731.851116-3-eesposit@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> show more ...
# 7845e731	24-Sep-2022	Sam Li <faithilikerun@gmail.com>	block/io_uring: revert "Use io_uring_register_ring_fd() to skip fd operations" Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1193 The commit "Use io_uring_register_ring_fd() to skip fd op block/io_uring: revert "Use io_uring_register_ring_fd() to skip fd operations" Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1193 The commit "Use io_uring_register_ring_fd() to skip fd operations" broke when booting a guest with iothread and io_uring. That is because the io_uring_register_ring_fd() call is made from the main thread instead of IOThread where io_uring_submit() is called. It can not be guaranteed to register the ring fd in the correct thread or unregister the same ring fd if the IOThread is disabled. This optimization is not critical so we will revert previous commit. This reverts commit e2848bc574fe2715c694bf8fe9a1ba7f78a1125a and 77e3f038af1764983087e3551a0fde9951952c4d. Cc: qemu-stable@nongnu.org Signed-off-by: Sam Li <faithilikerun@gmail.com> Message-Id: <20220924144815.5591-1-faithilikerun@gmail.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Tested-by: Dario Faggioli <dfaggioli@suse.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> show more ...
# 7845e731	24-Sep-2022	Sam Li <faithilikerun@gmail.com>	block/io_uring: revert "Use io_uring_register_ring_fd() to skip fd operations" Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1193 The commit "Use io_uring_register_ring_fd() to skip fd op block/io_uring: revert "Use io_uring_register_ring_fd() to skip fd operations" Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1193 The commit "Use io_uring_register_ring_fd() to skip fd operations" broke when booting a guest with iothread and io_uring. That is because the io_uring_register_ring_fd() call is made from the main thread instead of IOThread where io_uring_submit() is called. It can not be guaranteed to register the ring fd in the correct thread or unregister the same ring fd if the IOThread is disabled. This optimization is not critical so we will revert previous commit. This reverts commit e2848bc574fe2715c694bf8fe9a1ba7f78a1125a and 77e3f038af1764983087e3551a0fde9951952c4d. Cc: qemu-stable@nongnu.org Signed-off-by: Sam Li <faithilikerun@gmail.com> Message-Id: <20220924144815.5591-1-faithilikerun@gmail.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Tested-by: Dario Faggioli <dfaggioli@suse.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> show more ...
# 77e3f038	21-Jul-2022	Jinhao Fan <fanjinhao21s@ict.ac.cn>	block/io_uring: add missing include file The commit "Use io_uring_register_ring_fd() to skip fd operations" uses warn_report but did not include the header file "qemu/error-report.h". This causes "e block/io_uring: add missing include file The commit "Use io_uring_register_ring_fd() to skip fd operations" uses warn_report but did not include the header file "qemu/error-report.h". This causes "error: implicit declaration of function ‘warn_report’". Include this header file. Fixes: e2848bc574 ("Use io_uring_register_ring_fd() to skip fd operations") Signed-off-by: Jinhao Fan <fanjinhao21s@ict.ac.cn> Message-Id: <20220721065645.577404-1-fanjinhao21s@ict.ac.cn> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> show more ...
# 77e3f038	21-Jul-2022	Jinhao Fan <fanjinhao21s@ict.ac.cn>	block/io_uring: add missing include file The commit "Use io_uring_register_ring_fd() to skip fd operations" uses warn_report but did not include the header file "qemu/error-report.h". This causes "e block/io_uring: add missing include file The commit "Use io_uring_register_ring_fd() to skip fd operations" uses warn_report but did not include the header file "qemu/error-report.h". This causes "error: implicit declaration of function ‘warn_report’". Include this header file. Fixes: e2848bc574 ("Use io_uring_register_ring_fd() to skip fd operations") Signed-off-by: Jinhao Fan <fanjinhao21s@ict.ac.cn> Message-Id: <20220721065645.577404-1-fanjinhao21s@ict.ac.cn> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> show more ...
# be6a166f	06-Jul-2022	Stefan Hajnoczi <stefanha@redhat.com>	block/io_uring: clarify that short reads can happen Jens Axboe has confirmed that short reads are rare but can happen: https://lore.kernel.org/io-uring/YsU%2FCGkl9ZXUI+Tj@stefanha-x1.localdomain/T/# block/io_uring: clarify that short reads can happen Jens Axboe has confirmed that short reads are rare but can happen: https://lore.kernel.org/io-uring/YsU%2FCGkl9ZXUI+Tj@stefanha-x1.localdomain/T/#m729963dc577d709b709c191922e98ec79d7eef54 The luring_resubmit_short_read() comment claimed they were only due to a specific io_uring bug that was fixed in Linux commit 9d93a3f5a0c ("io_uring: punt short reads to async context"), which is wrong. Dominique Martinet found that a btrfs bug also causes short reads. There may be more kernel code paths that result in short reads. Let's consider short reads fair game. Cc: Dominique Martinet <dominique.martinet@atmark-techno.com> Based-on: <20220630010137.2518851-1-dominique.martinet@atmark-techno.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Message-id: 20220706080341.1206476-1-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> show more ...
# c06fc7ce	30-Jun-2022	Dominique Martinet <dominique.martinet@atmark-techno.com>	io_uring: fix short read slow path sqeq.off here is the offset to read within the disk image, so obviously not 'nread' (the amount we just read), but as the author meant to write its current value i io_uring: fix short read slow path sqeq.off here is the offset to read within the disk image, so obviously not 'nread' (the amount we just read), but as the author meant to write its current value incremented by the amount we just read. Normally recent versions of linux will not issue short reads, but it can happen so we should fix this. This lead to weird image corruptions when short read happened Fixes: 6663a0a33764 ("block/io_uring: implements interfaces for io_uring") Link: https://lkml.kernel.org/r/YrrFGO4A1jS0GI0G@atmark-techno.com Signed-off-by: Dominique Martinet <dominique.martinet@atmark-techno.com> Message-Id: <20220630010137.2518851-1-dominique.martinet@atmark-techno.com> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> show more ...
# be6a166f	06-Jul-2022	Stefan Hajnoczi <stefanha@redhat.com>	block/io_uring: clarify that short reads can happen Jens Axboe has confirmed that short reads are rare but can happen: https://lore.kernel.org/io-uring/YsU%2FCGkl9ZXUI+Tj@stefanha-x1.localdomain/T/# block/io_uring: clarify that short reads can happen Jens Axboe has confirmed that short reads are rare but can happen: https://lore.kernel.org/io-uring/YsU%2FCGkl9ZXUI+Tj@stefanha-x1.localdomain/T/#m729963dc577d709b709c191922e98ec79d7eef54 The luring_resubmit_short_read() comment claimed they were only due to a specific io_uring bug that was fixed in Linux commit 9d93a3f5a0c ("io_uring: punt short reads to async context"), which is wrong. Dominique Martinet found that a btrfs bug also causes short reads. There may be more kernel code paths that result in short reads. Let's consider short reads fair game. Cc: Dominique Martinet <dominique.martinet@atmark-techno.com> Based-on: <20220630010137.2518851-1-dominique.martinet@atmark-techno.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Message-id: 20220706080341.1206476-1-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> show more ...
# c06fc7ce	30-Jun-2022	Dominique Martinet <dominique.martinet@atmark-techno.com>	io_uring: fix short read slow path sqeq.off here is the offset to read within the disk image, so obviously not 'nread' (the amount we just read), but as the author meant to write its current value i io_uring: fix short read slow path sqeq.off here is the offset to read within the disk image, so obviously not 'nread' (the amount we just read), but as the author meant to write its current value incremented by the amount we just read. Normally recent versions of linux will not issue short reads, but it can happen so we should fix this. This lead to weird image corruptions when short read happened Fixes: 6663a0a33764 ("block/io_uring: implements interfaces for io_uring") Link: https://lkml.kernel.org/r/YrrFGO4A1jS0GI0G@atmark-techno.com Signed-off-by: Dominique Martinet <dominique.martinet@atmark-techno.com> Message-Id: <20220630010137.2518851-1-dominique.martinet@atmark-techno.com> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> show more ...
# be6a166f	06-Jul-2022	Stefan Hajnoczi <stefanha@redhat.com>	block/io_uring: clarify that short reads can happen Jens Axboe has confirmed that short reads are rare but can happen: https://lore.kernel.org/io-uring/YsU%2FCGkl9ZXUI+Tj@stefanha-x1.localdomain/T/# block/io_uring: clarify that short reads can happen Jens Axboe has confirmed that short reads are rare but can happen: https://lore.kernel.org/io-uring/YsU%2FCGkl9ZXUI+Tj@stefanha-x1.localdomain/T/#m729963dc577d709b709c191922e98ec79d7eef54 The luring_resubmit_short_read() comment claimed they were only due to a specific io_uring bug that was fixed in Linux commit 9d93a3f5a0c ("io_uring: punt short reads to async context"), which is wrong. Dominique Martinet found that a btrfs bug also causes short reads. There may be more kernel code paths that result in short reads. Let's consider short reads fair game. Cc: Dominique Martinet <dominique.martinet@atmark-techno.com> Based-on: <20220630010137.2518851-1-dominique.martinet@atmark-techno.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Message-id: 20220706080341.1206476-1-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> show more ...
# c06fc7ce	30-Jun-2022	Dominique Martinet <dominique.martinet@atmark-techno.com>	io_uring: fix short read slow path sqeq.off here is the offset to read within the disk image, so obviously not 'nread' (the amount we just read), but as the author meant to write its current value i io_uring: fix short read slow path sqeq.off here is the offset to read within the disk image, so obviously not 'nread' (the amount we just read), but as the author meant to write its current value incremented by the amount we just read. Normally recent versions of linux will not issue short reads, but it can happen so we should fix this. This lead to weird image corruptions when short read happened Fixes: 6663a0a33764 ("block/io_uring: implements interfaces for io_uring") Link: https://lkml.kernel.org/r/YrrFGO4A1jS0GI0G@atmark-techno.com Signed-off-by: Dominique Martinet <dominique.martinet@atmark-techno.com> Message-Id: <20220630010137.2518851-1-dominique.martinet@atmark-techno.com> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> show more ...
12