#
752863bd |
| 17-Apr-2024 |
Christoph Hellwig <hch@lst.de> |
block: propagate partition scanning errors to the BLKRRPART ioctl
Commit 4601b4b130de ("block: reopen the device in blkdev_reread_part") lost the propagation of I/O errors from the low-level read of
block: propagate partition scanning errors to the BLKRRPART ioctl
Commit 4601b4b130de ("block: reopen the device in blkdev_reread_part") lost the propagation of I/O errors from the low-level read of the partition table to the user space caller of the BLKRRPART.
Apparently some user space relies on, so restore the propagation. This isn't exactly pretty as other block device open calls explicitly do not are about these errors, so add a new BLK_OPEN_STRICT_SCAN to opt into the error propagation.
Fixes: 4601b4b130de ("block: reopen the device in blkdev_reread_part") Reported-by: Saranya Muruganandam <saranyamohan@google.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Link: https://lore.kernel.org/r/20240417144743.2277601-1-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
9617cd6f |
| 06-Apr-2024 |
Yu Kuai <yukuai3@huawei.com> |
block: fix module reference leakage from bdev_open_by_dev error path
At the time bdev_may_open() is called, module reference is grabbed already, hence module reference should be released if bdev_may
block: fix module reference leakage from bdev_open_by_dev error path
At the time bdev_may_open() is called, module reference is grabbed already, hence module reference should be released if bdev_may_open() failed.
This problem is found by code review.
Fixes: ed5cc702d311 ("block: Add config option to not allow writing to mounted devices") Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240406090930.2252838-22-yukuai1@huaweicloud.com Signed-off-by: Christian Brauner <brauner@kernel.org>
show more ...
|
#
22650a99 |
| 26-Mar-2024 |
Christian Brauner <brauner@kernel.org> |
fs,block: yield devices early
Currently a device is only really released once the umount returns to userspace due to how file closing works. That ultimately could cause an old umount assumption to b
fs,block: yield devices early
Currently a device is only really released once the umount returns to userspace due to how file closing works. That ultimately could cause an old umount assumption to be violated that concurrent umount and mount don't fail. So an exclusively held device with a temporary holder should be yielded before the filesystem is gone. Add a helper that allows callers to do that. This also allows us to remove the two holder ops that Linus wasn't excited about.
Link: https://lore.kernel.org/r/20240326-vfs-bdev-end_holder-v1-1-20af85202918@kernel.org Fixes: f3a608827d1f ("bdev: open block device as files") # mainline only Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
show more ...
|
#
3ff56e28 |
| 23-Mar-2024 |
Christian Brauner <brauner@kernel.org> |
block: count BLK_OPEN_RESTRICT_WRITES openers
The original changes in v6.8 do allow for a block device to be reopened with BLK_OPEN_RESTRICT_WRITES provided the same holder is used as per bdev_may_o
block: count BLK_OPEN_RESTRICT_WRITES openers
The original changes in v6.8 do allow for a block device to be reopened with BLK_OPEN_RESTRICT_WRITES provided the same holder is used as per bdev_may_open(). I think this has a bug.
The first opener @f1 of that block device will set bdev->bd_writers to -1. The second opener @f2 using the same holder will pass the check in bdev_may_open() that bdev->bd_writers must not be greater than zero.
The first opener @f1 now closes the block device and in bdev_release() will end up calling bdev_yield_write_access() which calls bdev_writes_blocked() and sets bdev->bd_writers to 0 again.
Now @f2 holds a file to that block device which was opened with exclusive write access but bdev->bd_writers has been reset to 0.
So now @f3 comes along and succeeds in opening the block device with BLK_OPEN_WRITE betraying @f2's request to have exclusive write access.
This isn't a practical issue yet because afaict there's no codepath inside the kernel that reopenes the same block device with BLK_OPEN_RESTRICT_WRITES but it will be if there is.
Fix this by counting the number of BLK_OPEN_RESTRICT_WRITES openers. So we only allow writes again once all BLK_OPEN_RESTRICT_WRITES openers are done.
Link: https://lore.kernel.org/r/20240323-abtauchen-klauen-c2953810082d@brauner Fixes: ed5cc702d311 ("block: Add config option to not allow writing to mounted devices") Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
show more ...
|
#
ddd65e19 |
| 23-Mar-2024 |
Christian Brauner <brauner@kernel.org> |
block: handle BLK_OPEN_RESTRICT_WRITES correctly
Last kernel release we introduce CONFIG_BLK_DEV_WRITE_MOUNTED. By default this option is set. When it is set the long-standing behavior of being able
block: handle BLK_OPEN_RESTRICT_WRITES correctly
Last kernel release we introduce CONFIG_BLK_DEV_WRITE_MOUNTED. By default this option is set. When it is set the long-standing behavior of being able to write to mounted block devices is enabled.
But in order to guard against unintended corruption by writing to the block device buffer cache CONFIG_BLK_DEV_WRITE_MOUNTED can be turned off. In that case it isn't possible to write to mounted block devices anymore.
A filesystem may open its block devices with BLK_OPEN_RESTRICT_WRITES which disallows concurrent BLK_OPEN_WRITE access. When we still had the bdev handle around we could recognize BLK_OPEN_RESTRICT_WRITES because the mode was passed around. Since we managed to get rid of the bdev handle we changed that logic to recognize BLK_OPEN_RESTRICT_WRITES based on whether the file was opened writable and writes to that block device are blocked. That logic doesn't work because we do allow BLK_OPEN_RESTRICT_WRITES to be specified without BLK_OPEN_WRITE.
Fix the detection logic and use an FMODE_* bit. We could've also abused O_EXCL as an indicator that BLK_OPEN_RESTRICT_WRITES has been requested. For userspace open paths O_EXCL will never be retained but for internal opens where we open files that are never installed into a file descriptor table this is fine. But it would be a gamble that this doesn't cause bugs. Note that BLK_OPEN_RESTRICT_WRITES is an internal only flag that cannot directly be raised by userspace. It is implicitly raised during mounting.
Passes xftests and blktests with CONFIG_BLK_DEV_WRITE_MOUNTED set and unset.
Link: https://lore.kernel.org/r/ZfyyEwu9Uq5Pgb94@casper.infradead.org Link: https://lore.kernel.org/r/20240323-zielbereich-mittragen-6fdf14876c3e@brauner Fixes: 321de651fa56 ("block: don't rely on BLK_OPEN_RESTRICT_WRITES when yielding write access") Reviewed-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Reported-by: Matthew Wilcox <willy@infradead.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
show more ...
|
#
59a55a63 |
| 14-Mar-2024 |
Christian Brauner <brauner@kernel.org> |
fs,block: get holder during claim
Now that we open block devices as files we need to deal with the realities that closing is a deferred operation. An operation on the block device such as e.g., free
fs,block: get holder during claim
Now that we open block devices as files we need to deal with the realities that closing is a deferred operation. An operation on the block device such as e.g., freeze, thaw, or removal that runs concurrently with umount, tries to acquire a stable reference on the holder. The holder might already be gone though. Make that reliable by grabbing a passive reference to the holder during bdev_open() and releasing it during bdev_release().
Fixes: f3a608827d1f ("bdev: open block device as files") # mainline only Reported-by: Christoph Hellwig <hch@infradead.org> Link: https://lore.kernel.org/r/ZfEQQ9jZZVes0WCZ@infradead.org Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@infradead.org> Tested-by: Yi Zhang <yi.zhang@redhat.com> Reported-by: https://lore.kernel.org/r/CAHj4cs8tbDwKRwfS1=DmooP73ysM__xAb2PQc6XsAmWR+VuYmg@mail.gmail.com Link: https://lore.kernel.org/r/20240315-freibad-annehmbar-ca68c375af91@brauner Signed-off-by: Christian Brauner <brauner@kernel.org>
show more ...
|
#
ab838b3f |
| 23-Jan-2024 |
Christian Brauner <brauner@kernel.org> |
block: remove bdev_handle completely
We just need to use the holder to indicate whether a block device open was exclusive or not. We did use to do that before but had to give that up once we switche
block: remove bdev_handle completely
We just need to use the holder to indicate whether a block device open was exclusive or not. We did use to do that before but had to give that up once we switched to struct bdev_handle. Before struct bdev_handle we only stashed stuff in file->private_data if this was an exclusive open but after struct bdev_handle we always set file->private_data to a struct bdev_handle and so we had to use bdev_handle->mode or bdev_handle->holder. Now that we don't use struct bdev_handle anymore we can revert back to the old behavior.
Link: https://lore.kernel.org/r/20240123-vfs-bdev-file-v2-32-adbd023e19cc@kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
show more ...
|
#
321de651 |
| 23-Jan-2024 |
Christian Brauner <brauner@kernel.org> |
block: don't rely on BLK_OPEN_RESTRICT_WRITES when yielding write access
Make it possible to detected a block device that was opened with restricted write access based only on BLK_OPEN_WRITE and bde
block: don't rely on BLK_OPEN_RESTRICT_WRITES when yielding write access
Make it possible to detected a block device that was opened with restricted write access based only on BLK_OPEN_WRITE and bdev->bd_writers < 0 so we won't have to claim another FMODE_* flag.
Link: https://lore.kernel.org/r/20240123-vfs-bdev-file-v2-31-adbd023e19cc@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
show more ...
|
#
7c09a4ed |
| 23-Jan-2024 |
Christian Brauner <brauner@kernel.org> |
bdev: remove bdev pointer from struct bdev_handle
We can always go directly via:
* I_BDEV(bdev_file->f_inode) * I_BDEV(bdev_file->f_mapping->host)
So keeping struct bdev in struct bdev_handle is r
bdev: remove bdev pointer from struct bdev_handle
We can always go directly via:
* I_BDEV(bdev_file->f_inode) * I_BDEV(bdev_file->f_mapping->host)
So keeping struct bdev in struct bdev_handle is redundant.
Link: https://lore.kernel.org/r/20240123-vfs-bdev-file-v2-30-adbd023e19cc@kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
show more ...
|
#
a56aefca |
| 23-Jan-2024 |
Christian Brauner <brauner@kernel.org> |
bdev: make struct bdev_handle private to the block layer
Link: https://lore.kernel.org/r/20240123-vfs-bdev-file-v2-29-adbd023e19cc@kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-b
bdev: make struct bdev_handle private to the block layer
Link: https://lore.kernel.org/r/20240123-vfs-bdev-file-v2-29-adbd023e19cc@kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
show more ...
|
#
b1211a25 |
| 23-Jan-2024 |
Christian Brauner <brauner@kernel.org> |
bdev: make bdev_{release, open_by_dev}() private to block layer
Move both of them to the private block header. There's no caller in the tree anymore that uses them directly.
Link: https://lore.kern
bdev: make bdev_{release, open_by_dev}() private to block layer
Move both of them to the private block header. There's no caller in the tree anymore that uses them directly.
Link: https://lore.kernel.org/r/20240123-vfs-bdev-file-v2-28-adbd023e19cc@kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
show more ...
|
#
e97d06a4 |
| 23-Jan-2024 |
Christian Brauner <brauner@kernel.org> |
bdev: remove bdev_open_by_path()
Link: https://lore.kernel.org/r/20240123-vfs-bdev-file-v2-27-adbd023e19cc@kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz>
bdev: remove bdev_open_by_path()
Link: https://lore.kernel.org/r/20240123-vfs-bdev-file-v2-27-adbd023e19cc@kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
show more ...
|
#
f3a60882 |
| 08-Feb-2024 |
Christian Brauner <brauner@kernel.org> |
bdev: open block device as files
Add two new helpers to allow opening block devices as files. This is not the final infrastructure. This still opens the block device before opening a struct a file.
bdev: open block device as files
Add two new helpers to allow opening block devices as files. This is not the final infrastructure. This still opens the block device before opening a struct a file. Until we have removed all references to struct bdev_handle we can't switch the order:
* Introduce blk_to_file_flags() to translate from block specific to flags usable to pen a new file. * Introduce bdev_file_open_by_{dev,path}(). * Introduce temporary sb_bdev_handle() helper to retrieve a struct bdev_handle from a block device file and update places that directly reference struct bdev_handle to rely on it. * Don't count block device openes against the number of open files. A bdev_file_open_by_{dev,path}() file is never installed into any file descriptor table.
One idea that came to mind was to use kernel_tmpfile_open() which would require us to pass a path and it would then call do_dentry_open() going through the regular fops->open::blkdev_open() path. But then we're back to the problem of routing block specific flags such as BLK_OPEN_RESTRICT_WRITES through the open path and would have to waste FMODE_* flags every time we add a new one. With this we can avoid using a flag bit and we have more leeway in how we open block devices from bdev_open_by_{dev,path}().
Link: https://lore.kernel.org/r/20240123-vfs-bdev-file-v2-1-adbd023e19cc@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
show more ...
|
#
82c6515d |
| 24-Feb-2024 |
Chengming Zhou <zhouchengming@bytedance.com> |
bdev: remove SLAB_MEM_SPREAD flag usage
The SLAB_MEM_SPREAD flag is already a no-op as of 6.8-rc1, remove its usage so we can delete it from slab. No functional change.
Signed-off-by: Chengming Zho
bdev: remove SLAB_MEM_SPREAD flag usage
The SLAB_MEM_SPREAD flag is already a no-op as of 6.8-rc1, remove its usage so we can delete it from slab. No functional change.
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com> Link: https://lore.kernel.org/r/20240224134646.829105-1-chengming.zhou@linux.dev Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
8ff363ad |
| 24-Dec-2023 |
Christophe JAILLET <christophe.jaillet@wanadoo.fr> |
block: Fix a memory leak in bdev_open_by_dev()
If we early exit here, 'handle' needs to be freed, or some memory leaks.
Fixes: ed5cc702d311 ("block: Add config option to not allow writing to mounte
block: Fix a memory leak in bdev_open_by_dev()
If we early exit here, 'handle' needs to be freed, or some memory leaks.
Fixes: ed5cc702d311 ("block: Add config option to not allow writing to mounted devices") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/8eaec334781e695810aaa383b55de00ca4ab1352.1703439383.git.christophe.jaillet@wanadoo.fr Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Christian Brauner <brauner@kernel.org>
show more ...
|
#
1898efcd |
| 25-Oct-2023 |
Christoph Hellwig <hch@lst.de> |
block: update the stable_writes flag in bdev_add
Propagate the per-queue stable_write flags into each bdev inode in bdev_add. This makes sure devices that require stable writes have it set for I/O o
block: update the stable_writes flag in bdev_add
Propagate the per-queue stable_write flags into each bdev inode in bdev_add. This makes sure devices that require stable writes have it set for I/O on the block device node as well.
Note that this doesn't cover the case of a flag changing on a live device yet. We should handle that as well, but I plan to cover it as part of a more general rework of how changing runtime paramters on block devices works.
Fixes: 1cb039f3dc16 ("bdi: replace BDI_CAP_STABLE_WRITES with a queue and a sb flag") Reported-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20231025141020.192413-3-hch@lst.de Tested-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
show more ...
|
#
ed5cc702 |
| 01-Nov-2023 |
Jan Kara <jack@suse.cz> |
block: Add config option to not allow writing to mounted devices
Writing to mounted devices is dangerous and can lead to filesystem corruption as well as crashes. Furthermore syzbot comes with more
block: Add config option to not allow writing to mounted devices
Writing to mounted devices is dangerous and can lead to filesystem corruption as well as crashes. Furthermore syzbot comes with more and more involved examples how to corrupt block device under a mounted filesystem leading to kernel crashes and reports we can do nothing about. Add tracking of writers to each block device and a kernel cmdline argument which controls whether other writeable opens to block devices open with BLK_OPEN_RESTRICT_WRITES flag are allowed. We will make filesystems use this flag for used devices.
Note that this effectively only prevents modification of the particular block device's page cache by other writers. The actual device content can still be modified by other means - e.g. by issuing direct scsi commands, by doing writes through devices lower in the storage stack (e.g. in case loop devices, DM, or MD are involved) etc. But blocking direct modifications of the block device page cache is enough to give filesystems a chance to perform data validation when loading data from the underlying storage and thus prevent kernel crashes.
Syzbot can use this cmdline argument option to avoid uninteresting crashes. Also users whose userspace setup does not need writing to mounted block devices can set this option for hardening.
Link: https://lore.kernel.org/all/60788e5d-5c7c-1142-e554-c21d709acfd9@linaro.org Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20231101174325.10596-3-jack@suse.cz Reviewed-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Christian Brauner <brauner@kernel.org>
show more ...
|
#
cd34758c |
| 01-Nov-2023 |
Jan Kara <jack@suse.cz> |
block: Remove blkdev_get_by_*() functions
blkdev_get_by_*() and blkdev_put() functions are now unused. Remove them.
Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz> L
block: Remove blkdev_get_by_*() functions
blkdev_get_by_*() and blkdev_put() functions are now unused. Remove them.
Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20231101174325.10596-2-jack@suse.cz Reviewed-by: Christian Brauner <brauner@kernel.org> Reviewed-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Christian Brauner <brauner@kernel.org>
show more ...
|
#
49ef8832 |
| 27-Sep-2023 |
Christian Brauner <brauner@kernel.org> |
bdev: implement freeze and thaw holder operations
The old method of implementing block device freeze and thaw operations required us to rely on get_active_super() to walk the list of all superblocks
bdev: implement freeze and thaw holder operations
The old method of implementing block device freeze and thaw operations required us to rely on get_active_super() to walk the list of all superblocks on the system to find any superblock that might use the block device. This is wasteful and not very pleasant overall.
Now that we can finally go straight from block device to owning superblock things become way simpler.
Link: https://lore.kernel.org/r/20231024-vfs-super-freeze-v2-5-599c19f4faac@kernel.org Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
show more ...
|
#
fbcb8f39 |
| 24-Oct-2023 |
Christian Brauner <brauner@kernel.org> |
bdev: surface the error from sync_blockdev()
When freeze_super() is called, sync_filesystem() will be called which calls sync_blockdev() and already surfaces any errors. Do the same for block device
bdev: surface the error from sync_blockdev()
When freeze_super() is called, sync_filesystem() will be called which calls sync_blockdev() and already surfaces any errors. Do the same for block devices that aren't owned by a superblock and also for filesystems that don't call sync_blockdev() internally but implicitly rely on bdev_freeze() to do it.
Link: https://lore.kernel.org/r/20231024-vfs-super-freeze-v2-3-599c19f4faac@kernel.org Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
show more ...
|
#
982c3b30 |
| 24-Oct-2023 |
Christian Brauner <brauner@kernel.org> |
bdev: rename freeze and thaw helpers
We have bdev_mark_dead() etc and we're going to move block device freezing to holder ops in the next patch. Make the naming consistent:
* freeze_bdev() -> bdev_
bdev: rename freeze and thaw helpers
We have bdev_mark_dead() etc and we're going to move block device freezing to holder ops in the next patch. Make the naming consistent:
* freeze_bdev() -> bdev_freeze() * thaw_bdev() -> bdev_thaw()
Also document the return code.
Link: https://lore.kernel.org/r/20231024-vfs-super-freeze-v2-2-599c19f4faac@kernel.org Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
show more ...
|
#
6e57236e |
| 17-Oct-2023 |
Christoph Hellwig <hch@lst.de> |
block: move bdev_mark_dead out of disk_check_media_change
disk_check_media_change is mostly called from ->open where it makes little sense to mark the file system on the device as dead, as we are ju
block: move bdev_mark_dead out of disk_check_media_change
disk_check_media_change is mostly called from ->open where it makes little sense to mark the file system on the device as dead, as we are just opening it. So instead of calling bdev_mark_dead from disk_check_media_change move it into the few callers that are not in an open instance. This avoid calling into bdev_mark_dead and thus taking s_umount with open_mutex held.
Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20231017184823.1383356-4-hch@lst.de Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Christian Brauner <brauner@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Christian Brauner <brauner@kernel.org>
show more ...
|
#
fd146410 |
| 18-Oct-2023 |
Jan Kara <jack@suse.cz> |
fs: Avoid grabbing sb->s_umount under bdev->bd_holder_lock
The implementation of bdev holder operations such as fs_bdev_mark_dead() and fs_bdev_sync() grab sb->s_umount semaphore under bdev->bd_hold
fs: Avoid grabbing sb->s_umount under bdev->bd_holder_lock
The implementation of bdev holder operations such as fs_bdev_mark_dead() and fs_bdev_sync() grab sb->s_umount semaphore under bdev->bd_holder_lock. This is problematic because it leads to disk->open_mutex -> sb->s_umount lock ordering which is counterintuitive (usually we grab higher level (e.g. filesystem) locks first and lower level (e.g. block layer) locks later) and indeed makes lockdep complain about possible locking cycles whenever we open a block device while holding sb->s_umount semaphore. Implement a function bdev_super_lock_shared() which safely transitions from holding bdev->bd_holder_lock to holding sb->s_umount on alive superblock without introducing the problematic lock dependency. We use this function fs_bdev_sync() and fs_bdev_mark_dead().
Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20231018152924.3858-1-jack@suse.cz Link: https://lore.kernel.org/r/20231017184823.1383356-1-hch@lst.de Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
show more ...
|
#
841dd789 |
| 27-Sep-2023 |
Jan Kara <jack@suse.cz> |
block: Use bdev_open_by_dev() in blkdev_open()
Convert blkdev_open() to use bdev_open_by_dev(). To be able to propagate handle from blkdev_open() to blkdev_release() we need to stop using existence
block: Use bdev_open_by_dev() in blkdev_open()
Convert blkdev_open() to use bdev_open_by_dev(). To be able to propagate handle from blkdev_open() to blkdev_release() we need to stop using existence of file->private_data to determine exclusive block device opens. Use bdev_handle->mode for this purpose since file->f_flags isn't usable for this (O_EXCL is cleared from the flags during open).
Acked-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230927093442.25915-2-jack@suse.cz Signed-off-by: Christian Brauner <brauner@kernel.org>
show more ...
|
#
e719b4d1 |
| 27-Sep-2023 |
Jan Kara <jack@suse.cz> |
block: Provide bdev_open_* functions
Create struct bdev_handle that contains all parameters that need to be passed to blkdev_put() and provide bdev_open_* functions that return this structure instea
block: Provide bdev_open_* functions
Create struct bdev_handle that contains all parameters that need to be passed to blkdev_put() and provide bdev_open_* functions that return this structure instead of plain bdev pointer. This will eventually allow us to pass one more argument to blkdev_put() (renamed to bdev_release()) without too much hassle.
Acked-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230927093442.25915-1-jack@suse.cz Signed-off-by: Christian Brauner <brauner@kernel.org>
show more ...
|