#
d9e5d310 |
| 12-Dec-2023 |
Amir Goldstein <amir73il@gmail.com> |
fsnotify: optionally pass access range in file permission hooks
In preparation for pre-content permission events with file access range, move fsnotify_file_perm() hook out of security_file_permissio
fsnotify: optionally pass access range in file permission hooks
In preparation for pre-content permission events with file access range, move fsnotify_file_perm() hook out of security_file_permission() and into the callers.
Callers that have the access range information call the new hook fsnotify_file_area_perm() with the access range.
Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Link: https://lore.kernel.org/r/20231212094440.250945-6-amir73il@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>
show more ...
|
#
705bcfcb |
| 12-Dec-2023 |
Amir Goldstein <amir73il@gmail.com> |
fs: use splice_copy_file_range() inline helper
generic_copy_file_range() is just a wrapper around splice_file_range(), which caps the maximum copy length.
The only caller of splice_file_range(), na
fs: use splice_copy_file_range() inline helper
generic_copy_file_range() is just a wrapper around splice_file_range(), which caps the maximum copy length.
The only caller of splice_file_range(), namely __ceph_copy_file_range() is already ready to cope with short copy.
Move the length capping into splice_file_range() and replace the exported symbol generic_copy_file_range() with a simple inline helper.
Suggested-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/linux-fsdevel/20231204083849.GC32438@lst.de/ Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Link: https://lore.kernel.org/r/20231212094440.250945-3-amir73il@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>
show more ...
|
#
0f292086 |
| 12-Dec-2023 |
Amir Goldstein <amir73il@gmail.com> |
splice: return type ssize_t from all helpers
Not sure why some splice helpers return long, maybe historic reasons. Change them all to return ssize_t to conform to the splice methods and to the rest
splice: return type ssize_t from all helpers
Not sure why some splice helpers return long, maybe historic reasons. Change them all to return ssize_t to conform to the splice methods and to the rest of the helpers.
Suggested-by: Christian Brauner <brauner@kernel.org> Link: https://lore.kernel.org/r/20231208-horchen-helium-d3ec1535ede5@brauner/ Signed-off-by: Amir Goldstein <amir73il@gmail.com> Link: https://lore.kernel.org/r/20231212094440.250945-2-amir73il@gmail.com Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
show more ...
|
#
73065126 |
| 30-Nov-2023 |
Amir Goldstein <amir73il@gmail.com> |
fs: use do_splice_direct() for nfsd/ksmbd server-side-copy
nfsd/ksmbd call vfs_copy_file_range() with flag COPY_FILE_SPLICE to perform kernel copy between two files on any two filesystems.
Splicing
fs: use do_splice_direct() for nfsd/ksmbd server-side-copy
nfsd/ksmbd call vfs_copy_file_range() with flag COPY_FILE_SPLICE to perform kernel copy between two files on any two filesystems.
Splicing input file, while holding file_start_write() on the output file which is on a different sb, posses a risk for fanotify related deadlocks.
We only need to call splice_file_range() from within the context of ->copy_file_range() filesystem methods with file_start_write() held.
To avoid the possible deadlocks, always use do_splice_direct() instead of splice_file_range() for the kernel copy fallback in vfs_copy_file_range() without holding file_start_write().
Reported-and-tested-by: Bert Karwatzki <spasswolf@web.de> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Link: https://lore.kernel.org/r/20231130141624.3338942-4-amir73il@gmail.com Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
show more ...
|
#
da40448c |
| 30-Nov-2023 |
Amir Goldstein <amir73il@gmail.com> |
fs: move file_start_write() into direct_splice_actor()
The callers of do_splice_direct() hold file_start_write() on the output file.
This may cause file permission hooks to be called indirectly on
fs: move file_start_write() into direct_splice_actor()
The callers of do_splice_direct() hold file_start_write() on the output file.
This may cause file permission hooks to be called indirectly on an overlayfs lower layer, which is on the same filesystem of the output file and could lead to deadlock with fanotify permission events.
To fix this potential deadlock, move file_start_write() from the callers into the direct_splice_actor(), so file_start_write() will not be held while splicing from the input file.
Suggested-by: Josef Bacik <josef@toxicpanda.com> Link: https://lore.kernel.org/r/20231128214258.GA2398475@perftesting/ Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Link: https://lore.kernel.org/r/20231130141624.3338942-3-amir73il@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>
show more ...
|
#
488e8f68 |
| 30-Nov-2023 |
Amir Goldstein <amir73il@gmail.com> |
fs: fork splice_file_range() from do_splice_direct()
In preparation of calling do_splice_direct() without file_start_write() held, create a new helper splice_file_range(), to be called from context
fs: fork splice_file_range() from do_splice_direct()
In preparation of calling do_splice_direct() without file_start_write() held, create a new helper splice_file_range(), to be called from context of ->copy_file_range() methods instead of do_splice_direct().
Currently, the only difference is that splice_file_range() does not take flags argument and that it asserts that file_start_write() is held, but we factor out a common helper do_splice_direct_actor() that will be used later.
Use the new helper from __ceph_copy_file_range(), that was incorrectly passing to do_splice_direct() the copy flags argument as splice flags. The value of copy flags in ceph is always 0, so it is a smenatic bug fix.
Move the declaration of both helpers to linux/splice.h.
Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Link: https://lore.kernel.org/r/20231130141624.3338942-2-amir73il@gmail.com Acked-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
show more ...
|
#
3d5cd491 |
| 22-Nov-2023 |
Amir Goldstein <amir73il@gmail.com> |
fs: create file_write_started() helper
Convenience wrapper for sb_write_started(file_inode(inode)->i_sb)), which has a single occurrence in the code right now.
Document the false negatives of those
fs: create file_write_started() helper
Convenience wrapper for sb_write_started(file_inode(inode)->i_sb)), which has a single occurrence in the code right now.
Document the false negatives of those helpers, which makes them unusable to assert that sb_start_write() is not held.
Signed-off-by: Amir Goldstein <amir73il@gmail.com> Link: https://lore.kernel.org/r/20231122122715.2561213-16-amir73il@gmail.com Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
show more ...
|
#
6ae65439 |
| 22-Nov-2023 |
Amir Goldstein <amir73il@gmail.com> |
fs: move kiocb_start_write() into vfs_iocb_iter_write()
In vfs code, sb_start_write() is usually called after the permission hook in rw_verify_area(). vfs_iocb_iter_write() is an exception to this
fs: move kiocb_start_write() into vfs_iocb_iter_write()
In vfs code, sb_start_write() is usually called after the permission hook in rw_verify_area(). vfs_iocb_iter_write() is an exception to this rule, where kiocb_start_write() is called by its callers.
Move kiocb_start_write() from the callers into vfs_iocb_iter_write() after the rw_verify_area() checks, to make them "start-write-safe".
The semantics of vfs_iocb_iter_write() is changed, so that the caller is responsible for calling kiocb_end_write() on completion only if async iocb was queued. The completion handlers of both callers were adapted to this semantic change.
This is needed for fanotify "pre content" events.
Suggested-by: Jan Kara <jack@suse.cz> Suggested-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Link: https://lore.kernel.org/r/20231122122715.2561213-14-amir73il@gmail.com Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
show more ...
|
#
b8e1425b |
| 16-Jul-2023 |
Amir Goldstein <amir73il@gmail.com> |
fs: move permission hook out of do_iter_read()
We recently moved fsnotify hook, rw_verify_area() and other checks from do_iter_write() out to its two callers.
for consistency, do the same thing for
fs: move permission hook out of do_iter_read()
We recently moved fsnotify hook, rw_verify_area() and other checks from do_iter_write() out to its two callers.
for consistency, do the same thing for do_iter_read() - move the rw_verify_area() checks and fsnotify hook to the callers vfs_iter_read() and vfs_readv().
This aligns those vfs helpers with the pattern used in vfs_read() and vfs_iocb_iter_read() and the vfs write helpers, where all the checks are in the vfs helpers and the do_* or call_* helpers do the work.
This is needed for fanotify "pre content" events.
Suggested-by: Jan Kara <jack@suse.cz> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Link: https://lore.kernel.org/r/20231122122715.2561213-13-amir73il@gmail.com Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
show more ...
|
#
1c8aa833 |
| 16-Jul-2023 |
Amir Goldstein <amir73il@gmail.com> |
fs: move permission hook out of do_iter_write()
In many of the vfs helpers, the rw_verity_area() checks are called before taking sb_start_write(), making them "start-write-safe". do_iter_write() is
fs: move permission hook out of do_iter_write()
In many of the vfs helpers, the rw_verity_area() checks are called before taking sb_start_write(), making them "start-write-safe". do_iter_write() is an exception to this rule.
do_iter_write() has two callers - vfs_iter_write() and vfs_writev(). Move rw_verify_area() and other checks from do_iter_write() out to its callers to make them "start-write-safe".
Move also the fsnotify_modify() hook to align with similar pattern used in vfs_write() and other vfs helpers.
This is needed for fanotify "pre content" events.
Suggested-by: Jan Kara <jack@suse.cz> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Link: https://lore.kernel.org/r/20231122122715.2561213-12-amir73il@gmail.com Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
show more ...
|
#
269aed70 |
| 22-Nov-2023 |
Amir Goldstein <amir73il@gmail.com> |
fs: move file_start_write() into vfs_iter_write()
All the callers of vfs_iter_write() call file_start_write() just before calling vfs_iter_write() except for target_core_file's fd_do_rw().
Move fil
fs: move file_start_write() into vfs_iter_write()
All the callers of vfs_iter_write() call file_start_write() just before calling vfs_iter_write() except for target_core_file's fd_do_rw().
Move file_start_write() from the callers into vfs_iter_write(). fd_do_rw() calls vfs_iter_write() with a non-regular file, so file_start_write() is a no-op.
This is needed for fanotify "pre content" events.
Suggested-by: Jan Kara <jack@suse.cz> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Link: https://lore.kernel.org/r/20231122122715.2561213-11-amir73il@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>
show more ...
|
#
89cbd4c0 |
| 11-Aug-2023 |
Yang Li <yang.lee@linux.alibaba.com> |
fs: Fix one kernel-doc comment
Fix one kernel-doc comment to silence the warning:
fs/read_write.c:88: warning: Function parameter or member 'maxsize' not described in 'generic_file_llseek_size'
Si
fs: Fix one kernel-doc comment
Fix one kernel-doc comment to silence the warning:
fs/read_write.c:88: warning: Function parameter or member 'maxsize' not described in 'generic_file_llseek_size'
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Message-Id: <20230811014359.4960-1-yang.lee@linux.alibaba.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
show more ...
|
#
2cb1e089 |
| 22-May-2023 |
David Howells <dhowells@redhat.com> |
splice: Use filemap_splice_read() instead of generic_file_splice_read()
Replace pointers to generic_file_splice_read() with calls to filemap_splice_read().
Signed-off-by: David Howells <dhowells@re
splice: Use filemap_splice_read() instead of generic_file_splice_read()
Replace pointers to generic_file_splice_read() with calls to filemap_splice_read().
Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christian Brauner <brauner@kernel.org> cc: Jens Axboe <axboe@kernel.dk> cc: Al Viro <viro@zeniv.linux.org.uk> cc: David Hildenbrand <david@redhat.com> cc: John Hubbard <jhubbard@nvidia.com> cc: linux-mm@kvack.org cc: linux-block@vger.kernel.org cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/20230522135018.2742245-29-dhowells@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
95e49cf8 |
| 29-Mar-2023 |
Jens Axboe <axboe@kernel.dk> |
iov_iter: add iter_iov_addr() and iter_iov_len() helpers
These just return the address and length of the current iovec segment in the iterator. Convert existing iov_iter_iovec() users to use them in
iov_iter: add iter_iov_addr() and iter_iov_len() helpers
These just return the address and length of the current iovec segment in the iterator. Convert existing iov_iter_iovec() users to use them instead of getting a copy of the current vec.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
de4eda9d |
| 16-Sep-2022 |
Al Viro <viro@zeniv.linux.org.uk> |
use less confusing names for iov_iter direction initializers
READ/WRITE proved to be actively confusing - the meanings are "data destination, as used with read(2)" and "data source, as used with wri
use less confusing names for iov_iter direction initializers
READ/WRITE proved to be actively confusing - the meanings are "data destination, as used with read(2)" and "data source, as used with write(2)", but people keep interpreting those as "we read data from it" and "we write data to it", i.e. exactly the wrong way.
Call them ITER_DEST and ITER_SOURCE - at least that is harder to misinterpret...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
show more ...
|
#
10bc8e4a |
| 17-Nov-2022 |
Amir Goldstein <amir73il@gmail.com> |
vfs: fix copy_file_range() averts filesystem freeze protection
Commit 868f9f2f8e00 ("vfs: fix copy_file_range() regression in cross-fs copies") removed fallback to generic_copy_file_range() for cros
vfs: fix copy_file_range() averts filesystem freeze protection
Commit 868f9f2f8e00 ("vfs: fix copy_file_range() regression in cross-fs copies") removed fallback to generic_copy_file_range() for cross-fs cases inside vfs_copy_file_range().
To preserve behavior of nfsd and ksmbd server-side-copy, the fallback to generic_copy_file_range() was added in nfsd and ksmbd code, but that call is missing sb_start_write(), fsnotify hooks and more.
Ideally, nfsd and ksmbd would pass a flag to vfs_copy_file_range() that will take care of the fallback, but that code would be subtle and we got vfs_copy_file_range() logic wrong too many times already.
Instead, add a flag to explicitly request vfs_copy_file_range() to perform only generic_copy_file_range() and let nfsd and ksmbd use this flag only in the fallback path.
This choise keeps the logic changes to minimum in the non-nfsd/ksmbd code paths to reduce the risk of further regressions.
Fixes: 868f9f2f8e00 ("vfs: fix copy_file_range() regression in cross-fs copies") Tested-by: Namjae Jeon <linkinjeon@kernel.org> Tested-by: Luis Henriques <lhenriques@suse.de> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
show more ...
|
#
06bbaa6d |
| 26-Sep-2022 |
Al Viro <viro@zeniv.linux.org.uk> |
[coredump] don't use __kernel_write() on kmap_local_page()
passing kmap_local_page() result to __kernel_write() is unsafe - random ->write_iter() might (and 9p one does) get unhappy when passed ITER
[coredump] don't use __kernel_write() on kmap_local_page()
passing kmap_local_page() result to __kernel_write() is unsafe - random ->write_iter() might (and 9p one does) get unhappy when passed ITER_KVEC with pointer that came from kmap_local_page().
Fix by providing a variant of __kernel_write() that takes an iov_iter from caller (__kernel_write() becomes a trivial wrapper) and adding dump_emit_page() that parallels dump_emit(), except that instead of __kernel_write() it uses __kernel_write_iter() with ITER_BVEC source.
Fixes: 3159ed57792b "fs/coredump: use kmap_local_page()" Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
show more ...
|
#
3e20a751 |
| 22-May-2022 |
Al Viro <viro@zeniv.linux.org.uk> |
switch new_sync_{read,write}() to ITER_UBUF
Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
#
80175539 |
| 23-Jun-2022 |
Stefan Roesch <shr@fb.com> |
fs: add a FMODE_BUF_WASYNC flags for f_mode
This introduces the flag FMODE_BUF_WASYNC. If devices support async buffered writes, this flag can be set. It also modifies the check in generic_write_che
fs: add a FMODE_BUF_WASYNC flags for f_mode
This introduces the flag FMODE_BUF_WASYNC. If devices support async buffered writes, this flag can be set. It also modifies the check in generic_write_checks to take async buffered writes into consideration.
Signed-off-by: Stefan Roesch <shr@fb.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Link: https://lore.kernel.org/r/20220623175157.1715274-8-shr@fb.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
bdeb77bc |
| 17-Jul-2022 |
Andrei Vagin <avagin@gmail.com> |
fs: sendfile handles O_NONBLOCK of out_fd
sendfile has to return EAGAIN if out_fd is nonblocking and the write into it would block.
Here is a small reproducer for the problem:
#define _GNU_SOURCE
fs: sendfile handles O_NONBLOCK of out_fd
sendfile has to return EAGAIN if out_fd is nonblocking and the write into it would block.
Here is a small reproducer for the problem:
#define _GNU_SOURCE /* See feature_test_macros(7) */ #include <fcntl.h> #include <stdio.h> #include <unistd.h> #include <errno.h> #include <sys/stat.h> #include <sys/types.h> #include <sys/sendfile.h>
#define FILE_SIZE (1UL << 30) int main(int argc, char **argv) { int p[2], fd;
if (pipe2(p, O_NONBLOCK)) return 1;
fd = open(argv[1], O_RDWR | O_TMPFILE, 0666); if (fd < 0) return 1; ftruncate(fd, FILE_SIZE);
if (sendfile(p[1], fd, 0, FILE_SIZE) == -1) { fprintf(stderr, "FAIL\n"); } if (sendfile(p[1], fd, 0, FILE_SIZE) != -1 || errno != EAGAIN) { fprintf(stderr, "FAIL\n"); } return 0; }
It worked before b964bf53e540, it is stuck after b964bf53e540, and it works again with this fix.
This regression occurred because do_splice_direct() calls pipe_write that handles O_NONBLOCK. Here is a trace log from the reproducer:
1) | __x64_sys_sendfile64() { 1) | do_sendfile() { 1) | __fdget() 1) | rw_verify_area() 1) | __fdget() 1) | rw_verify_area() 1) | do_splice_direct() { 1) | rw_verify_area() 1) | splice_direct_to_actor() { 1) | do_splice_to() { 1) | rw_verify_area() 1) | generic_file_splice_read() 1) + 74.153 us | } 1) | direct_splice_actor() { 1) | iter_file_splice_write() { 1) | __kmalloc() 1) 0.148 us | pipe_lock(); 1) 0.153 us | splice_from_pipe_next.part.0(); 1) 0.162 us | page_cache_pipe_buf_confirm(); ... 16 times 1) 0.159 us | page_cache_pipe_buf_confirm(); 1) | vfs_iter_write() { 1) | do_iter_write() { 1) | rw_verify_area() 1) | do_iter_readv_writev() { 1) | pipe_write() { 1) | mutex_lock() 1) 0.153 us | mutex_unlock(); 1) 1.368 us | } 1) 1.686 us | } 1) 5.798 us | } 1) 6.084 us | } 1) 0.174 us | kfree(); 1) 0.152 us | pipe_unlock(); 1) + 14.461 us | } 1) + 14.783 us | } 1) 0.164 us | page_cache_pipe_buf_release(); ... 16 times 1) 0.161 us | page_cache_pipe_buf_release(); 1) | touch_atime() 1) + 95.854 us | } 1) + 99.784 us | } 1) ! 107.393 us | } 1) ! 107.699 us | }
Link: https://lkml.kernel.org/r/20220415005015.525191-1-avagin@gmail.com Fixes: b964bf53e540 ("teach sendfile(2) to handle send-to-pipe directly") Signed-off-by: Andrei Vagin <avagin@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
#
868941b1 |
| 29-Jun-2022 |
Jason A. Donenfeld <Jason@zx2c4.com> |
fs: remove no_llseek
Now that all callers of ->llseek are going through vfs_llseek(), we don't gain anything by keeping no_llseek around. Nothing actually calls it and setting ->llseek to no_lseek i
fs: remove no_llseek
Now that all callers of ->llseek are going through vfs_llseek(), we don't gain anything by keeping no_llseek around. Nothing actually calls it and setting ->llseek to no_lseek is completely equivalent to leaving it NULL.
Longer term (== by the end of merge window) we want to remove all such intializations. To simplify the merge window this commit does *not* touch initializers - it only defines no_llseek as NULL (and simplifies the tests on file opening).
At -rc1 we'll need do a mechanical removal of no_llseek -
git grep -l -w no_llseek | grep -v porting.rst | while read i; do sed -i '/\<no_llseek\>/d' $i done would do it.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
show more ...
|
#
4e3299ea |
| 29-Jun-2022 |
Jason A. Donenfeld <Jason@zx2c4.com> |
fs: do not compare against ->llseek
Now vfs_llseek() can simply check for FMODE_LSEEK; if it's set, we know that ->llseek() won't be NULL and if it's not we should just fail with -ESPIPE.
A couple
fs: do not compare against ->llseek
Now vfs_llseek() can simply check for FMODE_LSEEK; if it's set, we know that ->llseek() won't be NULL and if it's not we should just fail with -ESPIPE.
A couple of other places where we used to check for special values of ->llseek() (somewhat inconsistently) switched to checking FMODE_LSEEK.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
show more ...
|
#
868f9f2f |
| 30-Jun-2022 |
Amir Goldstein <amir73il@gmail.com> |
vfs: fix copy_file_range() regression in cross-fs copies
A regression has been reported by Nicolas Boichat, found while using the copy_file_range syscall to copy a tracefs file.
Before commit 5dae2
vfs: fix copy_file_range() regression in cross-fs copies
A regression has been reported by Nicolas Boichat, found while using the copy_file_range syscall to copy a tracefs file.
Before commit 5dae222a5ff0 ("vfs: allow copy_file_range to copy across devices") the kernel would return -EXDEV to userspace when trying to copy a file across different filesystems. After this commit, the syscall doesn't fail anymore and instead returns zero (zero bytes copied), as this file's content is generated on-the-fly and thus reports a size of zero.
Another regression has been reported by He Zhe - the assertion of WARN_ON_ONCE(ret == -EOPNOTSUPP) can be triggered from userspace when copying from a sysfs file whose read operation may return -EOPNOTSUPP.
Since we do not have test coverage for copy_file_range() between any two types of filesystems, the best way to avoid these sort of issues in the future is for the kernel to be more picky about filesystems that are allowed to do copy_file_range().
This patch restores some cross-filesystem copy restrictions that existed prior to commit 5dae222a5ff0 ("vfs: allow copy_file_range to copy across devices"), namely, cross-sb copy is not allowed for filesystems that do not implement ->copy_file_range().
Filesystems that do implement ->copy_file_range() have full control of the result - if this method returns an error, the error is returned to the user. Before this change this was only true for fs that did not implement the ->remap_file_range() operation (i.e. nfsv3).
Filesystems that do not implement ->copy_file_range() still fall-back to the generic_copy_file_range() implementation when the copy is within the same sb. This helps the kernel can maintain a more consistent story about which filesystems support copy_file_range().
nfsd and ksmbd servers are modified to fall-back to the generic_copy_file_range() implementation in case vfs_copy_file_range() fails with -EOPNOTSUPP or -EXDEV, which preserves behavior of server-side-copy.
fall-back to generic_copy_file_range() is not implemented for the smb operation FSCTL_DUPLICATE_EXTENTS_TO_FILE, which is arguably a correct change of behavior.
Fixes: 5dae222a5ff0 ("vfs: allow copy_file_range to copy across devices") Link: https://lore.kernel.org/linux-fsdevel/20210212044405.4120619-1-drinkcat@chromium.org/ Link: https://lore.kernel.org/linux-fsdevel/CANMq1KDZuxir2LM5jOTm0xx+BnvW=ZmpsG47CyHFJwnw7zSX6Q@mail.gmail.com/ Link: https://lore.kernel.org/linux-fsdevel/20210126135012.1.If45b7cdc3ff707bc1efa17f5366057d60603c45f@changeid/ Link: https://lore.kernel.org/linux-fsdevel/20210630161320.29006-1-lhenriques@suse.de/ Reported-by: Nicolas Boichat <drinkcat@chromium.org> Reported-by: kernel test robot <oliver.sang@intel.com> Signed-off-by: Luis Henriques <lhenriques@suse.de> Fixes: 64bf5ff58dff ("vfs: no fallback for ->copy_file_range") Link: https://lore.kernel.org/linux-fsdevel/20f17f64-88cb-4e80-07c1-85cb96c83619@windriver.com/ Reported-by: He Zhe <zhe.he@windriver.com> Tested-by: Namjae Jeon <linkinjeon@kernel.org> Tested-by: Luis Henriques <lhenriques@suse.de> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
#
59c10c52 |
| 05-Apr-2022 |
Guo Ren <guoren@linux.alibaba.com> |
riscv: compat: syscall: Add compat_sys_call_table implementation
Implement compat sys_call_table and some system call functions: truncate64, ftruncate64, fallocate, pread64, pwrite64, sync_file_rang
riscv: compat: syscall: Add compat_sys_call_table implementation
Implement compat sys_call_table and some system call functions: truncate64, ftruncate64, fallocate, pread64, pwrite64, sync_file_range, readahead, fadvise64_64 which need argument translation.
Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Tested-by: Heiko Stuebner <heiko@sntech.de> Link: https://lore.kernel.org/r/20220405071314.3225832-12-guoren@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
show more ...
|
#
f6f7a25a |
| 12-Aug-2021 |
Omar Sandoval <osandov@fb.com> |
fs: export variant of generic_write_checks without iov_iter
Encoded I/O in Btrfs needs to check a write with a given logical size without an iov_iter that matches that size (because the iov_iter we
fs: export variant of generic_write_checks without iov_iter
Encoded I/O in Btrfs needs to check a write with a given logical size without an iov_iter that matches that size (because the iov_iter we have is for the compressed data). So, factor out the parts of generic_write_check() that don't need an iov_iter into a new generic_write_checks_count() function and export that.
Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|