#
dc6e0ae5 |
| 25-Apr-2024 |
Kemeng Shi <shikemeng@huaweicloud.com> |
mm: remove stale comment __folio_mark_dirty
The __folio_mark_dirty will not mark inode dirty any longer. Remove the stale comment of it.
Link: https://lkml.kernel.org/r/20240425131724.36778-5-shik
mm: remove stale comment __folio_mark_dirty
The __folio_mark_dirty will not mark inode dirty any longer. Remove the stale comment of it.
Link: https://lkml.kernel.org/r/20240425131724.36778-5-shikemeng@huaweicloud.com Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Howard Cochran <hcochran@kernelspring.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Miklos Szeredi <mszeredi@redhat.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
#
3b3e412e |
| 25-Apr-2024 |
Kemeng Shi <shikemeng@huaweicloud.com> |
mm: call __wb_calc_thresh instead of wb_calc_thresh in wb_over_bg_thresh
Call __wb_calc_thresh to calculate wb bg_thresh of gdtc in wb_over_bg_thresh to remove unnecessary wrap in wb_calc_thresh.
L
mm: call __wb_calc_thresh instead of wb_calc_thresh in wb_over_bg_thresh
Call __wb_calc_thresh to calculate wb bg_thresh of gdtc in wb_over_bg_thresh to remove unnecessary wrap in wb_calc_thresh.
Link: https://lkml.kernel.org/r/20240425131724.36778-4-shikemeng@huaweicloud.com Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Howard Cochran <hcochran@kernelspring.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Miklos Szeredi <mszeredi@redhat.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
#
fabd2e42 |
| 25-Apr-2024 |
Kemeng Shi <shikemeng@huaweicloud.com> |
mm: correct calculation of wb's bg_thresh in cgroup domain
wb_calc_thresh() is calculating wb's share of bg_thresh in the global domain. However in case of cgroup writeback this is not the right th
mm: correct calculation of wb's bg_thresh in cgroup domain
wb_calc_thresh() is calculating wb's share of bg_thresh in the global domain. However in case of cgroup writeback this is not the right thing to do. Consider the following domain hierarchy:
global domain (> 20G) / \ cgroup1 (10G) cgroup2 (10G) | | bdi wb1 wb2
and assume wb1 and wb2 have the same bandwidth and the background threshold is set at 10%. The bg_thresh of cgroup1 and cgroup2 is going to be 1G. Now because wb_calc_thresh(mdtc->wb, mdtc->bg_thresh) calculates per-wb threshold in the global domain as (wb bandwidth) / (domain bandwidth) it returns bg_thresh for wb1 as 0.5G although it has nobody to compete against in cgroup1.
Fix the problem by calculating wb's share of bg_thresh in the cgroup domain.
Test as following: /* make it easier to observe the issue */ echo 300000 > /proc/sys/vm/dirty_expire_centisecs echo 100 > /proc/sys/vm/dirty_writeback_centisecs
/* run fio in wb1 */ cd /sys/fs/cgroup echo "+memory +io" > cgroup.subtree_control mkdir group1 cd group1 echo 10G > memory.high echo 10G > memory.max echo $$ > cgroup.procs mkfs.ext4 -F /dev/vdb mount /dev/vdb /bdi1/ fio -name test -filename=/bdi1/file -size=600M -ioengine=libaio -bs=4K \ -iodepth=1 -rw=write -direct=0 --time_based -runtime=600 -invalidate=0
/* run fio in wb2 with a new shell */ cd /sys/fs/cgroup mkdir group2 cd group2 echo 10G > memory.high echo 10G > memory.max echo $$ > cgroup.procs mkfs.ext4 -F /dev/vdc mount /dev/vdc /bdi2/ fio -name test -filename=/bdi2/file -size=600M -ioengine=libaio -bs=4K \ -iodepth=1 -rw=write -direct=0 --time_based -runtime=600 -invalidate=0
Before fix, the wrttien pages of wb1 and wb2 reported from toos/writeback/wb_monitor.py keep growing. After fix, rare written pages are accumulated. There is no obvious change in fio result.
[jack@suse.cz: changelog rewording] Link: https://lkml.kernel.org/r/20240425131724.36778-3-shikemeng@huaweicloud.com Fixes: 74d369443325 ("writeback: Fix performance regression in wb_over_bg_thresh()") Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Howard Cochran <hcochran@kernelspring.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Miklos Szeredi <mszeredi@redhat.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
#
13fc4412 |
| 25-Apr-2024 |
Kemeng Shi <shikemeng@huaweicloud.com> |
mm: enable __wb_calc_thresh to calculate dirty background threshold
Patch series "Fix and cleanups to page-writeback", v2.
This series contains some random cleanups and a fix to correct calculation
mm: enable __wb_calc_thresh to calculate dirty background threshold
Patch series "Fix and cleanups to page-writeback", v2.
This series contains some random cleanups and a fix to correct calculation of wb's bg_thresh in cgroup domain. More details can be found respective patches.
This patch (of 4):
Originally, __wb_calc_thresh always calculate wb's share of dirty throttling threshold. By getting thresh of wb_domain from caller, __wb_calc_thresh could be used for both dirty throttling and dirty background threshold.
This is a preparation to correct threshold calculation of wb in cgroup.
Link: https://lkml.kernel.org/r/20240425131724.36778-1-shikemeng@huaweicloud.com Link: https://lkml.kernel.org/r/20240425131724.36778-2-shikemeng@huaweicloud.com Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Howard Cochran <hcochran@kernelspring.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Miklos Szeredi <mszeredi@redhat.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
#
826881a7 |
| 23-Apr-2024 |
Kemeng Shi <shikemeng@huaweicloud.com> |
writeback: rename nr_reclaimable to nr_dirty in balance_dirty_pages
Commit 8d92890bd6b85 ("mm/writeback: discard NR_UNSTABLE_NFS, use NR_WRITEBACK instead") removed NR_UNSTABLE_NFS and nr_reclaimabl
writeback: rename nr_reclaimable to nr_dirty in balance_dirty_pages
Commit 8d92890bd6b85 ("mm/writeback: discard NR_UNSTABLE_NFS, use NR_WRITEBACK instead") removed NR_UNSTABLE_NFS and nr_reclaimable only contains dirty page now. Rename nr_reclaimable to nr_dirty properly.
Link: https://lkml.kernel.org/r/20240423034643.141219-6-shikemeng@huaweicloud.com Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Brian Foster <bfoster@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: David Sterba <dsterba@suse.com> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: SeongJae Park <sj@kernel.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
#
4b5bbc39 |
| 23-Apr-2024 |
Kemeng Shi <shikemeng@huaweicloud.com> |
writeback: support retrieving per group debug writeback stats of bdi
Add /sys/kernel/debug/bdi/xxx/wb_stats to show per group writeback stats of bdi.
Following domain hierarchy is tested:
writeback: support retrieving per group debug writeback stats of bdi
Add /sys/kernel/debug/bdi/xxx/wb_stats to show per group writeback stats of bdi.
Following domain hierarchy is tested: global domain (320G) / \ cgroup domain1(10G) cgroup domain2(10G) | | bdi wb1 wb2
/* per wb writeback info of bdi is collected */ cat wb_stats WbCgIno: 1 WbWriteback: 0 kB WbReclaimable: 0 kB WbDirtyThresh: 0 kB WbDirtied: 0 kB WbWritten: 0 kB WbWriteBandwidth: 102400 kBps b_dirty: 0 b_io: 0 b_more_io: 0 b_dirty_time: 0 state: 1
WbCgIno: 4091 WbWriteback: 1792 kB WbReclaimable: 820512 kB WbDirtyThresh: 6004692 kB WbDirtied: 1820448 kB WbWritten: 999488 kB WbWriteBandwidth: 169020 kBps b_dirty: 0 b_io: 0 b_more_io: 1 b_dirty_time: 0 state: 5
WbCgIno: 4131 WbWriteback: 1120 kB WbReclaimable: 820064 kB WbDirtyThresh: 6004728 kB WbDirtied: 1822688 kB WbWritten: 1002400 kB WbWriteBandwidth: 153520 kBps b_dirty: 0 b_io: 0 b_more_io: 1 b_dirty_time: 0 state: 5
[shikemeng@huaweicloud.com: fix build problems] Link: https://lkml.kernel.org/r/20240423034643.141219-4-shikemeng@huaweicloud.com Link: https://lkml.kernel.org/r/20240423034643.141219-3-shikemeng@huaweicloud.com Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Cc: Brian Foster <bfoster@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: David Sterba <dsterba@suse.com> Cc: Jan Kara <jack@suse.cz> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: SeongJae Park <sj@kernel.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
#
3d84d897 |
| 16-Apr-2024 |
Matthew Wilcox (Oracle) <willy@infradead.org> |
doc: improve the description of __folio_mark_dirty
Patch series "Improve buffer head documentation", v3.
Turn buffer head documentation into its own document, and make many general improvements to
doc: improve the description of __folio_mark_dirty
Patch series "Improve buffer head documentation", v3.
Turn buffer head documentation into its own document, and make many general improvements to the docs. Obviously there is much more that could be done. Tested with make htmldocs.
This patch (of 8):
I've learned why it's safe to call __folio_mark_dirty() from mark_buffer_dirty() without holding the folio lock, so update the description to explain why.
Link: https://lkml.kernel.org/r/20240416031754.4076917-1-willy@infradead.org Link: https://lkml.kernel.org/r/20240416031754.4076917-2-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Pankaj Raghav <p.raghav@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
#
5a550a0c |
| 18-Mar-2024 |
David Howells <dhowells@redhat.com> |
mm: Export writeback_iter()
Export writeback_iter() so that it can be used by netfslib as a module.
Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org>
mm: Export writeback_iter()
Export writeback_iter() so that it can be used by netfslib as a module.
Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: Matthew Wilcox (Oracle) <willy@infradead.org> cc: Christoph Hellwig <hch@lst.de> cc: linux-mm@kvack.org
show more ...
|
#
7998df0b |
| 28-Mar-2024 |
Joel Granados <j.granados@samsung.com> |
memory: remove the now superfluous sentinel element from ctl_table array
This commit comes at the tail end of a greater effort to remove the empty elements at the end of the ctl_table arrays (sentin
memory: remove the now superfluous sentinel element from ctl_table array
This commit comes at the tail end of a greater effort to remove the empty elements at the end of the ctl_table arrays (sentinels) which will reduce the overall build time size of the kernel and run time memory bloat by ~64 bytes per sentinel (further information Link : https://lore.kernel.org/all/ZO5Yx5JFogGi%2FcBo@bombadil.infradead.org/)
Remove sentinel from all files under mm/ that register a sysctl table.
Link: https://lkml.kernel.org/r/20240328-jag-sysctl_remset_misc-v1-1-47c1463b3af2@samsung.com Signed-off-by: Joel Granados <j.granados@samsung.com> Reviewed-by: Muchun Song <muchun.song@linux.dev> Reviewed-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
#
c44ed5b7 |
| 15-Feb-2024 |
Matthew Wilcox (Oracle) <willy@infradead.org> |
writeback: remove a use of write_cache_pages() from do_writepages()
Use the new writeback_iter() directly instead of indirecting through a callback.
[hch@lst.de: ported to the while based iter styl
writeback: remove a use of write_cache_pages() from do_writepages()
Use the new writeback_iter() directly instead of indirecting through a callback.
[hch@lst.de: ported to the while based iter style] Link: https://lkml.kernel.org/r/20240215063649.2164017-15-hch@lst.de Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Christian Brauner <brauner@kernel.org> Cc: Dave Chinner <dchinner@redhat.com> Cc: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
#
cdc150b5 |
| 15-Feb-2024 |
Christoph Hellwig <hch@lst.de> |
writeback: add a writeback iterator
Refactor the code left in write_cache_pages into an iterator that the file system can call to get the next folio for a writeback operation:
struct folio *folio
writeback: add a writeback iterator
Refactor the code left in write_cache_pages into an iterator that the file system can call to get the next folio for a writeback operation:
struct folio *folio = NULL;
while ((folio = writeback_iter(mapping, wbc, folio, &error))) { error = <do per-folio writeback>; }
The twist here is that the error value is passed by reference, so that the iterator can restore it when breaking out of the loop.
Handling of the magic AOP_WRITEPAGE_ACTIVATE value stays outside the iterator and needs is just kept in the write_cache_pages legacy wrapper. in preparation for eventually killing it off.
Heavily based on a for_each* based iterator from Matthew Wilcox.
Link: https://lkml.kernel.org/r/20240215063649.2164017-14-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Christian Brauner <brauner@kernel.org> Cc: Dave Chinner <dchinner@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
#
a2cbc136 |
| 15-Feb-2024 |
Matthew Wilcox (Oracle) <willy@infradead.org> |
writeback: move the folio_prepare_writeback loop out of write_cache_pages()
Move the loop for should-we-write-this-folio to writeback_get_folio.
[hch@lst.de: fold loop into existing helper instead
writeback: move the folio_prepare_writeback loop out of write_cache_pages()
Move the loop for should-we-write-this-folio to writeback_get_folio.
[hch@lst.de: fold loop into existing helper instead of a separate one per Jan] Link: https://lkml.kernel.org/r/20240215063649.2164017-13-hch@lst.de Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz> Acked-by: Dave Chinner <dchinner@redhat.com> Cc: Christian Brauner <brauner@kernel.org> Cc: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
#
e6d0ab87 |
| 15-Feb-2024 |
Matthew Wilcox (Oracle) <willy@infradead.org> |
writeback: use the folio_batch queue iterator
Instead of keeping our own local iterator variable, use the one just added to folio_batch.
Link: https://lkml.kernel.org/r/20240215063649.2164017-12-hc
writeback: use the folio_batch queue iterator
Instead of keeping our own local iterator variable, use the one just added to folio_batch.
Link: https://lkml.kernel.org/r/20240215063649.2164017-12-hch@lst.de Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz> Acked-by: Dave Chinner <dchinner@redhat.com> Cc: Christian Brauner <brauner@kernel.org> Cc: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
#
807d1fe3 |
| 15-Feb-2024 |
Matthew Wilcox (Oracle) <willy@infradead.org> |
writeback: simplify the loops in write_cache_pages()
Collapse the two nested loops into one. This is needed as a step towards turning this into an iterator.
Note that this drops the "index <= end"
writeback: simplify the loops in write_cache_pages()
Collapse the two nested loops into one. This is needed as a step towards turning this into an iterator.
Note that this drops the "index <= end" check in the previous outer loop and just relies on filemap_get_folios_tag() to return 0 entries when index > end. This actually has a subtle implication when end == -1 because then the returned index will be -1 as well and thus if there is page present on index -1, we could be looping indefinitely. But as the comment in filemap_get_folios_tag documents this as already broken anyway we should not worry about it here either. The fix for that would probably a change to the filemap_get_folios_tag() calling convention.
[hch@lst.de: update the commit log per Jan] Link: https://lkml.kernel.org/r/20240215063649.2164017-10-hch@lst.de Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz> Acked-by: Dave Chinner <dchinner@redhat.com> Cc: Christian Brauner <brauner@kernel.org> Cc: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
#
751e0d55 |
| 15-Feb-2024 |
Matthew Wilcox (Oracle) <willy@infradead.org> |
writeback: factor writeback_get_batch() out of write_cache_pages()
This simple helper will be the basis of the writeback iterator. To make this work, we need to remember the current index and end p
writeback: factor writeback_get_batch() out of write_cache_pages()
This simple helper will be the basis of the writeback iterator. To make this work, we need to remember the current index and end positions in writeback_control.
[hch@lst.de: heavily rebased, add helpers to get the tag and end index, don't keep the end index in struct writeback_control] Link: https://lkml.kernel.org/r/20240215063649.2164017-9-hch@lst.de Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz> Acked-by: Dave Chinner <dchinner@redhat.com> Cc: Christian Brauner <brauner@kernel.org> Cc: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
#
b1793929 |
| 15-Feb-2024 |
Matthew Wilcox (Oracle) <willy@infradead.org> |
writeback: factor folio_prepare_writeback() out of write_cache_pages()
Reduce write_cache_pages() by about 30 lines; much of it is commentary, but it all bundles nicely into an obvious function.
[h
writeback: factor folio_prepare_writeback() out of write_cache_pages()
Reduce write_cache_pages() by about 30 lines; much of it is commentary, but it all bundles nicely into an obvious function.
[hch@lst.de: rename should_writeback_folio to folio_prepare_writeback per Jan] Link: https://lkml.kernel.org/r/20240215063649.2164017-8-hch@lst.de Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz> Acked-by: Dave Chinner <dchinner@redhat.com> Cc: Christian Brauner <brauner@kernel.org> Cc: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
#
f946e0d2 |
| 15-Feb-2024 |
Christoph Hellwig <hch@lst.de> |
writeback: rework the loop termination condition in write_cache_pages
Rework the way we deal with the cleanup after the writepage call.
First handle the magic AOP_WRITEPAGE_ACTIVATE separately from
writeback: rework the loop termination condition in write_cache_pages
Rework the way we deal with the cleanup after the writepage call.
First handle the magic AOP_WRITEPAGE_ACTIVATE separately from real error returns to get it out of the way of the actual error handling path.
The split the handling on intgrity vs non-integrity branches first, and return early using a goto for the non-ingegrity early loop condition to remove the need for the done and done_index local variables, and for assigning the error to ret when we can just return error directly.
Link: https://lkml.kernel.org/r/20240215063649.2164017-7-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Christian Brauner <brauner@kernel.org> Cc: Dave Chinner <dchinner@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
#
5d899d43 |
| 15-Feb-2024 |
Christoph Hellwig <hch@lst.de> |
writeback: only update ->writeback_index for range_cyclic writeback
mapping->writeback_index is only [1] used as the starting point for range_cyclic writeback, so there is no point in updating it fo
writeback: only update ->writeback_index for range_cyclic writeback
mapping->writeback_index is only [1] used as the starting point for range_cyclic writeback, so there is no point in updating it for other types of writeback.
[1] except for btrfs_defrag_file which does really odd things with mapping->writeback_index. But btrfs doesn't use write_cache_pages at all, so this isn't relevant here.
Link: https://lkml.kernel.org/r/20240215063649.2164017-6-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz> Acked-by: Dave Chinner <dchinner@redhat.com> Cc: Christian Brauner <brauner@kernel.org> Cc: David Howells <dhowells@redhat.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
#
98103258 |
| 15-Feb-2024 |
Christoph Hellwig <hch@lst.de> |
writeback: also update wbc->nr_to_write on writeback failure
When exiting write_cache_pages early due to a non-integrity write failure, wbc->nr_to_write currently doesn't account for the folio we ju
writeback: also update wbc->nr_to_write on writeback failure
When exiting write_cache_pages early due to a non-integrity write failure, wbc->nr_to_write currently doesn't account for the folio we just failed to write. This doesn't matter because the callers always ingore the value on a failure, but moving the update to common code will allow to simplify the code, so do it.
Link: https://lkml.kernel.org/r/20240215063649.2164017-5-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz> Acked-by: Dave Chinner <dchinner@redhat.com> Cc: Christian Brauner <brauner@kernel.org> Cc: David Howells <dhowells@redhat.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
#
a02829f0 |
| 15-Feb-2024 |
Christoph Hellwig <hch@lst.de> |
writeback: fix done_index when hitting the wbc->nr_to_write
When write_cache_pages finishes writing out a folio, it fails to update done_index to account for the number of pages in the folio just wr
writeback: fix done_index when hitting the wbc->nr_to_write
When write_cache_pages finishes writing out a folio, it fails to update done_index to account for the number of pages in the folio just written. That means when range_cyclic writeback is restarted, it will be restarted at this folio instead of after it as it should. Fix that by updating done_index before breaking out of the loop.
Link: https://lkml.kernel.org/r/20240215063649.2164017-4-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz> Acked-by: Dave Chinner <dchinner@redhat.com> Cc: Christian Brauner <brauner@kernel.org> Cc: David Howells <dhowells@redhat.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
#
6768907e |
| 15-Feb-2024 |
Christoph Hellwig <hch@lst.de> |
writeback: don't call mapping_set_error on AOP_WRITEPAGE_ACTIVATE
Patch series "convert write_cache_pages() to an iterator", v8.
This is an evolution of the series Matthew Wilcox originally sent in
writeback: don't call mapping_set_error on AOP_WRITEPAGE_ACTIVATE
Patch series "convert write_cache_pages() to an iterator", v8.
This is an evolution of the series Matthew Wilcox originally sent in June 2023, which has changed quite a bit since and now has a while based iterator.
This patch (of 14):
mapping_set_error should only be called on 0 returns (which it ignores) or a negative error code.
writepage_cb ends up being able to call writepage_cb on the magic AOP_WRITEPAGE_ACTIVATE return value from ->writepage which means success but the caller needs to unlock the page. Ignore that and just call mapping_set_error on negative errors.
(no fixes tag as this goes back more than 20 years over various renames and refactors so I've given up chasing down the original introduction)
Link: https://lkml.kernel.org/r/20240215063649.2164017-1-hch@lst.de Link: https://lkml.kernel.org/r/20240215063649.2164017-2-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Brian Foster <bfoster@redhat.com> Cc: Christian Brauner <brauner@kernel.org> Cc: David Howells <dhowells@redhat.com> Cc: Dave Chinner <dchinner@redhat.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
#
f814bdda |
| 23-Jan-2024 |
Jan Kara <jack@suse.cz> |
blk-wbt: Fix detection of dirty-throttled tasks
The detection of dirty-throttled tasks in blk-wbt has been subtly broken since its beginning in 2016. Namely if we are doing cgroup writeback and the
blk-wbt: Fix detection of dirty-throttled tasks
The detection of dirty-throttled tasks in blk-wbt has been subtly broken since its beginning in 2016. Namely if we are doing cgroup writeback and the throttled task is not in the root cgroup, balance_dirty_pages() will set dirty_sleep for the non-root bdi_writeback structure. However blk-wbt checks dirty_sleep only in the root cgroup bdi_writeback structure. Thus detection of recently throttled tasks is not working in this case (we noticed this when we switched to cgroup v2 and suddently writeback was slow).
Since blk-wbt has no easy way to get to proper bdi_writeback and furthermore its intention has always been to work on the whole device rather than on individual cgroups, just move the dirty_sleep timestamp from bdi_writeback to backing_dev_info. That fixes the checking for recently throttled task and saves memory for everybody as a bonus.
CC: stable@vger.kernel.org Fixes: b57d74aff9ab ("writeback: track if we're sleeping on progress in balance_dirty_pages()") Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20240123175826.21452-1-jack@suse.cz [axboe: fixup indentation errors] Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
9319b647 |
| 18-Jan-2024 |
Zach O'Keefe <zokeefe@google.com> |
mm/writeback: fix possible divide-by-zero in wb_dirty_limits(), again
(struct dirty_throttle_control *)->thresh is an unsigned long, but is passed as the u32 divisor argument to div_u64(). On archi
mm/writeback: fix possible divide-by-zero in wb_dirty_limits(), again
(struct dirty_throttle_control *)->thresh is an unsigned long, but is passed as the u32 divisor argument to div_u64(). On architectures where unsigned long is 64 bytes, the argument will be implicitly truncated.
Use div64_u64() instead of div_u64() so that the value used in the "is this a safe division" check is the same as the divisor.
Also, remove redundant cast of the numerator to u64, as that should happen implicitly.
This would be difficult to exploit in memcg domain, given the ratio-based arithmetic domain_drity_limits() uses, but is much easier in global writeback domain with a BDI_CAP_STRICTLIMIT-backing device, using e.g. vm.dirty_bytes=(1<<32)*PAGE_SIZE so that dtc->thresh == (1<<32)
Link: https://lkml.kernel.org/r/20240118181954.1415197-1-zokeefe@google.com Fixes: f6789593d5ce ("mm/page-writeback.c: fix divide by zero in bdi_dirty_limits()") Signed-off-by: Zach O'Keefe <zokeefe@google.com> Cc: Maxim Patlasov <MPatlasov@parallels.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
#
fa151a39 |
| 19-Dec-2023 |
Jingbo Xu <jefflexu@linux.alibaba.com> |
mm: fix arithmetic for max_prop_frac when setting max_ratio
Since now bdi->max_ratio is part per million, fix the wrong arithmetic for max_prop_frac when setting max_ratio. Otherwise the miscalcula
mm: fix arithmetic for max_prop_frac when setting max_ratio
Since now bdi->max_ratio is part per million, fix the wrong arithmetic for max_prop_frac when setting max_ratio. Otherwise the miscalculated max_prop_frac will affect the incrementing of writeout completion count when max_ratio is not 100%.
Link: https://lkml.kernel.org/r/20231219142508.86265-3-jefflexu@linux.alibaba.com Fixes: efc3e6ad53ea ("mm: split off __bdi_set_max_ratio() function") Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com> Cc: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Stefan Roesch <shr@devkernel.io> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|
#
e0646b75 |
| 19-Dec-2023 |
Jingbo Xu <jefflexu@linux.alibaba.com> |
mm: fix arithmetic for bdi min_ratio
Since now bdi->min_ratio is part per million, fix the wrong arithmetic. Otherwise it will fail with -EINVAL when setting a reasonable min_ratio, as it tries to
mm: fix arithmetic for bdi min_ratio
Since now bdi->min_ratio is part per million, fix the wrong arithmetic. Otherwise it will fail with -EINVAL when setting a reasonable min_ratio, as it tries to set min_ratio to (min_ratio * BDI_RATIO_SCALE) in percentage unit, which exceeds 100% anyway.
# cat /sys/class/bdi/253\:0/min_ratio 0 # cat /sys/class/bdi/253\:0/max_ratio 100 # echo 1 > /sys/class/bdi/253\:0/min_ratio -bash: echo: write error: Invalid argument
Link: https://lkml.kernel.org/r/20231219142508.86265-2-jefflexu@linux.alibaba.com Fixes: 8021fb3232f2 ("mm: split off __bdi_set_min_ratio() function") Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com> Reported-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Stefan Roesch <shr@devkernel.io> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
show more ...
|