History log of /dragonfly/sys/vfs/hammer2/hammer2_flush.c (Results 1 – 25 of 137)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# a13468b0 21-Dec-2023 Tomohiro Kusumi <tkusumi@netbsd.org>

sys/vfs/hammer2: Remove obsolete comments on chain statistics

Remove comments for removed code from
b3659de2a6ee73b51bf3edb4babfb4653134813f in 2014.


# 34fb48c2 20-Oct-2023 Matthew Dillon <dillon@apollo.backplane.com>

hammer2 - Multitude of SMP contention fixes, work on flush

* Change the hammer2_io RBTREE to a hash table with per-entry locks.
This reduces contention in the hammer2 block I/O subsystem which
u

hammer2 - Multitude of SMP contention fixes, work on flush

* Change the hammer2_io RBTREE to a hash table with per-entry locks.
This reduces contention in the hammer2 block I/O subsystem which
used to be protected by a single lock.

* Change the hammer2_inode RBTREE to a hash table with per-entry locks.
This reduces contention in the hammer2 inode cache which used to be
protected by a single lock.

* Replace the hammer2_chain LRU cache with a per-inode cluster cache,
which caches the last cluster-related chain. These caches are designed
to hold a deep chain with 0 refs (and thus its parent recursion) to
avoid having to reconstitute and recheck the chains on every VOP. For
example when doing sequential I/O on a file.

Probably needs more work.

* Use the new trigger_syncer_start() and trigger_syncer_end() API
to fix flush waits when the frontend is be asked to do large bulk
modifying operations (such as file creation).

The old code still worked but could sometimes cause processes to pause
for up to 30 seconds when the flush wait raced the syncer. The flush
wait wound up waiting for the next filesystem sync.

show more ...


# 73da1719 19-Jul-2023 Tomohiro Kusumi <tkusumi@netbsd.org>

sys/vfs/hammer2: Use HAMMER2_VOLUME_BYTES for volume header size

Both HAMMER2_PBUFSIZE and HAMMER2_VOLUME_BYTES are 64KiB,
but HAMMER2_VOLUME_BYTES should be used for volume header size
when explici

sys/vfs/hammer2: Use HAMMER2_VOLUME_BYTES for volume header size

Both HAMMER2_PBUFSIZE and HAMMER2_VOLUME_BYTES are 64KiB,
but HAMMER2_VOLUME_BYTES should be used for volume header size
when explicitly getting / reading a volume header block.
It's been mix of these two.

show more ...


# 523dfb54 25-Jan-2023 Tomohiro Kusumi <tkusumi@netbsd.org>

sys/vfs/hammer2: Remove -Wunused-but-set-variable local variables

Warned on Linux user space.


Revision tags: v6.4.0, v6.4.0rc1, v6.5.0, v6.2.2, v6.2.1, v6.2.0, v6.3.0, v6.0.1
# b70cecb7 04-Jul-2021 Tomohiro Kusumi <tkusumi@netbsd.org>

sys/vfs/hammer2: HAMMER2_CHAIN_BMAP* should be HAMMER2_CHAIN_BLKMAP*

The freemap code uses "bmap" and "BMAP" for bitmap,
so these two macros should be "BLKMAP" as the comment implies.


# 55e28d18 21-Jun-2021 Tomohiro Kusumi <tkusumi@netbsd.org>

sys/vfs/hammer2: Remove if0'd HAMMER2_TRANS_PREFLUSH

This no longer exists since 2085215738c03d949e60de63843cb91e84836eb9 in 2016.


# fb431ac4 21-Jun-2021 Tomohiro Kusumi <tkusumi@netbsd.org>

sys/vfs/hammer2: Remove unused FLUSH_DEBUG

No longer used since 8138a154be31c3db1d8bd046ca7b003a6c79c01c in 2014.


# 556042ea 13-Jun-2021 Tomohiro Kusumi <tkusumi@netbsd.org>

sys/vfs/hammer2: Remove unused hammer2_flush_info::debug

No longer used since 850d3f60f0a03e7a3f08357489acac749d6224ca in 2017.
Remove the entire "hammer2_debug & 0x200" code around this as well.


# c12cfc4a 11-Jun-2021 Tomohiro Kusumi <tkusumi@netbsd.org>

sys/vfs/hammer2: Remove unused (add used) header includes


Revision tags: v6.0.0, v6.0.0rc1, v6.1.0
# 0b738157 25-Dec-2020 Tomohiro Kusumi <tkusumi@netbsd.org>

sys/vfs/hammer2: Add initial multi-volumes support for HAMMER2

This commit adds initial multi-volumes support for HAMMER2. Maximum
supported volumes is 64. The feature and implementation is similar

sys/vfs/hammer2: Add initial multi-volumes support for HAMMER2

This commit adds initial multi-volumes support for HAMMER2. Maximum
supported volumes is 64. The feature and implementation is similar to
multi-volumes support in HAMMER1.

1. ondisk changes
=================
This commit bumps volume header version from 1 to 2, and adds four new
volume header fields using reserved fields in version 1. Other ondisk
structures are unchanged.
* "volu_id" - volume id from 0 to 63, where 0 represents root volume.
* "nvolumes" - number of volumes. All volumes have same the same value.
* "total_size" - sum of "volu_size" in volumes. All volumes have the
same value.
* "volu_loff[HAMMER2_MAX_VOLUMES]" - A 512 bytes table which contains
start offset of max 64 volumes within "total_size". All volumes have
the same value.

Version 1 volume header has 0 for above fields, so HAMMER2 internally
treats "nvolumes" as 1, and "total_size" as "volu_size" to be able to
handle version 1 and 2 transparently.

All volumes have 4 headers, but only root volume ones are relevant.
Non-root volume headers have their own unique "volu_id" and "volu_size",
but other fields are unimportant and never used. Non-root volume headers
have sroot blockset[i] whose type is HAMMER2_BREF_TYPE_INVALID. Non-root
volume headers don't have boot/aux area, so freemap area start from
offset 0. Non-root volume headers are readonly and never updated after
creation. This means non-root volumes are just extra storage to extend
fs size and internally make up a single virtual volume whose size is
"total_size".

It currently doesn't automatically upgrade an existing version 1 fs to
version 2. Only newly created fs becomes version 2 for now.

2. volumes layout
=================
Basically similar to HAMMER1. A first block device argument provided for
newfs_hammer2(8) becomes the root volume, and if specified remaining
devices extend "total_size" as non-root volumes. All volumes except for
the last one have 1GiB (freemap level1) aligned "volu_size".

This means each volume's start offset within "total_size" is also 1GiB
(freemap level1) aligned. The start offsets of volumes are stored in
volu_loff[HAMMER2_MAX_VOLUMES]. Each volu_loff[n] (0 <= n < nvolumes)
represents start offset of volume n within "total_size". Unused volumes
have -1 for volu_loff[n].
e.g. If a fs consists of 1 volume, volu_loff[0] has 0 and rests have -1.
e.g. If a fs consists of 3 volumes, x GiB root volume, y GiB volume,
and z GiB volume, volu_loff[0] has 0, volu_loff[1] has x, volu_loff[2]
has x+y, and rests have -1.

Low level I/O function in HAMMER2 uses this linear offsets table to
determine a device vnode to use and relative offset within the device
vnode, for a given blockref's "data_off". This is different from HAMMER1
where logical offset had embedded volume id bits (i.e. there were holes
in logical address space). HAMMER2 needs this table to support multi-
volumes without changing current logical offset mechanism.

Unless all volumes are specified and mountable, mount_hammer2(8) fails
like it failed in HAMMER1. This also applies to other userspace commands
which require volumes specification, except for fstyp(8).

3. userspace commands
=====================
Basically same as or similar to HAMMER1.
* newfs_hammer2(8) takes a list of block device paths as argv[].
* mount_hammer2(8) takes block device paths or names in "a:b:c:..."
format.
* hammer2(8) takes block device paths or names in "a:b:c:..." format for
directives which require volumes specification. This commit also adds
"volume-list" directive and an ioctl command HAMMER2IOC_VOLUME_LIST,
which are similar to the one in HAMMER1.
* fsck_hammer2(8) takes device paths or names in "a:b:c:..." format.
* fstyp(8) takes device paths in "path1:path2:path3:..." format.

4. limitations
==============
* hammer2(8) "info" directive ignores multi-volumes block devices.
* hammer2(8) "growfs" directive doesn't support multi-volumes fs.
* fstyp(8) is unable to find PFS label via -l option if the PFS inode or
its parent indirect blocks are located beyond root volume.
* hammer2(8) doesn't support "volume-add" and "volume-del" directives
which existed in HAMMER1, and there is currently no plan to support.

show more ...


# 1eb19191 14-Oct-2020 Tomohiro Kusumi <tkusumi@netbsd.org>

sys/vfs/hammer2: Cleanup invalid blockref type switch cases


Revision tags: v5.8.3, v5.8.2
# a84aaf37 06-Sep-2020 Tomohiro Kusumi <tkusumi@netbsd.org>

sys/vfs/hammer2: Remove unused HAMMER2_CHAIN_EMBEDDED

Appeared in 512beabd66d77a7cca8f73bf69030ffa90f0b9e3 in 2013,
but no longer used, only appears in KASSERT.


# eb227e4d 01-Sep-2020 Tomohiro Kusumi <tkusumi@netbsd.org>

sys/vfs/hammer2: Remove Debugger() call for debugging in normal paths


Revision tags: v5.8.1, v5.8.0, v5.9.0, v5.8.0rc1, v5.6.3
# d0e99d5d 30-Jan-2020 Matthew Dillon <dillon@apollo.backplane.com>

hammer2 - Fix inode & chain limits, improve flush pipeline.

* Reorganize VFS_MODIFYING() to avoid certain deadlock conditions and
adjust hammer2 to unconditionally stall in VFS_MODIFYING() when di

hammer2 - Fix inode & chain limits, improve flush pipeline.

* Reorganize VFS_MODIFYING() to avoid certain deadlock conditions and
adjust hammer2 to unconditionally stall in VFS_MODIFYING() when dirty
limits are exceeded.

Make sure VFS_MODIFYING() is called in all appropriate filesystem-
modifying paths.

This ensures that inode and chain structure allocation limits are
adhered to.

* Fix hammer2's wakeup code for the dirty inode count hystereis. This
fixes a situation where stalls due to excessive dirty inodes were waiting
a full second before resuming operation based on the dirty count
hysteresis.

The hysteresis now works as intended:

(1) Trigger a sync when the dirty count reache 50% N.
(2) Stall the frontend when the dirty count reaches 100% N.
(3) Resume the frontend when the diirty count drops to 66% N.

* Fix trigger_syncer() to guarantee that the syncer will flush the
filesystem ASAP when called. If the filesystem is already in a flush,
it will be flushed again.

Previously if the filesystem was already in a flush it would wait one
second before flushing again, which significantly reduces performance
under conditions where the dirty chain limit or the dirty inode limit is
constantly being hit (e.g. chown -R, etc).

Reported-by: tuxillo

show more ...


Revision tags: v5.6.2
# 628176c9 04-Aug-2019 Tomohiro Kusumi <kusumi.tomohiro@gmail.com>

sys/vfs/hammer2: Drop obsolete comments on chain

"dbtree", "dbq", "core_entry", "domodify" are all removed in around 2014.


Revision tags: v5.6.1, v5.6.0, v5.6.0rc1, v5.7.0, v5.4.3, v5.4.2, v5.4.1
# 5071e670 06-Dec-2018 Matthew Dillon <dillon@apollo.backplane.com>

hammer2 - stabilization

* Adjust the chain lock a bit to bump lockcnt prior to acquiring
a non-blocking shared lock, instead of afterwords, which cleans
up a live-loop case in the chain unlock c

hammer2 - stabilization

* Adjust the chain lock a bit to bump lockcnt prior to acquiring
a non-blocking shared lock, instead of afterwords, which cleans
up a live-loop case in the chain unlock code.

* Cleanup misc debugging. Add some inode debugging code (default
disabled).

* Add some crash-dump (or live) debug utilities for tracking down
chains, dio's, and inodes. kgdb's macros are too slow for iterating
a million chains.

show more ...


# d2a41023 05-Dec-2018 Matthew Dillon <dillon@apollo.backplane.com>

hammer2 - refactor filesystem sync 7/N

* Increase default caps for dirty chain and dirty inode counts. The
new SYNCQ semantics allow this number to be arbitrarily large, but
it is still a good

hammer2 - refactor filesystem sync 7/N

* Increase default caps for dirty chain and dirty inode counts. The
new SYNCQ semantics allow this number to be arbitrarily large, but
it is still a good idea not to allow it to get out of control.

NOTE: One advantage of higher caps is that it gives the frontend more
time to delete temporary files.

* Get rid of the old syncer speedup / write moderation mechanisms.
Replace with a new VFS_MODIFYING() hook that allows the filesystem
to implement moderation prior to any vnode locks being held.

Remove hammer2_pfs_memory_wait() calls from VOP bodies, implement
the hammer2_pfs_memory_wait() call via VFS_MODIFYING() instead.

* Move the moderation wakeup for the inode count to the syncer, and
change the parameter to use pmp->sideq_count.

show more ...


# d0755e6d 02-Dec-2018 Matthew Dillon <dillon@apollo.backplane.com>

hammer2 - refactor filesystem sync 6/N

* Dependency tracking. Add modest cross-dependency grouping. This code
does not track dependencies in a graph. Instead, it simply groups
dependent inode

hammer2 - refactor filesystem sync 6/N

* Dependency tracking. Add modest cross-dependency grouping. This code
does not track dependencies in a graph. Instead, it simply groups
dependent inodes together. This means that dependency groups can get
rather large when, for example, lots of files are being created or
deleted in the same directory.

* We retain the excellent dynamic inode reordering code for the syncq.
When the frontend blocks on an inode that is in the syncq, the inode
will be reordered to the front of the queue to reduce the frontend
stall time as much as possible.

* Remove the COPYQ transaction flag and related sequencing.

* Fix flush sequencing for pmp->iroot. We must flush iroot's chains with
HAMMER2_XOP_FSSYNC last. When iroot is dirty, the out-of-order flush
of iroot that occurs before the final stage must be run without FSSYNC
set, otherwise iroot's pmp->pfs_iroot_blocksets[] will not be consistent
because the remaining inodes in the syncq haven't been flushed yet.

* Fix a broken syncer speedup conditional.

show more ...


Revision tags: v5.4.0, v5.5.0, v5.4.0rc1
# 6f445d15 13-Nov-2018 Matthew Dillon <dillon@apollo.backplane.com>

hammer2 - refactor filesystem sync 4/N

* Save synchronized iroot blockmaps for snapshot code, and use them
in the snapshot code.

* Improve dependency handling and syncq/sideq flagging for
depen

hammer2 - refactor filesystem sync 4/N

* Save synchronized iroot blockmaps for snapshot code, and use them
in the snapshot code.

* Improve dependency handling and syncq/sideq flagging for
dependencies. Also improve the hammer2_inode_t reordering
code that allows the frontend to continue operating on dirty
inodes simultaneous with a filesystem sync.

* Move inode deletion into the filesystem sync code (in addition to
creation), for the same reason.

* Fix lost ref counts in the snapshot code which were causing umount
panics.

* Stabilization pass on volume flush code. Since flushes stop at
inode boundaries, we must properly flush the superroot before
flushing the volume header. That is, the flush sequence is:

- flush inodes for PFS (flushes inode content)
- flush PFS root inode (flushes through to inodes)
- flush superroot inode (flushes through to PFS root)
- flush volume header (flushes voulume header to superroot)

Theoretically this allows the filesystem asynchronously write data
and inodes flushed by the kernel's buffer cache and vnode code
concurrent with a filesystem flush without messing up filesystem
consistency, because these asynchronously flushed inodes are not
included (or have already been flushed) in the filesystem flush that
is already underway.

* Filesystem consistency still not perfect (using snapshot-debug
directive to test during heavy filesystem modification loads,
directory entries are sometimes desynchronized from their inodes).

show more ...


# 3e8408db 10-Nov-2018 Matthew Dillon <dillon@apollo.backplane.com>

hammer2 - refactor filesystem sync 3/N

* Attempt to guarantee filesystem consistency at all sync points.

* Pretty severely hacked, and at the moment this can result in
syncs which never end if th

hammer2 - refactor filesystem sync 3/N

* Attempt to guarantee filesystem consistency at all sync points.

* Pretty severely hacked, and at the moment this can result in
syncs which never end if the filesystem is busy.

show more ...


# ecfe89b8 09-Nov-2018 Matthew Dillon <dillon@apollo.backplane.com>

hammer2 - refactor filesystem sync 2/N

* Flesh out the flush partitioning code, fixing a number of issues.

* Refactor hammer2_inode_lock() and add hammer2_inode_lock4() to
interlock against flush

hammer2 - refactor filesystem sync 2/N

* Flesh out the flush partitioning code, fixing a number of issues.

* Refactor hammer2_inode_lock() and add hammer2_inode_lock4() to
interlock against flushes. This is handled by blocking inode locks
against SYNCQ, and reordering the inode to the front of the SYNCQ list
in order to unblock as quickly as possible as the filesystem sync
progresses. The result should be relatively few frontend stalls
during a filesystem sync.

* Disable resource caps for the moment, because synchronous
operations to prevent resource limits from blowing out break
the current inode_lock*() code and allow vnode deadlocks to
occur.

* To avoid deadlocks, the filesystem sync currently must clear SYNCQ
before locking the inode & vnode, and if it cannot lock a vnode it
must continue on with the next inode and then restart. Retried
vnodes introduce a short delay to give the frontend time to work
the blocking operation.

This is necessary because the kernel locks vnodes before entering the
H2 frontend, and we cannot safely unlock/relock them to work around
this. Nor do we necessarily even have full knowledge on which vnodes
the current thread has locked.

* Does not yet guarantee complete filesystem consistency on-crash.

show more ...


# 5afbe9d8 05-Nov-2018 Matthew Dillon <dillon@apollo.backplane.com>

hammer2 - refactor filesystem sync 1/N

* Change H2 to allow concurrent filesystem sync and modifying
frontend operations.

* FLUSH transactions no longer block modifying frontend
transactions.

hammer2 - refactor filesystem sync 1/N

* Change H2 to allow concurrent filesystem sync and modifying
frontend operations.

* FLUSH transactions no longer block modifying frontend
transactions.

* Change filesystem sync operation to put all modified
inodes on the pmp->syncq (which is also combined with
any inodes on pmp->sideq), and then iterating the
syncq to flush each inode.

After this is done, stage 2 will flush the meta-data tree
leading to each inode.

This code will also handle delayed inode creation and
destruction ops, which require modifications to the meta-data
tree governing the inodes themselves (so we don't want the
frontend to do it and interfere with the flush).

* Modifying operations against inodes already queued for a
filesystem sync that is in progress will now reorder the
inode to the front of the filesystem sync in progress and
wait for the sync on that inode to complete before proceeding.

This is handled by blocking in the exclusive inode lock code.

* hammer2_inode_get() does not need to pass 'dip' any more
because regular inodes are inserted under the iroot (PFS root
inode) and no longer inserted hierarchically.

* Separate out hammer2_inode_create() into hammer2_inode_create_pfs()
and hammer2_inode_create_normal(). These two forms are now distinct
enough that the code is a mess if we try to leave them combined.

show more ...


# c4421f07 29-Jul-2018 Matthew Dillon <dillon@apollo.backplane.com>

hammer2 - Remote xop implementation part 1

* Normalize naming conventions for XOP functions.

* Change the XOP callback API to remove the hammer2_thread argument.
Pass the clindex and scratch buff

hammer2 - Remote xop implementation part 1

* Normalize naming conventions for XOP functions.

* Change the XOP callback API to remove the hammer2_thread argument.
Pass the clindex and scratch buffer in directly.

* Change the XOP API to pass in a function descriptor instead of a
function pointer, create prototypes for DMSG send/receive XOPs which
will be used for XOP components which are DMSG based and not
local-storage based.

* Adjust comments.

show more ...


Revision tags: v5.2.2
# 257c2728 24-May-2018 Matthew Dillon <dillon@apollo.backplane.com>

hammer2 - Fix kmalloc pool blowout on low-memory machines

* Fix a kmalloc pool blown that can occur on low-memory machines due
to too many disconnected hammer2_inode structures building up.

* Was

hammer2 - Fix kmalloc pool blowout on low-memory machines

* Fix a kmalloc pool blown that can occur on low-memory machines due
to too many disconnected hammer2_inode structures building up.

* Was previously fixed for things like rm -rf and bulk renames,
but not for setattr (aka chown/chmod -R ops).

Reported-by: gjs278

show more ...


Revision tags: v5.2.1, v5.2.0, v5.3.0, v5.2.0rc
# 65c894ff 17-Mar-2018 Matthew Dillon <dillon@apollo.backplane.com>

hammer - Fix bugs, fix serious snapshot bug, flush adjustments

* Make sure we only flush the volume header for a general sync request
and not for a fsync() on /.

* Fix more lock order reversals w

hammer - Fix bugs, fix serious snapshot bug, flush adjustments

* Make sure we only flush the volume header for a general sync request
and not for a fsync() on /.

* Fix more lock order reversals when translating directory entries
to inodes.

* Separate out spmp elements into their own list to make umount ordering
easier.

* Flush in three stages.

(1) flush dirty filesystem inodes
(2) flush PFS meta-data topology up to the filesystem inodes.
(3) flush the volume root and its meta-data up to the PFS inodes.

This is staging for later sync concurrency improvements.

* Fix a bug where creating enough snapshots (more than 4 total PFSs)
causes some PFSs to lose an important flag in their blockref, which
causes flushes to stop working properly on that PFS.

show more ...


123456