History log of /dragonfly/sys/vfs/hammer2/hammer2_synchro.c (Results 1 – 19 of 19)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 022bb0a9 21-Feb-2022 Tomohiro Kusumi <tkusumi@netbsd.org>

sys/vfs/hammer2: Mostly trailing whitespace cleanups


Revision tags: v6.2.1, v6.2.0, v6.3.0, v6.0.1, v6.0.0, v6.0.0rc1, v6.1.0, v5.8.3, v5.8.2, v5.8.1, v5.8.0, v5.9.0, v5.8.0rc1, v5.6.3, v5.6.2, v5.6.1, v5.6.0, v5.6.0rc1, v5.7.0, v5.4.3, v5.4.2, v5.4.1, v5.4.0, v5.5.0, v5.4.0rc1
# ecfe89b8 09-Nov-2018 Matthew Dillon <dillon@apollo.backplane.com>

hammer2 - refactor filesystem sync 2/N

* Flesh out the flush partitioning code, fixing a number of issues.

* Refactor hammer2_inode_lock() and add hammer2_inode_lock4() to
interlock against flush

hammer2 - refactor filesystem sync 2/N

* Flesh out the flush partitioning code, fixing a number of issues.

* Refactor hammer2_inode_lock() and add hammer2_inode_lock4() to
interlock against flushes. This is handled by blocking inode locks
against SYNCQ, and reordering the inode to the front of the SYNCQ list
in order to unblock as quickly as possible as the filesystem sync
progresses. The result should be relatively few frontend stalls
during a filesystem sync.

* Disable resource caps for the moment, because synchronous
operations to prevent resource limits from blowing out break
the current inode_lock*() code and allow vnode deadlocks to
occur.

* To avoid deadlocks, the filesystem sync currently must clear SYNCQ
before locking the inode & vnode, and if it cannot lock a vnode it
must continue on with the next inode and then restart. Retried
vnodes introduce a short delay to give the frontend time to work
the blocking operation.

This is necessary because the kernel locks vnodes before entering the
H2 frontend, and we cannot safely unlock/relock them to work around
this. Nor do we necessarily even have full knowledge on which vnodes
the current thread has locked.

* Does not yet guarantee complete filesystem consistency on-crash.

show more ...


# 5afbe9d8 05-Nov-2018 Matthew Dillon <dillon@apollo.backplane.com>

hammer2 - refactor filesystem sync 1/N

* Change H2 to allow concurrent filesystem sync and modifying
frontend operations.

* FLUSH transactions no longer block modifying frontend
transactions.

hammer2 - refactor filesystem sync 1/N

* Change H2 to allow concurrent filesystem sync and modifying
frontend operations.

* FLUSH transactions no longer block modifying frontend
transactions.

* Change filesystem sync operation to put all modified
inodes on the pmp->syncq (which is also combined with
any inodes on pmp->sideq), and then iterating the
syncq to flush each inode.

After this is done, stage 2 will flush the meta-data tree
leading to each inode.

This code will also handle delayed inode creation and
destruction ops, which require modifications to the meta-data
tree governing the inodes themselves (so we don't want the
frontend to do it and interfere with the flush).

* Modifying operations against inodes already queued for a
filesystem sync that is in progress will now reorder the
inode to the front of the filesystem sync in progress and
wait for the sync on that inode to complete before proceeding.

This is handled by blocking in the exclusive inode lock code.

* hammer2_inode_get() does not need to pass 'dip' any more
because regular inodes are inserted under the iroot (PFS root
inode) and no longer inserted hierarchically.

* Separate out hammer2_inode_create() into hammer2_inode_create_pfs()
and hammer2_inode_create_normal(). These two forms are now distinct
enough that the code is a mess if we try to leave them combined.

show more ...


# c4421f07 29-Jul-2018 Matthew Dillon <dillon@apollo.backplane.com>

hammer2 - Remote xop implementation part 1

* Normalize naming conventions for XOP functions.

* Change the XOP callback API to remove the hammer2_thread argument.
Pass the clindex and scratch buff

hammer2 - Remote xop implementation part 1

* Normalize naming conventions for XOP functions.

* Change the XOP callback API to remove the hammer2_thread argument.
Pass the clindex and scratch buffer in directly.

* Change the XOP API to pass in a function descriptor instead of a
function pointer, create prototypes for DMSG send/receive XOPs which
will be used for XOP components which are DMSG based and not
local-storage based.

* Adjust comments.

show more ...


Revision tags: v5.2.2
# fda30e02 25-May-2018 Matthew Dillon <dillon@apollo.backplane.com>

hammer2 - Fix focus vs modify race

* Fixes rare panics when e.g. removing large numbers (as in hundreds of
millions) of directory entries.

The XOP collection code holds the collected cluster bu

hammer2 - Fix focus vs modify race

* Fixes rare panics when e.g. removing large numbers (as in hundreds of
millions) of directory entries.

The XOP collection code holds the collected cluster but cannot
safely lock it without risking a deadlock against backend operations
or dead backends. The hold on the chain prevents its destruction,
but does not prevent another thread from locking it and issuing
hammer2_chain_modify().

* Fix bugs due to the unsafe nature of an unlocked chain's content,
especially chain->data and chain->dio, by adding an interlock between
frontend access to the data and backend hammer2_chain_modify() calls.
Held but unlocked chains are used by the XOP API to pass chains back to
the frontend.

* Remove the automatic (because it is unsafe) dio synchronization
in hammer2_xop_collect() and instead implement an API that the
frontend can use to safely access the data. The API is
hammer2_xop_gdata()/hammer2_xop_pdata().

* Remove the unsafe hammer2_cluster_rdata(). Use gdata/pdata for this
too.

* Rewire hammer2_inode_get() to pass-in a hammer2_xop_head instead
of a hammer2_cluster so it can use the gdata/pdata API too.

show more ...


# 257c2728 24-May-2018 Matthew Dillon <dillon@apollo.backplane.com>

hammer2 - Fix kmalloc pool blowout on low-memory machines

* Fix a kmalloc pool blown that can occur on low-memory machines due
to too many disconnected hammer2_inode structures building up.

* Was

hammer2 - Fix kmalloc pool blowout on low-memory machines

* Fix a kmalloc pool blown that can occur on low-memory machines due
to too many disconnected hammer2_inode structures building up.

* Was previously fixed for things like rm -rf and bulk renames,
but not for setattr (aka chown/chmod -R ops).

Reported-by: gjs278

show more ...


Revision tags: v5.2.1, v5.2.0, v5.3.0, v5.2.0rc
# 68b321c1 16-Mar-2018 Matthew Dillon <dillon@apollo.backplane.com>

hammer2 - More involved refactoring of chain_repparent, cleanup

* Remove unused locking flags (remove the NOLOCK and NOUNLOCK
features).

* Add HAMMER2_RESOLVE_NONBLOCK to hammer2_chain_lock() for

hammer2 - More involved refactoring of chain_repparent, cleanup

* Remove unused locking flags (remove the NOLOCK and NOUNLOCK
features).

* Add HAMMER2_RESOLVE_NONBLOCK to hammer2_chain_lock() for use
only by hammer2_chain_getparent() and hammer2_chain_repparent().

* Refactor hammer2_chain_getparent() and hammer2_chain_repparent().
Add a hot-path that uses HAMMER2_RESOLVE_NONBLOCK. If this fails
we now do a much more involved tracking operation via 'reptrack'
to deal with races against indirect block deletions.

* Cleanup the copyright messages.

* Fix an issue where a sync could be held-up indefinitely by
ongoing overlapping modifying operations.

* Install a proper initial inode count when creating a snapshot.

* Fix a deadlock in checkdirempty(). A chain lock was winding
up being ordered incorrectly.

show more ...


Revision tags: v5.0.2, v5.0.1, v5.0.0, v5.0.0rc2, v5.1.0, v5.0.0rc1
# 65cacacf 07-Sep-2017 Matthew Dillon <dillon@apollo.backplane.com>

hammer2 - Implement error processing and free reserve enforcement

* newfs_hammer2 calculates the correct amount of reserved space. We
have to reserve 4MB per 1GB, not 4MB per 2GB, due to a snafu.

hammer2 - Implement error processing and free reserve enforcement

* newfs_hammer2 calculates the correct amount of reserved space. We
have to reserve 4MB per 1GB, not 4MB per 2GB, due to a snafu. This
is still only 0.4% of the storage.

* Flesh out HAMMER2_ERROR_* codes and make most hammer2 functions return
a proper error code.

* Add error handling to nearly all code that can dirty a chain, in
particular to handle ENOSPC issues. Any dirty buffers that cannot be
flushed will incur a write error (which in DragonFly typically causes
the buffer to be retries later). Any dirty chain that cannot be
flushed will remain in the topology and can be completed in a later
flush if space has been freed up.

We try to avoid allowing the filesystem to get into this situation in
the first place, but if it does, it should be possible to flush these
asynchronous modifying chains and buffers once space is freed up via
bulkfree.

* Relax class match requirements in the freemap allocator when the freemap
gets close to full. This will allow e.g. inodes to be allocated out of
DATA bitmaps and vise versa, and so forth. This fixes edge conditions
where there is enough free space available but it has all been earmarked
for the wrong data class.

* Try to fix a bug in live_count tracking when destroying an indirect
block chain or inode chain that has not yet been blockmapped due to
a drop. This situation only occurs when chains cannot be flushed due
to I/O errors or disk full conditions, and are then later destroyed
(e.g. such as when the governing file is removed).

This should fix a live_count assertion that can occur under these
circumstances. See hammer2_chain_lastdrop().

* Enforce the free reserve requirement for all modifying VOP calls.
Root users can nominally fill the file system to 97.5%, non-root
users to 95%. At 90%, write()s will enforce bawrite() verses bdwrite()
to try to avoid buffer cache flushes from actually running the
filesystem out of space.

This is needed because we do not actually know how much disk space is
going to be needed at write() time. Deduplication and compression
occurs later, at buffer-flush time.

* Do NOT flush the volume header when a vfs sync is unable to completely
flush a device due to errors. This ensures that the underlying media
does not become corrupt.

* Fix an issue where bref.check.freemap.bigmask was not being properly
reset to -1 when bulkfree is able to free an element. This bug
prevented the allocator from recognizing that free space was available
in that bitmap.

* Modify bulkfree operation to use the live topology when flushing and
snapshot operations fail due to errors, allowing bulkfree to run.

* Nominal bulkfree operations now runs on the snapshot without a
transaction (more testing is needed). This theoretically should allow
bulkfree to run concurrent with just about any operation including
flushes.

* Add a freespace tracking heuristic to reduce the overhead that modifying
VOP calls incur in checking the free reserve requirement.

* hammer2 show dumps additional info for freemap nodes.

show more ...


# c8c0a18a 31-Aug-2017 Matthew Dillon <dillon@apollo.backplane.com>

hammer2 - error handling 2/N (chain_lookup/chain_next)

* Implement error handling for hammer2_chain_lookup() and
hammer2_chain_next().

* Shim use cases for this commit. Ultimately the intent is

hammer2 - error handling 2/N (chain_lookup/chain_next)

* Implement error handling for hammer2_chain_lookup() and
hammer2_chain_next().

* Shim use cases for this commit. Ultimately the intent is to
convert the entire error path to HAMMER2_ERROR_* codes.

show more ...


# 2b8f3e7e 31-Aug-2017 Matthew Dillon <dillon@apollo.backplane.com>

hammer2 - Embed cache_index heuristic in chain structure

* Embed the cache_index heuristic in the hammer2_chain structure and get
rid of all the code that passed it in to various API functions. T

hammer2 - Embed cache_index heuristic in chain structure

* Embed the cache_index heuristic in the hammer2_chain structure and get
rid of all the code that passed it in to various API functions. This
substantially cleans-up the API.

* Adjust comments for upcoming error handling work.

show more ...


# 59eb0066 19-Aug-2017 Matthew Dillon <dillon@apollo.backplane.com>

hammer2 - Fix snapshots and multi-label mounts

* Allow the same /dev/blah@DIFFERENTLABEL to be specified in a mount
command, so multiple labels from the same device can be mounted.

* Devfs can th

hammer2 - Fix snapshots and multi-label mounts

* Allow the same /dev/blah@DIFFERENTLABEL to be specified in a mount
command, so multiple labels from the same device can be mounted.

* Devfs can throw different vnodes for the same device. When matching
up hammer2_dev, check devvp->v_rdev for a match as well.

* Fix a number of bugs in the snapshot code that left a hammer2_inode
structure hanging and caused a panic.

* Fix races in admin thread flag messaging that could lead to 60-second
delays during umount.

show more ...


# 9dca9515 18-Aug-2017 Matthew Dillon <dillon@apollo.backplane.com>

hammer2 - Add kernel-thread-based async bulk free

* Add an async bulk-free feature and a kernel thread to allow the hammer2
vfs itself to run bulk frees.

* Revamp the hammer2 thread management co

hammer2 - Add kernel-thread-based async bulk free

* Add an async bulk-free feature and a kernel thread to allow the hammer2
vfs itself to run bulk frees.

* Revamp the hammer2 thread management code a bit to support the new use,
and clean-up the API.

show more ...


Revision tags: v4.8.1
# da0cdd33 25-Jul-2017 Matthew Dillon <dillon@apollo.backplane.com>

hammer2 - Initial HARDLINK -> DIRENT replacement code

* Initial removal of the vestiges of the old embedded inode code. Inodes
were moved to the root directory long ago but directories still cont

hammer2 - Initial HARDLINK -> DIRENT replacement code

* Initial removal of the vestiges of the old embedded inode code. Inodes
were moved to the root directory long ago but directories still contain
dummy OBJTYPE_HARDLINK inodes instead of real directory entries to point
to the moved inodes. These inodes ate 1024 bytes of disk space for each
directory entry.

* Remove the dummy OBJTYPE_HARDLINK inodes and replace with new
BREF_TYPE_DIRENT blockrefs. These blockrefs represent directory
entries, and the entire dirent will fit in the blockref (requiring
no data ref) if the filename is <= 64 bytes.

* This new DIRENT mechanic significantly improves performance and reduces
storage overage vs the previous mechanicn, for obvious reasons.

Directory entries are now 128 bytes instead of 1024 bytes, and since they
are collected together in indirect blocks or (if <= 4 entries) simply
placed in the 4 blockrefs embedded in the directory inode, the related
I/O tends to be fairly optimal.

Only directory entries whos filenames are > 64 bytes long require an
additional data block reference. For now, due to other constraints,
we use the minimum H2 allocation size of 1KB for these, so certainly
space is wasted. But in real life there aren't actually a whole lot
of filenames that are that long so it should be fine.

show more ...


Revision tags: v4.8.0, v4.6.2
# 0d66a712 17-Mar-2017 Matthew Dillon <dillon@apollo.backplane.com>

hammer2 - Fix cluster synchronization bug (2+ nodes)

* Fix a bug where hammer2_cluster_check() can end up in an infinite
loop.

* Look for pfs_names[] in all PFSs associated with a cluster.

* Add

hammer2 - Fix cluster synchronization bug (2+ nodes)

* Fix a bug where hammer2_cluster_check() can end up in an infinite
loop.

* Look for pfs_names[] in all PFSs associated with a cluster.

* Add missing xop retirement in pfs-delete.

* Skip the directory empty check in pfs-delete.

* Preliminary code to deallocate an element of a live PFS

show more ...


Revision tags: v4.9.0, v4.8.0rc, v4.6.1, v4.6.0, v4.6.0rc2, v4.6.0rc, v4.7.0
# 7fece146 09-Jul-2016 Matthew Dillon <dillon@apollo.backplane.com>

hammer2 - Add feature to allow sector overwrite, fix meta-data check code

* If a file is set to use no check code (hammer2 setcheck none <file>),
data overwrites will reuse the same sector as long

hammer2 - Add feature to allow sector overwrite, fix meta-data check code

* If a file is set to use no check code (hammer2 setcheck none <file>),
data overwrites will reuse the same sector as long as it does not violate
the most recent snapshot.

This allows the program to relax copy-on-write requirements for certain
files, for example files which might be mmap()'d SHARED+RW and then
modified constantly where the programmer has determined that the
possibility of corruption is ok.

* Implement pfs_lsnap_tid in the PFS root inode meta-data. This records the
last snapshot TID so the chain code can determine if an overwrite is
allowed.

* Remove attr_tid and dirent_tid from the inode meta-data for now.

* Only BREF_TYPE_DATA brefs inherit the inode check mode. Meta-data brefs
such as indirect blocks, or directory entries, will only use the check
code type specified in the parent inode if it is not NONE. Otherwise
they will use the default check code.

This fixes a bug where meta-data brefs could wind up being unchecked. We
want all meta-data to always be checked (at least for now).

show more ...


# 660d007e 09-Jun-2016 Matthew Dillon <dillon@apollo.backplane.com>

hammer2 - Revamp worker thread signaling

* Revamp how worker thread signaling works. Get rid of a number of race
conditions and use atomic ops. We no longer need thr->lk.

* Make hammer2_cluster

hammer2 - Revamp worker thread signaling

* Revamp how worker thread signaling works. Get rid of a number of race
conditions and use atomic ops. We no longer need thr->lk.

* Make hammer2_cluster_enable's scaling factor work with cluster_write()
as well as cluster_read().

show more ...


Revision tags: v4.4.3, v4.4.2, v4.4.1, v4.4.0, v4.5.0, v4.4.0rc
# b02c0ae6 19-Nov-2015 Matthew Dillon <dillon@apollo.backplane.com>

hammer2 - stabilization pass on slave sync (2)

* Augment xop_scanall to allow flags to be passed in.

* Implement HMNT2_LOCAL (-o local) flag for debugging cluster elements.

* Fix missing data pani

hammer2 - stabilization pass on slave sync (2)

* Augment xop_scanall to allow flags to be passed in.

* Implement HMNT2_LOCAL (-o local) flag for debugging cluster elements.

* Fix missing data panic by resolving chain data in the slave sync scan.

* Fix PFS installation ioctl to add the new PFS to the cluster as
appropriate.

* Numerous cleanups and fixes to the slave sync code which was previously
ripped up by the XOPs work. Note that the slave sync code still has tons
of issues and races.

show more ...


# 3cbe226b 18-Nov-2015 Matthew Dillon <dillon@apollo.backplane.com>

hammer2 - stabilization pass on slave sync

* Add a temporary hack which avoids a NULL pointer panic. This goes a long
way to stabilizing the backend slave sync threads.


# 3f01ebaa 30-Aug-2015 Matthew Dillon <dillon@apollo.backplane.com>

hammer2 - live dedup, cleanup

* First attempt at a live dedup. The H2 strategy code now caches
{data_off, crc} info to track recently accessed data blocks. The
cache is checked in the strategy

hammer2 - live dedup, cleanup

* First attempt at a live dedup. The H2 strategy code now caches
{data_off, crc} info to track recently accessed data blocks. The
cache is checked in the strategy_write code after device-level
block encoding. If we get a cache hit, the disk block is compared
against the write data and reused if it matches.

* This 'live' dedup should catch most typical 'cp' or 'cpdup' style
commands. There will also be a bulk dedup capable of catching
everything.

* Note that 'df' output might be a bit confusing because the 'Used'
field represents the topology and does not take into account dedups.
'Avail' is calculated from the actual freemap. To make things look
right the total disk size is adjusted upward so it matches
Used+Avail. This mechanism will likely change.

Here is an example with one copy of /usr/src and 13 copies of /usr/src.
The first copy eats around 872MB, and a 'du' will show each copy eating
about the same. But because of dedup each subsequent copy actually only
eats around 160MB as you can see from the 'Avail' field:

test40# df -h /mnt
Filesystem Size Used Avail Capacity
/dev/serno/WD-WX51A82J2299.s1f@LOCAL 99G 934M 99G 1%
Filesystem Size Used Avail Capacity
/dev/serno/WD-WX51A82J2299.s1f@LOCAL 106G 8.5G 97G 8%

* Rename hammer2_bulkscan.c to hammer2_bulkfree.c since that is
basically all it does.

* Move the synchronization code to its own file, hammer2_synchro.c.
(note: This code is currently in rip-up mode and will not operate
properly).

show more ...