History log of /dragonfly/sys/vfs/hammer/hammer_cursor.h (Results 1 – 25 of 41)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v6.2.1, v6.2.0, v6.3.0, v6.0.1, v6.0.0, v6.0.0rc1, v6.1.0, v5.8.3, v5.8.2, v5.8.1, v5.8.0, v5.9.0, v5.8.0rc1, v5.6.3, v5.6.2, v5.6.1, v5.6.0, v5.6.0rc1, v5.7.0, v5.4.3, v5.4.2, v5.4.1, v5.4.0, v5.5.0, v5.4.0rc1, v5.2.2, v5.2.1, v5.2.0, v5.3.0, v5.2.0rc, v5.0.2, v5.0.1, v5.0.0, v5.0.0rc2, v5.1.0, v5.0.0rc1, v4.8.1, v4.8.0, v4.6.2, v4.9.0, v4.8.0rc, v4.6.1
# 22a0040d 28-Aug-2016 Tomohiro Kusumi <kusumi.tomohiro@gmail.com>

sys/vfs/hammer: Typedef struct on declaration

Most structs in hammer are typedef'd.
Some structs define it on declaration, but others don't.


# 562d34c2 28-Aug-2016 Tomohiro Kusumi <kusumi.tomohiro@gmail.com>

sys/vfs/hammer: Use typedef'd for struct hammer_buffer*

The whole hammer code is mix of using struct and typedef'd.
Use typedef'd because majority of the code use typedef'd.


# 877580d2 28-Aug-2016 Tomohiro Kusumi <kusumi.tomohiro@gmail.com>

sys/vfs/hammer: Use typedef'd for struct hammer_record*

The whole hammer code is mix of using struct and typedef'd.
Use typedef'd because majority of the code use typedef'd.


# e1067862 28-Aug-2016 Tomohiro Kusumi <kusumi.tomohiro@gmail.com>

sys/vfs/hammer: Use typedef'd for struct hammer_inode*

The whole hammer code is mix of using struct and typedef'd.
Use typedef'd because majority of the code use typedef'd.


# 053f997d 28-Aug-2016 Tomohiro Kusumi <kusumi.tomohiro@gmail.com>

sys/vfs/hammer: Use typedef'd for struct hammer_btree_leaf_elm*

The whole hammer code is mix of using struct and typedef'd.
Use typedef'd because majority of the code use typedef'd.


# 513f50d5 28-Aug-2016 Tomohiro Kusumi <kusumi.tomohiro@gmail.com>

sys/vfs/hammer: Use typedef'd for union hammer_data_ondisk*

The whole hammer code is mix of using struct and typedef'd.
Use typedef'd because majority of the code use typedef'd.


Revision tags: v4.6.0, v4.6.0rc2, v4.6.0rc, v4.7.0, v4.4.3
# cf977f11 01-Mar-2016 Tomohiro Kusumi <kusumi.tomohiro@gmail.com>

sys/vfs/hammer: Don't use HAMMER_CURSOR_GET_LEAF

HAMMER_CURSOR_GET_LEAF is not used by the current implementation
of hammer_btree_extract(), since this function will cause
cursor->leaf to point to t

sys/vfs/hammer: Don't use HAMMER_CURSOR_GET_LEAF

HAMMER_CURSOR_GET_LEAF is not used by the current implementation
of hammer_btree_extract(), since this function will cause
cursor->leaf to point to the node element in question regardless
of the flag value. Then just don't use this flag when calling
hammer_btree_extract() when we know loading node element (leaf)
is the default behavior.

The cursor flag is already complicated enough, so simplifying
btree extract callers by either passing 0 or DATA but not LEAF
makes things a bit more clear.

show more ...


Revision tags: v4.4.2, v4.4.1, v4.4.0
# 60ef395a 24-Nov-2015 Tomohiro Kusumi <kusumi.tomohiro@gmail.com>

sys/vfs/hammer: Remove forward declaration of struct hammer_cmirror


Revision tags: v4.5.0, v4.4.0rc
# 964cb30d 05-Sep-2015 Tomohiro Kusumi <kusumi.tomohiro@gmail.com>

sys/vfs/hammer: Add ifndef/define/endif for headers

Some headers are missing this, so add it to those.


# f176517c 28-Aug-2015 Tomohiro Kusumi <kusumi.tomohiro@gmail.com>

hammer: Remove cluster topology related comments

that were written in the early stage of hammer devel
but do not reflect the actual implementation today,
such as super-cluster, etc.


Revision tags: v4.2.4, v4.3.1, v4.2.3
# 745703c7 07-Jul-2015 Tomohiro Kusumi <kusumi.tomohiro@gmail.com>

hammer: Remove trailing whitespaces

- (Non-functional commits could make it difficult to git-blame
the history if there are too many of those)


Revision tags: v4.2.1, v4.2.0, v4.0.6, v4.3.0, v4.2.0rc, v4.0.5, v4.0.4, v4.0.3, v4.0.2, v4.0.1, v4.0.0, v4.0.0rc3, v4.0.0rc2, v4.0.0rc, v4.1.0, v3.8.2, v3.8.1, v3.6.3, v3.8.0, v3.8.0rc2, v3.9.0, v3.8.0rc, v3.6.2, v3.6.1, v3.6.0, v3.7.1, v3.6.0rc, v3.7.0, v3.4.3, v3.4.2, v3.4.0, v3.4.1, v3.4.0rc, v3.5.0, v3.2.2, v3.2.1, v3.2.0, v3.3.0, v3.0.3, v3.0.2, v3.0.1, v3.1.0, v3.0.0
# 86d7f5d3 26-Nov-2011 John Marino <draco@marino.st>

Initial import of binutils 2.22 on the new vendor branch

Future versions of binutils will also reside on this branch rather
than continuing to create new binutils branches for each new version.


Revision tags: v2.12.0, v2.13.0, v2.10.1, v2.11.0, v2.10.0, v2.9.1, v2.8.2, v2.8.1, v2.8.0, v2.9.0, v2.6.3, v2.7.3, v2.6.2, v2.7.2, v2.7.1, v2.6.1, v2.7.0, v2.6.0, v2.5.1, v2.4.1, v2.5.0, v2.4.0, v2.3.2, v2.3.1
# 3214ade6 06-May-2009 Matthew Dillon <dillon@apollo.backplane.com>

HAMMER VFS - Refactor merged search function to try to avoid missed entries

Refactor the merged B-Tree + In-Memory search function to try to avoid races
where an in-memory record is flushed to the m

HAMMER VFS - Refactor merged search function to try to avoid missed entries

Refactor the merged B-Tree + In-Memory search function to try to avoid races
where an in-memory record is flushed to the media during a search, causing
the search to miss the record.

Add another flag to hammer_record_t to indicate that the record was deleted
because it was committed to the media (verses simply being deleted).

Better-separate HAMMER_RECF_DELETED_FE and HAMMER_RECF_DELETED_BE. These
flags indicate whether the frontend or backend deleted an in-memory record.
The backend ignores frontend deletions that occur after the record has been
associated with a flush group.

Remove some console Warnings that are no longer applicable.

show more ...


Revision tags: v2.2.1, v2.2.0, v2.3.0, v2.1.1, v2.0.1
# 5c8d05e2 06-Aug-2008 Matthew Dillon <dillon@dragonflybsd.org>

HAMMER 2.1:01 - Stability

* Fix a bug in the B-Tree code. Recursive deletions are done prior to
marking a node as actually being empty, but setup for the deletion
(by calling hammer_cursor_dele

HAMMER 2.1:01 - Stability

* Fix a bug in the B-Tree code. Recursive deletions are done prior to
marking a node as actually being empty, but setup for the deletion
(by calling hammer_cursor_deleted_element()) must still occur prior
to the recursrion so cursor indexes are properly adjusted for the
possible removal. If the recursion is not successful we can just leave
the cursors post-adjusted since the subtree has an empty leaf anyway.

* Rename HAMMER_CURSOR_DELBTREE to HAMMER_CURSOR_RETEST so its function
is more apparent.

* Properly set the HAMMER_CURSOR_RETEST flag when relocking a cursor
that has tracked a ripout, so the cursor's new current element is
re-tested by any iteration using the cursor.

* Remove code that allowed a SETUP record to be converted to a FLUSH
record if the target inode is already in the correct flush group.
The problem is that target inode has already setup its sync state
for the backend and the nlinks count will not be correct if we
add another directory ADD/DEL record to the flush. While strictly
a temporary nlinks mismatch (the next flush would correct it), a
crash occuring here would result in inconsistent nlink counts on
the media.

* Reference and release buffers instead of directly calling low level
hammer_io_deallocate(), and generally reference and release buffers
around reclamations in the buffer/io invalidation code to avoid
races. In particular, the buffer must be referenced during a
call to hammer_io_clear_modify().

* Fix a buffer leak in hammer_del_buffers() which is not only bad unto
itself, but can also cause reblocking assertions on the presence of
buffer aliases later on.

* Return ENOTDIR if rmdir attempts to remove a non-directory.

Reported-by: Francois Tigeot <ftigeot@wolfpond.org> (rmdir)
Reported-by: YONETANI Tomokazu <qhwt+dfly@les.ath.cx> (multiple)

show more ...


# 4c038e17 10-Jul-2008 Matthew Dillon <dillon@dragonflybsd.org>

HAMMER 60J/Many: Mirroring

Finish implementing the core mirroring algorithm. The last bit was to add
support for no-history deletions on the master. The same support also covers
masters which have

HAMMER 60J/Many: Mirroring

Finish implementing the core mirroring algorithm. The last bit was to add
support for no-history deletions on the master. The same support also covers
masters which have pruned records away prior to the mirroring operation.
As with the work done previously, the algorithm is 100% queue-less and
has no age limitations. You could wait a month, and then do a mirroring
update from master to slave, and the algorithm will efficiently handle it.

The basic issue that this commit tackles is what to do when records are
physically deleted from the master. When this occurs the mirror master
cannot provide a list of records to delete to its slaves.

The solution is to use the mirror TID propagation to physically identify
swaths of the B-Tree in which a deletion MAY have taken place. The
mirroring code uses this information to generate PASS and SKIP mrecords.

A PASS identifies a record (sans its data payload) that remains within
the identified swath and should already exist on the target. The
mirroring target does a simultanious iteration of the same swath on the
target B-Tree and deletes records not identified by the master.

A SKIP is the heart of the algorithm's efficiency. The same mirror TID
stored in the B-Tree can also identify large swaths of the B-Tree for which
*NO* deletions have taken place (which will be most of the B-Tree). One
SKIP Record can identify an arbitrarily large swath. The target uses
the SKIP record to skip that swath on the target. No scan takes place.
SKIP records can be generated from any internal node of the B-Tree and cover
that node's entire sub-tree.

This also provides us with the feature where the retention policy can be
completely different between a master and a mirror, or between mirrors.
When the slave identifies a record that must be deleted through the above
algorithm it only needs to mark it as historically deleted, it does not
have to physically delete the record.

show more ...


# adf01747 07-Jul-2008 Matthew Dillon <dillon@dragonflybsd.org>

HAMMER 60E/Many: Mirroring, bug fixes

* Work on the mirror_tid propagation code. The code now retries on
EDEADLK so propagation is guaranteed to reach the root.

* Get most of the mirror_write co

HAMMER 60E/Many: Mirroring, bug fixes

* Work on the mirror_tid propagation code. The code now retries on
EDEADLK so propagation is guaranteed to reach the root.

* Get most of the mirror_write code working.

* Add PFS support for NFS exports. Change fid_reserved to fid_ext and use
it to store the localization parameter that selects the PFS. This isn't
well tested yet.

* BUGFIX: Fix a bug in vol0_last_tid updates. Flush sequences might
not always update the field, creating issues with mirroring and snapshots.

* BUGFIX: Properly update the volume header CRC.

* CLEANUP: Fix some obj_id's that were u_int64_t's. They should be int64_t's.

* CLEANUP: #if 0 out unused code, remove other bits of unused code.

show more ...


# b3bad96f 05-Jul-2008 Matthew Dillon <dillon@dragonflybsd.org>

HAMMER 60D/Many: Mirroring, bug fixes

* Add more mirroring infrastructure.

* Fix support for unix domain sockets.

Reported-by: Gergo Szakal <bastyaelvtars@gmail.com>,
Rumko <rumcic@gmail.com>


# c82af904 26-Jun-2008 Matthew Dillon <dillon@dragonflybsd.org>

HAMMER 59A/Many: Mirroring related work (and one bug fix).

* BUG FIX: Fix a bug in directory hashkey generation. The iterator could
sometimes conflict with a key already on-disk and interfere wit

HAMMER 59A/Many: Mirroring related work (and one bug fix).

* BUG FIX: Fix a bug in directory hashkey generation. The iterator could
sometimes conflict with a key already on-disk and interfere with a pending
deletion. The chance of this occuring was miniscule but not 0. Now fixed.

The fix also revamps the directory iterator code, moving it all to one
place and removing it from two other places.

* PRUNING CHANGE: The pruning code no longer shifts the create_tid and
delete_tid of adjacent records to fill gaps. This means that historical
queries must either use snapshot softlinks or use a fine-grained
transaction id greater then the most recent snapshot softlink.

fine-grained historical access still works up to the first snapshot
softlink.

* Clean up the cursor code responsible for acquiring the parent node.

* Add the core mirror ioctl read/write infrastructure. This work is still
in progress.

- ioctl commands
- pseudofs enhancements, including st_dev munging.
- mount options
- transaction id and object id conflictless allocation
- initial mirror_tid recursion up the B-Tree (not finished)
- B-Tree mirror scan optimizations to skip sub-hierarchies that do not
need to be scanned (requires mirror_tid recursion to be 100% working).

show more ...


# bf3b416b 14-Jun-2008 Matthew Dillon <dillon@dragonflybsd.org>

HAMMER 55: Performance tuning and bug fixes - MEDIA STRUCTURES CHANGED!

* BUG-FIX: Fix a race in hammer_rel_mem_record() which could result in a
machine lockup. The code could block at an inappro

HAMMER 55: Performance tuning and bug fixes - MEDIA STRUCTURES CHANGED!

* BUG-FIX: Fix a race in hammer_rel_mem_record() which could result in a
machine lockup. The code could block at an inappropriate time with both
the record and a dependancy inode pointer left unprotected.

* BUG-FIX: The direct-write code could assert on (*error != 0) due to an
incorrect conditional in the in-memory record scanning code.

* Inode data and directory entry data has been given its own zone as a
stop-gap until the low level allocator can be rewritten.

* Increase the directory object-id cache from 128 entries to 1024 entries.

* General cleanup.

* Introduce a separate reblocking domain for directories: 'hammer reblock-dirs'.

show more ...


# 47637bff 07-Jun-2008 Matthew Dillon <dillon@dragonflybsd.org>

HAMMER 53A/Many: Read and write performance enhancements, etc.

* Add hammer_io_direct_read(). For full-block reads this code allows
a high-level frontend buffer cache buffer associated with the

HAMMER 53A/Many: Read and write performance enhancements, etc.

* Add hammer_io_direct_read(). For full-block reads this code allows
a high-level frontend buffer cache buffer associated with the
regular file vnode to directly access the underlying storage,
instead of loading that storage via a hammer_buffer and bcopy()ing it.

* Add a write bypass, allowing the frontend to bypass the flusher and
write full-blocks directly to the underlying storage, greatly improving
frontend write performance. Caveat: See note at bottom.

The write bypass is implemented by adding a feature whereby the frontend
can soft-reserve unused disk space on the physical media without having
to interact (much) with on-disk meta-data structures. This allows the
frontend to flush high-level buffer cache buffers directly to disk
and release the buffer for reuse by the system, resulting in very high
write performance.

To properly associate the reserved space with the filesystem so it can be
accessed in later reads, an in-memory hammer_record is created referencing
it. This record is queued to the backend flusher for final disposition.
The backend disposes of the record by inserting the appropriate B-Tree
element and marking the storage as allocated. At that point the storage
becomes official.

* Clean up numerous procedures to support the above new features. In
particular, do a major cleanup of the cached truncation offset code
(this is the code which allows HAMMER to implement wholely asynchronous
truncate()/ftruncate() support.

Also clean up the flusher triggering code, removing numerous hacks that
had been in place to deal with the lack of a direct-write mechanism.

* Start working on statistics gathering to track record and B-Tree
operations.

* CAVEAT: The backend flusher creates a significant cpu burden when flushing
a large number of in-memory data records. Even though the data itself
has already been written to disk, there is currently a great deal of
overhead involved in manipulating the B-Tree to insert the new records.
Overall write performance will only be modestly improved until these
code paths are optimized.

show more ...


# 2f85fa4d 18-May-2008 Matthew Dillon <dillon@dragonflybsd.org>

HAMMER 46/Many: Performance pass, media changes, bug fixes.

* Add a localization field to the B-Tree element which has sorting priority
over the object id.

Use the localization field to separat

HAMMER 46/Many: Performance pass, media changes, bug fixes.

* Add a localization field to the B-Tree element which has sorting priority
over the object id.

Use the localization field to separate inode entries from file data. This
allows the reblocker to cluster inode information together and greatly
improves directory/stat performance.

* Enhance the reblocker to reblock internal B-Tree nodes as well as leaves.

* Enhance the reblocker by adding 'reblock-inodes' in addition to
'reblock-data' and 'reblock-btree', allowing individual types of
meta-data to be independantly reblocked.

* Fix a bug in hammer_bread(). The buffer's zoneX_offset field was
sometimes not being properly masked, resulting in unnecessary blockmap
lookups. Also add hammer_clrxlate_buffer() to clear the translation
cache for a hammer_buffer.

* Fix numerous issues with hmp->sync_lock.

* Fix a buffer exhaustion issue in the pruner and reblocker due to not
counting I/O's in progress as being dirty.

* Enhance the symlink implementation. Take advantage of the extra 24 bytes
of space in the inode data to directly store symlinks <= 24 bytes.

* Use cluster_read() to gang read I/O's into 64KB chunks. Rely on
localization and the reblocker and pruner to make doing the larger
I/O's worthwhile.

These changes reduce ls -lR overhead on 43383 files (half created with cpdup,
half totally randomly created with blogbench). Overhead went from 35 seconds
after reblocking, before the changes, to 5 seconds after reblocking,
after the changes.

show more ...


# 11ad5ade 12-May-2008 Matthew Dillon <dillon@dragonflybsd.org>

HAMMER 43/Many: Remove records from the media format, plus other stuff

* Get rid of hammer_record_ondisk. As HAMMER has evolved the need for
a separate record structure has devolved into triviali

HAMMER 43/Many: Remove records from the media format, plus other stuff

* Get rid of hammer_record_ondisk. As HAMMER has evolved the need for
a separate record structure has devolved into trivialities. Originally
the idea was to have B-Tree nodes referencing records and data. The
B-Tree elements were originally intended to be throw-away and the on-media
records were originally intended to be the official representation of
the data and contained additional meta-information such as the obj_id
of a directory entry and a few additional fields related to the inode.

But once the UNDO code went in and it became obvious that the B-Tree needed
to be tracked (undo-wise) along with everything else, the need for an
official representation of the record as a separate media structure
essentially disappeared.

Move the directory-record meta-data into the directory-entry data and move
the inode-record meta-data into the inode-record data. As a single
exception move the atime field to the B-Tree element itself (it replaces
what used to be the record offset), in order to continue to allow atime
updates to occur without requiring record rewrites. With these changes
records are no longer needed at all, so remove the on-media record structure
and all the related code.

* The removal of the on-media record structure also greatly improves
performance.

* B-Tree elements are now the official on-media record.

* Fix a race in the extraction of the root of the B-Tree.

* Clean up the in-memory record handling API. Instead of having to
construct B-Tree leaf elements we can simply embed one in the in-memory
record structure (struct hammer_record), and in the inode.

show more ...


# 4e17f465 03-May-2008 Matthew Dillon <dillon@dragonflybsd.org>

HAMMER 40D/Many: Inode/link-count sequencer cleanup pass.

* Move the vfsync from the frontend to the backend. This allows the
frontend to passively move inodes to the backend without having to

HAMMER 40D/Many: Inode/link-count sequencer cleanup pass.

* Move the vfsync from the frontend to the backend. This allows the
frontend to passively move inodes to the backend without having to
actually start the flush, greatly improving performance.

* Use an inode lock to deal with directory entry syncing races between
the frontend and the backend. It isn't optimal but it's ok for now.

* Massively optimize the backend code by initializing a single cursor
for an inode and passing the cursor to procedures, instead of having
each procedure initialize its own cursor.

* Fix a sequencing issue with the backend. While building the flush
state for an inode another process could get in and initiate its own
flush, screwing up the flush group and creating confusion.
(hmp->flusher_lock)

* Don't lose track of HAMMER_FLUSH_SIGNAL flush requests. If we get
such a requet but have to flag a reflush, also flag that the reflush
is to be signaled (done immediately when the current flush is done).

* Remove shared inode locks from hammer_vnops.c. Their original purpose
no longer exists.

* Simplify the arguments passed to numerous procedures (hammer_ip_first(),
etc).

show more ...


# 1f07f686 02-May-2008 Matthew Dillon <dillon@dragonflybsd.org>

HAMMER 40A/Many: Inode/link-count sequencer.

* Remove the hammer_depend structure and build the dependancies directly
into the hammer_record structure.

* Attempt to implement layout rules to ensu

HAMMER 40A/Many: Inode/link-count sequencer.

* Remove the hammer_depend structure and build the dependancies directly
into the hammer_record structure.

* Attempt to implement layout rules to ensure connectivity is maintained.
This means, for example, that before HAMMER can flush a newly created
file it will make sure the file has namespace connectivity to the
directory it was created it, recursively to the root.

NOTE: 40A destabilizes the filesystem a bit, it's going to take a few
passes to get everything working properly. There are numerous issues
with this commit.

show more ...


# b84de5af 24-Apr-2008 Matthew Dillon <dillon@dragonflybsd.org>

HAMMER 38A/Many: Undo/Synchronization and crash recovery

* Separate all frontend operations from all backend media synchronization.
The frontend VNOPs make all changes in-memory and in the fronten

HAMMER 38A/Many: Undo/Synchronization and crash recovery

* Separate all frontend operations from all backend media synchronization.
The frontend VNOPs make all changes in-memory and in the frontend
buffer cache. The backend buffer cache used to manage meta-data is
not touched.

- In-memory inode contains two copies of critical meta-data structures
- In-memory record tree distinguishes between records undergoing
synchronization and records not undergoing synchronization.
- Frontend buffer cache buffers are tracked to determine which ones
to synchronize and which ones not to.
- Deletions are cached in-memory. Any number of file truncations
simply caches the lowest truncation offset and on-media records
beyond that point are ignored. Record deletions are cached as
a negative entry in the in-memory record tree until the backend
can execute the operation on the media.
- Frontend operations continue to have full, direct read access to
the media.

* Backend synchronization to the disk media is able to take place
simultaniously with frontend operations on the same inodes. This
will need some tuning but it basically works.

* In-memory records are no longer removed from the B-Tree when deleted.
They are marked for deletion and removed when the last reference goes
away.

* An Inode whos last reference is being released is handed over to the
backend flusher for its final disposition.

* There are some bad hacks and debugging tests in this commit. In particular
when the backend needs to do a truncation it special-cases any
negative entries it finds in the in-memory record tree. Also, if a
rename operation hits a deadlock it currently breaks atomicy.

* The transaction API has been simplified. The frontend no longer allocates
transaction ids. Instead the backend does a full flush with a single
transaction id (since that is the granularity the crash recovery code will
have anyway).

show more ...


12