#
646c880e |
| 29-Jan-2023 |
Tomohiro Kusumi <tkusumi@netbsd.org> |
sbin/*hammer2: Remove redundant "CFLAGS+=-DXXH_NAMESPACE=h2_"
XXH_NAMESPACE as h2_ has been defined in sys/vfs/hammer2/xxhash/xxhash.h since 30ce27d45c7ee6e8a8cb7da80171265c31a230c3.
|
Revision tags: v6.4.0, v6.4.0rc1, v6.5.0 |
|
#
2d60b848 |
| 04-Jun-2022 |
Tomohiro Kusumi <tkusumi@netbsd.org> |
usr.sbin/makefs: Add HAMMER2 support
This commit adds HAMMER2 image creation support for makefs(8). It runs newfs_hammer2(8) and then sys/vfs/hammer2 logic in userspace to create HAMMER2 image from
usr.sbin/makefs: Add HAMMER2 support
This commit adds HAMMER2 image creation support for makefs(8). It runs newfs_hammer2(8) and then sys/vfs/hammer2 logic in userspace to create HAMMER2 image from a given directory.
This commit splits newfs_hammer2(8) into newfs and mkfs part simlarly to newfs_msdos(8), so that makefs(8) can use newfs functionality. The entire sys/vfs/hammer2 (with exception of unneeded hammer2_{bulkfree,ccms,iocom,ioctl,msgops,synchro}.[hc] and reusable hammer2_disk.h) is copied to usr.sbin/makefs with below modification. It intends to have minimum amount of diff against sys/vfs/hammer2.
* Header includes are modified so that it compiles in userspace. * VFS and other kernel functions are usually implemented as simple stub functions in hammer2_compat.h and hammer2_buf.c, but some are commented out. * Kernel functions such as kprintf, kmalloc, kprintf, kstrdup, etc are implemented using corresponding libc functions. * Lock primitives are basically NOP, and they (should) never block as makefs(8) is a single thread program. * struct vnode and struct buf (the ones defined locally in makefs(8), not sys/sys/*) have new struct members only used by HAMMER2 to emulate VFS behavior required by HAMMER2. * Since makefs(8) is write-only, VOP_{NRESOLVE,NCREATE,NMKDIR,NLINK, NSYMLINK,WRITE,STRATEGY} are implemented, but other VOPs just return EOPNOTSUPP. * VOP_{INACTIVE,RECLAIM} may be implemented and used in future to better emulate VFS behavior to address current limitation. * VOP_WRITE is modified to directly call VOP_STRATEGY function. * The XOP kernel thread is modified to act as a regular function called from VOPs, along with simplified admin code.
It currently has following limitations.
* multi-volumes is unsupported, simply due to makefs(8) only taking 1 image file path. * Not necessarily a limitation, but it only supports populating 1 PFS, which is "DATA" by default. Other PFSes if any won't have anything under the root PFS inode. * makefs(8) process gets killed by OOM for a directory with *extremely* large number of files, depending on available memory. This is due to the way it currently tries to flush all chains in a single VFS_SYNC. Supporting multiple VFS_SYNC calls by checking available memory along the way gives chance to free unused vnodes/inodes and chains. This may be implemented in future. This limitation is specific to HAMMER2, as all other makefs(8) filesystems are not CoW, meaning they allow in-place write based objects creation from a top directory to bottom whereas HAMMER2 flushes chains in bottom-up direction.
show more ...
|
Revision tags: v6.2.2, v6.2.1, v6.2.0, v6.3.0, v6.0.1, v6.0.0, v6.0.0rc1, v6.1.0 |
|
#
0b738157 |
| 25-Dec-2020 |
Tomohiro Kusumi <tkusumi@netbsd.org> |
sys/vfs/hammer2: Add initial multi-volumes support for HAMMER2
This commit adds initial multi-volumes support for HAMMER2. Maximum supported volumes is 64. The feature and implementation is similar
sys/vfs/hammer2: Add initial multi-volumes support for HAMMER2
This commit adds initial multi-volumes support for HAMMER2. Maximum supported volumes is 64. The feature and implementation is similar to multi-volumes support in HAMMER1.
1. ondisk changes ================= This commit bumps volume header version from 1 to 2, and adds four new volume header fields using reserved fields in version 1. Other ondisk structures are unchanged. * "volu_id" - volume id from 0 to 63, where 0 represents root volume. * "nvolumes" - number of volumes. All volumes have same the same value. * "total_size" - sum of "volu_size" in volumes. All volumes have the same value. * "volu_loff[HAMMER2_MAX_VOLUMES]" - A 512 bytes table which contains start offset of max 64 volumes within "total_size". All volumes have the same value.
Version 1 volume header has 0 for above fields, so HAMMER2 internally treats "nvolumes" as 1, and "total_size" as "volu_size" to be able to handle version 1 and 2 transparently.
All volumes have 4 headers, but only root volume ones are relevant. Non-root volume headers have their own unique "volu_id" and "volu_size", but other fields are unimportant and never used. Non-root volume headers have sroot blockset[i] whose type is HAMMER2_BREF_TYPE_INVALID. Non-root volume headers don't have boot/aux area, so freemap area start from offset 0. Non-root volume headers are readonly and never updated after creation. This means non-root volumes are just extra storage to extend fs size and internally make up a single virtual volume whose size is "total_size".
It currently doesn't automatically upgrade an existing version 1 fs to version 2. Only newly created fs becomes version 2 for now.
2. volumes layout ================= Basically similar to HAMMER1. A first block device argument provided for newfs_hammer2(8) becomes the root volume, and if specified remaining devices extend "total_size" as non-root volumes. All volumes except for the last one have 1GiB (freemap level1) aligned "volu_size".
This means each volume's start offset within "total_size" is also 1GiB (freemap level1) aligned. The start offsets of volumes are stored in volu_loff[HAMMER2_MAX_VOLUMES]. Each volu_loff[n] (0 <= n < nvolumes) represents start offset of volume n within "total_size". Unused volumes have -1 for volu_loff[n]. e.g. If a fs consists of 1 volume, volu_loff[0] has 0 and rests have -1. e.g. If a fs consists of 3 volumes, x GiB root volume, y GiB volume, and z GiB volume, volu_loff[0] has 0, volu_loff[1] has x, volu_loff[2] has x+y, and rests have -1.
Low level I/O function in HAMMER2 uses this linear offsets table to determine a device vnode to use and relative offset within the device vnode, for a given blockref's "data_off". This is different from HAMMER1 where logical offset had embedded volume id bits (i.e. there were holes in logical address space). HAMMER2 needs this table to support multi- volumes without changing current logical offset mechanism.
Unless all volumes are specified and mountable, mount_hammer2(8) fails like it failed in HAMMER1. This also applies to other userspace commands which require volumes specification, except for fstyp(8).
3. userspace commands ===================== Basically same as or similar to HAMMER1. * newfs_hammer2(8) takes a list of block device paths as argv[]. * mount_hammer2(8) takes block device paths or names in "a:b:c:..." format. * hammer2(8) takes block device paths or names in "a:b:c:..." format for directives which require volumes specification. This commit also adds "volume-list" directive and an ioctl command HAMMER2IOC_VOLUME_LIST, which are similar to the one in HAMMER1. * fsck_hammer2(8) takes device paths or names in "a:b:c:..." format. * fstyp(8) takes device paths in "path1:path2:path3:..." format.
4. limitations ============== * hammer2(8) "info" directive ignores multi-volumes block devices. * hammer2(8) "growfs" directive doesn't support multi-volumes fs. * fstyp(8) is unable to find PFS label via -l option if the PFS inode or its parent indirect blocks are located beyond root volume. * hammer2(8) doesn't support "volume-add" and "volume-del" directives which existed in HAMMER1, and there is currently no plan to support.
show more ...
|
Revision tags: v5.8.3, v5.8.2, v5.8.1, v5.8.0, v5.9.0, v5.8.0rc1, v5.6.3 |
|
#
a8607002 |
| 29-Sep-2019 |
Tomohiro Kusumi <kusumi.tomohiro@gmail.com> |
sbin/hammer2: Add sbin/hammer2/hammer2_subs.h
Separate a header for subs.c from <hammer2.h>.
This lets other HAMMER2 binaries drop unneeded dependencies required to use sbin/hammer2/subs.c (various
sbin/hammer2: Add sbin/hammer2/hammer2_subs.h
Separate a header for subs.c from <hammer2.h>.
This lets other HAMMER2 binaries drop unneeded dependencies required to use sbin/hammer2/subs.c (various unneeded OpenSSL header includes via <dmsg.h> via <hammer2.h>, global variables via <hammer2.h>).
This doesn't affect existing files which include <hammer2.h>.
show more ...
|
#
5914303a |
| 12-Sep-2019 |
Tomohiro Kusumi <kusumi.tomohiro@gmail.com> |
sbin/newfs_hammer2: Use sbin/hammer2/subs.c
This is same as how sbin/newfs_hammer makes use of common code in sbin/hammer.
|
#
44acc24d |
| 16-Aug-2019 |
Tomohiro Kusumi <kusumi.tomohiro@gmail.com> |
sbin/*_hammer2: Fix/cleanup Makefile
|
#
4336fc8d |
| 11-Aug-2019 |
Tomohiro Kusumi <kusumi.tomohiro@gmail.com> |
sbin/newfs_hammer2: Drop -Isbin/hammer2 in Makefile
Unlike newfs_hammer(8), newfs_hammer2(8) just creates root inodes, therefore it doesn't (need to) rely on hammer2(8) for ondisk initialization nor
sbin/newfs_hammer2: Drop -Isbin/hammer2 in Makefile
Unlike newfs_hammer(8), newfs_hammer2(8) just creates root inodes, therefore it doesn't (need to) rely on hammer2(8) for ondisk initialization nor does it have such functionality.
show more ...
|
Revision tags: v5.6.2, v5.6.1, v5.6.0, v5.6.0rc1, v5.7.0, v5.4.3, v5.4.2, v5.4.1, v5.4.0, v5.5.0, v5.4.0rc1, v5.2.2, v5.2.1, v5.2.0, v5.3.0, v5.2.0rc, v5.0.2, v5.0.1, v5.0.0, v5.0.0rc2, v5.1.0, v5.0.0rc1, v4.8.1, v4.8.0, v4.6.2, v4.9.0, v4.8.0rc, v4.6.1, v4.6.0, v4.6.0rc2, v4.6.0rc, v4.7.0 |
|
#
7d565a4f |
| 08-Jun-2016 |
Matthew Dillon <dillon@apollo.backplane.com> |
hammer2 - Add xxhash to H2 and throw in debug stuff for performance testing.
* Add the xxhash. This is a high-speed non-cryptographic hash code algorithm. Sam pointed me at the site, the code is
hammer2 - Add xxhash to H2 and throw in debug stuff for performance testing.
* Add the xxhash. This is a high-speed non-cryptographic hash code algorithm. Sam pointed me at the site, the code is available on github and is BSD licensed:
git://github.com/Cyan4973/xxHash.git
This hash has good distribution and is very fast.
* Change HAMMER2 to default to using xxhash64 instead of iscsi_crc32(). xxhash can process data at several GBytes/sec where as even the multi-table iscsi_crc32() can only do around 500 MBytes/sec, which is too slow for today's modern storage subsystems (NVME can nominally do 1.5-2.5 GBytes/sec, and high-end cards can do 5GBytes/sec).
* There are four major paths that eat tons of CPU in H2:
- The XIO path does a ton of allocation/deallocation and synchronous messaging. This has not yet been fixed.
- The check code (when it was iscsi_crc32()) slowed everything down. This is fixed, the default check code is now xxhash64.
- The check code was being called over and over again for the same cached buffer due to the hammer2_chain_t structure being thrown away.
Currently a hack involving a mask stored in the underlying DIO is being used to indicate that the check code was previously valid. This is strictly temporary. The actual mask will have to be stored in the device buffer cache buffer and a second one in the chain structure.
The chain structure must be made persistent as well (not yet done).
- The DEDUP code was also calling iscsi_crc32() redundantly (at least for reads).
The read path has been fixed. The write path is doable but requires more coding (not yet fixed).
- The logical file cluster_read() in the kernel was not doing any read-ahead due to H2 not implementing BMAP, creating long synchronous latencies.
The kernel code for cluster_read() and cluster_readcb() has been fixed to do read-ahead whether a logical BMAP is implemented or not. H2 will now pipeline reads.
Suggested-by: Samuel J. Greear <sjg@thesjg.com> (xxhash)
show more ...
|
Revision tags: v4.4.3, v4.4.2, v4.4.1, v4.4.0, v4.5.0, v4.4.0rc, v4.2.4, v4.3.1, v4.2.3, v4.2.1, v4.2.0, v4.0.6, v4.3.0, v4.2.0rc, v4.0.5, v4.0.4, v4.0.3, v4.0.2, v4.0.1, v4.0.0, v4.0.0rc3, v4.0.0rc2, v4.0.0rc, v4.1.0, v3.8.2, v3.8.1, v3.6.3, v3.8.0, v3.8.0rc2, v3.9.0, v3.8.0rc, v3.6.2, v3.6.1, v3.6.0, v3.7.1, v3.6.0rc, v3.7.0, v3.4.3, v3.4.2, v3.4.0, v3.4.1, v3.4.0rc, v3.5.0, v3.2.2 |
|
#
3a5aa68f |
| 25-Oct-2012 |
Matthew Dillon <dillon@apollo.backplane.com> |
hammer2 - Messaging layer separation work part 3
* Move more hammer2 vfs message code into kern_dmsg.c, renaming and implementing callbacks as needed.
* Move hammer2_icrc.c (the iscsi crc support
hammer2 - Messaging layer separation work part 3
* Move more hammer2 vfs message code into kern_dmsg.c, renaming and implementing callbacks as needed.
* Move hammer2_icrc.c (the iscsi crc support) to libkern/icrc32.c
show more ...
|
Revision tags: v3.2.1, v3.2.0, v3.3.0, v3.0.3, v3.0.2, v3.0.1 |
|
#
b33a7e92 |
| 09-Feb-2012 |
Matthew Dillon <dillon@apollo.backplane.com> |
hammer2 - Initial newfs_hammer2 implementation
* This is a rough start for the newfs_hammer2 implementation.
* Creates the volume header, super-root inode, and named root inode.
|