Revision tags: v6.2.1, v6.2.0, v6.3.0, v6.0.1, v6.0.0, v6.0.0rc1, v6.1.0 |
|
#
e9dbfea1 |
| 21-Mar-2021 |
Matthew Dillon <dillon@apollo.backplane.com> |
kernel - Add kmalloc_obj subsystem step 1
* Implement per-zone memory management to kmalloc() in the form of kmalloc_obj() and friends. Currently the subsystem uses the same malloc_type struct
kernel - Add kmalloc_obj subsystem step 1
* Implement per-zone memory management to kmalloc() in the form of kmalloc_obj() and friends. Currently the subsystem uses the same malloc_type structure but is otherwise distinct from the normal kmalloc(), so to avoid programming mistakes the *_obj() subsystem post-pends '_obj' to malloc_type pointers passed into it.
This mechanism will eventually replace objcache. This mechanism is designed to greatly reduce fragmentation issues on systems with long uptimes.
Eventually the feature will be better integrated and I will be able to remove the _obj stuff.
* This is a object allocator, so the zone must be dedicated to one type of object with a fixed size. All allocations out of the zone are of the object.
The allocator is not quite type-stable yet, but will be once existential locks are integrated into the freeing mechanism.
* Implement a mini-slab allocator for management. Since the zones are single-object, similar to objcache, the fixed-size mini-slabs are a lot easier to optimize and much simpler in construction than the main kernel slab allocator.
Uses a per-zone/per-cpu active/alternate slab with an ultra-optimized allocation path, and a per-zone partial/full/empty list.
Also has a globaldata-based per-cpu cache of free slabs. The mini-slab allocator frees slabs back to the same cpu they were originally allocated from in order to retain memory locality over time.
* Implement a passive cleanup poller. This currently polls kmalloc zones very slowly looking for excess full slabs to release back to the global slab cache or the system (if the global slab cache is full).
This code will ultimately also handle existential type-stable freeing.
* Fragmentation is greatly reduced due to the distinct zones. Slabs are dedicated to the zone and do not share allocation space with other zones. Also, when a zone is destroyed, all of its memory is cleanly disposed of and there will be no left-over fragmentation.
* Initially use the new interface for the following. These zones tend to or can become quite big:
vnodes namecache (but not related strings) hammer2 chains hammer2 inodes tmpfs nodes tmpfs dirents (but not related strings)
show more ...
|
Revision tags: v5.8.3, v5.8.2, v5.8.1 |
|
#
307bf766 |
| 23-Apr-2020 |
Matthew Dillon <dillon@apollo.backplane.com> |
tmpfs - Change paging behavior, fix two directory-entry races
* Change the paging behavior for vfs.tmpfs.bufcache_mode.
These changes try to reduce unnecessary tmpfs flushes to swap when the pa
tmpfs - Change paging behavior, fix two directory-entry races
* Change the paging behavior for vfs.tmpfs.bufcache_mode.
These changes try to reduce unnecessary tmpfs flushes to swap when the pageout daemon is able to locate sufficient clean VM pages. The pageout daemon can still page tmpfs data to swap via its normal operation, but tmpfs itself will not force write()s to pipeline to swap unless memory pressure is severe.
0 tmpfs write()s are pipelined to swap via the buffer cache only if the VM system is below the minimum free page count.
(this is the new default)
1 tmpfs write()s are pipelined to swap via the buffer cache when the VM system is paging.
2 Same as (1) but be more aggressive about releasing buffer cache buffers.
3 tmpfs_write()s are always pipelined to swap via the buffer cache, regardless.
* Fix tmpfs file creation, hard-linking, and rename to ensure that the new file is not created in a deleted directory. We must lock the directory node around existing tests and add checks that were missing.
Also remove a few unnecessary recursive locks.
show more ...
|
#
f354e0e6 |
| 28-Mar-2020 |
Sascha Wildner <saw@online.de> |
kernel: Remove <sys/mutex.h> from all files that don't need it.
98% of these were remains from porting from FreeBSD which could have been removed after converting to lockmgr(), etc.
While here, do
kernel: Remove <sys/mutex.h> from all files that don't need it.
98% of these were remains from porting from FreeBSD which could have been removed after converting to lockmgr(), etc.
While here, do the same for <sys/mutex2.h>.
show more ...
|
#
4d22d8ee |
| 04-Mar-2020 |
Matthew Dillon <dillon@apollo.backplane.com> |
tmpfs - Fix minor deadlock, refactor tn_links
* Fix a minor deadlock. tmpfs_alloc_vp() can rarely race a vnode and leave a dangling lock, causing a later umount to deadlock.
* Refactor tn_links
tmpfs - Fix minor deadlock, refactor tn_links
* Fix a minor deadlock. tmpfs_alloc_vp() can rarely race a vnode and leave a dangling lock, causing a later umount to deadlock.
* Refactor tn_links to use atomic ops, mainly to clean-up an almost impossible race that can happen at umount time.
show more ...
|
Revision tags: v5.8.0 |
|
#
a44ecf5c |
| 22-Feb-2020 |
Matthew Dillon <dillon@apollo.backplane.com> |
tmpfs - Cleanup, refactor tmpfs_alloc_vp()
* Refactor tmpfs_alloc_vp() to handle races without having to have a weird intermediate TMPFS_VNODE_ALLOCATING state. This also removes the related AL
tmpfs - Cleanup, refactor tmpfs_alloc_vp()
* Refactor tmpfs_alloc_vp() to handle races without having to have a weird intermediate TMPFS_VNODE_ALLOCATING state. This also removes the related ALLOCATING/WAIT code which had a totally broken tsleep() call in it.
* Properly zero fields in tmpfs_alloc_node().
* Cleanup some comments
show more ...
|
Revision tags: v5.9.0, v5.8.0rc1 |
|
#
00369c4a |
| 14-Feb-2020 |
Matthew Dillon <dillon@apollo.backplane.com> |
kernel - Rejigger mount code to add vfs_flags in struct vfsops
* Rejigger the mount code so we can add a vfs_flags field to vfsops, which mount_init() has visibility to.
* Allows nullfs to flag t
kernel - Rejigger mount code to add vfs_flags in struct vfsops
* Rejigger the mount code so we can add a vfs_flags field to vfsops, which mount_init() has visibility to.
* Allows nullfs to flag that its mounts do not need a syncer thread. Previously nullfs would destroy the syncer thread after the fact.
* Improves dsynth performance (it does lots of nullfs mounts).
show more ...
|
#
9cd86db5 |
| 13-Feb-2020 |
Matthew Dillon <dillon@apollo.backplane.com> |
kernel - Improve tmpfs support
* When a file in tmpfs is truncated to a size that is not on a block boundary, or extended (but not written) to a size that is not on a block boundary, the nvexten
kernel - Improve tmpfs support
* When a file in tmpfs is truncated to a size that is not on a block boundary, or extended (but not written) to a size that is not on a block boundary, the nvextendbuf() and nvtruncbuf() functions must modify the contents of the straddling buffer and bdwrite().
However, a bdwrite() for a tmpfs buffer will result in a dirty buffer cache buffer and likely force it to be cycled out to swap relatively soon under a modest load. This is not desirable if there is no memory pressure present to force it out.
Tmpfs almost always uses buwrite() in order to leave the buffer 'clean' (the underlying VM pages are dirtied instead), to prevent unecessary paging of tmpfs data to swap when the buffer gets recycled or the vnode cycles out.
* Add support for calling buwrite() in these functions by changing the 'trivial' boolean into a flags variable.
* Tmpfs now passes the appropriate flag, preventing the undesirable behavior.
show more ...
|
#
4eb0bb82 |
| 13-Feb-2020 |
Matthew Dillon <dillon@apollo.backplane.com> |
tmpfs - Improve write clustering
* Setup bmap and max iosize parameters so the kernel's clustering code can actually cluster 16KB tmpfs blocks together into 64KB blocks.
* In low-memory situati
tmpfs - Improve write clustering
* Setup bmap and max iosize parameters so the kernel's clustering code can actually cluster 16KB tmpfs blocks together into 64KB blocks.
* In low-memory situations the pageout daemon will flush tmpfs pages via the VM page queues. This ultimately runs through the tmpfs_vop_write() UIO_NOCOPY path which was previously using cluster_awrite(). However, because other nearby buffers are probably not present (buwrite()'s can allow buffers to be dismissed early), there is nothing for cluster_awrite() to latch onto to improve write granularity beyond 16KB.
Go back to using cluster_write() when SYNC and DIRECT are not specified. This allows the clustering code to collect buffers and flush them in larger chunks.
* Reduces low-memory tmpfs paging I/O overheads by 4x and generally increases paging throughput to SSD-based swap by 2x-4x. Tmpfs is now able to issue a lot more 64KB I/Os when under memory pressure.
show more ...
|
Revision tags: v5.6.3 |
|
#
89984f3d |
| 14-Sep-2019 |
Matthew Dillon <dillon@apollo.backplane.com> |
tmpfs - Close rare vnode recycle race
* Keep the node lock held when clearing tn_vnode in tmpfs_reclaim() to protect against a use-after-free race on tn_vnode against another thread.
* Keep the
tmpfs - Close rare vnode recycle race
* Keep the node lock held when clearing tn_vnode in tmpfs_reclaim() to protect against a use-after-free race on tn_vnode against another thread.
* Keep the node locked across the node type check and vnode ref in tmpfs_unmount() to protect against asynchronous reclamation races.
show more ...
|
Revision tags: v5.6.2, v5.6.1, v5.6.0, v5.6.0rc1, v5.7.0, v5.4.3, v5.4.2, v5.4.1, v5.4.0, v5.5.0, v5.4.0rc1 |
|
#
d89a0e31 |
| 25-Aug-2018 |
Matthew Dillon <dillon@apollo.backplane.com> |
tmpfs - Fix rare deadlock
* Fix a deadlock which can occur between umount and tmpfs, and possibly in other very rare situations.
* tmpfs holds the directory node locked when resolving a directory
tmpfs - Fix rare deadlock
* Fix a deadlock which can occur between umount and tmpfs, and possibly in other very rare situations.
* tmpfs holds the directory node locked when resolving a directory entry. This results in a lock order reversal between the directory's tmpfs_node lock and the vnode being locked.
Fixed by using a NOWAIT/UNLOCK/SLEEPFAIL/RETRY sequence.
show more ...
|
Revision tags: v5.2.2, v5.2.1 |
|
#
513b5023 |
| 12-May-2018 |
Matthew Dillon <dillon@apollo.backplane.com> |
tmpfs - Fix tmpfs_fid, fix NFS exports
* Fix the tmpfs_fid structure, the 64-bit elements made it incompatible with the system fid mapping.
This fixes NFS exports of a tmpfs filesystem.
* Fix
tmpfs - Fix tmpfs_fid, fix NFS exports
* Fix the tmpfs_fid structure, the 64-bit elements made it incompatible with the system fid mapping.
This fixes NFS exports of a tmpfs filesystem.
* Fix tmpfs_fhtovp, the inode number can exceed tmp->tm_nodes_max, do not error-out in that case.
show more ...
|
Revision tags: v5.2.0, v5.3.0, v5.2.0rc |
|
#
51a529db |
| 19-Mar-2018 |
Matthew Dillon <dillon@apollo.backplane.com> |
kernel - Implement QUICKHALT shortcut for unmounting during shutdown
* Add the MNTK_QUICKHALT flag which allows the system to just unlink but otherwise ignore certain mount types during a halt or
kernel - Implement QUICKHALT shortcut for unmounting during shutdown
* Add the MNTK_QUICKHALT flag which allows the system to just unlink but otherwise ignore certain mount types during a halt or reboot. For now we flag tmpfs, devfs, and procfs.
* The main impetus for this is to reduce the messing around we do with devfs during a shutdown. Devfs has its fingers, and its vnodes, prettymuch sunk throughout the system (e.g. /dev/null, system console, vty's, root mount, and so on and so forth). There's no real need to attempt to unwind all of that mess nicely.
show more ...
|
#
b2394163 |
| 12-Dec-2017 |
Matthew Dillon <dillon@apollo.backplane.com> |
tmpfs - Fix arbitrary maximum file size limitation
* tmpfs's maxfilesize was limited to the original tmpfs mount storage limit, prior to argument overrides. In addition, the -f argument overrid
tmpfs - Fix arbitrary maximum file size limitation
* tmpfs's maxfilesize was limited to the original tmpfs mount storage limit, prior to argument overrides. In addition, the -f argument override would not increase the maxfilesize limit beyond the original storage limit anyway.
* Remove this limit calculation entirely. Instead the limit is based on the storage limit which can be optioned at mount time.
* Fixes expectations when tmpfs is used to hold just a few (or even just one) file.
Reported-by: kerma
show more ...
|
Revision tags: v5.0.2, v5.0.1 |
|
#
25a86e44 |
| 29-Oct-2017 |
Markus Pfeiffer <markus.pfeiffer@morphism.de> |
kernel: Rename struct tmpfs_args to tmpfs_mount_info
This makes the names of vfs argument structures slightly more uniform.
Since they were not installed before this should not break any userland s
kernel: Rename struct tmpfs_args to tmpfs_mount_info
This makes the names of vfs argument structures slightly more uniform.
Since they were not installed before this should not break any userland software.
show more ...
|
#
f3c171e4 |
| 29-Oct-2017 |
Markus Pfeiffer <markus.pfeiffer@morphism.de> |
kernel: Rename tmpfs_args.h to tmpfs_mount.h
This is slightly more consistent with the other VFS.
|
Revision tags: v5.0.0, v5.0.0rc2, v5.1.0, v5.0.0rc1, v4.8.1, v4.8.0, v4.6.2, v4.9.0, v4.8.0rc |
|
#
87f62b1c |
| 08-Jan-2017 |
Matthew Dillon <dillon@apollo.backplane.com> |
kernel - Fix improper mplock in mount path
* VFS_MOUNT() was being called before MNTK_ALL_MPSAFE could be set by the filesystem, causing the operation to run with the MP token held.
* Add VFCF_MP
kernel - Fix improper mplock in mount path
* VFS_MOUNT() was being called before MNTK_ALL_MPSAFE could be set by the filesystem, causing the operation to run with the MP token held.
* Add VFCF_MPSAFE to the vfsconf flags and specify it for MPSAFE filesystems in their VFS_SET() specification. This flag causes MNTK_ALL_MPSAFE to be set in mount->mnt_kern_flags prior to the VFS_MOUNT() call. Set this flag for devfs, procfs, tmpfs, nullfs, hammer, and hammer2.
* Primarily effects synth or other bulk-builds which do a lot of mounting.
show more ...
|
Revision tags: v4.6.1, v4.6.0, v4.6.0rc2, v4.6.0rc, v4.7.0 |
|
#
ef560bee |
| 24-May-2016 |
Tomohiro Kusumi <kusumi.tomohiro@gmail.com> |
sys/kern: Don't implement .vfs_sync unless sync is supported
The only reason filesystems without requirement of syncing (e.g. no backing storage) need to implement .vfs_sync is because those fs need
sys/kern: Don't implement .vfs_sync unless sync is supported
The only reason filesystems without requirement of syncing (e.g. no backing storage) need to implement .vfs_sync is because those fs need a sync with a return value of 0 on unmount.
If unmount allows sync with return value of EOPNOTSUPP for fs that do not support sync, those fs no longer have to implement .vfs_sync with vfs_stdsync() only to pass dounmount().
The drawback is when there is a sync (other than vfs_stdnosync) that returns EOPNOTSUPP for real errors. The existing fs in DragonFly don't do this (and shouldn't either).
Also see https://bugs.dragonflybsd.org/issues/2912.
# grep "\.vfs_sync" sys/vfs sys/gnu/vfs -rI | grep vfs_stdsync sys/vfs/udf/udf_vfsops.c: .vfs_sync = vfs_stdsync, sys/vfs/portal/portal_vfsops.c: .vfs_sync = vfs_stdsync sys/vfs/devfs/devfs_vfsops.c: .vfs_sync = vfs_stdsync, sys/vfs/isofs/cd9660/cd9660_vfsops.c: .vfs_sync = vfs_stdsync, sys/vfs/autofs/autofs_vfsops.c: .vfs_sync = vfs_stdsync, /* for unmount(2) */ sys/vfs/tmpfs/tmpfs_vfsops.c: .vfs_sync = vfs_stdsync, sys/vfs/dirfs/dirfs_vfsops.c: .vfs_sync = vfs_stdsync, sys/vfs/ntfs/ntfs_vfsops.c: .vfs_sync = vfs_stdsync, sys/vfs/procfs/procfs_vfsops.c: .vfs_sync = vfs_stdsync sys/vfs/hpfs/hpfs_vfsops.c: .vfs_sync = vfs_stdsync, sys/vfs/nullfs/null_vfsops.c: .vfs_sync = vfs_stdsync,
show more ...
|
Revision tags: v4.4.3, v4.4.2, v4.4.1, v4.4.0, v4.5.0, v4.4.0rc, v4.2.4, v4.3.1, v4.2.3, v4.2.1, v4.2.0, v4.0.6, v4.3.0, v4.2.0rc |
|
#
5b09d16c |
| 12-May-2015 |
Tomohiro Kusumi <kusumi.tomohiro@gmail.com> |
sys/vfs/tmpfs: Rename ROOTINO to TMPFS_ROOTINO
- Rename it so that the utility program can distinguish it from UFS's ROOTINO when they need to include filesystem headers possibly in the future.
|
#
de8da6a3 |
| 11-May-2015 |
Tomohiro Kusumi <kusumi.tomohiro@gmail.com> |
sys/vfs/tmpfs: Bring in a macro from UFS
- Bring in ROOTINO macro from UFS whose root inode# also starts from 2.
|
#
4b3494c0 |
| 06-May-2015 |
Tomohiro Kusumi <kusumi.tomohiro@gmail.com> |
sys/vfs/tmpfs: Fix typo
- Flush all vnodes on unmount(2).
|
#
d9d4a3f4 |
| 06-May-2015 |
Tomohiro Kusumi <kusumi.tomohiro@gmail.com> |
sys/vfs/tmpfs: Fix assertion
- It's always expected to be 2.
|
#
4dd27cb8 |
| 04-May-2015 |
Tomohiro Kusumi <kusumi.tomohiro@gmail.com> |
sys/vfs/tmpfs: Remove trailing whitespace
- Lines changed by 66fa44e7 have trailing whitespaces for some reason.
|
Revision tags: v4.0.5, v4.0.4, v4.0.3, v4.0.2, v4.0.1, v4.0.0, v4.0.0rc3, v4.0.0rc2, v4.0.0rc, v4.1.0, v3.8.2, v3.8.1, v3.6.3, v3.8.0, v3.8.0rc2, v3.9.0, v3.8.0rc |
|
#
99ebfb7c |
| 06-May-2014 |
Sascha Wildner <saw@online.de> |
kernel: Fix some boolean_t vs. int confusion.
When boolean_t is defined to be _Bool instead of int (not part of this commit), this is what gcc is sad about.
|
Revision tags: v3.6.2, v3.6.1, v3.6.0, v3.7.1, v3.6.0rc, v3.7.0 |
|
#
ff837cd5 |
| 23-Oct-2013 |
Matthew Dillon <dillon@apollo.backplane.com> |
tmpfs - remove most mp->mnt_token cases, kqueue filterops are MPSAFE
* tmpfs's kqueue filterops are MPSAFE, set appropriate flag.
* tmpfs's vnops frontend universally obtained the tmpfs mnt_token,
tmpfs - remove most mp->mnt_token cases, kqueue filterops are MPSAFE
* tmpfs's kqueue filterops are MPSAFE, set appropriate flag.
* tmpfs's vnops frontend universally obtained the tmpfs mnt_token, but most of tmpfs's underlying code was already sub-locked by node.
Remove most mnt_token use cases and make the portions that were not safe, safe. This was primarily the directory lookup and scanning code and code to create, delete, and rename files.
* Should greatly improve tmpfs concurrency.
show more ...
|
#
ee173d09 |
| 20-Oct-2013 |
Sascha Wildner <saw@online.de> |
kernel - Rewrite vnode ref-counting code to improve performance
* Rewrite the vnode ref-counting code and modify operation to not immediately VOP_INACTIVE a vnode when its refs drops to 0. By d
kernel - Rewrite vnode ref-counting code to improve performance
* Rewrite the vnode ref-counting code and modify operation to not immediately VOP_INACTIVE a vnode when its refs drops to 0. By doing so we avoid cycling vnodes through exclusive locks when temporarily accessing them (such as in a path lookup). Shared locks can be used throughout.
* Track active/inactive vnodes a bit differently, keep track of the number of vnodes that are still active but have zero refs, and rewrite the vnode freeing code to use the new statistics to deactivate cached vnodes.
show more ...
|