/dragonfly/sys/dev/drm/include/linux/ |
H A D | mutex.h | 3b6a19b2 Tue Oct 24 01:39:16 GMT 2017 Matthew Dillon <dillon@apollo.backplane.com> kernel - Refactor lockmgr()
* Seriously refactor lockmgr() so we can use atomic_fetchadd_*() for shared locks and reduce unnecessary atomic ops and atomic op loops.
The main win here is being able to use atomic_fetchadd_*() when acquiring and releasing shared locks. A simple fstat() loop (which utilizes a LK_SHARED lockmgr lock on the vnode) improves from 191ns to around 110ns per loop with 32 concurrent threads (on a 16-core/ 32-thread xeon).
* To accomplish this, the 32-bit lk_count field becomes 64-bits. The shared count is separated into the high 32-bits, allowing it to be manipulated for both blocking shared requests and the shared lock count field. The low count bits are used for exclusive locks. Control bits are adjusted to manage lockmgr features.
LKC_SHARED Indicates shared lock count is active, else excl lock count. Can predispose the lock when the related count is 0 (does not have to be cleared, for example).
LKC_UPREQ Queued upgrade request. Automatically granted by releasing entity (UPREQ -> ~SHARED|1).
LKC_EXREQ Queued exclusive request (only when lock held shared). Automatically granted by releasing entity (EXREQ -> ~SHARED|1).
LKC_EXREQ2 Aggregated exclusive request. When EXREQ cannot be obtained due to the lock being held exclusively or EXREQ already being queued, EXREQ2 is flagged for wakeup/retries.
LKC_CANCEL Cancel API support
LKC_SMASK Shared lock count mask (LKC_SCOUNT increments).
LKC_XMASK Exclusive lock count mask (+1 increments)
The 'no lock' condition occurs when LKC_XMASK is 0 and LKC_SMASK is 0, regardless of the state of LKC_SHARED.
* Lockmgr still supports exclusive priority over shared locks. The semantics have slightly changed. The priority mechanism only applies to the EXREQ holder. Once an exclusive lock is obtained, any blocking shared or exclusive locks will have equal priority until the exclusive lock is released. Once released, shared locks can squeeze in, but then the next pending exclusive lock will assert its priority over any new shared locks when it wakes up and loops.
This isn't quite what I wanted, but it seems to work quite well. I had to make a trade-off in the EXREQ lock-grant mechanism to improve performance.
* In addition, we use atomic_fcmpset_long() instead of atomic_cmpset_long() to reduce cache line flip flopping at least a little.
* Remove lockcount() and lockcountnb(), which tried to count lock refs. Replace with lockinuse(), which simply tells the caller whether the lock is referenced or not.
* Expand some of the copyright notices (years and authors) for major rewrites. Really there are a lot more and I have to pay more attention to adjustments.
|
H A D | spinlock.h | 3b6a19b2 Tue Oct 24 01:39:16 GMT 2017 Matthew Dillon <dillon@apollo.backplane.com> kernel - Refactor lockmgr()
* Seriously refactor lockmgr() so we can use atomic_fetchadd_*() for shared locks and reduce unnecessary atomic ops and atomic op loops.
The main win here is being able to use atomic_fetchadd_*() when acquiring and releasing shared locks. A simple fstat() loop (which utilizes a LK_SHARED lockmgr lock on the vnode) improves from 191ns to around 110ns per loop with 32 concurrent threads (on a 16-core/ 32-thread xeon).
* To accomplish this, the 32-bit lk_count field becomes 64-bits. The shared count is separated into the high 32-bits, allowing it to be manipulated for both blocking shared requests and the shared lock count field. The low count bits are used for exclusive locks. Control bits are adjusted to manage lockmgr features.
LKC_SHARED Indicates shared lock count is active, else excl lock count. Can predispose the lock when the related count is 0 (does not have to be cleared, for example).
LKC_UPREQ Queued upgrade request. Automatically granted by releasing entity (UPREQ -> ~SHARED|1).
LKC_EXREQ Queued exclusive request (only when lock held shared). Automatically granted by releasing entity (EXREQ -> ~SHARED|1).
LKC_EXREQ2 Aggregated exclusive request. When EXREQ cannot be obtained due to the lock being held exclusively or EXREQ already being queued, EXREQ2 is flagged for wakeup/retries.
LKC_CANCEL Cancel API support
LKC_SMASK Shared lock count mask (LKC_SCOUNT increments).
LKC_XMASK Exclusive lock count mask (+1 increments)
The 'no lock' condition occurs when LKC_XMASK is 0 and LKC_SMASK is 0, regardless of the state of LKC_SHARED.
* Lockmgr still supports exclusive priority over shared locks. The semantics have slightly changed. The priority mechanism only applies to the EXREQ holder. Once an exclusive lock is obtained, any blocking shared or exclusive locks will have equal priority until the exclusive lock is released. Once released, shared locks can squeeze in, but then the next pending exclusive lock will assert its priority over any new shared locks when it wakes up and loops.
This isn't quite what I wanted, but it seems to work quite well. I had to make a trade-off in the EXREQ lock-grant mechanism to improve performance.
* In addition, we use atomic_fcmpset_long() instead of atomic_cmpset_long() to reduce cache line flip flopping at least a little.
* Remove lockcount() and lockcountnb(), which tried to count lock refs. Replace with lockinuse(), which simply tells the caller whether the lock is referenced or not.
* Expand some of the copyright notices (years and authors) for major rewrites. Really there are a lot more and I have to pay more attention to adjustments.
|
/dragonfly/sys/dev/disk/dm/ |
H A D | dm_table.c | 3b6a19b2 Tue Oct 24 01:39:16 GMT 2017 Matthew Dillon <dillon@apollo.backplane.com> kernel - Refactor lockmgr()
* Seriously refactor lockmgr() so we can use atomic_fetchadd_*() for shared locks and reduce unnecessary atomic ops and atomic op loops.
The main win here is being able to use atomic_fetchadd_*() when acquiring and releasing shared locks. A simple fstat() loop (which utilizes a LK_SHARED lockmgr lock on the vnode) improves from 191ns to around 110ns per loop with 32 concurrent threads (on a 16-core/ 32-thread xeon).
* To accomplish this, the 32-bit lk_count field becomes 64-bits. The shared count is separated into the high 32-bits, allowing it to be manipulated for both blocking shared requests and the shared lock count field. The low count bits are used for exclusive locks. Control bits are adjusted to manage lockmgr features.
LKC_SHARED Indicates shared lock count is active, else excl lock count. Can predispose the lock when the related count is 0 (does not have to be cleared, for example).
LKC_UPREQ Queued upgrade request. Automatically granted by releasing entity (UPREQ -> ~SHARED|1).
LKC_EXREQ Queued exclusive request (only when lock held shared). Automatically granted by releasing entity (EXREQ -> ~SHARED|1).
LKC_EXREQ2 Aggregated exclusive request. When EXREQ cannot be obtained due to the lock being held exclusively or EXREQ already being queued, EXREQ2 is flagged for wakeup/retries.
LKC_CANCEL Cancel API support
LKC_SMASK Shared lock count mask (LKC_SCOUNT increments).
LKC_XMASK Exclusive lock count mask (+1 increments)
The 'no lock' condition occurs when LKC_XMASK is 0 and LKC_SMASK is 0, regardless of the state of LKC_SHARED.
* Lockmgr still supports exclusive priority over shared locks. The semantics have slightly changed. The priority mechanism only applies to the EXREQ holder. Once an exclusive lock is obtained, any blocking shared or exclusive locks will have equal priority until the exclusive lock is released. Once released, shared locks can squeeze in, but then the next pending exclusive lock will assert its priority over any new shared locks when it wakes up and loops.
This isn't quite what I wanted, but it seems to work quite well. I had to make a trade-off in the EXREQ lock-grant mechanism to improve performance.
* In addition, we use atomic_fcmpset_long() instead of atomic_cmpset_long() to reduce cache line flip flopping at least a little.
* Remove lockcount() and lockcountnb(), which tried to count lock refs. Replace with lockinuse(), which simply tells the caller whether the lock is referenced or not.
* Expand some of the copyright notices (years and authors) for major rewrites. Really there are a lot more and I have to pay more attention to adjustments.
|
/dragonfly/sys/sys/ |
H A D | buf2.h | 3b6a19b2 Tue Oct 24 01:39:16 GMT 2017 Matthew Dillon <dillon@apollo.backplane.com> kernel - Refactor lockmgr()
* Seriously refactor lockmgr() so we can use atomic_fetchadd_*() for shared locks and reduce unnecessary atomic ops and atomic op loops.
The main win here is being able to use atomic_fetchadd_*() when acquiring and releasing shared locks. A simple fstat() loop (which utilizes a LK_SHARED lockmgr lock on the vnode) improves from 191ns to around 110ns per loop with 32 concurrent threads (on a 16-core/ 32-thread xeon).
* To accomplish this, the 32-bit lk_count field becomes 64-bits. The shared count is separated into the high 32-bits, allowing it to be manipulated for both blocking shared requests and the shared lock count field. The low count bits are used for exclusive locks. Control bits are adjusted to manage lockmgr features.
LKC_SHARED Indicates shared lock count is active, else excl lock count. Can predispose the lock when the related count is 0 (does not have to be cleared, for example).
LKC_UPREQ Queued upgrade request. Automatically granted by releasing entity (UPREQ -> ~SHARED|1).
LKC_EXREQ Queued exclusive request (only when lock held shared). Automatically granted by releasing entity (EXREQ -> ~SHARED|1).
LKC_EXREQ2 Aggregated exclusive request. When EXREQ cannot be obtained due to the lock being held exclusively or EXREQ already being queued, EXREQ2 is flagged for wakeup/retries.
LKC_CANCEL Cancel API support
LKC_SMASK Shared lock count mask (LKC_SCOUNT increments).
LKC_XMASK Exclusive lock count mask (+1 increments)
The 'no lock' condition occurs when LKC_XMASK is 0 and LKC_SMASK is 0, regardless of the state of LKC_SHARED.
* Lockmgr still supports exclusive priority over shared locks. The semantics have slightly changed. The priority mechanism only applies to the EXREQ holder. Once an exclusive lock is obtained, any blocking shared or exclusive locks will have equal priority until the exclusive lock is released. Once released, shared locks can squeeze in, but then the next pending exclusive lock will assert its priority over any new shared locks when it wakes up and loops.
This isn't quite what I wanted, but it seems to work quite well. I had to make a trade-off in the EXREQ lock-grant mechanism to improve performance.
* In addition, we use atomic_fcmpset_long() instead of atomic_cmpset_long() to reduce cache line flip flopping at least a little.
* Remove lockcount() and lockcountnb(), which tried to count lock refs. Replace with lockinuse(), which simply tells the caller whether the lock is referenced or not.
* Expand some of the copyright notices (years and authors) for major rewrites. Really there are a lot more and I have to pay more attention to adjustments.
|
H A D | lock.h | 3b6a19b2 Tue Oct 24 01:39:16 GMT 2017 Matthew Dillon <dillon@apollo.backplane.com> kernel - Refactor lockmgr()
* Seriously refactor lockmgr() so we can use atomic_fetchadd_*() for shared locks and reduce unnecessary atomic ops and atomic op loops.
The main win here is being able to use atomic_fetchadd_*() when acquiring and releasing shared locks. A simple fstat() loop (which utilizes a LK_SHARED lockmgr lock on the vnode) improves from 191ns to around 110ns per loop with 32 concurrent threads (on a 16-core/ 32-thread xeon).
* To accomplish this, the 32-bit lk_count field becomes 64-bits. The shared count is separated into the high 32-bits, allowing it to be manipulated for both blocking shared requests and the shared lock count field. The low count bits are used for exclusive locks. Control bits are adjusted to manage lockmgr features.
LKC_SHARED Indicates shared lock count is active, else excl lock count. Can predispose the lock when the related count is 0 (does not have to be cleared, for example).
LKC_UPREQ Queued upgrade request. Automatically granted by releasing entity (UPREQ -> ~SHARED|1).
LKC_EXREQ Queued exclusive request (only when lock held shared). Automatically granted by releasing entity (EXREQ -> ~SHARED|1).
LKC_EXREQ2 Aggregated exclusive request. When EXREQ cannot be obtained due to the lock being held exclusively or EXREQ already being queued, EXREQ2 is flagged for wakeup/retries.
LKC_CANCEL Cancel API support
LKC_SMASK Shared lock count mask (LKC_SCOUNT increments).
LKC_XMASK Exclusive lock count mask (+1 increments)
The 'no lock' condition occurs when LKC_XMASK is 0 and LKC_SMASK is 0, regardless of the state of LKC_SHARED.
* Lockmgr still supports exclusive priority over shared locks. The semantics have slightly changed. The priority mechanism only applies to the EXREQ holder. Once an exclusive lock is obtained, any blocking shared or exclusive locks will have equal priority until the exclusive lock is released. Once released, shared locks can squeeze in, but then the next pending exclusive lock will assert its priority over any new shared locks when it wakes up and loops.
This isn't quite what I wanted, but it seems to work quite well. I had to make a trade-off in the EXREQ lock-grant mechanism to improve performance.
* In addition, we use atomic_fcmpset_long() instead of atomic_cmpset_long() to reduce cache line flip flopping at least a little.
* Remove lockcount() and lockcountnb(), which tried to count lock refs. Replace with lockinuse(), which simply tells the caller whether the lock is referenced or not.
* Expand some of the copyright notices (years and authors) for major rewrites. Really there are a lot more and I have to pay more attention to adjustments.
|
/dragonfly/sys/vfs/tmpfs/ |
H A D | tmpfs_subr.c | 3b6a19b2 Tue Oct 24 01:39:16 GMT 2017 Matthew Dillon <dillon@apollo.backplane.com> kernel - Refactor lockmgr()
* Seriously refactor lockmgr() so we can use atomic_fetchadd_*() for shared locks and reduce unnecessary atomic ops and atomic op loops.
The main win here is being able to use atomic_fetchadd_*() when acquiring and releasing shared locks. A simple fstat() loop (which utilizes a LK_SHARED lockmgr lock on the vnode) improves from 191ns to around 110ns per loop with 32 concurrent threads (on a 16-core/ 32-thread xeon).
* To accomplish this, the 32-bit lk_count field becomes 64-bits. The shared count is separated into the high 32-bits, allowing it to be manipulated for both blocking shared requests and the shared lock count field. The low count bits are used for exclusive locks. Control bits are adjusted to manage lockmgr features.
LKC_SHARED Indicates shared lock count is active, else excl lock count. Can predispose the lock when the related count is 0 (does not have to be cleared, for example).
LKC_UPREQ Queued upgrade request. Automatically granted by releasing entity (UPREQ -> ~SHARED|1).
LKC_EXREQ Queued exclusive request (only when lock held shared). Automatically granted by releasing entity (EXREQ -> ~SHARED|1).
LKC_EXREQ2 Aggregated exclusive request. When EXREQ cannot be obtained due to the lock being held exclusively or EXREQ already being queued, EXREQ2 is flagged for wakeup/retries.
LKC_CANCEL Cancel API support
LKC_SMASK Shared lock count mask (LKC_SCOUNT increments).
LKC_XMASK Exclusive lock count mask (+1 increments)
The 'no lock' condition occurs when LKC_XMASK is 0 and LKC_SMASK is 0, regardless of the state of LKC_SHARED.
* Lockmgr still supports exclusive priority over shared locks. The semantics have slightly changed. The priority mechanism only applies to the EXREQ holder. Once an exclusive lock is obtained, any blocking shared or exclusive locks will have equal priority until the exclusive lock is released. Once released, shared locks can squeeze in, but then the next pending exclusive lock will assert its priority over any new shared locks when it wakes up and loops.
This isn't quite what I wanted, but it seems to work quite well. I had to make a trade-off in the EXREQ lock-grant mechanism to improve performance.
* In addition, we use atomic_fcmpset_long() instead of atomic_cmpset_long() to reduce cache line flip flopping at least a little.
* Remove lockcount() and lockcountnb(), which tried to count lock refs. Replace with lockinuse(), which simply tells the caller whether the lock is referenced or not.
* Expand some of the copyright notices (years and authors) for major rewrites. Really there are a lot more and I have to pay more attention to adjustments.
|
/dragonfly/sys/kern/ |
H A D | vfs_lock.c | 3b6a19b2 Tue Oct 24 01:39:16 GMT 2017 Matthew Dillon <dillon@apollo.backplane.com> kernel - Refactor lockmgr()
* Seriously refactor lockmgr() so we can use atomic_fetchadd_*() for shared locks and reduce unnecessary atomic ops and atomic op loops.
The main win here is being able to use atomic_fetchadd_*() when acquiring and releasing shared locks. A simple fstat() loop (which utilizes a LK_SHARED lockmgr lock on the vnode) improves from 191ns to around 110ns per loop with 32 concurrent threads (on a 16-core/ 32-thread xeon).
* To accomplish this, the 32-bit lk_count field becomes 64-bits. The shared count is separated into the high 32-bits, allowing it to be manipulated for both blocking shared requests and the shared lock count field. The low count bits are used for exclusive locks. Control bits are adjusted to manage lockmgr features.
LKC_SHARED Indicates shared lock count is active, else excl lock count. Can predispose the lock when the related count is 0 (does not have to be cleared, for example).
LKC_UPREQ Queued upgrade request. Automatically granted by releasing entity (UPREQ -> ~SHARED|1).
LKC_EXREQ Queued exclusive request (only when lock held shared). Automatically granted by releasing entity (EXREQ -> ~SHARED|1).
LKC_EXREQ2 Aggregated exclusive request. When EXREQ cannot be obtained due to the lock being held exclusively or EXREQ already being queued, EXREQ2 is flagged for wakeup/retries.
LKC_CANCEL Cancel API support
LKC_SMASK Shared lock count mask (LKC_SCOUNT increments).
LKC_XMASK Exclusive lock count mask (+1 increments)
The 'no lock' condition occurs when LKC_XMASK is 0 and LKC_SMASK is 0, regardless of the state of LKC_SHARED.
* Lockmgr still supports exclusive priority over shared locks. The semantics have slightly changed. The priority mechanism only applies to the EXREQ holder. Once an exclusive lock is obtained, any blocking shared or exclusive locks will have equal priority until the exclusive lock is released. Once released, shared locks can squeeze in, but then the next pending exclusive lock will assert its priority over any new shared locks when it wakes up and loops.
This isn't quite what I wanted, but it seems to work quite well. I had to make a trade-off in the EXREQ lock-grant mechanism to improve performance.
* In addition, we use atomic_fcmpset_long() instead of atomic_cmpset_long() to reduce cache line flip flopping at least a little.
* Remove lockcount() and lockcountnb(), which tried to count lock refs. Replace with lockinuse(), which simply tells the caller whether the lock is referenced or not.
* Expand some of the copyright notices (years and authors) for major rewrites. Really there are a lot more and I have to pay more attention to adjustments.
|
H A D | kern_lock.c | 3b6a19b2 Tue Oct 24 01:39:16 GMT 2017 Matthew Dillon <dillon@apollo.backplane.com> kernel - Refactor lockmgr()
* Seriously refactor lockmgr() so we can use atomic_fetchadd_*() for shared locks and reduce unnecessary atomic ops and atomic op loops.
The main win here is being able to use atomic_fetchadd_*() when acquiring and releasing shared locks. A simple fstat() loop (which utilizes a LK_SHARED lockmgr lock on the vnode) improves from 191ns to around 110ns per loop with 32 concurrent threads (on a 16-core/ 32-thread xeon).
* To accomplish this, the 32-bit lk_count field becomes 64-bits. The shared count is separated into the high 32-bits, allowing it to be manipulated for both blocking shared requests and the shared lock count field. The low count bits are used for exclusive locks. Control bits are adjusted to manage lockmgr features.
LKC_SHARED Indicates shared lock count is active, else excl lock count. Can predispose the lock when the related count is 0 (does not have to be cleared, for example).
LKC_UPREQ Queued upgrade request. Automatically granted by releasing entity (UPREQ -> ~SHARED|1).
LKC_EXREQ Queued exclusive request (only when lock held shared). Automatically granted by releasing entity (EXREQ -> ~SHARED|1).
LKC_EXREQ2 Aggregated exclusive request. When EXREQ cannot be obtained due to the lock being held exclusively or EXREQ already being queued, EXREQ2 is flagged for wakeup/retries.
LKC_CANCEL Cancel API support
LKC_SMASK Shared lock count mask (LKC_SCOUNT increments).
LKC_XMASK Exclusive lock count mask (+1 increments)
The 'no lock' condition occurs when LKC_XMASK is 0 and LKC_SMASK is 0, regardless of the state of LKC_SHARED.
* Lockmgr still supports exclusive priority over shared locks. The semantics have slightly changed. The priority mechanism only applies to the EXREQ holder. Once an exclusive lock is obtained, any blocking shared or exclusive locks will have equal priority until the exclusive lock is released. Once released, shared locks can squeeze in, but then the next pending exclusive lock will assert its priority over any new shared locks when it wakes up and loops.
This isn't quite what I wanted, but it seems to work quite well. I had to make a trade-off in the EXREQ lock-grant mechanism to improve performance.
* In addition, we use atomic_fcmpset_long() instead of atomic_cmpset_long() to reduce cache line flip flopping at least a little.
* Remove lockcount() and lockcountnb(), which tried to count lock refs. Replace with lockinuse(), which simply tells the caller whether the lock is referenced or not.
* Expand some of the copyright notices (years and authors) for major rewrites. Really there are a lot more and I have to pay more attention to adjustments.
|
H A D | kern_shutdown.c | 3b6a19b2 Tue Oct 24 01:39:16 GMT 2017 Matthew Dillon <dillon@apollo.backplane.com> kernel - Refactor lockmgr()
* Seriously refactor lockmgr() so we can use atomic_fetchadd_*() for shared locks and reduce unnecessary atomic ops and atomic op loops.
The main win here is being able to use atomic_fetchadd_*() when acquiring and releasing shared locks. A simple fstat() loop (which utilizes a LK_SHARED lockmgr lock on the vnode) improves from 191ns to around 110ns per loop with 32 concurrent threads (on a 16-core/ 32-thread xeon).
* To accomplish this, the 32-bit lk_count field becomes 64-bits. The shared count is separated into the high 32-bits, allowing it to be manipulated for both blocking shared requests and the shared lock count field. The low count bits are used for exclusive locks. Control bits are adjusted to manage lockmgr features.
LKC_SHARED Indicates shared lock count is active, else excl lock count. Can predispose the lock when the related count is 0 (does not have to be cleared, for example).
LKC_UPREQ Queued upgrade request. Automatically granted by releasing entity (UPREQ -> ~SHARED|1).
LKC_EXREQ Queued exclusive request (only when lock held shared). Automatically granted by releasing entity (EXREQ -> ~SHARED|1).
LKC_EXREQ2 Aggregated exclusive request. When EXREQ cannot be obtained due to the lock being held exclusively or EXREQ already being queued, EXREQ2 is flagged for wakeup/retries.
LKC_CANCEL Cancel API support
LKC_SMASK Shared lock count mask (LKC_SCOUNT increments).
LKC_XMASK Exclusive lock count mask (+1 increments)
The 'no lock' condition occurs when LKC_XMASK is 0 and LKC_SMASK is 0, regardless of the state of LKC_SHARED.
* Lockmgr still supports exclusive priority over shared locks. The semantics have slightly changed. The priority mechanism only applies to the EXREQ holder. Once an exclusive lock is obtained, any blocking shared or exclusive locks will have equal priority until the exclusive lock is released. Once released, shared locks can squeeze in, but then the next pending exclusive lock will assert its priority over any new shared locks when it wakes up and loops.
This isn't quite what I wanted, but it seems to work quite well. I had to make a trade-off in the EXREQ lock-grant mechanism to improve performance.
* In addition, we use atomic_fcmpset_long() instead of atomic_cmpset_long() to reduce cache line flip flopping at least a little.
* Remove lockcount() and lockcountnb(), which tried to count lock refs. Replace with lockinuse(), which simply tells the caller whether the lock is referenced or not.
* Expand some of the copyright notices (years and authors) for major rewrites. Really there are a lot more and I have to pay more attention to adjustments.
|
H A D | vfs_bio.c | 3b6a19b2 Tue Oct 24 01:39:16 GMT 2017 Matthew Dillon <dillon@apollo.backplane.com> kernel - Refactor lockmgr()
* Seriously refactor lockmgr() so we can use atomic_fetchadd_*() for shared locks and reduce unnecessary atomic ops and atomic op loops.
The main win here is being able to use atomic_fetchadd_*() when acquiring and releasing shared locks. A simple fstat() loop (which utilizes a LK_SHARED lockmgr lock on the vnode) improves from 191ns to around 110ns per loop with 32 concurrent threads (on a 16-core/ 32-thread xeon).
* To accomplish this, the 32-bit lk_count field becomes 64-bits. The shared count is separated into the high 32-bits, allowing it to be manipulated for both blocking shared requests and the shared lock count field. The low count bits are used for exclusive locks. Control bits are adjusted to manage lockmgr features.
LKC_SHARED Indicates shared lock count is active, else excl lock count. Can predispose the lock when the related count is 0 (does not have to be cleared, for example).
LKC_UPREQ Queued upgrade request. Automatically granted by releasing entity (UPREQ -> ~SHARED|1).
LKC_EXREQ Queued exclusive request (only when lock held shared). Automatically granted by releasing entity (EXREQ -> ~SHARED|1).
LKC_EXREQ2 Aggregated exclusive request. When EXREQ cannot be obtained due to the lock being held exclusively or EXREQ already being queued, EXREQ2 is flagged for wakeup/retries.
LKC_CANCEL Cancel API support
LKC_SMASK Shared lock count mask (LKC_SCOUNT increments).
LKC_XMASK Exclusive lock count mask (+1 increments)
The 'no lock' condition occurs when LKC_XMASK is 0 and LKC_SMASK is 0, regardless of the state of LKC_SHARED.
* Lockmgr still supports exclusive priority over shared locks. The semantics have slightly changed. The priority mechanism only applies to the EXREQ holder. Once an exclusive lock is obtained, any blocking shared or exclusive locks will have equal priority until the exclusive lock is released. Once released, shared locks can squeeze in, but then the next pending exclusive lock will assert its priority over any new shared locks when it wakes up and loops.
This isn't quite what I wanted, but it seems to work quite well. I had to make a trade-off in the EXREQ lock-grant mechanism to improve performance.
* In addition, we use atomic_fcmpset_long() instead of atomic_cmpset_long() to reduce cache line flip flopping at least a little.
* Remove lockcount() and lockcountnb(), which tried to count lock refs. Replace with lockinuse(), which simply tells the caller whether the lock is referenced or not.
* Expand some of the copyright notices (years and authors) for major rewrites. Really there are a lot more and I have to pay more attention to adjustments.
|
H A D | vfs_subr.c | 3b6a19b2 Tue Oct 24 01:39:16 GMT 2017 Matthew Dillon <dillon@apollo.backplane.com> kernel - Refactor lockmgr()
* Seriously refactor lockmgr() so we can use atomic_fetchadd_*() for shared locks and reduce unnecessary atomic ops and atomic op loops.
The main win here is being able to use atomic_fetchadd_*() when acquiring and releasing shared locks. A simple fstat() loop (which utilizes a LK_SHARED lockmgr lock on the vnode) improves from 191ns to around 110ns per loop with 32 concurrent threads (on a 16-core/ 32-thread xeon).
* To accomplish this, the 32-bit lk_count field becomes 64-bits. The shared count is separated into the high 32-bits, allowing it to be manipulated for both blocking shared requests and the shared lock count field. The low count bits are used for exclusive locks. Control bits are adjusted to manage lockmgr features.
LKC_SHARED Indicates shared lock count is active, else excl lock count. Can predispose the lock when the related count is 0 (does not have to be cleared, for example).
LKC_UPREQ Queued upgrade request. Automatically granted by releasing entity (UPREQ -> ~SHARED|1).
LKC_EXREQ Queued exclusive request (only when lock held shared). Automatically granted by releasing entity (EXREQ -> ~SHARED|1).
LKC_EXREQ2 Aggregated exclusive request. When EXREQ cannot be obtained due to the lock being held exclusively or EXREQ already being queued, EXREQ2 is flagged for wakeup/retries.
LKC_CANCEL Cancel API support
LKC_SMASK Shared lock count mask (LKC_SCOUNT increments).
LKC_XMASK Exclusive lock count mask (+1 increments)
The 'no lock' condition occurs when LKC_XMASK is 0 and LKC_SMASK is 0, regardless of the state of LKC_SHARED.
* Lockmgr still supports exclusive priority over shared locks. The semantics have slightly changed. The priority mechanism only applies to the EXREQ holder. Once an exclusive lock is obtained, any blocking shared or exclusive locks will have equal priority until the exclusive lock is released. Once released, shared locks can squeeze in, but then the next pending exclusive lock will assert its priority over any new shared locks when it wakes up and loops.
This isn't quite what I wanted, but it seems to work quite well. I had to make a trade-off in the EXREQ lock-grant mechanism to improve performance.
* In addition, we use atomic_fcmpset_long() instead of atomic_cmpset_long() to reduce cache line flip flopping at least a little.
* Remove lockcount() and lockcountnb(), which tried to count lock refs. Replace with lockinuse(), which simply tells the caller whether the lock is referenced or not.
* Expand some of the copyright notices (years and authors) for major rewrites. Really there are a lot more and I have to pay more attention to adjustments.
|
/dragonfly/sys/vfs/ufs/ |
H A D | ffs_softdep.c | 3b6a19b2 Tue Oct 24 01:39:16 GMT 2017 Matthew Dillon <dillon@apollo.backplane.com> kernel - Refactor lockmgr()
* Seriously refactor lockmgr() so we can use atomic_fetchadd_*() for shared locks and reduce unnecessary atomic ops and atomic op loops.
The main win here is being able to use atomic_fetchadd_*() when acquiring and releasing shared locks. A simple fstat() loop (which utilizes a LK_SHARED lockmgr lock on the vnode) improves from 191ns to around 110ns per loop with 32 concurrent threads (on a 16-core/ 32-thread xeon).
* To accomplish this, the 32-bit lk_count field becomes 64-bits. The shared count is separated into the high 32-bits, allowing it to be manipulated for both blocking shared requests and the shared lock count field. The low count bits are used for exclusive locks. Control bits are adjusted to manage lockmgr features.
LKC_SHARED Indicates shared lock count is active, else excl lock count. Can predispose the lock when the related count is 0 (does not have to be cleared, for example).
LKC_UPREQ Queued upgrade request. Automatically granted by releasing entity (UPREQ -> ~SHARED|1).
LKC_EXREQ Queued exclusive request (only when lock held shared). Automatically granted by releasing entity (EXREQ -> ~SHARED|1).
LKC_EXREQ2 Aggregated exclusive request. When EXREQ cannot be obtained due to the lock being held exclusively or EXREQ already being queued, EXREQ2 is flagged for wakeup/retries.
LKC_CANCEL Cancel API support
LKC_SMASK Shared lock count mask (LKC_SCOUNT increments).
LKC_XMASK Exclusive lock count mask (+1 increments)
The 'no lock' condition occurs when LKC_XMASK is 0 and LKC_SMASK is 0, regardless of the state of LKC_SHARED.
* Lockmgr still supports exclusive priority over shared locks. The semantics have slightly changed. The priority mechanism only applies to the EXREQ holder. Once an exclusive lock is obtained, any blocking shared or exclusive locks will have equal priority until the exclusive lock is released. Once released, shared locks can squeeze in, but then the next pending exclusive lock will assert its priority over any new shared locks when it wakes up and loops.
This isn't quite what I wanted, but it seems to work quite well. I had to make a trade-off in the EXREQ lock-grant mechanism to improve performance.
* In addition, we use atomic_fcmpset_long() instead of atomic_cmpset_long() to reduce cache line flip flopping at least a little.
* Remove lockcount() and lockcountnb(), which tried to count lock refs. Replace with lockinuse(), which simply tells the caller whether the lock is referenced or not.
* Expand some of the copyright notices (years and authors) for major rewrites. Really there are a lot more and I have to pay more attention to adjustments.
|
/dragonfly/sys/vfs/nfs/ |
H A D | nfs_subs.c | 3b6a19b2 Tue Oct 24 01:39:16 GMT 2017 Matthew Dillon <dillon@apollo.backplane.com> kernel - Refactor lockmgr()
* Seriously refactor lockmgr() so we can use atomic_fetchadd_*() for shared locks and reduce unnecessary atomic ops and atomic op loops.
The main win here is being able to use atomic_fetchadd_*() when acquiring and releasing shared locks. A simple fstat() loop (which utilizes a LK_SHARED lockmgr lock on the vnode) improves from 191ns to around 110ns per loop with 32 concurrent threads (on a 16-core/ 32-thread xeon).
* To accomplish this, the 32-bit lk_count field becomes 64-bits. The shared count is separated into the high 32-bits, allowing it to be manipulated for both blocking shared requests and the shared lock count field. The low count bits are used for exclusive locks. Control bits are adjusted to manage lockmgr features.
LKC_SHARED Indicates shared lock count is active, else excl lock count. Can predispose the lock when the related count is 0 (does not have to be cleared, for example).
LKC_UPREQ Queued upgrade request. Automatically granted by releasing entity (UPREQ -> ~SHARED|1).
LKC_EXREQ Queued exclusive request (only when lock held shared). Automatically granted by releasing entity (EXREQ -> ~SHARED|1).
LKC_EXREQ2 Aggregated exclusive request. When EXREQ cannot be obtained due to the lock being held exclusively or EXREQ already being queued, EXREQ2 is flagged for wakeup/retries.
LKC_CANCEL Cancel API support
LKC_SMASK Shared lock count mask (LKC_SCOUNT increments).
LKC_XMASK Exclusive lock count mask (+1 increments)
The 'no lock' condition occurs when LKC_XMASK is 0 and LKC_SMASK is 0, regardless of the state of LKC_SHARED.
* Lockmgr still supports exclusive priority over shared locks. The semantics have slightly changed. The priority mechanism only applies to the EXREQ holder. Once an exclusive lock is obtained, any blocking shared or exclusive locks will have equal priority until the exclusive lock is released. Once released, shared locks can squeeze in, but then the next pending exclusive lock will assert its priority over any new shared locks when it wakes up and loops.
This isn't quite what I wanted, but it seems to work quite well. I had to make a trade-off in the EXREQ lock-grant mechanism to improve performance.
* In addition, we use atomic_fcmpset_long() instead of atomic_cmpset_long() to reduce cache line flip flopping at least a little.
* Remove lockcount() and lockcountnb(), which tried to count lock refs. Replace with lockinuse(), which simply tells the caller whether the lock is referenced or not.
* Expand some of the copyright notices (years and authors) for major rewrites. Really there are a lot more and I have to pay more attention to adjustments.
|
H A D | nfs_vnops.c | 3b6a19b2 Tue Oct 24 01:39:16 GMT 2017 Matthew Dillon <dillon@apollo.backplane.com> kernel - Refactor lockmgr()
* Seriously refactor lockmgr() so we can use atomic_fetchadd_*() for shared locks and reduce unnecessary atomic ops and atomic op loops.
The main win here is being able to use atomic_fetchadd_*() when acquiring and releasing shared locks. A simple fstat() loop (which utilizes a LK_SHARED lockmgr lock on the vnode) improves from 191ns to around 110ns per loop with 32 concurrent threads (on a 16-core/ 32-thread xeon).
* To accomplish this, the 32-bit lk_count field becomes 64-bits. The shared count is separated into the high 32-bits, allowing it to be manipulated for both blocking shared requests and the shared lock count field. The low count bits are used for exclusive locks. Control bits are adjusted to manage lockmgr features.
LKC_SHARED Indicates shared lock count is active, else excl lock count. Can predispose the lock when the related count is 0 (does not have to be cleared, for example).
LKC_UPREQ Queued upgrade request. Automatically granted by releasing entity (UPREQ -> ~SHARED|1).
LKC_EXREQ Queued exclusive request (only when lock held shared). Automatically granted by releasing entity (EXREQ -> ~SHARED|1).
LKC_EXREQ2 Aggregated exclusive request. When EXREQ cannot be obtained due to the lock being held exclusively or EXREQ already being queued, EXREQ2 is flagged for wakeup/retries.
LKC_CANCEL Cancel API support
LKC_SMASK Shared lock count mask (LKC_SCOUNT increments).
LKC_XMASK Exclusive lock count mask (+1 increments)
The 'no lock' condition occurs when LKC_XMASK is 0 and LKC_SMASK is 0, regardless of the state of LKC_SHARED.
* Lockmgr still supports exclusive priority over shared locks. The semantics have slightly changed. The priority mechanism only applies to the EXREQ holder. Once an exclusive lock is obtained, any blocking shared or exclusive locks will have equal priority until the exclusive lock is released. Once released, shared locks can squeeze in, but then the next pending exclusive lock will assert its priority over any new shared locks when it wakes up and loops.
This isn't quite what I wanted, but it seems to work quite well. I had to make a trade-off in the EXREQ lock-grant mechanism to improve performance.
* In addition, we use atomic_fcmpset_long() instead of atomic_cmpset_long() to reduce cache line flip flopping at least a little.
* Remove lockcount() and lockcountnb(), which tried to count lock refs. Replace with lockinuse(), which simply tells the caller whether the lock is referenced or not.
* Expand some of the copyright notices (years and authors) for major rewrites. Really there are a lot more and I have to pay more attention to adjustments.
|