#
08c2d5b4 |
| 09-Apr-2022 |
riastradh <riastradh@NetBSD.org> |
Introduce membar_acquire/release. Deprecate membar_enter/exit.
The names membar_enter/exit were unclear, and the documentation of membar_enter has disagreed with the implementations on sparc, power
Introduce membar_acquire/release. Deprecate membar_enter/exit.
The names membar_enter/exit were unclear, and the documentation of membar_enter has disagreed with the implementations on sparc, powerpc, and even x86(!) for the entire time it has been in NetBSD.
The terms `acquire' and `release' are ubiquitous in the literature today, and have been adopted in the C and C++ standards to mean load-before-load/store and load/store-before-store, respectively, which are exactly the orderings required by acquiring and releasing a mutex, as well as other useful applications like decrementing a reference count and then freeing the underlying object if it went to zero.
Originally I proposed changing one word in the documentation for membar_enter to make it load-before-load/store instead of store-before-load/store, i.e., to make it an acquire barrier. I proposed this on the grounds that
(a) all implementations guarantee load-before-load/store, (b) some implementations fail to guarantee store-before-load/store, and (c) all uses in-tree assume load-before-load/store.
I verified parts (a) and (b) (except, for (a), powerpc didn't even guarantee load-before-load/store -- isync isn't necessarily enough; need lwsync in general -- but it _almost_ did, and it certainly didn't guarantee store-before-load/store).
Part (c) might not be correct, however: under the mistaken assumption that atomic-r/m/w then membar-w/rw is equivalent to atomic-r/m/w then membar-r/rw, I only audited the cases of membar_enter that _aren't_ immediately after an atomic-r/m/w. All of those cases assume load-before-load/store. But my assumption was wrong -- there are cases of atomic-r/m/w then membar-w/rw that would be broken by changing to atomic-r/m/w then membar-r/rw:
https://mail-index.netbsd.org/tech-kern/2022/03/29/msg028044.html
Furthermore, the name membar_enter has been adopted in other places like OpenBSD where it actually does follow the documentation and guarantee store-before-load/store, even if that order is not useful. So the name membar_enter currently lives in a bad place where it means either of two things -- r/rw or w/rw.
With this change, we deprecate membar_enter/exit, introduce membar_acquire/release as better names for the useful pair (r/rw and rw/w), and make sure the implementation of membar_enter guarantees both what was documented _and_ what was implemented, making it an alias for membar_sync.
While here, rework all of the membar_* definitions and aliases. The new logic follows a rule to make it easier to audit:
membar_X is defined as an alias for membar_Y iff membar_X is guaranteed by membar_Y.
The `no stronger than' relation is (the transitive closure of):
- membar_consumer (r/r) is guaranteed by membar_acquire (r/rw) - membar_producer (w/w) is guaranteed by membar_release (rw/w) - membar_acquire (r/rw) is guaranteed by membar_sync (rw/rw) - membar_release (rw/w) is guaranteed by membar_sync (rw/rw)
And, for the deprecated membars:
- membar_enter (whether r/rw, w/rw, or rw/rw) is guaranteed by membar_sync (rw/rw) - membar_exit (rw/w) is guaranteed by membar_release (rw/w)
(membar_exit is identical to membar_release, but the name is deprecated.)
Finally, while here, annotate some of the instructions with their semantics. For powerpc, leave an essay with citations on the unfortunate but -- as far as I can tell -- necessary decision to use lwsync, not isync, for membar_acquire and membar_consumer.
Also add membar(3) and atomic(3) man page links.
show more ...
|
#
e5affa2d |
| 09-Apr-2022 |
riastradh <riastradh@NetBSD.org> |
sparc: Fix membar_sync with LDSTUB.
membar_sync is required to be a full sequential consistency barrier, equivalent to MEMBAR #StoreStore|LoadStore|StoreLoad|LoadLoad on sparcv9. LDSTUB and SWAP ar
sparc: Fix membar_sync with LDSTUB.
membar_sync is required to be a full sequential consistency barrier, equivalent to MEMBAR #StoreStore|LoadStore|StoreLoad|LoadLoad on sparcv9. LDSTUB and SWAP are the only pre-v9 instructions that do this and SWAP doesn't exist on all v7 hardware, so use LDSTUB.
Note: I'm having a hard time nailing down a reference for the ordering implied by LDSTUB and SWAP. I'm _pretty sure_ SWAP has to imply store-load ordering since the SPARCv8 manual recommends it for Dekker's algorithm (which notoriously requires store-load ordering), and the formal memory model treats LDSTUB and SWAP the same for ordering. But the v8 and v9 manuals aren't clear.
GCC issues STBAR and LDSTUB, but (a) I don't see why STBAR is necessary here, (b) STBAR doesn't exist on v7 so it'd be a pain to use, and (c) from what I've heard (although again it's hard to nail down authoritative references here) all actual SPARC hardware is TSO or SC anyway so STBAR is a noop in all the silicon anyway.
Either way, certainly this is better than what we had before, which was nothing implying ordering at all, just a store!
show more ...
|