109467b48Spatrick========= 209467b48SpatrickMemorySSA 309467b48Spatrick========= 409467b48Spatrick 509467b48Spatrick.. contents:: 609467b48Spatrick :local: 709467b48Spatrick 809467b48SpatrickIntroduction 909467b48Spatrick============ 1009467b48Spatrick 1109467b48Spatrick``MemorySSA`` is an analysis that allows us to cheaply reason about the 1209467b48Spatrickinteractions between various memory operations. Its goal is to replace 1309467b48Spatrick``MemoryDependenceAnalysis`` for most (if not all) use-cases. This is because, 1409467b48Spatrickunless you're very careful, use of ``MemoryDependenceAnalysis`` can easily 1509467b48Spatrickresult in quadratic-time algorithms in LLVM. Additionally, ``MemorySSA`` doesn't 1609467b48Spatrickhave as many arbitrary limits as ``MemoryDependenceAnalysis``, so you should get 17097a140dSpatrickbetter results, too. One common use of ``MemorySSA`` is to quickly find out 18097a140dSpatrickthat something definitely cannot happen (for example, reason that a hoist 19097a140dSpatrickout of a loop can't happen). 2009467b48Spatrick 2109467b48SpatrickAt a high level, one of the goals of ``MemorySSA`` is to provide an SSA based 2209467b48Spatrickform for memory, complete with def-use and use-def chains, which 2309467b48Spatrickenables users to quickly find may-def and may-uses of memory operations. 2409467b48SpatrickIt can also be thought of as a way to cheaply give versions to the complete 25097a140dSpatrickstate of memory, and associate memory operations with those versions. 2609467b48Spatrick 2709467b48SpatrickThis document goes over how ``MemorySSA`` is structured, and some basic 2809467b48Spatrickintuition on how ``MemorySSA`` works. 2909467b48Spatrick 3009467b48SpatrickA paper on MemorySSA (with notes about how it's implemented in GCC) `can be 3109467b48Spatrickfound here <http://www.airs.com/dnovillo/Papers/mem-ssa.pdf>`_. Though, it's 32097a140dSpatrickrelatively out-of-date; the paper references multiple memory partitions, but GCC 3309467b48Spatrickeventually swapped to just using one, like we now have in LLVM. Like 3409467b48SpatrickGCC's, LLVM's MemorySSA is intraprocedural. 3509467b48Spatrick 3609467b48Spatrick 3709467b48SpatrickMemorySSA Structure 3809467b48Spatrick=================== 3909467b48Spatrick 4009467b48SpatrickMemorySSA is a virtual IR. After it's built, ``MemorySSA`` will contain a 4109467b48Spatrickstructure that maps ``Instruction``\ s to ``MemoryAccess``\ es, which are 4209467b48Spatrick``MemorySSA``'s parallel to LLVM ``Instruction``\ s. 4309467b48Spatrick 4409467b48SpatrickEach ``MemoryAccess`` can be one of three types: 4509467b48Spatrick 46097a140dSpatrick- ``MemoryDef`` 4709467b48Spatrick- ``MemoryPhi`` 4809467b48Spatrick- ``MemoryUse`` 49097a140dSpatrick 50097a140dSpatrick``MemoryDef``\ s are operations which may either modify memory, or which 51097a140dSpatrickintroduce some kind of ordering constraints. Examples of ``MemoryDef``\ s 52097a140dSpatrickinclude ``store``\ s, function calls, ``load``\ s with ``acquire`` (or higher) 53097a140dSpatrickordering, volatile operations, memory fences, etc. A ``MemoryDef`` 54097a140dSpatrickalways introduces a new version of the entire memory and is linked with a single 55097a140dSpatrick``MemoryDef/MemoryPhi`` which is the version of memory that the new 56097a140dSpatrickversion is based on. This implies that there is a *single* 57097a140dSpatrick``Def`` chain that connects all the ``Def``\ s, either directly 5873471bf0Spatrickor indirectly. For example in: 59097a140dSpatrick 60097a140dSpatrick.. code-block:: llvm 61097a140dSpatrick 62097a140dSpatrick b = MemoryDef(a) 63097a140dSpatrick c = MemoryDef(b) 64097a140dSpatrick d = MemoryDef(c) 65097a140dSpatrick 66097a140dSpatrick``d`` is connected directly with ``c`` and indirectly with ``b``. 67097a140dSpatrickThis means that ``d`` potentially clobbers (see below) ``c`` *or* 68097a140dSpatrick``b`` *or* both. This in turn implies that without the use of `The walker`_, 69097a140dSpatrickinitially every ``MemoryDef`` clobbers every other ``MemoryDef``. 7009467b48Spatrick 7109467b48Spatrick``MemoryPhi``\ s are ``PhiNode``\ s, but for memory operations. If at any 7209467b48Spatrickpoint we have two (or more) ``MemoryDef``\ s that could flow into a 7309467b48Spatrick``BasicBlock``, the block's top ``MemoryAccess`` will be a 7409467b48Spatrick``MemoryPhi``. As in LLVM IR, ``MemoryPhi``\ s don't correspond to any 7509467b48Spatrickconcrete operation. As such, ``BasicBlock``\ s are mapped to ``MemoryPhi``\ s 7609467b48Spatrickinside ``MemorySSA``, whereas ``Instruction``\ s are mapped to ``MemoryUse``\ s 7709467b48Spatrickand ``MemoryDef``\ s. 7809467b48Spatrick 7909467b48SpatrickNote also that in SSA, Phi nodes merge must-reach definitions (that is, 8009467b48Spatrickdefinitions that *must* be new versions of variables). In MemorySSA, PHI nodes 8109467b48Spatrickmerge may-reach definitions (that is, until disambiguated, the versions that 8209467b48Spatrickreach a phi node may or may not clobber a given variable). 8309467b48Spatrick 8409467b48Spatrick``MemoryUse``\ s are operations which use but don't modify memory. An example of 8509467b48Spatricka ``MemoryUse`` is a ``load``, or a ``readonly`` function call. 8609467b48Spatrick 8709467b48SpatrickEvery function that exists has a special ``MemoryDef`` called ``liveOnEntry``. 8809467b48SpatrickIt dominates every ``MemoryAccess`` in the function that ``MemorySSA`` is being 8909467b48Spatrickrun on, and implies that we've hit the top of the function. It's the only 9009467b48Spatrick``MemoryDef`` that maps to no ``Instruction`` in LLVM IR. Use of 9109467b48Spatrick``liveOnEntry`` implies that the memory being used is either undefined or 9209467b48Spatrickdefined before the function begins. 9309467b48Spatrick 9409467b48SpatrickAn example of all of this overlaid on LLVM IR (obtained by running ``opt 9509467b48Spatrick-passes='print<memoryssa>' -disable-output`` on an ``.ll`` file) is below. When 96097a140dSpatrickviewing this example, it may be helpful to view it in terms of clobbers. 97097a140dSpatrickThe operands of a given ``MemoryAccess`` are all (potential) clobbers of said 98097a140dSpatrick``MemoryAccess``, and the value produced by a ``MemoryAccess`` can act as a clobber 99097a140dSpatrickfor other ``MemoryAccess``\ es. 100097a140dSpatrick 101097a140dSpatrickIf a ``MemoryAccess`` is a *clobber* of another, it means that these two 102097a140dSpatrick``MemoryAccess``\ es may access the same memory. For example, ``x = MemoryDef(y)`` 103097a140dSpatrickmeans that ``x`` potentially modifies memory that ``y`` modifies/constrains 104097a140dSpatrick(or has modified / constrained). 105097a140dSpatrickIn the same manner, ``a = MemoryPhi({BB1,b},{BB2,c})`` means that 106097a140dSpatrickanyone that uses ``a`` is accessing memory potentially modified / constrained 107097a140dSpatrickby either ``b`` or ``c`` (or both). And finally, ``MemoryUse(x)`` means 108097a140dSpatrickthat this use accesses memory that ``x`` has modified / constrained 109097a140dSpatrick(as an example, think that if ``x = MemoryDef(...)`` 110097a140dSpatrickand ``MemoryUse(x)`` are in the same loop, the use can't 111097a140dSpatrickbe hoisted outside alone). 112097a140dSpatrick 113097a140dSpatrickAnother useful way of looking at it is in terms of memory versions. 114097a140dSpatrickIn that view, operands of a given ``MemoryAccess`` are the version 115097a140dSpatrickof the entire memory before the operation, and if the access produces 116097a140dSpatricka value (i.e. ``MemoryDef/MemoryPhi``), 117097a140dSpatrickthe value is the new version of the memory after the operation. 11809467b48Spatrick 11909467b48Spatrick.. code-block:: llvm 12009467b48Spatrick 12109467b48Spatrick define void @foo() { 12209467b48Spatrick entry: 12309467b48Spatrick %p1 = alloca i8 12409467b48Spatrick %p2 = alloca i8 12509467b48Spatrick %p3 = alloca i8 12609467b48Spatrick ; 1 = MemoryDef(liveOnEntry) 12709467b48Spatrick store i8 0, i8* %p3 12809467b48Spatrick br label %while.cond 12909467b48Spatrick 13009467b48Spatrick while.cond: 131097a140dSpatrick ; 6 = MemoryPhi({entry,1},{if.end,4}) 13209467b48Spatrick br i1 undef, label %if.then, label %if.else 13309467b48Spatrick 13409467b48Spatrick if.then: 13509467b48Spatrick ; 2 = MemoryDef(6) 13609467b48Spatrick store i8 0, i8* %p1 13709467b48Spatrick br label %if.end 13809467b48Spatrick 13909467b48Spatrick if.else: 14009467b48Spatrick ; 3 = MemoryDef(6) 14109467b48Spatrick store i8 1, i8* %p2 14209467b48Spatrick br label %if.end 14309467b48Spatrick 14409467b48Spatrick if.end: 14509467b48Spatrick ; 5 = MemoryPhi({if.then,2},{if.else,3}) 14609467b48Spatrick ; MemoryUse(5) 14709467b48Spatrick %1 = load i8, i8* %p1 14809467b48Spatrick ; 4 = MemoryDef(5) 14909467b48Spatrick store i8 2, i8* %p2 15009467b48Spatrick ; MemoryUse(1) 15109467b48Spatrick %2 = load i8, i8* %p3 15209467b48Spatrick br label %while.cond 15309467b48Spatrick } 15409467b48Spatrick 15509467b48SpatrickThe ``MemorySSA`` IR is shown in comments that precede the instructions they map 15609467b48Spatrickto (if such an instruction exists). For example, ``1 = MemoryDef(liveOnEntry)`` 15709467b48Spatrickis a ``MemoryAccess`` (specifically, a ``MemoryDef``), and it describes the LLVM 15809467b48Spatrickinstruction ``store i8 0, i8* %p3``. Other places in ``MemorySSA`` refer to this 15909467b48Spatrickparticular ``MemoryDef`` as ``1`` (much like how one can refer to ``load i8, i8* 16009467b48Spatrick%p1`` in LLVM with ``%1``). Again, ``MemoryPhi``\ s don't correspond to any LLVM 16109467b48SpatrickInstruction, so the line directly below a ``MemoryPhi`` isn't special. 16209467b48Spatrick 16309467b48SpatrickGoing from the top down: 16409467b48Spatrick 16509467b48Spatrick- ``6 = MemoryPhi({entry,1},{if.end,4})`` notes that, when entering 16609467b48Spatrick ``while.cond``, the reaching definition for it is either ``1`` or ``4``. This 16709467b48Spatrick ``MemoryPhi`` is referred to in the textual IR by the number ``6``. 16809467b48Spatrick- ``2 = MemoryDef(6)`` notes that ``store i8 0, i8* %p1`` is a definition, 16909467b48Spatrick and its reaching definition before it is ``6``, or the ``MemoryPhi`` after 170*d415bd75Srobert ``while.cond``. (See the `Use and Def optimization`_ and `Precision`_ 17109467b48Spatrick sections below for why this ``MemoryDef`` isn't linked to a separate, 17209467b48Spatrick disambiguated ``MemoryPhi``.) 17309467b48Spatrick- ``3 = MemoryDef(6)`` notes that ``store i8 0, i8* %p2`` is a definition; its 17409467b48Spatrick reaching definition is also ``6``. 17509467b48Spatrick- ``5 = MemoryPhi({if.then,2},{if.else,3})`` notes that the clobber before 17609467b48Spatrick this block could either be ``2`` or ``3``. 17709467b48Spatrick- ``MemoryUse(5)`` notes that ``load i8, i8* %p1`` is a use of memory, and that 17809467b48Spatrick it's clobbered by ``5``. 17973471bf0Spatrick- ``4 = MemoryDef(5)`` notes that ``store i8 2, i8* %p2`` is a definition; its 18009467b48Spatrick reaching definition is ``5``. 18109467b48Spatrick- ``MemoryUse(1)`` notes that ``load i8, i8* %p3`` is just a user of memory, 18209467b48Spatrick and the last thing that could clobber this use is above ``while.cond`` (e.g. 183097a140dSpatrick the store to ``%p3``). In memory versioning parlance, it really only depends on 184097a140dSpatrick the memory version 1, and is unaffected by the new memory versions generated since 18509467b48Spatrick then. 18609467b48Spatrick 18709467b48SpatrickAs an aside, ``MemoryAccess`` is a ``Value`` mostly for convenience; it's not 18809467b48Spatrickmeant to interact with LLVM IR. 18909467b48Spatrick 19009467b48SpatrickDesign of MemorySSA 19109467b48Spatrick=================== 19209467b48Spatrick 19309467b48Spatrick``MemorySSA`` is an analysis that can be built for any arbitrary function. When 19409467b48Spatrickit's built, it does a pass over the function's IR in order to build up its 19509467b48Spatrickmapping of ``MemoryAccess``\ es. You can then query ``MemorySSA`` for things 19609467b48Spatricklike the dominance relation between ``MemoryAccess``\ es, and get the 19709467b48Spatrick``MemoryAccess`` for any given ``Instruction`` . 19809467b48Spatrick 19909467b48SpatrickWhen ``MemorySSA`` is done building, it also hands you a ``MemorySSAWalker`` 20009467b48Spatrickthat you can use (see below). 20109467b48Spatrick 20209467b48Spatrick 20309467b48SpatrickThe walker 20409467b48Spatrick---------- 20509467b48Spatrick 20609467b48SpatrickA structure that helps ``MemorySSA`` do its job is the ``MemorySSAWalker``, or 20709467b48Spatrickthe walker, for short. The goal of the walker is to provide answers to clobber 20809467b48Spatrickqueries beyond what's represented directly by ``MemoryAccess``\ es. For example, 20909467b48Spatrickgiven: 21009467b48Spatrick 21109467b48Spatrick.. code-block:: llvm 21209467b48Spatrick 21309467b48Spatrick define void @foo() { 21409467b48Spatrick %a = alloca i8 21509467b48Spatrick %b = alloca i8 21609467b48Spatrick 21709467b48Spatrick ; 1 = MemoryDef(liveOnEntry) 21809467b48Spatrick store i8 0, i8* %a 21909467b48Spatrick ; 2 = MemoryDef(1) 22009467b48Spatrick store i8 0, i8* %b 22109467b48Spatrick } 22209467b48Spatrick 22309467b48SpatrickThe store to ``%a`` is clearly not a clobber for the store to ``%b``. It would 22409467b48Spatrickbe the walker's goal to figure this out, and return ``liveOnEntry`` when queried 22509467b48Spatrickfor the clobber of ``MemoryAccess`` ``2``. 22609467b48Spatrick 22709467b48SpatrickBy default, ``MemorySSA`` provides a walker that can optimize ``MemoryDef``\ s 22809467b48Spatrickand ``MemoryUse``\ s by consulting whatever alias analysis stack you happen to 22909467b48Spatrickbe using. Walkers were built to be flexible, though, so it's entirely reasonable 23009467b48Spatrick(and expected) to create more specialized walkers (e.g. one that specifically 23109467b48Spatrickqueries ``GlobalsAA``, one that always stops at ``MemoryPhi`` nodes, etc). 23209467b48Spatrick 23373471bf0SpatrickDefault walker APIs 23473471bf0Spatrick^^^^^^^^^^^^^^^^^^^ 23573471bf0Spatrick 23673471bf0SpatrickThere are two main APIs used to retrieve the clobbering access using the walker: 23773471bf0Spatrick 23873471bf0Spatrick- ``MemoryAccess *getClobberingMemoryAccess(MemoryAccess *MA);`` return the 23973471bf0Spatrick clobbering memory access for ``MA``, caching all intermediate results 24073471bf0Spatrick computed along the way as part of each access queried. 24173471bf0Spatrick 24273471bf0Spatrick- ``MemoryAccess *getClobberingMemoryAccess(MemoryAccess *MA, const MemoryLocation &Loc);`` 24373471bf0Spatrick returns the access clobbering memory location ``Loc``, starting at ``MA``. 24473471bf0Spatrick Because this API does not request the clobbering access of a specific memory 24573471bf0Spatrick access, there are no results that can be cached. 24609467b48Spatrick 24709467b48SpatrickLocating clobbers yourself 24809467b48Spatrick^^^^^^^^^^^^^^^^^^^^^^^^^^ 24909467b48Spatrick 25009467b48SpatrickIf you choose to make your own walker, you can find the clobber for a 25109467b48Spatrick``MemoryAccess`` by walking every ``MemoryDef`` that dominates said 25209467b48Spatrick``MemoryAccess``. The structure of ``MemoryDef``\ s makes this relatively simple; 25309467b48Spatrickthey ultimately form a linked list of every clobber that dominates the 25409467b48Spatrick``MemoryAccess`` that you're trying to optimize. In other words, the 25509467b48Spatrick``definingAccess`` of a ``MemoryDef`` is always the nearest dominating 25609467b48Spatrick``MemoryDef`` or ``MemoryPhi`` of said ``MemoryDef``. 25709467b48Spatrick 25809467b48Spatrick 259*d415bd75SrobertUse and Def optimization 260*d415bd75Srobert------------------------ 26109467b48Spatrick 262*d415bd75Srobert``MemoryUse``\ s keep a single operand, which is their defining or optimized 263*d415bd75Srobertaccess. 264*d415bd75SrobertTraditionally ``MemorySSA`` optimized ``MemoryUse``\ s at build-time, up to a 265*d415bd75Srobertgiven threshold. 266*d415bd75SrobertSpecifically, the operand of every ``MemoryUse`` was optimized to point to the 26709467b48Spatrickactual clobber of said ``MemoryUse``. This can be seen in the above example; the 26809467b48Spatricksecond ``MemoryUse`` in ``if.end`` has an operand of ``1``, which is a 26909467b48Spatrick``MemoryDef`` from the entry block. This is done to make walking, 27009467b48Spatrickvalue numbering, etc, faster and easier. 271*d415bd75SrobertAs of `this revision <https://reviews.llvm.org/D121381>`_, the default was 272*d415bd75Srobertchanged to not optimize uses at build time, in order to provide the option to 273*d415bd75Srobertreduce compile-time if the walking is not necessary in a pass. Most users call 274*d415bd75Srobertthe new API ``ensureOptimizedUses()`` to keep the previous behavior and do a 275*d415bd75Srobertone-time optimization of ``MemoryUse``\ s, if this was not done before. 276*d415bd75SrobertNew pass users are recommended to call ``ensureOptimizedUses()``. 27709467b48Spatrick 278*d415bd75SrobertInitially it was not possible to optimize ``MemoryDef``\ s in the same way, as we 279*d415bd75Srobertrestricted ``MemorySSA`` to one operand per access. 280*d415bd75SrobertThis was changed and ``MemoryDef``\ s now keep two operands. 281*d415bd75SrobertThe first one, the defining access, is 282*d415bd75Srobertalways the previous ``MemoryDef`` or ``MemoryPhi`` in the same basic block, or 283*d415bd75Srobertthe last one in a dominating predecessor if the current block doesn't have any 284*d415bd75Srobertother accesses writing to memory. This is needed for walking Def chains. 285*d415bd75SrobertThe second operand is the optimized access, if there was a previous call on the 286*d415bd75Srobertwalker's ``getClobberingMemoryAccess(MA)``. This API will cache information 287*d415bd75Srobertas part of ``MA``. 288*d415bd75SrobertOptimizing all ``MemoryDef``\ s has quadratic time complexity and is not done 289*d415bd75Srobertby default. 29009467b48Spatrick 291*d415bd75SrobertA walk of the uses for any MemoryDef can find the accesses that were optimized 292*d415bd75Srobertto it. 293*d415bd75SrobertA code snippet for such a walk looks like this: 294*d415bd75Srobert 295*d415bd75Srobert.. code-block:: c++ 296*d415bd75Srobert 297*d415bd75Srobert MemoryDef *Def; // find who's optimized or defining for this MemoryDef 298*d415bd75Srobert for (auto& U : Def->uses()) { 299*d415bd75Srobert MemoryAccess *MA = cast<MemoryAccess>(Use.getUser()); 300*d415bd75Srobert if (auto *DefUser = cast_of_null<MemoryDef>MA) 301*d415bd75Srobert if (DefUser->isOptimized() && DefUser->getOptimized() == Def) { 302*d415bd75Srobert // User who is optimized to Def 303*d415bd75Srobert } else { 304*d415bd75Srobert // User who's defining access is Def; optimized to something else or not optimized. 305*d415bd75Srobert } 306*d415bd75Srobert } 307*d415bd75Srobert 308*d415bd75SrobertWhen ``MemoryUse``\ s are optimized, for a given store, you can find all loads 309*d415bd75Srobertclobbered by that store by walking the immediate and transitive uses of 310*d415bd75Srobertthe store. 311*d415bd75Srobert 312*d415bd75Srobert.. code-block:: c++ 313*d415bd75Srobert 314*d415bd75Srobert checkUses(MemoryAccess *Def) { // Def can be a MemoryDef or a MemoryPhi. 315*d415bd75Srobert for (auto& U : Def->uses()) { 316*d415bd75Srobert MemoryAccess *MA = cast<MemoryAccess>(Use.getUser()); 317*d415bd75Srobert if (auto *MU = cast_of_null<MemoryUse>MA) { 318*d415bd75Srobert // Process MemoryUse as needed. 319*d415bd75Srobert } 320*d415bd75Srobert else { 321*d415bd75Srobert // Process MemoryDef or MemoryPhi as needed. 322*d415bd75Srobert 323*d415bd75Srobert // As a user can come up twice, as an optimized access and defining 324*d415bd75Srobert // access, keep a visited list. 325*d415bd75Srobert 326*d415bd75Srobert // Check transitive uses as needed 327*d415bd75Srobert checkUses (MA); // use a worklist for an iterative algorithm 328*d415bd75Srobert } 329*d415bd75Srobert } 330*d415bd75Srobert } 331*d415bd75Srobert 332*d415bd75SrobertAn example of similar traversals can be found in the DeadStoreElimination pass. 33309467b48Spatrick 33409467b48SpatrickInvalidation and updating 33509467b48Spatrick------------------------- 33609467b48Spatrick 33709467b48SpatrickBecause ``MemorySSA`` keeps track of LLVM IR, it needs to be updated whenever 33809467b48Spatrickthe IR is updated. "Update", in this case, includes the addition, deletion, and 33909467b48Spatrickmotion of ``Instructions``. The update API is being made on an as-needed basis. 340*d415bd75SrobertIf you'd like examples, ``GVNHoist`` and ``LICM`` are users of ``MemorySSA``\ s 341*d415bd75Srobertupdate API. 342*d415bd75SrobertNote that adding new ``MemoryDef``\ s (by calling ``insertDef``) can be a 343*d415bd75Sroberttime-consuming update, if the new access triggers many ``MemoryPhi`` insertions and 344*d415bd75Srobertrenaming (optimization invalidation) of many ``MemoryAccesses``\ es. 34509467b48Spatrick 34609467b48Spatrick 34709467b48SpatrickPhi placement 34809467b48Spatrick^^^^^^^^^^^^^ 34909467b48Spatrick 35009467b48Spatrick``MemorySSA`` only places ``MemoryPhi``\ s where they're actually 35109467b48Spatrickneeded. That is, it is a pruned SSA form, like LLVM's SSA form. For 35209467b48Spatrickexample, consider: 35309467b48Spatrick 35409467b48Spatrick.. code-block:: llvm 35509467b48Spatrick 35609467b48Spatrick define void @foo() { 35709467b48Spatrick entry: 35809467b48Spatrick %p1 = alloca i8 35909467b48Spatrick %p2 = alloca i8 36009467b48Spatrick %p3 = alloca i8 36109467b48Spatrick ; 1 = MemoryDef(liveOnEntry) 36209467b48Spatrick store i8 0, i8* %p3 36309467b48Spatrick br label %while.cond 36409467b48Spatrick 36509467b48Spatrick while.cond: 36609467b48Spatrick ; 3 = MemoryPhi({%0,1},{if.end,2}) 36709467b48Spatrick br i1 undef, label %if.then, label %if.else 36809467b48Spatrick 36909467b48Spatrick if.then: 37009467b48Spatrick br label %if.end 37109467b48Spatrick 37209467b48Spatrick if.else: 37309467b48Spatrick br label %if.end 37409467b48Spatrick 37509467b48Spatrick if.end: 37609467b48Spatrick ; MemoryUse(1) 37709467b48Spatrick %1 = load i8, i8* %p1 37809467b48Spatrick ; 2 = MemoryDef(3) 37909467b48Spatrick store i8 2, i8* %p2 38009467b48Spatrick ; MemoryUse(1) 38109467b48Spatrick %2 = load i8, i8* %p3 38209467b48Spatrick br label %while.cond 38309467b48Spatrick } 38409467b48Spatrick 38509467b48SpatrickBecause we removed the stores from ``if.then`` and ``if.else``, a ``MemoryPhi`` 38609467b48Spatrickfor ``if.end`` would be pointless, so we don't place one. So, if you need to 38709467b48Spatrickplace a ``MemoryDef`` in ``if.then`` or ``if.else``, you'll need to also create 38809467b48Spatricka ``MemoryPhi`` for ``if.end``. 38909467b48Spatrick 39009467b48SpatrickIf it turns out that this is a large burden, we can just place ``MemoryPhi``\ s 39109467b48Spatrickeverywhere. Because we have Walkers that are capable of optimizing above said 39209467b48Spatrickphis, doing so shouldn't prohibit optimizations. 39309467b48Spatrick 39409467b48Spatrick 39509467b48SpatrickNon-Goals 39609467b48Spatrick--------- 39709467b48Spatrick 39809467b48Spatrick``MemorySSA`` is meant to reason about the relation between memory 39909467b48Spatrickoperations, and enable quicker querying. 40009467b48SpatrickIt isn't meant to be the single source of truth for all potential memory-related 40109467b48Spatrickoptimizations. Specifically, care must be taken when trying to use ``MemorySSA`` 40209467b48Spatrickto reason about atomic or volatile operations, as in: 40309467b48Spatrick 40409467b48Spatrick.. code-block:: llvm 40509467b48Spatrick 40609467b48Spatrick define i8 @foo(i8* %a) { 40709467b48Spatrick entry: 40809467b48Spatrick br i1 undef, label %if.then, label %if.end 40909467b48Spatrick 41009467b48Spatrick if.then: 41109467b48Spatrick ; 1 = MemoryDef(liveOnEntry) 41209467b48Spatrick %0 = load volatile i8, i8* %a 41309467b48Spatrick br label %if.end 41409467b48Spatrick 41509467b48Spatrick if.end: 41609467b48Spatrick %av = phi i8 [0, %entry], [%0, %if.then] 41709467b48Spatrick ret i8 %av 41809467b48Spatrick } 41909467b48Spatrick 42009467b48SpatrickGoing solely by ``MemorySSA``'s analysis, hoisting the ``load`` to ``entry`` may 42109467b48Spatrickseem legal. Because it's a volatile load, though, it's not. 42209467b48Spatrick 42309467b48Spatrick 42409467b48SpatrickDesign tradeoffs 42509467b48Spatrick---------------- 42609467b48Spatrick 42709467b48SpatrickPrecision 42809467b48Spatrick^^^^^^^^^ 42909467b48Spatrick 43009467b48Spatrick``MemorySSA`` in LLVM deliberately trades off precision for speed. 43109467b48SpatrickLet us think about memory variables as if they were disjoint partitions of the 432097a140dSpatrickmemory (that is, if you have one variable, as above, it represents the entire 433097a140dSpatrickmemory, and if you have multiple variables, each one represents some 434097a140dSpatrickdisjoint portion of the memory) 43509467b48Spatrick 43609467b48SpatrickFirst, because alias analysis results conflict with each other, and 43709467b48Spatrickeach result may be what an analysis wants (IE 43809467b48SpatrickTBAA may say no-alias, and something else may say must-alias), it is 439097a140dSpatricknot possible to partition the memory the way every optimization wants. 44009467b48SpatrickSecond, some alias analysis results are not transitive (IE A noalias B, 44109467b48Spatrickand B noalias C, does not mean A noalias C), so it is not possible to 44209467b48Spatrickcome up with a precise partitioning in all cases without variables to 44309467b48Spatrickrepresent every pair of possible aliases. Thus, partitioning 44409467b48Spatrickprecisely may require introducing at least N^2 new virtual variables, 44509467b48Spatrickphi nodes, etc. 44609467b48Spatrick 44709467b48SpatrickEach of these variables may be clobbered at multiple def sites. 44809467b48Spatrick 44909467b48SpatrickTo give an example, if you were to split up struct fields into 45009467b48Spatrickindividual variables, all aliasing operations that may-def multiple struct 45109467b48Spatrickfields, will may-def more than one of them. This is pretty common (calls, 45209467b48Spatrickcopies, field stores, etc). 45309467b48Spatrick 45409467b48SpatrickExperience with SSA forms for memory in other compilers has shown that 45509467b48Spatrickit is simply not possible to do this precisely, and in fact, doing it 45609467b48Spatrickprecisely is not worth it, because now all the optimizations have to 45709467b48Spatrickwalk tons and tons of virtual variables and phi nodes. 45809467b48Spatrick 45909467b48SpatrickSo we partition. At the point at which you partition, again, 46009467b48Spatrickexperience has shown us there is no point in partitioning to more than 46109467b48Spatrickone variable. It simply generates more IR, and optimizations still 46209467b48Spatrickhave to query something to disambiguate further anyway. 46309467b48Spatrick 46409467b48SpatrickAs a result, LLVM partitions to one variable. 46509467b48Spatrick 466*d415bd75SrobertPrecision in practice 467*d415bd75Srobert^^^^^^^^^^^^^^^^^^^^^ 46809467b48Spatrick 469*d415bd75SrobertIn practice, there are implementation details in LLVM that also affect the 470*d415bd75Srobertresults' precision provided by ``MemorySSA``. For example, AliasAnalysis has various 471*d415bd75Srobertcaps, or restrictions on looking through phis which can affect what ``MemorySSA`` 472*d415bd75Srobertcan infer. Changes made by different passes may make MemorySSA either "overly 473*d415bd75Srobertoptimized" (it can provide a more acccurate result than if it were recomputed 474*d415bd75Srobertfrom scratch), or "under optimized" (it could infer more if it were recomputed). 475*d415bd75SrobertThis can lead to challenges to reproduced results in isolation with a single pass 476*d415bd75Srobertwhen the result relies on the state aquired by ``MemorySSA`` due to being updated by 477*d415bd75Srobertmultiple subsequent passes. 478*d415bd75SrobertPasses that use and update ``MemorySSA`` should do so through the APIs provided by the 479*d415bd75Srobert``MemorySSAUpdater``, or through calls on the Walker. 480*d415bd75SrobertDirect optimizations to ``MemorySSA`` are not permitted. 481*d415bd75SrobertThere is currently a single, narrowly scoped exception where DSE (DeadStoreElimination) 482*d415bd75Srobertupdates an optimized access of a store, after a traversal that guarantees the 483*d415bd75Srobertoptimization is correct. This is solely allowed due to the traversals and inferences 484*d415bd75Srobertbeing beyond what ``MemorySSA`` does and them being "free" (i.e. DSE does them anyway). 485*d415bd75SrobertThis exception is set under a flag ("-dse-optimize-memoryssa") and can be disabled to 486*d415bd75Sroberthelp reproduce optimizations in isolation. 487*d415bd75Srobert 48873471bf0Spatrick 48973471bf0SpatrickLLVM Developers Meeting presentations 49073471bf0Spatrick------------------------------------- 49173471bf0Spatrick 49273471bf0Spatrick- `2016 LLVM Developers' Meeting: G. Burgess - MemorySSA in Five Minutes <https://www.youtube.com/watch?v=bdxWmryoHak>`_. 49373471bf0Spatrick- `2020 LLVM Developers' Meeting: S. Baziotis & S. Moll - Finding Your Way Around the LLVM Dependence Analysis Zoo <https://www.youtube.com/watch?v=1e5y6WDbXCQ>`_ 494