xref: /openbsd/gnu/llvm/llvm/docs/MemorySSA.rst (revision d415bd75)
109467b48Spatrick=========
209467b48SpatrickMemorySSA
309467b48Spatrick=========
409467b48Spatrick
509467b48Spatrick.. contents::
609467b48Spatrick   :local:
709467b48Spatrick
809467b48SpatrickIntroduction
909467b48Spatrick============
1009467b48Spatrick
1109467b48Spatrick``MemorySSA`` is an analysis that allows us to cheaply reason about the
1209467b48Spatrickinteractions between various memory operations. Its goal is to replace
1309467b48Spatrick``MemoryDependenceAnalysis`` for most (if not all) use-cases. This is because,
1409467b48Spatrickunless you're very careful, use of ``MemoryDependenceAnalysis`` can easily
1509467b48Spatrickresult in quadratic-time algorithms in LLVM. Additionally, ``MemorySSA`` doesn't
1609467b48Spatrickhave as many arbitrary limits as ``MemoryDependenceAnalysis``, so you should get
17097a140dSpatrickbetter results, too. One common use of ``MemorySSA`` is to quickly find out
18097a140dSpatrickthat something definitely cannot happen (for example, reason that a hoist
19097a140dSpatrickout of a loop can't happen).
2009467b48Spatrick
2109467b48SpatrickAt a high level, one of the goals of ``MemorySSA`` is to provide an SSA based
2209467b48Spatrickform for memory, complete with def-use and use-def chains, which
2309467b48Spatrickenables users to quickly find may-def and may-uses of memory operations.
2409467b48SpatrickIt can also be thought of as a way to cheaply give versions to the complete
25097a140dSpatrickstate of memory, and associate memory operations with those versions.
2609467b48Spatrick
2709467b48SpatrickThis document goes over how ``MemorySSA`` is structured, and some basic
2809467b48Spatrickintuition on how ``MemorySSA`` works.
2909467b48Spatrick
3009467b48SpatrickA paper on MemorySSA (with notes about how it's implemented in GCC) `can be
3109467b48Spatrickfound here <http://www.airs.com/dnovillo/Papers/mem-ssa.pdf>`_. Though, it's
32097a140dSpatrickrelatively out-of-date; the paper references multiple memory partitions, but GCC
3309467b48Spatrickeventually swapped to just using one, like we now have in LLVM.  Like
3409467b48SpatrickGCC's, LLVM's MemorySSA is intraprocedural.
3509467b48Spatrick
3609467b48Spatrick
3709467b48SpatrickMemorySSA Structure
3809467b48Spatrick===================
3909467b48Spatrick
4009467b48SpatrickMemorySSA is a virtual IR. After it's built, ``MemorySSA`` will contain a
4109467b48Spatrickstructure that maps ``Instruction``\ s to ``MemoryAccess``\ es, which are
4209467b48Spatrick``MemorySSA``'s parallel to LLVM ``Instruction``\ s.
4309467b48Spatrick
4409467b48SpatrickEach ``MemoryAccess`` can be one of three types:
4509467b48Spatrick
46097a140dSpatrick- ``MemoryDef``
4709467b48Spatrick- ``MemoryPhi``
4809467b48Spatrick- ``MemoryUse``
49097a140dSpatrick
50097a140dSpatrick``MemoryDef``\ s are operations which may either modify memory, or which
51097a140dSpatrickintroduce some kind of ordering constraints. Examples of ``MemoryDef``\ s
52097a140dSpatrickinclude ``store``\ s, function calls, ``load``\ s with ``acquire`` (or higher)
53097a140dSpatrickordering, volatile operations, memory fences, etc. A ``MemoryDef``
54097a140dSpatrickalways introduces a new version of the entire memory and is linked with a single
55097a140dSpatrick``MemoryDef/MemoryPhi`` which is the version of memory that the new
56097a140dSpatrickversion is based on. This implies that there is a *single*
57097a140dSpatrick``Def`` chain that connects all the ``Def``\ s, either directly
5873471bf0Spatrickor indirectly. For example in:
59097a140dSpatrick
60097a140dSpatrick.. code-block:: llvm
61097a140dSpatrick
62097a140dSpatrick  b = MemoryDef(a)
63097a140dSpatrick  c = MemoryDef(b)
64097a140dSpatrick  d = MemoryDef(c)
65097a140dSpatrick
66097a140dSpatrick``d`` is connected directly with ``c`` and indirectly with ``b``.
67097a140dSpatrickThis means that ``d`` potentially clobbers (see below) ``c`` *or*
68097a140dSpatrick``b`` *or* both. This in turn implies that without the use of `The walker`_,
69097a140dSpatrickinitially every ``MemoryDef`` clobbers every other ``MemoryDef``.
7009467b48Spatrick
7109467b48Spatrick``MemoryPhi``\ s are ``PhiNode``\ s, but for memory operations. If at any
7209467b48Spatrickpoint we have two (or more) ``MemoryDef``\ s that could flow into a
7309467b48Spatrick``BasicBlock``, the block's top ``MemoryAccess`` will be a
7409467b48Spatrick``MemoryPhi``. As in LLVM IR, ``MemoryPhi``\ s don't correspond to any
7509467b48Spatrickconcrete operation. As such, ``BasicBlock``\ s are mapped to ``MemoryPhi``\ s
7609467b48Spatrickinside ``MemorySSA``, whereas ``Instruction``\ s are mapped to ``MemoryUse``\ s
7709467b48Spatrickand ``MemoryDef``\ s.
7809467b48Spatrick
7909467b48SpatrickNote also that in SSA, Phi nodes merge must-reach definitions (that is,
8009467b48Spatrickdefinitions that *must* be new versions of variables). In MemorySSA, PHI nodes
8109467b48Spatrickmerge may-reach definitions (that is, until disambiguated, the versions that
8209467b48Spatrickreach a phi node may or may not clobber a given variable).
8309467b48Spatrick
8409467b48Spatrick``MemoryUse``\ s are operations which use but don't modify memory. An example of
8509467b48Spatricka ``MemoryUse`` is a ``load``, or a ``readonly`` function call.
8609467b48Spatrick
8709467b48SpatrickEvery function that exists has a special ``MemoryDef`` called ``liveOnEntry``.
8809467b48SpatrickIt dominates every ``MemoryAccess`` in the function that ``MemorySSA`` is being
8909467b48Spatrickrun on, and implies that we've hit the top of the function. It's the only
9009467b48Spatrick``MemoryDef`` that maps to no ``Instruction`` in LLVM IR. Use of
9109467b48Spatrick``liveOnEntry`` implies that the memory being used is either undefined or
9209467b48Spatrickdefined before the function begins.
9309467b48Spatrick
9409467b48SpatrickAn example of all of this overlaid on LLVM IR (obtained by running ``opt
9509467b48Spatrick-passes='print<memoryssa>' -disable-output`` on an ``.ll`` file) is below. When
96097a140dSpatrickviewing this example, it may be helpful to view it in terms of clobbers.
97097a140dSpatrickThe operands of a given ``MemoryAccess`` are all (potential) clobbers of said
98097a140dSpatrick``MemoryAccess``, and the value produced by a ``MemoryAccess`` can act as a clobber
99097a140dSpatrickfor other ``MemoryAccess``\ es.
100097a140dSpatrick
101097a140dSpatrickIf a ``MemoryAccess`` is a *clobber* of another, it means that these two
102097a140dSpatrick``MemoryAccess``\ es may access the same memory. For example, ``x = MemoryDef(y)``
103097a140dSpatrickmeans that ``x`` potentially modifies memory that ``y`` modifies/constrains
104097a140dSpatrick(or has modified / constrained).
105097a140dSpatrickIn the same manner, ``a = MemoryPhi({BB1,b},{BB2,c})`` means that
106097a140dSpatrickanyone that uses ``a`` is accessing memory potentially modified / constrained
107097a140dSpatrickby either ``b`` or ``c`` (or both).  And finally, ``MemoryUse(x)`` means
108097a140dSpatrickthat this use accesses memory that ``x`` has modified / constrained
109097a140dSpatrick(as an example, think that if ``x = MemoryDef(...)``
110097a140dSpatrickand ``MemoryUse(x)`` are in the same loop, the use can't
111097a140dSpatrickbe hoisted outside alone).
112097a140dSpatrick
113097a140dSpatrickAnother useful way of looking at it is in terms of memory versions.
114097a140dSpatrickIn that view, operands of a given ``MemoryAccess`` are the version
115097a140dSpatrickof the entire memory before the operation, and if the access produces
116097a140dSpatricka value (i.e. ``MemoryDef/MemoryPhi``),
117097a140dSpatrickthe value is the new version of the memory after the operation.
11809467b48Spatrick
11909467b48Spatrick.. code-block:: llvm
12009467b48Spatrick
12109467b48Spatrick  define void @foo() {
12209467b48Spatrick  entry:
12309467b48Spatrick    %p1 = alloca i8
12409467b48Spatrick    %p2 = alloca i8
12509467b48Spatrick    %p3 = alloca i8
12609467b48Spatrick    ; 1 = MemoryDef(liveOnEntry)
12709467b48Spatrick    store i8 0, i8* %p3
12809467b48Spatrick    br label %while.cond
12909467b48Spatrick
13009467b48Spatrick  while.cond:
131097a140dSpatrick    ; 6 = MemoryPhi({entry,1},{if.end,4})
13209467b48Spatrick    br i1 undef, label %if.then, label %if.else
13309467b48Spatrick
13409467b48Spatrick  if.then:
13509467b48Spatrick    ; 2 = MemoryDef(6)
13609467b48Spatrick    store i8 0, i8* %p1
13709467b48Spatrick    br label %if.end
13809467b48Spatrick
13909467b48Spatrick  if.else:
14009467b48Spatrick    ; 3 = MemoryDef(6)
14109467b48Spatrick    store i8 1, i8* %p2
14209467b48Spatrick    br label %if.end
14309467b48Spatrick
14409467b48Spatrick  if.end:
14509467b48Spatrick    ; 5 = MemoryPhi({if.then,2},{if.else,3})
14609467b48Spatrick    ; MemoryUse(5)
14709467b48Spatrick    %1 = load i8, i8* %p1
14809467b48Spatrick    ; 4 = MemoryDef(5)
14909467b48Spatrick    store i8 2, i8* %p2
15009467b48Spatrick    ; MemoryUse(1)
15109467b48Spatrick    %2 = load i8, i8* %p3
15209467b48Spatrick    br label %while.cond
15309467b48Spatrick  }
15409467b48Spatrick
15509467b48SpatrickThe ``MemorySSA`` IR is shown in comments that precede the instructions they map
15609467b48Spatrickto (if such an instruction exists). For example, ``1 = MemoryDef(liveOnEntry)``
15709467b48Spatrickis a ``MemoryAccess`` (specifically, a ``MemoryDef``), and it describes the LLVM
15809467b48Spatrickinstruction ``store i8 0, i8* %p3``. Other places in ``MemorySSA`` refer to this
15909467b48Spatrickparticular ``MemoryDef`` as ``1`` (much like how one can refer to ``load i8, i8*
16009467b48Spatrick%p1`` in LLVM with ``%1``). Again, ``MemoryPhi``\ s don't correspond to any LLVM
16109467b48SpatrickInstruction, so the line directly below a ``MemoryPhi`` isn't special.
16209467b48Spatrick
16309467b48SpatrickGoing from the top down:
16409467b48Spatrick
16509467b48Spatrick- ``6 = MemoryPhi({entry,1},{if.end,4})`` notes that, when entering
16609467b48Spatrick  ``while.cond``, the reaching definition for it is either ``1`` or ``4``. This
16709467b48Spatrick  ``MemoryPhi`` is referred to in the textual IR by the number ``6``.
16809467b48Spatrick- ``2 = MemoryDef(6)`` notes that ``store i8 0, i8* %p1`` is a definition,
16909467b48Spatrick  and its reaching definition before it is ``6``, or the ``MemoryPhi`` after
170*d415bd75Srobert  ``while.cond``. (See the `Use and Def optimization`_ and `Precision`_
17109467b48Spatrick  sections below for why this ``MemoryDef`` isn't linked to a separate,
17209467b48Spatrick  disambiguated ``MemoryPhi``.)
17309467b48Spatrick- ``3 = MemoryDef(6)`` notes that ``store i8 0, i8* %p2`` is a definition; its
17409467b48Spatrick  reaching definition is also ``6``.
17509467b48Spatrick- ``5 = MemoryPhi({if.then,2},{if.else,3})`` notes that the clobber before
17609467b48Spatrick  this block could either be ``2`` or ``3``.
17709467b48Spatrick- ``MemoryUse(5)`` notes that ``load i8, i8* %p1`` is a use of memory, and that
17809467b48Spatrick  it's clobbered by ``5``.
17973471bf0Spatrick- ``4 = MemoryDef(5)`` notes that ``store i8 2, i8* %p2`` is a definition; its
18009467b48Spatrick  reaching definition is ``5``.
18109467b48Spatrick- ``MemoryUse(1)`` notes that ``load i8, i8* %p3`` is just a user of memory,
18209467b48Spatrick  and the last thing that could clobber this use is above ``while.cond`` (e.g.
183097a140dSpatrick  the store to ``%p3``). In memory versioning parlance, it really only depends on
184097a140dSpatrick  the memory version 1, and is unaffected by the new memory versions generated since
18509467b48Spatrick  then.
18609467b48Spatrick
18709467b48SpatrickAs an aside, ``MemoryAccess`` is a ``Value`` mostly for convenience; it's not
18809467b48Spatrickmeant to interact with LLVM IR.
18909467b48Spatrick
19009467b48SpatrickDesign of MemorySSA
19109467b48Spatrick===================
19209467b48Spatrick
19309467b48Spatrick``MemorySSA`` is an analysis that can be built for any arbitrary function. When
19409467b48Spatrickit's built, it does a pass over the function's IR in order to build up its
19509467b48Spatrickmapping of ``MemoryAccess``\ es. You can then query ``MemorySSA`` for things
19609467b48Spatricklike the dominance relation between ``MemoryAccess``\ es, and get the
19709467b48Spatrick``MemoryAccess`` for any given ``Instruction`` .
19809467b48Spatrick
19909467b48SpatrickWhen ``MemorySSA`` is done building, it also hands you a ``MemorySSAWalker``
20009467b48Spatrickthat you can use (see below).
20109467b48Spatrick
20209467b48Spatrick
20309467b48SpatrickThe walker
20409467b48Spatrick----------
20509467b48Spatrick
20609467b48SpatrickA structure that helps ``MemorySSA`` do its job is the ``MemorySSAWalker``, or
20709467b48Spatrickthe walker, for short. The goal of the walker is to provide answers to clobber
20809467b48Spatrickqueries beyond what's represented directly by ``MemoryAccess``\ es. For example,
20909467b48Spatrickgiven:
21009467b48Spatrick
21109467b48Spatrick.. code-block:: llvm
21209467b48Spatrick
21309467b48Spatrick  define void @foo() {
21409467b48Spatrick    %a = alloca i8
21509467b48Spatrick    %b = alloca i8
21609467b48Spatrick
21709467b48Spatrick    ; 1 = MemoryDef(liveOnEntry)
21809467b48Spatrick    store i8 0, i8* %a
21909467b48Spatrick    ; 2 = MemoryDef(1)
22009467b48Spatrick    store i8 0, i8* %b
22109467b48Spatrick  }
22209467b48Spatrick
22309467b48SpatrickThe store to ``%a`` is clearly not a clobber for the store to ``%b``. It would
22409467b48Spatrickbe the walker's goal to figure this out, and return ``liveOnEntry`` when queried
22509467b48Spatrickfor the clobber of ``MemoryAccess`` ``2``.
22609467b48Spatrick
22709467b48SpatrickBy default, ``MemorySSA`` provides a walker that can optimize ``MemoryDef``\ s
22809467b48Spatrickand ``MemoryUse``\ s by consulting whatever alias analysis stack you happen to
22909467b48Spatrickbe using. Walkers were built to be flexible, though, so it's entirely reasonable
23009467b48Spatrick(and expected) to create more specialized walkers (e.g. one that specifically
23109467b48Spatrickqueries ``GlobalsAA``, one that always stops at ``MemoryPhi`` nodes, etc).
23209467b48Spatrick
23373471bf0SpatrickDefault walker APIs
23473471bf0Spatrick^^^^^^^^^^^^^^^^^^^
23573471bf0Spatrick
23673471bf0SpatrickThere are two main APIs used to retrieve the clobbering access using the walker:
23773471bf0Spatrick
23873471bf0Spatrick-  ``MemoryAccess *getClobberingMemoryAccess(MemoryAccess *MA);`` return the
23973471bf0Spatrick   clobbering memory access for ``MA``, caching all intermediate results
24073471bf0Spatrick   computed along the way as part of each access queried.
24173471bf0Spatrick
24273471bf0Spatrick-  ``MemoryAccess *getClobberingMemoryAccess(MemoryAccess *MA, const MemoryLocation &Loc);``
24373471bf0Spatrick   returns the access clobbering memory location ``Loc``, starting at ``MA``.
24473471bf0Spatrick   Because this API does not request the clobbering access of a specific memory
24573471bf0Spatrick   access, there are no results that can be cached.
24609467b48Spatrick
24709467b48SpatrickLocating clobbers yourself
24809467b48Spatrick^^^^^^^^^^^^^^^^^^^^^^^^^^
24909467b48Spatrick
25009467b48SpatrickIf you choose to make your own walker, you can find the clobber for a
25109467b48Spatrick``MemoryAccess`` by walking every ``MemoryDef`` that dominates said
25209467b48Spatrick``MemoryAccess``. The structure of ``MemoryDef``\ s makes this relatively simple;
25309467b48Spatrickthey ultimately form a linked list of every clobber that dominates the
25409467b48Spatrick``MemoryAccess`` that you're trying to optimize. In other words, the
25509467b48Spatrick``definingAccess`` of a ``MemoryDef`` is always the nearest dominating
25609467b48Spatrick``MemoryDef`` or ``MemoryPhi`` of said ``MemoryDef``.
25709467b48Spatrick
25809467b48Spatrick
259*d415bd75SrobertUse and Def optimization
260*d415bd75Srobert------------------------
26109467b48Spatrick
262*d415bd75Srobert``MemoryUse``\ s keep a single operand, which is their defining or optimized
263*d415bd75Srobertaccess.
264*d415bd75SrobertTraditionally ``MemorySSA`` optimized ``MemoryUse``\ s at build-time, up to a
265*d415bd75Srobertgiven threshold.
266*d415bd75SrobertSpecifically, the operand of every ``MemoryUse`` was optimized to point to the
26709467b48Spatrickactual clobber of said ``MemoryUse``. This can be seen in the above example; the
26809467b48Spatricksecond ``MemoryUse`` in ``if.end`` has an operand of ``1``, which is a
26909467b48Spatrick``MemoryDef`` from the entry block.  This is done to make walking,
27009467b48Spatrickvalue numbering, etc, faster and easier.
271*d415bd75SrobertAs of `this revision <https://reviews.llvm.org/D121381>`_, the default was
272*d415bd75Srobertchanged to not optimize uses at build time, in order to provide the option to
273*d415bd75Srobertreduce compile-time if the walking is not necessary in a pass. Most users call
274*d415bd75Srobertthe new API ``ensureOptimizedUses()`` to keep the previous behavior and do a
275*d415bd75Srobertone-time optimization of ``MemoryUse``\ s, if this was not done before.
276*d415bd75SrobertNew pass users are recommended to call ``ensureOptimizedUses()``.
27709467b48Spatrick
278*d415bd75SrobertInitially it was not possible to optimize ``MemoryDef``\ s in the same way, as we
279*d415bd75Srobertrestricted ``MemorySSA`` to one operand per access.
280*d415bd75SrobertThis was changed and ``MemoryDef``\ s now keep two operands.
281*d415bd75SrobertThe first one, the defining access, is
282*d415bd75Srobertalways the previous ``MemoryDef`` or ``MemoryPhi`` in the same basic block, or
283*d415bd75Srobertthe last one in a dominating predecessor if the current block doesn't have any
284*d415bd75Srobertother accesses writing to memory. This is needed for walking Def chains.
285*d415bd75SrobertThe second operand is the optimized access, if there was a previous call on the
286*d415bd75Srobertwalker's ``getClobberingMemoryAccess(MA)``. This API will cache information
287*d415bd75Srobertas part of ``MA``.
288*d415bd75SrobertOptimizing all ``MemoryDef``\ s has quadratic time complexity and is not done
289*d415bd75Srobertby default.
29009467b48Spatrick
291*d415bd75SrobertA walk of the uses for any MemoryDef can find the accesses that were optimized
292*d415bd75Srobertto it.
293*d415bd75SrobertA code snippet for such a walk looks like this:
294*d415bd75Srobert
295*d415bd75Srobert.. code-block:: c++
296*d415bd75Srobert
297*d415bd75Srobert  MemoryDef *Def;  // find who's optimized or defining for this MemoryDef
298*d415bd75Srobert  for (auto& U : Def->uses()) {
299*d415bd75Srobert    MemoryAccess *MA = cast<MemoryAccess>(Use.getUser());
300*d415bd75Srobert    if (auto *DefUser = cast_of_null<MemoryDef>MA)
301*d415bd75Srobert      if (DefUser->isOptimized() && DefUser->getOptimized() == Def) {
302*d415bd75Srobert        // User who is optimized to Def
303*d415bd75Srobert      } else {
304*d415bd75Srobert        // User who's defining access is Def; optimized to something else or not optimized.
305*d415bd75Srobert      }
306*d415bd75Srobert  }
307*d415bd75Srobert
308*d415bd75SrobertWhen ``MemoryUse``\ s are optimized, for a given store,  you can find all loads
309*d415bd75Srobertclobbered by that store by walking the immediate and transitive uses of
310*d415bd75Srobertthe store.
311*d415bd75Srobert
312*d415bd75Srobert.. code-block:: c++
313*d415bd75Srobert
314*d415bd75Srobert  checkUses(MemoryAccess *Def) { // Def can be a MemoryDef or a MemoryPhi.
315*d415bd75Srobert    for (auto& U : Def->uses()) {
316*d415bd75Srobert      MemoryAccess *MA = cast<MemoryAccess>(Use.getUser());
317*d415bd75Srobert      if (auto *MU = cast_of_null<MemoryUse>MA) {
318*d415bd75Srobert        // Process MemoryUse as needed.
319*d415bd75Srobert      }
320*d415bd75Srobert      else {
321*d415bd75Srobert        // Process MemoryDef or MemoryPhi as needed.
322*d415bd75Srobert
323*d415bd75Srobert        // As a user can come up twice, as an optimized access and defining
324*d415bd75Srobert        // access, keep a visited list.
325*d415bd75Srobert
326*d415bd75Srobert        // Check transitive uses as needed
327*d415bd75Srobert        checkUses (MA); // use a worklist for an iterative algorithm
328*d415bd75Srobert      }
329*d415bd75Srobert    }
330*d415bd75Srobert  }
331*d415bd75Srobert
332*d415bd75SrobertAn example of similar traversals can be found in the DeadStoreElimination pass.
33309467b48Spatrick
33409467b48SpatrickInvalidation and updating
33509467b48Spatrick-------------------------
33609467b48Spatrick
33709467b48SpatrickBecause ``MemorySSA`` keeps track of LLVM IR, it needs to be updated whenever
33809467b48Spatrickthe IR is updated. "Update", in this case, includes the addition, deletion, and
33909467b48Spatrickmotion of ``Instructions``. The update API is being made on an as-needed basis.
340*d415bd75SrobertIf you'd like examples, ``GVNHoist`` and ``LICM`` are users of ``MemorySSA``\ s
341*d415bd75Srobertupdate API.
342*d415bd75SrobertNote that adding new ``MemoryDef``\ s (by calling ``insertDef``) can be a
343*d415bd75Sroberttime-consuming update, if the new access triggers many ``MemoryPhi`` insertions and
344*d415bd75Srobertrenaming (optimization invalidation) of many ``MemoryAccesses``\ es.
34509467b48Spatrick
34609467b48Spatrick
34709467b48SpatrickPhi placement
34809467b48Spatrick^^^^^^^^^^^^^
34909467b48Spatrick
35009467b48Spatrick``MemorySSA`` only places ``MemoryPhi``\ s where they're actually
35109467b48Spatrickneeded. That is, it is a pruned SSA form, like LLVM's SSA form.  For
35209467b48Spatrickexample, consider:
35309467b48Spatrick
35409467b48Spatrick.. code-block:: llvm
35509467b48Spatrick
35609467b48Spatrick  define void @foo() {
35709467b48Spatrick  entry:
35809467b48Spatrick    %p1 = alloca i8
35909467b48Spatrick    %p2 = alloca i8
36009467b48Spatrick    %p3 = alloca i8
36109467b48Spatrick    ; 1 = MemoryDef(liveOnEntry)
36209467b48Spatrick    store i8 0, i8* %p3
36309467b48Spatrick    br label %while.cond
36409467b48Spatrick
36509467b48Spatrick  while.cond:
36609467b48Spatrick    ; 3 = MemoryPhi({%0,1},{if.end,2})
36709467b48Spatrick    br i1 undef, label %if.then, label %if.else
36809467b48Spatrick
36909467b48Spatrick  if.then:
37009467b48Spatrick    br label %if.end
37109467b48Spatrick
37209467b48Spatrick  if.else:
37309467b48Spatrick    br label %if.end
37409467b48Spatrick
37509467b48Spatrick  if.end:
37609467b48Spatrick    ; MemoryUse(1)
37709467b48Spatrick    %1 = load i8, i8* %p1
37809467b48Spatrick    ; 2 = MemoryDef(3)
37909467b48Spatrick    store i8 2, i8* %p2
38009467b48Spatrick    ; MemoryUse(1)
38109467b48Spatrick    %2 = load i8, i8* %p3
38209467b48Spatrick    br label %while.cond
38309467b48Spatrick  }
38409467b48Spatrick
38509467b48SpatrickBecause we removed the stores from ``if.then`` and ``if.else``, a ``MemoryPhi``
38609467b48Spatrickfor ``if.end`` would be pointless, so we don't place one. So, if you need to
38709467b48Spatrickplace a ``MemoryDef`` in ``if.then`` or ``if.else``, you'll need to also create
38809467b48Spatricka ``MemoryPhi`` for ``if.end``.
38909467b48Spatrick
39009467b48SpatrickIf it turns out that this is a large burden, we can just place ``MemoryPhi``\ s
39109467b48Spatrickeverywhere. Because we have Walkers that are capable of optimizing above said
39209467b48Spatrickphis, doing so shouldn't prohibit optimizations.
39309467b48Spatrick
39409467b48Spatrick
39509467b48SpatrickNon-Goals
39609467b48Spatrick---------
39709467b48Spatrick
39809467b48Spatrick``MemorySSA`` is meant to reason about the relation between memory
39909467b48Spatrickoperations, and enable quicker querying.
40009467b48SpatrickIt isn't meant to be the single source of truth for all potential memory-related
40109467b48Spatrickoptimizations. Specifically, care must be taken when trying to use ``MemorySSA``
40209467b48Spatrickto reason about atomic or volatile operations, as in:
40309467b48Spatrick
40409467b48Spatrick.. code-block:: llvm
40509467b48Spatrick
40609467b48Spatrick  define i8 @foo(i8* %a) {
40709467b48Spatrick  entry:
40809467b48Spatrick    br i1 undef, label %if.then, label %if.end
40909467b48Spatrick
41009467b48Spatrick  if.then:
41109467b48Spatrick    ; 1 = MemoryDef(liveOnEntry)
41209467b48Spatrick    %0 = load volatile i8, i8* %a
41309467b48Spatrick    br label %if.end
41409467b48Spatrick
41509467b48Spatrick  if.end:
41609467b48Spatrick    %av = phi i8 [0, %entry], [%0, %if.then]
41709467b48Spatrick    ret i8 %av
41809467b48Spatrick  }
41909467b48Spatrick
42009467b48SpatrickGoing solely by ``MemorySSA``'s analysis, hoisting the ``load`` to ``entry`` may
42109467b48Spatrickseem legal. Because it's a volatile load, though, it's not.
42209467b48Spatrick
42309467b48Spatrick
42409467b48SpatrickDesign tradeoffs
42509467b48Spatrick----------------
42609467b48Spatrick
42709467b48SpatrickPrecision
42809467b48Spatrick^^^^^^^^^
42909467b48Spatrick
43009467b48Spatrick``MemorySSA`` in LLVM deliberately trades off precision for speed.
43109467b48SpatrickLet us think about memory variables as if they were disjoint partitions of the
432097a140dSpatrickmemory (that is, if you have one variable, as above, it represents the entire
433097a140dSpatrickmemory, and if you have multiple variables, each one represents some
434097a140dSpatrickdisjoint portion of the memory)
43509467b48Spatrick
43609467b48SpatrickFirst, because alias analysis results conflict with each other, and
43709467b48Spatrickeach result may be what an analysis wants (IE
43809467b48SpatrickTBAA may say no-alias, and something else may say must-alias), it is
439097a140dSpatricknot possible to partition the memory the way every optimization wants.
44009467b48SpatrickSecond, some alias analysis results are not transitive (IE A noalias B,
44109467b48Spatrickand B noalias C, does not mean A noalias C), so it is not possible to
44209467b48Spatrickcome up with a precise partitioning in all cases without variables to
44309467b48Spatrickrepresent every pair of possible aliases.  Thus, partitioning
44409467b48Spatrickprecisely may require introducing at least N^2 new virtual variables,
44509467b48Spatrickphi nodes, etc.
44609467b48Spatrick
44709467b48SpatrickEach of these variables may be clobbered at multiple def sites.
44809467b48Spatrick
44909467b48SpatrickTo give an example, if you were to split up struct fields into
45009467b48Spatrickindividual variables, all aliasing operations that may-def multiple struct
45109467b48Spatrickfields, will may-def more than one of them.  This is pretty common (calls,
45209467b48Spatrickcopies, field stores, etc).
45309467b48Spatrick
45409467b48SpatrickExperience with SSA forms for memory in other compilers has shown that
45509467b48Spatrickit is simply not possible to do this precisely, and in fact, doing it
45609467b48Spatrickprecisely is not worth it, because now all the optimizations have to
45709467b48Spatrickwalk tons and tons of virtual variables and phi nodes.
45809467b48Spatrick
45909467b48SpatrickSo we partition.  At the point at which you partition, again,
46009467b48Spatrickexperience has shown us there is no point in partitioning to more than
46109467b48Spatrickone variable.  It simply generates more IR, and optimizations still
46209467b48Spatrickhave to query something to disambiguate further anyway.
46309467b48Spatrick
46409467b48SpatrickAs a result, LLVM partitions to one variable.
46509467b48Spatrick
466*d415bd75SrobertPrecision in practice
467*d415bd75Srobert^^^^^^^^^^^^^^^^^^^^^
46809467b48Spatrick
469*d415bd75SrobertIn practice, there are implementation details in LLVM that also affect the
470*d415bd75Srobertresults' precision provided by ``MemorySSA``. For example, AliasAnalysis has various
471*d415bd75Srobertcaps, or restrictions on looking through phis which can affect what ``MemorySSA``
472*d415bd75Srobertcan infer. Changes made by different passes may make MemorySSA either "overly
473*d415bd75Srobertoptimized" (it can provide a more acccurate result than if it were recomputed
474*d415bd75Srobertfrom scratch), or "under optimized" (it could infer more if it were recomputed).
475*d415bd75SrobertThis can lead to challenges to reproduced results in isolation with a single pass
476*d415bd75Srobertwhen the result relies on the state aquired by ``MemorySSA`` due to being updated by
477*d415bd75Srobertmultiple subsequent passes.
478*d415bd75SrobertPasses that use and update ``MemorySSA`` should do so through the APIs provided by the
479*d415bd75Srobert``MemorySSAUpdater``, or through calls on the Walker.
480*d415bd75SrobertDirect optimizations to ``MemorySSA`` are not permitted.
481*d415bd75SrobertThere is currently a single, narrowly scoped exception where DSE (DeadStoreElimination)
482*d415bd75Srobertupdates an optimized access of a store, after a traversal that guarantees the
483*d415bd75Srobertoptimization is correct. This is solely allowed due to the traversals and inferences
484*d415bd75Srobertbeing beyond what ``MemorySSA`` does and them being "free" (i.e. DSE does them anyway).
485*d415bd75SrobertThis exception is set under a flag ("-dse-optimize-memoryssa") and can be disabled to
486*d415bd75Sroberthelp reproduce optimizations in isolation.
487*d415bd75Srobert
48873471bf0Spatrick
48973471bf0SpatrickLLVM Developers Meeting presentations
49073471bf0Spatrick-------------------------------------
49173471bf0Spatrick
49273471bf0Spatrick- `2016 LLVM Developers' Meeting: G. Burgess - MemorySSA in Five Minutes <https://www.youtube.com/watch?v=bdxWmryoHak>`_.
49373471bf0Spatrick- `2020 LLVM Developers' Meeting: S. Baziotis & S. Moll - Finding Your Way Around the LLVM Dependence Analysis Zoo <https://www.youtube.com/watch?v=1e5y6WDbXCQ>`_
494