1*d415bd75Srobert.. _cycle-terminology:
2*d415bd75Srobert
3*d415bd75Srobert======================
4*d415bd75SrobertLLVM Cycle Terminology
5*d415bd75Srobert======================
6*d415bd75Srobert
7*d415bd75Srobert.. contents::
8*d415bd75Srobert   :local:
9*d415bd75Srobert
10*d415bd75Srobert.. _cycle-definition:
11*d415bd75Srobert
12*d415bd75SrobertCycles
13*d415bd75Srobert======
14*d415bd75Srobert
15*d415bd75SrobertCycles are a generalization of LLVM :ref:`loops <loop-terminology>`,
16*d415bd75Srobertdefined recursively as follows [HavlakCycles]_:
17*d415bd75Srobert
18*d415bd75Srobert1. In a directed graph G that is a function CFG or a subgraph of it, a *cycle*
19*d415bd75Srobert   is a maximal strongly connected region with at least one internal edge.
20*d415bd75Srobert   (Informational note --- The requirement for at least one internal edge
21*d415bd75Srobert   ensures that a single basic block is a cycle only if there is an edge
22*d415bd75Srobert   that goes back to the same basic block.)
23*d415bd75Srobert2. A basic block in a cycle that can be reached from the entry of
24*d415bd75Srobert   the function along a path that does not visit any other basic block
25*d415bd75Srobert   in the cycle is called an *entry* of the cycle.
26*d415bd75Srobert   A cycle can have multiple entries.
27*d415bd75Srobert3. For a given depth-first search starting from the entry of the function, the
28*d415bd75Srobert   first node of a cycle to be visited is called the *header* of this cycle
29*d415bd75Srobert   with respect to this particular DFS. The header is always an entry node.
30*d415bd75Srobert4. In any depth-first search starting from the entry, the set of cycles
31*d415bd75Srobert   found in the CFG is the same. These are the *top-level cycles*
32*d415bd75Srobert   that do not themselves have a parent.
33*d415bd75Srobert5. The *child cycles* (or simply cycles) nested inside a cycle C with
34*d415bd75Srobert   header H are the cycles in the subgraph induced on the set of nodes (C - H).
35*d415bd75Srobert   C is said to be the *parent* of these cycles.
36*d415bd75Srobert
37*d415bd75SrobertThus, cycles form an implementation-defined forest where each cycle C is
38*d415bd75Srobertthe parent of any child cycles nested inside C. The tree closely
39*d415bd75Srobertfollows the nesting of loops in the same function. The unique entry of
40*d415bd75Sroberta reducible cycle (an LLVM loop) L dominates all its other nodes, and
41*d415bd75Srobertis always chosen as the header of some cycle C regardless of the DFS
42*d415bd75Sroberttree used. This cycle C is a superset of the loop L. For an
43*d415bd75Srobertirreducible cycle, no one entry dominates the nodes of the cycle. One
44*d415bd75Srobertof the entries is chosen as header of the cycle, in an
45*d415bd75Srobertimplementation-defined way.
46*d415bd75Srobert
47*d415bd75Srobert.. _cycle-irreducible:
48*d415bd75Srobert
49*d415bd75SrobertA cycle is *irreducible* if it has multiple entries and it is
50*d415bd75Srobert*reducible* otherwise.
51*d415bd75Srobert
52*d415bd75Srobert.. _cycle-parent-block:
53*d415bd75Srobert
54*d415bd75SrobertA cycle C is said to be the *parent* of a basic block B if B occurs in
55*d415bd75SrobertC but not in any child cycle of C. Then B is also said to be a *child*
56*d415bd75Srobertof cycle C.
57*d415bd75Srobert
58*d415bd75Srobert.. _cycle-toplevel-block:
59*d415bd75Srobert
60*d415bd75SrobertA block B is said to be a *top-level block* if it is not the child of
61*d415bd75Srobertany cycle.
62*d415bd75Srobert
63*d415bd75Srobert.. _cycle-sibling:
64*d415bd75Srobert
65*d415bd75SrobertA basic block or cycle X is a *sibling* of another basic block or
66*d415bd75Srobertcycle Y if they both have no parent or both have the same parent.
67*d415bd75Srobert
68*d415bd75SrobertInformational notes:
69*d415bd75Srobert
70*d415bd75Srobert- Non-header entry blocks of a cycle can be contained in child cycles.
71*d415bd75Srobert- If the CFG is reducible, the cycles are exactly the natural loops and
72*d415bd75Srobert  every cycle has exactly one entry block.
73*d415bd75Srobert- Cycles are well-nested (by definition).
74*d415bd75Srobert- The entry blocks of a cycle are siblings in the dominator tree.
75*d415bd75Srobert
76*d415bd75Srobert.. [HavlakCycles] Paul Havlak, "Nesting of reducible and irreducible
77*d415bd75Srobert                  loops." ACM Transactions on Programming Languages
78*d415bd75Srobert                  and Systems (TOPLAS) 19.4 (1997): 557-567.
79*d415bd75Srobert
80*d415bd75Srobert.. _cycle-examples:
81*d415bd75Srobert
82*d415bd75SrobertExamples of Cycles
83*d415bd75Srobert==================
84*d415bd75Srobert
85*d415bd75SrobertIrreducible cycle enclosing natural loops
86*d415bd75Srobert-----------------------------------------
87*d415bd75Srobert
88*d415bd75Srobert.. Graphviz source; the indented blocks below form a comment.
89*d415bd75Srobert
90*d415bd75Srobert  ///     |   |
91*d415bd75Srobert  ///   />A] [B<\
92*d415bd75Srobert  ///   |  \ /  |
93*d415bd75Srobert  ///   ^---C---^
94*d415bd75Srobert  ///       |
95*d415bd75Srobert
96*d415bd75Srobert  strict digraph {
97*d415bd75Srobert    { rank=same; A B}
98*d415bd75Srobert    Entry -> A
99*d415bd75Srobert    Entry -> B
100*d415bd75Srobert    A -> A
101*d415bd75Srobert    A -> C
102*d415bd75Srobert    B -> B
103*d415bd75Srobert    B -> C
104*d415bd75Srobert    C -> A
105*d415bd75Srobert    C -> B
106*d415bd75Srobert    C -> Exit
107*d415bd75Srobert  }
108*d415bd75Srobert
109*d415bd75Srobert.. image:: cycle-1.png
110*d415bd75Srobert
111*d415bd75SrobertThe self-loops of ``A`` and ``B`` give rise to two single-block
112*d415bd75Srobertnatural loops. A possible hierarchy of cycles is::
113*d415bd75Srobert
114*d415bd75Srobert    cycle: {A, B, C} entries: {A, B} header: A
115*d415bd75Srobert    - cycle: {B, C}  entries: {B, C} header: C
116*d415bd75Srobert      - cycle: {B}   entries: {B}    header: B
117*d415bd75Srobert
118*d415bd75SrobertThis hierarchy arises when DFS visits the blocks in the order ``A``,
119*d415bd75Srobert``C``, ``B`` (in preorder).
120*d415bd75Srobert
121*d415bd75SrobertIrreducible union of two natural loops
122*d415bd75Srobert--------------------------------------
123*d415bd75Srobert
124*d415bd75Srobert.. Graphviz source; the indented blocks below form a comment.
125*d415bd75Srobert
126*d415bd75Srobert  ///     |   |
127*d415bd75Srobert  ///     A<->B
128*d415bd75Srobert  ///     ^   ^
129*d415bd75Srobert  ///     |   |
130*d415bd75Srobert  ///     v   v
131*d415bd75Srobert  ///     C   D
132*d415bd75Srobert  ///     |   |
133*d415bd75Srobert
134*d415bd75Srobert  strict digraph {
135*d415bd75Srobert    { rank=same; A B}
136*d415bd75Srobert    { rank=same; C D}
137*d415bd75Srobert    Entry -> A
138*d415bd75Srobert    Entry -> B
139*d415bd75Srobert    A -> B
140*d415bd75Srobert    B -> A
141*d415bd75Srobert    A -> C
142*d415bd75Srobert    C -> A
143*d415bd75Srobert    B -> D
144*d415bd75Srobert    D -> B
145*d415bd75Srobert    C -> Exit
146*d415bd75Srobert    D -> Exit
147*d415bd75Srobert  }
148*d415bd75Srobert
149*d415bd75Srobert.. image:: cycle-2.png
150*d415bd75Srobert
151*d415bd75SrobertThere are two natural loops: ``{A, C}`` and ``{B, D}``. A possible
152*d415bd75Sroberthierarchy of cycles is::
153*d415bd75Srobert
154*d415bd75Srobert    cycle: {A, B, C, D} entries: {A, B} header: A
155*d415bd75Srobert    - cycle: {B, D}     entries: {B}    header: B
156*d415bd75Srobert
157*d415bd75SrobertIrreducible cycle without natural loops
158*d415bd75Srobert---------------------------------------
159*d415bd75Srobert
160*d415bd75Srobert.. Graphviz source; the indented blocks below form a comment.
161*d415bd75Srobert
162*d415bd75Srobert  ///     |   |
163*d415bd75Srobert  ///   />A   B<\
164*d415bd75Srobert  ///   | |\ /| |
165*d415bd75Srobert  ///   | | x | |
166*d415bd75Srobert  ///   | |/ \| |
167*d415bd75Srobert  ///   ^-C   D-^
168*d415bd75Srobert  ///     |   |
169*d415bd75Srobert  ///
170*d415bd75Srobert
171*d415bd75Srobert  strict digraph {
172*d415bd75Srobert    { rank=same; A B}
173*d415bd75Srobert    { rank=same; C D}
174*d415bd75Srobert    Entry -> A
175*d415bd75Srobert    Entry -> B
176*d415bd75Srobert    A -> C
177*d415bd75Srobert    A -> D
178*d415bd75Srobert    B -> C
179*d415bd75Srobert    B -> D
180*d415bd75Srobert    C -> A
181*d415bd75Srobert    D -> B
182*d415bd75Srobert    C -> Exit
183*d415bd75Srobert    D -> Exit
184*d415bd75Srobert  }
185*d415bd75Srobert
186*d415bd75Srobert.. image:: cycle-3.png
187*d415bd75Srobert
188*d415bd75SrobertThis graph does not contain any natural loops --- the nodes ``A``,
189*d415bd75Srobert``B``, ``C`` and ``D`` are siblings in the dominator tree. A possible
190*d415bd75Sroberthierarchy of cycles is::
191*d415bd75Srobert
192*d415bd75Srobert    cycle: {A, B, C, D} entries: {A, B} header: A
193*d415bd75Srobert    - cycle: {B, D}     entries: {B, D} header: D
194*d415bd75Srobert
195*d415bd75Srobert.. _cycle-closed-path:
196*d415bd75Srobert
197*d415bd75SrobertClosed Paths and Cycles
198*d415bd75Srobert=======================
199*d415bd75Srobert
200*d415bd75SrobertA *closed path* in a CFG is a connected sequence of nodes and edges in
201*d415bd75Srobertthe CFG whose start and end nodes are the same, and whose remaining
202*d415bd75Srobert(inner) nodes are distinct.
203*d415bd75Srobert
204*d415bd75SrobertAn *entry* to a closed path ``P`` is a node on ``P`` that is reachable
205*d415bd75Srobertfrom the function entry without passing through any other node on ``P``.
206*d415bd75Srobert
207*d415bd75Srobert1. If a node D dominates one or more nodes in a closed path P and P
208*d415bd75Srobert   does not contain D, then D dominates every node in P.
209*d415bd75Srobert
210*d415bd75Srobert   **Proof:** Let U be a node in P that is dominated by D. If there
211*d415bd75Srobert   was a node V in P not dominated by D, then U would be reachable
212*d415bd75Srobert   from the function entry node via V without passing through D, which
213*d415bd75Srobert   contradicts the fact that D dominates U.
214*d415bd75Srobert
215*d415bd75Srobert2. If a node D dominates one or more nodes in a closed path P and P
216*d415bd75Srobert   does not contain D, then there exists a cycle C that contains P but
217*d415bd75Srobert   not D.
218*d415bd75Srobert
219*d415bd75Srobert   **Proof:** From the above property, D dominates all the nodes in P.
220*d415bd75Srobert   For any nesting of cycles discovered by the implementation-defined
221*d415bd75Srobert   DFS, consider the smallest cycle C which contains P. For the sake
222*d415bd75Srobert   of contradiction, assume that D is in C. Then the header H of C
223*d415bd75Srobert   cannot be in P, since the header of a cycle cannot be dominated by
224*d415bd75Srobert   any other node in the cycle. Thus, P is in the set (C-H), and there
225*d415bd75Srobert   must be a smaller cycle C' in C which also contains P, but that
226*d415bd75Srobert   contradicts how we chose C.
227*d415bd75Srobert
228*d415bd75Srobert3. If a closed path P contains nodes U1 and U2 but not their
229*d415bd75Srobert   dominators D1 and D2 respectively, then there exists a cycle C that
230*d415bd75Srobert   contains U1 and U2 but neither of D1 and D2.
231*d415bd75Srobert
232*d415bd75Srobert   **Proof:** From the above properties, each D1 and D2 separately
233*d415bd75Srobert   dominate every node in P. There exists a cycle C1 (respectively,
234*d415bd75Srobert   C2) that contains P but not D1 (respectively, D2). Either C1 and C2
235*d415bd75Srobert   are the same cycle, or one of them is nested inside the other.
236*d415bd75Srobert   Hence there is always a cycle that contains U1 and U2 but neither
237*d415bd75Srobert   of D1 and D2.
238*d415bd75Srobert
239*d415bd75Srobert.. _cycle-closed-path-header:
240*d415bd75Srobert
241*d415bd75Srobert4. In any cycle hierarchy, the header ``H`` of the smallest cycle
242*d415bd75Srobert   ``C`` containing a closed path ``P`` itself lies on ``P``.
243*d415bd75Srobert
244*d415bd75Srobert   **Proof:** If ``H`` is not in ``P``, then there is a smaller cycle
245*d415bd75Srobert   ``C'`` in the set ``C - H`` containing ``P``, thus contradicting
246*d415bd75Srobert   the claim that ``C`` is the smallest such cycle.
247*d415bd75Srobert
248*d415bd75Srobert.. _cycle-reducible-headers:
249*d415bd75Srobert
250*d415bd75SrobertReducible Cycle Headers
251*d415bd75Srobert=======================
252*d415bd75Srobert
253*d415bd75SrobertAlthough the cycle hierarchy depends on the DFS chosen, reducible
254*d415bd75Srobertcycles satisfy the following invariant:
255*d415bd75Srobert
256*d415bd75Srobert  If a reducible cycle ``C`` with header ``H`` is discovered in any
257*d415bd75Srobert  DFS, then there exists a cycle ``C'`` in every DFS with header
258*d415bd75Srobert  ``H``, that contains ``C``.
259*d415bd75Srobert
260*d415bd75Srobert**Proof:** For a closed path ``P`` in ``C`` that passes through ``H``,
261*d415bd75Srobertevery cycle hierarchy has a smallest cycle ``C'`` containing ``P`` and
262*d415bd75Srobertwhose header is in ``P``. Since ``H`` is the only entry to ``P``,
263*d415bd75Srobert``H`` must be the header of ``C'``. Since headers uniquely define
264*d415bd75Srobertcycles, ``C'`` contains every such closed path ``P``, and hence ``C'``
265*d415bd75Srobertcontains ``C``.
266