1*ec02198aSmrg@c Copyright (C) 2006-2020 Free Software Foundation, Inc. 210d565efSmrg@c Free Software Foundation, Inc. 310d565efSmrg@c This is part of the GCC manual. 410d565efSmrg@c For copying conditions, see the file gcc.texi. 510d565efSmrg 610d565efSmrg@c --------------------------------------------------------------------- 710d565efSmrg@c Loop Representation 810d565efSmrg@c --------------------------------------------------------------------- 910d565efSmrg 1010d565efSmrg@node Loop Analysis and Representation 1110d565efSmrg@chapter Analysis and Representation of Loops 1210d565efSmrg 1310d565efSmrgGCC provides extensive infrastructure for work with natural loops, i.e., 1410d565efSmrgstrongly connected components of CFG with only one entry block. This 1510d565efSmrgchapter describes representation of loops in GCC, both on GIMPLE and in 1610d565efSmrgRTL, as well as the interfaces to loop-related analyses (induction 1710d565efSmrgvariable analysis and number of iterations analysis). 1810d565efSmrg 1910d565efSmrg@menu 2010d565efSmrg* Loop representation:: Representation and analysis of loops. 2110d565efSmrg* Loop querying:: Getting information about loops. 2210d565efSmrg* Loop manipulation:: Loop manipulation functions. 2310d565efSmrg* LCSSA:: Loop-closed SSA form. 2410d565efSmrg* Scalar evolutions:: Induction variables on GIMPLE. 2510d565efSmrg* loop-iv:: Induction variables on RTL. 2610d565efSmrg* Number of iterations:: Number of iterations analysis. 2710d565efSmrg* Dependency analysis:: Data dependency analysis. 2810d565efSmrg@end menu 2910d565efSmrg 3010d565efSmrg@node Loop representation 3110d565efSmrg@section Loop representation 3210d565efSmrg@cindex Loop representation 3310d565efSmrg@cindex Loop analysis 3410d565efSmrg 3510d565efSmrgThis chapter describes the representation of loops in GCC, and functions 3610d565efSmrgthat can be used to build, modify and analyze this representation. Most 3710d565efSmrgof the interfaces and data structures are declared in @file{cfgloop.h}. 3810d565efSmrgLoop structures are analyzed and this information disposed or updated 3910d565efSmrgat the discretion of individual passes. Still most of the generic 4010d565efSmrgCFG manipulation routines are aware of loop structures and try to 4110d565efSmrgkeep them up-to-date. By this means an increasing part of the 4210d565efSmrgcompilation pipeline is setup to maintain loop structure across 4310d565efSmrgpasses to allow attaching meta information to individual loops 4410d565efSmrgfor consumption by later passes. 4510d565efSmrg 4610d565efSmrgIn general, a natural loop has one entry block (header) and possibly 4710d565efSmrgseveral back edges (latches) leading to the header from the inside of 4810d565efSmrgthe loop. Loops with several latches may appear if several loops share 4910d565efSmrga single header, or if there is a branching in the middle of the loop. 5010d565efSmrgThe representation of loops in GCC however allows only loops with a 5110d565efSmrgsingle latch. During loop analysis, headers of such loops are split and 5210d565efSmrgforwarder blocks are created in order to disambiguate their structures. 5310d565efSmrgHeuristic based on profile information and structure of the induction 5410d565efSmrgvariables in the loops is used to determine whether the latches 5510d565efSmrgcorrespond to sub-loops or to control flow in a single loop. This means 5610d565efSmrgthat the analysis sometimes changes the CFG, and if you run it in the 5710d565efSmrgmiddle of an optimization pass, you must be able to deal with the new 5810d565efSmrgblocks. You may avoid CFG changes by passing 5910d565efSmrg@code{LOOPS_MAY_HAVE_MULTIPLE_LATCHES} flag to the loop discovery, 6010d565efSmrgnote however that most other loop manipulation functions will not work 6110d565efSmrgcorrectly for loops with multiple latch edges (the functions that only 6210d565efSmrgquery membership of blocks to loops and subloop relationships, or 6310d565efSmrgenumerate and test loop exits, can be expected to work). 6410d565efSmrg 6510d565efSmrgBody of the loop is the set of blocks that are dominated by its header, 6610d565efSmrgand reachable from its latch against the direction of edges in CFG@. The 6710d565efSmrgloops are organized in a containment hierarchy (tree) such that all the 6810d565efSmrgloops immediately contained inside loop L are the children of L in the 6910d565efSmrgtree. This tree is represented by the @code{struct loops} structure. 7010d565efSmrgThe root of this tree is a fake loop that contains all blocks in the 7110d565efSmrgfunction. Each of the loops is represented in a @code{struct loop} 7210d565efSmrgstructure. Each loop is assigned an index (@code{num} field of the 7310d565efSmrg@code{struct loop} structure), and the pointer to the loop is stored in 7410d565efSmrgthe corresponding field of the @code{larray} vector in the loops 7510d565efSmrgstructure. The indices do not have to be continuous, there may be 7610d565efSmrgempty (@code{NULL}) entries in the @code{larray} created by deleting 7710d565efSmrgloops. Also, there is no guarantee on the relative order of a loop 7810d565efSmrgand its subloops in the numbering. The index of a loop never changes. 7910d565efSmrg 8010d565efSmrgThe entries of the @code{larray} field should not be accessed directly. 8110d565efSmrgThe function @code{get_loop} returns the loop description for a loop with 8210d565efSmrgthe given index. @code{number_of_loops} function returns number of 8310d565efSmrgloops in the function. To traverse all loops, use @code{FOR_EACH_LOOP} 8410d565efSmrgmacro. The @code{flags} argument of the macro is used to determine 8510d565efSmrgthe direction of traversal and the set of loops visited. Each loop is 8610d565efSmrgguaranteed to be visited exactly once, regardless of the changes to the 8710d565efSmrgloop tree, and the loops may be removed during the traversal. The newly 8810d565efSmrgcreated loops are never traversed, if they need to be visited, this 89*ec02198aSmrgmust be done separately after their creation. 9010d565efSmrg 9110d565efSmrgEach basic block contains the reference to the innermost loop it belongs 9210d565efSmrgto (@code{loop_father}). For this reason, it is only possible to have 9310d565efSmrgone @code{struct loops} structure initialized at the same time for each 9410d565efSmrgCFG@. The global variable @code{current_loops} contains the 9510d565efSmrg@code{struct loops} structure. Many of the loop manipulation functions 9610d565efSmrgassume that dominance information is up-to-date. 9710d565efSmrg 9810d565efSmrgThe loops are analyzed through @code{loop_optimizer_init} function. The 9910d565efSmrgargument of this function is a set of flags represented in an integer 10010d565efSmrgbitmask. These flags specify what other properties of the loop 10110d565efSmrgstructures should be calculated/enforced and preserved later: 10210d565efSmrg 10310d565efSmrg@itemize 10410d565efSmrg@item @code{LOOPS_MAY_HAVE_MULTIPLE_LATCHES}: If this flag is set, no 10510d565efSmrgchanges to CFG will be performed in the loop analysis, in particular, 10610d565efSmrgloops with multiple latch edges will not be disambiguated. If a loop 10710d565efSmrghas multiple latches, its latch block is set to NULL@. Most of 10810d565efSmrgthe loop manipulation functions will not work for loops in this shape. 10910d565efSmrgNo other flags that require CFG changes can be passed to 11010d565efSmrgloop_optimizer_init. 11110d565efSmrg@item @code{LOOPS_HAVE_PREHEADERS}: Forwarder blocks are created in such 11210d565efSmrga way that each loop has only one entry edge, and additionally, the 11310d565efSmrgsource block of this entry edge has only one successor. This creates a 11410d565efSmrgnatural place where the code can be moved out of the loop, and ensures 11510d565efSmrgthat the entry edge of the loop leads from its immediate super-loop. 11610d565efSmrg@item @code{LOOPS_HAVE_SIMPLE_LATCHES}: Forwarder blocks are created to 11710d565efSmrgforce the latch block of each loop to have only one successor. This 11810d565efSmrgensures that the latch of the loop does not belong to any of its 11910d565efSmrgsub-loops, and makes manipulation with the loops significantly easier. 12010d565efSmrgMost of the loop manipulation functions assume that the loops are in 12110d565efSmrgthis shape. Note that with this flag, the ``normal'' loop without any 12210d565efSmrgcontrol flow inside and with one exit consists of two basic blocks. 12310d565efSmrg@item @code{LOOPS_HAVE_MARKED_IRREDUCIBLE_REGIONS}: Basic blocks and 12410d565efSmrgedges in the strongly connected components that are not natural loops 12510d565efSmrg(have more than one entry block) are marked with 12610d565efSmrg@code{BB_IRREDUCIBLE_LOOP} and @code{EDGE_IRREDUCIBLE_LOOP} flags. The 12710d565efSmrgflag is not set for blocks and edges that belong to natural loops that 12810d565efSmrgare in such an irreducible region (but it is set for the entry and exit 12910d565efSmrgedges of such a loop, if they lead to/from this region). 13010d565efSmrg@item @code{LOOPS_HAVE_RECORDED_EXITS}: The lists of exits are recorded 13110d565efSmrgand updated for each loop. This makes some functions (e.g., 13210d565efSmrg@code{get_loop_exit_edges}) more efficient. Some functions (e.g., 13310d565efSmrg@code{single_exit}) can be used only if the lists of exits are 13410d565efSmrgrecorded. 13510d565efSmrg@end itemize 13610d565efSmrg 13710d565efSmrgThese properties may also be computed/enforced later, using functions 13810d565efSmrg@code{create_preheaders}, @code{force_single_succ_latches}, 13910d565efSmrg@code{mark_irreducible_loops} and @code{record_loop_exits}. 14010d565efSmrgThe properties can be queried using @code{loops_state_satisfies_p}. 14110d565efSmrg 14210d565efSmrgThe memory occupied by the loops structures should be freed with 14310d565efSmrg@code{loop_optimizer_finalize} function. When loop structures are 14410d565efSmrgsetup to be preserved across passes this function reduces the 14510d565efSmrginformation to be kept up-to-date to a minimum (only 14610d565efSmrg@code{LOOPS_MAY_HAVE_MULTIPLE_LATCHES} set). 14710d565efSmrg 14810d565efSmrgThe CFG manipulation functions in general do not update loop structures. 14910d565efSmrgSpecialized versions that additionally do so are provided for the most 15010d565efSmrgcommon tasks. On GIMPLE, @code{cleanup_tree_cfg_loop} function can be 15110d565efSmrgused to cleanup CFG while updating the loops structures if 15210d565efSmrg@code{current_loops} is set. 15310d565efSmrg 15410d565efSmrgAt the moment loop structure is preserved from the start of GIMPLE 15510d565efSmrgloop optimizations until the end of RTL loop optimizations. During 15610d565efSmrgthis time a loop can be tracked by its @code{struct loop} and number. 15710d565efSmrg 15810d565efSmrg@node Loop querying 15910d565efSmrg@section Loop querying 16010d565efSmrg@cindex Loop querying 16110d565efSmrg 16210d565efSmrgThe functions to query the information about loops are declared in 16310d565efSmrg@file{cfgloop.h}. Some of the information can be taken directly from 16410d565efSmrgthe structures. @code{loop_father} field of each basic block contains 16510d565efSmrgthe innermost loop to that the block belongs. The most useful fields of 16610d565efSmrgloop structure (that are kept up-to-date at all times) are: 16710d565efSmrg 16810d565efSmrg@itemize 16910d565efSmrg@item @code{header}, @code{latch}: Header and latch basic blocks of the 17010d565efSmrgloop. 17110d565efSmrg@item @code{num_nodes}: Number of basic blocks in the loop (including 17210d565efSmrgthe basic blocks of the sub-loops). 17310d565efSmrg@item @code{outer}, @code{inner}, @code{next}: The super-loop, the first 17410d565efSmrgsub-loop, and the sibling of the loop in the loops tree. 17510d565efSmrg@end itemize 17610d565efSmrg 17710d565efSmrgThere are other fields in the loop structures, many of them used only by 17810d565efSmrgsome of the passes, or not updated during CFG changes; in general, they 17910d565efSmrgshould not be accessed directly. 18010d565efSmrg 18110d565efSmrgThe most important functions to query loop structures are: 18210d565efSmrg 18310d565efSmrg@itemize 18410d565efSmrg@item @code{loop_depth}: The depth of the loop in the loops tree, i.e., the 18510d565efSmrgnumber of super-loops of the loop. 18610d565efSmrg@item @code{flow_loops_dump}: Dumps the information about loops to a 18710d565efSmrgfile. 18810d565efSmrg@item @code{verify_loop_structure}: Checks consistency of the loop 18910d565efSmrgstructures. 19010d565efSmrg@item @code{loop_latch_edge}: Returns the latch edge of a loop. 19110d565efSmrg@item @code{loop_preheader_edge}: If loops have preheaders, returns 19210d565efSmrgthe preheader edge of a loop. 19310d565efSmrg@item @code{flow_loop_nested_p}: Tests whether loop is a sub-loop of 19410d565efSmrganother loop. 19510d565efSmrg@item @code{flow_bb_inside_loop_p}: Tests whether a basic block belongs 19610d565efSmrgto a loop (including its sub-loops). 19710d565efSmrg@item @code{find_common_loop}: Finds the common super-loop of two loops. 19810d565efSmrg@item @code{superloop_at_depth}: Returns the super-loop of a loop with 19910d565efSmrgthe given depth. 20010d565efSmrg@item @code{tree_num_loop_insns}, @code{num_loop_insns}: Estimates the 20110d565efSmrgnumber of insns in the loop, on GIMPLE and on RTL. 20210d565efSmrg@item @code{loop_exit_edge_p}: Tests whether edge is an exit from a 20310d565efSmrgloop. 20410d565efSmrg@item @code{mark_loop_exit_edges}: Marks all exit edges of all loops 20510d565efSmrgwith @code{EDGE_LOOP_EXIT} flag. 20610d565efSmrg@item @code{get_loop_body}, @code{get_loop_body_in_dom_order}, 20710d565efSmrg@code{get_loop_body_in_bfs_order}: Enumerates the basic blocks in the 20810d565efSmrgloop in depth-first search order in reversed CFG, ordered by dominance 20910d565efSmrgrelation, and breath-first search order, respectively. 21010d565efSmrg@item @code{single_exit}: Returns the single exit edge of the loop, or 21110d565efSmrg@code{NULL} if the loop has more than one exit. You can only use this 21210d565efSmrgfunction if LOOPS_HAVE_MARKED_SINGLE_EXITS property is used. 21310d565efSmrg@item @code{get_loop_exit_edges}: Enumerates the exit edges of a loop. 21410d565efSmrg@item @code{just_once_each_iteration_p}: Returns true if the basic block 21510d565efSmrgis executed exactly once during each iteration of a loop (that is, it 21610d565efSmrgdoes not belong to a sub-loop, and it dominates the latch of the loop). 21710d565efSmrg@end itemize 21810d565efSmrg 21910d565efSmrg@node Loop manipulation 22010d565efSmrg@section Loop manipulation 22110d565efSmrg@cindex Loop manipulation 22210d565efSmrg 22310d565efSmrgThe loops tree can be manipulated using the following functions: 22410d565efSmrg 22510d565efSmrg@itemize 22610d565efSmrg@item @code{flow_loop_tree_node_add}: Adds a node to the tree. 22710d565efSmrg@item @code{flow_loop_tree_node_remove}: Removes a node from the tree. 22810d565efSmrg@item @code{add_bb_to_loop}: Adds a basic block to a loop. 22910d565efSmrg@item @code{remove_bb_from_loops}: Removes a basic block from loops. 23010d565efSmrg@end itemize 23110d565efSmrg 23210d565efSmrgMost low-level CFG functions update loops automatically. The following 23310d565efSmrgfunctions handle some more complicated cases of CFG manipulations: 23410d565efSmrg 23510d565efSmrg@itemize 23610d565efSmrg@item @code{remove_path}: Removes an edge and all blocks it dominates. 23710d565efSmrg@item @code{split_loop_exit_edge}: Splits exit edge of the loop, 23810d565efSmrgensuring that PHI node arguments remain in the loop (this ensures that 23910d565efSmrgloop-closed SSA form is preserved). Only useful on GIMPLE. 24010d565efSmrg@end itemize 24110d565efSmrg 24210d565efSmrgFinally, there are some higher-level loop transformations implemented. 24310d565efSmrgWhile some of them are written so that they should work on non-innermost 24410d565efSmrgloops, they are mostly untested in that case, and at the moment, they 24510d565efSmrgare only reliable for the innermost loops: 24610d565efSmrg 24710d565efSmrg@itemize 24810d565efSmrg@item @code{create_iv}: Creates a new induction variable. Only works on 24910d565efSmrgGIMPLE@. @code{standard_iv_increment_position} can be used to find a 25010d565efSmrgsuitable place for the iv increment. 25110d565efSmrg@item @code{duplicate_loop_to_header_edge}, 25210d565efSmrg@code{tree_duplicate_loop_to_header_edge}: These functions (on RTL and 25310d565efSmrgon GIMPLE) duplicate the body of the loop prescribed number of times on 25410d565efSmrgone of the edges entering loop header, thus performing either loop 25510d565efSmrgunrolling or loop peeling. @code{can_duplicate_loop_p} 25610d565efSmrg(@code{can_unroll_loop_p} on GIMPLE) must be true for the duplicated 25710d565efSmrgloop. 25810d565efSmrg@item @code{loop_version}: This function creates a copy of a loop, and 25910d565efSmrga branch before them that selects one of them depending on the 26010d565efSmrgprescribed condition. This is useful for optimizations that need to 26110d565efSmrgverify some assumptions in runtime (one of the copies of the loop is 26210d565efSmrgusually left unchanged, while the other one is transformed in some way). 26310d565efSmrg@item @code{tree_unroll_loop}: Unrolls the loop, including peeling the 26410d565efSmrgextra iterations to make the number of iterations divisible by unroll 26510d565efSmrgfactor, updating the exit condition, and removing the exits that now 26610d565efSmrgcannot be taken. Works only on GIMPLE. 26710d565efSmrg@end itemize 26810d565efSmrg 26910d565efSmrg@node LCSSA 27010d565efSmrg@section Loop-closed SSA form 27110d565efSmrg@cindex LCSSA 27210d565efSmrg@cindex Loop-closed SSA form 27310d565efSmrg 27410d565efSmrgThroughout the loop optimizations on tree level, one extra condition is 27510d565efSmrgenforced on the SSA form: No SSA name is used outside of the loop in 27610d565efSmrgthat it is defined. The SSA form satisfying this condition is called 27710d565efSmrg``loop-closed SSA form'' -- LCSSA@. To enforce LCSSA, PHI nodes must be 27810d565efSmrgcreated at the exits of the loops for the SSA names that are used 27910d565efSmrgoutside of them. Only the real operands (not virtual SSA names) are 28010d565efSmrgheld in LCSSA, in order to save memory. 28110d565efSmrg 28210d565efSmrgThere are various benefits of LCSSA: 28310d565efSmrg 28410d565efSmrg@itemize 28510d565efSmrg@item Many optimizations (value range analysis, final value 28610d565efSmrgreplacement) are interested in the values that are defined in the loop 28710d565efSmrgand used outside of it, i.e., exactly those for that we create new PHI 28810d565efSmrgnodes. 28910d565efSmrg@item In induction variable analysis, it is not necessary to specify the 29010d565efSmrgloop in that the analysis should be performed -- the scalar evolution 29110d565efSmrganalysis always returns the results with respect to the loop in that the 29210d565efSmrgSSA name is defined. 29310d565efSmrg@item It makes updating of SSA form during loop transformations simpler. 29410d565efSmrgWithout LCSSA, operations like loop unrolling may force creation of PHI 29510d565efSmrgnodes arbitrarily far from the loop, while in LCSSA, the SSA form can be 29610d565efSmrgupdated locally. However, since we only keep real operands in LCSSA, we 29710d565efSmrgcannot use this advantage (we could have local updating of real 29810d565efSmrgoperands, but it is not much more efficient than to use generic SSA form 29910d565efSmrgupdating for it as well; the amount of changes to SSA is the same). 30010d565efSmrg@end itemize 30110d565efSmrg 30210d565efSmrgHowever, it also means LCSSA must be updated. This is usually 30310d565efSmrgstraightforward, unless you create a new value in loop and use it 30410d565efSmrgoutside, or unless you manipulate loop exit edges (functions are 30510d565efSmrgprovided to make these manipulations simple). 30610d565efSmrg@code{rewrite_into_loop_closed_ssa} is used to rewrite SSA form to 30710d565efSmrgLCSSA, and @code{verify_loop_closed_ssa} to check that the invariant of 30810d565efSmrgLCSSA is preserved. 30910d565efSmrg 31010d565efSmrg@node Scalar evolutions 31110d565efSmrg@section Scalar evolutions 31210d565efSmrg@cindex Scalar evolutions 31310d565efSmrg@cindex IV analysis on GIMPLE 31410d565efSmrg 31510d565efSmrgScalar evolutions (SCEV) are used to represent results of induction 31610d565efSmrgvariable analysis on GIMPLE@. They enable us to represent variables with 31710d565efSmrgcomplicated behavior in a simple and consistent way (we only use it to 31810d565efSmrgexpress values of polynomial induction variables, but it is possible to 31910d565efSmrgextend it). The interfaces to SCEV analysis are declared in 32010d565efSmrg@file{tree-scalar-evolution.h}. To use scalar evolutions analysis, 32110d565efSmrg@code{scev_initialize} must be used. To stop using SCEV, 32210d565efSmrg@code{scev_finalize} should be used. SCEV analysis caches results in 32310d565efSmrgorder to save time and memory. This cache however is made invalid by 32410d565efSmrgmost of the loop transformations, including removal of code. If such a 32510d565efSmrgtransformation is performed, @code{scev_reset} must be called to clean 32610d565efSmrgthe caches. 32710d565efSmrg 32810d565efSmrgGiven an SSA name, its behavior in loops can be analyzed using the 32910d565efSmrg@code{analyze_scalar_evolution} function. The returned SCEV however 33010d565efSmrgdoes not have to be fully analyzed and it may contain references to 33110d565efSmrgother SSA names defined in the loop. To resolve these (potentially 33210d565efSmrgrecursive) references, @code{instantiate_parameters} or 33310d565efSmrg@code{resolve_mixers} functions must be used. 33410d565efSmrg@code{instantiate_parameters} is useful when you use the results of SCEV 33510d565efSmrgonly for some analysis, and when you work with whole nest of loops at 33610d565efSmrgonce. It will try replacing all SSA names by their SCEV in all loops, 33710d565efSmrgincluding the super-loops of the current loop, thus providing a complete 33810d565efSmrginformation about the behavior of the variable in the loop nest. 33910d565efSmrg@code{resolve_mixers} is useful if you work with only one loop at a 34010d565efSmrgtime, and if you possibly need to create code based on the value of the 34110d565efSmrginduction variable. It will only resolve the SSA names defined in the 34210d565efSmrgcurrent loop, leaving the SSA names defined outside unchanged, even if 34310d565efSmrgtheir evolution in the outer loops is known. 34410d565efSmrg 34510d565efSmrgThe SCEV is a normal tree expression, except for the fact that it may 34610d565efSmrgcontain several special tree nodes. One of them is 34710d565efSmrg@code{SCEV_NOT_KNOWN}, used for SSA names whose value cannot be 34810d565efSmrgexpressed. The other one is @code{POLYNOMIAL_CHREC}. Polynomial chrec 34910d565efSmrghas three arguments -- base, step and loop (both base and step may 35010d565efSmrgcontain further polynomial chrecs). Type of the expression and of base 35110d565efSmrgand step must be the same. A variable has evolution 35210d565efSmrg@code{POLYNOMIAL_CHREC(base, step, loop)} if it is (in the specified 35310d565efSmrgloop) equivalent to @code{x_1} in the following example 35410d565efSmrg 35510d565efSmrg@smallexample 35610d565efSmrgwhile (@dots{}) 35710d565efSmrg @{ 35810d565efSmrg x_1 = phi (base, x_2); 35910d565efSmrg x_2 = x_1 + step; 36010d565efSmrg @} 36110d565efSmrg@end smallexample 36210d565efSmrg 36310d565efSmrgNote that this includes the language restrictions on the operations. 36410d565efSmrgFor example, if we compile C code and @code{x} has signed type, then the 36510d565efSmrgoverflow in addition would cause undefined behavior, and we may assume 36610d565efSmrgthat this does not happen. Hence, the value with this SCEV cannot 36710d565efSmrgoverflow (which restricts the number of iterations of such a loop). 36810d565efSmrg 36910d565efSmrgIn many cases, one wants to restrict the attention just to affine 37010d565efSmrginduction variables. In this case, the extra expressive power of SCEV 37110d565efSmrgis not useful, and may complicate the optimizations. In this case, 37210d565efSmrg@code{simple_iv} function may be used to analyze a value -- the result 37310d565efSmrgis a loop-invariant base and step. 37410d565efSmrg 37510d565efSmrg@node loop-iv 37610d565efSmrg@section IV analysis on RTL 37710d565efSmrg@cindex IV analysis on RTL 37810d565efSmrg 37910d565efSmrgThe induction variable on RTL is simple and only allows analysis of 38010d565efSmrgaffine induction variables, and only in one loop at once. The interface 38110d565efSmrgis declared in @file{cfgloop.h}. Before analyzing induction variables 38210d565efSmrgin a loop L, @code{iv_analysis_loop_init} function must be called on L. 38310d565efSmrgAfter the analysis (possibly calling @code{iv_analysis_loop_init} for 38410d565efSmrgseveral loops) is finished, @code{iv_analysis_done} should be called. 38510d565efSmrgThe following functions can be used to access the results of the 38610d565efSmrganalysis: 38710d565efSmrg 38810d565efSmrg@itemize 38910d565efSmrg@item @code{iv_analyze}: Analyzes a single register used in the given 39010d565efSmrginsn. If no use of the register in this insn is found, the following 39110d565efSmrginsns are scanned, so that this function can be called on the insn 39210d565efSmrgreturned by get_condition. 39310d565efSmrg@item @code{iv_analyze_result}: Analyzes result of the assignment in the 39410d565efSmrggiven insn. 39510d565efSmrg@item @code{iv_analyze_expr}: Analyzes a more complicated expression. 39610d565efSmrgAll its operands are analyzed by @code{iv_analyze}, and hence they must 39710d565efSmrgbe used in the specified insn or one of the following insns. 39810d565efSmrg@end itemize 39910d565efSmrg 40010d565efSmrgThe description of the induction variable is provided in @code{struct 40110d565efSmrgrtx_iv}. In order to handle subregs, the representation is a bit 40210d565efSmrgcomplicated; if the value of the @code{extend} field is not 40310d565efSmrg@code{UNKNOWN}, the value of the induction variable in the i-th 40410d565efSmrgiteration is 40510d565efSmrg 40610d565efSmrg@smallexample 40710d565efSmrgdelta + mult * extend_@{extend_mode@} (subreg_@{mode@} (base + i * step)), 40810d565efSmrg@end smallexample 40910d565efSmrg 41010d565efSmrgwith the following exception: if @code{first_special} is true, then the 41110d565efSmrgvalue in the first iteration (when @code{i} is zero) is @code{delta + 41210d565efSmrgmult * base}. However, if @code{extend} is equal to @code{UNKNOWN}, 41310d565efSmrgthen @code{first_special} must be false, @code{delta} 0, @code{mult} 1 41410d565efSmrgand the value in the i-th iteration is 41510d565efSmrg 41610d565efSmrg@smallexample 41710d565efSmrgsubreg_@{mode@} (base + i * step) 41810d565efSmrg@end smallexample 41910d565efSmrg 42010d565efSmrgThe function @code{get_iv_value} can be used to perform these 42110d565efSmrgcalculations. 42210d565efSmrg 42310d565efSmrg@node Number of iterations 42410d565efSmrg@section Number of iterations analysis 42510d565efSmrg@cindex Number of iterations analysis 42610d565efSmrg 42710d565efSmrgBoth on GIMPLE and on RTL, there are functions available to determine 42810d565efSmrgthe number of iterations of a loop, with a similar interface. The 42910d565efSmrgnumber of iterations of a loop in GCC is defined as the number of 43010d565efSmrgexecutions of the loop latch. In many cases, it is not possible to 43110d565efSmrgdetermine the number of iterations unconditionally -- the determined 43210d565efSmrgnumber is correct only if some assumptions are satisfied. The analysis 43310d565efSmrgtries to verify these conditions using the information contained in the 43410d565efSmrgprogram; if it fails, the conditions are returned together with the 43510d565efSmrgresult. The following information and conditions are provided by the 43610d565efSmrganalysis: 43710d565efSmrg 43810d565efSmrg@itemize 43910d565efSmrg@item @code{assumptions}: If this condition is false, the rest of 44010d565efSmrgthe information is invalid. 44110d565efSmrg@item @code{noloop_assumptions} on RTL, @code{may_be_zero} on GIMPLE: If 44210d565efSmrgthis condition is true, the loop exits in the first iteration. 44310d565efSmrg@item @code{infinite}: If this condition is true, the loop is infinite. 44410d565efSmrgThis condition is only available on RTL@. On GIMPLE, conditions for 44510d565efSmrgfiniteness of the loop are included in @code{assumptions}. 44610d565efSmrg@item @code{niter_expr} on RTL, @code{niter} on GIMPLE: The expression 44710d565efSmrgthat gives number of iterations. The number of iterations is defined as 44810d565efSmrgthe number of executions of the loop latch. 44910d565efSmrg@end itemize 45010d565efSmrg 45110d565efSmrgBoth on GIMPLE and on RTL, it necessary for the induction variable 45210d565efSmrganalysis framework to be initialized (SCEV on GIMPLE, loop-iv on RTL). 45310d565efSmrgOn GIMPLE, the results are stored to @code{struct tree_niter_desc} 45410d565efSmrgstructure. Number of iterations before the loop is exited through a 45510d565efSmrggiven exit can be determined using @code{number_of_iterations_exit} 45610d565efSmrgfunction. On RTL, the results are returned in @code{struct niter_desc} 45710d565efSmrgstructure. The corresponding function is named 45810d565efSmrg@code{check_simple_exit}. There are also functions that pass through 45910d565efSmrgall the exits of a loop and try to find one with easy to determine 46010d565efSmrgnumber of iterations -- @code{find_loop_niter} on GIMPLE and 46110d565efSmrg@code{find_simple_exit} on RTL@. Finally, there are functions that 46210d565efSmrgprovide the same information, but additionally cache it, so that 46310d565efSmrgrepeated calls to number of iterations are not so costly -- 46410d565efSmrg@code{number_of_latch_executions} on GIMPLE and @code{get_simple_loop_desc} 46510d565efSmrgon RTL. 46610d565efSmrg 46710d565efSmrgNote that some of these functions may behave slightly differently than 46810d565efSmrgothers -- some of them return only the expression for the number of 46910d565efSmrgiterations, and fail if there are some assumptions. The function 47010d565efSmrg@code{number_of_latch_executions} works only for single-exit loops. 47110d565efSmrgThe function @code{number_of_cond_exit_executions} can be used to 47210d565efSmrgdetermine number of executions of the exit condition of a single-exit 47310d565efSmrgloop (i.e., the @code{number_of_latch_executions} increased by one). 47410d565efSmrg 47510d565efSmrgOn GIMPLE, below constraint flags affect semantics of some APIs of number 47610d565efSmrgof iterations analyzer: 47710d565efSmrg 47810d565efSmrg@itemize 47910d565efSmrg@item @code{LOOP_C_INFINITE}: If this constraint flag is set, the loop 48010d565efSmrgis known to be infinite. APIs like @code{number_of_iterations_exit} can 48110d565efSmrgreturn false directly without doing any analysis. 48210d565efSmrg@item @code{LOOP_C_FINITE}: If this constraint flag is set, the loop is 48310d565efSmrgknown to be finite, in other words, loop's number of iterations can be 48410d565efSmrgcomputed with @code{assumptions} be true. 48510d565efSmrg@end itemize 48610d565efSmrg 48710d565efSmrgGenerally, the constraint flags are set/cleared by consumers which are 48810d565efSmrgloop optimizers. It's also the consumers' responsibility to set/clear 48910d565efSmrgconstraints correctly. Failing to do that might result in hard to track 49010d565efSmrgdown bugs in scev/niter consumers. One typical use case is vectorizer: 49110d565efSmrgit drives number of iterations analyzer by setting @code{LOOP_C_FINITE} 49210d565efSmrgand vectorizes possibly infinite loop by versioning loop with analysis 49310d565efSmrgresult. In return, constraints set by consumers can also help number of 49410d565efSmrgiterations analyzer in following optimizers. For example, @code{niter} 49510d565efSmrgof a loop versioned under @code{assumptions} is valid unconditionally. 49610d565efSmrg 49710d565efSmrgOther constraints may be added in the future, for example, a constraint 49810d565efSmrgindicating that loops' latch must roll thus @code{may_be_zero} would be 49910d565efSmrgfalse unconditionally. 50010d565efSmrg 50110d565efSmrg@node Dependency analysis 50210d565efSmrg@section Data Dependency Analysis 50310d565efSmrg@cindex Data Dependency Analysis 50410d565efSmrg 50510d565efSmrgThe code for the data dependence analysis can be found in 50610d565efSmrg@file{tree-data-ref.c} and its interface and data structures are 50710d565efSmrgdescribed in @file{tree-data-ref.h}. The function that computes the 50810d565efSmrgdata dependences for all the array and pointer references for a given 50910d565efSmrgloop is @code{compute_data_dependences_for_loop}. This function is 51010d565efSmrgcurrently used by the linear loop transform and the vectorization 51110d565efSmrgpasses. Before calling this function, one has to allocate two vectors: 51210d565efSmrga first vector will contain the set of data references that are 51310d565efSmrgcontained in the analyzed loop body, and the second vector will contain 51410d565efSmrgthe dependence relations between the data references. Thus if the 51510d565efSmrgvector of data references is of size @code{n}, the vector containing the 51610d565efSmrgdependence relations will contain @code{n*n} elements. However if the 51710d565efSmrganalyzed loop contains side effects, such as calls that potentially can 51810d565efSmrginterfere with the data references in the current analyzed loop, the 51910d565efSmrganalysis stops while scanning the loop body for data references, and 52010d565efSmrginserts a single @code{chrec_dont_know} in the dependence relation 52110d565efSmrgarray. 52210d565efSmrg 52310d565efSmrgThe data references are discovered in a particular order during the 52410d565efSmrgscanning of the loop body: the loop body is analyzed in execution order, 52510d565efSmrgand the data references of each statement are pushed at the end of the 52610d565efSmrgdata reference array. Two data references syntactically occur in the 52710d565efSmrgprogram in the same order as in the array of data references. This 52810d565efSmrgsyntactic order is important in some classical data dependence tests, 52910d565efSmrgand mapping this order to the elements of this array avoids costly 53010d565efSmrgqueries to the loop body representation. 53110d565efSmrg 53210d565efSmrgThree types of data references are currently handled: ARRAY_REF, 53310d565efSmrgINDIRECT_REF and COMPONENT_REF@. The data structure for the data reference 53410d565efSmrgis @code{data_reference}, where @code{data_reference_p} is a name of a 53510d565efSmrgpointer to the data reference structure. The structure contains the 53610d565efSmrgfollowing elements: 53710d565efSmrg 53810d565efSmrg@itemize 53910d565efSmrg@item @code{base_object_info}: Provides information about the base object 54010d565efSmrgof the data reference and its access functions. These access functions 54110d565efSmrgrepresent the evolution of the data reference in the loop relative to 54210d565efSmrgits base, in keeping with the classical meaning of the data reference 54310d565efSmrgaccess function for the support of arrays. For example, for a reference 54410d565efSmrg@code{a.b[i][j]}, the base object is @code{a.b} and the access functions, 54510d565efSmrgone for each array subscript, are: 54610d565efSmrg@code{@{i_init, + i_step@}_1, @{j_init, +, j_step@}_2}. 54710d565efSmrg 54810d565efSmrg@item @code{first_location_in_loop}: Provides information about the first 54910d565efSmrglocation accessed by the data reference in the loop and about the access 55010d565efSmrgfunction used to represent evolution relative to this location. This data 55110d565efSmrgis used to support pointers, and is not used for arrays (for which we 55210d565efSmrghave base objects). Pointer accesses are represented as a one-dimensional 55310d565efSmrgaccess that starts from the first location accessed in the loop. For 55410d565efSmrgexample: 55510d565efSmrg 55610d565efSmrg@smallexample 55710d565efSmrg for1 i 55810d565efSmrg for2 j 55910d565efSmrg *((int *)p + i + j) = a[i][j]; 56010d565efSmrg@end smallexample 56110d565efSmrg 56210d565efSmrgThe access function of the pointer access is @code{@{0, + 4B@}_for2} 56310d565efSmrgrelative to @code{p + i}. The access functions of the array are 56410d565efSmrg@code{@{i_init, + i_step@}_for1} and @code{@{j_init, +, j_step@}_for2} 56510d565efSmrgrelative to @code{a}. 56610d565efSmrg 56710d565efSmrgUsually, the object the pointer refers to is either unknown, or we cannot 56810d565efSmrgprove that the access is confined to the boundaries of a certain object. 56910d565efSmrg 57010d565efSmrgTwo data references can be compared only if at least one of these two 57110d565efSmrgrepresentations has all its fields filled for both data references. 57210d565efSmrg 57310d565efSmrgThe current strategy for data dependence tests is as follows: 57410d565efSmrgIf both @code{a} and @code{b} are represented as arrays, compare 57510d565efSmrg@code{a.base_object} and @code{b.base_object}; 57610d565efSmrgif they are equal, apply dependence tests (use access functions based on 57710d565efSmrgbase_objects). 57810d565efSmrgElse if both @code{a} and @code{b} are represented as pointers, compare 57910d565efSmrg@code{a.first_location} and @code{b.first_location}; 58010d565efSmrgif they are equal, apply dependence tests (use access functions based on 58110d565efSmrgfirst location). 58210d565efSmrgHowever, if @code{a} and @code{b} are represented differently, only try 58310d565efSmrgto prove that the bases are definitely different. 58410d565efSmrg 58510d565efSmrg@item Aliasing information. 58610d565efSmrg@item Alignment information. 58710d565efSmrg@end itemize 58810d565efSmrg 58910d565efSmrgThe structure describing the relation between two data references is 59010d565efSmrg@code{data_dependence_relation} and the shorter name for a pointer to 59110d565efSmrgsuch a structure is @code{ddr_p}. This structure contains: 59210d565efSmrg 59310d565efSmrg@itemize 59410d565efSmrg@item a pointer to each data reference, 59510d565efSmrg@item a tree node @code{are_dependent} that is set to @code{chrec_known} 59610d565efSmrgif the analysis has proved that there is no dependence between these two 59710d565efSmrgdata references, @code{chrec_dont_know} if the analysis was not able to 59810d565efSmrgdetermine any useful result and potentially there could exist a 59910d565efSmrgdependence between these data references, and @code{are_dependent} is 60010d565efSmrgset to @code{NULL_TREE} if there exist a dependence relation between the 60110d565efSmrgdata references, and the description of this dependence relation is 60210d565efSmrggiven in the @code{subscripts}, @code{dir_vects}, and @code{dist_vects} 60310d565efSmrgarrays, 60410d565efSmrg@item a boolean that determines whether the dependence relation can be 60510d565efSmrgrepresented by a classical distance vector, 60610d565efSmrg@item an array @code{subscripts} that contains a description of each 60710d565efSmrgsubscript of the data references. Given two array accesses a 60810d565efSmrgsubscript is the tuple composed of the access functions for a given 60910d565efSmrgdimension. For example, given @code{A[f1][f2][f3]} and 61010d565efSmrg@code{B[g1][g2][g3]}, there are three subscripts: @code{(f1, g1), (f2, 61110d565efSmrgg2), (f3, g3)}. 61210d565efSmrg@item two arrays @code{dir_vects} and @code{dist_vects} that contain 61310d565efSmrgclassical representations of the data dependences under the form of 61410d565efSmrgdirection and distance dependence vectors, 61510d565efSmrg@item an array of loops @code{loop_nest} that contains the loops to 61610d565efSmrgwhich the distance and direction vectors refer to. 61710d565efSmrg@end itemize 61810d565efSmrg 61910d565efSmrgSeveral functions for pretty printing the information extracted by the 62010d565efSmrgdata dependence analysis are available: @code{dump_ddrs} prints with a 62110d565efSmrgmaximum verbosity the details of a data dependence relations array, 62210d565efSmrg@code{dump_dist_dir_vectors} prints only the classical distance and 62310d565efSmrgdirection vectors for a data dependence relations array, and 62410d565efSmrg@code{dump_data_references} prints the details of the data references 62510d565efSmrgcontained in a data reference array. 626