1@c Copyright (c) 2006, 2007, 2008 Free Software Foundation, Inc. 2@c Free Software Foundation, Inc. 3@c This is part of the GCC manual. 4@c For copying conditions, see the file gcc.texi. 5 6@c --------------------------------------------------------------------- 7@c Loop Representation 8@c --------------------------------------------------------------------- 9 10@node Loop Analysis and Representation 11@chapter Analysis and Representation of Loops 12 13GCC provides extensive infrastructure for work with natural loops, i.e., 14strongly connected components of CFG with only one entry block. This 15chapter describes representation of loops in GCC, both on GIMPLE and in 16RTL, as well as the interfaces to loop-related analyses (induction 17variable analysis and number of iterations analysis). 18 19@menu 20* Loop representation:: Representation and analysis of loops. 21* Loop querying:: Getting information about loops. 22* Loop manipulation:: Loop manipulation functions. 23* LCSSA:: Loop-closed SSA form. 24* Scalar evolutions:: Induction variables on GIMPLE. 25* loop-iv:: Induction variables on RTL. 26* Number of iterations:: Number of iterations analysis. 27* Dependency analysis:: Data dependency analysis. 28* Lambda:: Linear loop transformations framework. 29* Omega:: A solver for linear programming problems. 30@end menu 31 32@node Loop representation 33@section Loop representation 34@cindex Loop representation 35@cindex Loop analysis 36 37This chapter describes the representation of loops in GCC, and functions 38that can be used to build, modify and analyze this representation. Most 39of the interfaces and data structures are declared in @file{cfgloop.h}. 40At the moment, loop structures are analyzed and this information is 41updated only by the optimization passes that deal with loops, but some 42efforts are being made to make it available throughout most of the 43optimization passes. 44 45In general, a natural loop has one entry block (header) and possibly 46several back edges (latches) leading to the header from the inside of 47the loop. Loops with several latches may appear if several loops share 48a single header, or if there is a branching in the middle of the loop. 49The representation of loops in GCC however allows only loops with a 50single latch. During loop analysis, headers of such loops are split and 51forwarder blocks are created in order to disambiguate their structures. 52Heuristic based on profile information and structure of the induction 53variables in the loops is used to determine whether the latches 54correspond to sub-loops or to control flow in a single loop. This means 55that the analysis sometimes changes the CFG, and if you run it in the 56middle of an optimization pass, you must be able to deal with the new 57blocks. You may avoid CFG changes by passing 58@code{LOOPS_MAY_HAVE_MULTIPLE_LATCHES} flag to the loop discovery, 59note however that most other loop manipulation functions will not work 60correctly for loops with multiple latch edges (the functions that only 61query membership of blocks to loops and subloop relationships, or 62enumerate and test loop exits, can be expected to work). 63 64Body of the loop is the set of blocks that are dominated by its header, 65and reachable from its latch against the direction of edges in CFG@. The 66loops are organized in a containment hierarchy (tree) such that all the 67loops immediately contained inside loop L are the children of L in the 68tree. This tree is represented by the @code{struct loops} structure. 69The root of this tree is a fake loop that contains all blocks in the 70function. Each of the loops is represented in a @code{struct loop} 71structure. Each loop is assigned an index (@code{num} field of the 72@code{struct loop} structure), and the pointer to the loop is stored in 73the corresponding field of the @code{larray} vector in the loops 74structure. The indices do not have to be continuous, there may be 75empty (@code{NULL}) entries in the @code{larray} created by deleting 76loops. Also, there is no guarantee on the relative order of a loop 77and its subloops in the numbering. The index of a loop never changes. 78 79The entries of the @code{larray} field should not be accessed directly. 80The function @code{get_loop} returns the loop description for a loop with 81the given index. @code{number_of_loops} function returns number of 82loops in the function. To traverse all loops, use @code{FOR_EACH_LOOP} 83macro. The @code{flags} argument of the macro is used to determine 84the direction of traversal and the set of loops visited. Each loop is 85guaranteed to be visited exactly once, regardless of the changes to the 86loop tree, and the loops may be removed during the traversal. The newly 87created loops are never traversed, if they need to be visited, this 88must be done separately after their creation. The @code{FOR_EACH_LOOP} 89macro allocates temporary variables. If the @code{FOR_EACH_LOOP} loop 90were ended using break or goto, they would not be released; 91@code{FOR_EACH_LOOP_BREAK} macro must be used instead. 92 93Each basic block contains the reference to the innermost loop it belongs 94to (@code{loop_father}). For this reason, it is only possible to have 95one @code{struct loops} structure initialized at the same time for each 96CFG@. The global variable @code{current_loops} contains the 97@code{struct loops} structure. Many of the loop manipulation functions 98assume that dominance information is up-to-date. 99 100The loops are analyzed through @code{loop_optimizer_init} function. The 101argument of this function is a set of flags represented in an integer 102bitmask. These flags specify what other properties of the loop 103structures should be calculated/enforced and preserved later: 104 105@itemize 106@item @code{LOOPS_MAY_HAVE_MULTIPLE_LATCHES}: If this flag is set, no 107changes to CFG will be performed in the loop analysis, in particular, 108loops with multiple latch edges will not be disambiguated. If a loop 109has multiple latches, its latch block is set to NULL@. Most of 110the loop manipulation functions will not work for loops in this shape. 111No other flags that require CFG changes can be passed to 112loop_optimizer_init. 113@item @code{LOOPS_HAVE_PREHEADERS}: Forwarder blocks are created in such 114a way that each loop has only one entry edge, and additionally, the 115source block of this entry edge has only one successor. This creates a 116natural place where the code can be moved out of the loop, and ensures 117that the entry edge of the loop leads from its immediate super-loop. 118@item @code{LOOPS_HAVE_SIMPLE_LATCHES}: Forwarder blocks are created to 119force the latch block of each loop to have only one successor. This 120ensures that the latch of the loop does not belong to any of its 121sub-loops, and makes manipulation with the loops significantly easier. 122Most of the loop manipulation functions assume that the loops are in 123this shape. Note that with this flag, the ``normal'' loop without any 124control flow inside and with one exit consists of two basic blocks. 125@item @code{LOOPS_HAVE_MARKED_IRREDUCIBLE_REGIONS}: Basic blocks and 126edges in the strongly connected components that are not natural loops 127(have more than one entry block) are marked with 128@code{BB_IRREDUCIBLE_LOOP} and @code{EDGE_IRREDUCIBLE_LOOP} flags. The 129flag is not set for blocks and edges that belong to natural loops that 130are in such an irreducible region (but it is set for the entry and exit 131edges of such a loop, if they lead to/from this region). 132@item @code{LOOPS_HAVE_RECORDED_EXITS}: The lists of exits are recorded 133and updated for each loop. This makes some functions (e.g., 134@code{get_loop_exit_edges}) more efficient. Some functions (e.g., 135@code{single_exit}) can be used only if the lists of exits are 136recorded. 137@end itemize 138 139These properties may also be computed/enforced later, using functions 140@code{create_preheaders}, @code{force_single_succ_latches}, 141@code{mark_irreducible_loops} and @code{record_loop_exits}. 142 143The memory occupied by the loops structures should be freed with 144@code{loop_optimizer_finalize} function. 145 146The CFG manipulation functions in general do not update loop structures. 147Specialized versions that additionally do so are provided for the most 148common tasks. On GIMPLE, @code{cleanup_tree_cfg_loop} function can be 149used to cleanup CFG while updating the loops structures if 150@code{current_loops} is set. 151 152@node Loop querying 153@section Loop querying 154@cindex Loop querying 155 156The functions to query the information about loops are declared in 157@file{cfgloop.h}. Some of the information can be taken directly from 158the structures. @code{loop_father} field of each basic block contains 159the innermost loop to that the block belongs. The most useful fields of 160loop structure (that are kept up-to-date at all times) are: 161 162@itemize 163@item @code{header}, @code{latch}: Header and latch basic blocks of the 164loop. 165@item @code{num_nodes}: Number of basic blocks in the loop (including 166the basic blocks of the sub-loops). 167@item @code{depth}: The depth of the loop in the loops tree, i.e., the 168number of super-loops of the loop. 169@item @code{outer}, @code{inner}, @code{next}: The super-loop, the first 170sub-loop, and the sibling of the loop in the loops tree. 171@end itemize 172 173There are other fields in the loop structures, many of them used only by 174some of the passes, or not updated during CFG changes; in general, they 175should not be accessed directly. 176 177The most important functions to query loop structures are: 178 179@itemize 180@item @code{flow_loops_dump}: Dumps the information about loops to a 181file. 182@item @code{verify_loop_structure}: Checks consistency of the loop 183structures. 184@item @code{loop_latch_edge}: Returns the latch edge of a loop. 185@item @code{loop_preheader_edge}: If loops have preheaders, returns 186the preheader edge of a loop. 187@item @code{flow_loop_nested_p}: Tests whether loop is a sub-loop of 188another loop. 189@item @code{flow_bb_inside_loop_p}: Tests whether a basic block belongs 190to a loop (including its sub-loops). 191@item @code{find_common_loop}: Finds the common super-loop of two loops. 192@item @code{superloop_at_depth}: Returns the super-loop of a loop with 193the given depth. 194@item @code{tree_num_loop_insns}, @code{num_loop_insns}: Estimates the 195number of insns in the loop, on GIMPLE and on RTL. 196@item @code{loop_exit_edge_p}: Tests whether edge is an exit from a 197loop. 198@item @code{mark_loop_exit_edges}: Marks all exit edges of all loops 199with @code{EDGE_LOOP_EXIT} flag. 200@item @code{get_loop_body}, @code{get_loop_body_in_dom_order}, 201@code{get_loop_body_in_bfs_order}: Enumerates the basic blocks in the 202loop in depth-first search order in reversed CFG, ordered by dominance 203relation, and breath-first search order, respectively. 204@item @code{single_exit}: Returns the single exit edge of the loop, or 205@code{NULL} if the loop has more than one exit. You can only use this 206function if LOOPS_HAVE_MARKED_SINGLE_EXITS property is used. 207@item @code{get_loop_exit_edges}: Enumerates the exit edges of a loop. 208@item @code{just_once_each_iteration_p}: Returns true if the basic block 209is executed exactly once during each iteration of a loop (that is, it 210does not belong to a sub-loop, and it dominates the latch of the loop). 211@end itemize 212 213@node Loop manipulation 214@section Loop manipulation 215@cindex Loop manipulation 216 217The loops tree can be manipulated using the following functions: 218 219@itemize 220@item @code{flow_loop_tree_node_add}: Adds a node to the tree. 221@item @code{flow_loop_tree_node_remove}: Removes a node from the tree. 222@item @code{add_bb_to_loop}: Adds a basic block to a loop. 223@item @code{remove_bb_from_loops}: Removes a basic block from loops. 224@end itemize 225 226Most low-level CFG functions update loops automatically. The following 227functions handle some more complicated cases of CFG manipulations: 228 229@itemize 230@item @code{remove_path}: Removes an edge and all blocks it dominates. 231@item @code{split_loop_exit_edge}: Splits exit edge of the loop, 232ensuring that PHI node arguments remain in the loop (this ensures that 233loop-closed SSA form is preserved). Only useful on GIMPLE. 234@end itemize 235 236Finally, there are some higher-level loop transformations implemented. 237While some of them are written so that they should work on non-innermost 238loops, they are mostly untested in that case, and at the moment, they 239are only reliable for the innermost loops: 240 241@itemize 242@item @code{create_iv}: Creates a new induction variable. Only works on 243GIMPLE@. @code{standard_iv_increment_position} can be used to find a 244suitable place for the iv increment. 245@item @code{duplicate_loop_to_header_edge}, 246@code{tree_duplicate_loop_to_header_edge}: These functions (on RTL and 247on GIMPLE) duplicate the body of the loop prescribed number of times on 248one of the edges entering loop header, thus performing either loop 249unrolling or loop peeling. @code{can_duplicate_loop_p} 250(@code{can_unroll_loop_p} on GIMPLE) must be true for the duplicated 251loop. 252@item @code{loop_version}, @code{tree_ssa_loop_version}: These function 253create a copy of a loop, and a branch before them that selects one of 254them depending on the prescribed condition. This is useful for 255optimizations that need to verify some assumptions in runtime (one of 256the copies of the loop is usually left unchanged, while the other one is 257transformed in some way). 258@item @code{tree_unroll_loop}: Unrolls the loop, including peeling the 259extra iterations to make the number of iterations divisible by unroll 260factor, updating the exit condition, and removing the exits that now 261cannot be taken. Works only on GIMPLE. 262@end itemize 263 264@node LCSSA 265@section Loop-closed SSA form 266@cindex LCSSA 267@cindex Loop-closed SSA form 268 269Throughout the loop optimizations on tree level, one extra condition is 270enforced on the SSA form: No SSA name is used outside of the loop in 271that it is defined. The SSA form satisfying this condition is called 272``loop-closed SSA form'' -- LCSSA@. To enforce LCSSA, PHI nodes must be 273created at the exits of the loops for the SSA names that are used 274outside of them. Only the real operands (not virtual SSA names) are 275held in LCSSA, in order to save memory. 276 277There are various benefits of LCSSA: 278 279@itemize 280@item Many optimizations (value range analysis, final value 281replacement) are interested in the values that are defined in the loop 282and used outside of it, i.e., exactly those for that we create new PHI 283nodes. 284@item In induction variable analysis, it is not necessary to specify the 285loop in that the analysis should be performed -- the scalar evolution 286analysis always returns the results with respect to the loop in that the 287SSA name is defined. 288@item It makes updating of SSA form during loop transformations simpler. 289Without LCSSA, operations like loop unrolling may force creation of PHI 290nodes arbitrarily far from the loop, while in LCSSA, the SSA form can be 291updated locally. However, since we only keep real operands in LCSSA, we 292cannot use this advantage (we could have local updating of real 293operands, but it is not much more efficient than to use generic SSA form 294updating for it as well; the amount of changes to SSA is the same). 295@end itemize 296 297However, it also means LCSSA must be updated. This is usually 298straightforward, unless you create a new value in loop and use it 299outside, or unless you manipulate loop exit edges (functions are 300provided to make these manipulations simple). 301@code{rewrite_into_loop_closed_ssa} is used to rewrite SSA form to 302LCSSA, and @code{verify_loop_closed_ssa} to check that the invariant of 303LCSSA is preserved. 304 305@node Scalar evolutions 306@section Scalar evolutions 307@cindex Scalar evolutions 308@cindex IV analysis on GIMPLE 309 310Scalar evolutions (SCEV) are used to represent results of induction 311variable analysis on GIMPLE@. They enable us to represent variables with 312complicated behavior in a simple and consistent way (we only use it to 313express values of polynomial induction variables, but it is possible to 314extend it). The interfaces to SCEV analysis are declared in 315@file{tree-scalar-evolution.h}. To use scalar evolutions analysis, 316@code{scev_initialize} must be used. To stop using SCEV, 317@code{scev_finalize} should be used. SCEV analysis caches results in 318order to save time and memory. This cache however is made invalid by 319most of the loop transformations, including removal of code. If such a 320transformation is performed, @code{scev_reset} must be called to clean 321the caches. 322 323Given an SSA name, its behavior in loops can be analyzed using the 324@code{analyze_scalar_evolution} function. The returned SCEV however 325does not have to be fully analyzed and it may contain references to 326other SSA names defined in the loop. To resolve these (potentially 327recursive) references, @code{instantiate_parameters} or 328@code{resolve_mixers} functions must be used. 329@code{instantiate_parameters} is useful when you use the results of SCEV 330only for some analysis, and when you work with whole nest of loops at 331once. It will try replacing all SSA names by their SCEV in all loops, 332including the super-loops of the current loop, thus providing a complete 333information about the behavior of the variable in the loop nest. 334@code{resolve_mixers} is useful if you work with only one loop at a 335time, and if you possibly need to create code based on the value of the 336induction variable. It will only resolve the SSA names defined in the 337current loop, leaving the SSA names defined outside unchanged, even if 338their evolution in the outer loops is known. 339 340The SCEV is a normal tree expression, except for the fact that it may 341contain several special tree nodes. One of them is 342@code{SCEV_NOT_KNOWN}, used for SSA names whose value cannot be 343expressed. The other one is @code{POLYNOMIAL_CHREC}. Polynomial chrec 344has three arguments -- base, step and loop (both base and step may 345contain further polynomial chrecs). Type of the expression and of base 346and step must be the same. A variable has evolution 347@code{POLYNOMIAL_CHREC(base, step, loop)} if it is (in the specified 348loop) equivalent to @code{x_1} in the following example 349 350@smallexample 351while (@dots{}) 352 @{ 353 x_1 = phi (base, x_2); 354 x_2 = x_1 + step; 355 @} 356@end smallexample 357 358Note that this includes the language restrictions on the operations. 359For example, if we compile C code and @code{x} has signed type, then the 360overflow in addition would cause undefined behavior, and we may assume 361that this does not happen. Hence, the value with this SCEV cannot 362overflow (which restricts the number of iterations of such a loop). 363 364In many cases, one wants to restrict the attention just to affine 365induction variables. In this case, the extra expressive power of SCEV 366is not useful, and may complicate the optimizations. In this case, 367@code{simple_iv} function may be used to analyze a value -- the result 368is a loop-invariant base and step. 369 370@node loop-iv 371@section IV analysis on RTL 372@cindex IV analysis on RTL 373 374The induction variable on RTL is simple and only allows analysis of 375affine induction variables, and only in one loop at once. The interface 376is declared in @file{cfgloop.h}. Before analyzing induction variables 377in a loop L, @code{iv_analysis_loop_init} function must be called on L. 378After the analysis (possibly calling @code{iv_analysis_loop_init} for 379several loops) is finished, @code{iv_analysis_done} should be called. 380The following functions can be used to access the results of the 381analysis: 382 383@itemize 384@item @code{iv_analyze}: Analyzes a single register used in the given 385insn. If no use of the register in this insn is found, the following 386insns are scanned, so that this function can be called on the insn 387returned by get_condition. 388@item @code{iv_analyze_result}: Analyzes result of the assignment in the 389given insn. 390@item @code{iv_analyze_expr}: Analyzes a more complicated expression. 391All its operands are analyzed by @code{iv_analyze}, and hence they must 392be used in the specified insn or one of the following insns. 393@end itemize 394 395The description of the induction variable is provided in @code{struct 396rtx_iv}. In order to handle subregs, the representation is a bit 397complicated; if the value of the @code{extend} field is not 398@code{UNKNOWN}, the value of the induction variable in the i-th 399iteration is 400 401@smallexample 402delta + mult * extend_@{extend_mode@} (subreg_@{mode@} (base + i * step)), 403@end smallexample 404 405with the following exception: if @code{first_special} is true, then the 406value in the first iteration (when @code{i} is zero) is @code{delta + 407mult * base}. However, if @code{extend} is equal to @code{UNKNOWN}, 408then @code{first_special} must be false, @code{delta} 0, @code{mult} 1 409and the value in the i-th iteration is 410 411@smallexample 412subreg_@{mode@} (base + i * step) 413@end smallexample 414 415The function @code{get_iv_value} can be used to perform these 416calculations. 417 418@node Number of iterations 419@section Number of iterations analysis 420@cindex Number of iterations analysis 421 422Both on GIMPLE and on RTL, there are functions available to determine 423the number of iterations of a loop, with a similar interface. The 424number of iterations of a loop in GCC is defined as the number of 425executions of the loop latch. In many cases, it is not possible to 426determine the number of iterations unconditionally -- the determined 427number is correct only if some assumptions are satisfied. The analysis 428tries to verify these conditions using the information contained in the 429program; if it fails, the conditions are returned together with the 430result. The following information and conditions are provided by the 431analysis: 432 433@itemize 434@item @code{assumptions}: If this condition is false, the rest of 435the information is invalid. 436@item @code{noloop_assumptions} on RTL, @code{may_be_zero} on GIMPLE: If 437this condition is true, the loop exits in the first iteration. 438@item @code{infinite}: If this condition is true, the loop is infinite. 439This condition is only available on RTL@. On GIMPLE, conditions for 440finiteness of the loop are included in @code{assumptions}. 441@item @code{niter_expr} on RTL, @code{niter} on GIMPLE: The expression 442that gives number of iterations. The number of iterations is defined as 443the number of executions of the loop latch. 444@end itemize 445 446Both on GIMPLE and on RTL, it necessary for the induction variable 447analysis framework to be initialized (SCEV on GIMPLE, loop-iv on RTL). 448On GIMPLE, the results are stored to @code{struct tree_niter_desc} 449structure. Number of iterations before the loop is exited through a 450given exit can be determined using @code{number_of_iterations_exit} 451function. On RTL, the results are returned in @code{struct niter_desc} 452structure. The corresponding function is named 453@code{check_simple_exit}. There are also functions that pass through 454all the exits of a loop and try to find one with easy to determine 455number of iterations -- @code{find_loop_niter} on GIMPLE and 456@code{find_simple_exit} on RTL@. Finally, there are functions that 457provide the same information, but additionally cache it, so that 458repeated calls to number of iterations are not so costly -- 459@code{number_of_latch_executions} on GIMPLE and @code{get_simple_loop_desc} 460on RTL. 461 462Note that some of these functions may behave slightly differently than 463others -- some of them return only the expression for the number of 464iterations, and fail if there are some assumptions. The function 465@code{number_of_latch_executions} works only for single-exit loops. 466The function @code{number_of_cond_exit_executions} can be used to 467determine number of executions of the exit condition of a single-exit 468loop (i.e., the @code{number_of_latch_executions} increased by one). 469 470@node Dependency analysis 471@section Data Dependency Analysis 472@cindex Data Dependency Analysis 473 474The code for the data dependence analysis can be found in 475@file{tree-data-ref.c} and its interface and data structures are 476described in @file{tree-data-ref.h}. The function that computes the 477data dependences for all the array and pointer references for a given 478loop is @code{compute_data_dependences_for_loop}. This function is 479currently used by the linear loop transform and the vectorization 480passes. Before calling this function, one has to allocate two vectors: 481a first vector will contain the set of data references that are 482contained in the analyzed loop body, and the second vector will contain 483the dependence relations between the data references. Thus if the 484vector of data references is of size @code{n}, the vector containing the 485dependence relations will contain @code{n*n} elements. However if the 486analyzed loop contains side effects, such as calls that potentially can 487interfere with the data references in the current analyzed loop, the 488analysis stops while scanning the loop body for data references, and 489inserts a single @code{chrec_dont_know} in the dependence relation 490array. 491 492The data references are discovered in a particular order during the 493scanning of the loop body: the loop body is analyzed in execution order, 494and the data references of each statement are pushed at the end of the 495data reference array. Two data references syntactically occur in the 496program in the same order as in the array of data references. This 497syntactic order is important in some classical data dependence tests, 498and mapping this order to the elements of this array avoids costly 499queries to the loop body representation. 500 501Three types of data references are currently handled: ARRAY_REF, 502INDIRECT_REF and COMPONENT_REF@. The data structure for the data reference 503is @code{data_reference}, where @code{data_reference_p} is a name of a 504pointer to the data reference structure. The structure contains the 505following elements: 506 507@itemize 508@item @code{base_object_info}: Provides information about the base object 509of the data reference and its access functions. These access functions 510represent the evolution of the data reference in the loop relative to 511its base, in keeping with the classical meaning of the data reference 512access function for the support of arrays. For example, for a reference 513@code{a.b[i][j]}, the base object is @code{a.b} and the access functions, 514one for each array subscript, are: 515@code{@{i_init, + i_step@}_1, @{j_init, +, j_step@}_2}. 516 517@item @code{first_location_in_loop}: Provides information about the first 518location accessed by the data reference in the loop and about the access 519function used to represent evolution relative to this location. This data 520is used to support pointers, and is not used for arrays (for which we 521have base objects). Pointer accesses are represented as a one-dimensional 522access that starts from the first location accessed in the loop. For 523example: 524 525@smallexample 526 for1 i 527 for2 j 528 *((int *)p + i + j) = a[i][j]; 529@end smallexample 530 531The access function of the pointer access is @code{@{0, + 4B@}_for2} 532relative to @code{p + i}. The access functions of the array are 533@code{@{i_init, + i_step@}_for1} and @code{@{j_init, +, j_step@}_for2} 534relative to @code{a}. 535 536Usually, the object the pointer refers to is either unknown, or we can't 537prove that the access is confined to the boundaries of a certain object. 538 539Two data references can be compared only if at least one of these two 540representations has all its fields filled for both data references. 541 542The current strategy for data dependence tests is as follows: 543If both @code{a} and @code{b} are represented as arrays, compare 544@code{a.base_object} and @code{b.base_object}; 545if they are equal, apply dependence tests (use access functions based on 546base_objects). 547Else if both @code{a} and @code{b} are represented as pointers, compare 548@code{a.first_location} and @code{b.first_location}; 549if they are equal, apply dependence tests (use access functions based on 550first location). 551However, if @code{a} and @code{b} are represented differently, only try 552to prove that the bases are definitely different. 553 554@item Aliasing information. 555@item Alignment information. 556@end itemize 557 558The structure describing the relation between two data references is 559@code{data_dependence_relation} and the shorter name for a pointer to 560such a structure is @code{ddr_p}. This structure contains: 561 562@itemize 563@item a pointer to each data reference, 564@item a tree node @code{are_dependent} that is set to @code{chrec_known} 565if the analysis has proved that there is no dependence between these two 566data references, @code{chrec_dont_know} if the analysis was not able to 567determine any useful result and potentially there could exist a 568dependence between these data references, and @code{are_dependent} is 569set to @code{NULL_TREE} if there exist a dependence relation between the 570data references, and the description of this dependence relation is 571given in the @code{subscripts}, @code{dir_vects}, and @code{dist_vects} 572arrays, 573@item a boolean that determines whether the dependence relation can be 574represented by a classical distance vector, 575@item an array @code{subscripts} that contains a description of each 576subscript of the data references. Given two array accesses a 577subscript is the tuple composed of the access functions for a given 578dimension. For example, given @code{A[f1][f2][f3]} and 579@code{B[g1][g2][g3]}, there are three subscripts: @code{(f1, g1), (f2, 580g2), (f3, g3)}. 581@item two arrays @code{dir_vects} and @code{dist_vects} that contain 582classical representations of the data dependences under the form of 583direction and distance dependence vectors, 584@item an array of loops @code{loop_nest} that contains the loops to 585which the distance and direction vectors refer to. 586@end itemize 587 588Several functions for pretty printing the information extracted by the 589data dependence analysis are available: @code{dump_ddrs} prints with a 590maximum verbosity the details of a data dependence relations array, 591@code{dump_dist_dir_vectors} prints only the classical distance and 592direction vectors for a data dependence relations array, and 593@code{dump_data_references} prints the details of the data references 594contained in a data reference array. 595 596@node Lambda 597@section Linear loop transformations framework 598@cindex Linear loop transformations framework 599 600Lambda is a framework that allows transformations of loops using 601non-singular matrix based transformations of the iteration space and 602loop bounds. This allows compositions of skewing, scaling, interchange, 603and reversal transformations. These transformations are often used to 604improve cache behavior or remove inner loop dependencies to allow 605parallelization and vectorization to take place. 606 607To perform these transformations, Lambda requires that the loopnest be 608converted into an internal form that can be matrix transformed easily. 609To do this conversion, the function 610@code{gcc_loopnest_to_lambda_loopnest} is provided. If the loop cannot 611be transformed using lambda, this function will return NULL. 612 613Once a @code{lambda_loopnest} is obtained from the conversion function, 614it can be transformed by using @code{lambda_loopnest_transform}, which 615takes a transformation matrix to apply. Note that it is up to the 616caller to verify that the transformation matrix is legal to apply to the 617loop (dependence respecting, etc). Lambda simply applies whatever 618matrix it is told to provide. It can be extended to make legal matrices 619out of any non-singular matrix, but this is not currently implemented. 620Legality of a matrix for a given loopnest can be verified using 621@code{lambda_transform_legal_p}. 622 623Given a transformed loopnest, conversion back into gcc IR is done by 624@code{lambda_loopnest_to_gcc_loopnest}. This function will modify the 625loops so that they match the transformed loopnest. 626 627 628@node Omega 629@section Omega a solver for linear programming problems 630@cindex Omega a solver for linear programming problems 631 632The data dependence analysis contains several solvers triggered 633sequentially from the less complex ones to the more sophisticated. 634For ensuring the consistency of the results of these solvers, a data 635dependence check pass has been implemented based on two different 636solvers. The second method that has been integrated to GCC is based 637on the Omega dependence solver, written in the 1990's by William Pugh 638and David Wonnacott. Data dependence tests can be formulated using a 639subset of the Presburger arithmetics that can be translated to linear 640constraint systems. These linear constraint systems can then be 641solved using the Omega solver. 642 643The Omega solver is using Fourier-Motzkin's algorithm for variable 644elimination: a linear constraint system containing @code{n} variables 645is reduced to a linear constraint system with @code{n-1} variables. 646The Omega solver can also be used for solving other problems that can 647be expressed under the form of a system of linear equalities and 648inequalities. The Omega solver is known to have an exponential worst 649case, also known under the name of ``omega nightmare'' in the 650literature, but in practice, the omega test is known to be efficient 651for the common data dependence tests. 652 653The interface used by the Omega solver for describing the linear 654programming problems is described in @file{omega.h}, and the solver is 655@code{omega_solve_problem}. 656