1@c Copyright (c) 2004, 2005 Free Software Foundation, Inc. 2@c Free Software Foundation, Inc. 3@c This is part of the GCC manual. 4@c For copying conditions, see the file gcc.texi. 5 6@c --------------------------------------------------------------------- 7@c Tree SSA 8@c --------------------------------------------------------------------- 9 10@node Tree SSA 11@chapter Analysis and Optimization of GIMPLE Trees 12@cindex Tree SSA 13@cindex Optimization infrastructure for GIMPLE 14 15GCC uses three main intermediate languages to represent the program 16during compilation: GENERIC, GIMPLE and RTL@. GENERIC is a 17language-independent representation generated by each front end. It 18is used to serve as an interface between the parser and optimizer. 19GENERIC is a common representation that is able to represent programs 20written in all the languages supported by GCC@. 21 22GIMPLE and RTL are used to optimize the program. GIMPLE is used for 23target and language independent optimizations (e.g., inlining, 24constant propagation, tail call elimination, redundancy elimination, 25etc). Much like GENERIC, GIMPLE is a language independent, tree based 26representation. However, it differs from GENERIC in that the GIMPLE 27grammar is more restrictive: expressions contain no more than 3 28operands (except function calls), it has no control flow structures 29and expressions with side-effects are only allowed on the right hand 30side of assignments. See the chapter describing GENERIC and GIMPLE 31for more details. 32 33This chapter describes the data structures and functions used in the 34GIMPLE optimizers (also known as ``tree optimizers'' or ``middle 35end''). In particular, it focuses on all the macros, data structures, 36functions and programming constructs needed to implement optimization 37passes for GIMPLE@. 38 39@menu 40* GENERIC:: A high-level language-independent representation. 41* GIMPLE:: A lower-level factored tree representation. 42* Annotations:: Attributes for statements and variables. 43* Statement Operands:: Variables referenced by GIMPLE statements. 44* SSA:: Static Single Assignment representation. 45* Alias analysis:: Representing aliased loads and stores. 46@end menu 47 48@node GENERIC 49@section GENERIC 50@cindex GENERIC 51 52The purpose of GENERIC is simply to provide a language-independent way of 53representing an entire function in trees. To this end, it was necessary to 54add a few new tree codes to the back end, but most everything was already 55there. If you can express it with the codes in @code{gcc/tree.def}, it's 56GENERIC@. 57 58Early on, there was a great deal of debate about how to think about 59statements in a tree IL@. In GENERIC, a statement is defined as any 60expression whose value, if any, is ignored. A statement will always 61have @code{TREE_SIDE_EFFECTS} set (or it will be discarded), but a 62non-statement expression may also have side effects. A 63@code{CALL_EXPR}, for instance. 64 65It would be possible for some local optimizations to work on the 66GENERIC form of a function; indeed, the adapted tree inliner works 67fine on GENERIC, but the current compiler performs inlining after 68lowering to GIMPLE (a restricted form described in the next section). 69Indeed, currently the frontends perform this lowering before handing 70off to @code{tree_rest_of_compilation}, but this seems inelegant. 71 72If necessary, a front end can use some language-dependent tree codes 73in its GENERIC representation, so long as it provides a hook for 74converting them to GIMPLE and doesn't expect them to work with any 75(hypothetical) optimizers that run before the conversion to GIMPLE@. 76The intermediate representation used while parsing C and C++ looks 77very little like GENERIC, but the C and C++ gimplifier hooks are 78perfectly happy to take it as input and spit out GIMPLE@. 79 80@node GIMPLE 81@section GIMPLE 82@cindex GIMPLE 83 84GIMPLE is a simplified subset of GENERIC for use in optimization. The 85particular subset chosen (and the name) was heavily influenced by the 86SIMPLE IL used by the McCAT compiler project at McGill University, 87though we have made some different choices. For one thing, SIMPLE 88doesn't support @code{goto}; a production compiler can't afford that 89kind of restriction. 90 91GIMPLE retains much of the structure of the parse trees: lexical 92scopes are represented as containers, rather than markers. However, 93expressions are broken down into a 3-address form, using temporary 94variables to hold intermediate values. Also, control structures are 95lowered to gotos. 96 97In GIMPLE no container node is ever used for its value; if a 98@code{COND_EXPR} or @code{BIND_EXPR} has a value, it is stored into a 99temporary within the controlled blocks, and that temporary is used in 100place of the container. 101 102The compiler pass which lowers GENERIC to GIMPLE is referred to as the 103@samp{gimplifier}. The gimplifier works recursively, replacing complex 104statements with sequences of simple statements. 105 106@c Currently, the only way to 107@c tell whether or not an expression is in GIMPLE form is by recursively 108@c examining it; in the future there will probably be a flag to help avoid 109@c redundant work. FIXME FIXME 110 111@menu 112* Interfaces:: 113* Temporaries:: 114* GIMPLE Expressions:: 115* Statements:: 116* GIMPLE Example:: 117* Rough GIMPLE Grammar:: 118@end menu 119 120@node Interfaces 121@subsection Interfaces 122@cindex gimplification 123 124The tree representation of a function is stored in 125@code{DECL_SAVED_TREE}. It is lowered to GIMPLE by a call to 126@code{gimplify_function_tree}. 127 128If a front end wants to include language-specific tree codes in the tree 129representation which it provides to the back end, it must provide a 130definition of @code{LANG_HOOKS_GIMPLIFY_EXPR} which knows how to 131convert the front end trees to GIMPLE@. Usually such a hook will involve 132much of the same code for expanding front end trees to RTL@. This function 133can return fully lowered GIMPLE, or it can return GENERIC trees and let the 134main gimplifier lower them the rest of the way; this is often simpler. 135GIMPLE that is not fully lowered is known as ``high GIMPLE'' and 136consists of the IL before the pass @code{pass_lower_cf}. High GIMPLE 137still contains lexical scopes and nested expressions, while low GIMPLE 138exposes all of the implicit jumps for control expressions like 139@code{COND_EXPR}. 140 141The C and C++ front ends currently convert directly from front end 142trees to GIMPLE, and hand that off to the back end rather than first 143converting to GENERIC@. Their gimplifier hooks know about all the 144@code{_STMT} nodes and how to convert them to GENERIC forms. There 145was some work done on a genericization pass which would run first, but 146the existence of @code{STMT_EXPR} meant that in order to convert all 147of the C statements into GENERIC equivalents would involve walking the 148entire tree anyway, so it was simpler to lower all the way. This 149might change in the future if someone writes an optimization pass 150which would work better with higher-level trees, but currently the 151optimizers all expect GIMPLE@. 152 153A front end which wants to use the tree optimizers (and already has 154some sort of whole-function tree representation) only needs to provide 155a definition of @code{LANG_HOOKS_GIMPLIFY_EXPR}, call 156@code{gimplify_function_tree} to lower to GIMPLE, and then hand off to 157@code{tree_rest_of_compilation} to compile and output the function. 158 159You can tell the compiler to dump a C-like representation of the GIMPLE 160form with the flag @option{-fdump-tree-gimple}. 161 162@node Temporaries 163@subsection Temporaries 164@cindex Temporaries 165 166When gimplification encounters a subexpression which is too complex, it 167creates a new temporary variable to hold the value of the subexpression, 168and adds a new statement to initialize it before the current statement. 169These special temporaries are known as @samp{expression temporaries}, and are 170allocated using @code{get_formal_tmp_var}. The compiler tries to 171always evaluate identical expressions into the same temporary, to simplify 172elimination of redundant calculations. 173 174We can only use expression temporaries when we know that it will not be 175reevaluated before its value is used, and that it will not be otherwise 176modified@footnote{These restrictions are derived from those in Morgan 4.8.}. 177Other temporaries can be allocated using 178@code{get_initialized_tmp_var} or @code{create_tmp_var}. 179 180Currently, an expression like @code{a = b + 5} is not reduced any 181further. We tried converting it to something like 182@smallexample 183 T1 = b + 5; 184 a = T1; 185@end smallexample 186but this bloated the representation for minimal benefit. However, a 187variable which must live in memory cannot appear in an expression; its 188value is explicitly loaded into a temporary first. Similarly, storing 189the value of an expression to a memory variable goes through a 190temporary. 191 192@node GIMPLE Expressions 193@subsection Expressions 194@cindex GIMPLE Expressions 195 196In general, expressions in GIMPLE consist of an operation and the 197appropriate number of simple operands; these operands must either be a 198GIMPLE rvalue (@code{is_gimple_val}), i.e.@: a constant or a register 199variable. More complex operands are factored out into temporaries, so 200that 201@smallexample 202 a = b + c + d 203@end smallexample 204becomes 205@smallexample 206 T1 = b + c; 207 a = T1 + d; 208@end smallexample 209 210The same rule holds for arguments to a @code{CALL_EXPR}. 211 212The target of an assignment is usually a variable, but can also be an 213@code{INDIRECT_REF} or a compound lvalue as described below. 214 215@menu 216* Compound Expressions:: 217* Compound Lvalues:: 218* Conditional Expressions:: 219* Logical Operators:: 220@end menu 221 222@node Compound Expressions 223@subsubsection Compound Expressions 224@cindex Compound Expressions 225 226The left-hand side of a C comma expression is simply moved into a separate 227statement. 228 229@node Compound Lvalues 230@subsubsection Compound Lvalues 231@cindex Compound Lvalues 232 233Currently compound lvalues involving array and structure field references 234are not broken down; an expression like @code{a.b[2] = 42} is not reduced 235any further (though complex array subscripts are). This restriction is a 236workaround for limitations in later optimizers; if we were to convert this 237to 238 239@smallexample 240 T1 = &a.b; 241 T1[2] = 42; 242@end smallexample 243 244alias analysis would not remember that the reference to @code{T1[2]} came 245by way of @code{a.b}, so it would think that the assignment could alias 246another member of @code{a}; this broke @code{struct-alias-1.c}. Future 247optimizer improvements may make this limitation unnecessary. 248 249@node Conditional Expressions 250@subsubsection Conditional Expressions 251@cindex Conditional Expressions 252 253A C @code{?:} expression is converted into an @code{if} statement with 254each branch assigning to the same temporary. So, 255 256@smallexample 257 a = b ? c : d; 258@end smallexample 259becomes 260@smallexample 261 if (b) 262 T1 = c; 263 else 264 T1 = d; 265 a = T1; 266@end smallexample 267 268Tree level if-conversion pass re-introduces @code{?:} expression, if appropriate. 269It is used to vectorize loops with conditions using vector conditional operations. 270 271Note that in GIMPLE, @code{if} statements are also represented using 272@code{COND_EXPR}, as described below. 273 274@node Logical Operators 275@subsubsection Logical Operators 276@cindex Logical Operators 277 278Except when they appear in the condition operand of a @code{COND_EXPR}, 279logical `and' and `or' operators are simplified as follows: 280@code{a = b && c} becomes 281 282@smallexample 283 T1 = (bool)b; 284 if (T1) 285 T1 = (bool)c; 286 a = T1; 287@end smallexample 288 289Note that @code{T1} in this example cannot be an expression temporary, 290because it has two different assignments. 291 292@node Statements 293@subsection Statements 294@cindex Statements 295 296Most statements will be assignment statements, represented by 297@code{MODIFY_EXPR}. A @code{CALL_EXPR} whose value is ignored can 298also be a statement. No other C expressions can appear at statement level; 299a reference to a volatile object is converted into a @code{MODIFY_EXPR}. 300In GIMPLE form, type of @code{MODIFY_EXPR} is not meaningful. Instead, use type 301of LHS or RHS@. 302 303There are also several varieties of complex statements. 304 305@menu 306* Blocks:: 307* Statement Sequences:: 308* Empty Statements:: 309* Loops:: 310* Selection Statements:: 311* Jumps:: 312* Cleanups:: 313* GIMPLE Exception Handling:: 314@end menu 315 316@node Blocks 317@subsubsection Blocks 318@cindex Blocks 319 320Block scopes and the variables they declare in GENERIC and GIMPLE are 321expressed using the @code{BIND_EXPR} code, which in previous versions of 322GCC was primarily used for the C statement-expression extension. 323 324Variables in a block are collected into @code{BIND_EXPR_VARS} in 325declaration order. Any runtime initialization is moved out of 326@code{DECL_INITIAL} and into a statement in the controlled block. When 327gimplifying from C or C++, this initialization replaces the 328@code{DECL_STMT}. 329 330Variable-length arrays (VLAs) complicate this process, as their size often 331refers to variables initialized earlier in the block. To handle this, we 332currently split the block at that point, and move the VLA into a new, inner 333@code{BIND_EXPR}. This strategy may change in the future. 334 335@code{DECL_SAVED_TREE} for a GIMPLE function will always be a 336@code{BIND_EXPR} which contains declarations for the temporary variables 337used in the function. 338 339A C++ program will usually contain more @code{BIND_EXPR}s than there are 340syntactic blocks in the source code, since several C++ constructs have 341implicit scopes associated with them. On the other hand, although the C++ 342front end uses pseudo-scopes to handle cleanups for objects with 343destructors, these don't translate into the GIMPLE form; multiple 344declarations at the same level use the same @code{BIND_EXPR}. 345 346@node Statement Sequences 347@subsubsection Statement Sequences 348@cindex Statement Sequences 349 350Multiple statements at the same nesting level are collected into a 351@code{STATEMENT_LIST}. Statement lists are modified and traversed 352using the interface in @samp{tree-iterator.h}. 353 354@node Empty Statements 355@subsubsection Empty Statements 356@cindex Empty Statements 357 358Whenever possible, statements with no effect are discarded. But if they 359are nested within another construct which cannot be discarded for some 360reason, they are instead replaced with an empty statement, generated by 361@code{build_empty_stmt}. Initially, all empty statements were shared, 362after the pattern of the Java front end, but this caused a lot of trouble in 363practice. 364 365An empty statement is represented as @code{(void)0}. 366 367@node Loops 368@subsubsection Loops 369@cindex Loops 370 371At one time loops were expressed in GIMPLE using @code{LOOP_EXPR}, but 372now they are lowered to explicit gotos. 373 374@node Selection Statements 375@subsubsection Selection Statements 376@cindex Selection Statements 377 378A simple selection statement, such as the C @code{if} statement, is 379expressed in GIMPLE using a void @code{COND_EXPR}. If only one branch is 380used, the other is filled with an empty statement. 381 382Normally, the condition expression is reduced to a simple comparison. If 383it is a shortcut (@code{&&} or @code{||}) expression, however, we try to 384break up the @code{if} into multiple @code{if}s so that the implied shortcut 385is taken directly, much like the transformation done by @code{do_jump} in 386the RTL expander. 387 388A @code{SWITCH_EXPR} in GIMPLE contains the condition and a 389@code{TREE_VEC} of @code{CASE_LABEL_EXPR}s describing the case values 390and corresponding @code{LABEL_DECL}s to jump to. The body of the 391@code{switch} is moved after the @code{SWITCH_EXPR}. 392 393@node Jumps 394@subsubsection Jumps 395@cindex Jumps 396 397Other jumps are expressed by either @code{GOTO_EXPR} or @code{RETURN_EXPR}. 398 399The operand of a @code{GOTO_EXPR} must be either a label or a variable 400containing the address to jump to. 401 402The operand of a @code{RETURN_EXPR} is either @code{NULL_TREE}, 403@code{RESULT_DECL}, or a @code{MODIFY_EXPR} which sets the return value. It 404would be nice to move the @code{MODIFY_EXPR} into a separate statement, but the 405special return semantics in @code{expand_return} make that difficult. It may 406still happen in the future, perhaps by moving most of that logic into 407@code{expand_assignment}. 408 409@node Cleanups 410@subsubsection Cleanups 411@cindex Cleanups 412 413Destructors for local C++ objects and similar dynamic cleanups are 414represented in GIMPLE by a @code{TRY_FINALLY_EXPR}. 415@code{TRY_FINALLY_EXPR} has two operands, both of which are a sequence 416of statements to execute. The first sequence is executed. When it 417completes the second sequence is executed. 418 419The first sequence may complete in the following ways: 420 421@enumerate 422 423@item Execute the last statement in the sequence and fall off the 424end. 425 426@item Execute a goto statement (@code{GOTO_EXPR}) to an ordinary 427label outside the sequence. 428 429@item Execute a return statement (@code{RETURN_EXPR}). 430 431@item Throw an exception. This is currently not explicitly represented in 432GIMPLE. 433 434@end enumerate 435 436The second sequence is not executed if the first sequence completes by 437calling @code{setjmp} or @code{exit} or any other function that does 438not return. The second sequence is also not executed if the first 439sequence completes via a non-local goto or a computed goto (in general 440the compiler does not know whether such a goto statement exits the 441first sequence or not, so we assume that it doesn't). 442 443After the second sequence is executed, if it completes normally by 444falling off the end, execution continues wherever the first sequence 445would have continued, by falling off the end, or doing a goto, etc. 446 447@code{TRY_FINALLY_EXPR} complicates the flow graph, since the cleanup 448needs to appear on every edge out of the controlled block; this 449reduces the freedom to move code across these edges. Therefore, the 450EH lowering pass which runs before most of the optimization passes 451eliminates these expressions by explicitly adding the cleanup to each 452edge. Rethrowing the exception is represented using @code{RESX_EXPR}. 453 454 455@node GIMPLE Exception Handling 456@subsubsection Exception Handling 457@cindex GIMPLE Exception Handling 458 459Other exception handling constructs are represented using 460@code{TRY_CATCH_EXPR}. @code{TRY_CATCH_EXPR} has two operands. The 461first operand is a sequence of statements to execute. If executing 462these statements does not throw an exception, then the second operand 463is ignored. Otherwise, if an exception is thrown, then the second 464operand of the @code{TRY_CATCH_EXPR} is checked. The second operand 465may have the following forms: 466 467@enumerate 468 469@item A sequence of statements to execute. When an exception occurs, 470these statements are executed, and then the exception is rethrown. 471 472@item A sequence of @code{CATCH_EXPR} expressions. Each @code{CATCH_EXPR} 473has a list of applicable exception types and handler code. If the 474thrown exception matches one of the caught types, the associated 475handler code is executed. If the handler code falls off the bottom, 476execution continues after the original @code{TRY_CATCH_EXPR}. 477 478@item An @code{EH_FILTER_EXPR} expression. This has a list of 479permitted exception types, and code to handle a match failure. If the 480thrown exception does not match one of the allowed types, the 481associated match failure code is executed. If the thrown exception 482does match, it continues unwinding the stack looking for the next 483handler. 484 485@end enumerate 486 487Currently throwing an exception is not directly represented in GIMPLE, 488since it is implemented by calling a function. At some point in the future 489we will want to add some way to express that the call will throw an 490exception of a known type. 491 492Just before running the optimizers, the compiler lowers the high-level 493EH constructs above into a set of @samp{goto}s, magic labels, and EH 494regions. Continuing to unwind at the end of a cleanup is represented 495with a @code{RESX_EXPR}. 496 497@node GIMPLE Example 498@subsection GIMPLE Example 499@cindex GIMPLE Example 500 501@smallexample 502struct A @{ A(); ~A(); @}; 503 504int i; 505int g(); 506void f() 507@{ 508 A a; 509 int j = (--i, i ? 0 : 1); 510 511 for (int x = 42; x > 0; --x) 512 @{ 513 i += g()*4 + 32; 514 @} 515@} 516@end smallexample 517 518becomes 519 520@smallexample 521void f() 522@{ 523 int i.0; 524 int T.1; 525 int iftmp.2; 526 int T.3; 527 int T.4; 528 int T.5; 529 int T.6; 530 531 @{ 532 struct A a; 533 int j; 534 535 __comp_ctor (&a); 536 try 537 @{ 538 i.0 = i; 539 T.1 = i.0 - 1; 540 i = T.1; 541 i.0 = i; 542 if (i.0 == 0) 543 iftmp.2 = 1; 544 else 545 iftmp.2 = 0; 546 j = iftmp.2; 547 @{ 548 int x; 549 550 x = 42; 551 goto test; 552 loop:; 553 554 T.3 = g (); 555 T.4 = T.3 * 4; 556 i.0 = i; 557 T.5 = T.4 + i.0; 558 T.6 = T.5 + 32; 559 i = T.6; 560 x = x - 1; 561 562 test:; 563 if (x > 0) 564 goto loop; 565 else 566 goto break_; 567 break_:; 568 @} 569 @} 570 finally 571 @{ 572 __comp_dtor (&a); 573 @} 574 @} 575@} 576@end smallexample 577 578@node Rough GIMPLE Grammar 579@subsection Rough GIMPLE Grammar 580@cindex Rough GIMPLE Grammar 581 582@smallexample 583 function : FUNCTION_DECL 584 DECL_SAVED_TREE -> compound-stmt 585 586 compound-stmt: STATEMENT_LIST 587 members -> stmt 588 589 stmt : block 590 | if-stmt 591 | switch-stmt 592 | goto-stmt 593 | return-stmt 594 | resx-stmt 595 | label-stmt 596 | try-stmt 597 | modify-stmt 598 | call-stmt 599 600 block : BIND_EXPR 601 BIND_EXPR_VARS -> chain of DECLs 602 BIND_EXPR_BLOCK -> BLOCK 603 BIND_EXPR_BODY -> compound-stmt 604 605 if-stmt : COND_EXPR 606 op0 -> condition 607 op1 -> compound-stmt 608 op2 -> compound-stmt 609 610 switch-stmt : SWITCH_EXPR 611 op0 -> val 612 op1 -> NULL 613 op2 -> TREE_VEC of CASE_LABEL_EXPRs 614 The CASE_LABEL_EXPRs are sorted by CASE_LOW, 615 and default is last. 616 617 goto-stmt : GOTO_EXPR 618 op0 -> LABEL_DECL | val 619 620 return-stmt : RETURN_EXPR 621 op0 -> return-value 622 623 return-value : NULL 624 | RESULT_DECL 625 | MODIFY_EXPR 626 op0 -> RESULT_DECL 627 op1 -> lhs 628 629 resx-stmt : RESX_EXPR 630 631 label-stmt : LABEL_EXPR 632 op0 -> LABEL_DECL 633 634 try-stmt : TRY_CATCH_EXPR 635 op0 -> compound-stmt 636 op1 -> handler 637 | TRY_FINALLY_EXPR 638 op0 -> compound-stmt 639 op1 -> compound-stmt 640 641 handler : catch-seq 642 | EH_FILTER_EXPR 643 | compound-stmt 644 645 catch-seq : STATEMENT_LIST 646 members -> CATCH_EXPR 647 648 modify-stmt : MODIFY_EXPR 649 op0 -> lhs 650 op1 -> rhs 651 652 call-stmt : CALL_EXPR 653 op0 -> val | OBJ_TYPE_REF 654 op1 -> call-arg-list 655 656 call-arg-list: TREE_LIST 657 members -> lhs | CONST 658 659 addr-expr-arg: ID 660 | compref 661 662 addressable : addr-expr-arg 663 | indirectref 664 665 with-size-arg: addressable 666 | call-stmt 667 668 indirectref : INDIRECT_REF 669 op0 -> val 670 671 lhs : addressable 672 | bitfieldref 673 | WITH_SIZE_EXPR 674 op0 -> with-size-arg 675 op1 -> val 676 677 min-lval : ID 678 | indirectref 679 680 bitfieldref : BIT_FIELD_REF 681 op0 -> inner-compref 682 op1 -> CONST 683 op2 -> var 684 685 compref : inner-compref 686 | TARGET_MEM_REF 687 op0 -> ID 688 op1 -> val 689 op2 -> val 690 op3 -> CONST 691 op4 -> CONST 692 | REALPART_EXPR 693 op0 -> inner-compref 694 | IMAGPART_EXPR 695 op0 -> inner-compref 696 697 inner-compref: min-lval 698 | COMPONENT_REF 699 op0 -> inner-compref 700 op1 -> FIELD_DECL 701 op2 -> val 702 | ARRAY_REF 703 op0 -> inner-compref 704 op1 -> val 705 op2 -> val 706 op3 -> val 707 | ARRAY_RANGE_REF 708 op0 -> inner-compref 709 op1 -> val 710 op2 -> val 711 op3 -> val 712 | VIEW_CONVERT_EXPR 713 op0 -> inner-compref 714 715 condition : val 716 | RELOP 717 op0 -> val 718 op1 -> val 719 720 val : ID 721 | CONST 722 723 rhs : lhs 724 | CONST 725 | call-stmt 726 | ADDR_EXPR 727 op0 -> addr-expr-arg 728 | UNOP 729 op0 -> val 730 | BINOP 731 op0 -> val 732 op1 -> val 733 | RELOP 734 op0 -> val 735 op1 -> val 736 | COND_EXPR 737 op0 -> condition 738 op1 -> val 739 op2 -> val 740@end smallexample 741 742@node Annotations 743@section Annotations 744@cindex annotations 745 746The optimizers need to associate attributes with statements and 747variables during the optimization process. For instance, we need to 748know what basic block a statement belongs to or whether a variable 749has aliases. All these attributes are stored in data structures 750called annotations which are then linked to the field @code{ann} in 751@code{struct tree_common}. 752 753Presently, we define annotations for statements (@code{stmt_ann_t}), 754variables (@code{var_ann_t}) and SSA names (@code{ssa_name_ann_t}). 755Annotations are defined and documented in @file{tree-flow.h}. 756 757 758@node Statement Operands 759@section Statement Operands 760@cindex operands 761@cindex virtual operands 762@cindex real operands 763@findex update_stmt 764 765Almost every GIMPLE statement will contain a reference to a variable 766or memory location. Since statements come in different shapes and 767sizes, their operands are going to be located at various spots inside 768the statement's tree. To facilitate access to the statement's 769operands, they are organized into lists associated inside each 770statement's annotation. Each element in an operand list is a pointer 771to a @code{VAR_DECL}, @code{PARM_DECL} or @code{SSA_NAME} tree node. 772This provides a very convenient way of examining and replacing 773operands. 774 775Data flow analysis and optimization is done on all tree nodes 776representing variables. Any node for which @code{SSA_VAR_P} returns 777nonzero is considered when scanning statement operands. However, not 778all @code{SSA_VAR_P} variables are processed in the same way. For the 779purposes of optimization, we need to distinguish between references to 780local scalar variables and references to globals, statics, structures, 781arrays, aliased variables, etc. The reason is simple, the compiler 782can gather complete data flow information for a local scalar. On the 783other hand, a global variable may be modified by a function call, it 784may not be possible to keep track of all the elements of an array or 785the fields of a structure, etc. 786 787The operand scanner gathers two kinds of operands: @dfn{real} and 788@dfn{virtual}. An operand for which @code{is_gimple_reg} returns true 789is considered real, otherwise it is a virtual operand. We also 790distinguish between uses and definitions. An operand is used if its 791value is loaded by the statement (e.g., the operand at the RHS of an 792assignment). If the statement assigns a new value to the operand, the 793operand is considered a definition (e.g., the operand at the LHS of 794an assignment). 795 796Virtual and real operands also have very different data flow 797properties. Real operands are unambiguous references to the 798full object that they represent. For instance, given 799 800@smallexample 801@{ 802 int a, b; 803 a = b 804@} 805@end smallexample 806 807Since @code{a} and @code{b} are non-aliased locals, the statement 808@code{a = b} will have one real definition and one real use because 809variable @code{b} is completely modified with the contents of 810variable @code{a}. Real definition are also known as @dfn{killing 811definitions}. Similarly, the use of @code{a} reads all its bits. 812 813In contrast, virtual operands are used with variables that can have 814a partial or ambiguous reference. This includes structures, arrays, 815globals, and aliased variables. In these cases, we have two types of 816definitions. For globals, structures, and arrays, we can determine from 817a statement whether a variable of these types has a killing definition. 818If the variable does, then the statement is marked as having a 819@dfn{must definition} of that variable. However, if a statement is only 820defining a part of the variable (i.e.@: a field in a structure), or if we 821know that a statement might define the variable but we cannot say for sure, 822then we mark that statement as having a @dfn{may definition}. For 823instance, given 824 825@smallexample 826@{ 827 int a, b, *p; 828 829 if (...) 830 p = &a; 831 else 832 p = &b; 833 *p = 5; 834 return *p; 835@} 836@end smallexample 837 838The assignment @code{*p = 5} may be a definition of @code{a} or 839@code{b}. If we cannot determine statically where @code{p} is 840pointing to at the time of the store operation, we create virtual 841definitions to mark that statement as a potential definition site for 842@code{a} and @code{b}. Memory loads are similarly marked with virtual 843use operands. Virtual operands are shown in tree dumps right before 844the statement that contains them. To request a tree dump with virtual 845operands, use the @option{-vops} option to @option{-fdump-tree}: 846 847@smallexample 848@{ 849 int a, b, *p; 850 851 if (...) 852 p = &a; 853 else 854 p = &b; 855 # a = V_MAY_DEF <a> 856 # b = V_MAY_DEF <b> 857 *p = 5; 858 859 # VUSE <a> 860 # VUSE <b> 861 return *p; 862@} 863@end smallexample 864 865Notice that @code{V_MAY_DEF} operands have two copies of the referenced 866variable. This indicates that this is not a killing definition of 867that variable. In this case we refer to it as a @dfn{may definition} 868or @dfn{aliased store}. The presence of the second copy of the 869variable in the @code{V_MAY_DEF} operand will become important when the 870function is converted into SSA form. This will be used to link all 871the non-killing definitions to prevent optimizations from making 872incorrect assumptions about them. 873 874Operands are updated as soon as the statement is finished via a call 875to @code{update_stmt}. If statement elements are changed via 876@code{SET_USE} or @code{SET_DEF}, then no further action is required 877(i.e., those macros take care of updating the statement). If changes 878are made by manipulating the statement's tree directly, then a call 879must be made to @code{update_stmt} when complete. Calling one of the 880@code{bsi_insert} routines or @code{bsi_replace} performs an implicit 881call to @code{update_stmt}. 882 883@subsection Operand Iterators And Access Routines 884@cindex Operand Iterators 885@cindex Operand Access Routines 886 887Operands are collected by @file{tree-ssa-operands.c}. They are stored 888inside each statement's annotation and can be accessed through either the 889operand iterators or an access routine. 890 891The following access routines are available for examining operands: 892 893@enumerate 894@item @code{SINGLE_SSA_@{USE,DEF,TREE@}_OPERAND}: These accessors will return 895NULL unless there is exactly one operand matching the specified flags. If 896there is exactly one operand, the operand is returned as either a @code{tree}, 897@code{def_operand_p}, or @code{use_operand_p}. 898 899@smallexample 900tree t = SINGLE_SSA_TREE_OPERAND (stmt, flags); 901use_operand_p u = SINGLE_SSA_USE_OPERAND (stmt, SSA_ALL_VIRTUAL_USES); 902def_operand_p d = SINGLE_SSA_DEF_OPERAND (stmt, SSA_OP_ALL_DEFS); 903@end smallexample 904 905@item @code{ZERO_SSA_OPERANDS}: This macro returns true if there are no 906operands matching the specified flags. 907 908@smallexample 909if (ZERO_SSA_OPERANDS (stmt, SSA_OP_ALL_VIRTUALS)) 910 return; 911@end smallexample 912 913@item @code{NUM_SSA_OPERANDS}: This macro Returns the number of operands 914matching 'flags'. This actually executes a loop to perform the count, so 915only use this if it is really needed. 916 917@smallexample 918int count = NUM_SSA_OPERANDS (stmt, flags) 919@end smallexample 920@end enumerate 921 922 923If you wish to iterate over some or all operands, use the 924@code{FOR_EACH_SSA_@{USE,DEF,TREE@}_OPERAND} iterator. For example, to print 925all the operands for a statement: 926 927@smallexample 928void 929print_ops (tree stmt) 930@{ 931 ssa_op_iter; 932 tree var; 933 934 FOR_EACH_SSA_TREE_OPERAND (var, stmt, iter, SSA_OP_ALL_OPERANDS) 935 print_generic_expr (stderr, var, TDF_SLIM); 936@} 937@end smallexample 938 939 940How to choose the appropriate iterator: 941 942@enumerate 943@item Determine whether you are need to see the operand pointers, or just the 944 trees, and choose the appropriate macro: 945 946@smallexample 947Need Macro: 948---- ------- 949use_operand_p FOR_EACH_SSA_USE_OPERAND 950def_operand_p FOR_EACH_SSA_DEF_OPERAND 951tree FOR_EACH_SSA_TREE_OPERAND 952@end smallexample 953 954@item You need to declare a variable of the type you are interested 955 in, and an ssa_op_iter structure which serves as the loop 956 controlling variable. 957 958@item Determine which operands you wish to use, and specify the flags of 959 those you are interested in. They are documented in 960 @file{tree-ssa-operands.h}: 961 962@smallexample 963#define SSA_OP_USE 0x01 /* @r{Real USE operands.} */ 964#define SSA_OP_DEF 0x02 /* @r{Real DEF operands.} */ 965#define SSA_OP_VUSE 0x04 /* @r{VUSE operands.} */ 966#define SSA_OP_VMAYUSE 0x08 /* @r{USE portion of V_MAY_DEFS.} */ 967#define SSA_OP_VMAYDEF 0x10 /* @r{DEF portion of V_MAY_DEFS.} */ 968#define SSA_OP_VMUSTDEF 0x20 /* @r{V_MUST_DEF definitions.} */ 969 970/* @r{These are commonly grouped operand flags.} */ 971#define SSA_OP_VIRTUAL_USES (SSA_OP_VUSE | SSA_OP_VMAYUSE) 972#define SSA_OP_VIRTUAL_DEFS (SSA_OP_VMAYDEF | SSA_OP_VMUSTDEF) 973#define SSA_OP_ALL_USES (SSA_OP_VIRTUAL_USES | SSA_OP_USE) 974#define SSA_OP_ALL_DEFS (SSA_OP_VIRTUAL_DEFS | SSA_OP_DEF) 975#define SSA_OP_ALL_OPERANDS (SSA_OP_ALL_USES | SSA_OP_ALL_DEFS) 976@end smallexample 977@end enumerate 978 979So if you want to look at the use pointers for all the @code{USE} and 980@code{VUSE} operands, you would do something like: 981 982@smallexample 983 use_operand_p use_p; 984 ssa_op_iter iter; 985 986 FOR_EACH_SSA_USE_OPERAND (use_p, stmt, iter, (SSA_OP_USE | SSA_OP_VUSE)) 987 @{ 988 process_use_ptr (use_p); 989 @} 990@end smallexample 991 992The @code{TREE} macro is basically the same as the @code{USE} and 993@code{DEF} macros, only with the use or def dereferenced via 994@code{USE_FROM_PTR (use_p)} and @code{DEF_FROM_PTR (def_p)}. Since we 995aren't using operand pointers, use and defs flags can be mixed. 996 997@smallexample 998 tree var; 999 ssa_op_iter iter; 1000 1001 FOR_EACH_SSA_TREE_OPERAND (var, stmt, iter, SSA_OP_VUSE | SSA_OP_VMUSTDEF) 1002 @{ 1003 print_generic_expr (stderr, var, TDF_SLIM); 1004 @} 1005@end smallexample 1006 1007@code{V_MAY_DEF}s are broken into two flags, one for the 1008@code{DEF} portion (@code{SSA_OP_VMAYDEF}) and one for the USE portion 1009(@code{SSA_OP_VMAYUSE}). If all you want to look at are the 1010@code{V_MAY_DEF}s together, there is a fourth iterator macro for this, 1011which returns both a def_operand_p and a use_operand_p for each 1012@code{V_MAY_DEF} in the statement. Note that you don't need any flags for 1013this one. 1014 1015@smallexample 1016 use_operand_p use_p; 1017 def_operand_p def_p; 1018 ssa_op_iter iter; 1019 1020 FOR_EACH_SSA_MAYDEF_OPERAND (def_p, use_p, stmt, iter) 1021 @{ 1022 my_code; 1023 @} 1024@end smallexample 1025 1026@code{V_MUST_DEF}s are broken into two flags, one for the 1027@code{DEF} portion (@code{SSA_OP_VMUSTDEF}) and one for the kill portion 1028(@code{SSA_OP_VMUSTKILL}). If all you want to look at are the 1029@code{V_MUST_DEF}s together, there is a fourth iterator macro for this, 1030which returns both a def_operand_p and a use_operand_p for each 1031@code{V_MUST_DEF} in the statement. Note that you don't need any flags for 1032this one. 1033 1034@smallexample 1035 use_operand_p kill_p; 1036 def_operand_p def_p; 1037 ssa_op_iter iter; 1038 1039 FOR_EACH_SSA_MUSTDEF_OPERAND (def_p, kill_p, stmt, iter) 1040 @{ 1041 my_code; 1042 @} 1043@end smallexample 1044 1045 1046There are many examples in the code as well, as well as the 1047documentation in @file{tree-ssa-operands.h}. 1048 1049There are also a couple of variants on the stmt iterators regarding PHI 1050nodes. 1051 1052@code{FOR_EACH_PHI_ARG} Works exactly like 1053@code{FOR_EACH_SSA_USE_OPERAND}, except it works over @code{PHI} arguments 1054instead of statement operands. 1055 1056@smallexample 1057/* Look at every virtual PHI use. */ 1058FOR_EACH_PHI_ARG (use_p, phi_stmt, iter, SSA_OP_VIRTUAL_USES) 1059@{ 1060 my_code; 1061@} 1062 1063/* Look at every real PHI use. */ 1064FOR_EACH_PHI_ARG (use_p, phi_stmt, iter, SSA_OP_USES) 1065 my_code; 1066 1067/* Look at every every PHI use. */ 1068FOR_EACH_PHI_ARG (use_p, phi_stmt, iter, SSA_OP_ALL_USES) 1069 my_code; 1070@end smallexample 1071 1072@code{FOR_EACH_PHI_OR_STMT_@{USE,DEF@}} works exactly like 1073@code{FOR_EACH_SSA_@{USE,DEF@}_OPERAND}, except it will function on 1074either a statement or a @code{PHI} node. These should be used when it is 1075appropriate but they are not quite as efficient as the individual 1076@code{FOR_EACH_PHI} and @code{FOR_EACH_SSA} routines. 1077 1078@smallexample 1079FOR_EACH_PHI_OR_STMT_USE (use_operand_p, stmt, iter, flags) 1080 @{ 1081 my_code; 1082 @} 1083 1084FOR_EACH_PHI_OR_STMT_DEF (def_operand_p, phi, iter, flags) 1085 @{ 1086 my_code; 1087 @} 1088@end smallexample 1089 1090@subsection Immediate Uses 1091@cindex Immediate Uses 1092 1093Immediate use information is now always available. Using the immediate use 1094iterators, you may examine every use of any @code{SSA_NAME}. For instance, 1095to change each use of @code{ssa_var} to @code{ssa_var2} and call fold_stmt on 1096each stmt after that is done: 1097 1098@smallexample 1099 use_operand_p imm_use_p; 1100 imm_use_iterator iterator; 1101 tree ssa_var, stmt; 1102 1103 1104 FOR_EACH_IMM_USE_STMT (stmt, iterator, ssa_var) 1105 @{ 1106 FOR_EACH_IMM_USE_ON_STMT (imm_use_p, iterator) 1107 SET_USE (imm_use_p, ssa_var_2); 1108 fold_stmt (stmt); 1109 @} 1110@end smallexample 1111 1112There are 2 iterators which can be used. @code{FOR_EACH_IMM_USE_FAST} is 1113used when the immediate uses are not changed, i.e., you are looking at the 1114uses, but not setting them. 1115 1116If they do get changed, then care must be taken that things are not changed 1117under the iterators, so use the @code{FOR_EACH_IMM_USE_STMT} and 1118@code{FOR_EACH_IMM_USE_ON_STMT} iterators. They attempt to preserve the 1119sanity of the use list by moving all the uses for a statement into 1120a controlled position, and then iterating over those uses. Then the 1121optimization can manipulate the stmt when all the uses have been 1122processed. This is a little slower than the FAST version since it adds a 1123placeholder element and must sort through the list a bit for each statement. 1124This placeholder element must be also be removed if the loop is 1125terminated early. The macro @code{BREAK_FROM_IMM_USE_SAFE} is provided 1126to do this : 1127 1128@smallexample 1129 FOR_EACH_IMM_USE_STMT (stmt, iterator, ssa_var) 1130 @{ 1131 if (stmt == last_stmt) 1132 BREAK_FROM_SAFE_IMM_USE (iter); 1133 1134 FOR_EACH_IMM_USE_ON_STMT (imm_use_p, iterator) 1135 SET_USE (imm_use_p, ssa_var_2); 1136 fold_stmt (stmt); 1137 @} 1138@end smallexample 1139 1140There are checks in @code{verify_ssa} which verify that the immediate use list 1141is up to date, as well as checking that an optimization didn't break from the 1142loop without using this macro. It is safe to simply 'break'; from a 1143@code{FOR_EACH_IMM_USE_FAST} traverse. 1144 1145Some useful functions and macros: 1146@enumerate 1147@item @code{has_zero_uses (ssa_var)} : Returns true if there are no uses of 1148@code{ssa_var}. 1149@item @code{has_single_use (ssa_var)} : Returns true if there is only a 1150single use of @code{ssa_var}. 1151@item @code{single_imm_use (ssa_var, use_operand_p *ptr, tree *stmt)} : 1152Returns true if there is only a single use of @code{ssa_var}, and also returns 1153the use pointer and statement it occurs in in the second and third parameters. 1154@item @code{num_imm_uses (ssa_var)} : Returns the number of immediate uses of 1155@code{ssa_var}. It is better not to use this if possible since it simply 1156utilizes a loop to count the uses. 1157@item @code{PHI_ARG_INDEX_FROM_USE (use_p)} : Given a use within a @code{PHI} 1158node, return the index number for the use. An assert is triggered if the use 1159isn't located in a @code{PHI} node. 1160@item @code{USE_STMT (use_p)} : Return the statement a use occurs in. 1161@end enumerate 1162 1163Note that uses are not put into an immediate use list until their statement is 1164actually inserted into the instruction stream via a @code{bsi_*} routine. 1165 1166It is also still possible to utilize lazy updating of statements, but this 1167should be used only when absolutely required. Both alias analysis and the 1168dominator optimizations currently do this. 1169 1170When lazy updating is being used, the immediate use information is out of date 1171and cannot be used reliably. Lazy updating is achieved by simply marking 1172statements modified via calls to @code{mark_stmt_modified} instead of 1173@code{update_stmt}. When lazy updating is no longer required, all the 1174modified statements must have @code{update_stmt} called in order to bring them 1175up to date. This must be done before the optimization is finished, or 1176@code{verify_ssa} will trigger an abort. 1177 1178This is done with a simple loop over the instruction stream: 1179@smallexample 1180 block_stmt_iterator bsi; 1181 basic_block bb; 1182 FOR_EACH_BB (bb) 1183 @{ 1184 for (bsi = bsi_start (bb); !bsi_end_p (bsi); bsi_next (&bsi)) 1185 update_stmt_if_modified (bsi_stmt (bsi)); 1186 @} 1187@end smallexample 1188 1189@node SSA 1190@section Static Single Assignment 1191@cindex SSA 1192@cindex static single assignment 1193 1194Most of the tree optimizers rely on the data flow information provided 1195by the Static Single Assignment (SSA) form. We implement the SSA form 1196as described in @cite{R. Cytron, J. Ferrante, B. Rosen, M. Wegman, and 1197K. Zadeck. Efficiently Computing Static Single Assignment Form and the 1198Control Dependence Graph. ACM Transactions on Programming Languages 1199and Systems, 13(4):451-490, October 1991}. 1200 1201The SSA form is based on the premise that program variables are 1202assigned in exactly one location in the program. Multiple assignments 1203to the same variable create new versions of that variable. Naturally, 1204actual programs are seldom in SSA form initially because variables 1205tend to be assigned multiple times. The compiler modifies the program 1206representation so that every time a variable is assigned in the code, 1207a new version of the variable is created. Different versions of the 1208same variable are distinguished by subscripting the variable name with 1209its version number. Variables used in the right-hand side of 1210expressions are renamed so that their version number matches that of 1211the most recent assignment. 1212 1213We represent variable versions using @code{SSA_NAME} nodes. The 1214renaming process in @file{tree-ssa.c} wraps every real and 1215virtual operand with an @code{SSA_NAME} node which contains 1216the version number and the statement that created the 1217@code{SSA_NAME}. Only definitions and virtual definitions may 1218create new @code{SSA_NAME} nodes. 1219 1220Sometimes, flow of control makes it impossible to determine what is the 1221most recent version of a variable. In these cases, the compiler 1222inserts an artificial definition for that variable called 1223@dfn{PHI function} or @dfn{PHI node}. This new definition merges 1224all the incoming versions of the variable to create a new name 1225for it. For instance, 1226 1227@smallexample 1228if (...) 1229 a_1 = 5; 1230else if (...) 1231 a_2 = 2; 1232else 1233 a_3 = 13; 1234 1235# a_4 = PHI <a_1, a_2, a_3> 1236return a_4; 1237@end smallexample 1238 1239Since it is not possible to determine which of the three branches 1240will be taken at runtime, we don't know which of @code{a_1}, 1241@code{a_2} or @code{a_3} to use at the return statement. So, the 1242SSA renamer creates a new version @code{a_4} which is assigned 1243the result of ``merging'' @code{a_1}, @code{a_2} and @code{a_3}. 1244Hence, PHI nodes mean ``one of these operands. I don't know 1245which''. 1246 1247The following macros can be used to examine PHI nodes 1248 1249@defmac PHI_RESULT (@var{phi}) 1250Returns the @code{SSA_NAME} created by PHI node @var{phi} (i.e., 1251@var{phi}'s LHS)@. 1252@end defmac 1253 1254@defmac PHI_NUM_ARGS (@var{phi}) 1255Returns the number of arguments in @var{phi}. This number is exactly 1256the number of incoming edges to the basic block holding @var{phi}@. 1257@end defmac 1258 1259@defmac PHI_ARG_ELT (@var{phi}, @var{i}) 1260Returns a tuple representing the @var{i}th argument of @var{phi}@. 1261Each element of this tuple contains an @code{SSA_NAME} @var{var} and 1262the incoming edge through which @var{var} flows. 1263@end defmac 1264 1265@defmac PHI_ARG_EDGE (@var{phi}, @var{i}) 1266Returns the incoming edge for the @var{i}th argument of @var{phi}. 1267@end defmac 1268 1269@defmac PHI_ARG_DEF (@var{phi}, @var{i}) 1270Returns the @code{SSA_NAME} for the @var{i}th argument of @var{phi}. 1271@end defmac 1272 1273 1274@subsection Preserving the SSA form 1275@findex update_ssa 1276@cindex preserving SSA form 1277Some optimization passes make changes to the function that 1278invalidate the SSA property. This can happen when a pass has 1279added new symbols or changed the program so that variables that 1280were previously aliased aren't anymore. Whenever something like this 1281happens, the affected symbols must be renamed into SSA form again. 1282Transformations that emit new code or replicate existing statements 1283will also need to update the SSA form@. 1284 1285Since GCC implements two different SSA forms for register and virtual 1286variables, keeping the SSA form up to date depends on whether you are 1287updating register or virtual names. In both cases, the general idea 1288behind incremental SSA updates is similar: when new SSA names are 1289created, they typically are meant to replace other existing names in 1290the program@. 1291 1292For instance, given the following code: 1293 1294@smallexample 1295 1 L0: 1296 2 x_1 = PHI (0, x_5) 1297 3 if (x_1 < 10) 1298 4 if (x_1 > 7) 1299 5 y_2 = 0 1300 6 else 1301 7 y_3 = x_1 + x_7 1302 8 endif 1303 9 x_5 = x_1 + 1 1304 10 goto L0; 1305 11 endif 1306@end smallexample 1307 1308Suppose that we insert new names @code{x_10} and @code{x_11} (lines 1309@code{4} and @code{8})@. 1310 1311@smallexample 1312 1 L0: 1313 2 x_1 = PHI (0, x_5) 1314 3 if (x_1 < 10) 1315 4 x_10 = ... 1316 5 if (x_1 > 7) 1317 6 y_2 = 0 1318 7 else 1319 8 x_11 = ... 1320 9 y_3 = x_1 + x_7 1321 10 endif 1322 11 x_5 = x_1 + 1 1323 12 goto L0; 1324 13 endif 1325@end smallexample 1326 1327We want to replace all the uses of @code{x_1} with the new definitions 1328of @code{x_10} and @code{x_11}. Note that the only uses that should 1329be replaced are those at lines @code{5}, @code{9} and @code{11}. 1330Also, the use of @code{x_7} at line @code{9} should @emph{not} be 1331replaced (this is why we cannot just mark symbol @code{x} for 1332renaming)@. 1333 1334Additionally, we may need to insert a PHI node at line @code{11} 1335because that is a merge point for @code{x_10} and @code{x_11}. So the 1336use of @code{x_1} at line @code{11} will be replaced with the new PHI 1337node. The insertion of PHI nodes is optional. They are not strictly 1338necessary to preserve the SSA form, and depending on what the caller 1339inserted, they may not even be useful for the optimizers@. 1340 1341Updating the SSA form is a two step process. First, the pass has to 1342identify which names need to be updated and/or which symbols need to 1343be renamed into SSA form for the first time. When new names are 1344introduced to replace existing names in the program, the mapping 1345between the old and the new names are registered by calling 1346@code{register_new_name_mapping} (note that if your pass creates new 1347code by duplicating basic blocks, the call to @code{tree_duplicate_bb} 1348will set up the necessary mappings automatically). On the other hand, 1349if your pass exposes a new symbol that should be put in SSA form for 1350the first time, the new symbol should be registered with 1351@code{mark_sym_for_renaming}. 1352 1353After the replacement mappings have been registered and new symbols 1354marked for renaming, a call to @code{update_ssa} makes the registered 1355changes. This can be done with an explicit call or by creating 1356@code{TODO} flags in the @code{tree_opt_pass} structure for your pass. 1357There are several @code{TODO} flags that control the behavior of 1358@code{update_ssa}: 1359 1360@itemize @bullet 1361@item @code{TODO_update_ssa}. Update the SSA form inserting PHI nodes 1362 for newly exposed symbols and virtual names marked for updating. 1363 When updating real names, only insert PHI nodes for a real name 1364 @code{O_j} in blocks reached by all the new and old definitions for 1365 @code{O_j}. If the iterated dominance frontier for @code{O_j} 1366 is not pruned, we may end up inserting PHI nodes in blocks that 1367 have one or more edges with no incoming definition for 1368 @code{O_j}. This would lead to uninitialized warnings for 1369 @code{O_j}'s symbol@. 1370 1371@item @code{TODO_update_ssa_no_phi}. Update the SSA form without 1372 inserting any new PHI nodes at all. This is used by passes that 1373 have either inserted all the PHI nodes themselves or passes that 1374 need only to patch use-def and def-def chains for virtuals 1375 (e.g., DCE)@. 1376 1377 1378@item @code{TODO_update_ssa_full_phi}. Insert PHI nodes everywhere 1379 they are needed. No pruning of the IDF is done. This is used 1380 by passes that need the PHI nodes for @code{O_j} even if it 1381 means that some arguments will come from the default definition 1382 of @code{O_j}'s symbol (e.g., @code{pass_linear_transform})@. 1383 1384 WARNING: If you need to use this flag, chances are that your 1385 pass may be doing something wrong. Inserting PHI nodes for an 1386 old name where not all edges carry a new replacement may lead to 1387 silent codegen errors or spurious uninitialized warnings@. 1388 1389@item @code{TODO_update_ssa_only_virtuals}. Passes that update the 1390 SSA form on their own may want to delegate the updating of 1391 virtual names to the generic updater. Since FUD chains are 1392 easier to maintain, this simplifies the work they need to do. 1393 NOTE: If this flag is used, any OLD->NEW mappings for real names 1394 are explicitly destroyed and only the symbols marked for 1395 renaming are processed@. 1396@end itemize 1397 1398@subsection Preserving the virtual SSA form 1399@cindex preserving virtual SSA form 1400 1401The virtual SSA form is harder to preserve than the non-virtual SSA form 1402mainly because the set of virtual operands for a statement may change at 1403what some would consider unexpected times. In general, any time you 1404have modified a statement that has virtual operands, you should verify 1405whether the list of virtual operands has changed, and if so, mark the 1406newly exposed symbols by calling @code{mark_new_vars_to_rename}. 1407 1408There is one additional caveat to preserving virtual SSA form. When the 1409entire set of virtual operands may be eliminated due to better 1410disambiguation, a bare SMT will be added to the list of virtual 1411operands, to signify the non-visible aliases that the are still being 1412referenced. If the set of bare SMT's may change, 1413@code{TODO_update_smt_usage} should be added to the todo flags. 1414 1415With the current pruning code, this can only occur when constants are 1416propagated into array references that were previously non-constant, or 1417address expressions are propagated into their uses. 1418 1419@subsection Examining @code{SSA_NAME} nodes 1420@cindex examining SSA_NAMEs 1421 1422The following macros can be used to examine @code{SSA_NAME} nodes 1423 1424@defmac SSA_NAME_DEF_STMT (@var{var}) 1425Returns the statement @var{s} that creates the @code{SSA_NAME} 1426@var{var}. If @var{s} is an empty statement (i.e., @code{IS_EMPTY_STMT 1427(@var{s})} returns @code{true}), it means that the first reference to 1428this variable is a USE or a VUSE@. 1429@end defmac 1430 1431@defmac SSA_NAME_VERSION (@var{var}) 1432Returns the version number of the @code{SSA_NAME} object @var{var}. 1433@end defmac 1434 1435 1436@subsection Walking use-def chains 1437 1438@deftypefn {Tree SSA function} void walk_use_def_chains (@var{var}, @var{fn}, @var{data}) 1439 1440Walks use-def chains starting at the @code{SSA_NAME} node @var{var}. 1441Calls function @var{fn} at each reaching definition found. Function 1442@var{FN} takes three arguments: @var{var}, its defining statement 1443(@var{def_stmt}) and a generic pointer to whatever state information 1444that @var{fn} may want to maintain (@var{data}). Function @var{fn} is 1445able to stop the walk by returning @code{true}, otherwise in order to 1446continue the walk, @var{fn} should return @code{false}. 1447 1448Note, that if @var{def_stmt} is a @code{PHI} node, the semantics are 1449slightly different. For each argument @var{arg} of the PHI node, this 1450function will: 1451 1452@enumerate 1453@item Walk the use-def chains for @var{arg}. 1454@item Call @code{FN (@var{arg}, @var{phi}, @var{data})}. 1455@end enumerate 1456 1457Note how the first argument to @var{fn} is no longer the original 1458variable @var{var}, but the PHI argument currently being examined. 1459If @var{fn} wants to get at @var{var}, it should call 1460@code{PHI_RESULT} (@var{phi}). 1461@end deftypefn 1462 1463@subsection Walking the dominator tree 1464 1465@deftypefn {Tree SSA function} void walk_dominator_tree (@var{walk_data}, @var{bb}) 1466 1467This function walks the dominator tree for the current CFG calling a 1468set of callback functions defined in @var{struct dom_walk_data} in 1469@file{domwalk.h}. The call back functions you need to define give you 1470hooks to execute custom code at various points during traversal: 1471 1472@enumerate 1473@item Once to initialize any local data needed while processing 1474 @var{bb} and its children. This local data is pushed into an 1475 internal stack which is automatically pushed and popped as the 1476 walker traverses the dominator tree. 1477 1478@item Once before traversing all the statements in the @var{bb}. 1479 1480@item Once for every statement inside @var{bb}. 1481 1482@item Once after traversing all the statements and before recursing 1483 into @var{bb}'s dominator children. 1484 1485@item It then recurses into all the dominator children of @var{bb}. 1486 1487@item After recursing into all the dominator children of @var{bb} it 1488 can, optionally, traverse every statement in @var{bb} again 1489 (i.e., repeating steps 2 and 3). 1490 1491@item Once after walking the statements in @var{bb} and @var{bb}'s 1492 dominator children. At this stage, the block local data stack 1493 is popped. 1494@end enumerate 1495@end deftypefn 1496 1497@node Alias analysis 1498@section Alias analysis 1499@cindex alias 1500@cindex flow-sensitive alias analysis 1501@cindex flow-insensitive alias analysis 1502 1503Alias analysis proceeds in 4 main phases: 1504 1505@enumerate 1506@item Structural alias analysis. 1507 1508This phase walks the types for structure variables, and determines which 1509of the fields can overlap using offset and size of each field. For each 1510field, a ``subvariable'' called a ``Structure field tag'' (SFT)@ is 1511created, which represents that field as a separate variable. All 1512accesses that could possibly overlap with a given field will have 1513virtual operands for the SFT of that field. 1514 1515@smallexample 1516struct foo 1517@{ 1518 int a; 1519 int b; 1520@} 1521struct foo temp; 1522int bar (void) 1523@{ 1524 int tmp1, tmp2, tmp3; 1525 SFT.0_2 = V_MUST_DEF <SFT.0_1> 1526 temp.a = 5; 1527 SFT.1_4 = V_MUST_DEF <SFT.1_3> 1528 temp.b = 6; 1529 1530 VUSE <SFT.1_4> 1531 tmp1_5 = temp.b; 1532 VUSE <SFT.0_2> 1533 tmp2_6 = temp.a; 1534 1535 tmp3_7 = tmp1_5 + tmp2_6; 1536 return tmp3_7; 1537@} 1538@end smallexample 1539 1540If you copy the symbol tag for a variable for some reason, you probably 1541also want to copy the subvariables for that variable. 1542 1543@item Points-to and escape analysis. 1544 1545This phase walks the use-def chains in the SSA web looking for 1546three things: 1547 1548 @itemize @bullet 1549 @item Assignments of the form @code{P_i = &VAR} 1550 @item Assignments of the form P_i = malloc() 1551 @item Pointers and ADDR_EXPR that escape the current function. 1552 @end itemize 1553 1554The concept of `escaping' is the same one used in the Java world. 1555When a pointer or an ADDR_EXPR escapes, it means that it has been 1556exposed outside of the current function. So, assignment to 1557global variables, function arguments and returning a pointer are 1558all escape sites. 1559 1560This is where we are currently limited. Since not everything is 1561renamed into SSA, we lose track of escape properties when a 1562pointer is stashed inside a field in a structure, for instance. 1563In those cases, we are assuming that the pointer does escape. 1564 1565We use escape analysis to determine whether a variable is 1566call-clobbered. Simply put, if an ADDR_EXPR escapes, then the 1567variable is call-clobbered. If a pointer P_i escapes, then all 1568the variables pointed-to by P_i (and its memory tag) also escape. 1569 1570@item Compute flow-sensitive aliases 1571 1572We have two classes of memory tags. Memory tags associated with 1573the pointed-to data type of the pointers in the program. These 1574tags are called ``symbol memory tag'' (SMT)@. The other class are 1575those associated with SSA_NAMEs, called ``name memory tag'' (NMT)@. 1576The basic idea is that when adding operands for an INDIRECT_REF 1577*P_i, we will first check whether P_i has a name tag, if it does 1578we use it, because that will have more precise aliasing 1579information. Otherwise, we use the standard symbol tag. 1580 1581In this phase, we go through all the pointers we found in 1582points-to analysis and create alias sets for the name memory tags 1583associated with each pointer P_i. If P_i escapes, we mark 1584call-clobbered the variables it points to and its tag. 1585 1586 1587@item Compute flow-insensitive aliases 1588 1589This pass will compare the alias set of every symbol memory tag and 1590every addressable variable found in the program. Given a symbol 1591memory tag SMT and an addressable variable V@. If the alias sets 1592of SMT and V conflict (as computed by may_alias_p), then V is 1593marked as an alias tag and added to the alias set of SMT@. 1594@end enumerate 1595 1596For instance, consider the following function: 1597 1598@smallexample 1599foo (int i) 1600@{ 1601 int *p, *q, a, b; 1602 1603 if (i > 10) 1604 p = &a; 1605 else 1606 q = &b; 1607 1608 *p = 3; 1609 *q = 5; 1610 a = b + 2; 1611 return *p; 1612@} 1613@end smallexample 1614 1615After aliasing analysis has finished, the symbol memory tag for 1616pointer @code{p} will have two aliases, namely variables @code{a} and 1617@code{b}. 1618Every time pointer @code{p} is dereferenced, we want to mark the 1619operation as a potential reference to @code{a} and @code{b}. 1620 1621@smallexample 1622foo (int i) 1623@{ 1624 int *p, a, b; 1625 1626 if (i_2 > 10) 1627 p_4 = &a; 1628 else 1629 p_6 = &b; 1630 # p_1 = PHI <p_4(1), p_6(2)>; 1631 1632 # a_7 = V_MAY_DEF <a_3>; 1633 # b_8 = V_MAY_DEF <b_5>; 1634 *p_1 = 3; 1635 1636 # a_9 = V_MAY_DEF <a_7> 1637 # VUSE <b_8> 1638 a_9 = b_8 + 2; 1639 1640 # VUSE <a_9>; 1641 # VUSE <b_8>; 1642 return *p_1; 1643@} 1644@end smallexample 1645 1646In certain cases, the list of may aliases for a pointer may grow 1647too large. This may cause an explosion in the number of virtual 1648operands inserted in the code. Resulting in increased memory 1649consumption and compilation time. 1650 1651When the number of virtual operands needed to represent aliased 1652loads and stores grows too large (configurable with @option{--param 1653max-aliased-vops}), alias sets are grouped to avoid severe 1654compile-time slow downs and memory consumption. The alias 1655grouping heuristic proceeds as follows: 1656 1657@enumerate 1658@item Sort the list of pointers in decreasing number of contributed 1659virtual operands. 1660 1661@item Take the first pointer from the list and reverse the role 1662of the memory tag and its aliases. Usually, whenever an 1663aliased variable Vi is found to alias with a memory tag 1664T, we add Vi to the may-aliases set for T@. Meaning that 1665after alias analysis, we will have: 1666 1667@smallexample 1668may-aliases(T) = @{ V1, V2, V3, ..., Vn @} 1669@end smallexample 1670 1671This means that every statement that references T, will get 1672@code{n} virtual operands for each of the Vi tags. But, when 1673alias grouping is enabled, we make T an alias tag and add it 1674to the alias set of all the Vi variables: 1675 1676@smallexample 1677may-aliases(V1) = @{ T @} 1678may-aliases(V2) = @{ T @} 1679... 1680may-aliases(Vn) = @{ T @} 1681@end smallexample 1682 1683This has two effects: (a) statements referencing T will only get 1684a single virtual operand, and, (b) all the variables Vi will now 1685appear to alias each other. So, we lose alias precision to 1686improve compile time. But, in theory, a program with such a high 1687level of aliasing should not be very optimizable in the first 1688place. 1689 1690@item Since variables may be in the alias set of more than one 1691memory tag, the grouping done in step (2) needs to be extended 1692to all the memory tags that have a non-empty intersection with 1693the may-aliases set of tag T@. For instance, if we originally 1694had these may-aliases sets: 1695 1696@smallexample 1697may-aliases(T) = @{ V1, V2, V3 @} 1698may-aliases(R) = @{ V2, V4 @} 1699@end smallexample 1700 1701In step (2) we would have reverted the aliases for T as: 1702 1703@smallexample 1704may-aliases(V1) = @{ T @} 1705may-aliases(V2) = @{ T @} 1706may-aliases(V3) = @{ T @} 1707@end smallexample 1708 1709But note that now V2 is no longer aliased with R@. We could 1710add R to may-aliases(V2), but we are in the process of 1711grouping aliases to reduce virtual operands so what we do is 1712add V4 to the grouping to obtain: 1713 1714@smallexample 1715may-aliases(V1) = @{ T @} 1716may-aliases(V2) = @{ T @} 1717may-aliases(V3) = @{ T @} 1718may-aliases(V4) = @{ T @} 1719@end smallexample 1720 1721@item If the total number of virtual operands due to aliasing is 1722still above the threshold set by max-alias-vops, go back to (2). 1723@end enumerate 1724