1@c -*-texinfo-*- 2@c Copyright (C) 2001-2020 Free Software Foundation, Inc. 3@c This is part of the GCC manual. 4@c For copying conditions, see the file gcc.texi. 5 6@c --------------------------------------------------------------------- 7@c Control Flow Graph 8@c --------------------------------------------------------------------- 9 10@node Control Flow 11@chapter Control Flow Graph 12@cindex CFG, Control Flow Graph 13@findex basic-block.h 14 15A control flow graph (CFG) is a data structure built on top of the 16intermediate code representation (the RTL or @code{GIMPLE} instruction 17stream) abstracting the control flow behavior of a function that is 18being compiled. The CFG is a directed graph where the vertices 19represent basic blocks and edges represent possible transfer of 20control flow from one basic block to another. The data structures 21used to represent the control flow graph are defined in 22@file{basic-block.h}. 23 24In GCC, the representation of control flow is maintained throughout 25the compilation process, from constructing the CFG early in 26@code{pass_build_cfg} to @code{pass_free_cfg} (see @file{passes.def}). 27The CFG takes various different modes and may undergo extensive 28manipulations, but the graph is always valid between its construction 29and its release. This way, transfer of information such as data flow, 30a measured profile, or the loop tree, can be propagated through the 31passes pipeline, and even from @code{GIMPLE} to @code{RTL}. 32 33Often the CFG may be better viewed as integral part of instruction 34chain, than structure built on the top of it. Updating the compiler's 35intermediate representation for instructions cannot be easily done 36without proper maintenance of the CFG simultaneously. 37 38@menu 39* Basic Blocks:: The definition and representation of basic blocks. 40* Edges:: Types of edges and their representation. 41* Profile information:: Representation of frequencies and probabilities. 42* Maintaining the CFG:: Keeping the control flow graph and up to date. 43* Liveness information:: Using and maintaining liveness information. 44@end menu 45 46 47@node Basic Blocks 48@section Basic Blocks 49 50@cindex basic block 51@findex basic_block 52A basic block is a straight-line sequence of code with only one entry 53point and only one exit. In GCC, basic blocks are represented using 54the @code{basic_block} data type. 55 56@findex ENTRY_BLOCK_PTR, EXIT_BLOCK_PTR 57Special basic blocks represent possible entry and exit points of a 58function. These blocks are called @code{ENTRY_BLOCK_PTR} and 59@code{EXIT_BLOCK_PTR}. These blocks do not contain any code. 60 61@findex BASIC_BLOCK 62The @code{BASIC_BLOCK} array contains all basic blocks in an 63unspecified order. Each @code{basic_block} structure has a field 64that holds a unique integer identifier @code{index} that is the 65index of the block in the @code{BASIC_BLOCK} array. 66The total number of basic blocks in the function is 67@code{n_basic_blocks}. Both the basic block indices and 68the total number of basic blocks may vary during the compilation 69process, as passes reorder, create, duplicate, and destroy basic 70blocks. The index for any block should never be greater than 71@code{last_basic_block}. The indices 0 and 1 are special codes 72reserved for @code{ENTRY_BLOCK} and @code{EXIT_BLOCK}, the 73indices of @code{ENTRY_BLOCK_PTR} and @code{EXIT_BLOCK_PTR}. 74 75@findex next_bb, prev_bb, FOR_EACH_BB, FOR_ALL_BB 76Two pointer members of the @code{basic_block} structure are the 77pointers @code{next_bb} and @code{prev_bb}. These are used to keep 78doubly linked chain of basic blocks in the same order as the 79underlying instruction stream. The chain of basic blocks is updated 80transparently by the provided API for manipulating the CFG@. The macro 81@code{FOR_EACH_BB} can be used to visit all the basic blocks in 82lexicographical order, except @code{ENTRY_BLOCK} and @code{EXIT_BLOCK}. 83The macro @code{FOR_ALL_BB} also visits all basic blocks in 84lexicographical order, including @code{ENTRY_BLOCK} and @code{EXIT_BLOCK}. 85 86@findex post_order_compute, inverted_post_order_compute, walk_dominator_tree 87The functions @code{post_order_compute} and @code{inverted_post_order_compute} 88can be used to compute topological orders of the CFG. The orders are 89stored as vectors of basic block indices. The @code{BASIC_BLOCK} array 90can be used to iterate each basic block by index. 91Dominator traversals are also possible using 92@code{walk_dominator_tree}. Given two basic blocks A and B, block A 93dominates block B if A is @emph{always} executed before B@. 94 95Each @code{basic_block} also contains pointers to the first 96instruction (the @dfn{head}) and the last instruction (the @dfn{tail}) 97or @dfn{end} of the instruction stream contained in a basic block. In 98fact, since the @code{basic_block} data type is used to represent 99blocks in both major intermediate representations of GCC (@code{GIMPLE} 100and RTL), there are pointers to the head and end of a basic block for 101both representations, stored in intermediate representation specific 102data in the @code{il} field of @code{struct basic_block_def}. 103 104@findex CODE_LABEL 105@findex NOTE_INSN_BASIC_BLOCK 106For RTL, these pointers are @code{BB_HEAD} and @code{BB_END}. 107 108@cindex insn notes, notes 109@findex NOTE_INSN_BASIC_BLOCK 110In the RTL representation of a function, the instruction stream 111contains not only the ``real'' instructions, but also @dfn{notes} 112or @dfn{insn notes} (to distinguish them from @dfn{reg notes}). 113Any function that moves or duplicates the basic blocks needs 114to take care of updating of these notes. Many of these notes expect 115that the instruction stream consists of linear regions, so updating 116can sometimes be tedious. All types of insn notes are defined 117in @file{insn-notes.def}. 118 119In the RTL function representation, the instructions contained in a 120basic block always follow a @code{NOTE_INSN_BASIC_BLOCK}, but zero 121or more @code{CODE_LABEL} nodes can precede the block note. 122A basic block ends with a control flow instruction or with the last 123instruction before the next @code{CODE_LABEL} or 124@code{NOTE_INSN_BASIC_BLOCK}. 125By definition, a @code{CODE_LABEL} cannot appear in the middle of 126the instruction stream of a basic block. 127 128@findex can_fallthru 129@cindex table jump 130In addition to notes, the jump table vectors are also represented as 131``pseudo-instructions'' inside the insn stream. These vectors never 132appear in the basic block and should always be placed just after the 133table jump instructions referencing them. After removing the 134table-jump it is often difficult to eliminate the code computing the 135address and referencing the vector, so cleaning up these vectors is 136postponed until after liveness analysis. Thus the jump table vectors 137may appear in the insn stream unreferenced and without any purpose. 138Before any edge is made @dfn{fall-thru}, the existence of such 139construct in the way needs to be checked by calling 140@code{can_fallthru} function. 141 142@cindex GIMPLE statement iterators 143For the @code{GIMPLE} representation, the PHI nodes and statements 144contained in a basic block are in a @code{gimple_seq} pointed to by 145the basic block intermediate language specific pointers. 146Abstract containers and iterators are used to access the PHI nodes 147and statements in a basic blocks. These iterators are called 148@dfn{GIMPLE statement iterators} (GSIs). Grep for @code{^gsi} 149in the various @file{gimple-*} and @file{tree-*} files. 150There is a @code{gimple_stmt_iterator} type for iterating over 151all kinds of statement, and a @code{gphi_iterator} subclass for 152iterating over PHI nodes. 153The following snippet will pretty-print all PHI nodes the statements 154of the current function in the GIMPLE representation. 155 156@smallexample 157basic_block bb; 158 159FOR_EACH_BB (bb) 160 @{ 161 gphi_iterator pi; 162 gimple_stmt_iterator si; 163 164 for (pi = gsi_start_phis (bb); !gsi_end_p (pi); gsi_next (&pi)) 165 @{ 166 gphi *phi = pi.phi (); 167 print_gimple_stmt (dump_file, phi, 0, TDF_SLIM); 168 @} 169 for (si = gsi_start_bb (bb); !gsi_end_p (si); gsi_next (&si)) 170 @{ 171 gimple stmt = gsi_stmt (si); 172 print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM); 173 @} 174 @} 175@end smallexample 176 177 178@node Edges 179@section Edges 180 181@cindex edge in the flow graph 182@findex edge 183Edges represent possible control flow transfers from the end of some 184basic block A to the head of another basic block B@. We say that A is 185a predecessor of B, and B is a successor of A@. Edges are represented 186in GCC with the @code{edge} data type. Each @code{edge} acts as a 187link between two basic blocks: The @code{src} member of an edge 188points to the predecessor basic block of the @code{dest} basic block. 189The members @code{preds} and @code{succs} of the @code{basic_block} data 190type point to type-safe vectors of edges to the predecessors and 191successors of the block. 192 193@cindex edge iterators 194When walking the edges in an edge vector, @dfn{edge iterators} should 195be used. Edge iterators are constructed using the 196@code{edge_iterator} data structure and several methods are available 197to operate on them: 198 199@ftable @code 200@item ei_start 201This function initializes an @code{edge_iterator} that points to the 202first edge in a vector of edges. 203 204@item ei_last 205This function initializes an @code{edge_iterator} that points to the 206last edge in a vector of edges. 207 208@item ei_end_p 209This predicate is @code{true} if an @code{edge_iterator} represents 210the last edge in an edge vector. 211 212@item ei_one_before_end_p 213This predicate is @code{true} if an @code{edge_iterator} represents 214the second last edge in an edge vector. 215 216@item ei_next 217This function takes a pointer to an @code{edge_iterator} and makes it 218point to the next edge in the sequence. 219 220@item ei_prev 221This function takes a pointer to an @code{edge_iterator} and makes it 222point to the previous edge in the sequence. 223 224@item ei_edge 225This function returns the @code{edge} currently pointed to by an 226@code{edge_iterator}. 227 228@item ei_safe_safe 229This function returns the @code{edge} currently pointed to by an 230@code{edge_iterator}, but returns @code{NULL} if the iterator is 231pointing at the end of the sequence. This function has been provided 232for existing code makes the assumption that a @code{NULL} edge 233indicates the end of the sequence. 234 235@end ftable 236 237The convenience macro @code{FOR_EACH_EDGE} can be used to visit all of 238the edges in a sequence of predecessor or successor edges. It must 239not be used when an element might be removed during the traversal, 240otherwise elements will be missed. Here is an example of how to use 241the macro: 242 243@smallexample 244edge e; 245edge_iterator ei; 246 247FOR_EACH_EDGE (e, ei, bb->succs) 248 @{ 249 if (e->flags & EDGE_FALLTHRU) 250 break; 251 @} 252@end smallexample 253 254@findex fall-thru 255There are various reasons why control flow may transfer from one block 256to another. One possibility is that some instruction, for example a 257@code{CODE_LABEL}, in a linearized instruction stream just always 258starts a new basic block. In this case a @dfn{fall-thru} edge links 259the basic block to the first following basic block. But there are 260several other reasons why edges may be created. The @code{flags} 261field of the @code{edge} data type is used to store information 262about the type of edge we are dealing with. Each edge is of one of 263the following types: 264 265@table @emph 266@item jump 267No type flags are set for edges corresponding to jump instructions. 268These edges are used for unconditional or conditional jumps and in 269RTL also for table jumps. They are the easiest to manipulate as they 270may be freely redirected when the flow graph is not in SSA form. 271 272@item fall-thru 273@findex EDGE_FALLTHRU, force_nonfallthru 274Fall-thru edges are present in case where the basic block may continue 275execution to the following one without branching. These edges have 276the @code{EDGE_FALLTHRU} flag set. Unlike other types of edges, these 277edges must come into the basic block immediately following in the 278instruction stream. The function @code{force_nonfallthru} is 279available to insert an unconditional jump in the case that redirection 280is needed. Note that this may require creation of a new basic block. 281 282@item exception handling 283@cindex exception handling 284@findex EDGE_ABNORMAL, EDGE_EH 285Exception handling edges represent possible control transfers from a 286trapping instruction to an exception handler. The definition of 287``trapping'' varies. In C++, only function calls can throw, but for 288Ada exceptions like division by zero or segmentation fault are 289defined and thus each instruction possibly throwing this kind of 290exception needs to be handled as control flow instruction. Exception 291edges have the @code{EDGE_ABNORMAL} and @code{EDGE_EH} flags set. 292 293@findex purge_dead_edges 294When updating the instruction stream it is easy to change possibly 295trapping instruction to non-trapping, by simply removing the exception 296edge. The opposite conversion is difficult, but should not happen 297anyway. The edges can be eliminated via @code{purge_dead_edges} call. 298 299@findex REG_EH_REGION, EDGE_ABNORMAL_CALL 300In the RTL representation, the destination of an exception edge is 301specified by @code{REG_EH_REGION} note attached to the insn. 302In case of a trapping call the @code{EDGE_ABNORMAL_CALL} flag is set 303too. In the @code{GIMPLE} representation, this extra flag is not set. 304 305@findex may_trap_p, tree_could_trap_p 306In the RTL representation, the predicate @code{may_trap_p} may be used 307to check whether instruction still may trap or not. For the tree 308representation, the @code{tree_could_trap_p} predicate is available, 309but this predicate only checks for possible memory traps, as in 310dereferencing an invalid pointer location. 311 312 313@item sibling calls 314@cindex sibling call 315@findex EDGE_ABNORMAL, EDGE_SIBCALL 316Sibling calls or tail calls terminate the function in a non-standard 317way and thus an edge to the exit must be present. 318@code{EDGE_SIBCALL} and @code{EDGE_ABNORMAL} are set in such case. 319These edges only exist in the RTL representation. 320 321@item computed jumps 322@cindex computed jump 323@findex EDGE_ABNORMAL 324Computed jumps contain edges to all labels in the function referenced 325from the code. All those edges have @code{EDGE_ABNORMAL} flag set. 326The edges used to represent computed jumps often cause compile time 327performance problems, since functions consisting of many taken labels 328and many computed jumps may have @emph{very} dense flow graphs, so 329these edges need to be handled with special care. During the earlier 330stages of the compilation process, GCC tries to avoid such dense flow 331graphs by factoring computed jumps. For example, given the following 332series of jumps, 333 334@smallexample 335 goto *x; 336 [ @dots{} ] 337 338 goto *x; 339 [ @dots{} ] 340 341 goto *x; 342 [ @dots{} ] 343@end smallexample 344 345@noindent 346factoring the computed jumps results in the following code sequence 347which has a much simpler flow graph: 348 349@smallexample 350 goto y; 351 [ @dots{} ] 352 353 goto y; 354 [ @dots{} ] 355 356 goto y; 357 [ @dots{} ] 358 359y: 360 goto *x; 361@end smallexample 362 363@findex pass_duplicate_computed_gotos 364However, the classic problem with this transformation is that it has a 365runtime cost in there resulting code: An extra jump. Therefore, the 366computed jumps are un-factored in the later passes of the compiler 367(in the pass called @code{pass_duplicate_computed_gotos}). 368Be aware of that when you work on passes in that area. There have 369been numerous examples already where the compile time for code with 370unfactored computed jumps caused some serious headaches. 371 372@item nonlocal goto handlers 373@cindex nonlocal goto handler 374@findex EDGE_ABNORMAL, EDGE_ABNORMAL_CALL 375GCC allows nested functions to return into caller using a @code{goto} 376to a label passed to as an argument to the callee. The labels passed 377to nested functions contain special code to cleanup after function 378call. Such sections of code are referred to as ``nonlocal goto 379receivers''. If a function contains such nonlocal goto receivers, an 380edge from the call to the label is created with the 381@code{EDGE_ABNORMAL} and @code{EDGE_ABNORMAL_CALL} flags set. 382 383@item function entry points 384@cindex function entry point, alternate function entry point 385@findex LABEL_ALTERNATE_NAME 386By definition, execution of function starts at basic block 0, so there 387is always an edge from the @code{ENTRY_BLOCK_PTR} to basic block 0. 388There is no @code{GIMPLE} representation for alternate entry points at 389this moment. In RTL, alternate entry points are specified by 390@code{CODE_LABEL} with @code{LABEL_ALTERNATE_NAME} defined. This 391feature is currently used for multiple entry point prologues and is 392limited to post-reload passes only. This can be used by back-ends to 393emit alternate prologues for functions called from different contexts. 394In future full support for multiple entry functions defined by Fortran 39590 needs to be implemented. 396 397@item function exits 398In the pre-reload representation a function terminates after the last 399instruction in the insn chain and no explicit return instructions are 400used. This corresponds to the fall-thru edge into exit block. After 401reload, optimal RTL epilogues are used that use explicit (conditional) 402return instructions that are represented by edges with no flags set. 403 404@end table 405 406 407@node Profile information 408@section Profile information 409 410@cindex profile representation 411In many cases a compiler must make a choice whether to trade speed in 412one part of code for speed in another, or to trade code size for code 413speed. In such cases it is useful to know information about how often 414some given block will be executed. That is the purpose for 415maintaining profile within the flow graph. 416GCC can handle profile information obtained through @dfn{profile 417feedback}, but it can also estimate branch probabilities based on 418statics and heuristics. 419 420@cindex profile feedback 421The feedback based profile is produced by compiling the program with 422instrumentation, executing it on a train run and reading the numbers 423of executions of basic blocks and edges back to the compiler while 424re-compiling the program to produce the final executable. This method 425provides very accurate information about where a program spends most 426of its time on the train run. Whether it matches the average run of 427course depends on the choice of train data set, but several studies 428have shown that the behavior of a program usually changes just 429marginally over different data sets. 430 431@cindex Static profile estimation 432@cindex branch prediction 433@findex predict.def 434When profile feedback is not available, the compiler may be asked to 435attempt to predict the behavior of each branch in the program using a 436set of heuristics (see @file{predict.def} for details) and compute 437estimated frequencies of each basic block by propagating the 438probabilities over the graph. 439 440@findex frequency, count, BB_FREQ_BASE 441Each @code{basic_block} contains two integer fields to represent 442profile information: @code{frequency} and @code{count}. The 443@code{frequency} is an estimation how often is basic block executed 444within a function. It is represented as an integer scaled in the 445range from 0 to @code{BB_FREQ_BASE}. The most frequently executed 446basic block in function is initially set to @code{BB_FREQ_BASE} and 447the rest of frequencies are scaled accordingly. During optimization, 448the frequency of the most frequent basic block can both decrease (for 449instance by loop unrolling) or grow (for instance by cross-jumping 450optimization), so scaling sometimes has to be performed multiple 451times. 452 453@findex gcov_type 454The @code{count} contains hard-counted numbers of execution measured 455during training runs and is nonzero only when profile feedback is 456available. This value is represented as the host's widest integer 457(typically a 64 bit integer) of the special type @code{gcov_type}. 458 459Most optimization passes can use only the frequency information of a 460basic block, but a few passes may want to know hard execution counts. 461The frequencies should always match the counts after scaling, however 462during updating of the profile information numerical error may 463accumulate into quite large errors. 464 465@findex REG_BR_PROB_BASE, EDGE_FREQUENCY 466Each edge also contains a branch probability field: an integer in the 467range from 0 to @code{REG_BR_PROB_BASE}. It represents probability of 468passing control from the end of the @code{src} basic block to the 469@code{dest} basic block, i.e.@: the probability that control will flow 470along this edge. The @code{EDGE_FREQUENCY} macro is available to 471compute how frequently a given edge is taken. There is a @code{count} 472field for each edge as well, representing same information as for a 473basic block. 474 475The basic block frequencies are not represented in the instruction 476stream, but in the RTL representation the edge frequencies are 477represented for conditional jumps (via the @code{REG_BR_PROB} 478macro) since they are used when instructions are output to the 479assembly file and the flow graph is no longer maintained. 480 481@cindex reverse probability 482The probability that control flow arrives via a given edge to its 483destination basic block is called @dfn{reverse probability} and is not 484directly represented, but it may be easily computed from frequencies 485of basic blocks. 486 487@findex redirect_edge_and_branch 488Updating profile information is a delicate task that can unfortunately 489not be easily integrated with the CFG manipulation API@. Many of the 490functions and hooks to modify the CFG, such as 491@code{redirect_edge_and_branch}, do not have enough information to 492easily update the profile, so updating it is in the majority of cases 493left up to the caller. It is difficult to uncover bugs in the profile 494updating code, because they manifest themselves only by producing 495worse code, and checking profile consistency is not possible because 496of numeric error accumulation. Hence special attention needs to be 497given to this issue in each pass that modifies the CFG@. 498 499@findex REG_BR_PROB_BASE, BB_FREQ_BASE, count 500It is important to point out that @code{REG_BR_PROB_BASE} and 501@code{BB_FREQ_BASE} are both set low enough to be possible to compute 502second power of any frequency or probability in the flow graph, it is 503not possible to even square the @code{count} field, as modern CPUs are 504fast enough to execute $2^32$ operations quickly. 505 506 507@node Maintaining the CFG 508@section Maintaining the CFG 509@findex cfghooks.h 510 511An important task of each compiler pass is to keep both the control 512flow graph and all profile information up-to-date. Reconstruction of 513the control flow graph after each pass is not an option, since it may be 514very expensive and lost profile information cannot be reconstructed at 515all. 516 517GCC has two major intermediate representations, and both use the 518@code{basic_block} and @code{edge} data types to represent control 519flow. Both representations share as much of the CFG maintenance code 520as possible. For each representation, a set of @dfn{hooks} is defined 521so that each representation can provide its own implementation of CFG 522manipulation routines when necessary. These hooks are defined in 523@file{cfghooks.h}. There are hooks for almost all common CFG 524manipulations, including block splitting and merging, edge redirection 525and creating and deleting basic blocks. These hooks should provide 526everything you need to maintain and manipulate the CFG in both the RTL 527and @code{GIMPLE} representation. 528 529At the moment, the basic block boundaries are maintained transparently 530when modifying instructions, so there rarely is a need to move them 531manually (such as in case someone wants to output instruction outside 532basic block explicitly). 533 534@findex BLOCK_FOR_INSN, gimple_bb 535In the RTL representation, each instruction has a 536@code{BLOCK_FOR_INSN} value that represents pointer to the basic block 537that contains the instruction. In the @code{GIMPLE} representation, the 538function @code{gimple_bb} returns a pointer to the basic block 539containing the queried statement. 540 541@cindex GIMPLE statement iterators 542When changes need to be applied to a function in its @code{GIMPLE} 543representation, @dfn{GIMPLE statement iterators} should be used. These 544iterators provide an integrated abstraction of the flow graph and the 545instruction stream. Block statement iterators are constructed using 546the @code{gimple_stmt_iterator} data structure and several modifiers are 547available, including the following: 548 549@ftable @code 550@item gsi_start 551This function initializes a @code{gimple_stmt_iterator} that points to 552the first non-empty statement in a basic block. 553 554@item gsi_last 555This function initializes a @code{gimple_stmt_iterator} that points to 556the last statement in a basic block. 557 558@item gsi_end_p 559This predicate is @code{true} if a @code{gimple_stmt_iterator} 560represents the end of a basic block. 561 562@item gsi_next 563This function takes a @code{gimple_stmt_iterator} and makes it point to 564its successor. 565 566@item gsi_prev 567This function takes a @code{gimple_stmt_iterator} and makes it point to 568its predecessor. 569 570@item gsi_insert_after 571This function inserts a statement after the @code{gimple_stmt_iterator} 572passed in. The final parameter determines whether the statement 573iterator is updated to point to the newly inserted statement, or left 574pointing to the original statement. 575 576@item gsi_insert_before 577This function inserts a statement before the @code{gimple_stmt_iterator} 578passed in. The final parameter determines whether the statement 579iterator is updated to point to the newly inserted statement, or left 580pointing to the original statement. 581 582@item gsi_remove 583This function removes the @code{gimple_stmt_iterator} passed in and 584rechains the remaining statements in a basic block, if any. 585@end ftable 586 587@findex BB_HEAD, BB_END 588In the RTL representation, the macros @code{BB_HEAD} and @code{BB_END} 589may be used to get the head and end @code{rtx} of a basic block. No 590abstract iterators are defined for traversing the insn chain, but you 591can just use @code{NEXT_INSN} and @code{PREV_INSN} instead. @xref{Insns}. 592 593@findex purge_dead_edges 594Usually a code manipulating pass simplifies the instruction stream and 595the flow of control, possibly eliminating some edges. This may for 596example happen when a conditional jump is replaced with an 597unconditional jump. Updating of edges 598is not transparent and each optimization pass is required to do so 599manually. However only few cases occur in practice. The pass may 600call @code{purge_dead_edges} on a given basic block to remove 601superfluous edges, if any. 602 603@findex redirect_edge_and_branch, redirect_jump 604Another common scenario is redirection of branch instructions, but 605this is best modeled as redirection of edges in the control flow graph 606and thus use of @code{redirect_edge_and_branch} is preferred over more 607low level functions, such as @code{redirect_jump} that operate on RTL 608chain only. The CFG hooks defined in @file{cfghooks.h} should provide 609the complete API required for manipulating and maintaining the CFG@. 610 611@findex split_block 612It is also possible that a pass has to insert control flow instruction 613into the middle of a basic block, thus creating an entry point in the 614middle of the basic block, which is impossible by definition: The 615block must be split to make sure it only has one entry point, i.e.@: the 616head of the basic block. The CFG hook @code{split_block} may be used 617when an instruction in the middle of a basic block has to become the 618target of a jump or branch instruction. 619 620@findex insert_insn_on_edge 621@findex commit_edge_insertions 622@findex gsi_insert_on_edge 623@findex gsi_commit_edge_inserts 624@cindex edge splitting 625For a global optimizer, a common operation is to split edges in the 626flow graph and insert instructions on them. In the RTL 627representation, this can be easily done using the 628@code{insert_insn_on_edge} function that emits an instruction 629``on the edge'', caching it for a later @code{commit_edge_insertions} 630call that will take care of moving the inserted instructions off the 631edge into the instruction stream contained in a basic block. This 632includes the creation of new basic blocks where needed. In the 633@code{GIMPLE} representation, the equivalent functions are 634@code{gsi_insert_on_edge} which inserts a block statement 635iterator on an edge, and @code{gsi_commit_edge_inserts} which flushes 636the instruction to actual instruction stream. 637 638@findex verify_flow_info 639@cindex CFG verification 640While debugging the optimization pass, the @code{verify_flow_info} 641function may be useful to find bugs in the control flow graph updating 642code. 643 644 645@node Liveness information 646@section Liveness information 647@cindex Liveness representation 648Liveness information is useful to determine whether some register is 649``live'' at given point of program, i.e.@: that it contains a value that 650may be used at a later point in the program. This information is 651used, for instance, during register allocation, as the pseudo 652registers only need to be assigned to a unique hard register or to a 653stack slot if they are live. The hard registers and stack slots may 654be freely reused for other values when a register is dead. 655 656Liveness information is available in the back end starting with 657@code{pass_df_initialize} and ending with @code{pass_df_finish}. Three 658flavors of live analysis are available: With @code{LR}, it is possible 659to determine at any point @code{P} in the function if the register may be 660used on some path from @code{P} to the end of the function. With 661@code{UR}, it is possible to determine if there is a path from the 662beginning of the function to @code{P} that defines the variable. 663@code{LIVE} is the intersection of the @code{LR} and @code{UR} and a 664variable is live at @code{P} if there is both an assignment that reaches 665it from the beginning of the function and a use that can be reached on 666some path from @code{P} to the end of the function. 667 668In general @code{LIVE} is the most useful of the three. The macros 669@code{DF_[LR,UR,LIVE]_[IN,OUT]} can be used to access this information. 670The macros take a basic block number and return a bitmap that is indexed 671by the register number. This information is only guaranteed to be up to 672date after calls are made to @code{df_analyze}. See the file 673@code{df-core.c} for details on using the dataflow. 674 675 676@findex REG_DEAD, REG_UNUSED 677The liveness information is stored partly in the RTL instruction stream 678and partly in the flow graph. Local information is stored in the 679instruction stream: Each instruction may contain @code{REG_DEAD} notes 680representing that the value of a given register is no longer needed, or 681@code{REG_UNUSED} notes representing that the value computed by the 682instruction is never used. The second is useful for instructions 683computing multiple values at once. 684 685