110d565efSmrg@c markers: BUG TODO
210d565efSmrg
3*ec02198aSmrg@c Copyright (C) 1988-2020 Free Software Foundation, Inc.
410d565efSmrg@c This is part of the GCC manual.
510d565efSmrg@c For copying conditions, see the file gcc.texi.
610d565efSmrg
710d565efSmrg@node Passes
810d565efSmrg@chapter Passes and Files of the Compiler
910d565efSmrg@cindex passes and files of the compiler
1010d565efSmrg@cindex files and passes of the compiler
1110d565efSmrg@cindex compiler passes and files
1210d565efSmrg@cindex pass dumps
1310d565efSmrg
1410d565efSmrgThis chapter is dedicated to giving an overview of the optimization and
1510d565efSmrgcode generation passes of the compiler.  In the process, it describes
1610d565efSmrgsome of the language front end interface, though this description is no
1710d565efSmrgwhere near complete.
1810d565efSmrg
1910d565efSmrg@menu
2010d565efSmrg* Parsing pass::         The language front end turns text into bits.
2110d565efSmrg* Gimplification pass::  The bits are turned into something we can optimize.
2210d565efSmrg* Pass manager::         Sequencing the optimization passes.
23*ec02198aSmrg* IPA passes::           Inter-procedural optimizations.
2410d565efSmrg* Tree SSA passes::      Optimizations on a high-level representation.
2510d565efSmrg* RTL passes::           Optimizations on a low-level representation.
2610d565efSmrg* Optimization info::    Dumping optimization information from passes.
2710d565efSmrg@end menu
2810d565efSmrg
2910d565efSmrg@node Parsing pass
3010d565efSmrg@section Parsing pass
3110d565efSmrg@cindex GENERIC
3210d565efSmrg@findex lang_hooks.parse_file
3310d565efSmrgThe language front end is invoked only once, via
3410d565efSmrg@code{lang_hooks.parse_file}, to parse the entire input.  The language
3510d565efSmrgfront end may use any intermediate language representation deemed
3610d565efSmrgappropriate.  The C front end uses GENERIC trees (@pxref{GENERIC}), plus
3710d565efSmrga double handful of language specific tree codes defined in
3810d565efSmrg@file{c-common.def}.  The Fortran front end uses a completely different
3910d565efSmrgprivate representation.
4010d565efSmrg
4110d565efSmrg@cindex GIMPLE
4210d565efSmrg@cindex gimplification
4310d565efSmrg@cindex gimplifier
4410d565efSmrg@cindex language-independent intermediate representation
4510d565efSmrg@cindex intermediate representation lowering
4610d565efSmrg@cindex lowering, language-dependent intermediate representation
4710d565efSmrgAt some point the front end must translate the representation used in the
4810d565efSmrgfront end to a representation understood by the language-independent
4910d565efSmrgportions of the compiler.  Current practice takes one of two forms.
5010d565efSmrgThe C front end manually invokes the gimplifier (@pxref{GIMPLE}) on each function,
5110d565efSmrgand uses the gimplifier callbacks to convert the language-specific tree
5210d565efSmrgnodes directly to GIMPLE before passing the function off to be compiled.
5310d565efSmrgThe Fortran front end converts from a private representation to GENERIC,
5410d565efSmrgwhich is later lowered to GIMPLE when the function is compiled.  Which
5510d565efSmrgroute to choose probably depends on how well GENERIC (plus extensions)
5610d565efSmrgcan be made to match up with the source language and necessary parsing
5710d565efSmrgdata structures.
5810d565efSmrg
5910d565efSmrgBUG: Gimplification must occur before nested function lowering,
6010d565efSmrgand nested function lowering must be done by the front end before
6110d565efSmrgpassing the data off to cgraph.
6210d565efSmrg
6310d565efSmrgTODO: Cgraph should control nested function lowering.  It would
6410d565efSmrgonly be invoked when it is certain that the outer-most function
6510d565efSmrgis used.
6610d565efSmrg
6710d565efSmrgTODO: Cgraph needs a gimplify_function callback.  It should be
6810d565efSmrginvoked when (1) it is certain that the function is used, (2)
6910d565efSmrgwarning flags specified by the user require some amount of
7010d565efSmrgcompilation in order to honor, (3) the language indicates that
7110d565efSmrgsemantic analysis is not complete until gimplification occurs.
7210d565efSmrgHum@dots{} this sounds overly complicated.  Perhaps we should just
7310d565efSmrghave the front end gimplify always; in most cases it's only one
7410d565efSmrgfunction call.
7510d565efSmrg
7610d565efSmrgThe front end needs to pass all function definitions and top level
7710d565efSmrgdeclarations off to the middle-end so that they can be compiled and
7810d565efSmrgemitted to the object file.  For a simple procedural language, it is
7910d565efSmrgusually most convenient to do this as each top level declaration or
8010d565efSmrgdefinition is seen.  There is also a distinction to be made between
8110d565efSmrggenerating functional code and generating complete debug information.
8210d565efSmrgThe only thing that is absolutely required for functional code is that
8310d565efSmrgfunction and data @emph{definitions} be passed to the middle-end.  For
8410d565efSmrgcomplete debug information, function, data and type declarations
8510d565efSmrgshould all be passed as well.
8610d565efSmrg
8710d565efSmrg@findex rest_of_decl_compilation
8810d565efSmrg@findex rest_of_type_compilation
8910d565efSmrg@findex cgraph_finalize_function
9010d565efSmrgIn any case, the front end needs each complete top-level function or
9110d565efSmrgdata declaration, and each data definition should be passed to
9210d565efSmrg@code{rest_of_decl_compilation}.  Each complete type definition should
9310d565efSmrgbe passed to @code{rest_of_type_compilation}.  Each function definition
9410d565efSmrgshould be passed to @code{cgraph_finalize_function}.
9510d565efSmrg
9610d565efSmrgTODO: I know rest_of_compilation currently has all sorts of
9710d565efSmrgRTL generation semantics.  I plan to move all code generation
9810d565efSmrgbits (both Tree and RTL) to compile_function.  Should we hide
9910d565efSmrgcgraph from the front ends and move back to rest_of_compilation
10010d565efSmrgas the official interface?  Possibly we should rename all three
10110d565efSmrginterfaces such that the names match in some meaningful way and
10210d565efSmrgthat is more descriptive than "rest_of".
10310d565efSmrg
10410d565efSmrgThe middle-end will, at its option, emit the function and data
10510d565efSmrgdefinitions immediately or queue them for later processing.
10610d565efSmrg
10710d565efSmrg@node Gimplification pass
10810d565efSmrg@section Gimplification pass
10910d565efSmrg
11010d565efSmrg@cindex gimplification
11110d565efSmrg@cindex GIMPLE
11210d565efSmrg@dfn{Gimplification} is a whimsical term for the process of converting
11310d565efSmrgthe intermediate representation of a function into the GIMPLE language
11410d565efSmrg(@pxref{GIMPLE}).  The term stuck, and so words like ``gimplification'',
11510d565efSmrg``gimplify'', ``gimplifier'' and the like are sprinkled throughout this
11610d565efSmrgsection of code.
11710d565efSmrg
11810d565efSmrgWhile a front end may certainly choose to generate GIMPLE directly if
11910d565efSmrgit chooses, this can be a moderately complex process unless the
12010d565efSmrgintermediate language used by the front end is already fairly simple.
12110d565efSmrgUsually it is easier to generate GENERIC trees plus extensions
12210d565efSmrgand let the language-independent gimplifier do most of the work.
12310d565efSmrg
12410d565efSmrg@findex gimplify_function_tree
12510d565efSmrg@findex gimplify_expr
12610d565efSmrg@findex lang_hooks.gimplify_expr
12710d565efSmrgThe main entry point to this pass is @code{gimplify_function_tree}
12810d565efSmrglocated in @file{gimplify.c}.  From here we process the entire
12910d565efSmrgfunction gimplifying each statement in turn.  The main workhorse
13010d565efSmrgfor this pass is @code{gimplify_expr}.  Approximately everything
13110d565efSmrgpasses through here at least once, and it is from here that we
13210d565efSmrginvoke the @code{lang_hooks.gimplify_expr} callback.
13310d565efSmrg
13410d565efSmrgThe callback should examine the expression in question and return
13510d565efSmrg@code{GS_UNHANDLED} if the expression is not a language specific
13610d565efSmrgconstruct that requires attention.  Otherwise it should alter the
13710d565efSmrgexpression in some way to such that forward progress is made toward
13810d565efSmrgproducing valid GIMPLE@.  If the callback is certain that the
13910d565efSmrgtransformation is complete and the expression is valid GIMPLE, it
14010d565efSmrgshould return @code{GS_ALL_DONE}.  Otherwise it should return
14110d565efSmrg@code{GS_OK}, which will cause the expression to be processed again.
14210d565efSmrgIf the callback encounters an error during the transformation (because
14310d565efSmrgthe front end is relying on the gimplification process to finish
14410d565efSmrgsemantic checks), it should return @code{GS_ERROR}.
14510d565efSmrg
14610d565efSmrg@node Pass manager
14710d565efSmrg@section Pass manager
14810d565efSmrg
14910d565efSmrgThe pass manager is located in @file{passes.c}, @file{tree-optimize.c}
15010d565efSmrgand @file{tree-pass.h}.
15110d565efSmrgIt processes passes as described in @file{passes.def}.
15210d565efSmrgIts job is to run all of the individual passes in the correct order,
15310d565efSmrgand take care of standard bookkeeping that applies to every pass.
15410d565efSmrg
15510d565efSmrgThe theory of operation is that each pass defines a structure that
15610d565efSmrgrepresents everything we need to know about that pass---when it
15710d565efSmrgshould be run, how it should be run, what intermediate language
15810d565efSmrgform or on-the-side data structures it needs.  We register the pass
15910d565efSmrgto be run in some particular order, and the pass manager arranges
16010d565efSmrgfor everything to happen in the correct order.
16110d565efSmrg
16210d565efSmrgThe actuality doesn't completely live up to the theory at present.
16310d565efSmrgCommand-line switches and @code{timevar_id_t} enumerations must still
16410d565efSmrgbe defined elsewhere.  The pass manager validates constraints but does
16510d565efSmrgnot attempt to (re-)generate data structures or lower intermediate
16610d565efSmrglanguage form based on the requirements of the next pass.  Nevertheless,
16710d565efSmrgwhat is present is useful, and a far sight better than nothing at all.
16810d565efSmrg
16910d565efSmrgEach pass should have a unique name.
17010d565efSmrgEach pass may have its own dump file (for GCC debugging purposes).
17110d565efSmrgPasses with a name starting with a star do not dump anything.
17210d565efSmrgSometimes passes are supposed to share a dump file / option name.
17310d565efSmrgTo still give these unique names, you can use a prefix that is delimited
17410d565efSmrgby a space from the part that is used for the dump file / option name.
17510d565efSmrgE.g. When the pass name is "ud dce", the name used for dump file/options
17610d565efSmrgis "dce".
17710d565efSmrg
17810d565efSmrgTODO: describe the global variables set up by the pass manager,
17910d565efSmrgand a brief description of how a new pass should use it.
18010d565efSmrgI need to look at what info RTL passes use first@enddots{}
18110d565efSmrg
182*ec02198aSmrg@node IPA passes
183*ec02198aSmrg@section Inter-procedural optimization passes
184*ec02198aSmrg@cindex IPA passes
185*ec02198aSmrg@cindex inter-procedural optimization passes
186*ec02198aSmrg
187*ec02198aSmrgThe inter-procedural optimization (IPA) passes use call graph
188*ec02198aSmrginformation to perform transformations across function boundaries.
189*ec02198aSmrgIPA is a critical part of link-time optimization (LTO) and
190*ec02198aSmrgwhole-program (WHOPR) optimization, and these passes are structured
191*ec02198aSmrgwith the needs of LTO and WHOPR in mind by dividing their operations
192*ec02198aSmrginto stages.  For detailed discussion of the LTO/WHOPR IPA pass stages
193*ec02198aSmrgand interfaces, see @ref{IPA}.
194*ec02198aSmrg
195*ec02198aSmrgThe following briefly describes the inter-procedural optimization (IPA)
196*ec02198aSmrgpasses, which are split into small IPA passes, regular IPA passes,
197*ec02198aSmrgand late IPA passes, according to the LTO/WHOPR processing model.
198*ec02198aSmrg
199*ec02198aSmrg@menu
200*ec02198aSmrg* Small IPA passes::
201*ec02198aSmrg* Regular IPA passes::
202*ec02198aSmrg* Late IPA passes::
203*ec02198aSmrg@end menu
204*ec02198aSmrg
205*ec02198aSmrg@node Small IPA passes
206*ec02198aSmrg@subsection Small IPA passes
207*ec02198aSmrg@cindex small IPA passes
208*ec02198aSmrgA small IPA pass is a pass derived from @code{simple_ipa_opt_pass}.
209*ec02198aSmrgAs described in @ref{IPA}, it does everything at once and
210*ec02198aSmrgdefines only the @emph{Execute} stage.  During this
211*ec02198aSmrgstage it accesses and modifies the function bodies.
212*ec02198aSmrgNo @code{generate_summary}, @code{read_summary}, or @code{write_summary}
213*ec02198aSmrghooks are defined.
214*ec02198aSmrg
215*ec02198aSmrg@itemize @bullet
216*ec02198aSmrg@item IPA free lang data
217*ec02198aSmrg
218*ec02198aSmrgThis pass frees resources that are used by the front end but are
219*ec02198aSmrgnot needed once it is done.  It is located in @file{tree.c} and is described by
220*ec02198aSmrg@code{pass_ipa_free_lang_data}.
221*ec02198aSmrg
222*ec02198aSmrg@item IPA function and variable visibility
223*ec02198aSmrg
224*ec02198aSmrgThis is a local function pass handling visibilities of all symbols.  This
225*ec02198aSmrghappens before LTO streaming, so @option{-fwhole-program} should be ignored
226*ec02198aSmrgat this level.  It is located in @file{ipa-visibility.c} and is described by
227*ec02198aSmrg@code{pass_ipa_function_and_variable_visibility}.
228*ec02198aSmrg
229*ec02198aSmrg@item IPA remove symbols
230*ec02198aSmrg
231*ec02198aSmrgThis pass performs reachability analysis and reclaims all unreachable nodes.
232*ec02198aSmrgIt is located in @file{passes.c} and is described by
233*ec02198aSmrg@code{pass_ipa_remove_symbols}.
234*ec02198aSmrg
235*ec02198aSmrg@item IPA OpenACC
236*ec02198aSmrg
237*ec02198aSmrgThis is a pass group for OpenACC processing.  It is located in
238*ec02198aSmrg@file{tree-ssa-loop.c} and is described by @code{pass_ipa_oacc}.
239*ec02198aSmrg
240*ec02198aSmrg@item IPA points-to analysis
241*ec02198aSmrg
242*ec02198aSmrgThis is a tree-based points-to analysis pass. The idea behind this analyzer
243*ec02198aSmrgis to generate set constraints from the program, then solve the resulting
244*ec02198aSmrgconstraints in order to generate the points-to sets.  It is located in
245*ec02198aSmrg@file{tree-ssa-structalias.c} and is described by @code{pass_ipa_pta}.
246*ec02198aSmrg
247*ec02198aSmrg@item IPA OpenACC kernels
248*ec02198aSmrg
249*ec02198aSmrgThis is a pass group for processing OpenACC kernels regions.  It is a
250*ec02198aSmrgsubpass of the IPA OpenACC pass group that runs on offloaded functions
251*ec02198aSmrgcontaining OpenACC kernels loops.  It is located in
252*ec02198aSmrg@file{tree-ssa-loop.c} and is described by
253*ec02198aSmrg@code{pass_ipa_oacc_kernels}.
254*ec02198aSmrg
255*ec02198aSmrg@item Target clone
256*ec02198aSmrg
257*ec02198aSmrgThis is a pass for parsing functions with multiple target attributes.
258*ec02198aSmrgIt is located in @file{multiple_target.c} and is described by
259*ec02198aSmrg@code{pass_target_clone}.
260*ec02198aSmrg
261*ec02198aSmrg@item IPA auto profile
262*ec02198aSmrg
263*ec02198aSmrgThis pass uses AutoFDO profiling data to annotate the control flow graph.
264*ec02198aSmrgIt is located in @file{auto-profile.c} and is described by
265*ec02198aSmrg@code{pass_ipa_auto_profile}.
266*ec02198aSmrg
267*ec02198aSmrg@item IPA tree profile
268*ec02198aSmrg
269*ec02198aSmrgThis pass does profiling for all functions in the call graph.
270*ec02198aSmrgIt calculates branch
271*ec02198aSmrgprobabilities and basic block execution counts. It is located
272*ec02198aSmrgin @file{tree-profile.c} and is described by @code{pass_ipa_tree_profile}.
273*ec02198aSmrg
274*ec02198aSmrg@item IPA free function summary
275*ec02198aSmrg
276*ec02198aSmrgThis pass is a small IPA pass when argument @code{small_p} is true.
277*ec02198aSmrgIt releases inline function summaries and call summaries.
278*ec02198aSmrgIt is located in @file{ipa-fnsummary.c} and is described by
279*ec02198aSmrg@code{pass_ipa_free_free_fn_summary}.
280*ec02198aSmrg
281*ec02198aSmrg@item IPA increase alignment
282*ec02198aSmrg
283*ec02198aSmrgThis pass increases the alignment of global arrays to improve
284*ec02198aSmrgvectorization. It is located in @file{tree-vectorizer.c}
285*ec02198aSmrgand is described by @code{pass_ipa_increase_alignment}.
286*ec02198aSmrg
287*ec02198aSmrg@item IPA transactional memory
288*ec02198aSmrg
289*ec02198aSmrgThis pass is for transactional memory support.
290*ec02198aSmrgIt is located in @file{trans-mem.c} and is described by
291*ec02198aSmrg@code{pass_ipa_tm}.
292*ec02198aSmrg
293*ec02198aSmrg@item IPA lower emulated TLS
294*ec02198aSmrg
295*ec02198aSmrgThis pass lowers thread-local storage (TLS) operations
296*ec02198aSmrgto emulation functions provided by libgcc.
297*ec02198aSmrgIt is located in @file{tree-emutls.c} and is described by
298*ec02198aSmrg@code{pass_ipa_lower_emutls}.
299*ec02198aSmrg
300*ec02198aSmrg@end itemize
301*ec02198aSmrg
302*ec02198aSmrg@node Regular IPA passes
303*ec02198aSmrg@subsection Regular IPA passes
304*ec02198aSmrg@cindex regular IPA passes
305*ec02198aSmrg
306*ec02198aSmrgA regular IPA pass is a pass derived from @code{ipa_opt_pass_d} that
307*ec02198aSmrgis executed in WHOPR compilation. Regular IPA passes may have summary
308*ec02198aSmrghooks implemented in any of the LGEN, WPA or LTRANS stages (@pxref{IPA}).
309*ec02198aSmrg
310*ec02198aSmrg@itemize @bullet
311*ec02198aSmrg@item IPA whole program visibility
312*ec02198aSmrg
313*ec02198aSmrgThis pass performs various optimizations involving symbol visibility
314*ec02198aSmrgwith @option{-fwhole-program}, including symbol privatization,
315*ec02198aSmrgdiscovering local functions, and dismantling comdat groups.  It is
316*ec02198aSmrglocated in @file{ipa-visibility.c} and is described by
317*ec02198aSmrg@code{pass_ipa_whole_program_visibility}.
318*ec02198aSmrg
319*ec02198aSmrg@item IPA profile
320*ec02198aSmrg
321*ec02198aSmrgThe IPA profile pass propagates profiling frequencies across the call
322*ec02198aSmrggraph.  It is located in @file{ipa-profile.c} and is described by
323*ec02198aSmrg@code{pass_ipa_profile}.
324*ec02198aSmrg
325*ec02198aSmrg@item IPA identical code folding
326*ec02198aSmrg
327*ec02198aSmrgThis is the inter-procedural identical code folding pass.
328*ec02198aSmrgThe goal of this transformation is to discover functions
329*ec02198aSmrgand read-only variables that have exactly the same semantics.  It is
330*ec02198aSmrglocated in @file{ipa-icf.c} and is described by @code{pass_ipa_icf}.
331*ec02198aSmrg
332*ec02198aSmrg@item IPA devirtualization
333*ec02198aSmrg
334*ec02198aSmrgThis pass performs speculative devirtualization based on the type
335*ec02198aSmrginheritance graph.  When a polymorphic call has only one likely target
336*ec02198aSmrgin the unit, it is turned into a speculative call. It is located in
337*ec02198aSmrg@file{ipa-devirt.c} and is described by @code{pass_ipa_devirt}.
338*ec02198aSmrg
339*ec02198aSmrg@item IPA constant propagation
340*ec02198aSmrg
341*ec02198aSmrgThe goal of this pass is to discover functions that are always invoked
342*ec02198aSmrgwith some arguments with the same known constant values and to modify
343*ec02198aSmrgthe functions accordingly.  It can also do partial specialization and
344*ec02198aSmrgtype-based devirtualization.  It is located in @file{ipa-cp.c} and is
345*ec02198aSmrgdescribed by @code{pass_ipa_cp}.
346*ec02198aSmrg
347*ec02198aSmrg@item IPA scalar replacement of aggregates
348*ec02198aSmrg
349*ec02198aSmrgThis pass can replace an aggregate parameter with a set of other parameters
350*ec02198aSmrgrepresenting part of the original, turning those passed by reference
351*ec02198aSmrginto new ones which pass the value directly.  It also removes unused
352*ec02198aSmrgfunction return values and unused function parameters.  This pass is
353*ec02198aSmrglocated in @file{ipa-sra.c} and is described by @code{pass_ipa_sra}.
354*ec02198aSmrg
355*ec02198aSmrg@item IPA constructor/destructor merge
356*ec02198aSmrg
357*ec02198aSmrgThis pass merges multiple constructors and destructors for static
358*ec02198aSmrgobjects into single functions.  It's only run at LTO time unless the
359*ec02198aSmrgtarget doesn't support constructors and destructors natively.  The
360*ec02198aSmrgpass is located in @file{ipa.c} and is described by
361*ec02198aSmrg@code{pass_ipa_cdtor_merge}.
362*ec02198aSmrg
363*ec02198aSmrg@item IPA HSA
364*ec02198aSmrg
365*ec02198aSmrgThis pass is part of the GCC support for HSA (Heterogeneous System
366*ec02198aSmrgArchitecture) accelerators.  It is responsible for creation of HSA
367*ec02198aSmrgclones and emitting HSAIL instructions for them.  It is located in
368*ec02198aSmrg@file{ipa-hsa.c} and is described by @code{pass_ipa_hsa}.
369*ec02198aSmrg
370*ec02198aSmrg@item IPA function summary
371*ec02198aSmrg
372*ec02198aSmrgThis pass provides function analysis for inter-procedural passes.
373*ec02198aSmrgIt collects estimates of function body size, execution time, and frame
374*ec02198aSmrgsize for each function.  It also estimates information about function
375*ec02198aSmrgcalls: call statement size, time and how often the parameters change
376*ec02198aSmrgfor each call.  It is located in @file{ipa-fnsummary.c} and is
377*ec02198aSmrgdescribed by @code{pass_ipa_fn_summary}.
378*ec02198aSmrg
379*ec02198aSmrg@item IPA inline
380*ec02198aSmrg
381*ec02198aSmrgThe IPA inline pass handles function inlining with whole-program
382*ec02198aSmrgknowledge. Small functions that are candidates for inlining are
383*ec02198aSmrgordered in increasing badness, bounded by unit growth parameters.
384*ec02198aSmrgUnreachable functions are removed from the call graph.  Functions called
385*ec02198aSmrgonce and not exported from the unit are inlined.  This pass is located in
386*ec02198aSmrg@file{ipa-inline.c} and is described by @code{pass_ipa_inline}.
387*ec02198aSmrg
388*ec02198aSmrg@item IPA pure/const analysis
389*ec02198aSmrg
390*ec02198aSmrgThis pass marks functions as being either const (@code{TREE_READONLY}) or
391*ec02198aSmrgpure (@code{DECL_PURE_P}).  The per-function information is produced
392*ec02198aSmrgby @code{pure_const_generate_summary}, then the global information is computed
393*ec02198aSmrgby performing a transitive closure over the call graph.   It is located in
394*ec02198aSmrg@file{ipa-pure-const.c} and is described by @code{pass_ipa_pure_const}.
395*ec02198aSmrg
396*ec02198aSmrg@item IPA free function summary
397*ec02198aSmrg
398*ec02198aSmrgThis pass is a regular IPA pass when argument @code{small_p} is false.
399*ec02198aSmrgIt releases inline function summaries and call summaries.
400*ec02198aSmrgIt is located in @file{ipa-fnsummary.c} and is described by
401*ec02198aSmrg@code{pass_ipa_free_fn_summary}.
402*ec02198aSmrg
403*ec02198aSmrg@item IPA reference
404*ec02198aSmrg
405*ec02198aSmrgThis pass gathers information about how variables whose scope is
406*ec02198aSmrgconfined to the compilation unit are used.  It is located in
407*ec02198aSmrg@file{ipa-reference.c} and is described by @code{pass_ipa_reference}.
408*ec02198aSmrg
409*ec02198aSmrg@item IPA single use
410*ec02198aSmrg
411*ec02198aSmrgThis pass checks whether variables are used by a single function.
412*ec02198aSmrgIt is located in @file{ipa.c} and is described by
413*ec02198aSmrg@code{pass_ipa_single_use}.
414*ec02198aSmrg
415*ec02198aSmrg@item IPA comdats
416*ec02198aSmrg
417*ec02198aSmrgThis pass looks for static symbols that are used exclusively
418*ec02198aSmrgwithin one comdat group, and moves them into that comdat group. It is
419*ec02198aSmrglocated in @file{ipa-comdats.c} and is described by
420*ec02198aSmrg@code{pass_ipa_comdats}.
421*ec02198aSmrg
422*ec02198aSmrg@end itemize
423*ec02198aSmrg
424*ec02198aSmrg@node Late IPA passes
425*ec02198aSmrg@subsection Late IPA passes
426*ec02198aSmrg@cindex late IPA passes
427*ec02198aSmrg
428*ec02198aSmrgLate IPA passes are simple IPA passes executed after
429*ec02198aSmrgthe regular passes.  In WHOPR mode the passes are executed after
430*ec02198aSmrgpartitioning and thus see just parts of the compiled unit.
431*ec02198aSmrg
432*ec02198aSmrg@itemize @bullet
433*ec02198aSmrg@item Materialize all clones
434*ec02198aSmrg
435*ec02198aSmrgOnce all functions from compilation unit are in memory, produce all clones
436*ec02198aSmrgand update all calls.  It is located in @file{ipa.c} and is described by
437*ec02198aSmrg@code{pass_materialize_all_clones}.
438*ec02198aSmrg
439*ec02198aSmrg@item IPA points-to analysis
440*ec02198aSmrg
441*ec02198aSmrgPoints-to analysis; this is the same as the points-to-analysis pass
442*ec02198aSmrgrun with the small IPA passes (@pxref{Small IPA passes}).
443*ec02198aSmrg
444*ec02198aSmrg@item OpenMP simd clone
445*ec02198aSmrg
446*ec02198aSmrgThis is the OpenMP constructs' SIMD clone pass.  It creates the appropriate
447*ec02198aSmrgSIMD clones for functions tagged as elemental SIMD functions.
448*ec02198aSmrgIt is located in @file{omp-simd-clone.c} and is described by
449*ec02198aSmrg@code{pass_omp_simd_clone}.
450*ec02198aSmrg
451*ec02198aSmrg@end itemize
452*ec02198aSmrg
45310d565efSmrg@node Tree SSA passes
45410d565efSmrg@section Tree SSA passes
45510d565efSmrg
45610d565efSmrgThe following briefly describes the Tree optimization passes that are
45710d565efSmrgrun after gimplification and what source files they are located in.
45810d565efSmrg
45910d565efSmrg@itemize @bullet
46010d565efSmrg@item Remove useless statements
46110d565efSmrg
46210d565efSmrgThis pass is an extremely simple sweep across the gimple code in which
46310d565efSmrgwe identify obviously dead code and remove it.  Here we do things like
46410d565efSmrgsimplify @code{if} statements with constant conditions, remove
46510d565efSmrgexception handling constructs surrounding code that obviously cannot
46610d565efSmrgthrow, remove lexical bindings that contain no variables, and other
46710d565efSmrgassorted simplistic cleanups.  The idea is to get rid of the obvious
46810d565efSmrgstuff quickly rather than wait until later when it's more work to get
46910d565efSmrgrid of it.  This pass is located in @file{tree-cfg.c} and described by
47010d565efSmrg@code{pass_remove_useless_stmts}.
47110d565efSmrg
47210d565efSmrg@item OpenMP lowering
47310d565efSmrg
47410d565efSmrgIf OpenMP generation (@option{-fopenmp}) is enabled, this pass lowers
47510d565efSmrgOpenMP constructs into GIMPLE.
47610d565efSmrg
47710d565efSmrgLowering of OpenMP constructs involves creating replacement
47810d565efSmrgexpressions for local variables that have been mapped using data
47910d565efSmrgsharing clauses, exposing the control flow of most synchronization
48010d565efSmrgdirectives and adding region markers to facilitate the creation of the
48110d565efSmrgcontrol flow graph.  The pass is located in @file{omp-low.c} and is
48210d565efSmrgdescribed by @code{pass_lower_omp}.
48310d565efSmrg
48410d565efSmrg@item OpenMP expansion
48510d565efSmrg
48610d565efSmrgIf OpenMP generation (@option{-fopenmp}) is enabled, this pass expands
48710d565efSmrgparallel regions into their own functions to be invoked by the thread
48810d565efSmrglibrary.  The pass is located in @file{omp-low.c} and is described by
48910d565efSmrg@code{pass_expand_omp}.
49010d565efSmrg
49110d565efSmrg@item Lower control flow
49210d565efSmrg
49310d565efSmrgThis pass flattens @code{if} statements (@code{COND_EXPR})
49410d565efSmrgand moves lexical bindings (@code{BIND_EXPR}) out of line.  After
49510d565efSmrgthis pass, all @code{if} statements will have exactly two @code{goto}
49610d565efSmrgstatements in its @code{then} and @code{else} arms.  Lexical binding
49710d565efSmrginformation for each statement will be found in @code{TREE_BLOCK} rather
49810d565efSmrgthan being inferred from its position under a @code{BIND_EXPR}.  This
49910d565efSmrgpass is found in @file{gimple-low.c} and is described by
50010d565efSmrg@code{pass_lower_cf}.
50110d565efSmrg
50210d565efSmrg@item Lower exception handling control flow
50310d565efSmrg
50410d565efSmrgThis pass decomposes high-level exception handling constructs
50510d565efSmrg(@code{TRY_FINALLY_EXPR} and @code{TRY_CATCH_EXPR}) into a form
50610d565efSmrgthat explicitly represents the control flow involved.  After this
50710d565efSmrgpass, @code{lookup_stmt_eh_region} will return a non-negative
50810d565efSmrgnumber for any statement that may have EH control flow semantics;
50910d565efSmrgexamine @code{tree_can_throw_internal} or @code{tree_can_throw_external}
51010d565efSmrgfor exact semantics.  Exact control flow may be extracted from
51110d565efSmrg@code{foreach_reachable_handler}.  The EH region nesting tree is defined
51210d565efSmrgin @file{except.h} and built in @file{except.c}.  The lowering pass
51310d565efSmrgitself is in @file{tree-eh.c} and is described by @code{pass_lower_eh}.
51410d565efSmrg
51510d565efSmrg@item Build the control flow graph
51610d565efSmrg
51710d565efSmrgThis pass decomposes a function into basic blocks and creates all of
51810d565efSmrgthe edges that connect them.  It is located in @file{tree-cfg.c} and
51910d565efSmrgis described by @code{pass_build_cfg}.
52010d565efSmrg
52110d565efSmrg@item Find all referenced variables
52210d565efSmrg
52310d565efSmrgThis pass walks the entire function and collects an array of all
52410d565efSmrgvariables referenced in the function, @code{referenced_vars}.  The
52510d565efSmrgindex at which a variable is found in the array is used as a UID
52610d565efSmrgfor the variable within this function.  This data is needed by the
52710d565efSmrgSSA rewriting routines.  The pass is located in @file{tree-dfa.c}
52810d565efSmrgand is described by @code{pass_referenced_vars}.
52910d565efSmrg
53010d565efSmrg@item Enter static single assignment form
53110d565efSmrg
53210d565efSmrgThis pass rewrites the function such that it is in SSA form.  After
53310d565efSmrgthis pass, all @code{is_gimple_reg} variables will be referenced by
53410d565efSmrg@code{SSA_NAME}, and all occurrences of other variables will be
53510d565efSmrgannotated with @code{VDEFS} and @code{VUSES}; PHI nodes will have
53610d565efSmrgbeen inserted as necessary for each basic block.  This pass is
53710d565efSmrglocated in @file{tree-ssa.c} and is described by @code{pass_build_ssa}.
53810d565efSmrg
53910d565efSmrg@item Warn for uninitialized variables
54010d565efSmrg
54110d565efSmrgThis pass scans the function for uses of @code{SSA_NAME}s that
54210d565efSmrgare fed by default definition.  For non-parameter variables, such
54310d565efSmrguses are uninitialized.  The pass is run twice, before and after
54410d565efSmrgoptimization (if turned on).  In the first pass we only warn for uses that are
54510d565efSmrgpositively uninitialized; in the second pass we warn for uses that
54610d565efSmrgare possibly uninitialized.  The pass is located in @file{tree-ssa.c}
54710d565efSmrgand is defined by @code{pass_early_warn_uninitialized} and
54810d565efSmrg@code{pass_late_warn_uninitialized}.
54910d565efSmrg
55010d565efSmrg@item Dead code elimination
55110d565efSmrg
55210d565efSmrgThis pass scans the function for statements without side effects whose
55310d565efSmrgresult is unused.  It does not do memory life analysis, so any value
55410d565efSmrgthat is stored in memory is considered used.  The pass is run multiple
55510d565efSmrgtimes throughout the optimization process.  It is located in
55610d565efSmrg@file{tree-ssa-dce.c} and is described by @code{pass_dce}.
55710d565efSmrg
55810d565efSmrg@item Dominator optimizations
55910d565efSmrg
56010d565efSmrgThis pass performs trivial dominator-based copy and constant propagation,
56110d565efSmrgexpression simplification, and jump threading.  It is run multiple times
56210d565efSmrgthroughout the optimization process.  It is located in @file{tree-ssa-dom.c}
56310d565efSmrgand is described by @code{pass_dominator}.
56410d565efSmrg
56510d565efSmrg@item Forward propagation of single-use variables
56610d565efSmrg
56710d565efSmrgThis pass attempts to remove redundant computation by substituting
56810d565efSmrgvariables that are used once into the expression that uses them and
56910d565efSmrgseeing if the result can be simplified.  It is located in
57010d565efSmrg@file{tree-ssa-forwprop.c} and is described by @code{pass_forwprop}.
57110d565efSmrg
57210d565efSmrg@item Copy Renaming
57310d565efSmrg
57410d565efSmrgThis pass attempts to change the name of compiler temporaries involved in
57510d565efSmrgcopy operations such that SSA->normal can coalesce the copy away.  When compiler
57610d565efSmrgtemporaries are copies of user variables, it also renames the compiler
57710d565efSmrgtemporary to the user variable resulting in better use of user symbols.  It is
57810d565efSmrglocated in @file{tree-ssa-copyrename.c} and is described by
57910d565efSmrg@code{pass_copyrename}.
58010d565efSmrg
58110d565efSmrg@item PHI node optimizations
58210d565efSmrg
58310d565efSmrgThis pass recognizes forms of PHI inputs that can be represented as
58410d565efSmrgconditional expressions and rewrites them into straight line code.
58510d565efSmrgIt is located in @file{tree-ssa-phiopt.c} and is described by
58610d565efSmrg@code{pass_phiopt}.
58710d565efSmrg
58810d565efSmrg@item May-alias optimization
58910d565efSmrg
59010d565efSmrgThis pass performs a flow sensitive SSA-based points-to analysis.
59110d565efSmrgThe resulting may-alias, must-alias, and escape analysis information
59210d565efSmrgis used to promote variables from in-memory addressable objects to
59310d565efSmrgnon-aliased variables that can be renamed into SSA form.  We also
59410d565efSmrgupdate the @code{VDEF}/@code{VUSE} memory tags for non-renameable
59510d565efSmrgaggregates so that we get fewer false kills.  The pass is located
59610d565efSmrgin @file{tree-ssa-alias.c} and is described by @code{pass_may_alias}.
59710d565efSmrg
59810d565efSmrgInterprocedural points-to information is located in
59910d565efSmrg@file{tree-ssa-structalias.c} and described by @code{pass_ipa_pta}.
60010d565efSmrg
60110d565efSmrg@item Profiling
60210d565efSmrg
60310d565efSmrgThis pass instruments the function in order to collect runtime block
60410d565efSmrgand value profiling data.  Such data may be fed back into the compiler
60510d565efSmrgon a subsequent run so as to allow optimization based on expected
60610d565efSmrgexecution frequencies.  The pass is located in @file{tree-profile.c} and
60710d565efSmrgis described by @code{pass_ipa_tree_profile}.
60810d565efSmrg
60910d565efSmrg@item Static profile estimation
61010d565efSmrg
61110d565efSmrgThis pass implements series of heuristics to guess propababilities
61210d565efSmrgof branches.  The resulting predictions are turned into edge profile
61310d565efSmrgby propagating branches across the control flow graphs.
61410d565efSmrgThe pass is located in @file{tree-profile.c} and is described by
61510d565efSmrg@code{pass_profile}.
61610d565efSmrg
61710d565efSmrg@item Lower complex arithmetic
61810d565efSmrg
61910d565efSmrgThis pass rewrites complex arithmetic operations into their component
62010d565efSmrgscalar arithmetic operations.  The pass is located in @file{tree-complex.c}
62110d565efSmrgand is described by @code{pass_lower_complex}.
62210d565efSmrg
62310d565efSmrg@item Scalar replacement of aggregates
62410d565efSmrg
62510d565efSmrgThis pass rewrites suitable non-aliased local aggregate variables into
62610d565efSmrga set of scalar variables.  The resulting scalar variables are
62710d565efSmrgrewritten into SSA form, which allows subsequent optimization passes
62810d565efSmrgto do a significantly better job with them.  The pass is located in
62910d565efSmrg@file{tree-sra.c} and is described by @code{pass_sra}.
63010d565efSmrg
63110d565efSmrg@item Dead store elimination
63210d565efSmrg
63310d565efSmrgThis pass eliminates stores to memory that are subsequently overwritten
63410d565efSmrgby another store, without any intervening loads.  The pass is located
63510d565efSmrgin @file{tree-ssa-dse.c} and is described by @code{pass_dse}.
63610d565efSmrg
63710d565efSmrg@item Tail recursion elimination
63810d565efSmrg
63910d565efSmrgThis pass transforms tail recursion into a loop.  It is located in
64010d565efSmrg@file{tree-tailcall.c} and is described by @code{pass_tail_recursion}.
64110d565efSmrg
64210d565efSmrg@item Forward store motion
64310d565efSmrg
64410d565efSmrgThis pass sinks stores and assignments down the flowgraph closer to their
64510d565efSmrguse point.  The pass is located in @file{tree-ssa-sink.c} and is
64610d565efSmrgdescribed by @code{pass_sink_code}.
64710d565efSmrg
64810d565efSmrg@item Partial redundancy elimination
64910d565efSmrg
65010d565efSmrgThis pass eliminates partially redundant computations, as well as
65110d565efSmrgperforming load motion.  The pass is located in @file{tree-ssa-pre.c}
65210d565efSmrgand is described by @code{pass_pre}.
65310d565efSmrg
65410d565efSmrgJust before partial redundancy elimination, if
65510d565efSmrg@option{-funsafe-math-optimizations} is on, GCC tries to convert
65610d565efSmrgdivisions to multiplications by the reciprocal.  The pass is located
65710d565efSmrgin @file{tree-ssa-math-opts.c} and is described by
65810d565efSmrg@code{pass_cse_reciprocal}.
65910d565efSmrg
66010d565efSmrg@item Full redundancy elimination
66110d565efSmrg
66210d565efSmrgThis is a simpler form of PRE that only eliminates redundancies that
66310d565efSmrgoccur on all paths.  It is located in @file{tree-ssa-pre.c} and
66410d565efSmrgdescribed by @code{pass_fre}.
66510d565efSmrg
66610d565efSmrg@item Loop optimization
66710d565efSmrg
66810d565efSmrgThe main driver of the pass is placed in @file{tree-ssa-loop.c}
66910d565efSmrgand described by @code{pass_loop}.
67010d565efSmrg
67110d565efSmrgThe optimizations performed by this pass are:
67210d565efSmrg
67310d565efSmrgLoop invariant motion.  This pass moves only invariants that
67410d565efSmrgwould be hard to handle on RTL level (function calls, operations that expand to
67510d565efSmrgnontrivial sequences of insns).  With @option{-funswitch-loops} it also moves
67610d565efSmrgoperands of conditions that are invariant out of the loop, so that we can use
67710d565efSmrgjust trivial invariantness analysis in loop unswitching.  The pass also includes
67810d565efSmrgstore motion.  The pass is implemented in @file{tree-ssa-loop-im.c}.
67910d565efSmrg
68010d565efSmrgCanonical induction variable creation.  This pass creates a simple counter
68110d565efSmrgfor number of iterations of the loop and replaces the exit condition of the
68210d565efSmrgloop using it, in case when a complicated analysis is necessary to determine
68310d565efSmrgthe number of iterations.  Later optimizations then may determine the number
68410d565efSmrgeasily.  The pass is implemented in @file{tree-ssa-loop-ivcanon.c}.
68510d565efSmrg
68610d565efSmrgInduction variable optimizations.  This pass performs standard induction
68710d565efSmrgvariable optimizations, including strength reduction, induction variable
68810d565efSmrgmerging and induction variable elimination.  The pass is implemented in
68910d565efSmrg@file{tree-ssa-loop-ivopts.c}.
69010d565efSmrg
69110d565efSmrgLoop unswitching.  This pass moves the conditional jumps that are invariant
69210d565efSmrgout of the loops.  To achieve this, a duplicate of the loop is created for
69310d565efSmrgeach possible outcome of conditional jump(s).  The pass is implemented in
69410d565efSmrg@file{tree-ssa-loop-unswitch.c}.
69510d565efSmrg
69610d565efSmrgLoop splitting.  If a loop contains a conditional statement that is
69710d565efSmrgalways true for one part of the iteration space and false for the other
69810d565efSmrgthis pass splits the loop into two, one dealing with one side the other
69910d565efSmrgonly with the other, thereby removing one inner-loop conditional.  The
70010d565efSmrgpass is implemented in @file{tree-ssa-loop-split.c}.
70110d565efSmrg
70210d565efSmrgThe optimizations also use various utility functions contained in
70310d565efSmrg@file{tree-ssa-loop-manip.c}, @file{cfgloop.c}, @file{cfgloopanal.c} and
70410d565efSmrg@file{cfgloopmanip.c}.
70510d565efSmrg
70610d565efSmrgVectorization.  This pass transforms loops to operate on vector types
70710d565efSmrginstead of scalar types.  Data parallelism across loop iterations is exploited
70810d565efSmrgto group data elements from consecutive iterations into a vector and operate
70910d565efSmrgon them in parallel.  Depending on available target support the loop is
71010d565efSmrgconceptually unrolled by a factor @code{VF} (vectorization factor), which is
71110d565efSmrgthe number of elements operated upon in parallel in each iteration, and the
71210d565efSmrg@code{VF} copies of each scalar operation are fused to form a vector operation.
71310d565efSmrgAdditional loop transformations such as peeling and versioning may take place
71410d565efSmrgto align the number of iterations, and to align the memory accesses in the
71510d565efSmrgloop.
71610d565efSmrgThe pass is implemented in @file{tree-vectorizer.c} (the main driver),
71710d565efSmrg@file{tree-vect-loop.c} and @file{tree-vect-loop-manip.c} (loop specific parts
71810d565efSmrgand general loop utilities), @file{tree-vect-slp} (loop-aware SLP
71910d565efSmrgfunctionality), @file{tree-vect-stmts.c} and @file{tree-vect-data-refs.c}.
72010d565efSmrgAnalysis of data references is in @file{tree-data-ref.c}.
72110d565efSmrg
72210d565efSmrgSLP Vectorization.  This pass performs vectorization of straight-line code. The
72310d565efSmrgpass is implemented in @file{tree-vectorizer.c} (the main driver),
72410d565efSmrg@file{tree-vect-slp.c}, @file{tree-vect-stmts.c} and
72510d565efSmrg@file{tree-vect-data-refs.c}.
72610d565efSmrg
72710d565efSmrgAutoparallelization.  This pass splits the loop iteration space to run
72810d565efSmrginto several threads.  The pass is implemented in @file{tree-parloops.c}.
72910d565efSmrg
73010d565efSmrgGraphite is a loop transformation framework based on the polyhedral
73110d565efSmrgmodel.  Graphite stands for Gimple Represented as Polyhedra.  The
73210d565efSmrginternals of this infrastructure are documented in
73310d565efSmrg@w{@uref{http://gcc.gnu.org/wiki/Graphite}}.  The passes working on
73410d565efSmrgthis representation are implemented in the various @file{graphite-*}
73510d565efSmrgfiles.
73610d565efSmrg
73710d565efSmrg@item Tree level if-conversion for vectorizer
73810d565efSmrg
73910d565efSmrgThis pass applies if-conversion to simple loops to help vectorizer.
74010d565efSmrgWe identify if convertible loops, if-convert statements and merge
74110d565efSmrgbasic blocks in one big block.  The idea is to present loop in such
74210d565efSmrgform so that vectorizer can have one to one mapping between statements
74310d565efSmrgand available vector operations.  This pass is located in
74410d565efSmrg@file{tree-if-conv.c} and is described by @code{pass_if_conversion}.
74510d565efSmrg
74610d565efSmrg@item Conditional constant propagation
74710d565efSmrg
74810d565efSmrgThis pass relaxes a lattice of values in order to identify those
74910d565efSmrgthat must be constant even in the presence of conditional branches.
75010d565efSmrgThe pass is located in @file{tree-ssa-ccp.c} and is described
75110d565efSmrgby @code{pass_ccp}.
75210d565efSmrg
75310d565efSmrgA related pass that works on memory loads and stores, and not just
75410d565efSmrgregister values, is located in @file{tree-ssa-ccp.c} and described by
75510d565efSmrg@code{pass_store_ccp}.
75610d565efSmrg
75710d565efSmrg@item Conditional copy propagation
75810d565efSmrg
75910d565efSmrgThis is similar to constant propagation but the lattice of values is
76010d565efSmrgthe ``copy-of'' relation.  It eliminates redundant copies from the
76110d565efSmrgcode.  The pass is located in @file{tree-ssa-copy.c} and described by
76210d565efSmrg@code{pass_copy_prop}.
76310d565efSmrg
76410d565efSmrgA related pass that works on memory copies, and not just register
76510d565efSmrgcopies, is located in @file{tree-ssa-copy.c} and described by
76610d565efSmrg@code{pass_store_copy_prop}.
76710d565efSmrg
76810d565efSmrg@item Value range propagation
76910d565efSmrg
77010d565efSmrgThis transformation is similar to constant propagation but
77110d565efSmrginstead of propagating single constant values, it propagates
77210d565efSmrgknown value ranges.  The implementation is based on Patterson's
77310d565efSmrgrange propagation algorithm (Accurate Static Branch Prediction by
77410d565efSmrgValue Range Propagation, J. R. C. Patterson, PLDI '95).  In
77510d565efSmrgcontrast to Patterson's algorithm, this implementation does not
77610d565efSmrgpropagate branch probabilities nor it uses more than a single
77710d565efSmrgrange per SSA name. This means that the current implementation
77810d565efSmrgcannot be used for branch prediction (though adapting it would
77910d565efSmrgnot be difficult).  The pass is located in @file{tree-vrp.c} and is
78010d565efSmrgdescribed by @code{pass_vrp}.
78110d565efSmrg
78210d565efSmrg@item Folding built-in functions
78310d565efSmrg
78410d565efSmrgThis pass simplifies built-in functions, as applicable, with constant
78510d565efSmrgarguments or with inferable string lengths.  It is located in
78610d565efSmrg@file{tree-ssa-ccp.c} and is described by @code{pass_fold_builtins}.
78710d565efSmrg
78810d565efSmrg@item Split critical edges
78910d565efSmrg
79010d565efSmrgThis pass identifies critical edges and inserts empty basic blocks
79110d565efSmrgsuch that the edge is no longer critical.  The pass is located in
79210d565efSmrg@file{tree-cfg.c} and is described by @code{pass_split_crit_edges}.
79310d565efSmrg
79410d565efSmrg@item Control dependence dead code elimination
79510d565efSmrg
79610d565efSmrgThis pass is a stronger form of dead code elimination that can
79710d565efSmrgeliminate unnecessary control flow statements.   It is located
79810d565efSmrgin @file{tree-ssa-dce.c} and is described by @code{pass_cd_dce}.
79910d565efSmrg
80010d565efSmrg@item Tail call elimination
80110d565efSmrg
80210d565efSmrgThis pass identifies function calls that may be rewritten into
80310d565efSmrgjumps.  No code transformation is actually applied here, but the
80410d565efSmrgdata and control flow problem is solved.  The code transformation
80510d565efSmrgrequires target support, and so is delayed until RTL@.  In the
80610d565efSmrgmeantime @code{CALL_EXPR_TAILCALL} is set indicating the possibility.
80710d565efSmrgThe pass is located in @file{tree-tailcall.c} and is described by
80810d565efSmrg@code{pass_tail_calls}.  The RTL transformation is handled by
80910d565efSmrg@code{fixup_tail_calls} in @file{calls.c}.
81010d565efSmrg
81110d565efSmrg@item Warn for function return without value
81210d565efSmrg
81310d565efSmrgFor non-void functions, this pass locates return statements that do
81410d565efSmrgnot specify a value and issues a warning.  Such a statement may have
81510d565efSmrgbeen injected by falling off the end of the function.  This pass is
81610d565efSmrgrun last so that we have as much time as possible to prove that the
81710d565efSmrgstatement is not reachable.  It is located in @file{tree-cfg.c} and
81810d565efSmrgis described by @code{pass_warn_function_return}.
81910d565efSmrg
82010d565efSmrg@item Leave static single assignment form
82110d565efSmrg
82210d565efSmrgThis pass rewrites the function such that it is in normal form.  At
82310d565efSmrgthe same time, we eliminate as many single-use temporaries as possible,
82410d565efSmrgso the intermediate language is no longer GIMPLE, but GENERIC@.  The
82510d565efSmrgpass is located in @file{tree-outof-ssa.c} and is described by
82610d565efSmrg@code{pass_del_ssa}.
82710d565efSmrg
82810d565efSmrg@item Merge PHI nodes that feed into one another
82910d565efSmrg
83010d565efSmrgThis is part of the CFG cleanup passes.  It attempts to join PHI nodes
83110d565efSmrgfrom a forwarder CFG block into another block with PHI nodes.  The
83210d565efSmrgpass is located in @file{tree-cfgcleanup.c} and is described by
83310d565efSmrg@code{pass_merge_phi}.
83410d565efSmrg
83510d565efSmrg@item Return value optimization
83610d565efSmrg
83710d565efSmrgIf a function always returns the same local variable, and that local
83810d565efSmrgvariable is an aggregate type, then the variable is replaced with the
83910d565efSmrgreturn value for the function (i.e., the function's DECL_RESULT).  This
84010d565efSmrgis equivalent to the C++ named return value optimization applied to
84110d565efSmrgGIMPLE@.  The pass is located in @file{tree-nrv.c} and is described by
84210d565efSmrg@code{pass_nrv}.
84310d565efSmrg
84410d565efSmrg@item Return slot optimization
84510d565efSmrg
84610d565efSmrgIf a function returns a memory object and is called as @code{var =
84710d565efSmrgfoo()}, this pass tries to change the call so that the address of
84810d565efSmrg@code{var} is sent to the caller to avoid an extra memory copy.  This
84910d565efSmrgpass is located in @code{tree-nrv.c} and is described by
85010d565efSmrg@code{pass_return_slot}.
85110d565efSmrg
85210d565efSmrg@item Optimize calls to @code{__builtin_object_size}
85310d565efSmrg
85410d565efSmrgThis is a propagation pass similar to CCP that tries to remove calls
85510d565efSmrgto @code{__builtin_object_size} when the size of the object can be
85610d565efSmrgcomputed at compile-time.  This pass is located in
85710d565efSmrg@file{tree-object-size.c} and is described by
85810d565efSmrg@code{pass_object_sizes}.
85910d565efSmrg
86010d565efSmrg@item Loop invariant motion
86110d565efSmrg
86210d565efSmrgThis pass removes expensive loop-invariant computations out of loops.
86310d565efSmrgThe pass is located in @file{tree-ssa-loop.c} and described by
86410d565efSmrg@code{pass_lim}.
86510d565efSmrg
86610d565efSmrg@item Loop nest optimizations
86710d565efSmrg
86810d565efSmrgThis is a family of loop transformations that works on loop nests.  It
86910d565efSmrgincludes loop interchange, scaling, skewing and reversal and they are
87010d565efSmrgall geared to the optimization of data locality in array traversals
87110d565efSmrgand the removal of dependencies that hamper optimizations such as loop
87210d565efSmrgparallelization and vectorization.  The pass is located in
87310d565efSmrg@file{tree-loop-linear.c} and described by
87410d565efSmrg@code{pass_linear_transform}.
87510d565efSmrg
87610d565efSmrg@item Removal of empty loops
87710d565efSmrg
87810d565efSmrgThis pass removes loops with no code in them.  The pass is located in
87910d565efSmrg@file{tree-ssa-loop-ivcanon.c} and described by
88010d565efSmrg@code{pass_empty_loop}.
88110d565efSmrg
88210d565efSmrg@item Unrolling of small loops
88310d565efSmrg
88410d565efSmrgThis pass completely unrolls loops with few iterations.  The pass
88510d565efSmrgis located in @file{tree-ssa-loop-ivcanon.c} and described by
88610d565efSmrg@code{pass_complete_unroll}.
88710d565efSmrg
88810d565efSmrg@item Predictive commoning
88910d565efSmrg
89010d565efSmrgThis pass makes the code reuse the computations from the previous
89110d565efSmrgiterations of the loops, especially loads and stores to memory.
89210d565efSmrgIt does so by storing the values of these computations to a bank
89310d565efSmrgof temporary variables that are rotated at the end of loop.  To avoid
89410d565efSmrgthe need for this rotation, the loop is then unrolled and the copies
89510d565efSmrgof the loop body are rewritten to use the appropriate version of
89610d565efSmrgthe temporary variable.  This pass is located in @file{tree-predcom.c}
89710d565efSmrgand described by @code{pass_predcom}.
89810d565efSmrg
89910d565efSmrg@item Array prefetching
90010d565efSmrg
90110d565efSmrgThis pass issues prefetch instructions for array references inside
90210d565efSmrgloops.  The pass is located in @file{tree-ssa-loop-prefetch.c} and
90310d565efSmrgdescribed by @code{pass_loop_prefetch}.
90410d565efSmrg
90510d565efSmrg@item Reassociation
90610d565efSmrg
90710d565efSmrgThis pass rewrites arithmetic expressions to enable optimizations that
90810d565efSmrgoperate on them, like redundancy elimination and vectorization.  The
90910d565efSmrgpass is located in @file{tree-ssa-reassoc.c} and described by
91010d565efSmrg@code{pass_reassoc}.
91110d565efSmrg
91210d565efSmrg@item Optimization of @code{stdarg} functions
91310d565efSmrg
91410d565efSmrgThis pass tries to avoid the saving of register arguments into the
91510d565efSmrgstack on entry to @code{stdarg} functions.  If the function doesn't
91610d565efSmrguse any @code{va_start} macros, no registers need to be saved.  If
91710d565efSmrg@code{va_start} macros are used, the @code{va_list} variables don't
91810d565efSmrgescape the function, it is only necessary to save registers that will
91910d565efSmrgbe used in @code{va_arg} macros.  For instance, if @code{va_arg} is
92010d565efSmrgonly used with integral types in the function, floating point
92110d565efSmrgregisters don't need to be saved.  This pass is located in
92210d565efSmrg@code{tree-stdarg.c} and described by @code{pass_stdarg}.
92310d565efSmrg
92410d565efSmrg@end itemize
92510d565efSmrg
92610d565efSmrg@node RTL passes
92710d565efSmrg@section RTL passes
92810d565efSmrg
92910d565efSmrgThe following briefly describes the RTL generation and optimization
93010d565efSmrgpasses that are run after the Tree optimization passes.
93110d565efSmrg
93210d565efSmrg@itemize @bullet
93310d565efSmrg@item RTL generation
93410d565efSmrg
93510d565efSmrg@c Avoiding overfull is tricky here.
93610d565efSmrgThe source files for RTL generation include
93710d565efSmrg@file{stmt.c},
93810d565efSmrg@file{calls.c},
93910d565efSmrg@file{expr.c},
94010d565efSmrg@file{explow.c},
94110d565efSmrg@file{expmed.c},
94210d565efSmrg@file{function.c},
94310d565efSmrg@file{optabs.c}
94410d565efSmrgand @file{emit-rtl.c}.
94510d565efSmrgAlso, the file
94610d565efSmrg@file{insn-emit.c}, generated from the machine description by the
94710d565efSmrgprogram @code{genemit}, is used in this pass.  The header file
94810d565efSmrg@file{expr.h} is used for communication within this pass.
94910d565efSmrg
95010d565efSmrg@findex genflags
95110d565efSmrg@findex gencodes
95210d565efSmrgThe header files @file{insn-flags.h} and @file{insn-codes.h},
95310d565efSmrggenerated from the machine description by the programs @code{genflags}
95410d565efSmrgand @code{gencodes}, tell this pass which standard names are available
95510d565efSmrgfor use and which patterns correspond to them.
95610d565efSmrg
95710d565efSmrg@item Generation of exception landing pads
95810d565efSmrg
95910d565efSmrgThis pass generates the glue that handles communication between the
96010d565efSmrgexception handling library routines and the exception handlers within
96110d565efSmrgthe function.  Entry points in the function that are invoked by the
96210d565efSmrgexception handling library are called @dfn{landing pads}.  The code
96310d565efSmrgfor this pass is located in @file{except.c}.
96410d565efSmrg
96510d565efSmrg@item Control flow graph cleanup
96610d565efSmrg
96710d565efSmrgThis pass removes unreachable code, simplifies jumps to next, jumps to
96810d565efSmrgjump, jumps across jumps, etc.  The pass is run multiple times.
96910d565efSmrgFor historical reasons, it is occasionally referred to as the ``jump
97010d565efSmrgoptimization pass''.  The bulk of the code for this pass is in
97110d565efSmrg@file{cfgcleanup.c}, and there are support routines in @file{cfgrtl.c}
97210d565efSmrgand @file{jump.c}.
97310d565efSmrg
97410d565efSmrg@item Forward propagation of single-def values
97510d565efSmrg
97610d565efSmrgThis pass attempts to remove redundant computation by substituting
97710d565efSmrgvariables that come from a single definition, and
97810d565efSmrgseeing if the result can be simplified.  It performs copy propagation
97910d565efSmrgand addressing mode selection.  The pass is run twice, with values
98010d565efSmrgbeing propagated into loops only on the second run.  The code is
98110d565efSmrglocated in @file{fwprop.c}.
98210d565efSmrg
98310d565efSmrg@item Common subexpression elimination
98410d565efSmrg
98510d565efSmrgThis pass removes redundant computation within basic blocks, and
98610d565efSmrgoptimizes addressing modes based on cost.  The pass is run twice.
98710d565efSmrgThe code for this pass is located in @file{cse.c}.
98810d565efSmrg
98910d565efSmrg@item Global common subexpression elimination
99010d565efSmrg
99110d565efSmrgThis pass performs two
99210d565efSmrgdifferent types of GCSE  depending on whether you are optimizing for
99310d565efSmrgsize or not (LCM based GCSE tends to increase code size for a gain in
99410d565efSmrgspeed, while Morel-Renvoise based GCSE does not).
99510d565efSmrgWhen optimizing for size, GCSE is done using Morel-Renvoise Partial
99610d565efSmrgRedundancy Elimination, with the exception that it does not try to move
99710d565efSmrginvariants out of loops---that is left to  the loop optimization pass.
99810d565efSmrgIf MR PRE GCSE is done, code hoisting (aka unification) is also done, as
99910d565efSmrgwell as load motion.
100010d565efSmrgIf you are optimizing for speed, LCM (lazy code motion) based GCSE is
100110d565efSmrgdone.  LCM is based on the work of Knoop, Ruthing, and Steffen.  LCM
100210d565efSmrgbased GCSE also does loop invariant code motion.  We also perform load
100310d565efSmrgand store motion when optimizing for speed.
100410d565efSmrgRegardless of which type of GCSE is used, the GCSE pass also performs
100510d565efSmrgglobal constant and  copy propagation.
100610d565efSmrgThe source file for this pass is @file{gcse.c}, and the LCM routines
100710d565efSmrgare in @file{lcm.c}.
100810d565efSmrg
100910d565efSmrg@item Loop optimization
101010d565efSmrg
101110d565efSmrgThis pass performs several loop related optimizations.
101210d565efSmrgThe source files @file{cfgloopanal.c} and @file{cfgloopmanip.c} contain
101310d565efSmrggeneric loop analysis and manipulation code.  Initialization and finalization
101410d565efSmrgof loop structures is handled by @file{loop-init.c}.
101510d565efSmrgA loop invariant motion pass is implemented in @file{loop-invariant.c}.
101610d565efSmrgBasic block level optimizations---unrolling, and peeling loops---
101710d565efSmrgare implemented in @file{loop-unroll.c}.
101810d565efSmrgReplacing of the exit condition of loops by special machine-dependent
101910d565efSmrginstructions is handled by @file{loop-doloop.c}.
102010d565efSmrg
102110d565efSmrg@item Jump bypassing
102210d565efSmrg
102310d565efSmrgThis pass is an aggressive form of GCSE that transforms the control
102410d565efSmrgflow graph of a function by propagating constants into conditional
102510d565efSmrgbranch instructions.  The source file for this pass is @file{gcse.c}.
102610d565efSmrg
102710d565efSmrg@item If conversion
102810d565efSmrg
102910d565efSmrgThis pass attempts to replace conditional branches and surrounding
103010d565efSmrgassignments with arithmetic, boolean value producing comparison
103110d565efSmrginstructions, and conditional move instructions.  In the very last
103210d565efSmrginvocation after reload/LRA, it will generate predicated instructions
103310d565efSmrgwhen supported by the target.  The code is located in @file{ifcvt.c}.
103410d565efSmrg
103510d565efSmrg@item Web construction
103610d565efSmrg
103710d565efSmrgThis pass splits independent uses of each pseudo-register.  This can
103810d565efSmrgimprove effect of the other transformation, such as CSE or register
103910d565efSmrgallocation.  The code for this pass is located in @file{web.c}.
104010d565efSmrg
104110d565efSmrg@item Instruction combination
104210d565efSmrg
104310d565efSmrgThis pass attempts to combine groups of two or three instructions that
104410d565efSmrgare related by data flow into single instructions.  It combines the
104510d565efSmrgRTL expressions for the instructions by substitution, simplifies the
104610d565efSmrgresult using algebra, and then attempts to match the result against
104710d565efSmrgthe machine description.  The code is located in @file{combine.c}.
104810d565efSmrg
104910d565efSmrg@item Mode switching optimization
105010d565efSmrg
105110d565efSmrgThis pass looks for instructions that require the processor to be in a
105210d565efSmrgspecific ``mode'' and minimizes the number of mode changes required to
105310d565efSmrgsatisfy all users.  What these modes are, and what they apply to are
105410d565efSmrgcompletely target-specific.  The code for this pass is located in
105510d565efSmrg@file{mode-switching.c}.
105610d565efSmrg
105710d565efSmrg@cindex modulo scheduling
105810d565efSmrg@cindex sms, swing, software pipelining
105910d565efSmrg@item Modulo scheduling
106010d565efSmrg
106110d565efSmrgThis pass looks at innermost loops and reorders their instructions
106210d565efSmrgby overlapping different iterations.  Modulo scheduling is performed
106310d565efSmrgimmediately before instruction scheduling.  The code for this pass is
106410d565efSmrglocated in @file{modulo-sched.c}.
106510d565efSmrg
106610d565efSmrg@item Instruction scheduling
106710d565efSmrg
106810d565efSmrgThis pass looks for instructions whose output will not be available by
106910d565efSmrgthe time that it is used in subsequent instructions.  Memory loads and
107010d565efSmrgfloating point instructions often have this behavior on RISC machines.
107110d565efSmrgIt re-orders instructions within a basic block to try to separate the
107210d565efSmrgdefinition and use of items that otherwise would cause pipeline
107310d565efSmrgstalls.  This pass is performed twice, before and after register
107410d565efSmrgallocation.  The code for this pass is located in @file{haifa-sched.c},
107510d565efSmrg@file{sched-deps.c}, @file{sched-ebb.c}, @file{sched-rgn.c} and
107610d565efSmrg@file{sched-vis.c}.
107710d565efSmrg
107810d565efSmrg@item Register allocation
107910d565efSmrg
108010d565efSmrgThese passes make sure that all occurrences of pseudo registers are
108110d565efSmrgeliminated, either by allocating them to a hard register, replacing
108210d565efSmrgthem by an equivalent expression (e.g.@: a constant) or by placing
108310d565efSmrgthem on the stack.  This is done in several subpasses:
108410d565efSmrg
108510d565efSmrg@itemize @bullet
108610d565efSmrg@item
108710d565efSmrgThe integrated register allocator (@acronym{IRA}).  It is called
108810d565efSmrgintegrated because coalescing, register live range splitting, and hard
108910d565efSmrgregister preferencing are done on-the-fly during coloring.  It also
109010d565efSmrghas better integration with the reload/LRA pass.  Pseudo-registers spilled
109110d565efSmrgby the allocator or the reload/LRA have still a chance to get
109210d565efSmrghard-registers if the reload/LRA evicts some pseudo-registers from
109310d565efSmrghard-registers.  The allocator helps to choose better pseudos for
109410d565efSmrgspilling based on their live ranges and to coalesce stack slots
109510d565efSmrgallocated for the spilled pseudo-registers.  IRA is a regional
109610d565efSmrgregister allocator which is transformed into Chaitin-Briggs allocator
109710d565efSmrgif there is one region.  By default, IRA chooses regions using
109810d565efSmrgregister pressure but the user can force it to use one region or
109910d565efSmrgregions corresponding to all loops.
110010d565efSmrg
110110d565efSmrgSource files of the allocator are @file{ira.c}, @file{ira-build.c},
110210d565efSmrg@file{ira-costs.c}, @file{ira-conflicts.c}, @file{ira-color.c},
110310d565efSmrg@file{ira-emit.c}, @file{ira-lives}, plus header files @file{ira.h}
110410d565efSmrgand @file{ira-int.h} used for the communication between the allocator
110510d565efSmrgand the rest of the compiler and between the IRA files.
110610d565efSmrg
110710d565efSmrg@cindex reloading
110810d565efSmrg@item
110910d565efSmrgReloading.  This pass renumbers pseudo registers with the hardware
111010d565efSmrgregisters numbers they were allocated.  Pseudo registers that did not
111110d565efSmrgget hard registers are replaced with stack slots.  Then it finds
111210d565efSmrginstructions that are invalid because a value has failed to end up in
111310d565efSmrga register, or has ended up in a register of the wrong kind.  It fixes
111410d565efSmrgup these instructions by reloading the problematical values
111510d565efSmrgtemporarily into registers.  Additional instructions are generated to
111610d565efSmrgdo the copying.
111710d565efSmrg
111810d565efSmrgThe reload pass also optionally eliminates the frame pointer and inserts
111910d565efSmrginstructions to save and restore call-clobbered registers around calls.
112010d565efSmrg
112110d565efSmrgSource files are @file{reload.c} and @file{reload1.c}, plus the header
112210d565efSmrg@file{reload.h} used for communication between them.
112310d565efSmrg
112410d565efSmrg@cindex Local Register Allocator (LRA)
112510d565efSmrg@item
112610d565efSmrgThis pass is a modern replacement of the reload pass.  Source files
112710d565efSmrgare @file{lra.c}, @file{lra-assign.c}, @file{lra-coalesce.c},
112810d565efSmrg@file{lra-constraints.c}, @file{lra-eliminations.c},
112910d565efSmrg@file{lra-lives.c}, @file{lra-remat.c}, @file{lra-spills.c}, the
113010d565efSmrgheader @file{lra-int.h} used for communication between them, and the
113110d565efSmrgheader @file{lra.h} used for communication between LRA and the rest of
113210d565efSmrgcompiler.
113310d565efSmrg
113410d565efSmrgUnlike the reload pass, intermediate LRA decisions are reflected in
113510d565efSmrgRTL as much as possible.  This reduces the number of target-dependent
113610d565efSmrgmacros and hooks, leaving instruction constraints as the primary
113710d565efSmrgsource of control.
113810d565efSmrg
113910d565efSmrgLRA is run on targets for which TARGET_LRA_P returns true.
114010d565efSmrg@end itemize
114110d565efSmrg
114210d565efSmrg@item Basic block reordering
114310d565efSmrg
114410d565efSmrgThis pass implements profile guided code positioning.  If profile
114510d565efSmrginformation is not available, various types of static analysis are
114610d565efSmrgperformed to make the predictions normally coming from the profile
114710d565efSmrgfeedback (IE execution frequency, branch probability, etc).  It is
114810d565efSmrgimplemented in the file @file{bb-reorder.c}, and the various
114910d565efSmrgprediction routines are in @file{predict.c}.
115010d565efSmrg
115110d565efSmrg@item Variable tracking
115210d565efSmrg
115310d565efSmrgThis pass computes where the variables are stored at each
115410d565efSmrgposition in code and generates notes describing the variable locations
115510d565efSmrgto RTL code.  The location lists are then generated according to these
115610d565efSmrgnotes to debug information if the debugging information format supports
115710d565efSmrglocation lists.  The code is located in @file{var-tracking.c}.
115810d565efSmrg
115910d565efSmrg@item Delayed branch scheduling
116010d565efSmrg
116110d565efSmrgThis optional pass attempts to find instructions that can go into the
116210d565efSmrgdelay slots of other instructions, usually jumps and calls.  The code
116310d565efSmrgfor this pass is located in @file{reorg.c}.
116410d565efSmrg
116510d565efSmrg@item Branch shortening
116610d565efSmrg
116710d565efSmrgOn many RISC machines, branch instructions have a limited range.
116810d565efSmrgThus, longer sequences of instructions must be used for long branches.
116910d565efSmrgIn this pass, the compiler figures out what how far each instruction
117010d565efSmrgwill be from each other instruction, and therefore whether the usual
117110d565efSmrginstructions, or the longer sequences, must be used for each branch.
117210d565efSmrgThe code for this pass is located in @file{final.c}.
117310d565efSmrg
117410d565efSmrg@item Register-to-stack conversion
117510d565efSmrg
117610d565efSmrgConversion from usage of some hard registers to usage of a register
117710d565efSmrgstack may be done at this point.  Currently, this is supported only
117810d565efSmrgfor the floating-point registers of the Intel 80387 coprocessor.  The
117910d565efSmrgcode for this pass is located in @file{reg-stack.c}.
118010d565efSmrg
118110d565efSmrg@item Final
118210d565efSmrg
118310d565efSmrgThis pass outputs the assembler code for the function.  The source files
118410d565efSmrgare @file{final.c} plus @file{insn-output.c}; the latter is generated
118510d565efSmrgautomatically from the machine description by the tool @file{genoutput}.
118610d565efSmrgThe header file @file{conditions.h} is used for communication between
118710d565efSmrgthese files.
118810d565efSmrg
118910d565efSmrg@item Debugging information output
119010d565efSmrg
119110d565efSmrgThis is run after final because it must output the stack slot offsets
119210d565efSmrgfor pseudo registers that did not get hard registers.  Source files
1193c7a68eb7Smrgare @file{dbxout.c} for DBX symbol table format, @file{dwarfout.c} for
1194c7a68eb7SmrgDWARF symbol table format, files @file{dwarf2out.c} and @file{dwarf2asm.c}
1195c7a68eb7Smrgfor DWARF2 symbol table format, and @file{vmsdbgout.c} for VMS debug
1196c7a68eb7Smrgsymbol table format.
119710d565efSmrg
119810d565efSmrg@end itemize
119910d565efSmrg
120010d565efSmrg@node Optimization info
120110d565efSmrg@section Optimization info
120210d565efSmrg@include optinfo.texi
1203