1@c Copyright (C) 2002-2021 Free Software Foundation, Inc. 2@c This is part of the GCC manual. 3@c For copying conditions, see the file gcc.texi. 4 5@node Type Information 6@chapter Memory Management and Type Information 7@cindex GGC 8@findex GTY 9 10GCC uses some fairly sophisticated memory management techniques, which 11involve determining information about GCC's data structures from GCC's 12source code and using this information to perform garbage collection and 13implement precompiled headers. 14 15A full C++ parser would be too complicated for this task, so a limited 16subset of C++ is interpreted and special markers are used to determine 17what parts of the source to look at. All @code{struct}, @code{union} 18and @code{template} structure declarations that define data structures 19that are allocated under control of the garbage collector must be 20marked. All global variables that hold pointers to garbage-collected 21memory must also be marked. Finally, all global variables that need 22to be saved and restored by a precompiled header must be marked. (The 23precompiled header mechanism can only save static variables if they're 24scalar. Complex data structures must be allocated in garbage-collected 25memory to be saved in a precompiled header.) 26 27The full format of a marker is 28@smallexample 29GTY (([@var{option}] [(@var{param})], [@var{option}] [(@var{param})] @dots{})) 30@end smallexample 31@noindent 32but in most cases no options are needed. The outer double parentheses 33are still necessary, though: @code{GTY(())}. Markers can appear: 34 35@itemize @bullet 36@item 37In a structure definition, before the open brace; 38@item 39In a global variable declaration, after the keyword @code{static} or 40@code{extern}; and 41@item 42In a structure field definition, before the name of the field. 43@end itemize 44 45Here are some examples of marking simple data structures and globals. 46 47@smallexample 48struct GTY(()) @var{tag} 49@{ 50 @var{fields}@dots{} 51@}; 52 53typedef struct GTY(()) @var{tag} 54@{ 55 @var{fields}@dots{} 56@} *@var{typename}; 57 58static GTY(()) struct @var{tag} *@var{list}; /* @r{points to GC memory} */ 59static GTY(()) int @var{counter}; /* @r{save counter in a PCH} */ 60@end smallexample 61 62The parser understands simple typedefs such as 63@code{typedef struct @var{tag} *@var{name};} and 64@code{typedef int @var{name};}. 65These don't need to be marked. 66 67Since @code{gengtype}'s understanding of C++ is limited, there are 68several constructs and declarations that are not supported inside 69classes/structures marked for automatic GC code generation. The 70following C++ constructs produce a @code{gengtype} error on 71structures/classes marked for automatic GC code generation: 72 73@itemize @bullet 74@item 75Type definitions inside classes/structures are not supported. 76@item 77Enumerations inside classes/structures are not supported. 78@end itemize 79 80If you have a class or structure using any of the above constructs, 81you need to mark that class as @code{GTY ((user))} and provide your 82own marking routines (see section @ref{User GC} for details). 83 84It is always valid to include function definitions inside classes. 85Those are always ignored by @code{gengtype}, as it only cares about 86data members. 87 88@menu 89* GTY Options:: What goes inside a @code{GTY(())}. 90* Inheritance and GTY:: Adding GTY to a class hierarchy. 91* User GC:: Adding user-provided GC marking routines. 92* GGC Roots:: Making global variables GGC roots. 93* Files:: How the generated files work. 94* Invoking the garbage collector:: How to invoke the garbage collector. 95* Troubleshooting:: When something does not work as expected. 96@end menu 97 98@node GTY Options 99@section The Inside of a @code{GTY(())} 100 101Sometimes the C code is not enough to fully describe the type 102structure. Extra information can be provided with @code{GTY} options 103and additional markers. Some options take a parameter, which may be 104either a string or a type name, depending on the parameter. If an 105option takes no parameter, it is acceptable either to omit the 106parameter entirely, or to provide an empty string as a parameter. For 107example, @code{@w{GTY ((skip))}} and @code{@w{GTY ((skip ("")))}} are 108equivalent. 109 110When the parameter is a string, often it is a fragment of C code. Four 111special escapes may be used in these strings, to refer to pieces of 112the data structure being marked: 113 114@cindex % in GTY option 115@table @code 116@item %h 117The current structure. 118@item %1 119The structure that immediately contains the current structure. 120@item %0 121The outermost structure that contains the current structure. 122@item %a 123A partial expression of the form @code{[i1][i2]@dots{}} that indexes 124the array item currently being marked. 125@end table 126 127For instance, suppose that you have a structure of the form 128@smallexample 129struct A @{ 130 @dots{} 131@}; 132struct B @{ 133 struct A foo[12]; 134@}; 135@end smallexample 136@noindent 137and @code{b} is a variable of type @code{struct B}. When marking 138@samp{b.foo[11]}, @code{%h} would expand to @samp{b.foo[11]}, 139@code{%0} and @code{%1} would both expand to @samp{b}, and @code{%a} 140would expand to @samp{[11]}. 141 142As in ordinary C, adjacent strings will be concatenated; this is 143helpful when you have a complicated expression. 144@smallexample 145@group 146GTY ((chain_next ("TREE_CODE (&%h.generic) == INTEGER_TYPE" 147 " ? TYPE_NEXT_VARIANT (&%h.generic)" 148 " : TREE_CHAIN (&%h.generic)"))) 149@end group 150@end smallexample 151 152The available options are: 153 154@table @code 155@findex length 156@item length ("@var{expression}") 157 158There are two places the type machinery will need to be explicitly told 159the length of an array of non-atomic objects. The first case is when a 160structure ends in a variable-length array, like this: 161@smallexample 162struct GTY(()) rtvec_def @{ 163 int num_elem; /* @r{number of elements} */ 164 rtx GTY ((length ("%h.num_elem"))) elem[1]; 165@}; 166@end smallexample 167 168In this case, the @code{length} option is used to override the specified 169array length (which should usually be @code{1}). The parameter of the 170option is a fragment of C code that calculates the length. 171 172The second case is when a structure or a global variable contains a 173pointer to an array, like this: 174@smallexample 175struct gimple_omp_for_iter * GTY((length ("%h.collapse"))) iter; 176@end smallexample 177In this case, @code{iter} has been allocated by writing something like 178@smallexample 179 x->iter = ggc_alloc_cleared_vec_gimple_omp_for_iter (collapse); 180@end smallexample 181and the @code{collapse} provides the length of the field. 182 183This second use of @code{length} also works on global variables, like: 184@verbatim 185static GTY((length("reg_known_value_size"))) rtx *reg_known_value; 186@end verbatim 187 188Note that the @code{length} option is only meant for use with arrays of 189non-atomic objects, that is, objects that contain pointers pointing to 190other GTY-managed objects. For other GC-allocated arrays and strings 191you should use @code{atomic}. 192 193@findex skip 194@item skip 195 196If @code{skip} is applied to a field, the type machinery will ignore it. 197This is somewhat dangerous; the only safe use is in a union when one 198field really isn't ever used. 199 200@findex for_user 201@item for_user 202 203Use this to mark types that need to be marked by user gc routines, but are not 204refered to in a template argument. So if you have some user gc type T1 and a 205non user gc type T2 you can give T2 the for_user option so that the marking 206functions for T1 can call non mangled functions to mark T2. 207 208@findex desc 209@findex tag 210@findex default 211@item desc ("@var{expression}") 212@itemx tag ("@var{constant}") 213@itemx default 214 215The type machinery needs to be told which field of a @code{union} is 216currently active. This is done by giving each field a constant 217@code{tag} value, and then specifying a discriminator using @code{desc}. 218The value of the expression given by @code{desc} is compared against 219each @code{tag} value, each of which should be different. If no 220@code{tag} is matched, the field marked with @code{default} is used if 221there is one, otherwise no field in the union will be marked. 222 223In the @code{desc} option, the ``current structure'' is the union that 224it discriminates. Use @code{%1} to mean the structure containing it. 225There are no escapes available to the @code{tag} option, since it is a 226constant. 227 228For example, 229@smallexample 230struct GTY(()) tree_binding 231@{ 232 struct tree_common common; 233 union tree_binding_u @{ 234 tree GTY ((tag ("0"))) scope; 235 struct cp_binding_level * GTY ((tag ("1"))) level; 236 @} GTY ((desc ("BINDING_HAS_LEVEL_P ((tree)&%0)"))) xscope; 237 tree value; 238@}; 239@end smallexample 240 241In this example, the value of BINDING_HAS_LEVEL_P when applied to a 242@code{struct tree_binding *} is presumed to be 0 or 1. If 1, the type 243mechanism will treat the field @code{level} as being present and if 0, 244will treat the field @code{scope} as being present. 245 246The @code{desc} and @code{tag} options can also be used for inheritance 247to denote which subclass an instance is. See @ref{Inheritance and GTY} 248for more information. 249 250@findex cache 251@item cache 252 253When the @code{cache} option is applied to a global variable gt_cleare_cache is 254called on that variable between the mark and sweep phases of garbage 255collection. The gt_clear_cache function is free to mark blocks as used, or to 256clear pointers in the variable. 257 258@findex deletable 259@item deletable 260 261@code{deletable}, when applied to a global variable, indicates that when 262garbage collection runs, there's no need to mark anything pointed to 263by this variable, it can just be set to @code{NULL} instead. This is used 264to keep a list of free structures around for re-use. 265 266@findex maybe_undef 267@item maybe_undef 268 269When applied to a field, @code{maybe_undef} indicates that it's OK if 270the structure that this fields points to is never defined, so long as 271this field is always @code{NULL}. This is used to avoid requiring 272backends to define certain optional structures. It doesn't work with 273language frontends. 274 275@findex nested_ptr 276@item nested_ptr (@var{type}, "@var{to expression}", "@var{from expression}") 277 278The type machinery expects all pointers to point to the start of an 279object. Sometimes for abstraction purposes it's convenient to have 280a pointer which points inside an object. So long as it's possible to 281convert the original object to and from the pointer, such pointers 282can still be used. @var{type} is the type of the original object, 283the @var{to expression} returns the pointer given the original object, 284and the @var{from expression} returns the original object given 285the pointer. The pointer will be available using the @code{%h} 286escape. 287 288@findex chain_next 289@findex chain_prev 290@findex chain_circular 291@item chain_next ("@var{expression}") 292@itemx chain_prev ("@var{expression}") 293@itemx chain_circular ("@var{expression}") 294 295It's helpful for the type machinery to know if objects are often 296chained together in long lists; this lets it generate code that uses 297less stack space by iterating along the list instead of recursing down 298it. @code{chain_next} is an expression for the next item in the list, 299@code{chain_prev} is an expression for the previous item. For singly 300linked lists, use only @code{chain_next}; for doubly linked lists, use 301both. The machinery requires that taking the next item of the 302previous item gives the original item. @code{chain_circular} is similar 303to @code{chain_next}, but can be used for circular single linked lists. 304 305@findex reorder 306@item reorder ("@var{function name}") 307 308Some data structures depend on the relative ordering of pointers. If 309the precompiled header machinery needs to change that ordering, it 310will call the function referenced by the @code{reorder} option, before 311changing the pointers in the object that's pointed to by the field the 312option applies to. The function must take four arguments, with the 313signature @samp{@w{void *, void *, gt_pointer_operator, void *}}. 314The first parameter is a pointer to the structure that contains the 315object being updated, or the object itself if there is no containing 316structure. The second parameter is a cookie that should be ignored. 317The third parameter is a routine that, given a pointer, will update it 318to its correct new value. The fourth parameter is a cookie that must 319be passed to the second parameter. 320 321PCH cannot handle data structures that depend on the absolute values 322of pointers. @code{reorder} functions can be expensive. When 323possible, it is better to depend on properties of the data, like an ID 324number or the hash of a string instead. 325 326@findex atomic 327@item atomic 328 329The @code{atomic} option can only be used with pointers. It informs 330the GC machinery that the memory that the pointer points to does not 331contain any pointers, and hence it should be treated by the GC and PCH 332machinery as an ``atomic'' block of memory that does not need to be 333examined when scanning memory for pointers. In particular, the 334machinery will not scan that memory for pointers to mark them as 335reachable (when marking pointers for GC) or to relocate them (when 336writing a PCH file). 337 338The @code{atomic} option differs from the @code{skip} option. 339@code{atomic} keeps the memory under Garbage Collection, but makes the 340GC ignore the contents of the memory. @code{skip} is more drastic in 341that it causes the pointer and the memory to be completely ignored by 342the Garbage Collector. So, memory marked as @code{atomic} is 343automatically freed when no longer reachable, while memory marked as 344@code{skip} is not. 345 346The @code{atomic} option must be used with great care, because all 347sorts of problem can occur if used incorrectly, that is, if the memory 348the pointer points to does actually contain a pointer. 349 350Here is an example of how to use it: 351@smallexample 352struct GTY(()) my_struct @{ 353 int number_of_elements; 354 unsigned int * GTY ((atomic)) elements; 355@}; 356@end smallexample 357In this case, @code{elements} is a pointer under GC, and the memory it 358points to needs to be allocated using the Garbage Collector, and will 359be freed automatically by the Garbage Collector when it is no longer 360referenced. But the memory that the pointer points to is an array of 361@code{unsigned int} elements, and the GC must not try to scan it to 362find pointers to mark or relocate, which is why it is marked with the 363@code{atomic} option. 364 365Note that, currently, global variables cannot be marked with 366@code{atomic}; only fields of a struct can. This is a known 367limitation. It would be useful to be able to mark global pointers 368with @code{atomic} to make the PCH machinery aware of them so that 369they are saved and restored correctly to PCH files. 370 371@findex special 372@item special ("@var{name}") 373 374The @code{special} option is used to mark types that have to be dealt 375with by special case machinery. The parameter is the name of the 376special case. See @file{gengtype.c} for further details. Avoid 377adding new special cases unless there is no other alternative. 378 379@findex user 380@item user 381 382The @code{user} option indicates that the code to mark structure 383fields is completely handled by user-provided routines. See section 384@ref{User GC} for details on what functions need to be provided. 385@end table 386 387@node Inheritance and GTY 388@section Support for inheritance 389gengtype has some support for simple class hierarchies. You can use 390this to have gengtype autogenerate marking routines, provided: 391 392@itemize @bullet 393@item 394There must be a concrete base class, with a discriminator expression 395that can be used to identify which subclass an instance is. 396@item 397Only single inheritance is used. 398@item 399None of the classes within the hierarchy are templates. 400@end itemize 401 402If your class hierarchy does not fit in this pattern, you must use 403@ref{User GC} instead. 404 405The base class and its discriminator must be identified using the ``desc'' 406option. Each concrete subclass must use the ``tag'' option to identify 407which value of the discriminator it corresponds to. 408 409Every class in the hierarchy must have a @code{GTY(())} marker, as 410gengtype will only attempt to parse classes that have such a marker 411@footnote{Classes lacking such a marker will not be identified as being 412part of the hierarchy, and so the marking routines will not handle them, 413leading to a assertion failure within the marking routines due to an 414unknown tag value (assuming that assertions are enabled).}. 415 416@smallexample 417class GTY((desc("%h.kind"), tag("0"))) example_base 418@{ 419public: 420 int kind; 421 tree a; 422@}; 423 424class GTY((tag("1"))) some_subclass : public example_base 425@{ 426public: 427 tree b; 428@}; 429 430class GTY((tag("2"))) some_other_subclass : public example_base 431@{ 432public: 433 tree c; 434@}; 435@end smallexample 436 437The generated marking routines for the above will contain a ``switch'' 438on ``kind'', visiting all appropriate fields. For example, if kind is 4392, it will cast to ``some_other_subclass'' and visit fields a, b, and c. 440 441@node User GC 442@section Support for user-provided GC marking routines 443@cindex user gc 444The garbage collector supports types for which no automatic marking 445code is generated. For these types, the user is required to provide 446three functions: one to act as a marker for garbage collection, and 447two functions to act as marker and pointer walker for pre-compiled 448headers. 449 450Given a structure @code{struct GTY((user)) my_struct}, the following functions 451should be defined to mark @code{my_struct}: 452 453@smallexample 454void gt_ggc_mx (my_struct *p) 455@{ 456 /* This marks field 'fld'. */ 457 gt_ggc_mx (p->fld); 458@} 459 460void gt_pch_nx (my_struct *p) 461@{ 462 /* This marks field 'fld'. */ 463 gt_pch_nx (tp->fld); 464@} 465 466void gt_pch_nx (my_struct *p, gt_pointer_operator op, void *cookie) 467@{ 468 /* For every field 'fld', call the given pointer operator. */ 469 op (&(tp->fld), cookie); 470@} 471@end smallexample 472 473In general, each marker @code{M} should call @code{M} for every 474pointer field in the structure. Fields that are not allocated in GC 475or are not pointers must be ignored. 476 477For embedded lists (e.g., structures with a @code{next} or @code{prev} 478pointer), the marker must follow the chain and mark every element in 479it. 480 481Note that the rules for the pointer walker @code{gt_pch_nx (my_struct 482*, gt_pointer_operator, void *)} are slightly different. In this 483case, the operation @code{op} must be applied to the @emph{address} of 484every pointer field. 485 486@subsection User-provided marking routines for template types 487When a template type @code{TP} is marked with @code{GTY}, all 488instances of that type are considered user-provided types. This means 489that the individual instances of @code{TP} do not need to be marked 490with @code{GTY}. The user needs to provide template functions to mark 491all the fields of the type. 492 493The following code snippets represent all the functions that need to 494be provided. Note that type @code{TP} may reference to more than one 495type. In these snippets, there is only one type @code{T}, but there 496could be more. 497 498@smallexample 499template<typename T> 500void gt_ggc_mx (TP<T> *tp) 501@{ 502 extern void gt_ggc_mx (T&); 503 504 /* This marks field 'fld' of type 'T'. */ 505 gt_ggc_mx (tp->fld); 506@} 507 508template<typename T> 509void gt_pch_nx (TP<T> *tp) 510@{ 511 extern void gt_pch_nx (T&); 512 513 /* This marks field 'fld' of type 'T'. */ 514 gt_pch_nx (tp->fld); 515@} 516 517template<typename T> 518void gt_pch_nx (TP<T *> *tp, gt_pointer_operator op, void *cookie) 519@{ 520 /* For every field 'fld' of 'tp' with type 'T *', call the given 521 pointer operator. */ 522 op (&(tp->fld), cookie); 523@} 524 525template<typename T> 526void gt_pch_nx (TP<T> *tp, gt_pointer_operator, void *cookie) 527@{ 528 extern void gt_pch_nx (T *, gt_pointer_operator, void *); 529 530 /* For every field 'fld' of 'tp' with type 'T', call the pointer 531 walker for all the fields of T. */ 532 gt_pch_nx (&(tp->fld), op, cookie); 533@} 534@end smallexample 535 536Support for user-defined types is currently limited. The following 537restrictions apply: 538 539@enumerate 540@item Type @code{TP} and all the argument types @code{T} must be 541marked with @code{GTY}. 542 543@item Type @code{TP} can only have type names in its argument list. 544 545@item The pointer walker functions are different for @code{TP<T>} and 546@code{TP<T *>}. In the case of @code{TP<T>}, references to 547@code{T} must be handled by calling @code{gt_pch_nx} (which 548will, in turn, walk all the pointers inside fields of @code{T}). 549In the case of @code{TP<T *>}, references to @code{T *} must be 550handled by calling the @code{op} function on the address of the 551pointer (see the code snippets above). 552@end enumerate 553 554@node GGC Roots 555@section Marking Roots for the Garbage Collector 556@cindex roots, marking 557@cindex marking roots 558 559In addition to keeping track of types, the type machinery also locates 560the global variables (@dfn{roots}) that the garbage collector starts 561at. Roots must be declared using one of the following syntaxes: 562 563@itemize @bullet 564@item 565@code{extern GTY(([@var{options}])) @var{type} @var{name};} 566@item 567@code{static GTY(([@var{options}])) @var{type} @var{name};} 568@end itemize 569@noindent 570The syntax 571@itemize @bullet 572@item 573@code{GTY(([@var{options}])) @var{type} @var{name};} 574@end itemize 575@noindent 576is @emph{not} accepted. There should be an @code{extern} declaration 577of such a variable in a header somewhere---mark that, not the 578definition. Or, if the variable is only used in one file, make it 579@code{static}. 580 581@node Files 582@section Source Files Containing Type Information 583@cindex generated files 584@cindex files, generated 585 586Whenever you add @code{GTY} markers to a source file that previously 587had none, or create a new source file containing @code{GTY} markers, 588there are three things you need to do: 589 590@enumerate 591@item 592You need to add the file to the list of source files the type 593machinery scans. There are four cases: 594 595@enumerate a 596@item 597For a back-end file, this is usually done 598automatically; if not, you should add it to @code{target_gtfiles} in 599the appropriate port's entries in @file{config.gcc}. 600 601@item 602For files shared by all front ends, add the filename to the 603@code{GTFILES} variable in @file{Makefile.in}. 604 605@item 606For files that are part of one front end, add the filename to the 607@code{gtfiles} variable defined in the appropriate 608@file{config-lang.in}. 609Headers should appear before non-headers in this list. 610 611@item 612For files that are part of some but not all front ends, add the 613filename to the @code{gtfiles} variable of @emph{all} the front ends 614that use it. 615@end enumerate 616 617@item 618If the file was a header file, you'll need to check that it's included 619in the right place to be visible to the generated files. For a back-end 620header file, this should be done automatically. For a front-end header 621file, it needs to be included by the same file that includes 622@file{gtype-@var{lang}.h}. For other header files, it needs to be 623included in @file{gtype-desc.c}, which is a generated file, so add it to 624@code{ifiles} in @code{open_base_file} in @file{gengtype.c}. 625 626For source files that aren't header files, the machinery will generate a 627header file that should be included in the source file you just changed. 628The file will be called @file{gt-@var{path}.h} where @var{path} is the 629pathname relative to the @file{gcc} directory with slashes replaced by 630@verb{|-|}, so for example the header file to be included in 631@file{cp/parser.c} is called @file{gt-cp-parser.c}. The 632generated header file should be included after everything else in the 633source file. Don't forget to mention this file as a dependency in the 634@file{Makefile}! 635 636@end enumerate 637 638For language frontends, there is another file that needs to be included 639somewhere. It will be called @file{gtype-@var{lang}.h}, where 640@var{lang} is the name of the subdirectory the language is contained in. 641 642Plugins can add additional root tables. Run the @code{gengtype} 643utility in plugin mode as @code{gengtype -P pluginout.h @var{source-dir} 644@var{file-list} @var{plugin*.c}} with your plugin files 645@var{plugin*.c} using @code{GTY} to generate the @var{pluginout.h} file. 646The GCC build tree is needed to be present in that mode. 647 648 649@node Invoking the garbage collector 650@section How to invoke the garbage collector 651@cindex garbage collector, invocation 652@findex ggc_collect 653 654The GCC garbage collector GGC is only invoked explicitly. In contrast 655with many other garbage collectors, it is not implicitly invoked by 656allocation routines when a lot of memory has been consumed. So the 657only way to have GGC reclaim storage is to call the @code{ggc_collect} 658function explicitly. This call is an expensive operation, as it may 659have to scan the entire heap. Beware that local variables (on the GCC 660call stack) are not followed by such an invocation (as many other 661garbage collectors do): you should reference all your data from static 662or external @code{GTY}-ed variables, and it is advised to call 663@code{ggc_collect} with a shallow call stack. The GGC is an exact mark 664and sweep garbage collector (so it does not scan the call stack for 665pointers). In practice GCC passes don't often call @code{ggc_collect} 666themselves, because it is called by the pass manager between passes. 667 668At the time of the @code{ggc_collect} call all pointers in the GC-marked 669structures must be valid or @code{NULL}. In practice this means that 670there should not be uninitialized pointer fields in the structures even 671if your code never reads or writes those fields at a particular 672instance. One way to ensure this is to use cleared versions of 673allocators unless all the fields are initialized manually immediately 674after allocation. 675 676@node Troubleshooting 677@section Troubleshooting the garbage collector 678@cindex garbage collector, troubleshooting 679 680With the current garbage collector implementation, most issues should 681show up as GCC compilation errors. Some of the most commonly 682encountered issues are described below. 683 684@itemize @bullet 685@item Gengtype does not produce allocators for a @code{GTY}-marked type. 686Gengtype checks if there is at least one possible path from GC roots to 687at least one instance of each type before outputting allocators. If 688there is no such path, the @code{GTY} markers will be ignored and no 689allocators will be output. Solve this by making sure that there exists 690at least one such path. If creating it is unfeasible or raises a ``code 691smell'', consider if you really must use GC for allocating such type. 692 693@item Link-time errors about undefined @code{gt_ggc_r_foo_bar} and 694similarly-named symbols. Check if your @file{foo_bar} source file has 695@code{#include "gt-foo_bar.h"} as its very last line. 696 697@end itemize 698