1@c Copyright (C) 2002, 2003, 2004 2@c Free Software Foundation, Inc. 3@c This is part of the GCC manual. 4@c For copying conditions, see the file gcc.texi. 5 6@node Type Information 7@chapter Memory Management and Type Information 8@cindex GGC 9@findex GTY 10 11GCC uses some fairly sophisticated memory management techniques, which 12involve determining information about GCC's data structures from GCC's 13source code and using this information to perform garbage collection and 14implement precompiled headers. 15 16A full C parser would be too complicated for this task, so a limited 17subset of C is interpreted and special markers are used to determine 18what parts of the source to look at. All @code{struct} and 19@code{union} declarations that define data structures that are 20allocated under control of the garbage collector must be marked. All 21global variables that hold pointers to garbage-collected memory must 22also be marked. Finally, all global variables that need to be saved 23and restored by a precompiled header must be marked. (The precompiled 24header mechanism can only save static variables if they're scalar. 25Complex data structures must be allocated in garbage-collected memory 26to be saved in a precompiled header.) 27 28The full format of a marker is 29@smallexample 30GTY (([@var{option}] [(@var{param})], [@var{option}] [(@var{param})] @dots{})) 31@end smallexample 32@noindent 33but in most cases no options are needed. The outer double parentheses 34are still necessary, though: @code{GTY(())}. Markers can appear: 35 36@itemize @bullet 37@item 38In a structure definition, before the open brace; 39@item 40In a global variable declaration, after the keyword @code{static} or 41@code{extern}; and 42@item 43In a structure field definition, before the name of the field. 44@end itemize 45 46Here are some examples of marking simple data structures and globals. 47 48@smallexample 49struct @var{tag} GTY(()) 50@{ 51 @var{fields}@dots{} 52@}; 53 54typedef struct @var{tag} GTY(()) 55@{ 56 @var{fields}@dots{} 57@} *@var{typename}; 58 59static GTY(()) struct @var{tag} *@var{list}; /* @r{points to GC memory} */ 60static GTY(()) int @var{counter}; /* @r{save counter in a PCH} */ 61@end smallexample 62 63The parser understands simple typedefs such as 64@code{typedef struct @var{tag} *@var{name};} and 65@code{typedef int @var{name};}. 66These don't need to be marked. 67 68@menu 69* GTY Options:: What goes inside a @code{GTY(())}. 70* GGC Roots:: Making global variables GGC roots. 71* Files:: How the generated files work. 72@end menu 73 74@node GTY Options 75@section The Inside of a @code{GTY(())} 76 77Sometimes the C code is not enough to fully describe the type 78structure. Extra information can be provided with @code{GTY} options 79and additional markers. Some options take a parameter, which may be 80either a string or a type name, depending on the parameter. If an 81option takes no parameter, it is acceptable either to omit the 82parameter entirely, or to provide an empty string as a parameter. For 83example, @code{@w{GTY ((skip))}} and @code{@w{GTY ((skip ("")))}} are 84equivalent. 85 86When the parameter is a string, often it is a fragment of C code. Four 87special escapes may be used in these strings, to refer to pieces of 88the data structure being marked: 89 90@cindex % in GTY option 91@table @code 92@item %h 93The current structure. 94@item %1 95The structure that immediately contains the current structure. 96@item %0 97The outermost structure that contains the current structure. 98@item %a 99A partial expression of the form @code{[i1][i2]...} that indexes 100the array item currently being marked. 101@end table 102 103For instance, suppose that you have a structure of the form 104@smallexample 105struct A @{ 106 ... 107@}; 108struct B @{ 109 struct A foo[12]; 110@}; 111@end smallexample 112@noindent 113and @code{b} is a variable of type @code{struct B}. When marking 114@samp{b.foo[11]}, @code{%h} would expand to @samp{b.foo[11]}, 115@code{%0} and @code{%1} would both expand to @samp{b}, and @code{%a} 116would expand to @samp{[11]}. 117 118As in ordinary C, adjacent strings will be concatenated; this is 119helpful when you have a complicated expression. 120@smallexample 121@group 122GTY ((chain_next ("TREE_CODE (&%h.generic) == INTEGER_TYPE" 123 " ? TYPE_NEXT_VARIANT (&%h.generic)" 124 " : TREE_CHAIN (&%h.generic)"))) 125@end group 126@end smallexample 127 128The available options are: 129 130@table @code 131@findex length 132@item length ("@var{expression}") 133 134There are two places the type machinery will need to be explicitly told 135the length of an array. The first case is when a structure ends in a 136variable-length array, like this: 137@smallexample 138struct rtvec_def GTY(()) @{ 139 int num_elem; /* @r{number of elements} */ 140 rtx GTY ((length ("%h.num_elem"))) elem[1]; 141@}; 142@end smallexample 143 144In this case, the @code{length} option is used to override the specified 145array length (which should usually be @code{1}). The parameter of the 146option is a fragment of C code that calculates the length. 147 148The second case is when a structure or a global variable contains a 149pointer to an array, like this: 150@smallexample 151tree * 152 GTY ((length ("%h.regno_pointer_align_length"))) regno_decl; 153@end smallexample 154In this case, @code{regno_decl} has been allocated by writing something like 155@smallexample 156 x->regno_decl = 157 ggc_alloc (x->regno_pointer_align_length * sizeof (tree)); 158@end smallexample 159and the @code{length} provides the length of the field. 160 161This second use of @code{length} also works on global variables, like: 162@verbatim 163 static GTY((length ("reg_base_value_size"))) 164 rtx *reg_base_value; 165@end verbatim 166 167@findex skip 168@item skip 169 170If @code{skip} is applied to a field, the type machinery will ignore it. 171This is somewhat dangerous; the only safe use is in a union when one 172field really isn't ever used. 173 174@findex desc 175@findex tag 176@findex default 177@item desc ("@var{expression}") 178@itemx tag ("@var{constant}") 179@itemx default 180 181The type machinery needs to be told which field of a @code{union} is 182currently active. This is done by giving each field a constant 183@code{tag} value, and then specifying a discriminator using @code{desc}. 184The value of the expression given by @code{desc} is compared against 185each @code{tag} value, each of which should be different. If no 186@code{tag} is matched, the field marked with @code{default} is used if 187there is one, otherwise no field in the union will be marked. 188 189In the @code{desc} option, the ``current structure'' is the union that 190it discriminates. Use @code{%1} to mean the structure containing it. 191There are no escapes available to the @code{tag} option, since it is a 192constant. 193 194For example, 195@smallexample 196struct tree_binding GTY(()) 197@{ 198 struct tree_common common; 199 union tree_binding_u @{ 200 tree GTY ((tag ("0"))) scope; 201 struct cp_binding_level * GTY ((tag ("1"))) level; 202 @} GTY ((desc ("BINDING_HAS_LEVEL_P ((tree)&%0)"))) xscope; 203 tree value; 204@}; 205@end smallexample 206 207In this example, the value of BINDING_HAS_LEVEL_P when applied to a 208@code{struct tree_binding *} is presumed to be 0 or 1. If 1, the type 209mechanism will treat the field @code{level} as being present and if 0, 210will treat the field @code{scope} as being present. 211 212@findex param_is 213@findex use_param 214@item param_is (@var{type}) 215@itemx use_param 216 217Sometimes it's convenient to define some data structure to work on 218generic pointers (that is, @code{PTR}) and then use it with a specific 219type. @code{param_is} specifies the real type pointed to, and 220@code{use_param} says where in the generic data structure that type 221should be put. 222 223For instance, to have a @code{htab_t} that points to trees, one would 224write the definition of @code{htab_t} like this: 225@smallexample 226typedef struct GTY(()) @{ 227 @dots{} 228 void ** GTY ((use_param, @dots{})) entries; 229 @dots{} 230@} htab_t; 231@end smallexample 232and then declare variables like this: 233@smallexample 234 static htab_t GTY ((param_is (union tree_node))) ict; 235@end smallexample 236 237@findex param@var{n}_is 238@findex use_param@var{n} 239@item param@var{n}_is (@var{type}) 240@itemx use_param@var{n} 241 242In more complicated cases, the data structure might need to work on 243several different types, which might not necessarily all be pointers. 244For this, @code{param1_is} through @code{param9_is} may be used to 245specify the real type of a field identified by @code{use_param1} through 246@code{use_param9}. 247 248@findex use_params 249@item use_params 250 251When a structure contains another structure that is parameterized, 252there's no need to do anything special, the inner structure inherits the 253parameters of the outer one. When a structure contains a pointer to a 254parameterized structure, the type machinery won't automatically detect 255this (it could, it just doesn't yet), so it's necessary to tell it that 256the pointed-to structure should use the same parameters as the outer 257structure. This is done by marking the pointer with the 258@code{use_params} option. 259 260@findex deletable 261@item deletable 262 263@code{deletable}, when applied to a global variable, indicates that when 264garbage collection runs, there's no need to mark anything pointed to 265by this variable, it can just be set to @code{NULL} instead. This is used 266to keep a list of free structures around for re-use. 267 268@findex if_marked 269@item if_marked ("@var{expression}") 270 271Suppose you want some kinds of object to be unique, and so you put them 272in a hash table. If garbage collection marks the hash table, these 273objects will never be freed, even if the last other reference to them 274goes away. GGC has special handling to deal with this: if you use the 275@code{if_marked} option on a global hash table, GGC will call the 276routine whose name is the parameter to the option on each hash table 277entry. If the routine returns nonzero, the hash table entry will 278be marked as usual. If the routine returns zero, the hash table entry 279will be deleted. 280 281The routine @code{ggc_marked_p} can be used to determine if an element 282has been marked already; in fact, the usual case is to use 283@code{if_marked ("ggc_marked_p")}. 284 285@findex maybe_undef 286@item maybe_undef 287 288When applied to a field, @code{maybe_undef} indicates that it's OK if 289the structure that this fields points to is never defined, so long as 290this field is always @code{NULL}. This is used to avoid requiring 291backends to define certain optional structures. It doesn't work with 292language frontends. 293 294@findex nested_ptr 295@item nested_ptr (@var{type}, "@var{to expression}", "@var{from expression}") 296 297The type machinery expects all pointers to point to the start of an 298object. Sometimes for abstraction purposes it's convenient to have 299a pointer which points inside an object. So long as it's possible to 300convert the original object to and from the pointer, such pointers 301can still be used. @var{type} is the type of the original object, 302the @var{to expression} returns the pointer given the original object, 303and the @var{from expression} returns the original object given 304the pointer. The pointer will be available using the @code{%h} 305escape. 306 307@findex chain_next 308@findex chain_prev 309@item chain_next ("@var{expression}") 310@itemx chain_prev ("@var{expression}") 311 312It's helpful for the type machinery to know if objects are often 313chained together in long lists; this lets it generate code that uses 314less stack space by iterating along the list instead of recursing down 315it. @code{chain_next} is an expression for the next item in the list, 316@code{chain_prev} is an expression for the previous item. For singly 317linked lists, use only @code{chain_next}; for doubly linked lists, use 318both. The machinery requires that taking the next item of the 319previous item gives the original item. 320 321@findex reorder 322@item reorder ("@var{function name}") 323 324Some data structures depend on the relative ordering of pointers. If 325the precompiled header machinery needs to change that ordering, it 326will call the function referenced by the @code{reorder} option, before 327changing the pointers in the object that's pointed to by the field the 328option applies to. The function must take four arguments, with the 329signature @samp{@w{void *, void *, gt_pointer_operator, void *}}. 330The first parameter is a pointer to the structure that contains the 331object being updated, or the object itself if there is no containing 332structure. The second parameter is a cookie that should be ignored. 333The third parameter is a routine that, given a pointer, will update it 334to its correct new value. The fourth parameter is a cookie that must 335be passed to the second parameter. 336 337PCH cannot handle data structures that depend on the absolute values 338of pointers. @code{reorder} functions can be expensive. When 339possible, it is better to depend on properties of the data, like an ID 340number or the hash of a string instead. 341 342@findex special 343@item special ("@var{name}") 344 345The @code{special} option is used to mark types that have to be dealt 346with by special case machinery. The parameter is the name of the 347special case. See @file{gengtype.c} for further details. Avoid 348adding new special cases unless there is no other alternative. 349@end table 350 351@node GGC Roots 352@section Marking Roots for the Garbage Collector 353@cindex roots, marking 354@cindex marking roots 355 356In addition to keeping track of types, the type machinery also locates 357the global variables (@dfn{roots}) that the garbage collector starts 358at. Roots must be declared using one of the following syntaxes: 359 360@itemize @bullet 361@item 362@code{extern GTY(([@var{options}])) @var{type} @var{name};} 363@item 364@code{static GTY(([@var{options}])) @var{type} @var{name};} 365@end itemize 366@noindent 367The syntax 368@itemize @bullet 369@item 370@code{GTY(([@var{options}])) @var{type} @var{name};} 371@end itemize 372@noindent 373is @emph{not} accepted. There should be an @code{extern} declaration 374of such a variable in a header somewhere---mark that, not the 375definition. Or, if the variable is only used in one file, make it 376@code{static}. 377 378@node Files 379@section Source Files Containing Type Information 380@cindex generated files 381@cindex files, generated 382 383Whenever you add @code{GTY} markers to a source file that previously 384had none, or create a new source file containing @code{GTY} markers, 385there are three things you need to do: 386 387@enumerate 388@item 389You need to add the file to the list of source files the type 390machinery scans. There are four cases: 391 392@enumerate a 393@item 394For a back-end file, this is usually done 395automatically; if not, you should add it to @code{target_gtfiles} in 396the appropriate port's entries in @file{config.gcc}. 397 398@item 399For files shared by all front ends, add the filename to the 400@code{GTFILES} variable in @file{Makefile.in}. 401 402@item 403For files that are part of one front end, add the filename to the 404@code{gtfiles} variable defined in the appropriate 405@file{config-lang.in}. For C, the file is @file{c-config-lang.in}. 406 407@item 408For files that are part of some but not all front ends, add the 409filename to the @code{gtfiles} variable of @emph{all} the front ends 410that use it. 411@end enumerate 412 413@item 414If the file was a header file, you'll need to check that it's included 415in the right place to be visible to the generated files. For a back-end 416header file, this should be done automatically. For a front-end header 417file, it needs to be included by the same file that includes 418@file{gtype-@var{lang}.h}. For other header files, it needs to be 419included in @file{gtype-desc.c}, which is a generated file, so add it to 420@code{ifiles} in @code{open_base_file} in @file{gengtype.c}. 421 422For source files that aren't header files, the machinery will generate a 423header file that should be included in the source file you just changed. 424The file will be called @file{gt-@var{path}.h} where @var{path} is the 425pathname relative to the @file{gcc} directory with slashes replaced by 426@verb{|-|}, so for example the header file to be included in 427@file{cp/parser.c} is called @file{gt-cp-parser.c}. The 428generated header file should be included after everything else in the 429source file. Don't forget to mention this file as a dependency in the 430@file{Makefile}! 431 432@end enumerate 433 434For language frontends, there is another file that needs to be included 435somewhere. It will be called @file{gtype-@var{lang}.h}, where 436@var{lang} is the name of the subdirectory the language is contained in. 437