1=head1 NAME 2 3perlclassguts - Internals of how C<feature 'class'> and class syntax works 4 5=head1 DESCRIPTION 6 7This document provides in-depth information about the way in which the perl 8interpreter implements the C<feature 'class'> syntax and overall behaviour. 9It is not intended as an end-user guide on how to use the feature. For that, 10see L<perlclass>. 11 12The reader is assumed to be generally familiar with the perl interpreter 13internals overall. For a more general overview of these details, see also 14L<perlguts>. 15 16=head1 DATA STORAGE 17 18=head2 Classes 19 20A class is fundamentally a package, and exists in the symbol table as an HV 21with an aux structure in exactly the same way as a non-class package. It is 22distinguished from a non-class package by the fact that the 23C<HvSTASH_IS_CLASS()> macro will return true on it. 24 25Extra information relating to it being a class is stored in the 26C<struct xpvhv_aux> structure attached to the stash, in the following fields: 27 28 HV *xhv_class_superclass; 29 CV *xhv_class_initfields_cv; 30 AV *xhv_class_adjust_blocks; 31 PADNAMELIST *xhv_class_fields; 32 PADOFFSET xhv_class_next_fieldix; 33 HV *xhv_class_param_map; 34 35=over 4 36 37=item * 38 39C<xhv_class_superclass> will be C<NULL> for a class with no superclass. It 40will point directly to the stash of the parent class if one has been set with 41the C<:isa()> class attribute. 42 43=item * 44 45C<xhv_class_initfields_cv> will contain a C<CV *> pointing to a function to be 46invoked as part of the constructor of this class or any subclass thereof. This 47CV is responsible for initializing all the fields defined by this class for a 48new instance. This CV will be an anonymous real function - i.e. while it has no 49name and no GV, it is I<not> a protosub and may be directly invoked. 50 51=item * 52 53C<xhv_class_adjust_blocks> may point to an AV containing CV pointers to each of 54the C<ADJUST> blocks defined on the class. If the class has a superclass, this 55array will additionally contain duplicate pointers of the CVs of its parent 56class. The AV is created lazily the first time an element is pushed to it; it 57is valid for there not to be one, and this pointer will be C<NULL> in that 58case. 59 60The CVs are stored directly, not via RVs. Each CV will be an anonymous real 61function. 62 63=item * 64 65C<xhv_class_fields> will point to a C<PADNAMELIST> containing C<PADNAME>s, 66each being one defined field of the class. They are stored in order of 67declaration. Note however, that the index into this array will not necessarily 68be equal to the C<fieldix> of each field, because in the case of a subclass, 69the array will begin at zero but the index of the first field in it will be 70non-zero if its parent class contains any fields at all. 71 72For more information on how individual fields are represented, see L</Fields>. 73 74=item * 75 76C<xhv_class_next_fieldix> gives the field index that will be assigned to the 77next field to be added to the class. It is only useful at compile-time. 78 79=item * 80 81C<xhv_class_param_map> may point to an HV which maps field C<:param> attribute 82names to the field index of the field with that name. This mapping is copied 83from parent classes; each class will contain the sum total of all its parents 84in addition to its own. 85 86=back 87 88=head2 Fields 89 90A field is still fundamentally a lexical variable declared in a scope, and 91exists in the C<PADNAMELIST> of its corresponding CV. Methods and other 92method-like CVs can still capture them exactly as they can with regular 93lexicals. A field is distinguished from other kinds of pad entry in that the 94C<PadnameIsFIELD()> macro will return true on it. 95 96Extra information relating to it being a field is stored in an additional 97structure accessible via the C<PadnameFIELDINFO()> macro on the padname. This 98structure has the following fields: 99 100 PADOFFSET fieldix; 101 HV *fieldstash; 102 OP *defop; 103 SV *paramname; 104 bool def_if_undef; 105 bool def_if_false; 106 107=over 4 108 109=item * 110 111C<fieldix> stores the "field index" of the field; that is, the index into the 112instance field array where this field's value will be stored. Note that the 113first index in the array is not specially reserved. The first field in a class 114will start from field index 0. 115 116=item * 117 118C<fieldstash> stores a pointer to the stash of the class that defined this 119field. This is necessary in case there are multiple classes defined within the 120same scope; it is used to disambiguate the fields of each. 121 122 { 123 class C1; field $x; 124 class C2; field $x; 125 } 126 127=item * 128 129C<defop> may store a pointer to a defaulting expression optree for this field. 130Defaulting expressions are optional; this field may be C<NULL>. 131 132=item * 133 134C<paramname> may point to a regular string SV containing the C<:param> name 135attribute given to the field. If none, it will be C<NULL>. 136 137=item * 138 139One of C<def_if_undef> and C<def_if_false> will be true if the defaulting 140expression was set using the C<//=> or C<||=> operators respectively. 141 142=back 143 144=head2 Methods 145 146A method is still fundamentally a CV, and has the same basic representation as 147one. It has an optree and a pad, and is stored via a GV in the stash of its 148containing package. It is distinguished from a non-method CV by the fact that 149the C<CvIsMETHOD()> macro will return true on it. 150 151(Note: This macro should not be confused with the one that was previously 152called C<CvMETHOD()>. That one does not relate to the class system, and was 153renamed to C<CvNOWARN_AMBIGUOUS()> to avoid this confusion.) 154 155There is currently no extra information that needs to be stored about a method 156CV, so the structure does not add any new fields. 157 158=head2 Instances 159 160Object instances are represented by an entirely new SV type, whose base type 161is C<SVt_PVOBJ>. This should still be blessed into its class stash and wrapped 162in an RV in the usual manner for classical object. 163 164As these are their own unique container type, distinct from hashes or arrays, 165the core C<builtin::reftype> function returns a new value when asked about 166these. That value is C<"OBJECT">. 167 168Internally, such an object is an array of SV pointers whose size is fixed at 169creation time (because the number of fields in a class is known after 170compilation). An object instance stores the max field index within it (for 171basic error-checking on access), and a fixed-size array of SV pointers storing 172the individual field values. 173 174Fields of array and hash type directly store AV or HV pointers into the array; 175they are not stored via an intervening RV. 176 177=head1 API 178 179The data structures described above are supported by the following API 180functions. 181 182=head2 Class Manipulation 183 184=head3 class_setup_stash 185 186 void class_setup_stash(HV *stash); 187 188Called by the parser on encountering the C<class> keyword. It upgrades the 189stash into being a class and prepares it for receiving class-specific items 190like methods and fields. 191 192=head3 class_seal_stash 193 194 void class_seal_stash(HV *stash); 195 196Called by the parser at the end of a C<class> block, or for unit classes its 197containing scope. This function performs various finalisation activities that 198are required before instances of the class can be constructed, but could not 199have been done until all the information about the members of the class is 200known. 201 202Any additions to or modifications of the class under compilation must be 203performed between these two function calls. Classes cannot be modified once 204they have been sealed. 205 206=head3 class_add_field 207 208 void class_add_field(HV *stash, PADNAME *pn); 209 210Called by F<pad.c> as part of defining a new field name in the current pad. 211Note that this function does I<not> create the padname; that must already be 212done by F<pad.c>. This API function simply informs the class that the new 213field name has been created and is now available for it. 214 215=head3 class_add_ADJUST 216 217 void class_add_ADJUST(HV *stash, CV *cv); 218 219Called by the parser once it has parsed and constructed a CV for a new 220C<ADJUST> block. This gets added to the list stored by the class. 221 222=head2 Field Manipulation 223 224=head3 class_prepare_initfield_parse 225 226 void class_prepare_initfield_parse(); 227 228Called by the parser just before parsing an initializing expression for a 229field variable. This makes use of a suspended compcv to combine all the field 230initializing expressions into the same CV. 231 232=head3 class_set_field_defop 233 234 void class_set_field_defop(PADNAME *pn, OPCODE defmode, OP *defop); 235 236Called by the parser after it has parsed an initializing expression for the 237field. Sets the defaulting expression and mode of application. C<defmode> 238should either be zero, or one of C<OP_ORASSIGN> or C<OP_DORASSIGN> depending 239on the defaulting mode. 240 241=head3 padadd_FIELD 242 243 #define padadd_FIELD 244 245This flag constant tells the C<pad_add_name_*> family of functions that the 246new name should be added as a field. There is no need to call 247C<class_add_field()>; this will be done automatically. 248 249=head2 Method Manipulation 250 251=head3 class_prepare_method_parse 252 253 void class_prepare_method_parse(CV *cv); 254 255Called by the parser after C<start_subparse()> but immediately before doing 256anything else. This prepares the C<PL_compcv> for parsing a method; arranging 257for the C<CvIsMETHOD> test to be true, adding the C<$self> lexical, and any 258other activities that may be required. 259 260=head3 class_wrap_method_body 261 262 OP *class_wrap_method_body(OP *o); 263 264Called by the parser at the end of parsing a method body into an optree but 265just before wrapping it in the eventual CV. This function inserts extra ops 266into the optree to make the method work correctly. 267 268=head2 Object Instances 269 270=head3 SVt_PVOBJ 271 272 #define SVt_PVOBJ 273 274An SV type constant used for comparison with the C<SvTYPE()> macro. 275 276=head3 ObjectMAXFIELD 277 278 SSize_t ObjectMAXFIELD(sv); 279 280A function-like macro that obtains the maximum valid field index that can be 281accessed from the C<ObjectFIELDS> array. 282 283=head3 ObjectFIELDS 284 285 SV **ObjectFIELDS(sv); 286 287A function-like macro that obtains the fields array directly out of an object 288instance. Fields can be accessed by their field index, from 0 up to the maximum 289valid index given by C<ObjectMAXFIELD>. 290 291=head1 OPCODES 292 293=head2 OP_METHSTART 294 295 newUNOP_AUX(OP_METHSTART, ...); 296 297An C<OP_METHSTART> is an C<UNOP_AUX> which must be present at the start of a 298method CV in order to make it work properly. This is inserted by 299C<class_wrap_method_body()>, and even appears before any optree fragment 300associated with signature argument checking or extraction. 301 302This op is responsible for shifting the value of C<$self> out of the arguments 303list and binding any field variables that the method requires access to into 304the pad. The AUX vector will contain details of the field/pad index pairings 305required. 306 307This op also performs sanity checking on the invocant value. It checks that it 308is definitely an object reference of a compatible class type. If not, an 309exception is thrown. 310 311If the C<op_private> field includes the C<OPpINITFIELDS> flag, this indicates 312that the op begins the special C<xhv_class_initfields_cv> CV. In this case it 313should additionally take the second value from the arguments list, which 314should be a plain HV pointer (I<directly>, not via RV). and bind it to the 315second pad slot, where the generated optree will expect to find it. 316 317=head2 OP_INITFIELD 318 319An C<OP_INITFIELD> is only invoked as part of the C<xhv_class_initfields_cv> 320CV during the construction phase of an instance. This is the time that the 321individual SVs that make up the mutable fields of the instance (including AVs 322and HVs) are actually assigned into the C<ObjectFIELDS> array. The 323C<OPpINITFIELD_AV> and C<OPpINITFIELD_HV> private flags indicate whether it is 324creating an AV or HV; if neither is set then an SV is created. 325 326If the op has the C<OPf_STACKED> flag it expects to find an initializing value 327on the stack. For SVs this is the topmost SV on the data stack. For AVs and 328HVs it expects a marked list. 329 330=head1 COMPILE-TIME BEHAVIOUR 331 332=head2 C<ADJUST> Phasers 333 334During compiletime, parsing of an C<ADJUST> phaser is handled in a 335fundamentally different way to the existing perl phasers (C<BEGIN>, etc...) 336 337Rather than taking the usual route, the tokenizer recognises that the 338C<ADJUST> keyword introduces a phaser block. The parser then parses the body 339of this block similarly to how it would parse an (anonymous) method body, 340creating a CV that has no name GV. This is then inserted directly into the 341class information by calling C<class_add_ADJUST>, entirely bypassing the 342symbol table. 343 344=head2 Attributes 345 346During compilation, attributes of both classes and fields are handled in a 347different way to existing perl attributes on subroutines and lexical 348variables. 349 350The parser still forms an C<OP_LIST> optree of C<OP_CONST> nodes, but these 351are passed to the C<class_apply_attributes> or C<class_apply_field_attributes> 352functions. Rather than using a class lookup for a method in the class being 353parsed, a fixed internal list of known attributes is used to find functions to 354apply the attribute to the class or field. In future this may support 355user-supplied extension attribute, though at present it only recognises ones 356defined by the core itself. 357 358=head2 Field Initializing Expressions 359 360During compilation, the parser makes use of a suspended compcv when parsing 361the defaulting expression for a field. All the expressions for all the fields 362in the class share the same suspended compcv, which is then compiled up into 363the same internal CV called by the constructor to initialize all the fields 364provided by that class. 365 366=head1 RUNTIME BEHAVIOUR 367 368=head2 Constructor 369 370The generated constructor for a class itself is an XSUB which performs three 371tasks in order: it creates the instance SV itself, invokes the field 372initializers, then invokes the ADJUST block CVs. The constructor for any class 373is always the same basic shape, regardless of whether the class has a 374superclass or not. 375 376The field initializers are collected into a generated optree-based CV called 377the field initializer CV. This is the CV which contains all the optree 378fragments for the field initializing expressions. When invoked, the field 379initializer CV might make a chained call to the superclass initializer if one 380exists, before invoking all of the individual field initialization ops. The 381field initializer CV is invoked with two items on the stack; being the 382instance SV and a direct HV containing the constructor parameters. Note 383carefully: this HV is passed I<directly>, not via an RV reference. This is 384permitted because both the caller and the callee are directly generated code 385and not arbitrary pure-perl subroutines. 386 387The ADJUST block CVs are all collected into a single flat list, merging all of 388the ones defined by the superclass as well. They are all invoked in order, 389after the field initializer CV. 390 391=head2 C<$self> Access During Methods 392 393When C<class_prepare_method_parse()> is called, it arranges that the pad of 394the new CV body will begin with a lexical called C<$self>. Because the pad 395should be freshly-created at this point, this will have the pad index of 1. 396The function checks this and aborts if that is not true. 397 398Because of this fact, code within the body of a method or method-like CV can 399reliably use pad index 1 to obtain the invocant reference. The C<OP_INITFIELD> 400opcode also relies on this fact. 401 402In similar fashion, during the C<xhv_class_initfields_cv> the next pad slot is 403relied on to store the constructor parameters HV, at pad index 2. 404 405=head1 AUTHORS 406 407Paul Evans 408 409=cut 410