1=head1 NAME 2 3perlmod - Perl modules (packages and symbol tables) 4 5=head1 DESCRIPTION 6 7=head2 Packages 8X<package> X<namespace> X<variable, global> X<global variable> X<global> 9 10Perl provides a mechanism for alternative namespaces to protect 11packages from stomping on each other's variables. In fact, there's 12really no such thing as a global variable in Perl. The package 13statement declares the compilation unit as being in the given 14namespace. The scope of the package declaration is from the 15declaration itself through the end of the enclosing block, C<eval>, 16or file, whichever comes first (the same scope as the my() and 17local() operators). Unqualified dynamic identifiers will be in 18this namespace, except for those few identifiers that if unqualified, 19default to the main package instead of the current one as described 20below. A package statement affects only dynamic variables--including 21those you've used local() on--but I<not> lexical variables created 22with my(). Typically it would be the first declaration in a file 23included by the C<do>, C<require>, or C<use> operators. You can 24switch into a package in more than one place; it merely influences 25which symbol table is used by the compiler for the rest of that 26block. You can refer to variables and filehandles in other packages 27by prefixing the identifier with the package name and a double 28colon: C<$Package::Variable>. If the package name is null, the 29C<main> package is assumed. That is, C<$::sail> is equivalent to 30C<$main::sail>. 31 32The old package delimiter was a single quote, but double colon is now the 33preferred delimiter, in part because it's more readable to humans, and 34in part because it's more readable to B<emacs> macros. It also makes C++ 35programmers feel like they know what's going on--as opposed to using the 36single quote as separator, which was there to make Ada programmers feel 37like they knew what was going on. Because the old-fashioned syntax is still 38supported for backwards compatibility, if you try to use a string like 39C<"This is $owner's house">, you'll be accessing C<$owner::s>; that is, 40the $s variable in package C<owner>, which is probably not what you meant. 41Use braces to disambiguate, as in C<"This is ${owner}'s house">. 42X<::> X<'> 43 44Packages may themselves contain package separators, as in 45C<$OUTER::INNER::var>. This implies nothing about the order of 46name lookups, however. There are no relative packages: all symbols 47are either local to the current package, or must be fully qualified 48from the outer package name down. For instance, there is nowhere 49within package C<OUTER> that C<$INNER::var> refers to 50C<$OUTER::INNER::var>. C<INNER> refers to a totally 51separate global package. 52 53Only identifiers starting with letters (or underscore) are stored 54in a package's symbol table. All other symbols are kept in package 55C<main>, including all punctuation variables, like $_. In addition, 56when unqualified, the identifiers STDIN, STDOUT, STDERR, ARGV, 57ARGVOUT, ENV, INC, and SIG are forced to be in package C<main>, 58even when used for other purposes than their built-in ones. If you 59have a package called C<m>, C<s>, or C<y>, then you can't use the 60qualified form of an identifier because it would be instead interpreted 61as a pattern match, a substitution, or a transliteration. 62X<variable, punctuation> 63 64Variables beginning with underscore used to be forced into package 65main, but we decided it was more useful for package writers to be able 66to use leading underscore to indicate private variables and method names. 67However, variables and functions named with a single C<_>, such as 68$_ and C<sub _>, are still forced into the package C<main>. See also 69L<perlvar/"Technical Note on the Syntax of Variable Names">. 70 71C<eval>ed strings are compiled in the package in which the eval() was 72compiled. (Assignments to C<$SIG{}>, however, assume the signal 73handler specified is in the C<main> package. Qualify the signal handler 74name if you wish to have a signal handler in a package.) For an 75example, examine F<perldb.pl> in the Perl library. It initially switches 76to the C<DB> package so that the debugger doesn't interfere with variables 77in the program you are trying to debug. At various points, however, it 78temporarily switches back to the C<main> package to evaluate various 79expressions in the context of the C<main> package (or wherever you came 80from). See L<perldebug>. 81 82The special symbol C<__PACKAGE__> contains the current package, but cannot 83(easily) be used to construct variable names. 84 85See L<perlsub> for other scoping issues related to my() and local(), 86and L<perlref> regarding closures. 87 88=head2 Symbol Tables 89X<symbol table> X<stash> X<%::> X<%main::> X<typeglob> X<glob> X<alias> 90 91The symbol table for a package happens to be stored in the hash of that 92name with two colons appended. The main symbol table's name is thus 93C<%main::>, or C<%::> for short. Likewise the symbol table for the nested 94package mentioned earlier is named C<%OUTER::INNER::>. 95 96The value in each entry of the hash is what you are referring to when you 97use the C<*name> typeglob notation. 98 99 local *main::foo = *main::bar; 100 101You can use this to print out all the variables in a package, for 102instance. The standard but antiquated F<dumpvar.pl> library and 103the CPAN module Devel::Symdump make use of this. 104 105Assignment to a typeglob performs an aliasing operation, i.e., 106 107 *dick = *richard; 108 109causes variables, subroutines, formats, and file and directory handles 110accessible via the identifier C<richard> also to be accessible via the 111identifier C<dick>. If you want to alias only a particular variable or 112subroutine, assign a reference instead: 113 114 *dick = \$richard; 115 116Which makes $richard and $dick the same variable, but leaves 117@richard and @dick as separate arrays. Tricky, eh? 118 119There is one subtle difference between the following statements: 120 121 *foo = *bar; 122 *foo = \$bar; 123 124C<*foo = *bar> makes the typeglobs themselves synonymous while 125C<*foo = \$bar> makes the SCALAR portions of two distinct typeglobs 126refer to the same scalar value. This means that the following code: 127 128 $bar = 1; 129 *foo = \$bar; # Make $foo an alias for $bar 130 131 { 132 local $bar = 2; # Restrict changes to block 133 print $foo; # Prints '1'! 134 } 135 136Would print '1', because C<$foo> holds a reference to the I<original> 137C<$bar>. The one that was stuffed away by C<local()> and which will be 138restored when the block ends. Because variables are accessed through the 139typeglob, you can use C<*foo = *bar> to create an alias which can be 140localized. (But be aware that this means you can't have a separate 141C<@foo> and C<@bar>, etc.) 142 143What makes all of this important is that the Exporter module uses glob 144aliasing as the import/export mechanism. Whether or not you can properly 145localize a variable that has been exported from a module depends on how 146it was exported: 147 148 @EXPORT = qw($FOO); # Usual form, can't be localized 149 @EXPORT = qw(*FOO); # Can be localized 150 151You can work around the first case by using the fully qualified name 152(C<$Package::FOO>) where you need a local value, or by overriding it 153by saying C<*FOO = *Package::FOO> in your script. 154 155The C<*x = \$y> mechanism may be used to pass and return cheap references 156into or from subroutines if you don't want to copy the whole 157thing. It only works when assigning to dynamic variables, not 158lexicals. 159 160 %some_hash = (); # can't be my() 161 *some_hash = fn( \%another_hash ); 162 sub fn { 163 local *hashsym = shift; 164 # now use %hashsym normally, and you 165 # will affect the caller's %another_hash 166 my %nhash = (); # do what you want 167 return \%nhash; 168 } 169 170On return, the reference will overwrite the hash slot in the 171symbol table specified by the *some_hash typeglob. This 172is a somewhat tricky way of passing around references cheaply 173when you don't want to have to remember to dereference variables 174explicitly. 175 176Another use of symbol tables is for making "constant" scalars. 177X<constant> X<scalar, constant> 178 179 *PI = \3.14159265358979; 180 181Now you cannot alter C<$PI>, which is probably a good thing all in all. 182This isn't the same as a constant subroutine, which is subject to 183optimization at compile-time. A constant subroutine is one prototyped 184to take no arguments and to return a constant expression. See 185L<perlsub> for details on these. The C<use constant> pragma is a 186convenient shorthand for these. 187 188You can say C<*foo{PACKAGE}> and C<*foo{NAME}> to find out what name and 189package the *foo symbol table entry comes from. This may be useful 190in a subroutine that gets passed typeglobs as arguments: 191 192 sub identify_typeglob { 193 my $glob = shift; 194 print 'You gave me ', *{$glob}{PACKAGE}, '::', *{$glob}{NAME}, "\n"; 195 } 196 identify_typeglob *foo; 197 identify_typeglob *bar::baz; 198 199This prints 200 201 You gave me main::foo 202 You gave me bar::baz 203 204The C<*foo{THING}> notation can also be used to obtain references to the 205individual elements of *foo. See L<perlref>. 206 207Subroutine definitions (and declarations, for that matter) need 208not necessarily be situated in the package whose symbol table they 209occupy. You can define a subroutine outside its package by 210explicitly qualifying the name of the subroutine: 211 212 package main; 213 sub Some_package::foo { ... } # &foo defined in Some_package 214 215This is just a shorthand for a typeglob assignment at compile time: 216 217 BEGIN { *Some_package::foo = sub { ... } } 218 219and is I<not> the same as writing: 220 221 { 222 package Some_package; 223 sub foo { ... } 224 } 225 226In the first two versions, the body of the subroutine is 227lexically in the main package, I<not> in Some_package. So 228something like this: 229 230 package main; 231 232 $Some_package::name = "fred"; 233 $main::name = "barney"; 234 235 sub Some_package::foo { 236 print "in ", __PACKAGE__, ": \$name is '$name'\n"; 237 } 238 239 Some_package::foo(); 240 241prints: 242 243 in main: $name is 'barney' 244 245rather than: 246 247 in Some_package: $name is 'fred' 248 249This also has implications for the use of the SUPER:: qualifier 250(see L<perlobj>). 251 252=head2 BEGIN, UNITCHECK, CHECK, INIT and END 253X<BEGIN> X<UNITCHECK> X<CHECK> X<INIT> X<END> 254 255Five specially named code blocks are executed at the beginning and at 256the end of a running Perl program. These are the C<BEGIN>, 257C<UNITCHECK>, C<CHECK>, C<INIT>, and C<END> blocks. 258 259These code blocks can be prefixed with C<sub> to give the appearance of a 260subroutine (although this is not considered good style). One should note 261that these code blocks don't really exist as named subroutines (despite 262their appearance). The thing that gives this away is the fact that you can 263have B<more than one> of these code blocks in a program, and they will get 264B<all> executed at the appropriate moment. So you can't execute any of 265these code blocks by name. 266 267A C<BEGIN> code block is executed as soon as possible, that is, the moment 268it is completely defined, even before the rest of the containing file (or 269string) is parsed. You may have multiple C<BEGIN> blocks within a file (or 270eval'ed string); they will execute in order of definition. Because a C<BEGIN> 271code block executes immediately, it can pull in definitions of subroutines 272and such from other files in time to be visible to the rest of the compile 273and run time. Once a C<BEGIN> has run, it is immediately undefined and any 274code it used is returned to Perl's memory pool. 275 276An C<END> code block is executed as late as possible, that is, after 277perl has finished running the program and just before the interpreter 278is being exited, even if it is exiting as a result of a die() function. 279(But not if it's morphing into another program via C<exec>, or 280being blown out of the water by a signal--you have to trap that yourself 281(if you can).) You may have multiple C<END> blocks within a file--they 282will execute in reverse order of definition; that is: last in, first 283out (LIFO). C<END> blocks are not executed when you run perl with the 284C<-c> switch, or if compilation fails. 285 286Note that C<END> code blocks are B<not> executed at the end of a string 287C<eval()>: if any C<END> code blocks are created in a string C<eval()>, 288they will be executed just as any other C<END> code block of that package 289in LIFO order just before the interpreter is being exited. 290 291Inside an C<END> code block, C<$?> contains the value that the program is 292going to pass to C<exit()>. You can modify C<$?> to change the exit 293value of the program. Beware of changing C<$?> by accident (e.g. by 294running something via C<system>). 295X<$?> 296 297C<UNITCHECK>, C<CHECK> and C<INIT> code blocks are useful to catch the 298transition between the compilation phase and the execution phase of 299the main program. 300 301C<UNITCHECK> blocks are run just after the unit which defined them has 302been compiled. The main program file and each module it loads are 303compilation units, as are string C<eval>s, code compiled using the 304C<(?{ })> construct in a regex, calls to C<do FILE>, C<require FILE>, 305and code after the C<-e> switch on the command line. 306 307C<CHECK> code blocks are run just after the B<initial> Perl compile phase ends 308and before the run time begins, in LIFO order. C<CHECK> code blocks are used 309in the Perl compiler suite to save the compiled state of the program. 310 311C<INIT> blocks are run just before the Perl runtime begins execution, in 312"first in, first out" (FIFO) order. 313 314The C<CHECK> and C<INIT> code blocks will not be executed inside a string 315eval(), if that eval() happens after the end of the main compilation 316phase; that can be a problem in mod_perl and other persistent environments 317which use C<eval STRING> to load code at runtime. 318 319When you use the B<-n> and B<-p> switches to Perl, C<BEGIN> and 320C<END> work just as they do in B<awk>, as a degenerate case. 321Both C<BEGIN> and C<CHECK> blocks are run when you use the B<-c> 322switch for a compile-only syntax check, although your main code 323is not. 324 325The B<begincheck> program makes it all clear, eventually: 326 327 #!/usr/bin/perl 328 329 # begincheck 330 331 print "10. Ordinary code runs at runtime.\n"; 332 333 END { print "16. So this is the end of the tale.\n" } 334 INIT { print " 7. INIT blocks run FIFO just before runtime.\n" } 335 UNITCHECK { 336 print " 4. And therefore before any CHECK blocks.\n" 337 } 338 CHECK { print " 6. So this is the sixth line.\n" } 339 340 print "11. It runs in order, of course.\n"; 341 342 BEGIN { print " 1. BEGIN blocks run FIFO during compilation.\n" } 343 END { print "15. Read perlmod for the rest of the story.\n" } 344 CHECK { print " 5. CHECK blocks run LIFO after all compilation.\n" } 345 INIT { print " 8. Run this again, using Perl's -c switch.\n" } 346 347 print "12. This is anti-obfuscated code.\n"; 348 349 END { print "14. END blocks run LIFO at quitting time.\n" } 350 BEGIN { print " 2. So this line comes out second.\n" } 351 UNITCHECK { 352 print " 3. UNITCHECK blocks run LIFO after each file is compiled.\n" 353 } 354 INIT { print " 9. You'll see the difference right away.\n" } 355 356 print "13. It merely _looks_ like it should be confusing.\n"; 357 358 __END__ 359 360=head2 Perl Classes 361X<class> X<@ISA> 362 363There is no special class syntax in Perl, but a package may act 364as a class if it provides subroutines to act as methods. Such a 365package may also derive some of its methods from another class (package) 366by listing the other package name(s) in its global @ISA array (which 367must be a package global, not a lexical). 368 369For more on this, see L<perltoot> and L<perlobj>. 370 371=head2 Perl Modules 372X<module> 373 374A module is just a set of related functions in a library file, i.e., 375a Perl package with the same name as the file. It is specifically 376designed to be reusable by other modules or programs. It may do this 377by providing a mechanism for exporting some of its symbols into the 378symbol table of any package using it, or it may function as a class 379definition and make its semantics available implicitly through 380method calls on the class and its objects, without explicitly 381exporting anything. Or it can do a little of both. 382 383For example, to start a traditional, non-OO module called Some::Module, 384create a file called F<Some/Module.pm> and start with this template: 385 386 package Some::Module; # assumes Some/Module.pm 387 388 use strict; 389 use warnings; 390 391 BEGIN { 392 use Exporter (); 393 our ($VERSION, @ISA, @EXPORT, @EXPORT_OK, %EXPORT_TAGS); 394 395 # set the version for version checking 396 $VERSION = 1.00; 397 # if using RCS/CVS, this may be preferred 398 $VERSION = sprintf "%d.%03d", q$Revision: 1.1 $ =~ /(\d+)/g; 399 400 @ISA = qw(Exporter); 401 @EXPORT = qw(&func1 &func2 &func4); 402 %EXPORT_TAGS = ( ); # eg: TAG => [ qw!name1 name2! ], 403 404 # your exported package globals go here, 405 # as well as any optionally exported functions 406 @EXPORT_OK = qw($Var1 %Hashit &func3); 407 } 408 our @EXPORT_OK; 409 410 # exported package globals go here 411 our $Var1; 412 our %Hashit; 413 414 # non-exported package globals go here 415 our @more; 416 our $stuff; 417 418 # initialize package globals, first exported ones 419 $Var1 = ''; 420 %Hashit = (); 421 422 # then the others (which are still accessible as $Some::Module::stuff) 423 $stuff = ''; 424 @more = (); 425 426 # all file-scoped lexicals must be created before 427 # the functions below that use them. 428 429 # file-private lexicals go here 430 my $priv_var = ''; 431 my %secret_hash = (); 432 433 # here's a file-private function as a closure, 434 # callable as &$priv_func; it cannot be prototyped. 435 my $priv_func = sub { 436 # stuff goes here. 437 }; 438 439 # make all your functions, whether exported or not; 440 # remember to put something interesting in the {} stubs 441 sub func1 {} # no prototype 442 sub func2() {} # proto'd void 443 sub func3($$) {} # proto'd to 2 scalars 444 445 # this one isn't exported, but could be called! 446 sub func4(\%) {} # proto'd to 1 hash ref 447 448 END { } # module clean-up code here (global destructor) 449 450 ## YOUR CODE GOES HERE 451 452 1; # don't forget to return a true value from the file 453 454Then go on to declare and use your variables in functions without 455any qualifications. See L<Exporter> and the L<perlmodlib> for 456details on mechanics and style issues in module creation. 457 458Perl modules are included into your program by saying 459 460 use Module; 461 462or 463 464 use Module LIST; 465 466This is exactly equivalent to 467 468 BEGIN { require Module; import Module; } 469 470or 471 472 BEGIN { require Module; import Module LIST; } 473 474As a special case 475 476 use Module (); 477 478is exactly equivalent to 479 480 BEGIN { require Module; } 481 482All Perl module files have the extension F<.pm>. The C<use> operator 483assumes this so you don't have to spell out "F<Module.pm>" in quotes. 484This also helps to differentiate new modules from old F<.pl> and 485F<.ph> files. Module names are also capitalized unless they're 486functioning as pragmas; pragmas are in effect compiler directives, 487and are sometimes called "pragmatic modules" (or even "pragmata" 488if you're a classicist). 489 490The two statements: 491 492 require SomeModule; 493 require "SomeModule.pm"; 494 495differ from each other in two ways. In the first case, any double 496colons in the module name, such as C<Some::Module>, are translated 497into your system's directory separator, usually "/". The second 498case does not, and would have to be specified literally. The other 499difference is that seeing the first C<require> clues in the compiler 500that uses of indirect object notation involving "SomeModule", as 501in C<$ob = purge SomeModule>, are method calls, not function calls. 502(Yes, this really can make a difference.) 503 504Because the C<use> statement implies a C<BEGIN> block, the importing 505of semantics happens as soon as the C<use> statement is compiled, 506before the rest of the file is compiled. This is how it is able 507to function as a pragma mechanism, and also how modules are able to 508declare subroutines that are then visible as list or unary operators for 509the rest of the current file. This will not work if you use C<require> 510instead of C<use>. With C<require> you can get into this problem: 511 512 require Cwd; # make Cwd:: accessible 513 $here = Cwd::getcwd(); 514 515 use Cwd; # import names from Cwd:: 516 $here = getcwd(); 517 518 require Cwd; # make Cwd:: accessible 519 $here = getcwd(); # oops! no main::getcwd() 520 521In general, C<use Module ()> is recommended over C<require Module>, 522because it determines module availability at compile time, not in the 523middle of your program's execution. An exception would be if two modules 524each tried to C<use> each other, and each also called a function from 525that other module. In that case, it's easy to use C<require> instead. 526 527Perl packages may be nested inside other package names, so we can have 528package names containing C<::>. But if we used that package name 529directly as a filename it would make for unwieldy or impossible 530filenames on some systems. Therefore, if a module's name is, say, 531C<Text::Soundex>, then its definition is actually found in the library 532file F<Text/Soundex.pm>. 533 534Perl modules always have a F<.pm> file, but there may also be 535dynamically linked executables (often ending in F<.so>) or autoloaded 536subroutine definitions (often ending in F<.al>) associated with the 537module. If so, these will be entirely transparent to the user of 538the module. It is the responsibility of the F<.pm> file to load 539(or arrange to autoload) any additional functionality. For example, 540although the POSIX module happens to do both dynamic loading and 541autoloading, the user can say just C<use POSIX> to get it all. 542 543=head2 Making your module threadsafe 544X<threadsafe> X<thread safe> 545X<module, threadsafe> X<module, thread safe> 546X<CLONE> X<CLONE_SKIP> X<thread> X<threads> X<ithread> 547 548Since 5.6.0, Perl has had support for a new type of threads called 549interpreter threads (ithreads). These threads can be used explicitly 550and implicitly. 551 552Ithreads work by cloning the data tree so that no data is shared 553between different threads. These threads can be used by using the C<threads> 554module or by doing fork() on win32 (fake fork() support). When a 555thread is cloned all Perl data is cloned, however non-Perl data cannot 556be cloned automatically. Perl after 5.7.2 has support for the C<CLONE> 557special subroutine. In C<CLONE> you can do whatever 558you need to do, 559like for example handle the cloning of non-Perl data, if necessary. 560C<CLONE> will be called once as a class method for every package that has it 561defined (or inherits it). It will be called in the context of the new thread, 562so all modifications are made in the new area. Currently CLONE is called with 563no parameters other than the invocant package name, but code should not assume 564that this will remain unchanged, as it is likely that in future extra parameters 565will be passed in to give more information about the state of cloning. 566 567If you want to CLONE all objects you will need to keep track of them per 568package. This is simply done using a hash and Scalar::Util::weaken(). 569 570Perl after 5.8.7 has support for the C<CLONE_SKIP> special subroutine. 571Like C<CLONE>, C<CLONE_SKIP> is called once per package; however, it is 572called just before cloning starts, and in the context of the parent 573thread. If it returns a true value, then no objects of that class will 574be cloned; or rather, they will be copied as unblessed, undef values. 575For example: if in the parent there are two references to a single blessed 576hash, then in the child there will be two references to a single undefined 577scalar value instead. 578This provides a simple mechanism for making a module threadsafe; just add 579C<sub CLONE_SKIP { 1 }> at the top of the class, and C<DESTROY()> will be 580now only be called once per object. Of course, if the child thread needs 581to make use of the objects, then a more sophisticated approach is 582needed. 583 584Like C<CLONE>, C<CLONE_SKIP> is currently called with no parameters other 585than the invocant package name, although that may change. Similarly, to 586allow for future expansion, the return value should be a single C<0> or 587C<1> value. 588 589=head1 SEE ALSO 590 591See L<perlmodlib> for general style issues related to building Perl 592modules and classes, as well as descriptions of the standard library 593and CPAN, L<Exporter> for how Perl's standard import/export mechanism 594works, L<perltoot> and L<perltooc> for an in-depth tutorial on 595creating classes, L<perlobj> for a hard-core reference document on 596objects, L<perlsub> for an explanation of functions and scoping, 597and L<perlxstut> and L<perlguts> for more information on writing 598extension modules. 599