1=head1 NAME 2 3perlintro -- a brief introduction and overview of Perl 4 5=head1 DESCRIPTION 6 7This document is intended to give you a quick overview of the Perl 8programming language, along with pointers to further documentation. It 9is intended as a "bootstrap" guide for those who are new to the 10language, and provides just enough information for you to be able to 11read other peoples' Perl and understand roughly what it's doing, or 12write your own simple scripts. 13 14This introductory document does not aim to be complete. It does not 15even aim to be entirely accurate. In some cases perfection has been 16sacrificed in the goal of getting the general idea across. You are 17I<strongly> advised to follow this introduction with more information 18from the full Perl manual, the table of contents to which can be found 19in L<perltoc>. 20 21Throughout this document you'll see references to other parts of the 22Perl documentation. You can read that documentation using the C<perldoc> 23command or whatever method you're using to read this document. 24 25=head2 What is Perl? 26 27Perl is a general-purpose programming language originally developed for 28text manipulation and now used for a wide range of tasks including 29system administration, web development, network programming, GUI 30development, and more. 31 32The language is intended to be practical (easy to use, efficient, 33complete) rather than beautiful (tiny, elegant, minimal). Its major 34features are that it's easy to use, supports both procedural and 35object-oriented (OO) programming, has powerful built-in support for text 36processing, and has one of the world's most impressive collections of 37third-party modules. 38 39Different definitions of Perl are given in L<perl>, L<perlfaq1> and 40no doubt other places. From this we can determine that Perl is different 41things to different people, but that lots of people think it's at least 42worth writing about. 43 44=head2 Running Perl programs 45 46To run a Perl program from the Unix command line: 47 48 perl progname.pl 49 50Alternatively, put this as the first line of your script: 51 52 #!/usr/bin/env perl 53 54... and run the script as C</path/to/script.pl>. Of course, it'll need 55to be executable first, so C<chmod 755 script.pl> (under Unix). 56 57For more information, including instructions for other platforms such as 58Windows and Mac OS, read L<perlrun>. 59 60=head2 Basic syntax overview 61 62A Perl script or program consists of one or more statements. These 63statements are simply written in the script in a straightforward 64fashion. There is no need to have a C<main()> function or anything of 65that kind. 66 67Perl statements end in a semi-colon: 68 69 print "Hello, world"; 70 71Comments start with a hash symbol and run to the end of the line 72 73 # This is a comment 74 75Whitespace is irrelevant: 76 77 print 78 "Hello, world" 79 ; 80 81... except inside quoted strings: 82 83 # this would print with a linebreak in the middle 84 print "Hello 85 world"; 86 87Double quotes or single quotes may be used around literal strings: 88 89 print "Hello, world"; 90 print 'Hello, world'; 91 92However, only double quotes "interpolate" variables and special 93characters such as newlines (C<\n>): 94 95 print "Hello, $name\n"; # works fine 96 print 'Hello, $name\n'; # prints $name\n literally 97 98Numbers don't need quotes around them: 99 100 print 42; 101 102You can use parentheses for functions' arguments or omit them 103according to your personal taste. They are only required 104occasionally to clarify issues of precedence. 105 106 print("Hello, world\n"); 107 print "Hello, world\n"; 108 109More detailed information about Perl syntax can be found in L<perlsyn>. 110 111=head2 Perl variable types 112 113Perl has three main variable types: scalars, arrays, and hashes. 114 115=over 4 116 117=item Scalars 118 119A scalar represents a single value: 120 121 my $animal = "camel"; 122 my $answer = 42; 123 124Scalar values can be strings, integers or floating point numbers, and Perl 125will automatically convert between them as required. There is no need 126to pre-declare your variable types. 127 128Scalar values can be used in various ways: 129 130 print $animal; 131 print "The animal is $animal\n"; 132 print "The square of $answer is ", $answer * $answer, "\n"; 133 134There are a number of "magic" scalars with names that look like 135punctuation or line noise. These special variables are used for all 136kinds of purposes, and are documented in L<perlvar>. The only one you 137need to know about for now is C<$_> which is the "default variable". 138It's used as the default argument to a number of functions in Perl, and 139it's set implicitly by certain looping constructs. 140 141 print; # prints contents of $_ by default 142 143=item Arrays 144 145An array represents a list of values: 146 147 my @animals = ("camel", "llama", "owl"); 148 my @numbers = (23, 42, 69); 149 my @mixed = ("camel", 42, 1.23); 150 151Arrays are zero-indexed. Here's how you get at elements in an array: 152 153 print $animals[0]; # prints "camel" 154 print $animals[1]; # prints "llama" 155 156The special variable C<$#array> tells you the index of the last element 157of an array: 158 159 print $mixed[$#mixed]; # last element, prints 1.23 160 161You might be tempted to use C<$#array + 1> to tell you how many items there 162are in an array. Don't bother. As it happens, using C<@array> where Perl 163expects to find a scalar value ("in scalar context") will give you the number 164of elements in the array: 165 166 if (@animals < 5) { ... } 167 168The elements we're getting from the array start with a C<$> because 169we're getting just a single value out of the array -- you ask for a scalar, 170you get a scalar. 171 172To get multiple values from an array: 173 174 @animals[0,1]; # gives ("camel", "llama"); 175 @animals[0..2]; # gives ("camel", "llama", "owl"); 176 @animals[1..$#animals]; # gives all except the first element 177 178This is called an "array slice". 179 180You can do various useful things to lists: 181 182 my @sorted = sort @animals; 183 my @backwards = reverse @numbers; 184 185There are a couple of special arrays too, such as C<@ARGV> (the command 186line arguments to your script) and C<@_> (the arguments passed to a 187subroutine). These are documented in L<perlvar>. 188 189=item Hashes 190 191A hash represents a set of key/value pairs: 192 193 my %fruit_color = ("apple", "red", "banana", "yellow"); 194 195You can use whitespace and the C<< => >> operator to lay them out more 196nicely: 197 198 my %fruit_color = ( 199 apple => "red", 200 banana => "yellow", 201 ); 202 203To get at hash elements: 204 205 $fruit_color{"apple"}; # gives "red" 206 207You can get at lists of keys and values with C<keys()> and 208C<values()>. 209 210 my @fruits = keys %fruit_colors; 211 my @colors = values %fruit_colors; 212 213Hashes have no particular internal order, though you can sort the keys 214and loop through them. 215 216Just like special scalars and arrays, there are also special hashes. 217The most well known of these is C<%ENV> which contains environment 218variables. Read all about it (and other special variables) in 219L<perlvar>. 220 221=back 222 223Scalars, arrays and hashes are documented more fully in L<perldata>. 224 225More complex data types can be constructed using references, which allow 226you to build lists and hashes within lists and hashes. 227 228A reference is a scalar value and can refer to any other Perl data 229type. So by storing a reference as the value of an array or hash 230element, you can easily create lists and hashes within lists and 231hashes. The following example shows a 2 level hash of hash 232structure using anonymous hash references. 233 234 my $variables = { 235 scalar => { 236 description => "single item", 237 sigil => '$', 238 }, 239 array => { 240 description => "ordered list of items", 241 sigil => '@', 242 }, 243 hash => { 244 description => "key/value pairs", 245 sigil => '%', 246 }, 247 }; 248 249 print "Scalars begin with a $variables->{'scalar'}->{'sigil'}\n"; 250 251Exhaustive information on the topic of references can be found in 252L<perlreftut>, L<perllol>, L<perlref> and L<perldsc>. 253 254=head2 Variable scoping 255 256Throughout the previous section all the examples have used the syntax: 257 258 my $var = "value"; 259 260The C<my> is actually not required; you could just use: 261 262 $var = "value"; 263 264However, the above usage will create global variables throughout your 265program, which is bad programming practice. C<my> creates lexically 266scoped variables instead. The variables are scoped to the block 267(i.e. a bunch of statements surrounded by curly-braces) in which they 268are defined. 269 270 my $a = "foo"; 271 if ($some_condition) { 272 my $b = "bar"; 273 print $a; # prints "foo" 274 print $b; # prints "bar" 275 } 276 print $a; # prints "foo" 277 print $b; # prints nothing; $b has fallen out of scope 278 279Using C<my> in combination with a C<use strict;> at the top of 280your Perl scripts means that the interpreter will pick up certain common 281programming errors. For instance, in the example above, the final 282C<print $b> would cause a compile-time error and prevent you from 283running the program. Using C<strict> is highly recommended. 284 285=head2 Conditional and looping constructs 286 287Perl has most of the usual conditional and looping constructs except for 288case/switch (but if you really want it, there is a Switch module in Perl 2895.8 and newer, and on CPAN. See the section on modules, below, for more 290information about modules and CPAN). 291 292The conditions can be any Perl expression. See the list of operators in 293the next section for information on comparison and boolean logic operators, 294which are commonly used in conditional statements. 295 296=over 4 297 298=item if 299 300 if ( condition ) { 301 ... 302 } elsif ( other condition ) { 303 ... 304 } else { 305 ... 306 } 307 308There's also a negated version of it: 309 310 unless ( condition ) { 311 ... 312 } 313 314This is provided as a more readable version of C<if (!I<condition>)>. 315 316Note that the braces are required in Perl, even if you've only got one 317line in the block. However, there is a clever way of making your one-line 318conditional blocks more English like: 319 320 # the traditional way 321 if ($zippy) { 322 print "Yow!"; 323 } 324 325 # the Perlish post-condition way 326 print "Yow!" if $zippy; 327 print "We have no bananas" unless $bananas; 328 329=item while 330 331 while ( condition ) { 332 ... 333 } 334 335There's also a negated version, for the same reason we have C<unless>: 336 337 until ( condition ) { 338 ... 339 } 340 341You can also use C<while> in a post-condition: 342 343 print "LA LA LA\n" while 1; # loops forever 344 345=item for 346 347Exactly like C: 348 349 for ($i=0; $i <= $max; $i++) { 350 ... 351 } 352 353The C style for loop is rarely needed in Perl since Perl provides 354the more friendly list scanning C<foreach> loop. 355 356=item foreach 357 358 foreach (@array) { 359 print "This element is $_\n"; 360 } 361 362 # you don't have to use the default $_ either... 363 foreach my $key (keys %hash) { 364 print "The value of $key is $hash{$key}\n"; 365 } 366 367=back 368 369For more detail on looping constructs (and some that weren't mentioned in 370this overview) see L<perlsyn>. 371 372=head2 Builtin operators and functions 373 374Perl comes with a wide selection of builtin functions. Some of the ones 375we've already seen include C<print>, C<sort> and C<reverse>. A list of 376them is given at the start of L<perlfunc> and you can easily read 377about any given function by using C<perldoc -f I<functionname>>. 378 379Perl operators are documented in full in L<perlop>, but here are a few 380of the most common ones: 381 382=over 4 383 384=item Arithmetic 385 386 + addition 387 - subtraction 388 * multiplication 389 / division 390 391=item Numeric comparison 392 393 == equality 394 != inequality 395 < less than 396 > greater than 397 <= less than or equal 398 >= greater than or equal 399 400=item String comparison 401 402 eq equality 403 ne inequality 404 lt less than 405 gt greater than 406 le less than or equal 407 ge greater than or equal 408 409(Why do we have separate numeric and string comparisons? Because we don't 410have special variable types, and Perl needs to know whether to sort 411numerically (where 99 is less than 100) or alphabetically (where 100 comes 412before 99). 413 414=item Boolean logic 415 416 && and 417 || or 418 ! not 419 420(C<and>, C<or> and C<not> aren't just in the above table as descriptions 421of the operators -- they're also supported as operators in their own 422right. They're more readable than the C-style operators, but have 423different precedence to C<&&> and friends. Check L<perlop> for more 424detail.) 425 426=item Miscellaneous 427 428 = assignment 429 . string concatenation 430 x string multiplication 431 .. range operator (creates a list of numbers) 432 433=back 434 435Many operators can be combined with a C<=> as follows: 436 437 $a += 1; # same as $a = $a + 1 438 $a -= 1; # same as $a = $a - 1 439 $a .= "\n"; # same as $a = $a . "\n"; 440 441=head2 Files and I/O 442 443You can open a file for input or output using the C<open()> function. 444It's documented in extravagant detail in L<perlfunc> and L<perlopentut>, 445but in short: 446 447 open(INFILE, "input.txt") or die "Can't open input.txt: $!"; 448 open(OUTFILE, ">output.txt") or die "Can't open output.txt: $!"; 449 open(LOGFILE, ">>my.log") or die "Can't open logfile: $!"; 450 451You can read from an open filehandle using the C<< <> >> operator. In 452scalar context it reads a single line from the filehandle, and in list 453context it reads the whole file in, assigning each line to an element of 454the list: 455 456 my $line = <INFILE>; 457 my @lines = <INFILE>; 458 459Reading in the whole file at one time is called slurping. It can 460be useful but it may be a memory hog. Most text file processing 461can be done a line at a time with Perl's looping constructs. 462 463The C<< <> >> operator is most often seen in a C<while> loop: 464 465 while (<INFILE>) { # assigns each line in turn to $_ 466 print "Just read in this line: $_"; 467 } 468 469We've already seen how to print to standard output using C<print()>. 470However, C<print()> can also take an optional first argument specifying 471which filehandle to print to: 472 473 print STDERR "This is your final warning.\n"; 474 print OUTFILE $record; 475 print LOGFILE $logmessage; 476 477When you're done with your filehandles, you should C<close()> them 478(though to be honest, Perl will clean up after you if you forget): 479 480 close INFILE; 481 482=head2 Regular expressions 483 484Perl's regular expression support is both broad and deep, and is the 485subject of lengthy documentation in L<perlrequick>, L<perlretut>, and 486elsewhere. However, in short: 487 488=over 4 489 490=item Simple matching 491 492 if (/foo/) { ... } # true if $_ contains "foo" 493 if ($a =~ /foo/) { ... } # true if $a contains "foo" 494 495The C<//> matching operator is documented in L<perlop>. It operates on 496C<$_> by default, or can be bound to another variable using the C<=~> 497binding operator (also documented in L<perlop>). 498 499=item Simple substitution 500 501 s/foo/bar/; # replaces foo with bar in $_ 502 $a =~ s/foo/bar/; # replaces foo with bar in $a 503 $a =~ s/foo/bar/g; # replaces ALL INSTANCES of foo with bar in $a 504 505The C<s///> substitution operator is documented in L<perlop>. 506 507=item More complex regular expressions 508 509You don't just have to match on fixed strings. In fact, you can match 510on just about anything you could dream of by using more complex regular 511expressions. These are documented at great length in L<perlre>, but for 512the meantime, here's a quick cheat sheet: 513 514 . a single character 515 \s a whitespace character (space, tab, newline) 516 \S non-whitespace character 517 \d a digit (0-9) 518 \D a non-digit 519 \w a word character (a-z, A-Z, 0-9, _) 520 \W a non-word character 521 [aeiou] matches a single character in the given set 522 [^aeiou] matches a single character outside the given set 523 (foo|bar|baz) matches any of the alternatives specified 524 525 ^ start of string 526 $ end of string 527 528Quantifiers can be used to specify how many of the previous thing you 529want to match on, where "thing" means either a literal character, one 530of the metacharacters listed above, or a group of characters or 531metacharacters in parentheses. 532 533 * zero or more of the previous thing 534 + one or more of the previous thing 535 ? zero or one of the previous thing 536 {3} matches exactly 3 of the previous thing 537 {3,6} matches between 3 and 6 of the previous thing 538 {3,} matches 3 or more of the previous thing 539 540Some brief examples: 541 542 /^\d+/ string starts with one or more digits 543 /^$/ nothing in the string (start and end are adjacent) 544 /(\d\s){3}/ a three digits, each followed by a whitespace 545 character (eg "3 4 5 ") 546 /(a.)+/ matches a string in which every odd-numbered letter 547 is a (eg "abacadaf") 548 549 # This loop reads from STDIN, and prints non-blank lines: 550 while (<>) { 551 next if /^$/; 552 print; 553 } 554 555=item Parentheses for capturing 556 557As well as grouping, parentheses serve a second purpose. They can be 558used to capture the results of parts of the regexp match for later use. 559The results end up in C<$1>, C<$2> and so on. 560 561 # a cheap and nasty way to break an email address up into parts 562 563 if ($email =~ /([^@])+@(.+)/) { 564 print "Username is $1\n"; 565 print "Hostname is $2\n"; 566 } 567 568=item Other regexp features 569 570Perl regexps also support backreferences, lookaheads, and all kinds of 571other complex details. Read all about them in L<perlrequick>, 572L<perlretut>, and L<perlre>. 573 574=back 575 576=head2 Writing subroutines 577 578Writing subroutines is easy: 579 580 sub log { 581 my $logmessage = shift; 582 print LOGFILE $logmessage; 583 } 584 585What's that C<shift>? Well, the arguments to a subroutine are available 586to us as a special array called C<@_> (see L<perlvar> for more on that). 587The default argument to the C<shift> function just happens to be C<@_>. 588So C<my $logmessage = shift;> shifts the first item off the list of 589arguments and assigns it to C<$logmessage>. 590 591We can manipulate C<@_> in other ways too: 592 593 my ($logmessage, $priority) = @_; # common 594 my $logmessage = $_[0]; # uncommon, and ugly 595 596Subroutines can also return values: 597 598 sub square { 599 my $num = shift; 600 my $result = $num * $num; 601 return $result; 602 } 603 604For more information on writing subroutines, see L<perlsub>. 605 606=head2 OO Perl 607 608OO Perl is relatively simple and is implemented using references which 609know what sort of object they are based on Perl's concept of packages. 610However, OO Perl is largely beyond the scope of this document. 611Read L<perlboot>, L<perltoot>, L<perltooc> and L<perlobj>. 612 613As a beginning Perl programmer, your most common use of OO Perl will be 614in using third-party modules, which are documented below. 615 616=head2 Using Perl modules 617 618Perl modules provide a range of features to help you avoid reinventing 619the wheel, and can be downloaded from CPAN ( http://www.cpan.org/ ). A 620number of popular modules are included with the Perl distribution 621itself. 622 623Categories of modules range from text manipulation to network protocols 624to database integration to graphics. A categorized list of modules is 625also available from CPAN. 626 627To learn how to install modules you download from CPAN, read 628L<perlmodinstall> 629 630To learn how to use a particular module, use C<perldoc I<Module::Name>>. 631Typically you will want to C<use I<Module::Name>>, which will then give 632you access to exported functions or an OO interface to the module. 633 634L<perlfaq> contains questions and answers related to many common 635tasks, and often provides suggestions for good CPAN modules to use. 636 637L<perlmod> describes Perl modules in general. L<perlmodlib> lists the 638modules which came with your Perl installation. 639 640If you feel the urge to write Perl modules, L<perlnewmod> will give you 641good advice. 642 643=head1 AUTHOR 644 645Kirrily "Skud" Robert <skud@cpan.org> 646