1=head1 NAME 2 3perlintro -- a brief introduction and overview of Perl 4 5=head1 DESCRIPTION 6 7This document is intended to give you a quick overview of the Perl 8programming language, along with pointers to further documentation. It 9is intended as a "bootstrap" guide for those who are new to the 10language, and provides just enough information for you to be able to 11read other peoples' Perl and understand roughly what it's doing, or 12write your own simple scripts. 13 14This introductory document does not aim to be complete. It does not 15even aim to be entirely accurate. In some cases perfection has been 16sacrificed in the goal of getting the general idea across. You are 17I<strongly> advised to follow this introduction with more information 18from the full Perl manual, the table of contents to which can be found 19in L<perltoc>. 20 21Throughout this document you'll see references to other parts of the 22Perl documentation. You can read that documentation using the C<perldoc> 23command or whatever method you're using to read this document. 24 25=head2 What is Perl? 26 27Perl is a general-purpose programming language originally developed for 28text manipulation and now used for a wide range of tasks including 29system administration, web development, network programming, GUI 30development, and more. 31 32The language is intended to be practical (easy to use, efficient, 33complete) rather than beautiful (tiny, elegant, minimal). Its major 34features are that it's easy to use, supports both procedural and 35object-oriented (OO) programming, has powerful built-in support for text 36processing, and has one of the world's most impressive collections of 37third-party modules. 38 39Different definitions of Perl are given in L<perl>, L<perlfaq1> and 40no doubt other places. From this we can determine that Perl is different 41things to different people, but that lots of people think it's at least 42worth writing about. 43 44=head2 Running Perl programs 45 46To run a Perl program from the Unix command line: 47 48 perl progname.pl 49 50Alternatively, put this as the first line of your script: 51 52 #!/usr/bin/env perl 53 54... and run the script as C</path/to/script.pl>. Of course, it'll need 55to be executable first, so C<chmod 755 script.pl> (under Unix). 56 57(This start line assumes you have the B<env> program. You can also put 58directly the path to your perl executable, like in C<#!/usr/bin/perl>). 59 60For more information, including instructions for other platforms such as 61Windows and Mac OS, read L<perlrun>. 62 63=head2 Safety net 64 65Perl by default is very forgiving. In order to make it more robust 66it is recommended to start every program with the following lines: 67 68 #!/usr/bin/perl 69 use strict; 70 use warnings; 71 72The two additional lines request from perl to catch various common 73problems in your code. They check different things so you need both. A 74potential problem caught by C<use strict;> will cause your code to stop 75immediately when it is encountered, while C<use warnings;> will merely 76give a warning (like the command-line switch B<-w>) and let your code run. 77To read more about them check their respective manual pages at L<strict> 78and L<warnings>. 79 80=head2 Basic syntax overview 81 82A Perl script or program consists of one or more statements. These 83statements are simply written in the script in a straightforward 84fashion. There is no need to have a C<main()> function or anything of 85that kind. 86 87Perl statements end in a semi-colon: 88 89 print "Hello, world"; 90 91Comments start with a hash symbol and run to the end of the line 92 93 # This is a comment 94 95Whitespace is irrelevant: 96 97 print 98 "Hello, world" 99 ; 100 101... except inside quoted strings: 102 103 # this would print with a linebreak in the middle 104 print "Hello 105 world"; 106 107Double quotes or single quotes may be used around literal strings: 108 109 print "Hello, world"; 110 print 'Hello, world'; 111 112However, only double quotes "interpolate" variables and special 113characters such as newlines (C<\n>): 114 115 print "Hello, $name\n"; # works fine 116 print 'Hello, $name\n'; # prints $name\n literally 117 118Numbers don't need quotes around them: 119 120 print 42; 121 122You can use parentheses for functions' arguments or omit them 123according to your personal taste. They are only required 124occasionally to clarify issues of precedence. 125 126 print("Hello, world\n"); 127 print "Hello, world\n"; 128 129More detailed information about Perl syntax can be found in L<perlsyn>. 130 131=head2 Perl variable types 132 133Perl has three main variable types: scalars, arrays, and hashes. 134 135=over 4 136 137=item Scalars 138 139A scalar represents a single value: 140 141 my $animal = "camel"; 142 my $answer = 42; 143 144Scalar values can be strings, integers or floating point numbers, and Perl 145will automatically convert between them as required. There is no need 146to pre-declare your variable types, but you have to declare them using 147the C<my> keyword the first time you use them. (This is one of the 148requirements of C<use strict;>.) 149 150Scalar values can be used in various ways: 151 152 print $animal; 153 print "The animal is $animal\n"; 154 print "The square of $answer is ", $answer * $answer, "\n"; 155 156There are a number of "magic" scalars with names that look like 157punctuation or line noise. These special variables are used for all 158kinds of purposes, and are documented in L<perlvar>. The only one you 159need to know about for now is C<$_> which is the "default variable". 160It's used as the default argument to a number of functions in Perl, and 161it's set implicitly by certain looping constructs. 162 163 print; # prints contents of $_ by default 164 165=item Arrays 166 167An array represents a list of values: 168 169 my @animals = ("camel", "llama", "owl"); 170 my @numbers = (23, 42, 69); 171 my @mixed = ("camel", 42, 1.23); 172 173Arrays are zero-indexed. Here's how you get at elements in an array: 174 175 print $animals[0]; # prints "camel" 176 print $animals[1]; # prints "llama" 177 178The special variable C<$#array> tells you the index of the last element 179of an array: 180 181 print $mixed[$#mixed]; # last element, prints 1.23 182 183You might be tempted to use C<$#array + 1> to tell you how many items there 184are in an array. Don't bother. As it happens, using C<@array> where Perl 185expects to find a scalar value ("in scalar context") will give you the number 186of elements in the array: 187 188 if (@animals < 5) { ... } 189 190The elements we're getting from the array start with a C<$> because 191we're getting just a single value out of the array -- you ask for a scalar, 192you get a scalar. 193 194To get multiple values from an array: 195 196 @animals[0,1]; # gives ("camel", "llama"); 197 @animals[0..2]; # gives ("camel", "llama", "owl"); 198 @animals[1..$#animals]; # gives all except the first element 199 200This is called an "array slice". 201 202You can do various useful things to lists: 203 204 my @sorted = sort @animals; 205 my @backwards = reverse @numbers; 206 207There are a couple of special arrays too, such as C<@ARGV> (the command 208line arguments to your script) and C<@_> (the arguments passed to a 209subroutine). These are documented in L<perlvar>. 210 211=item Hashes 212 213A hash represents a set of key/value pairs: 214 215 my %fruit_color = ("apple", "red", "banana", "yellow"); 216 217You can use whitespace and the C<< => >> operator to lay them out more 218nicely: 219 220 my %fruit_color = ( 221 apple => "red", 222 banana => "yellow", 223 ); 224 225To get at hash elements: 226 227 $fruit_color{"apple"}; # gives "red" 228 229You can get at lists of keys and values with C<keys()> and 230C<values()>. 231 232 my @fruits = keys %fruit_colors; 233 my @colors = values %fruit_colors; 234 235Hashes have no particular internal order, though you can sort the keys 236and loop through them. 237 238Just like special scalars and arrays, there are also special hashes. 239The most well known of these is C<%ENV> which contains environment 240variables. Read all about it (and other special variables) in 241L<perlvar>. 242 243=back 244 245Scalars, arrays and hashes are documented more fully in L<perldata>. 246 247More complex data types can be constructed using references, which allow 248you to build lists and hashes within lists and hashes. 249 250A reference is a scalar value and can refer to any other Perl data 251type. So by storing a reference as the value of an array or hash 252element, you can easily create lists and hashes within lists and 253hashes. The following example shows a 2 level hash of hash 254structure using anonymous hash references. 255 256 my $variables = { 257 scalar => { 258 description => "single item", 259 sigil => '$', 260 }, 261 array => { 262 description => "ordered list of items", 263 sigil => '@', 264 }, 265 hash => { 266 description => "key/value pairs", 267 sigil => '%', 268 }, 269 }; 270 271 print "Scalars begin with a $variables->{'scalar'}->{'sigil'}\n"; 272 273Exhaustive information on the topic of references can be found in 274L<perlreftut>, L<perllol>, L<perlref> and L<perldsc>. 275 276=head2 Variable scoping 277 278Throughout the previous section all the examples have used the syntax: 279 280 my $var = "value"; 281 282The C<my> is actually not required; you could just use: 283 284 $var = "value"; 285 286However, the above usage will create global variables throughout your 287program, which is bad programming practice. C<my> creates lexically 288scoped variables instead. The variables are scoped to the block 289(i.e. a bunch of statements surrounded by curly-braces) in which they 290are defined. 291 292 my $x = "foo"; 293 my $some_condition = 1; 294 if ($some_condition) { 295 my $y = "bar"; 296 print $x; # prints "foo" 297 print $y; # prints "bar" 298 } 299 print $x; # prints "foo" 300 print $y; # prints nothing; $y has fallen out of scope 301 302Using C<my> in combination with a C<use strict;> at the top of 303your Perl scripts means that the interpreter will pick up certain common 304programming errors. For instance, in the example above, the final 305C<print $y> would cause a compile-time error and prevent you from 306running the program. Using C<strict> is highly recommended. 307 308=head2 Conditional and looping constructs 309 310Perl has most of the usual conditional and looping constructs except for 311case/switch (but if you really want it, there is a Switch module in Perl 3125.8 and newer, and on CPAN. See the section on modules, below, for more 313information about modules and CPAN). 314 315The conditions can be any Perl expression. See the list of operators in 316the next section for information on comparison and boolean logic operators, 317which are commonly used in conditional statements. 318 319=over 4 320 321=item if 322 323 if ( condition ) { 324 ... 325 } elsif ( other condition ) { 326 ... 327 } else { 328 ... 329 } 330 331There's also a negated version of it: 332 333 unless ( condition ) { 334 ... 335 } 336 337This is provided as a more readable version of C<if (!I<condition>)>. 338 339Note that the braces are required in Perl, even if you've only got one 340line in the block. However, there is a clever way of making your one-line 341conditional blocks more English like: 342 343 # the traditional way 344 if ($zippy) { 345 print "Yow!"; 346 } 347 348 # the Perlish post-condition way 349 print "Yow!" if $zippy; 350 print "We have no bananas" unless $bananas; 351 352=item while 353 354 while ( condition ) { 355 ... 356 } 357 358There's also a negated version, for the same reason we have C<unless>: 359 360 until ( condition ) { 361 ... 362 } 363 364You can also use C<while> in a post-condition: 365 366 print "LA LA LA\n" while 1; # loops forever 367 368=item for 369 370Exactly like C: 371 372 for ($i = 0; $i <= $max; $i++) { 373 ... 374 } 375 376The C style for loop is rarely needed in Perl since Perl provides 377the more friendly list scanning C<foreach> loop. 378 379=item foreach 380 381 foreach (@array) { 382 print "This element is $_\n"; 383 } 384 385 print $list[$_] foreach 0 .. $max; 386 387 # you don't have to use the default $_ either... 388 foreach my $key (keys %hash) { 389 print "The value of $key is $hash{$key}\n"; 390 } 391 392=back 393 394For more detail on looping constructs (and some that weren't mentioned in 395this overview) see L<perlsyn>. 396 397=head2 Builtin operators and functions 398 399Perl comes with a wide selection of builtin functions. Some of the ones 400we've already seen include C<print>, C<sort> and C<reverse>. A list of 401them is given at the start of L<perlfunc> and you can easily read 402about any given function by using C<perldoc -f I<functionname>>. 403 404Perl operators are documented in full in L<perlop>, but here are a few 405of the most common ones: 406 407=over 4 408 409=item Arithmetic 410 411 + addition 412 - subtraction 413 * multiplication 414 / division 415 416=item Numeric comparison 417 418 == equality 419 != inequality 420 < less than 421 > greater than 422 <= less than or equal 423 >= greater than or equal 424 425=item String comparison 426 427 eq equality 428 ne inequality 429 lt less than 430 gt greater than 431 le less than or equal 432 ge greater than or equal 433 434(Why do we have separate numeric and string comparisons? Because we don't 435have special variable types, and Perl needs to know whether to sort 436numerically (where 99 is less than 100) or alphabetically (where 100 comes 437before 99). 438 439=item Boolean logic 440 441 && and 442 || or 443 ! not 444 445(C<and>, C<or> and C<not> aren't just in the above table as descriptions 446of the operators -- they're also supported as operators in their own 447right. They're more readable than the C-style operators, but have 448different precedence to C<&&> and friends. Check L<perlop> for more 449detail.) 450 451=item Miscellaneous 452 453 = assignment 454 . string concatenation 455 x string multiplication 456 .. range operator (creates a list of numbers) 457 458=back 459 460Many operators can be combined with a C<=> as follows: 461 462 $a += 1; # same as $a = $a + 1 463 $a -= 1; # same as $a = $a - 1 464 $a .= "\n"; # same as $a = $a . "\n"; 465 466=head2 Files and I/O 467 468You can open a file for input or output using the C<open()> function. 469It's documented in extravagant detail in L<perlfunc> and L<perlopentut>, 470but in short: 471 472 open(my $in, "<", "input.txt") or die "Can't open input.txt: $!"; 473 open(my $out, ">", "output.txt") or die "Can't open output.txt: $!"; 474 open(my $log, ">>", "my.log") or die "Can't open my.log: $!"; 475 476You can read from an open filehandle using the C<< <> >> operator. In 477scalar context it reads a single line from the filehandle, and in list 478context it reads the whole file in, assigning each line to an element of 479the list: 480 481 my $line = <$in>; 482 my @lines = <$in>; 483 484Reading in the whole file at one time is called slurping. It can 485be useful but it may be a memory hog. Most text file processing 486can be done a line at a time with Perl's looping constructs. 487 488The C<< <> >> operator is most often seen in a C<while> loop: 489 490 while (<$in>) { # assigns each line in turn to $_ 491 print "Just read in this line: $_"; 492 } 493 494We've already seen how to print to standard output using C<print()>. 495However, C<print()> can also take an optional first argument specifying 496which filehandle to print to: 497 498 print STDERR "This is your final warning.\n"; 499 print $out $record; 500 print $log $logmessage; 501 502When you're done with your filehandles, you should C<close()> them 503(though to be honest, Perl will clean up after you if you forget): 504 505 close $in or die "$in: $!"; 506 507=head2 Regular expressions 508 509Perl's regular expression support is both broad and deep, and is the 510subject of lengthy documentation in L<perlrequick>, L<perlretut>, and 511elsewhere. However, in short: 512 513=over 4 514 515=item Simple matching 516 517 if (/foo/) { ... } # true if $_ contains "foo" 518 if ($a =~ /foo/) { ... } # true if $a contains "foo" 519 520The C<//> matching operator is documented in L<perlop>. It operates on 521C<$_> by default, or can be bound to another variable using the C<=~> 522binding operator (also documented in L<perlop>). 523 524=item Simple substitution 525 526 s/foo/bar/; # replaces foo with bar in $_ 527 $a =~ s/foo/bar/; # replaces foo with bar in $a 528 $a =~ s/foo/bar/g; # replaces ALL INSTANCES of foo with bar in $a 529 530The C<s///> substitution operator is documented in L<perlop>. 531 532=item More complex regular expressions 533 534You don't just have to match on fixed strings. In fact, you can match 535on just about anything you could dream of by using more complex regular 536expressions. These are documented at great length in L<perlre>, but for 537the meantime, here's a quick cheat sheet: 538 539 . a single character 540 \s a whitespace character (space, tab, newline, ...) 541 \S non-whitespace character 542 \d a digit (0-9) 543 \D a non-digit 544 \w a word character (a-z, A-Z, 0-9, _) 545 \W a non-word character 546 [aeiou] matches a single character in the given set 547 [^aeiou] matches a single character outside the given set 548 (foo|bar|baz) matches any of the alternatives specified 549 550 ^ start of string 551 $ end of string 552 553Quantifiers can be used to specify how many of the previous thing you 554want to match on, where "thing" means either a literal character, one 555of the metacharacters listed above, or a group of characters or 556metacharacters in parentheses. 557 558 * zero or more of the previous thing 559 + one or more of the previous thing 560 ? zero or one of the previous thing 561 {3} matches exactly 3 of the previous thing 562 {3,6} matches between 3 and 6 of the previous thing 563 {3,} matches 3 or more of the previous thing 564 565Some brief examples: 566 567 /^\d+/ string starts with one or more digits 568 /^$/ nothing in the string (start and end are adjacent) 569 /(\d\s){3}/ a three digits, each followed by a whitespace 570 character (eg "3 4 5 ") 571 /(a.)+/ matches a string in which every odd-numbered letter 572 is a (eg "abacadaf") 573 574 # This loop reads from STDIN, and prints non-blank lines: 575 while (<>) { 576 next if /^$/; 577 print; 578 } 579 580=item Parentheses for capturing 581 582As well as grouping, parentheses serve a second purpose. They can be 583used to capture the results of parts of the regexp match for later use. 584The results end up in C<$1>, C<$2> and so on. 585 586 # a cheap and nasty way to break an email address up into parts 587 588 if ($email =~ /([^@]+)@(.+)/) { 589 print "Username is $1\n"; 590 print "Hostname is $2\n"; 591 } 592 593=item Other regexp features 594 595Perl regexps also support backreferences, lookaheads, and all kinds of 596other complex details. Read all about them in L<perlrequick>, 597L<perlretut>, and L<perlre>. 598 599=back 600 601=head2 Writing subroutines 602 603Writing subroutines is easy: 604 605 sub logger { 606 my $logmessage = shift; 607 open my $logfile, ">>", "my.log" or die "Could not open my.log: $!"; 608 print $logfile $logmessage; 609 } 610 611Now we can use the subroutine just as any other built-in function: 612 613 logger("We have a logger subroutine!"); 614 615What's that C<shift>? Well, the arguments to a subroutine are available 616to us as a special array called C<@_> (see L<perlvar> for more on that). 617The default argument to the C<shift> function just happens to be C<@_>. 618So C<my $logmessage = shift;> shifts the first item off the list of 619arguments and assigns it to C<$logmessage>. 620 621We can manipulate C<@_> in other ways too: 622 623 my ($logmessage, $priority) = @_; # common 624 my $logmessage = $_[0]; # uncommon, and ugly 625 626Subroutines can also return values: 627 628 sub square { 629 my $num = shift; 630 my $result = $num * $num; 631 return $result; 632 } 633 634Then use it like: 635 636 $sq = square(8); 637 638For more information on writing subroutines, see L<perlsub>. 639 640=head2 OO Perl 641 642OO Perl is relatively simple and is implemented using references which 643know what sort of object they are based on Perl's concept of packages. 644However, OO Perl is largely beyond the scope of this document. 645Read L<perlboot>, L<perltoot>, L<perltooc> and L<perlobj>. 646 647As a beginning Perl programmer, your most common use of OO Perl will be 648in using third-party modules, which are documented below. 649 650=head2 Using Perl modules 651 652Perl modules provide a range of features to help you avoid reinventing 653the wheel, and can be downloaded from CPAN ( http://www.cpan.org/ ). A 654number of popular modules are included with the Perl distribution 655itself. 656 657Categories of modules range from text manipulation to network protocols 658to database integration to graphics. A categorized list of modules is 659also available from CPAN. 660 661To learn how to install modules you download from CPAN, read 662L<perlmodinstall>. 663 664To learn how to use a particular module, use C<perldoc I<Module::Name>>. 665Typically you will want to C<use I<Module::Name>>, which will then give 666you access to exported functions or an OO interface to the module. 667 668L<perlfaq> contains questions and answers related to many common 669tasks, and often provides suggestions for good CPAN modules to use. 670 671L<perlmod> describes Perl modules in general. L<perlmodlib> lists the 672modules which came with your Perl installation. 673 674If you feel the urge to write Perl modules, L<perlnewmod> will give you 675good advice. 676 677=head1 AUTHOR 678 679Kirrily "Skud" Robert <skud@cpan.org> 680