xref: /openbsd/gnu/usr.bin/perl/pod/perlintro.pod (revision eac174f2)
1=head1 NAME
2
3perlintro - a brief introduction and overview of Perl
4
5=head1 DESCRIPTION
6
7This document is intended to give you a quick overview of the Perl
8programming language, along with pointers to further documentation.  It
9is intended as a "bootstrap" guide for those who are new to the
10language, and provides just enough information for you to be able to
11read other peoples' Perl and understand roughly what it's doing, or
12write your own simple scripts.
13
14This introductory document does not aim to be complete.  It does not
15even aim to be entirely accurate.  In some cases perfection has been
16sacrificed in the goal of getting the general idea across.  You are
17I<strongly> advised to follow this introduction with more information
18from the full Perl manual, the table of contents to which can be found
19in L<perltoc>.
20
21Throughout this document you'll see references to other parts of the
22Perl documentation.  You can read that documentation using the C<perldoc>
23command or whatever method you're using to read this document.
24
25Throughout Perl's documentation, you'll find numerous examples intended
26to help explain the discussed features.  Please keep in mind that many
27of them are code fragments rather than complete programs.
28
29These examples often reflect the style and preference of the author of
30that piece of the documentation, and may be briefer than a corresponding
31line of code in a real program.  Except where otherwise noted, you
32should assume that C<use strict> and C<use warnings> statements
33appear earlier in the "program", and that any variables used have
34already been declared, even if those declarations have been omitted
35to make the example easier to read.
36
37Do note that the examples have been written by many different authors over
38a period of several decades.  Styles and techniques will therefore differ,
39although some effort has been made to not vary styles too widely in the
40same sections.  Do not consider one style to be better than others - "There's
41More Than One Way To Do It" is one of Perl's mottos.  After all, in your
42journey as a programmer, you are likely to encounter different styles.
43
44=head2 What is Perl?
45
46Perl is a general-purpose programming language originally developed for
47text manipulation and now used for a wide range of tasks including
48system administration, web development, network programming, GUI
49development, and more.
50
51The language is intended to be practical (easy to use, efficient,
52complete) rather than beautiful (tiny, elegant, minimal).  Its major
53features are that it's easy to use, supports both procedural and
54object-oriented (OO) programming, has powerful built-in support for text
55processing, and has one of the world's most impressive collections of
56third-party modules.
57
58Different definitions of Perl are given in L<perl>, L<perlfaq1> and
59no doubt other places.  From this we can determine that Perl is different
60things to different people, but that lots of people think it's at least
61worth writing about.
62
63=head2 Running Perl programs
64
65To run a Perl program from the Unix command line:
66
67 perl progname.pl
68
69Alternatively, put this as the first line of your script:
70
71 #!/usr/bin/env perl
72
73... and run the script as F</path/to/script.pl>.  Of course, it'll need
74to be executable first, so C<chmod 755 script.pl> (under Unix).
75
76(This start line assumes you have the B<env> program.  You can also put
77directly the path to your perl executable, like in C<#!/usr/bin/perl>).
78
79For more information, including instructions for other platforms such as
80Windows, read L<perlrun>.
81
82=head2 Safety net
83
84Perl by default is very forgiving.  In order to make it more robust
85it is recommended to start every program with the following lines:
86
87 #!/usr/bin/perl
88 use strict;
89 use warnings;
90
91The two additional lines request from perl to catch various common
92problems in your code.  They check different things so you need both.  A
93potential problem caught by C<use strict;> will cause your code to stop
94immediately when it is encountered, while C<use warnings;> will merely
95give a warning (like the command-line switch B<-w>) and let your code run.
96To read more about them, check their respective manual pages at L<strict>
97and L<warnings>.
98
99A C<L<use v5.35|perlfunc/use VERSION>> (or higher) declaration will
100enable both C<strict> and C<warnings>:
101
102  #!/usr/bin/perl
103  use v5.35;
104
105In addition to enabling the C<strict> and C<warnings> pragmata, this
106declaration will also activate a
107L<"feature bundle"|feature/FEATURE BUNDLES>; a collection of named
108features that enable many of the more recent additions and changes to the
109language, as well as occasionally removing older features found to have
110been mistakes in design and discouraged.
111
112=head2 Basic syntax overview
113
114A Perl script or program consists of one or more statements.  These
115statements are simply written in the script in a straightforward
116fashion.  There is no need to have a C<main()> function or anything of
117that kind.
118
119Perl statements end in a semi-colon:
120
121 print "Hello, world";
122
123Comments start with a hash symbol and run to the end of the line
124
125 # This is a comment
126
127Whitespace is irrelevant:
128
129 print
130     "Hello, world"
131     ;
132
133... except inside quoted strings:
134
135 # this would print with a linebreak in the middle
136 print "Hello
137 world";
138
139Double quotes or single quotes may be used around literal strings:
140
141 print "Hello, world";
142 print 'Hello, world';
143
144However, only double quotes "interpolate" variables and special
145characters such as newlines (C<\n>):
146
147 print "Hello, $name\n";     # works fine
148 print 'Hello, $name\n';     # prints $name\n literally
149
150Numbers don't need quotes around them:
151
152 print 42;
153
154You can use parentheses for functions' arguments or omit them
155according to your personal taste.  They are only required
156occasionally to clarify issues of precedence.
157
158 print("Hello, world\n");
159 print "Hello, world\n";
160
161More detailed information about Perl syntax can be found in L<perlsyn>.
162
163=head2 Perl variable types
164
165Perl has three main variable types: scalars, arrays, and hashes.
166
167=over 4
168
169=item Scalars
170
171A scalar represents a single value:
172
173 my $animal = "camel";
174 my $answer = 42;
175
176Scalar values can be strings, integers or floating point numbers, and Perl
177will automatically convert between them as required.  You have to declare
178them using the C<my> keyword the first time you use them.  (This is one of the
179requirements of C<use strict;>.)
180
181Scalar values can be used in various ways:
182
183 print $animal;
184 print "The animal is $animal\n";
185 print "The square of $answer is ", $answer * $answer, "\n";
186
187Perl defines a number of special scalars with short names, often single
188punctuation marks or digits. These variables are used for all
189kinds of purposes, and are documented in L<perlvar>.  The only one you
190need to know about for now is C<$_> which is the "default variable".
191It's used as the default argument to a number of functions in Perl, and
192it's set implicitly by certain looping constructs.
193
194 print;          # prints contents of $_ by default
195
196=item Arrays
197
198An array represents a list of values:
199
200 my @animals = ("camel", "llama", "owl");
201 my @numbers = (23, 42, 69);
202 my @mixed   = ("camel", 42, 1.23);
203
204Arrays are zero-indexed.  Here's how you get at elements in an array:
205
206 print $animals[0];              # prints "camel"
207 print $animals[1];              # prints "llama"
208
209The special variable C<$#array> tells you the index of the last element
210of an array:
211
212 print $mixed[$#mixed];       # last element, prints 1.23
213
214You might be tempted to use C<$#array + 1> to tell you how many items there
215are in an array.  Don't bother.  As it happens, using C<@array> where Perl
216expects to find a scalar value ("in scalar context") will give you the number
217of elements in the array:
218
219 if (@animals < 5) { ... }
220
221The elements we're getting from the array start with a C<$> because
222we're getting just a single value out of the array; you ask for a scalar,
223you get a scalar.
224
225To get multiple values from an array:
226
227 @animals[0,1];                 # gives ("camel", "llama");
228 @animals[0..2];                # gives ("camel", "llama", "owl");
229 @animals[1..$#animals];        # gives all except the first element
230
231This is called an "array slice".
232
233You can do various useful things to lists:
234
235 my @sorted    = sort @animals;
236 my @backwards = reverse @numbers;
237
238There are a couple of special arrays too, such as C<@ARGV> (the command
239line arguments to your script) and C<@_> (the arguments passed to a
240subroutine).  These are documented in L<perlvar>.
241
242=item Hashes
243
244A hash represents a set of key/value pairs:
245
246 my %fruit_color = ("apple", "red", "banana", "yellow");
247
248You can use whitespace and the C<< => >> operator to lay them out more
249nicely:
250
251 my %fruit_color = (
252     apple  => "red",
253     banana => "yellow",
254 );
255
256To get at hash elements:
257
258 $fruit_color{"apple"};           # gives "red"
259
260You can get at lists of keys and values with C<keys()> and
261C<values()>.
262
263 my @fruits = keys %fruit_color;
264 my @colors = values %fruit_color;
265
266Hashes have no particular internal order, though you can sort the keys
267and loop through them.
268
269Just like special scalars and arrays, there are also special hashes.
270The most well known of these is C<%ENV> which contains environment
271variables.  Read all about it (and other special variables) in
272L<perlvar>.
273
274=back
275
276Scalars, arrays and hashes are documented more fully in L<perldata>.
277
278More complex data types can be constructed using references, which allow
279you to build lists and hashes within lists and hashes.
280
281A reference is a scalar value and can refer to any other Perl data
282type.  So by storing a reference as the value of an array or hash
283element, you can easily create lists and hashes within lists and
284hashes.  The following example shows a 2 level hash of hash
285structure using anonymous hash references.
286
287 my $variables = {
288     scalar  =>  {
289                  description => "single item",
290                  sigil => '$',
291                 },
292     array   =>  {
293                  description => "ordered list of items",
294                  sigil => '@',
295                 },
296     hash    =>  {
297                  description => "key/value pairs",
298                  sigil => '%',
299                 },
300 };
301
302 print "Scalars begin with a $variables->{'scalar'}->{'sigil'}\n";
303
304Exhaustive information on the topic of references can be found in
305L<perlreftut>, L<perllol>, L<perlref> and L<perldsc>.
306
307=head2 Variable scoping
308
309Throughout the previous section all the examples have used the syntax:
310
311 my $var = "value";
312
313The C<my> is actually not required; you could just use:
314
315 $var = "value";
316
317However, the above usage will create global variables throughout your
318program, which is bad programming practice.  C<my> creates lexically
319scoped variables instead.  The variables are scoped to the block
320(i.e. a bunch of statements surrounded by curly-braces) in which they
321are defined.
322
323 my $x = "foo";
324 my $some_condition = 1;
325 if ($some_condition) {
326     my $y = "bar";
327     print $x;           # prints "foo"
328     print $y;           # prints "bar"
329 }
330 print $x;               # prints "foo"
331 print $y;               # prints nothing; $y has fallen out of scope
332
333Using C<my> in combination with a C<use strict;> at the top of
334your Perl scripts means that the interpreter will pick up certain common
335programming errors.  For instance, in the example above, the final
336C<print $y> would cause a compile-time error and prevent you from
337running the program.  Using C<strict> is highly recommended.
338
339=head2 Conditional and looping constructs
340
341Perl has most of the usual conditional and looping constructs.
342
343The conditions can be any Perl expression.  See the list of operators in
344the next section for information on comparison and boolean logic operators,
345which are commonly used in conditional statements.
346
347=over 4
348
349=item if
350
351 if ( condition ) {
352     ...
353 } elsif ( other condition ) {
354     ...
355 } else {
356     ...
357 }
358
359There's also a negated version of it:
360
361 unless ( condition ) {
362     ...
363 }
364
365This is provided as a more readable version of C<if (!I<condition>)>.
366
367Note that the braces are required in Perl, even if you've only got one
368line in the block.  However, there is a clever way of making your one-line
369conditional blocks more English like:
370
371 # the traditional way
372 if ($zippy) {
373     print "Yow!";
374 }
375
376 # the Perlish post-condition way
377 print "Yow!" if $zippy;
378 print "We have no bananas" unless $bananas;
379
380=item while
381
382 while ( condition ) {
383     ...
384 }
385
386There's also a negated version, for the same reason we have C<unless>:
387
388 until ( condition ) {
389     ...
390 }
391
392You can also use C<while> in a post-condition:
393
394 print "LA LA LA\n" while 1;          # loops forever
395
396=item for
397
398Exactly like C:
399
400 for ($i = 0; $i <= $max; $i++) {
401     ...
402 }
403
404The C style for loop is rarely needed in Perl since Perl provides
405the more friendly list scanning C<foreach> loop.
406
407=item foreach
408
409 foreach (@array) {
410     print "This element is $_\n";
411 }
412
413 print $list[$_] foreach 0 .. $max;
414
415 # you don't have to use the default $_ either...
416 foreach my $key (keys %hash) {
417     print "The value of $key is $hash{$key}\n";
418 }
419
420The C<foreach> keyword is actually a synonym for the C<for>
421keyword.  See C<L<perlsyn/"Foreach Loops">>.
422
423=back
424
425For more detail on looping constructs (and some that weren't mentioned in
426this overview) see L<perlsyn>.
427
428=head2 Builtin operators and functions
429
430Perl comes with a wide selection of builtin functions.  Some of the ones
431we've already seen include C<print>, C<sort> and C<reverse>.  A list of
432them is given at the start of L<perlfunc> and you can easily read
433about any given function by using C<perldoc -f I<functionname>>.
434
435Perl operators are documented in full in L<perlop>, but here are a few
436of the most common ones:
437
438=over 4
439
440=item Arithmetic
441
442 +   addition
443 -   subtraction
444 *   multiplication
445 /   division
446
447=item Numeric comparison
448
449 ==  equality
450 !=  inequality
451 <   less than
452 >   greater than
453 <=  less than or equal
454 >=  greater than or equal
455
456=item String comparison
457
458 eq  equality
459 ne  inequality
460 lt  less than
461 gt  greater than
462 le  less than or equal
463 ge  greater than or equal
464
465(Why do we have separate numeric and string comparisons?  Because we don't
466have special variable types, and Perl needs to know whether to sort
467numerically (where 99 is less than 100) or alphabetically (where 100 comes
468before 99).
469
470=item Boolean logic
471
472 &&  and
473 ||  or
474 !   not
475
476(C<and>, C<or> and C<not> aren't just in the above table as descriptions
477of the operators.  They're also supported as operators in their own
478right.  They're more readable than the C-style operators, but have
479different precedence to C<&&> and friends.  Check L<perlop> for more
480detail.)
481
482=item Miscellaneous
483
484 =   assignment
485 .   string concatenation
486 x   string multiplication (repeats strings)
487 ..  range operator (creates a list of numbers or strings)
488
489=back
490
491Many operators can be combined with a C<=> as follows:
492
493 $a += 1;        # same as $a = $a + 1
494 $a -= 1;        # same as $a = $a - 1
495 $a .= "\n";     # same as $a = $a . "\n";
496
497=head2 Files and I/O
498
499You can open a file for input or output using the C<open()> function.
500It's documented in extravagant detail in L<perlfunc> and L<perlopentut>,
501but in short:
502
503 open(my $in,  "<",  "input.txt")  or die "Can't open input.txt: $!";
504 open(my $out, ">",  "output.txt") or die "Can't open output.txt: $!";
505 open(my $log, ">>", "my.log")     or die "Can't open my.log: $!";
506
507You can read from an open filehandle using the C<< <> >> operator.  In
508scalar context it reads a single line from the filehandle, and in list
509context it reads the whole file in, assigning each line to an element of
510the list:
511
512 my $line  = <$in>;
513 my @lines = <$in>;
514
515Reading in the whole file at one time is called slurping.  It can
516be useful but it may be a memory hog.  Most text file processing
517can be done a line at a time with Perl's looping constructs.
518
519The C<< <> >> operator is most often seen in a C<while> loop:
520
521 while (<$in>) {     # assigns each line in turn to $_
522     print "Just read in this line: $_";
523 }
524
525We've already seen how to print to standard output using C<print()>.
526However, C<print()> can also take an optional first argument specifying
527which filehandle to print to:
528
529 print STDERR "This is your final warning.\n";
530 print $out $record;
531 print $log $logmessage;
532
533When you're done with your filehandles, you should C<close()> them
534(though to be honest, Perl will clean up after you if you forget):
535
536 close $in or die "$in: $!";
537
538=head2 Regular expressions
539
540Perl's regular expression support is both broad and deep, and is the
541subject of lengthy documentation in L<perlrequick>, L<perlretut>, and
542elsewhere.  However, in short:
543
544=over 4
545
546=item Simple matching
547
548 if (/foo/)       { ... }  # true if $_ contains "foo"
549 if ($a =~ /foo/) { ... }  # true if $a contains "foo"
550
551The C<//> matching operator is documented in L<perlop>.  It operates on
552C<$_> by default, or can be bound to another variable using the C<=~>
553binding operator (also documented in L<perlop>).
554
555=item Simple substitution
556
557 s/foo/bar/;               # replaces foo with bar in $_
558 $a =~ s/foo/bar/;         # replaces foo with bar in $a
559 $a =~ s/foo/bar/g;        # replaces ALL INSTANCES of foo with bar
560                           # in $a
561
562The C<s///> substitution operator is documented in L<perlop>.
563
564=item More complex regular expressions
565
566You don't just have to match on fixed strings.  In fact, you can match
567on just about anything you could dream of by using more complex regular
568expressions.  These are documented at great length in L<perlre>, but for
569the meantime, here's a quick cheat sheet:
570
571 .                   a single character
572 \s                  a whitespace character (space, tab, newline,
573                     ...)
574 \S                  non-whitespace character
575 \d                  a digit (0-9)
576 \D                  a non-digit
577 \w                  a word character (a-z, A-Z, 0-9, _)
578 \W                  a non-word character
579 [aeiou]             matches a single character in the given set
580 [^aeiou]            matches a single character outside the given
581                     set
582 (foo|bar|baz)       matches any of the alternatives specified
583
584 ^                   start of string
585 $                   end of string
586
587Quantifiers can be used to specify how many of the previous thing you
588want to match on, where "thing" means either a literal character, one
589of the metacharacters listed above, or a group of characters or
590metacharacters in parentheses.
591
592 *                   zero or more of the previous thing
593 +                   one or more of the previous thing
594 ?                   zero or one of the previous thing
595 {3}                 matches exactly 3 of the previous thing
596 {3,6}               matches between 3 and 6 of the previous thing
597 {3,}                matches 3 or more of the previous thing
598
599Some brief examples:
600
601 /^\d+/              string starts with one or more digits
602 /^$/                nothing in the string (start and end are
603                     adjacent)
604 /(\d\s){3}/         three digits, each followed by a whitespace
605                     character (eg "3 4 5 ")
606 /(a.)+/             matches a string in which every odd-numbered
607                     letter is a (eg "abacadaf")
608
609 # This loop reads from STDIN, and prints non-blank lines:
610 while (<>) {
611     next if /^$/;
612     print;
613 }
614
615=item Parentheses for capturing
616
617As well as grouping, parentheses serve a second purpose.  They can be
618used to capture the results of parts of the regexp match for later use.
619The results end up in C<$1>, C<$2> and so on.
620
621 # a cheap and nasty way to break an email address up into parts
622
623 if ($email =~ /([^@]+)@(.+)/) {
624     print "Username is $1\n";
625     print "Hostname is $2\n";
626 }
627
628=item Other regexp features
629
630Perl regexps also support backreferences, lookaheads, and all kinds of
631other complex details.  Read all about them in L<perlrequick>,
632L<perlretut>, and L<perlre>.
633
634=back
635
636=head2 Writing subroutines
637
638Writing subroutines is easy:
639
640 sub logger {
641    my $logmessage = shift;
642    open my $logfile, ">>", "my.log" or die "Could not open my.log: $!";
643    print $logfile $logmessage;
644 }
645
646Now we can use the subroutine just as any other built-in function:
647
648 logger("We have a logger subroutine!");
649
650What's that C<shift>?  Well, the arguments to a subroutine are available
651to us as a special array called C<@_> (see L<perlvar> for more on that).
652The default argument to the C<shift> function just happens to be C<@_>.
653So C<my $logmessage = shift;> shifts the first item off the list of
654arguments and assigns it to C<$logmessage>.
655
656We can manipulate C<@_> in other ways too:
657
658 my ($logmessage, $priority) = @_;       # common
659 my $logmessage = $_[0];                 # uncommon, and ugly
660
661Subroutines can also return values:
662
663 sub square {
664     my $num = shift;
665     my $result = $num * $num;
666     return $result;
667 }
668
669Then use it like:
670
671 $sq = square(8);
672
673For more information on writing subroutines, see L<perlsub>.
674
675=head2 OO Perl
676
677OO Perl is relatively simple and is implemented using references which
678know what sort of object they are based on Perl's concept of packages.
679However, OO Perl is largely beyond the scope of this document.
680Read L<perlootut> and L<perlobj>.
681
682As a beginning Perl programmer, your most common use of OO Perl will be
683in using third-party modules, which are documented below.
684
685=head2 Using Perl modules
686
687Perl modules provide a range of features to help you avoid reinventing
688the wheel, and can be downloaded from CPAN ( L<http://www.cpan.org/> ).  A
689number of popular modules are included with the Perl distribution
690itself.
691
692Categories of modules range from text manipulation to network protocols
693to database integration to graphics.  A categorized list of modules is
694also available from CPAN.
695
696To learn how to install modules you download from CPAN, read
697L<perlmodinstall>.
698
699To learn how to use a particular module, use C<perldoc I<Module::Name>>.
700Typically you will want to C<use I<Module::Name>>, which will then give
701you access to exported functions or an OO interface to the module.
702
703L<perlfaq> contains questions and answers related to many common
704tasks, and often provides suggestions for good CPAN modules to use.
705
706L<perlmod> describes Perl modules in general.  L<perlmodlib> lists the
707modules which came with your Perl installation.
708
709If you feel the urge to write Perl modules, L<perlnewmod> will give you
710good advice.
711
712=head1 AUTHOR
713
714Kirrily "Skud" Robert <skud@cpan.org>
715