xref: /openbsd/gnu/usr.bin/perl/pod/perlopentut.pod (revision 4cfece93)
1=encoding utf8
2
3=head1 NAME
4
5perlopentut - simple recipes for opening files and pipes in Perl
6
7=head1 DESCRIPTION
8
9Whenever you do I/O on a file in Perl, you do so through what in Perl is
10called a B<filehandle>.  A filehandle is an internal name for an external
11file.  It is the job of the C<open> function to make the association
12between the internal name and the external name, and it is the job
13of the C<close> function to break that association.
14
15For your convenience, Perl sets up a few special filehandles that are
16already open when you run.  These include C<STDIN>, C<STDOUT>, C<STDERR>,
17and C<ARGV>.  Since those are pre-opened, you can use them right away
18without having to go to the trouble of opening them yourself:
19
20    print STDERR "This is a debugging message.\n";
21
22    print STDOUT "Please enter something: ";
23    $response = <STDIN> // die "how come no input?";
24    print STDOUT "Thank you!\n";
25
26    while (<ARGV>) { ... }
27
28As you see from those examples, C<STDOUT> and C<STDERR> are output
29handles, and C<STDIN> and C<ARGV> are input handles.  They are
30in all capital letters because they are reserved to Perl, much
31like the C<@ARGV> array and the C<%ENV> hash are.  Their external
32associations were set up by your shell.
33
34You will need to open every other filehandle on your own. Although there
35are many variants, the most common way to call Perl's open() function
36is with three arguments and one return value:
37
38C<    I<OK> = open(I<HANDLE>, I<MODE>, I<PATHNAME>)>
39
40Where:
41
42=over
43
44=item I<OK>
45
46will be some defined value if the open succeeds, but
47C<undef> if it fails;
48
49=item I<HANDLE>
50
51should be an undefined scalar variable to be filled in by the
52C<open> function if it succeeds;
53
54=item I<MODE>
55
56is the access mode and the encoding format to open the file with;
57
58=item I<PATHNAME>
59
60is the external name of the file you want opened.
61
62=back
63
64Most of the complexity of the C<open> function lies in the many
65possible values that the I<MODE> parameter can take on.
66
67One last thing before we show you how to open files: opening
68files does not (usually) automatically lock them in Perl.  See
69L<perlfaq5> for how to lock.
70
71=head1 Opening Text Files
72
73=head2 Opening Text Files for Reading
74
75If you want to read from a text file, first open it in
76read-only mode like this:
77
78    my $filename = "/some/path/to/a/textfile/goes/here";
79    my $encoding = ":encoding(UTF-8)";
80    my $handle   = undef;     # this will be filled in on success
81
82    open($handle, "< $encoding", $filename)
83        || die "$0: can't open $filename for reading: $!";
84
85As with the shell, in Perl the C<< "<" >> is used to open the file in
86read-only mode.  If it succeeds, Perl allocates a brand new filehandle for
87you and fills in your previously undefined C<$handle> argument with a
88reference to that handle.
89
90Now you may use functions like C<readline>, C<read>, C<getc>, and
91C<sysread> on that handle.  Probably the most common input function
92is the one that looks like an operator:
93
94    $line = readline($handle);
95    $line = <$handle>;          # same thing
96
97Because the C<readline> function returns C<undef> at end of file or
98upon error, you will sometimes see it used this way:
99
100    $line = <$handle>;
101    if (defined $line) {
102        # do something with $line
103    }
104    else {
105        # $line is not valid, so skip it
106    }
107
108You can also just quickly C<die> on an undefined value this way:
109
110    $line = <$handle> // die "no input found";
111
112However, if hitting EOF is an expected and normal event, you do not want to
113exit simply because you have run out of input.  Instead, you probably just want
114to exit an input loop.  You can then test to see if an actual error has caused
115the loop to terminate, and act accordingly:
116
117    while (<$handle>) {
118        # do something with data in $_
119    }
120    if ($!) {
121        die "unexpected error while reading from $filename: $!";
122    }
123
124B<A Note on Encodings>: Having to specify the text encoding every time
125might seem a bit of a bother.  To set up a default encoding for C<open> so
126that you don't have to supply it each time, you can use the C<open> pragma:
127
128    use open qw< :encoding(UTF-8) >;
129
130Once you've done that, you can safely omit the encoding part of the
131open mode:
132
133    open($handle, "<", $filename)
134        || die "$0: can't open $filename for reading: $!";
135
136But never use the bare C<< "<" >> without having set up a default encoding
137first.  Otherwise, Perl cannot know which of the many, many, many possible
138flavors of text file you have, and Perl will have no idea how to correctly
139map the data in your file into actual characters it can work with.  Other
140common encoding formats including C<"ASCII">, C<"ISO-8859-1">,
141C<"ISO-8859-15">, C<"Windows-1252">, C<"MacRoman">, and even C<"UTF-16LE">.
142See L<perlunitut> for more about encodings.
143
144=head2 Opening Text Files for Writing
145
146When you want to write to a file, you first have to decide what to do about
147any existing contents of that file.  You have two basic choices here: to
148preserve or to clobber.
149
150If you want to preserve any existing contents, then you want to open the file
151in append mode.  As in the shell, in Perl you use C<<< ">>" >>> to open an
152existing file in append mode.  C<<< ">>" >>> creates the file if it does not
153already exist.
154
155    my $handle   = undef;
156    my $filename = "/some/path/to/a/textfile/goes/here";
157    my $encoding = ":encoding(UTF-8)";
158
159    open($handle, ">> $encoding", $filename)
160        || die "$0: can't open $filename for appending: $!";
161
162Now you can write to that filehandle using any of C<print>, C<printf>,
163C<say>, C<write>, or C<syswrite>.
164
165As noted above, if the file does not already exist, then the append-mode open
166will create it for you.  But if the file does already exist, its contents are
167safe from harm because you will be adding your new text past the end of the
168old text.
169
170On the other hand, sometimes you want to clobber whatever might already be
171there.  To empty out a file before you start writing to it, you can open it
172in write-only mode:
173
174    my $handle   = undef;
175    my $filename = "/some/path/to/a/textfile/goes/here";
176    my $encoding = ":encoding(UTF-8)";
177
178    open($handle, "> $encoding", $filename)
179        || die "$0: can't open $filename in write-open mode: $!";
180
181Here again Perl works just like the shell in that the C<< ">" >> clobbers
182an existing file.
183
184As with the append mode, when you open a file in write-only mode,
185you can now write to that filehandle using any of C<print>, C<printf>,
186C<say>, C<write>, or C<syswrite>.
187
188What about read-write mode?  You should probably pretend it doesn't exist,
189because opening text files in read-write mode is unlikely to do what you
190would like.  See L<perlfaq5> for details.
191
192=head1 Opening Binary Files
193
194If the file to be opened contains binary data instead of text characters,
195then the C<MODE> argument to C<open> is a little different.  Instead of
196specifying the encoding, you tell Perl that your data are in raw bytes.
197
198    my $filename = "/some/path/to/a/binary/file/goes/here";
199    my $encoding = ":raw :bytes"
200    my $handle   = undef;     # this will be filled in on success
201
202And then open as before, choosing C<<< "<" >>>, C<<< ">>" >>>, or
203C<<< ">" >>> as needed:
204
205    open($handle, "< $encoding", $filename)
206        || die "$0: can't open $filename for reading: $!";
207
208    open($handle, ">> $encoding", $filename)
209        || die "$0: can't open $filename for appending: $!";
210
211    open($handle, "> $encoding", $filename)
212        || die "$0: can't open $filename in write-open mode: $!";
213
214Alternately, you can change to binary mode on an existing handle this way:
215
216    binmode($handle)    || die "cannot binmode handle";
217
218This is especially handy for the handles that Perl has already opened for you.
219
220    binmode(STDIN)      || die "cannot binmode STDIN";
221    binmode(STDOUT)     || die "cannot binmode STDOUT";
222
223You can also pass C<binmode> an explicit encoding to change it on the fly.
224This isn't exactly "binary" mode, but we still use C<binmode> to do it:
225
226  binmode(STDIN,  ":encoding(MacRoman)") || die "cannot binmode STDIN";
227  binmode(STDOUT, ":encoding(UTF-8)")    || die "cannot binmode STDOUT";
228
229Once you have your binary file properly opened in the right mode, you can
230use all the same Perl I/O functions as you used on text files.  However,
231you may wish to use the fixed-size C<read> instead of the variable-sized
232C<readline> for your input.
233
234Here's an example of how to copy a binary file:
235
236    my $BUFSIZ   = 64 * (2 ** 10);
237    my $name_in  = "/some/input/file";
238    my $name_out = "/some/output/flie";
239
240    my($in_fh, $out_fh, $buffer);
241
242    open($in_fh,  "<", $name_in)
243        || die "$0: cannot open $name_in for reading: $!";
244    open($out_fh, ">", $name_out)
245        || die "$0: cannot open $name_out for writing: $!";
246
247    for my $fh ($in_fh, $out_fh)  {
248        binmode($fh)               || die "binmode failed";
249    }
250
251    while (read($in_fh, $buffer, $BUFSIZ)) {
252        unless (print $out_fh $buffer) {
253            die "couldn't write to $name_out: $!";
254        }
255    }
256
257    close($in_fh)       || die "couldn't close $name_in: $!";
258    close($out_fh)      || die "couldn't close $name_out: $!";
259
260=head1 Opening Pipes
261
262To be announced.
263
264=head1 Low-level File Opens via sysopen
265
266To be announced.  Or deleted.
267
268=head1 SEE ALSO
269
270To be announced.
271
272=head1 AUTHOR and COPYRIGHT
273
274Copyright 2013 Tom Christiansen.
275
276This documentation is free; you can redistribute it and/or modify it under
277the same terms as Perl itself.
278
279