1=head1 NAME 2 3perlfaq5 - Files and Formats 4 5=head1 DESCRIPTION 6 7This section deals with I/O and the "f" issues: filehandles, flushing, 8formats, and footers. 9 10=head2 How do I flush/unbuffer an output filehandle? Why must I do this? 11X<flush> X<buffer> X<unbuffer> X<autoflush> 12 13(contributed by brian d foy) 14 15You might like to read Mark Jason Dominus's "Suffering From Buffering" 16at L<http://perl.plover.com/FAQs/Buffering.html> . 17 18Perl normally buffers output so it doesn't make a system call for every 19bit of output. By saving up output, it makes fewer expensive system calls. 20For instance, in this little bit of code, you want to print a dot to the 21screen for every line you process to watch the progress of your program. 22Instead of seeing a dot for every line, Perl buffers the output and you 23have a long wait before you see a row of 50 dots all at once: 24 25 # long wait, then row of dots all at once 26 while( <> ) { 27 print "."; 28 print "\n" unless ++$count % 50; 29 30 #... expensive line processing operations 31 } 32 33To get around this, you have to unbuffer the output filehandle, in this 34case, C<STDOUT>. You can set the special variable C<$|> to a true value 35(mnemonic: making your filehandles "piping hot"): 36 37 $|++; 38 39 # dot shown immediately 40 while( <> ) { 41 print "."; 42 print "\n" unless ++$count % 50; 43 44 #... expensive line processing operations 45 } 46 47The C<$|> is one of the per-filehandle special variables, so each 48filehandle has its own copy of its value. If you want to merge 49standard output and standard error for instance, you have to unbuffer 50each (although STDERR might be unbuffered by default): 51 52 { 53 my $previous_default = select(STDOUT); # save previous default 54 $|++; # autoflush STDOUT 55 select(STDERR); 56 $|++; # autoflush STDERR, to be sure 57 select($previous_default); # restore previous default 58 } 59 60 # now should alternate . and + 61 while( 1 ) { 62 sleep 1; 63 print STDOUT "."; 64 print STDERR "+"; 65 print STDOUT "\n" unless ++$count % 25; 66 } 67 68Besides the C<$|> special variable, you can use C<binmode> to give 69your filehandle a C<:unix> layer, which is unbuffered: 70 71 binmode( STDOUT, ":unix" ); 72 73 while( 1 ) { 74 sleep 1; 75 print "."; 76 print "\n" unless ++$count % 50; 77 } 78 79For more information on output layers, see the entries for C<binmode> 80and L<open> in L<perlfunc>, and the L<PerlIO> module documentation. 81 82If you are using L<IO::Handle> or one of its subclasses, you can 83call the C<autoflush> method to change the settings of the 84filehandle: 85 86 use IO::Handle; 87 open my( $io_fh ), ">", "output.txt"; 88 $io_fh->autoflush(1); 89 90The L<IO::Handle> objects also have a C<flush> method. You can flush 91the buffer any time you want without auto-buffering 92 93 $io_fh->flush; 94 95=head2 How do I change, delete, or insert a line in a file, or append to the beginning of a file? 96X<file, editing> 97 98(contributed by brian d foy) 99 100The basic idea of inserting, changing, or deleting a line from a text 101file involves reading and printing the file to the point you want to 102make the change, making the change, then reading and printing the rest 103of the file. Perl doesn't provide random access to lines (especially 104since the record input separator, C<$/>, is mutable), although modules 105such as L<Tie::File> can fake it. 106 107A Perl program to do these tasks takes the basic form of opening a 108file, printing its lines, then closing the file: 109 110 open my $in, '<', $file or die "Can't read old file: $!"; 111 open my $out, '>', "$file.new" or die "Can't write new file: $!"; 112 113 while( <$in> ) { 114 print $out $_; 115 } 116 117 close $out; 118 119Within that basic form, add the parts that you need to insert, change, 120or delete lines. 121 122To prepend lines to the beginning, print those lines before you enter 123the loop that prints the existing lines. 124 125 open my $in, '<', $file or die "Can't read old file: $!"; 126 open my $out, '>', "$file.new" or die "Can't write new file: $!"; 127 128 print $out "# Add this line to the top\n"; # <--- HERE'S THE MAGIC 129 130 while( <$in> ) { 131 print $out $_; 132 } 133 134 close $out; 135 136To change existing lines, insert the code to modify the lines inside 137the C<while> loop. In this case, the code finds all lowercased 138versions of "perl" and uppercases them. The happens for every line, so 139be sure that you're supposed to do that on every line! 140 141 open my $in, '<', $file or die "Can't read old file: $!"; 142 open my $out, '>', "$file.new" or die "Can't write new file: $!"; 143 144 print $out "# Add this line to the top\n"; 145 146 while( <$in> ) { 147 s/\b(perl)\b/Perl/g; 148 print $out $_; 149 } 150 151 close $out; 152 153To change only a particular line, the input line number, C<$.>, is 154useful. First read and print the lines up to the one you want to 155change. Next, read the single line you want to change, change it, and 156print it. After that, read the rest of the lines and print those: 157 158 while( <$in> ) { # print the lines before the change 159 print $out $_; 160 last if $. == 4; # line number before change 161 } 162 163 my $line = <$in>; 164 $line =~ s/\b(perl)\b/Perl/g; 165 print $out $line; 166 167 while( <$in> ) { # print the rest of the lines 168 print $out $_; 169 } 170 171To skip lines, use the looping controls. The C<next> in this example 172skips comment lines, and the C<last> stops all processing once it 173encounters either C<__END__> or C<__DATA__>. 174 175 while( <$in> ) { 176 next if /^\s+#/; # skip comment lines 177 last if /^__(END|DATA)__$/; # stop at end of code marker 178 print $out $_; 179 } 180 181Do the same sort of thing to delete a particular line by using C<next> 182to skip the lines you don't want to show up in the output. This 183example skips every fifth line: 184 185 while( <$in> ) { 186 next unless $. % 5; 187 print $out $_; 188 } 189 190If, for some odd reason, you really want to see the whole file at once 191rather than processing line-by-line, you can slurp it in (as long as 192you can fit the whole thing in memory!): 193 194 open my $in, '<', $file or die "Can't read old file: $!" 195 open my $out, '>', "$file.new" or die "Can't write new file: $!"; 196 197 my $content = do { local $/; <$in> }; # slurp! 198 199 # do your magic here 200 201 print $out $content; 202 203Modules such as L<File::Slurp> and L<Tie::File> can help with that 204too. If you can, however, avoid reading the entire file at once. Perl 205won't give that memory back to the operating system until the process 206finishes. 207 208You can also use Perl one-liners to modify a file in-place. The 209following changes all 'Fred' to 'Barney' in F<inFile.txt>, overwriting 210the file with the new contents. With the C<-p> switch, Perl wraps a 211C<while> loop around the code you specify with C<-e>, and C<-i> turns 212on in-place editing. The current line is in C<$_>. With C<-p>, Perl 213automatically prints the value of C<$_> at the end of the loop. See 214L<perlrun> for more details. 215 216 perl -pi -e 's/Fred/Barney/' inFile.txt 217 218To make a backup of C<inFile.txt>, give C<-i> a file extension to add: 219 220 perl -pi.bak -e 's/Fred/Barney/' inFile.txt 221 222To change only the fifth line, you can add a test checking C<$.>, the 223input line number, then only perform the operation when the test 224passes: 225 226 perl -pi -e 's/Fred/Barney/ if $. == 5' inFile.txt 227 228To add lines before a certain line, you can add a line (or lines!) 229before Perl prints C<$_>: 230 231 perl -pi -e 'print "Put before third line\n" if $. == 3' inFile.txt 232 233You can even add a line to the beginning of a file, since the current 234line prints at the end of the loop: 235 236 perl -pi -e 'print "Put before first line\n" if $. == 1' inFile.txt 237 238To insert a line after one already in the file, use the C<-n> switch. 239It's just like C<-p> except that it doesn't print C<$_> at the end of 240the loop, so you have to do that yourself. In this case, print C<$_> 241first, then print the line that you want to add. 242 243 perl -ni -e 'print; print "Put after fifth line\n" if $. == 5' inFile.txt 244 245To delete lines, only print the ones that you want. 246 247 perl -ni -e 'print if /d/' inFile.txt 248 249=head2 How do I count the number of lines in a file? 250X<file, counting lines> X<lines> X<line> 251 252(contributed by brian d foy) 253 254Conceptually, the easiest way to count the lines in a file is to 255simply read them and count them: 256 257 my $count = 0; 258 while( <$fh> ) { $count++; } 259 260You don't really have to count them yourself, though, since Perl 261already does that with the C<$.> variable, which is the current line 262number from the last filehandle read: 263 264 1 while( <$fh> ); 265 my $count = $.; 266 267If you want to use C<$.>, you can reduce it to a simple one-liner, 268like one of these: 269 270 % perl -lne '} print $.; {' file 271 272 % perl -lne 'END { print $. }' file 273 274Those can be rather inefficient though. If they aren't fast enough for 275you, you might just read chunks of data and count the number of 276newlines: 277 278 my $lines = 0; 279 open my($fh), '<:raw', $filename or die "Can't open $filename: $!"; 280 while( sysread $fh, $buffer, 4096 ) { 281 $lines += ( $buffer =~ tr/\n// ); 282 } 283 close FILE; 284 285However, that doesn't work if the line ending isn't a newline. You 286might change that C<tr///> to a C<s///> so you can count the number of 287times the input record separator, C<$/>, shows up: 288 289 my $lines = 0; 290 open my($fh), '<:raw', $filename or die "Can't open $filename: $!"; 291 while( sysread $fh, $buffer, 4096 ) { 292 $lines += ( $buffer =~ s|$/||g; ); 293 } 294 close FILE; 295 296If you don't mind shelling out, the C<wc> command is usually the 297fastest, even with the extra interprocess overhead. Ensure that you 298have an untainted filename though: 299 300 #!perl -T 301 302 $ENV{PATH} = undef; 303 304 my $lines; 305 if( $filename =~ /^([0-9a-z_.]+)\z/ ) { 306 $lines = `/usr/bin/wc -l $1` 307 chomp $lines; 308 } 309 310=head2 How do I delete the last N lines from a file? 311X<lines> X<file> 312 313(contributed by brian d foy) 314 315The easiest conceptual solution is to count the lines in the 316file then start at the beginning and print the number of lines 317(minus the last N) to a new file. 318 319Most often, the real question is how you can delete the last N lines 320without making more than one pass over the file, or how to do it 321without a lot of copying. The easy concept is the hard reality when 322you might have millions of lines in your file. 323 324One trick is to use L<File::ReadBackwards>, which starts at the end of 325the file. That module provides an object that wraps the real filehandle 326to make it easy for you to move around the file. Once you get to the 327spot you need, you can get the actual filehandle and work with it as 328normal. In this case, you get the file position at the end of the last 329line you want to keep and truncate the file to that point: 330 331 use File::ReadBackwards; 332 333 my $filename = 'test.txt'; 334 my $Lines_to_truncate = 2; 335 336 my $bw = File::ReadBackwards->new( $filename ) 337 or die "Could not read backwards in [$filename]: $!"; 338 339 my $lines_from_end = 0; 340 until( $bw->eof or $lines_from_end == $Lines_to_truncate ) { 341 print "Got: ", $bw->readline; 342 $lines_from_end++; 343 } 344 345 truncate( $filename, $bw->tell ); 346 347The L<File::ReadBackwards> module also has the advantage of setting 348the input record separator to a regular expression. 349 350You can also use the L<Tie::File> module which lets you access 351the lines through a tied array. You can use normal array operations 352to modify your file, including setting the last index and using 353C<splice>. 354 355=head2 How can I use Perl's C<-i> option from within a program? 356X<-i> X<in-place> 357 358C<-i> sets the value of Perl's C<$^I> variable, which in turn affects 359the behavior of C<< <> >>; see L<perlrun> for more details. By 360modifying the appropriate variables directly, you can get the same 361behavior within a larger program. For example: 362 363 # ... 364 { 365 local($^I, @ARGV) = ('.orig', glob("*.c")); 366 while (<>) { 367 if ($. == 1) { 368 print "This line should appear at the top of each file\n"; 369 } 370 s/\b(p)earl\b/${1}erl/i; # Correct typos, preserving case 371 print; 372 close ARGV if eof; # Reset $. 373 } 374 } 375 # $^I and @ARGV return to their old values here 376 377This block modifies all the C<.c> files in the current directory, 378leaving a backup of the original data from each file in a new 379C<.c.orig> file. 380 381=head2 How can I copy a file? 382X<copy> X<file, copy> X<File::Copy> 383 384(contributed by brian d foy) 385 386Use the L<File::Copy> module. It comes with Perl and can do a 387true copy across file systems, and it does its magic in 388a portable fashion. 389 390 use File::Copy; 391 392 copy( $original, $new_copy ) or die "Copy failed: $!"; 393 394If you can't use L<File::Copy>, you'll have to do the work yourself: 395open the original file, open the destination file, then print 396to the destination file as you read the original. You also have to 397remember to copy the permissions, owner, and group to the new file. 398 399=head2 How do I make a temporary file name? 400X<file, temporary> 401 402If you don't need to know the name of the file, you can use C<open()> 403with C<undef> in place of the file name. In Perl 5.8 or later, the 404C<open()> function creates an anonymous temporary file: 405 406 open my $tmp, '+>', undef or die $!; 407 408Otherwise, you can use the File::Temp module. 409 410 use File::Temp qw/ tempfile tempdir /; 411 412 my $dir = tempdir( CLEANUP => 1 ); 413 ($fh, $filename) = tempfile( DIR => $dir ); 414 415 # or if you don't need to know the filename 416 417 my $fh = tempfile( DIR => $dir ); 418 419The File::Temp has been a standard module since Perl 5.6.1. If you 420don't have a modern enough Perl installed, use the C<new_tmpfile> 421class method from the IO::File module to get a filehandle opened for 422reading and writing. Use it if you don't need to know the file's name: 423 424 use IO::File; 425 my $fh = IO::File->new_tmpfile() 426 or die "Unable to make new temporary file: $!"; 427 428If you're committed to creating a temporary file by hand, use the 429process ID and/or the current time-value. If you need to have many 430temporary files in one process, use a counter: 431 432 BEGIN { 433 use Fcntl; 434 my $temp_dir = -d '/tmp' ? '/tmp' : $ENV{TMPDIR} || $ENV{TEMP}; 435 my $base_name = sprintf "%s/%d-%d-0000", $temp_dir, $$, time; 436 437 sub temp_file { 438 my $fh; 439 my $count = 0; 440 until( defined(fileno($fh)) || $count++ > 100 ) { 441 $base_name =~ s/-(\d+)$/"-" . (1 + $1)/e; 442 # O_EXCL is required for security reasons. 443 sysopen $fh, $base_name, O_WRONLY|O_EXCL|O_CREAT; 444 } 445 446 if( defined fileno($fh) ) { 447 return ($fh, $base_name); 448 } 449 else { 450 return (); 451 } 452 } 453 } 454 455=head2 How can I manipulate fixed-record-length files? 456X<fixed-length> X<file, fixed-length records> 457 458The most efficient way is using L<pack()|perlfunc/"pack"> and 459L<unpack()|perlfunc/"unpack">. This is faster than using 460L<substr()|perlfunc/"substr"> when taking many, many strings. It is 461slower for just a few. 462 463Here is a sample chunk of code to break up and put back together again 464some fixed-format input lines, in this case from the output of a normal, 465Berkeley-style ps: 466 467 # sample input line: 468 # 15158 p5 T 0:00 perl /home/tchrist/scripts/now-what 469 my $PS_T = 'A6 A4 A7 A5 A*'; 470 open my $ps, '-|', 'ps'; 471 print scalar <$ps>; 472 my @fields = qw( pid tt stat time command ); 473 while (<$ps>) { 474 my %process; 475 @process{@fields} = unpack($PS_T, $_); 476 for my $field ( @fields ) { 477 print "$field: <$process{$field}>\n"; 478 } 479 print 'line=', pack($PS_T, @process{@fields} ), "\n"; 480 } 481 482We've used a hash slice in order to easily handle the fields of each row. 483Storing the keys in an array makes it easy to operate on them as a 484group or loop over them with C<for>. It also avoids polluting the program 485with global variables and using symbolic references. 486 487=head2 How can I make a filehandle local to a subroutine? How do I pass filehandles between subroutines? How do I make an array of filehandles? 488X<filehandle, local> X<filehandle, passing> X<filehandle, reference> 489 490As of perl5.6, open() autovivifies file and directory handles 491as references if you pass it an uninitialized scalar variable. 492You can then pass these references just like any other scalar, 493and use them in the place of named handles. 494 495 open my $fh, $file_name; 496 497 open local $fh, $file_name; 498 499 print $fh "Hello World!\n"; 500 501 process_file( $fh ); 502 503If you like, you can store these filehandles in an array or a hash. 504If you access them directly, they aren't simple scalars and you 505need to give C<print> a little help by placing the filehandle 506reference in braces. Perl can only figure it out on its own when 507the filehandle reference is a simple scalar. 508 509 my @fhs = ( $fh1, $fh2, $fh3 ); 510 511 for( $i = 0; $i <= $#fhs; $i++ ) { 512 print {$fhs[$i]} "just another Perl answer, \n"; 513 } 514 515Before perl5.6, you had to deal with various typeglob idioms 516which you may see in older code. 517 518 open FILE, "> $filename"; 519 process_typeglob( *FILE ); 520 process_reference( \*FILE ); 521 522 sub process_typeglob { local *FH = shift; print FH "Typeglob!" } 523 sub process_reference { local $fh = shift; print $fh "Reference!" } 524 525If you want to create many anonymous handles, you should 526check out the Symbol or IO::Handle modules. 527 528=head2 How can I use a filehandle indirectly? 529X<filehandle, indirect> 530 531An indirect filehandle is the use of something other than a symbol 532in a place that a filehandle is expected. Here are ways 533to get indirect filehandles: 534 535 $fh = SOME_FH; # bareword is strict-subs hostile 536 $fh = "SOME_FH"; # strict-refs hostile; same package only 537 $fh = *SOME_FH; # typeglob 538 $fh = \*SOME_FH; # ref to typeglob (bless-able) 539 $fh = *SOME_FH{IO}; # blessed IO::Handle from *SOME_FH typeglob 540 541Or, you can use the C<new> method from one of the IO::* modules to 542create an anonymous filehandle and store that in a scalar variable. 543 544 use IO::Handle; # 5.004 or higher 545 my $fh = IO::Handle->new(); 546 547Then use any of those as you would a normal filehandle. Anywhere that 548Perl is expecting a filehandle, an indirect filehandle may be used 549instead. An indirect filehandle is just a scalar variable that contains 550a filehandle. Functions like C<print>, C<open>, C<seek>, or 551the C<< <FH> >> diamond operator will accept either a named filehandle 552or a scalar variable containing one: 553 554 ($ifh, $ofh, $efh) = (*STDIN, *STDOUT, *STDERR); 555 print $ofh "Type it: "; 556 my $got = <$ifh> 557 print $efh "What was that: $got"; 558 559If you're passing a filehandle to a function, you can write 560the function in two ways: 561 562 sub accept_fh { 563 my $fh = shift; 564 print $fh "Sending to indirect filehandle\n"; 565 } 566 567Or it can localize a typeglob and use the filehandle directly: 568 569 sub accept_fh { 570 local *FH = shift; 571 print FH "Sending to localized filehandle\n"; 572 } 573 574Both styles work with either objects or typeglobs of real filehandles. 575(They might also work with strings under some circumstances, but this 576is risky.) 577 578 accept_fh(*STDOUT); 579 accept_fh($handle); 580 581In the examples above, we assigned the filehandle to a scalar variable 582before using it. That is because only simple scalar variables, not 583expressions or subscripts of hashes or arrays, can be used with 584built-ins like C<print>, C<printf>, or the diamond operator. Using 585something other than a simple scalar variable as a filehandle is 586illegal and won't even compile: 587 588 my @fd = (*STDIN, *STDOUT, *STDERR); 589 print $fd[1] "Type it: "; # WRONG 590 my $got = <$fd[0]> # WRONG 591 print $fd[2] "What was that: $got"; # WRONG 592 593With C<print> and C<printf>, you get around this by using a block and 594an expression where you would place the filehandle: 595 596 print { $fd[1] } "funny stuff\n"; 597 printf { $fd[1] } "Pity the poor %x.\n", 3_735_928_559; 598 # Pity the poor deadbeef. 599 600That block is a proper block like any other, so you can put more 601complicated code there. This sends the message out to one of two places: 602 603 my $ok = -x "/bin/cat"; 604 print { $ok ? $fd[1] : $fd[2] } "cat stat $ok\n"; 605 print { $fd[ 1+ ($ok || 0) ] } "cat stat $ok\n"; 606 607This approach of treating C<print> and C<printf> like object methods 608calls doesn't work for the diamond operator. That's because it's a 609real operator, not just a function with a comma-less argument. Assuming 610you've been storing typeglobs in your structure as we did above, you 611can use the built-in function named C<readline> to read a record just 612as C<< <> >> does. Given the initialization shown above for @fd, this 613would work, but only because readline() requires a typeglob. It doesn't 614work with objects or strings, which might be a bug we haven't fixed yet. 615 616 $got = readline($fd[0]); 617 618Let it be noted that the flakiness of indirect filehandles is not 619related to whether they're strings, typeglobs, objects, or anything else. 620It's the syntax of the fundamental operators. Playing the object 621game doesn't help you at all here. 622 623=head2 How can I set up a footer format to be used with write()? 624X<footer> 625 626There's no builtin way to do this, but L<perlform> has a couple of 627techniques to make it possible for the intrepid hacker. 628 629=head2 How can I write() into a string? 630X<write, into a string> 631 632(contributed by brian d foy) 633 634If you want to C<write> into a string, you just have to <open> a 635filehandle to a string, which Perl has been able to do since Perl 5.6: 636 637 open FH, '>', \my $string; 638 write( FH ); 639 640Since you want to be a good programmer, you probably want to use a lexical 641filehandle, even though formats are designed to work with bareword filehandles 642since the default format names take the filehandle name. However, you can 643control this with some Perl special per-filehandle variables: C<$^>, which 644names the top-of-page format, and C<$~> which shows the line format. You have 645to change the default filehandle to set these variables: 646 647 open my($fh), '>', \my $string; 648 649 { # set per-filehandle variables 650 my $old_fh = select( $fh ); 651 $~ = 'ANIMAL'; 652 $^ = 'ANIMAL_TOP'; 653 select( $old_fh ); 654 } 655 656 format ANIMAL_TOP = 657 ID Type Name 658 . 659 660 format ANIMAL = 661 @## @<<< @<<<<<<<<<<<<<< 662 $id, $type, $name 663 . 664 665Although write can work with lexical or package variables, whatever variables 666you use have to scope in the format. That most likely means you'll want to 667localize some package variables: 668 669 { 670 local( $id, $type, $name ) = qw( 12 cat Buster ); 671 write( $fh ); 672 } 673 674 print $string; 675 676There are also some tricks that you can play with C<formline> and the 677accumulator variable C<$^A>, but you lose a lot of the value of formats 678since C<formline> won't handle paging and so on. You end up reimplementing 679formats when you use them. 680 681=head2 How can I open a filehandle to a string? 682X<string> X<open> X<IO::String> X<filehandle> 683 684(contributed by Peter J. Holzer, hjp-usenet2@hjp.at) 685 686Since Perl 5.8.0 a file handle referring to a string can be created by 687calling open with a reference to that string instead of the filename. 688This file handle can then be used to read from or write to the string: 689 690 open(my $fh, '>', \$string) or die "Could not open string for writing"; 691 print $fh "foo\n"; 692 print $fh "bar\n"; # $string now contains "foo\nbar\n" 693 694 open(my $fh, '<', \$string) or die "Could not open string for reading"; 695 my $x = <$fh>; # $x now contains "foo\n" 696 697With older versions of Perl, the L<IO::String> module provides similar 698functionality. 699 700=head2 How can I output my numbers with commas added? 701X<number, commify> 702 703(contributed by brian d foy and Benjamin Goldberg) 704 705You can use L<Number::Format> to separate places in a number. 706It handles locale information for those of you who want to insert 707full stops instead (or anything else that they want to use, 708really). 709 710This subroutine will add commas to your number: 711 712 sub commify { 713 local $_ = shift; 714 1 while s/^([-+]?\d+)(\d{3})/$1,$2/; 715 return $_; 716 } 717 718This regex from Benjamin Goldberg will add commas to numbers: 719 720 s/(^[-+]?\d+?(?=(?>(?:\d{3})+)(?!\d))|\G\d{3}(?=\d))/$1,/g; 721 722It is easier to see with comments: 723 724 s/( 725 ^[-+]? # beginning of number. 726 \d+? # first digits before first comma 727 (?= # followed by, (but not included in the match) : 728 (?>(?:\d{3})+) # some positive multiple of three digits. 729 (?!\d) # an *exact* multiple, not x * 3 + 1 or whatever. 730 ) 731 | # or: 732 \G\d{3} # after the last group, get three digits 733 (?=\d) # but they have to have more digits after them. 734 )/$1,/xg; 735 736=head2 How can I translate tildes (~) in a filename? 737X<tilde> X<tilde expansion> 738 739Use the E<lt>E<gt> (C<glob()>) operator, documented in L<perlfunc>. 740Versions of Perl older than 5.6 require that you have a shell 741installed that groks tildes. Later versions of Perl have this feature 742built in. The L<File::KGlob> module (available from CPAN) gives more 743portable glob functionality. 744 745Within Perl, you may use this directly: 746 747 $filename =~ s{ 748 ^ ~ # find a leading tilde 749 ( # save this in $1 750 [^/] # a non-slash character 751 * # repeated 0 or more times (0 means me) 752 ) 753 }{ 754 $1 755 ? (getpwnam($1))[7] 756 : ( $ENV{HOME} || $ENV{LOGDIR} ) 757 }ex; 758 759=head2 How come when I open a file read-write it wipes it out? 760X<clobber> X<read-write> X<clobbering> X<truncate> X<truncating> 761 762Because you're using something like this, which truncates the file 763I<then> gives you read-write access: 764 765 open my $fh, '+>', '/path/name'; # WRONG (almost always) 766 767Whoops. You should instead use this, which will fail if the file 768doesn't exist: 769 770 open my $fh, '+<', '/path/name'; # open for update 771 772Using ">" always clobbers or creates. Using "<" never does 773either. The "+" doesn't change this. 774 775Here are examples of many kinds of file opens. Those using C<sysopen> 776all assume that you've pulled in the constants from L<Fcntl>: 777 778 use Fcntl; 779 780To open file for reading: 781 782 open my $fh, '<', $path or die $!; 783 sysopen my $fh, $path, O_RDONLY or die $!; 784 785To open file for writing, create new file if needed or else truncate old file: 786 787 open my $fh, '>', $path or die $!; 788 sysopen my $fh, $path, O_WRONLY|O_TRUNC|O_CREAT or die $!; 789 sysopen my $fh, $path, O_WRONLY|O_TRUNC|O_CREAT, 0666 or die $!; 790 791To open file for writing, create new file, file must not exist: 792 793 sysopen my $fh, $path, O_WRONLY|O_EXCL|O_CREAT or die $!; 794 sysopen my $fh, $path, O_WRONLY|O_EXCL|O_CREAT, 0666 or die $!; 795 796To open file for appending, create if necessary: 797 798 open my $fh, '>>' $path or die $!; 799 sysopen my $fh, $path, O_WRONLY|O_APPEND|O_CREAT or die $!; 800 sysopen my $fh, $path, O_WRONLY|O_APPEND|O_CREAT, 0666 or die $!; 801 802To open file for appending, file must exist: 803 804 sysopen my $fh, $path, O_WRONLY|O_APPEND or die $!; 805 806To open file for update, file must exist: 807 808 open my $fh, '+<', $path or die $!; 809 sysopen my $fh, $path, O_RDWR or die $!; 810 811To open file for update, create file if necessary: 812 813 sysopen my $fh, $path, O_RDWR|O_CREAT or die $!; 814 sysopen my $fh, $path, O_RDWR|O_CREAT, 0666 or die $!; 815 816To open file for update, file must not exist: 817 818 sysopen my $fh, $path, O_RDWR|O_EXCL|O_CREAT or die $!; 819 sysopen my $fh, $path, O_RDWR|O_EXCL|O_CREAT, 0666 or die $!; 820 821To open a file without blocking, creating if necessary: 822 823 sysopen my $fh, '/foo/somefile', O_WRONLY|O_NDELAY|O_CREAT 824 or die "can't open /foo/somefile: $!": 825 826Be warned that neither creation nor deletion of files is guaranteed to 827be an atomic operation over NFS. That is, two processes might both 828successfully create or unlink the same file! Therefore O_EXCL 829isn't as exclusive as you might wish. 830 831See also L<perlopentut>. 832 833=head2 Why do I sometimes get an "Argument list too long" when I use E<lt>*E<gt>? 834X<argument list too long> 835 836The C<< <> >> operator performs a globbing operation (see above). 837In Perl versions earlier than v5.6.0, the internal glob() operator forks 838csh(1) to do the actual glob expansion, but 839csh can't handle more than 127 items and so gives the error message 840C<Argument list too long>. People who installed tcsh as csh won't 841have this problem, but their users may be surprised by it. 842 843To get around this, either upgrade to Perl v5.6.0 or later, do the glob 844yourself with readdir() and patterns, or use a module like L<File::Glob>, 845one that doesn't use the shell to do globbing. 846 847=head2 How can I open a file with a leading ">" or trailing blanks? 848X<filename, special characters> 849 850(contributed by Brian McCauley) 851 852The special two-argument form of Perl's open() function ignores 853trailing blanks in filenames and infers the mode from certain leading 854characters (or a trailing "|"). In older versions of Perl this was the 855only version of open() and so it is prevalent in old code and books. 856 857Unless you have a particular reason to use the two-argument form you 858should use the three-argument form of open() which does not treat any 859characters in the filename as special. 860 861 open my $fh, "<", " file "; # filename is " file " 862 open my $fh, ">", ">file"; # filename is ">file" 863 864=head2 How can I reliably rename a file? 865X<rename> X<mv> X<move> X<file, rename> 866 867If your operating system supports a proper mv(1) utility or its 868functional equivalent, this works: 869 870 rename($old, $new) or system("mv", $old, $new); 871 872It may be more portable to use the L<File::Copy> module instead. 873You just copy to the new file to the new name (checking return 874values), then delete the old one. This isn't really the same 875semantically as a C<rename()>, which preserves meta-information like 876permissions, timestamps, inode info, etc. 877 878=head2 How can I lock a file? 879X<lock> X<file, lock> X<flock> 880 881Perl's builtin flock() function (see L<perlfunc> for details) will call 882flock(2) if that exists, fcntl(2) if it doesn't (on perl version 5.004 and 883later), and lockf(3) if neither of the two previous system calls exists. 884On some systems, it may even use a different form of native locking. 885Here are some gotchas with Perl's flock(): 886 887=over 4 888 889=item 1 890 891Produces a fatal error if none of the three system calls (or their 892close equivalent) exists. 893 894=item 2 895 896lockf(3) does not provide shared locking, and requires that the 897filehandle be open for writing (or appending, or read/writing). 898 899=item 3 900 901Some versions of flock() can't lock files over a network (e.g. on NFS file 902systems), so you'd need to force the use of fcntl(2) when you build Perl. 903But even this is dubious at best. See the flock entry of L<perlfunc> 904and the F<INSTALL> file in the source distribution for information on 905building Perl to do this. 906 907Two potentially non-obvious but traditional flock semantics are that 908it waits indefinitely until the lock is granted, and that its locks are 909I<merely advisory>. Such discretionary locks are more flexible, but 910offer fewer guarantees. This means that files locked with flock() may 911be modified by programs that do not also use flock(). Cars that stop 912for red lights get on well with each other, but not with cars that don't 913stop for red lights. See the perlport manpage, your port's specific 914documentation, or your system-specific local manpages for details. It's 915best to assume traditional behavior if you're writing portable programs. 916(If you're not, you should as always feel perfectly free to write 917for your own system's idiosyncrasies (sometimes called "features"). 918Slavish adherence to portability concerns shouldn't get in the way of 919your getting your job done.) 920 921For more information on file locking, see also 922L<perlopentut/"File Locking"> if you have it (new for 5.6). 923 924=back 925 926=head2 Why can't I just open(FH, "E<gt>file.lock")? 927X<lock, lockfile race condition> 928 929A common bit of code B<NOT TO USE> is this: 930 931 sleep(3) while -e 'file.lock'; # PLEASE DO NOT USE 932 open my $lock, '>', 'file.lock'; # THIS BROKEN CODE 933 934This is a classic race condition: you take two steps to do something 935which must be done in one. That's why computer hardware provides an 936atomic test-and-set instruction. In theory, this "ought" to work: 937 938 sysopen my $fh, "file.lock", O_WRONLY|O_EXCL|O_CREAT 939 or die "can't open file.lock: $!"; 940 941except that lamentably, file creation (and deletion) is not atomic 942over NFS, so this won't work (at least, not every time) over the net. 943Various schemes involving link() have been suggested, but 944these tend to involve busy-wait, which is also less than desirable. 945 946=head2 I still don't get locking. I just want to increment the number in the file. How can I do this? 947X<counter> X<file, counter> 948 949Didn't anyone ever tell you web-page hit counters were useless? 950They don't count number of hits, they're a waste of time, and they serve 951only to stroke the writer's vanity. It's better to pick a random number; 952they're more realistic. 953 954Anyway, this is what you can do if you can't help yourself. 955 956 use Fcntl qw(:DEFAULT :flock); 957 sysopen my $fh, "numfile", O_RDWR|O_CREAT or die "can't open numfile: $!"; 958 flock $fh, LOCK_EX or die "can't flock numfile: $!"; 959 my $num = <$fh> || 0; 960 seek $fh, 0, 0 or die "can't rewind numfile: $!"; 961 truncate $fh, 0 or die "can't truncate numfile: $!"; 962 (print $fh $num+1, "\n") or die "can't write numfile: $!"; 963 close $fh or die "can't close numfile: $!"; 964 965Here's a much better web-page hit counter: 966 967 $hits = int( (time() - 850_000_000) / rand(1_000) ); 968 969If the count doesn't impress your friends, then the code might. :-) 970 971=head2 All I want to do is append a small amount of text to the end of a file. Do I still have to use locking? 972X<append> X<file, append> 973 974If you are on a system that correctly implements C<flock> and you use 975the example appending code from "perldoc -f flock" everything will be 976OK even if the OS you are on doesn't implement append mode correctly 977(if such a system exists). So if you are happy to restrict yourself to 978OSs that implement C<flock> (and that's not really much of a 979restriction) then that is what you should do. 980 981If you know you are only going to use a system that does correctly 982implement appending (i.e. not Win32) then you can omit the C<seek> 983from the code in the previous answer. 984 985If you know you are only writing code to run on an OS and filesystem 986that does implement append mode correctly (a local filesystem on a 987modern Unix for example), and you keep the file in block-buffered mode 988and you write less than one buffer-full of output between each manual 989flushing of the buffer then each bufferload is almost guaranteed to be 990written to the end of the file in one chunk without getting 991intermingled with anyone else's output. You can also use the 992C<syswrite> function which is simply a wrapper around your system's 993C<write(2)> system call. 994 995There is still a small theoretical chance that a signal will interrupt 996the system-level C<write()> operation before completion. There is also 997a possibility that some STDIO implementations may call multiple system 998level C<write()>s even if the buffer was empty to start. There may be 999some systems where this probability is reduced to zero, and this is 1000not a concern when using C<:perlio> instead of your system's STDIO. 1001 1002=head2 How do I randomly update a binary file? 1003X<file, binary patch> 1004 1005If you're just trying to patch a binary, in many cases something as 1006simple as this works: 1007 1008 perl -i -pe 's{window manager}{window mangler}g' /usr/bin/emacs 1009 1010However, if you have fixed sized records, then you might do something more 1011like this: 1012 1013 my $RECSIZE = 220; # size of record, in bytes 1014 my $recno = 37; # which record to update 1015 open my $fh, '+<', 'somewhere' or die "can't update somewhere: $!"; 1016 seek $fh, $recno * $RECSIZE, 0; 1017 read $fh, $record, $RECSIZE == $RECSIZE or die "can't read record $recno: $!"; 1018 # munge the record 1019 seek $fh, -$RECSIZE, 1; 1020 print $fh $record; 1021 close $fh; 1022 1023Locking and error checking are left as an exercise for the reader. 1024Don't forget them or you'll be quite sorry. 1025 1026=head2 How do I get a file's timestamp in perl? 1027X<timestamp> X<file, timestamp> 1028 1029If you want to retrieve the time at which the file was last read, 1030written, or had its meta-data (owner, etc) changed, you use the B<-A>, 1031B<-M>, or B<-C> file test operations as documented in L<perlfunc>. 1032These retrieve the age of the file (measured against the start-time of 1033your program) in days as a floating point number. Some platforms may 1034not have all of these times. See L<perlport> for details. To retrieve 1035the "raw" time in seconds since the epoch, you would call the stat 1036function, then use C<localtime()>, C<gmtime()>, or 1037C<POSIX::strftime()> to convert this into human-readable form. 1038 1039Here's an example: 1040 1041 my $write_secs = (stat($file))[9]; 1042 printf "file %s updated at %s\n", $file, 1043 scalar localtime($write_secs); 1044 1045If you prefer something more legible, use the File::stat module 1046(part of the standard distribution in version 5.004 and later): 1047 1048 # error checking left as an exercise for reader. 1049 use File::stat; 1050 use Time::localtime; 1051 my $date_string = ctime(stat($file)->mtime); 1052 print "file $file updated at $date_string\n"; 1053 1054The POSIX::strftime() approach has the benefit of being, 1055in theory, independent of the current locale. See L<perllocale> 1056for details. 1057 1058=head2 How do I set a file's timestamp in perl? 1059X<timestamp> X<file, timestamp> 1060 1061You use the utime() function documented in L<perlfunc/utime>. 1062By way of example, here's a little program that copies the 1063read and write times from its first argument to all the rest 1064of them. 1065 1066 if (@ARGV < 2) { 1067 die "usage: cptimes timestamp_file other_files ...\n"; 1068 } 1069 my $timestamp = shift; 1070 my($atime, $mtime) = (stat($timestamp))[8,9]; 1071 utime $atime, $mtime, @ARGV; 1072 1073Error checking is, as usual, left as an exercise for the reader. 1074 1075The perldoc for utime also has an example that has the same 1076effect as touch(1) on files that I<already exist>. 1077 1078Certain file systems have a limited ability to store the times 1079on a file at the expected level of precision. For example, the 1080FAT and HPFS filesystem are unable to create dates on files with 1081a finer granularity than two seconds. This is a limitation of 1082the filesystems, not of utime(). 1083 1084=head2 How do I print to more than one file at once? 1085X<print, to multiple files> 1086 1087To connect one filehandle to several output filehandles, 1088you can use the L<IO::Tee> or L<Tie::FileHandle::Multiplex> modules. 1089 1090If you only have to do this once, you can print individually 1091to each filehandle. 1092 1093 for my $fh ($fh1, $fh2, $fh3) { print $fh "whatever\n" } 1094 1095=head2 How can I read in an entire file all at once? 1096X<slurp> X<file, slurping> 1097 1098The customary Perl approach for processing all the lines in a file is to 1099do so one line at a time: 1100 1101 open my $input, '<', $file or die "can't open $file: $!"; 1102 while (<$input>) { 1103 chomp; 1104 # do something with $_ 1105 } 1106 close $input or die "can't close $file: $!"; 1107 1108This is tremendously more efficient than reading the entire file into 1109memory as an array of lines and then processing it one element at a time, 1110which is often--if not almost always--the wrong approach. Whenever 1111you see someone do this: 1112 1113 my @lines = <INPUT>; 1114 1115You should think long and hard about why you need everything loaded at 1116once. It's just not a scalable solution. 1117 1118If you "mmap" the file with the File::Map module from 1119CPAN, you can virtually load the entire file into a 1120string without actually storing it in memory: 1121 1122 use File::Map qw(map_file); 1123 1124 map_file my $string, $filename; 1125 1126Once mapped, you can treat C<$string> as you would any other string. 1127Since you don't necessarily have to load the data, mmap-ing can be 1128very fast and may not increase your memory footprint. 1129 1130You might also find it more 1131fun to use the standard L<Tie::File> module, or the L<DB_File> module's 1132C<$DB_RECNO> bindings, which allow you to tie an array to a file so that 1133accessing an element of the array actually accesses the corresponding 1134line in the file. 1135 1136If you want to load the entire file, you can use the L<File::Slurp> 1137module to do it in one simple and efficient step: 1138 1139 use File::Slurp; 1140 1141 my $all_of_it = read_file($filename); # entire file in scalar 1142 my @all_lines = read_file($filename); # one line per element 1143 1144Or you can read the entire file contents into a scalar like this: 1145 1146 my $var; 1147 { 1148 local $/; 1149 open my $fh, '<', $file or die "can't open $file: $!"; 1150 $var = <$fh>; 1151 } 1152 1153That temporarily undefs your record separator, and will automatically 1154close the file at block exit. If the file is already open, just use this: 1155 1156 my $var = do { local $/; <$fh> }; 1157 1158You can also use a localized C<@ARGV> to eliminate the C<open>: 1159 1160 my $var = do { local( @ARGV, $/ ) = $file; <> }; 1161 1162For ordinary files you can also use the C<read> function. 1163 1164 read( $fh, $var, -s $fh ); 1165 1166That third argument tests the byte size of the data on the C<$fh> filehandle 1167and reads that many bytes into the buffer C<$var>. 1168 1169=head2 How can I read in a file by paragraphs? 1170X<file, reading by paragraphs> 1171 1172Use the C<$/> variable (see L<perlvar> for details). You can either 1173set it to C<""> to eliminate empty paragraphs (C<"abc\n\n\n\ndef">, 1174for instance, gets treated as two paragraphs and not three), or 1175C<"\n\n"> to accept empty paragraphs. 1176 1177Note that a blank line must have no blanks in it. Thus 1178S<C<"fred\n \nstuff\n\n">> is one paragraph, but C<"fred\n\nstuff\n\n"> is two. 1179 1180=head2 How can I read a single character from a file? From the keyboard? 1181X<getc> X<file, reading one character at a time> 1182 1183You can use the builtin C<getc()> function for most filehandles, but 1184it won't (easily) work on a terminal device. For STDIN, either use 1185the Term::ReadKey module from CPAN or use the sample code in 1186L<perlfunc/getc>. 1187 1188If your system supports the portable operating system programming 1189interface (POSIX), you can use the following code, which you'll note 1190turns off echo processing as well. 1191 1192 #!/usr/bin/perl -w 1193 use strict; 1194 $| = 1; 1195 for (1..4) { 1196 print "gimme: "; 1197 my $got = getone(); 1198 print "--> $got\n"; 1199 } 1200 exit; 1201 1202 BEGIN { 1203 use POSIX qw(:termios_h); 1204 1205 my ($term, $oterm, $echo, $noecho, $fd_stdin); 1206 1207 my $fd_stdin = fileno(STDIN); 1208 1209 $term = POSIX::Termios->new(); 1210 $term->getattr($fd_stdin); 1211 $oterm = $term->getlflag(); 1212 1213 $echo = ECHO | ECHOK | ICANON; 1214 $noecho = $oterm & ~$echo; 1215 1216 sub cbreak { 1217 $term->setlflag($noecho); 1218 $term->setcc(VTIME, 1); 1219 $term->setattr($fd_stdin, TCSANOW); 1220 } 1221 1222 sub cooked { 1223 $term->setlflag($oterm); 1224 $term->setcc(VTIME, 0); 1225 $term->setattr($fd_stdin, TCSANOW); 1226 } 1227 1228 sub getone { 1229 my $key = ''; 1230 cbreak(); 1231 sysread(STDIN, $key, 1); 1232 cooked(); 1233 return $key; 1234 } 1235 } 1236 1237 END { cooked() } 1238 1239The Term::ReadKey module from CPAN may be easier to use. Recent versions 1240include also support for non-portable systems as well. 1241 1242 use Term::ReadKey; 1243 open my $tty, '<', '/dev/tty'; 1244 print "Gimme a char: "; 1245 ReadMode "raw"; 1246 my $key = ReadKey 0, $tty; 1247 ReadMode "normal"; 1248 printf "\nYou said %s, char number %03d\n", 1249 $key, ord $key; 1250 1251=head2 How can I tell whether there's a character waiting on a filehandle? 1252 1253The very first thing you should do is look into getting the Term::ReadKey 1254extension from CPAN. As we mentioned earlier, it now even has limited 1255support for non-portable (read: not open systems, closed, proprietary, 1256not POSIX, not Unix, etc.) systems. 1257 1258You should also check out the Frequently Asked Questions list in 1259comp.unix.* for things like this: the answer is essentially the same. 1260It's very system-dependent. Here's one solution that works on BSD 1261systems: 1262 1263 sub key_ready { 1264 my($rin, $nfd); 1265 vec($rin, fileno(STDIN), 1) = 1; 1266 return $nfd = select($rin,undef,undef,0); 1267 } 1268 1269If you want to find out how many characters are waiting, there's 1270also the FIONREAD ioctl call to be looked at. The I<h2ph> tool that 1271comes with Perl tries to convert C include files to Perl code, which 1272can be C<require>d. FIONREAD ends up defined as a function in the 1273I<sys/ioctl.ph> file: 1274 1275 require 'sys/ioctl.ph'; 1276 1277 $size = pack("L", 0); 1278 ioctl(FH, FIONREAD(), $size) or die "Couldn't call ioctl: $!\n"; 1279 $size = unpack("L", $size); 1280 1281If I<h2ph> wasn't installed or doesn't work for you, you can 1282I<grep> the include files by hand: 1283 1284 % grep FIONREAD /usr/include/*/* 1285 /usr/include/asm/ioctls.h:#define FIONREAD 0x541B 1286 1287Or write a small C program using the editor of champions: 1288 1289 % cat > fionread.c 1290 #include <sys/ioctl.h> 1291 main() { 1292 printf("%#08x\n", FIONREAD); 1293 } 1294 ^D 1295 % cc -o fionread fionread.c 1296 % ./fionread 1297 0x4004667f 1298 1299And then hard-code it, leaving porting as an exercise to your successor. 1300 1301 $FIONREAD = 0x4004667f; # XXX: opsys dependent 1302 1303 $size = pack("L", 0); 1304 ioctl(FH, $FIONREAD, $size) or die "Couldn't call ioctl: $!\n"; 1305 $size = unpack("L", $size); 1306 1307FIONREAD requires a filehandle connected to a stream, meaning that sockets, 1308pipes, and tty devices work, but I<not> files. 1309 1310=head2 How do I do a C<tail -f> in perl? 1311X<tail> X<IO::Handle> X<File::Tail> X<clearerr> 1312 1313First try 1314 1315 seek($gw_fh, 0, 1); 1316 1317The statement C<seek($gw_fh, 0, 1)> doesn't change the current position, 1318but it does clear the end-of-file condition on the handle, so that the 1319next C<< <$gw_fh> >> makes Perl try again to read something. 1320 1321If that doesn't work (it relies on features of your stdio implementation), 1322then you need something more like this: 1323 1324 for (;;) { 1325 for ($curpos = tell($gw_fh); <$gw_fh>; $curpos =tell($gw_fh)) { 1326 # search for some stuff and put it into files 1327 } 1328 # sleep for a while 1329 seek($gw_fh, $curpos, 0); # seek to where we had been 1330 } 1331 1332If this still doesn't work, look into the C<clearerr> method 1333from L<IO::Handle>, which resets the error and end-of-file states 1334on the handle. 1335 1336There's also a L<File::Tail> module from CPAN. 1337 1338=head2 How do I dup() a filehandle in Perl? 1339X<dup> 1340 1341If you check L<perlfunc/open>, you'll see that several of the ways 1342to call open() should do the trick. For example: 1343 1344 open my $log, '>>', '/foo/logfile'; 1345 open STDERR, '>&', $log; 1346 1347Or even with a literal numeric descriptor: 1348 1349 my $fd = $ENV{MHCONTEXTFD}; 1350 open $mhcontext, "<&=$fd"; # like fdopen(3S) 1351 1352Note that "<&STDIN" makes a copy, but "<&=STDIN" makes 1353an alias. That means if you close an aliased handle, all 1354aliases become inaccessible. This is not true with 1355a copied one. 1356 1357Error checking, as always, has been left as an exercise for the reader. 1358 1359=head2 How do I close a file descriptor by number? 1360X<file, closing file descriptors> X<POSIX> X<close> 1361 1362If, for some reason, you have a file descriptor instead of a 1363filehandle (perhaps you used C<POSIX::open>), you can use the 1364C<close()> function from the L<POSIX> module: 1365 1366 use POSIX (); 1367 1368 POSIX::close( $fd ); 1369 1370This should rarely be necessary, as the Perl C<close()> function is to be 1371used for things that Perl opened itself, even if it was a dup of a 1372numeric descriptor as with C<MHCONTEXT> above. But if you really have 1373to, you may be able to do this: 1374 1375 require 'sys/syscall.ph'; 1376 my $rc = syscall(SYS_close(), $fd + 0); # must force numeric 1377 die "can't sysclose $fd: $!" unless $rc == -1; 1378 1379Or, just use the fdopen(3S) feature of C<open()>: 1380 1381 { 1382 open my $fh, "<&=$fd" or die "Cannot reopen fd=$fd: $!"; 1383 close $fh; 1384 } 1385 1386=head2 Why can't I use "C:\temp\foo" in DOS paths? Why doesn't `C:\temp\foo.exe` work? 1387X<filename, DOS issues> 1388 1389Whoops! You just put a tab and a formfeed into that filename! 1390Remember that within double quoted strings ("like\this"), the 1391backslash is an escape character. The full list of these is in 1392L<perlop/Quote and Quote-like Operators>. Unsurprisingly, you don't 1393have a file called "c:(tab)emp(formfeed)oo" or 1394"c:(tab)emp(formfeed)oo.exe" on your legacy DOS filesystem. 1395 1396Either single-quote your strings, or (preferably) use forward slashes. 1397Since all DOS and Windows versions since something like MS-DOS 2.0 or so 1398have treated C</> and C<\> the same in a path, you might as well use the 1399one that doesn't clash with Perl--or the POSIX shell, ANSI C and C++, 1400awk, Tcl, Java, or Python, just to mention a few. POSIX paths 1401are more portable, too. 1402 1403=head2 Why doesn't glob("*.*") get all the files? 1404X<glob> 1405 1406Because even on non-Unix ports, Perl's glob function follows standard 1407Unix globbing semantics. You'll need C<glob("*")> to get all (non-hidden) 1408files. This makes glob() portable even to legacy systems. Your 1409port may include proprietary globbing functions as well. Check its 1410documentation for details. 1411 1412=head2 Why does Perl let me delete read-only files? Why does C<-i> clobber protected files? Isn't this a bug in Perl? 1413 1414This is elaborately and painstakingly described in the 1415F<file-dir-perms> article in the "Far More Than You Ever Wanted To 1416Know" collection in L<http://www.cpan.org/misc/olddoc/FMTEYEWTK.tgz> . 1417 1418The executive summary: learn how your filesystem works. The 1419permissions on a file say what can happen to the data in that file. 1420The permissions on a directory say what can happen to the list of 1421files in that directory. If you delete a file, you're removing its 1422name from the directory (so the operation depends on the permissions 1423of the directory, not of the file). If you try to write to the file, 1424the permissions of the file govern whether you're allowed to. 1425 1426=head2 How do I select a random line from a file? 1427X<file, selecting a random line> 1428 1429Short of loading the file into a database or pre-indexing the lines in 1430the file, there are a couple of things that you can do. 1431 1432Here's a reservoir-sampling algorithm from the Camel Book: 1433 1434 srand; 1435 rand($.) < 1 && ($line = $_) while <>; 1436 1437This has a significant advantage in space over reading the whole file 1438in. You can find a proof of this method in I<The Art of Computer 1439Programming>, Volume 2, Section 3.4.2, by Donald E. Knuth. 1440 1441You can use the L<File::Random> module which provides a function 1442for that algorithm: 1443 1444 use File::Random qw/random_line/; 1445 my $line = random_line($filename); 1446 1447Another way is to use the L<Tie::File> module, which treats the entire 1448file as an array. Simply access a random array element. 1449 1450=head2 Why do I get weird spaces when I print an array of lines? 1451 1452(contributed by brian d foy) 1453 1454If you are seeing spaces between the elements of your array when 1455you print the array, you are probably interpolating the array in 1456double quotes: 1457 1458 my @animals = qw(camel llama alpaca vicuna); 1459 print "animals are: @animals\n"; 1460 1461It's the double quotes, not the C<print>, doing this. Whenever you 1462interpolate an array in a double quote context, Perl joins the 1463elements with spaces (or whatever is in C<$">, which is a space by 1464default): 1465 1466 animals are: camel llama alpaca vicuna 1467 1468This is different than printing the array without the interpolation: 1469 1470 my @animals = qw(camel llama alpaca vicuna); 1471 print "animals are: ", @animals, "\n"; 1472 1473Now the output doesn't have the spaces between the elements because 1474the elements of C<@animals> simply become part of the list to 1475C<print>: 1476 1477 animals are: camelllamaalpacavicuna 1478 1479You might notice this when each of the elements of C<@array> end with 1480a newline. You expect to print one element per line, but notice that 1481every line after the first is indented: 1482 1483 this is a line 1484 this is another line 1485 this is the third line 1486 1487That extra space comes from the interpolation of the array. If you 1488don't want to put anything between your array elements, don't use the 1489array in double quotes. You can send it to print without them: 1490 1491 print @lines; 1492 1493=head2 How do I traverse a directory tree? 1494 1495(contributed by brian d foy) 1496 1497The L<File::Find> module, which comes with Perl, does all of the hard 1498work to traverse a directory structure. It comes with Perl. You simply 1499call the C<find> subroutine with a callback subroutine and the 1500directories you want to traverse: 1501 1502 use File::Find; 1503 1504 find( \&wanted, @directories ); 1505 1506 sub wanted { 1507 # full path in $File::Find::name 1508 # just filename in $_ 1509 ... do whatever you want to do ... 1510 } 1511 1512The L<File::Find::Closures>, which you can download from CPAN, provides 1513many ready-to-use subroutines that you can use with L<File::Find>. 1514 1515The L<File::Finder>, which you can download from CPAN, can help you 1516create the callback subroutine using something closer to the syntax of 1517the C<find> command-line utility: 1518 1519 use File::Find; 1520 use File::Finder; 1521 1522 my $deep_dirs = File::Finder->depth->type('d')->ls->exec('rmdir','{}'); 1523 1524 find( $deep_dirs->as_options, @places ); 1525 1526The L<File::Find::Rule> module, which you can download from CPAN, has 1527a similar interface, but does the traversal for you too: 1528 1529 use File::Find::Rule; 1530 1531 my @files = File::Find::Rule->file() 1532 ->name( '*.pm' ) 1533 ->in( @INC ); 1534 1535=head2 How do I delete a directory tree? 1536 1537(contributed by brian d foy) 1538 1539If you have an empty directory, you can use Perl's built-in C<rmdir>. 1540If the directory is not empty (so, no files or subdirectories), you 1541either have to empty it yourself (a lot of work) or use a module to 1542help you. 1543 1544The L<File::Path> module, which comes with Perl, has a C<remove_tree> 1545which can take care of all of the hard work for you: 1546 1547 use File::Path qw(remove_tree); 1548 1549 remove_tree( @directories ); 1550 1551The L<File::Path> module also has a legacy interface to the older 1552C<rmtree> subroutine. 1553 1554=head2 How do I copy an entire directory? 1555 1556(contributed by Shlomi Fish) 1557 1558To do the equivalent of C<cp -R> (i.e. copy an entire directory tree 1559recursively) in portable Perl, you'll either need to write something yourself 1560or find a good CPAN module such as L<File::Copy::Recursive>. 1561 1562=head1 AUTHOR AND COPYRIGHT 1563 1564Copyright (c) 1997-2010 Tom Christiansen, Nathan Torkington, and 1565other authors as noted. All rights reserved. 1566 1567This documentation is free; you can redistribute it and/or modify it 1568under the same terms as Perl itself. 1569 1570Irrespective of its distribution, all code examples here are in the public 1571domain. You are permitted and encouraged to use this code and any 1572derivatives thereof in your own programs for fun or for profit as you 1573see fit. A simple comment in the code giving credit to the FAQ would 1574be courteous but is not required. 1575