1=head1 NAME 2 3perlfaq5 - Files and Formats 4 5=head1 VERSION 6 7version 5.20210520 8 9=head1 DESCRIPTION 10 11This section deals with I/O and the "f" issues: filehandles, flushing, 12formats, and footers. 13 14=head2 How do I flush/unbuffer an output filehandle? Why must I do this? 15X<flush> X<buffer> X<unbuffer> X<autoflush> 16 17(contributed by brian d foy) 18 19You might like to read Mark Jason Dominus's "Suffering From Buffering" 20at L<http://perl.plover.com/FAQs/Buffering.html> . 21 22Perl normally buffers output so it doesn't make a system call for every 23bit of output. By saving up output, it makes fewer expensive system calls. 24For instance, in this little bit of code, you want to print a dot to the 25screen for every line you process to watch the progress of your program. 26Instead of seeing a dot for every line, Perl buffers the output and you 27have a long wait before you see a row of 50 dots all at once: 28 29 # long wait, then row of dots all at once 30 while( <> ) { 31 print "."; 32 print "\n" unless ++$count % 50; 33 34 #... expensive line processing operations 35 } 36 37To get around this, you have to unbuffer the output filehandle, in this 38case, C<STDOUT>. You can set the special variable C<$|> to a true value 39(mnemonic: making your filehandles "piping hot"): 40 41 $|++; 42 43 # dot shown immediately 44 while( <> ) { 45 print "."; 46 print "\n" unless ++$count % 50; 47 48 #... expensive line processing operations 49 } 50 51The C<$|> is one of the per-filehandle special variables, so each 52filehandle has its own copy of its value. If you want to merge 53standard output and standard error for instance, you have to unbuffer 54each (although STDERR might be unbuffered by default): 55 56 { 57 my $previous_default = select(STDOUT); # save previous default 58 $|++; # autoflush STDOUT 59 select(STDERR); 60 $|++; # autoflush STDERR, to be sure 61 select($previous_default); # restore previous default 62 } 63 64 # now should alternate . and + 65 while( 1 ) { 66 sleep 1; 67 print STDOUT "."; 68 print STDERR "+"; 69 print STDOUT "\n" unless ++$count % 25; 70 } 71 72Besides the C<$|> special variable, you can use C<binmode> to give 73your filehandle a C<:unix> layer, which is unbuffered: 74 75 binmode( STDOUT, ":unix" ); 76 77 while( 1 ) { 78 sleep 1; 79 print "."; 80 print "\n" unless ++$count % 50; 81 } 82 83For more information on output layers, see the entries for C<binmode> 84and L<open> in L<perlfunc>, and the L<PerlIO> module documentation. 85 86If you are using L<IO::Handle> or one of its subclasses, you can 87call the C<autoflush> method to change the settings of the 88filehandle: 89 90 use IO::Handle; 91 open my( $io_fh ), ">", "output.txt"; 92 $io_fh->autoflush(1); 93 94The L<IO::Handle> objects also have a C<flush> method. You can flush 95the buffer any time you want without auto-buffering 96 97 $io_fh->flush; 98 99=head2 How do I change, delete, or insert a line in a file, or append to the beginning of a file? 100X<file, editing> 101 102(contributed by brian d foy) 103 104The basic idea of inserting, changing, or deleting a line from a text 105file involves reading and printing the file to the point you want to 106make the change, making the change, then reading and printing the rest 107of the file. Perl doesn't provide random access to lines (especially 108since the record input separator, C<$/>, is mutable), although modules 109such as L<Tie::File> can fake it. 110 111A Perl program to do these tasks takes the basic form of opening a 112file, printing its lines, then closing the file: 113 114 open my $in, '<', $file or die "Can't read old file: $!"; 115 open my $out, '>', "$file.new" or die "Can't write new file: $!"; 116 117 while( <$in> ) { 118 print $out $_; 119 } 120 121 close $out; 122 123Within that basic form, add the parts that you need to insert, change, 124or delete lines. 125 126To prepend lines to the beginning, print those lines before you enter 127the loop that prints the existing lines. 128 129 open my $in, '<', $file or die "Can't read old file: $!"; 130 open my $out, '>', "$file.new" or die "Can't write new file: $!"; 131 132 print $out "# Add this line to the top\n"; # <--- HERE'S THE MAGIC 133 134 while( <$in> ) { 135 print $out $_; 136 } 137 138 close $out; 139 140To change existing lines, insert the code to modify the lines inside 141the C<while> loop. In this case, the code finds all lowercased 142versions of "perl" and uppercases them. The happens for every line, so 143be sure that you're supposed to do that on every line! 144 145 open my $in, '<', $file or die "Can't read old file: $!"; 146 open my $out, '>', "$file.new" or die "Can't write new file: $!"; 147 148 print $out "# Add this line to the top\n"; 149 150 while( <$in> ) { 151 s/\b(perl)\b/Perl/g; 152 print $out $_; 153 } 154 155 close $out; 156 157To change only a particular line, the input line number, C<$.>, is 158useful. First read and print the lines up to the one you want to 159change. Next, read the single line you want to change, change it, and 160print it. After that, read the rest of the lines and print those: 161 162 while( <$in> ) { # print the lines before the change 163 print $out $_; 164 last if $. == 4; # line number before change 165 } 166 167 my $line = <$in>; 168 $line =~ s/\b(perl)\b/Perl/g; 169 print $out $line; 170 171 while( <$in> ) { # print the rest of the lines 172 print $out $_; 173 } 174 175To skip lines, use the looping controls. The C<next> in this example 176skips comment lines, and the C<last> stops all processing once it 177encounters either C<__END__> or C<__DATA__>. 178 179 while( <$in> ) { 180 next if /^\s+#/; # skip comment lines 181 last if /^__(END|DATA)__$/; # stop at end of code marker 182 print $out $_; 183 } 184 185Do the same sort of thing to delete a particular line by using C<next> 186to skip the lines you don't want to show up in the output. This 187example skips every fifth line: 188 189 while( <$in> ) { 190 next unless $. % 5; 191 print $out $_; 192 } 193 194If, for some odd reason, you really want to see the whole file at once 195rather than processing line-by-line, you can slurp it in (as long as 196you can fit the whole thing in memory!): 197 198 open my $in, '<', $file or die "Can't read old file: $!" 199 open my $out, '>', "$file.new" or die "Can't write new file: $!"; 200 201 my $content = do { local $/; <$in> }; # slurp! 202 203 # do your magic here 204 205 print $out $content; 206 207Modules such as L<Path::Tiny> and L<Tie::File> can help with that 208too. If you can, however, avoid reading the entire file at once. Perl 209won't give that memory back to the operating system until the process 210finishes. 211 212You can also use Perl one-liners to modify a file in-place. The 213following changes all 'Fred' to 'Barney' in F<inFile.txt>, overwriting 214the file with the new contents. With the C<-p> switch, Perl wraps a 215C<while> loop around the code you specify with C<-e>, and C<-i> turns 216on in-place editing. The current line is in C<$_>. With C<-p>, Perl 217automatically prints the value of C<$_> at the end of the loop. See 218L<perlrun> for more details. 219 220 perl -pi -e 's/Fred/Barney/' inFile.txt 221 222To make a backup of C<inFile.txt>, give C<-i> a file extension to add: 223 224 perl -pi.bak -e 's/Fred/Barney/' inFile.txt 225 226To change only the fifth line, you can add a test checking C<$.>, the 227input line number, then only perform the operation when the test 228passes: 229 230 perl -pi -e 's/Fred/Barney/ if $. == 5' inFile.txt 231 232To add lines before a certain line, you can add a line (or lines!) 233before Perl prints C<$_>: 234 235 perl -pi -e 'print "Put before third line\n" if $. == 3' inFile.txt 236 237You can even add a line to the beginning of a file, since the current 238line prints at the end of the loop: 239 240 perl -pi -e 'print "Put before first line\n" if $. == 1' inFile.txt 241 242To insert a line after one already in the file, use the C<-n> switch. 243It's just like C<-p> except that it doesn't print C<$_> at the end of 244the loop, so you have to do that yourself. In this case, print C<$_> 245first, then print the line that you want to add. 246 247 perl -ni -e 'print; print "Put after fifth line\n" if $. == 5' inFile.txt 248 249To delete lines, only print the ones that you want. 250 251 perl -ni -e 'print if /d/' inFile.txt 252 253=head2 How do I count the number of lines in a file? 254X<file, counting lines> X<lines> X<line> 255 256(contributed by brian d foy) 257 258Conceptually, the easiest way to count the lines in a file is to 259simply read them and count them: 260 261 my $count = 0; 262 while( <$fh> ) { $count++; } 263 264You don't really have to count them yourself, though, since Perl 265already does that with the C<$.> variable, which is the current line 266number from the last filehandle read: 267 268 1 while( <$fh> ); 269 my $count = $.; 270 271If you want to use C<$.>, you can reduce it to a simple one-liner, 272like one of these: 273 274 % perl -lne '} print $.; {' file 275 276 % perl -lne 'END { print $. }' file 277 278Those can be rather inefficient though. If they aren't fast enough for 279you, you might just read chunks of data and count the number of 280newlines: 281 282 my $lines = 0; 283 open my($fh), '<:raw', $filename or die "Can't open $filename: $!"; 284 while( sysread $fh, $buffer, 4096 ) { 285 $lines += ( $buffer =~ tr/\n// ); 286 } 287 close $fh; 288 289However, that doesn't work if the line ending isn't a newline. You 290might change that C<tr///> to a C<s///> so you can count the number of 291times the input record separator, C<$/>, shows up: 292 293 my $lines = 0; 294 open my($fh), '<:raw', $filename or die "Can't open $filename: $!"; 295 while( sysread $fh, $buffer, 4096 ) { 296 $lines += ( $buffer =~ s|$/||g; ); 297 } 298 close $fh; 299 300If you don't mind shelling out, the C<wc> command is usually the 301fastest, even with the extra interprocess overhead. Ensure that you 302have an untainted filename though: 303 304 #!perl -T 305 306 $ENV{PATH} = undef; 307 308 my $lines; 309 if( $filename =~ /^([0-9a-z_.]+)\z/ ) { 310 $lines = `/usr/bin/wc -l $1` 311 chomp $lines; 312 } 313 314=head2 How do I delete the last N lines from a file? 315X<lines> X<file> 316 317(contributed by brian d foy) 318 319The easiest conceptual solution is to count the lines in the 320file then start at the beginning and print the number of lines 321(minus the last N) to a new file. 322 323Most often, the real question is how you can delete the last N lines 324without making more than one pass over the file, or how to do it 325without a lot of copying. The easy concept is the hard reality when 326you might have millions of lines in your file. 327 328One trick is to use L<File::ReadBackwards>, which starts at the end of 329the file. That module provides an object that wraps the real filehandle 330to make it easy for you to move around the file. Once you get to the 331spot you need, you can get the actual filehandle and work with it as 332normal. In this case, you get the file position at the end of the last 333line you want to keep and truncate the file to that point: 334 335 use File::ReadBackwards; 336 337 my $filename = 'test.txt'; 338 my $Lines_to_truncate = 2; 339 340 my $bw = File::ReadBackwards->new( $filename ) 341 or die "Could not read backwards in [$filename]: $!"; 342 343 my $lines_from_end = 0; 344 until( $bw->eof or $lines_from_end == $Lines_to_truncate ) { 345 print "Got: ", $bw->readline; 346 $lines_from_end++; 347 } 348 349 truncate( $filename, $bw->tell ); 350 351The L<File::ReadBackwards> module also has the advantage of setting 352the input record separator to a regular expression. 353 354You can also use the L<Tie::File> module which lets you access 355the lines through a tied array. You can use normal array operations 356to modify your file, including setting the last index and using 357C<splice>. 358 359=head2 How can I use Perl's C<-i> option from within a program? 360X<-i> X<in-place> 361 362C<-i> sets the value of Perl's C<$^I> variable, which in turn affects 363the behavior of C<< <> >>; see L<perlrun> for more details. By 364modifying the appropriate variables directly, you can get the same 365behavior within a larger program. For example: 366 367 # ... 368 { 369 local($^I, @ARGV) = ('.orig', glob("*.c")); 370 while (<>) { 371 if ($. == 1) { 372 print "This line should appear at the top of each file\n"; 373 } 374 s/\b(p)earl\b/${1}erl/i; # Correct typos, preserving case 375 print; 376 close ARGV if eof; # Reset $. 377 } 378 } 379 # $^I and @ARGV return to their old values here 380 381This block modifies all the C<.c> files in the current directory, 382leaving a backup of the original data from each file in a new 383C<.c.orig> file. 384 385=head2 How can I copy a file? 386X<copy> X<file, copy> X<File::Copy> 387 388(contributed by brian d foy) 389 390Use the L<File::Copy> module. It comes with Perl and can do a 391true copy across file systems, and it does its magic in 392a portable fashion. 393 394 use File::Copy; 395 396 copy( $original, $new_copy ) or die "Copy failed: $!"; 397 398If you can't use L<File::Copy>, you'll have to do the work yourself: 399open the original file, open the destination file, then print 400to the destination file as you read the original. You also have to 401remember to copy the permissions, owner, and group to the new file. 402 403=head2 How do I make a temporary file name? 404X<file, temporary> 405 406If you don't need to know the name of the file, you can use C<open()> 407with C<undef> in place of the file name. In Perl 5.8 or later, the 408C<open()> function creates an anonymous temporary file: 409 410 open my $tmp, '+>', undef or die $!; 411 412Otherwise, you can use the File::Temp module. 413 414 use File::Temp qw/ tempfile tempdir /; 415 416 my $dir = tempdir( CLEANUP => 1 ); 417 ($fh, $filename) = tempfile( DIR => $dir ); 418 419 # or if you don't need to know the filename 420 421 my $fh = tempfile( DIR => $dir ); 422 423The File::Temp has been a standard module since Perl 5.6.1. If you 424don't have a modern enough Perl installed, use the C<new_tmpfile> 425class method from the IO::File module to get a filehandle opened for 426reading and writing. Use it if you don't need to know the file's name: 427 428 use IO::File; 429 my $fh = IO::File->new_tmpfile() 430 or die "Unable to make new temporary file: $!"; 431 432If you're committed to creating a temporary file by hand, use the 433process ID and/or the current time-value. If you need to have many 434temporary files in one process, use a counter: 435 436 BEGIN { 437 use Fcntl; 438 use File::Spec; 439 my $temp_dir = File::Spec->tmpdir(); 440 my $file_base = sprintf "%d-%d-0000", $$, time; 441 my $base_name = File::Spec->catfile($temp_dir, $file_base); 442 443 sub temp_file { 444 my $fh; 445 my $count = 0; 446 until( defined(fileno($fh)) || $count++ > 100 ) { 447 $base_name =~ s/-(\d+)$/"-" . (1 + $1)/e; 448 # O_EXCL is required for security reasons. 449 sysopen $fh, $base_name, O_WRONLY|O_EXCL|O_CREAT; 450 } 451 452 if( defined fileno($fh) ) { 453 return ($fh, $base_name); 454 } 455 else { 456 return (); 457 } 458 } 459 } 460 461=head2 How can I manipulate fixed-record-length files? 462X<fixed-length> X<file, fixed-length records> 463 464The most efficient way is using L<pack()|perlfunc/"pack"> and 465L<unpack()|perlfunc/"unpack">. This is faster than using 466L<substr()|perlfunc/"substr"> when taking many, many strings. It is 467slower for just a few. 468 469Here is a sample chunk of code to break up and put back together again 470some fixed-format input lines, in this case from the output of a normal, 471Berkeley-style ps: 472 473 # sample input line: 474 # 15158 p5 T 0:00 perl /home/tchrist/scripts/now-what 475 my $PS_T = 'A6 A4 A7 A5 A*'; 476 open my $ps, '-|', 'ps'; 477 print scalar <$ps>; 478 my @fields = qw( pid tt stat time command ); 479 while (<$ps>) { 480 my %process; 481 @process{@fields} = unpack($PS_T, $_); 482 for my $field ( @fields ) { 483 print "$field: <$process{$field}>\n"; 484 } 485 print 'line=', pack($PS_T, @process{@fields} ), "\n"; 486 } 487 488We've used a hash slice in order to easily handle the fields of each row. 489Storing the keys in an array makes it easy to operate on them as a 490group or loop over them with C<for>. It also avoids polluting the program 491with global variables and using symbolic references. 492 493=head2 How can I make a filehandle local to a subroutine? How do I pass filehandles between subroutines? How do I make an array of filehandles? 494X<filehandle, local> X<filehandle, passing> X<filehandle, reference> 495 496As of perl5.6, open() autovivifies file and directory handles 497as references if you pass it an uninitialized scalar variable. 498You can then pass these references just like any other scalar, 499and use them in the place of named handles. 500 501 open my $fh, $file_name; 502 503 open local $fh, $file_name; 504 505 print $fh "Hello World!\n"; 506 507 process_file( $fh ); 508 509If you like, you can store these filehandles in an array or a hash. 510If you access them directly, they aren't simple scalars and you 511need to give C<print> a little help by placing the filehandle 512reference in braces. Perl can only figure it out on its own when 513the filehandle reference is a simple scalar. 514 515 my @fhs = ( $fh1, $fh2, $fh3 ); 516 517 for( $i = 0; $i <= $#fhs; $i++ ) { 518 print {$fhs[$i]} "just another Perl answer, \n"; 519 } 520 521Before perl5.6, you had to deal with various typeglob idioms 522which you may see in older code. 523 524 open FILE, "> $filename"; 525 process_typeglob( *FILE ); 526 process_reference( \*FILE ); 527 528 sub process_typeglob { local *FH = shift; print FH "Typeglob!" } 529 sub process_reference { local $fh = shift; print $fh "Reference!" } 530 531If you want to create many anonymous handles, you should 532check out the Symbol or IO::Handle modules. 533 534=head2 How can I use a filehandle indirectly? 535X<filehandle, indirect> 536 537An indirect filehandle is the use of something other than a symbol 538in a place that a filehandle is expected. Here are ways 539to get indirect filehandles: 540 541 $fh = SOME_FH; # bareword is strict-subs hostile 542 $fh = "SOME_FH"; # strict-refs hostile; same package only 543 $fh = *SOME_FH; # typeglob 544 $fh = \*SOME_FH; # ref to typeglob (bless-able) 545 $fh = *SOME_FH{IO}; # blessed IO::Handle from *SOME_FH typeglob 546 547Or, you can use the C<new> method from one of the IO::* modules to 548create an anonymous filehandle and store that in a scalar variable. 549 550 use IO::Handle; # 5.004 or higher 551 my $fh = IO::Handle->new(); 552 553Then use any of those as you would a normal filehandle. Anywhere that 554Perl is expecting a filehandle, an indirect filehandle may be used 555instead. An indirect filehandle is just a scalar variable that contains 556a filehandle. Functions like C<print>, C<open>, C<seek>, or 557the C<< <FH> >> diamond operator will accept either a named filehandle 558or a scalar variable containing one: 559 560 ($ifh, $ofh, $efh) = (*STDIN, *STDOUT, *STDERR); 561 print $ofh "Type it: "; 562 my $got = <$ifh> 563 print $efh "What was that: $got"; 564 565If you're passing a filehandle to a function, you can write 566the function in two ways: 567 568 sub accept_fh { 569 my $fh = shift; 570 print $fh "Sending to indirect filehandle\n"; 571 } 572 573Or it can localize a typeglob and use the filehandle directly: 574 575 sub accept_fh { 576 local *FH = shift; 577 print FH "Sending to localized filehandle\n"; 578 } 579 580Both styles work with either objects or typeglobs of real filehandles. 581(They might also work with strings under some circumstances, but this 582is risky.) 583 584 accept_fh(*STDOUT); 585 accept_fh($handle); 586 587In the examples above, we assigned the filehandle to a scalar variable 588before using it. That is because only simple scalar variables, not 589expressions or subscripts of hashes or arrays, can be used with 590built-ins like C<print>, C<printf>, or the diamond operator. Using 591something other than a simple scalar variable as a filehandle is 592illegal and won't even compile: 593 594 my @fd = (*STDIN, *STDOUT, *STDERR); 595 print $fd[1] "Type it: "; # WRONG 596 my $got = <$fd[0]> # WRONG 597 print $fd[2] "What was that: $got"; # WRONG 598 599With C<print> and C<printf>, you get around this by using a block and 600an expression where you would place the filehandle: 601 602 print { $fd[1] } "funny stuff\n"; 603 printf { $fd[1] } "Pity the poor %x.\n", 3_735_928_559; 604 # Pity the poor deadbeef. 605 606That block is a proper block like any other, so you can put more 607complicated code there. This sends the message out to one of two places: 608 609 my $ok = -x "/bin/cat"; 610 print { $ok ? $fd[1] : $fd[2] } "cat stat $ok\n"; 611 print { $fd[ 1+ ($ok || 0) ] } "cat stat $ok\n"; 612 613This approach of treating C<print> and C<printf> like object methods 614calls doesn't work for the diamond operator. That's because it's a 615real operator, not just a function with a comma-less argument. Assuming 616you've been storing typeglobs in your structure as we did above, you 617can use the built-in function named C<readline> to read a record just 618as C<< <> >> does. Given the initialization shown above for @fd, this 619would work, but only because readline() requires a typeglob. It doesn't 620work with objects or strings, which might be a bug we haven't fixed yet. 621 622 $got = readline($fd[0]); 623 624Let it be noted that the flakiness of indirect filehandles is not 625related to whether they're strings, typeglobs, objects, or anything else. 626It's the syntax of the fundamental operators. Playing the object 627game doesn't help you at all here. 628 629=head2 How can I open a filehandle to a string? 630X<string> X<open> X<IO::String> X<filehandle> 631 632(contributed by Peter J. Holzer, hjp-usenet2@hjp.at) 633 634Since Perl 5.8.0 a file handle referring to a string can be created by 635calling open with a reference to that string instead of the filename. 636This file handle can then be used to read from or write to the string: 637 638 open(my $fh, '>', \$string) or die "Could not open string for writing"; 639 print $fh "foo\n"; 640 print $fh "bar\n"; # $string now contains "foo\nbar\n" 641 642 open(my $fh, '<', \$string) or die "Could not open string for reading"; 643 my $x = <$fh>; # $x now contains "foo\n" 644 645With older versions of Perl, the L<IO::String> module provides similar 646functionality. 647 648=head2 How can I set up a footer format to be used with write()? 649X<footer> 650 651There's no builtin way to do this, but L<perlform> has a couple of 652techniques to make it possible for the intrepid hacker. 653 654=head2 How can I write() into a string? 655X<write, into a string> 656 657(contributed by brian d foy) 658 659If you want to C<write> into a string, you just have to <open> a 660filehandle to a string, which Perl has been able to do since Perl 5.6: 661 662 open FH, '>', \my $string; 663 write( FH ); 664 665Since you want to be a good programmer, you probably want to use a lexical 666filehandle, even though formats are designed to work with bareword filehandles 667since the default format names take the filehandle name. However, you can 668control this with some Perl special per-filehandle variables: C<$^>, which 669names the top-of-page format, and C<$~> which shows the line format. You have 670to change the default filehandle to set these variables: 671 672 open my($fh), '>', \my $string; 673 674 { # set per-filehandle variables 675 my $old_fh = select( $fh ); 676 $~ = 'ANIMAL'; 677 $^ = 'ANIMAL_TOP'; 678 select( $old_fh ); 679 } 680 681 format ANIMAL_TOP = 682 ID Type Name 683 . 684 685 format ANIMAL = 686 @## @<<< @<<<<<<<<<<<<<< 687 $id, $type, $name 688 . 689 690Although write can work with lexical or package variables, whatever variables 691you use have to scope in the format. That most likely means you'll want to 692localize some package variables: 693 694 { 695 local( $id, $type, $name ) = qw( 12 cat Buster ); 696 write( $fh ); 697 } 698 699 print $string; 700 701There are also some tricks that you can play with C<formline> and the 702accumulator variable C<$^A>, but you lose a lot of the value of formats 703since C<formline> won't handle paging and so on. You end up reimplementing 704formats when you use them. 705 706=head2 How can I output my numbers with commas added? 707X<number, commify> 708 709(contributed by brian d foy and Benjamin Goldberg) 710 711You can use L<Number::Format> to separate places in a number. 712It handles locale information for those of you who want to insert 713full stops instead (or anything else that they want to use, 714really). 715 716This subroutine will add commas to your number: 717 718 sub commify { 719 local $_ = shift; 720 1 while s/^([-+]?\d+)(\d{3})/$1,$2/; 721 return $_; 722 } 723 724This regex from Benjamin Goldberg will add commas to numbers: 725 726 s/(^[-+]?\d+?(?=(?>(?:\d{3})+)(?!\d))|\G\d{3}(?=\d))/$1,/g; 727 728It is easier to see with comments: 729 730 s/( 731 ^[-+]? # beginning of number. 732 \d+? # first digits before first comma 733 (?= # followed by, (but not included in the match) : 734 (?>(?:\d{3})+) # some positive multiple of three digits. 735 (?!\d) # an *exact* multiple, not x * 3 + 1 or whatever. 736 ) 737 | # or: 738 \G\d{3} # after the last group, get three digits 739 (?=\d) # but they have to have more digits after them. 740 )/$1,/xg; 741 742=head2 How can I translate tildes (~) in a filename? 743X<tilde> X<tilde expansion> 744 745Use the E<lt>E<gt> (C<glob()>) operator, documented in L<perlfunc>. 746Versions of Perl older than 5.6 require that you have a shell 747installed that groks tildes. Later versions of Perl have this feature 748built in. The L<File::KGlob> module (available from CPAN) gives more 749portable glob functionality. 750 751Within Perl, you may use this directly: 752 753 $filename =~ s{ 754 ^ ~ # find a leading tilde 755 ( # save this in $1 756 [^/] # a non-slash character 757 * # repeated 0 or more times (0 means me) 758 ) 759 }{ 760 $1 761 ? (getpwnam($1))[7] 762 : ( $ENV{HOME} || $ENV{LOGDIR} ) 763 }ex; 764 765=head2 How come when I open a file read-write it wipes it out? 766X<clobber> X<read-write> X<clobbering> X<truncate> X<truncating> 767 768Because you're using something like this, which truncates the file 769I<then> gives you read-write access: 770 771 open my $fh, '+>', '/path/name'; # WRONG (almost always) 772 773Whoops. You should instead use this, which will fail if the file 774doesn't exist: 775 776 open my $fh, '+<', '/path/name'; # open for update 777 778Using ">" always clobbers or creates. Using "<" never does 779either. The "+" doesn't change this. 780 781Here are examples of many kinds of file opens. Those using C<sysopen> 782all assume that you've pulled in the constants from L<Fcntl>: 783 784 use Fcntl; 785 786To open file for reading: 787 788 open my $fh, '<', $path or die $!; 789 sysopen my $fh, $path, O_RDONLY or die $!; 790 791To open file for writing, create new file if needed or else truncate old file: 792 793 open my $fh, '>', $path or die $!; 794 sysopen my $fh, $path, O_WRONLY|O_TRUNC|O_CREAT or die $!; 795 sysopen my $fh, $path, O_WRONLY|O_TRUNC|O_CREAT, 0666 or die $!; 796 797To open file for writing, create new file, file must not exist: 798 799 sysopen my $fh, $path, O_WRONLY|O_EXCL|O_CREAT or die $!; 800 sysopen my $fh, $path, O_WRONLY|O_EXCL|O_CREAT, 0666 or die $!; 801 802To open file for appending, create if necessary: 803 804 open my $fh, '>>', $path or die $!; 805 sysopen my $fh, $path, O_WRONLY|O_APPEND|O_CREAT or die $!; 806 sysopen my $fh, $path, O_WRONLY|O_APPEND|O_CREAT, 0666 or die $!; 807 808To open file for appending, file must exist: 809 810 sysopen my $fh, $path, O_WRONLY|O_APPEND or die $!; 811 812To open file for update, file must exist: 813 814 open my $fh, '+<', $path or die $!; 815 sysopen my $fh, $path, O_RDWR or die $!; 816 817To open file for update, create file if necessary: 818 819 sysopen my $fh, $path, O_RDWR|O_CREAT or die $!; 820 sysopen my $fh, $path, O_RDWR|O_CREAT, 0666 or die $!; 821 822To open file for update, file must not exist: 823 824 sysopen my $fh, $path, O_RDWR|O_EXCL|O_CREAT or die $!; 825 sysopen my $fh, $path, O_RDWR|O_EXCL|O_CREAT, 0666 or die $!; 826 827To open a file without blocking, creating if necessary: 828 829 sysopen my $fh, '/foo/somefile', O_WRONLY|O_NDELAY|O_CREAT 830 or die "can't open /foo/somefile: $!": 831 832Be warned that neither creation nor deletion of files is guaranteed to 833be an atomic operation over NFS. That is, two processes might both 834successfully create or unlink the same file! Therefore O_EXCL 835isn't as exclusive as you might wish. 836 837See also L<perlopentut>. 838 839=head2 Why do I sometimes get an "Argument list too long" when I use E<lt>*E<gt>? 840X<argument list too long> 841 842The C<< <> >> operator performs a globbing operation (see above). 843In Perl versions earlier than v5.6.0, the internal glob() operator forks 844csh(1) to do the actual glob expansion, but 845csh can't handle more than 127 items and so gives the error message 846C<Argument list too long>. People who installed tcsh as csh won't 847have this problem, but their users may be surprised by it. 848 849To get around this, either upgrade to Perl v5.6.0 or later, do the glob 850yourself with readdir() and patterns, or use a module like L<File::Glob>, 851one that doesn't use the shell to do globbing. 852 853=head2 How can I open a file named with a leading ">" or trailing blanks? 854X<filename, special characters> 855 856(contributed by Brian McCauley) 857 858The special two-argument form of Perl's open() function ignores 859trailing blanks in filenames and infers the mode from certain leading 860characters (or a trailing "|"). In older versions of Perl this was the 861only version of open() and so it is prevalent in old code and books. 862 863Unless you have a particular reason to use the two-argument form you 864should use the three-argument form of open() which does not treat any 865characters in the filename as special. 866 867 open my $fh, "<", " file "; # filename is " file " 868 open my $fh, ">", ">file"; # filename is ">file" 869 870=head2 How can I reliably rename a file? 871X<rename> X<mv> X<move> X<file, rename> 872 873If your operating system supports a proper mv(1) utility or its 874functional equivalent, this works: 875 876 rename($old, $new) or system("mv", $old, $new); 877 878It may be more portable to use the L<File::Copy> module instead. 879You just copy to the new file to the new name (checking return 880values), then delete the old one. This isn't really the same 881semantically as a C<rename()>, which preserves meta-information like 882permissions, timestamps, inode info, etc. 883 884=head2 How can I lock a file? 885X<lock> X<file, lock> X<flock> 886 887Perl's builtin flock() function (see L<perlfunc> for details) will call 888flock(2) if that exists, fcntl(2) if it doesn't (on perl version 5.004 and 889later), and lockf(3) if neither of the two previous system calls exists. 890On some systems, it may even use a different form of native locking. 891Here are some gotchas with Perl's flock(): 892 893=over 4 894 895=item 1 896 897Produces a fatal error if none of the three system calls (or their 898close equivalent) exists. 899 900=item 2 901 902lockf(3) does not provide shared locking, and requires that the 903filehandle be open for writing (or appending, or read/writing). 904 905=item 3 906 907Some versions of flock() can't lock files over a network (e.g. on NFS file 908systems), so you'd need to force the use of fcntl(2) when you build Perl. 909But even this is dubious at best. See the flock entry of L<perlfunc> 910and the F<INSTALL> file in the source distribution for information on 911building Perl to do this. 912 913Two potentially non-obvious but traditional flock semantics are that 914it waits indefinitely until the lock is granted, and that its locks are 915I<merely advisory>. Such discretionary locks are more flexible, but 916offer fewer guarantees. This means that files locked with flock() may 917be modified by programs that do not also use flock(). Cars that stop 918for red lights get on well with each other, but not with cars that don't 919stop for red lights. See the perlport manpage, your port's specific 920documentation, or your system-specific local manpages for details. It's 921best to assume traditional behavior if you're writing portable programs. 922(If you're not, you should as always feel perfectly free to write 923for your own system's idiosyncrasies (sometimes called "features"). 924Slavish adherence to portability concerns shouldn't get in the way of 925your getting your job done.) 926 927For more information on file locking, see also 928L<perlopentut/"File Locking"> if you have it (new for 5.6). 929 930=back 931 932=head2 Why can't I just open(FH, "E<gt>file.lock")? 933X<lock, lockfile race condition> 934 935A common bit of code B<NOT TO USE> is this: 936 937 sleep(3) while -e 'file.lock'; # PLEASE DO NOT USE 938 open my $lock, '>', 'file.lock'; # THIS BROKEN CODE 939 940This is a classic race condition: you take two steps to do something 941which must be done in one. That's why computer hardware provides an 942atomic test-and-set instruction. In theory, this "ought" to work: 943 944 sysopen my $fh, "file.lock", O_WRONLY|O_EXCL|O_CREAT 945 or die "can't open file.lock: $!"; 946 947except that lamentably, file creation (and deletion) is not atomic 948over NFS, so this won't work (at least, not every time) over the net. 949Various schemes involving link() have been suggested, but 950these tend to involve busy-wait, which is also less than desirable. 951 952=head2 I still don't get locking. I just want to increment the number in the file. How can I do this? 953X<counter> X<file, counter> 954 955Didn't anyone ever tell you web-page hit counters were useless? 956They don't count number of hits, they're a waste of time, and they serve 957only to stroke the writer's vanity. It's better to pick a random number; 958they're more realistic. 959 960Anyway, this is what you can do if you can't help yourself. 961 962 use Fcntl qw(:DEFAULT :flock); 963 sysopen my $fh, "numfile", O_RDWR|O_CREAT or die "can't open numfile: $!"; 964 flock $fh, LOCK_EX or die "can't flock numfile: $!"; 965 my $num = <$fh> || 0; 966 seek $fh, 0, 0 or die "can't rewind numfile: $!"; 967 truncate $fh, 0 or die "can't truncate numfile: $!"; 968 (print $fh $num+1, "\n") or die "can't write numfile: $!"; 969 close $fh or die "can't close numfile: $!"; 970 971Here's a much better web-page hit counter: 972 973 $hits = int( (time() - 850_000_000) / rand(1_000) ); 974 975If the count doesn't impress your friends, then the code might. :-) 976 977=head2 All I want to do is append a small amount of text to the end of a file. Do I still have to use locking? 978X<append> X<file, append> 979 980If you are on a system that correctly implements C<flock> and you use 981the example appending code from "perldoc -f flock" everything will be 982OK even if the OS you are on doesn't implement append mode correctly 983(if such a system exists). So if you are happy to restrict yourself to 984OSs that implement C<flock> (and that's not really much of a 985restriction) then that is what you should do. 986 987If you know you are only going to use a system that does correctly 988implement appending (i.e. not Win32) then you can omit the C<seek> 989from the code in the previous answer. 990 991If you know you are only writing code to run on an OS and filesystem 992that does implement append mode correctly (a local filesystem on a 993modern Unix for example), and you keep the file in block-buffered mode 994and you write less than one buffer-full of output between each manual 995flushing of the buffer then each bufferload is almost guaranteed to be 996written to the end of the file in one chunk without getting 997intermingled with anyone else's output. You can also use the 998C<syswrite> function which is simply a wrapper around your system's 999C<write(2)> system call. 1000 1001There is still a small theoretical chance that a signal will interrupt 1002the system-level C<write()> operation before completion. There is also 1003a possibility that some STDIO implementations may call multiple system 1004level C<write()>s even if the buffer was empty to start. There may be 1005some systems where this probability is reduced to zero, and this is 1006not a concern when using C<:perlio> instead of your system's STDIO. 1007 1008=head2 How do I randomly update a binary file? 1009X<file, binary patch> 1010 1011If you're just trying to patch a binary, in many cases something as 1012simple as this works: 1013 1014 perl -i -pe 's{window manager}{window mangler}g' /usr/bin/emacs 1015 1016However, if you have fixed sized records, then you might do something more 1017like this: 1018 1019 my $RECSIZE = 220; # size of record, in bytes 1020 my $recno = 37; # which record to update 1021 open my $fh, '+<', 'somewhere' or die "can't update somewhere: $!"; 1022 seek $fh, $recno * $RECSIZE, 0; 1023 read $fh, $record, $RECSIZE == $RECSIZE or die "can't read record $recno: $!"; 1024 # munge the record 1025 seek $fh, -$RECSIZE, 1; 1026 print $fh $record; 1027 close $fh; 1028 1029Locking and error checking are left as an exercise for the reader. 1030Don't forget them or you'll be quite sorry. 1031 1032=head2 How do I get a file's timestamp in perl? 1033X<timestamp> X<file, timestamp> 1034 1035If you want to retrieve the time at which the file was last read, 1036written, or had its meta-data (owner, etc) changed, you use the B<-A>, 1037B<-M>, or B<-C> file test operations as documented in L<perlfunc>. 1038These retrieve the age of the file (measured against the start-time of 1039your program) in days as a floating point number. Some platforms may 1040not have all of these times. See L<perlport> for details. To retrieve 1041the "raw" time in seconds since the epoch, you would call the stat 1042function, then use C<localtime()>, C<gmtime()>, or 1043C<POSIX::strftime()> to convert this into human-readable form. 1044 1045Here's an example: 1046 1047 my $write_secs = (stat($file))[9]; 1048 printf "file %s updated at %s\n", $file, 1049 scalar localtime($write_secs); 1050 1051If you prefer something more legible, use the File::stat module 1052(part of the standard distribution in version 5.004 and later): 1053 1054 # error checking left as an exercise for reader. 1055 use File::stat; 1056 use Time::localtime; 1057 my $date_string = ctime(stat($file)->mtime); 1058 print "file $file updated at $date_string\n"; 1059 1060The POSIX::strftime() approach has the benefit of being, 1061in theory, independent of the current locale. See L<perllocale> 1062for details. 1063 1064=head2 How do I set a file's timestamp in perl? 1065X<timestamp> X<file, timestamp> 1066 1067You use the utime() function documented in L<perlfunc/utime>. 1068By way of example, here's a little program that copies the 1069read and write times from its first argument to all the rest 1070of them. 1071 1072 if (@ARGV < 2) { 1073 die "usage: cptimes timestamp_file other_files ...\n"; 1074 } 1075 my $timestamp = shift; 1076 my($atime, $mtime) = (stat($timestamp))[8,9]; 1077 utime $atime, $mtime, @ARGV; 1078 1079Error checking is, as usual, left as an exercise for the reader. 1080 1081The perldoc for utime also has an example that has the same 1082effect as touch(1) on files that I<already exist>. 1083 1084Certain file systems have a limited ability to store the times 1085on a file at the expected level of precision. For example, the 1086FAT and HPFS filesystem are unable to create dates on files with 1087a finer granularity than two seconds. This is a limitation of 1088the filesystems, not of utime(). 1089 1090=head2 How do I print to more than one file at once? 1091X<print, to multiple files> 1092 1093To connect one filehandle to several output filehandles, 1094you can use the L<IO::Tee> or L<Tie::FileHandle::Multiplex> modules. 1095 1096If you only have to do this once, you can print individually 1097to each filehandle. 1098 1099 for my $fh ($fh1, $fh2, $fh3) { print $fh "whatever\n" } 1100 1101=head2 How can I read in an entire file all at once? 1102X<slurp> X<file, slurping> 1103 1104The customary Perl approach for processing all the lines in a file is to 1105do so one line at a time: 1106 1107 open my $input, '<', $file or die "can't open $file: $!"; 1108 while (<$input>) { 1109 chomp; 1110 # do something with $_ 1111 } 1112 close $input or die "can't close $file: $!"; 1113 1114This is tremendously more efficient than reading the entire file into 1115memory as an array of lines and then processing it one element at a time, 1116which is often--if not almost always--the wrong approach. Whenever 1117you see someone do this: 1118 1119 my @lines = <INPUT>; 1120 1121You should think long and hard about why you need everything loaded at 1122once. It's just not a scalable solution. 1123 1124If you "mmap" the file with the File::Map module from 1125CPAN, you can virtually load the entire file into a 1126string without actually storing it in memory: 1127 1128 use File::Map qw(map_file); 1129 1130 map_file my $string, $filename; 1131 1132Once mapped, you can treat C<$string> as you would any other string. 1133Since you don't necessarily have to load the data, mmap-ing can be 1134very fast and may not increase your memory footprint. 1135 1136You might also find it more 1137fun to use the standard L<Tie::File> module, or the L<DB_File> module's 1138C<$DB_RECNO> bindings, which allow you to tie an array to a file so that 1139accessing an element of the array actually accesses the corresponding 1140line in the file. 1141 1142If you want to load the entire file, you can use the L<Path::Tiny> 1143module to do it in one simple and efficient step: 1144 1145 use Path::Tiny; 1146 1147 my $all_of_it = path($filename)->slurp; # entire file in scalar 1148 my @all_lines = path($filename)->lines; # one line per element 1149 1150Or you can read the entire file contents into a scalar like this: 1151 1152 my $var; 1153 { 1154 local $/; 1155 open my $fh, '<', $file or die "can't open $file: $!"; 1156 $var = <$fh>; 1157 } 1158 1159That temporarily undefs your record separator, and will automatically 1160close the file at block exit. If the file is already open, just use this: 1161 1162 my $var = do { local $/; <$fh> }; 1163 1164You can also use a localized C<@ARGV> to eliminate the C<open>: 1165 1166 my $var = do { local( @ARGV, $/ ) = $file; <> }; 1167 1168=head2 How can I read in a file by paragraphs? 1169X<file, reading by paragraphs> 1170 1171Use the C<$/> variable (see L<perlvar> for details). You can either 1172set it to C<""> to eliminate empty paragraphs (C<"abc\n\n\n\ndef">, 1173for instance, gets treated as two paragraphs and not three), or 1174C<"\n\n"> to accept empty paragraphs. 1175 1176Note that a blank line must have no blanks in it. Thus 1177S<C<"fred\n \nstuff\n\n">> is one paragraph, but C<"fred\n\nstuff\n\n"> is two. 1178 1179=head2 How can I read a single character from a file? From the keyboard? 1180X<getc> X<file, reading one character at a time> 1181 1182You can use the builtin C<getc()> function for most filehandles, but 1183it won't (easily) work on a terminal device. For STDIN, either use 1184the Term::ReadKey module from CPAN or use the sample code in 1185L<perlfunc/getc>. 1186 1187If your system supports the portable operating system programming 1188interface (POSIX), you can use the following code, which you'll note 1189turns off echo processing as well. 1190 1191 #!/usr/bin/perl -w 1192 use strict; 1193 $| = 1; 1194 for (1..4) { 1195 print "gimme: "; 1196 my $got = getone(); 1197 print "--> $got\n"; 1198 } 1199 exit; 1200 1201 BEGIN { 1202 use POSIX qw(:termios_h); 1203 1204 my ($term, $oterm, $echo, $noecho, $fd_stdin); 1205 1206 my $fd_stdin = fileno(STDIN); 1207 1208 $term = POSIX::Termios->new(); 1209 $term->getattr($fd_stdin); 1210 $oterm = $term->getlflag(); 1211 1212 $echo = ECHO | ECHOK | ICANON; 1213 $noecho = $oterm & ~$echo; 1214 1215 sub cbreak { 1216 $term->setlflag($noecho); 1217 $term->setcc(VTIME, 1); 1218 $term->setattr($fd_stdin, TCSANOW); 1219 } 1220 1221 sub cooked { 1222 $term->setlflag($oterm); 1223 $term->setcc(VTIME, 0); 1224 $term->setattr($fd_stdin, TCSANOW); 1225 } 1226 1227 sub getone { 1228 my $key = ''; 1229 cbreak(); 1230 sysread(STDIN, $key, 1); 1231 cooked(); 1232 return $key; 1233 } 1234 } 1235 1236 END { cooked() } 1237 1238The Term::ReadKey module from CPAN may be easier to use. Recent versions 1239include also support for non-portable systems as well. 1240 1241 use Term::ReadKey; 1242 open my $tty, '<', '/dev/tty'; 1243 print "Gimme a char: "; 1244 ReadMode "raw"; 1245 my $key = ReadKey 0, $tty; 1246 ReadMode "normal"; 1247 printf "\nYou said %s, char number %03d\n", 1248 $key, ord $key; 1249 1250=head2 How can I tell whether there's a character waiting on a filehandle? 1251 1252The very first thing you should do is look into getting the Term::ReadKey 1253extension from CPAN. As we mentioned earlier, it now even has limited 1254support for non-portable (read: not open systems, closed, proprietary, 1255not POSIX, not Unix, etc.) systems. 1256 1257You should also check out the Frequently Asked Questions list in 1258comp.unix.* for things like this: the answer is essentially the same. 1259It's very system-dependent. Here's one solution that works on BSD 1260systems: 1261 1262 sub key_ready { 1263 my($rin, $nfd); 1264 vec($rin, fileno(STDIN), 1) = 1; 1265 return $nfd = select($rin,undef,undef,0); 1266 } 1267 1268If you want to find out how many characters are waiting, there's 1269also the FIONREAD ioctl call to be looked at. The I<h2ph> tool that 1270comes with Perl tries to convert C include files to Perl code, which 1271can be C<require>d. FIONREAD ends up defined as a function in the 1272I<sys/ioctl.ph> file: 1273 1274 require './sys/ioctl.ph'; 1275 1276 $size = pack("L", 0); 1277 ioctl(FH, FIONREAD(), $size) or die "Couldn't call ioctl: $!\n"; 1278 $size = unpack("L", $size); 1279 1280If I<h2ph> wasn't installed or doesn't work for you, you can 1281I<grep> the include files by hand: 1282 1283 % grep FIONREAD /usr/include/*/* 1284 /usr/include/asm/ioctls.h:#define FIONREAD 0x541B 1285 1286Or write a small C program using the editor of champions: 1287 1288 % cat > fionread.c 1289 #include <sys/ioctl.h> 1290 main() { 1291 printf("%#08x\n", FIONREAD); 1292 } 1293 ^D 1294 % cc -o fionread fionread.c 1295 % ./fionread 1296 0x4004667f 1297 1298And then hard-code it, leaving porting as an exercise to your successor. 1299 1300 $FIONREAD = 0x4004667f; # XXX: opsys dependent 1301 1302 $size = pack("L", 0); 1303 ioctl(FH, $FIONREAD, $size) or die "Couldn't call ioctl: $!\n"; 1304 $size = unpack("L", $size); 1305 1306FIONREAD requires a filehandle connected to a stream, meaning that sockets, 1307pipes, and tty devices work, but I<not> files. 1308 1309=head2 How do I do a C<tail -f> in perl? 1310X<tail> X<IO::Handle> X<File::Tail> X<clearerr> 1311 1312First try 1313 1314 seek($gw_fh, 0, 1); 1315 1316The statement C<seek($gw_fh, 0, 1)> doesn't change the current position, 1317but it does clear the end-of-file condition on the handle, so that the 1318next C<< <$gw_fh> >> makes Perl try again to read something. 1319 1320If that doesn't work (it relies on features of your stdio implementation), 1321then you need something more like this: 1322 1323 for (;;) { 1324 for ($curpos = tell($gw_fh); <$gw_fh>; $curpos =tell($gw_fh)) { 1325 # search for some stuff and put it into files 1326 } 1327 # sleep for a while 1328 seek($gw_fh, $curpos, 0); # seek to where we had been 1329 } 1330 1331If this still doesn't work, look into the C<clearerr> method 1332from L<IO::Handle>, which resets the error and end-of-file states 1333on the handle. 1334 1335There's also a L<File::Tail> module from CPAN. 1336 1337=head2 How do I dup() a filehandle in Perl? 1338X<dup> 1339 1340If you check L<perlfunc/open>, you'll see that several of the ways 1341to call open() should do the trick. For example: 1342 1343 open my $log, '>>', '/foo/logfile'; 1344 open STDERR, '>&', $log; 1345 1346Or even with a literal numeric descriptor: 1347 1348 my $fd = $ENV{MHCONTEXTFD}; 1349 open $mhcontext, "<&=$fd"; # like fdopen(3S) 1350 1351Note that "<&STDIN" makes a copy, but "<&=STDIN" makes 1352an alias. That means if you close an aliased handle, all 1353aliases become inaccessible. This is not true with 1354a copied one. 1355 1356Error checking, as always, has been left as an exercise for the reader. 1357 1358=head2 How do I close a file descriptor by number? 1359X<file, closing file descriptors> X<POSIX> X<close> 1360 1361If, for some reason, you have a file descriptor instead of a 1362filehandle (perhaps you used C<POSIX::open>), you can use the 1363C<close()> function from the L<POSIX> module: 1364 1365 use POSIX (); 1366 1367 POSIX::close( $fd ); 1368 1369This should rarely be necessary, as the Perl C<close()> function is to be 1370used for things that Perl opened itself, even if it was a dup of a 1371numeric descriptor as with C<MHCONTEXT> above. But if you really have 1372to, you may be able to do this: 1373 1374 require './sys/syscall.ph'; 1375 my $rc = syscall(SYS_close(), $fd + 0); # must force numeric 1376 die "can't sysclose $fd: $!" unless $rc == -1; 1377 1378Or, just use the fdopen(3S) feature of C<open()>: 1379 1380 { 1381 open my $fh, "<&=$fd" or die "Cannot reopen fd=$fd: $!"; 1382 close $fh; 1383 } 1384 1385=head2 Why can't I use "C:\temp\foo" in DOS paths? Why doesn't `C:\temp\foo.exe` work? 1386X<filename, DOS issues> 1387 1388Whoops! You just put a tab and a formfeed into that filename! 1389Remember that within double quoted strings ("like\this"), the 1390backslash is an escape character. The full list of these is in 1391L<perlop/Quote and Quote-like Operators>. Unsurprisingly, you don't 1392have a file called "c:(tab)emp(formfeed)oo" or 1393"c:(tab)emp(formfeed)oo.exe" on your legacy DOS filesystem. 1394 1395Either single-quote your strings, or (preferably) use forward slashes. 1396Since all DOS and Windows versions since something like MS-DOS 2.0 or so 1397have treated C</> and C<\> the same in a path, you might as well use the 1398one that doesn't clash with Perl--or the POSIX shell, ANSI C and C++, 1399awk, Tcl, Java, or Python, just to mention a few. POSIX paths 1400are more portable, too. 1401 1402=head2 Why doesn't glob("*.*") get all the files? 1403X<glob> 1404 1405Because even on non-Unix ports, Perl's glob function follows standard 1406Unix globbing semantics. You'll need C<glob("*")> to get all (non-hidden) 1407files. This makes glob() portable even to legacy systems. Your 1408port may include proprietary globbing functions as well. Check its 1409documentation for details. 1410 1411=head2 Why does Perl let me delete read-only files? Why does C<-i> clobber protected files? Isn't this a bug in Perl? 1412 1413This is elaborately and painstakingly described in the 1414F<file-dir-perms> article in the "Far More Than You Ever Wanted To 1415Know" collection in L<http://www.cpan.org/misc/olddoc/FMTEYEWTK.tgz> . 1416 1417The executive summary: learn how your filesystem works. The 1418permissions on a file say what can happen to the data in that file. 1419The permissions on a directory say what can happen to the list of 1420files in that directory. If you delete a file, you're removing its 1421name from the directory (so the operation depends on the permissions 1422of the directory, not of the file). If you try to write to the file, 1423the permissions of the file govern whether you're allowed to. 1424 1425=head2 How do I select a random line from a file? 1426X<file, selecting a random line> 1427 1428Short of loading the file into a database or pre-indexing the lines in 1429the file, there are a couple of things that you can do. 1430 1431Here's a reservoir-sampling algorithm from the Camel Book: 1432 1433 srand; 1434 rand($.) < 1 && ($line = $_) while <>; 1435 1436This has a significant advantage in space over reading the whole file 1437in. You can find a proof of this method in I<The Art of Computer 1438Programming>, Volume 2, Section 3.4.2, by Donald E. Knuth. 1439 1440You can use the L<File::Random> module which provides a function 1441for that algorithm: 1442 1443 use File::Random qw/random_line/; 1444 my $line = random_line($filename); 1445 1446Another way is to use the L<Tie::File> module, which treats the entire 1447file as an array. Simply access a random array element. 1448 1449=head2 Why do I get weird spaces when I print an array of lines? 1450 1451(contributed by brian d foy) 1452 1453If you are seeing spaces between the elements of your array when 1454you print the array, you are probably interpolating the array in 1455double quotes: 1456 1457 my @animals = qw(camel llama alpaca vicuna); 1458 print "animals are: @animals\n"; 1459 1460It's the double quotes, not the C<print>, doing this. Whenever you 1461interpolate an array in a double quote context, Perl joins the 1462elements with spaces (or whatever is in C<$">, which is a space by 1463default): 1464 1465 animals are: camel llama alpaca vicuna 1466 1467This is different than printing the array without the interpolation: 1468 1469 my @animals = qw(camel llama alpaca vicuna); 1470 print "animals are: ", @animals, "\n"; 1471 1472Now the output doesn't have the spaces between the elements because 1473the elements of C<@animals> simply become part of the list to 1474C<print>: 1475 1476 animals are: camelllamaalpacavicuna 1477 1478You might notice this when each of the elements of C<@array> end with 1479a newline. You expect to print one element per line, but notice that 1480every line after the first is indented: 1481 1482 this is a line 1483 this is another line 1484 this is the third line 1485 1486That extra space comes from the interpolation of the array. If you 1487don't want to put anything between your array elements, don't use the 1488array in double quotes. You can send it to print without them: 1489 1490 print @lines; 1491 1492=head2 How do I traverse a directory tree? 1493 1494(contributed by brian d foy) 1495 1496The L<File::Find> module, which comes with Perl, does all of the hard 1497work to traverse a directory structure. It comes with Perl. You simply 1498call the C<find> subroutine with a callback subroutine and the 1499directories you want to traverse: 1500 1501 use File::Find; 1502 1503 find( \&wanted, @directories ); 1504 1505 sub wanted { 1506 # full path in $File::Find::name 1507 # just filename in $_ 1508 ... do whatever you want to do ... 1509 } 1510 1511The L<File::Find::Closures>, which you can download from CPAN, provides 1512many ready-to-use subroutines that you can use with L<File::Find>. 1513 1514The L<File::Finder>, which you can download from CPAN, can help you 1515create the callback subroutine using something closer to the syntax of 1516the C<find> command-line utility: 1517 1518 use File::Find; 1519 use File::Finder; 1520 1521 my $deep_dirs = File::Finder->depth->type('d')->ls->exec('rmdir','{}'); 1522 1523 find( $deep_dirs->as_options, @places ); 1524 1525The L<File::Find::Rule> module, which you can download from CPAN, has 1526a similar interface, but does the traversal for you too: 1527 1528 use File::Find::Rule; 1529 1530 my @files = File::Find::Rule->file() 1531 ->name( '*.pm' ) 1532 ->in( @INC ); 1533 1534=head2 How do I delete a directory tree? 1535 1536(contributed by brian d foy) 1537 1538If you have an empty directory, you can use Perl's built-in C<rmdir>. 1539If the directory is not empty (so, with files or subdirectories), you 1540either have to empty it yourself (a lot of work) or use a module to 1541help you. 1542 1543The L<File::Path> module, which comes with Perl, has a C<remove_tree> 1544which can take care of all of the hard work for you: 1545 1546 use File::Path qw(remove_tree); 1547 1548 remove_tree( @directories ); 1549 1550The L<File::Path> module also has a legacy interface to the older 1551C<rmtree> subroutine. 1552 1553=head2 How do I copy an entire directory? 1554 1555(contributed by Shlomi Fish) 1556 1557To do the equivalent of C<cp -R> (i.e. copy an entire directory tree 1558recursively) in portable Perl, you'll either need to write something yourself 1559or find a good CPAN module such as L<File::Copy::Recursive>. 1560 1561=head1 AUTHOR AND COPYRIGHT 1562 1563Copyright (c) 1997-2010 Tom Christiansen, Nathan Torkington, and 1564other authors as noted. All rights reserved. 1565 1566This documentation is free; you can redistribute it and/or modify it 1567under the same terms as Perl itself. 1568 1569Irrespective of its distribution, all code examples here are in the public 1570domain. You are permitted and encouraged to use this code and any 1571derivatives thereof in your own programs for fun or for profit as you 1572see fit. A simple comment in the code giving credit to the FAQ would 1573be courteous but is not required. 1574