1=head1 NAME 2 3perlipc - Perl interprocess communication (signals, fifos, pipes, safe subprocesses, sockets, and semaphores) 4 5=head1 DESCRIPTION 6 7The basic IPC facilities of Perl are built out of the good old Unix 8signals, named pipes, pipe opens, the Berkeley socket routines, and SysV 9IPC calls. Each is used in slightly different situations. 10 11=head1 Signals 12 13Perl uses a simple signal handling model: the %SIG hash contains names 14or references of user-installed signal handlers. These handlers will 15be called with an argument which is the name of the signal that 16triggered it. A signal may be generated intentionally from a 17particular keyboard sequence like control-C or control-Z, sent to you 18from another process, or triggered automatically by the kernel when 19special events transpire, like a child process exiting, your own process 20running out of stack space, or hitting a process file-size limit. 21 22For example, to trap an interrupt signal, set up a handler like this: 23 24 our $shucks; 25 26 sub catch_zap { 27 my $signame = shift; 28 $shucks++; 29 die "Somebody sent me a SIG$signame"; 30 } 31 $SIG{INT} = __PACKAGE__ . "::catch_zap"; 32 $SIG{INT} = \&catch_zap; # best strategy 33 34Prior to Perl 5.8.0 it was necessary to do as little as you possibly 35could in your handler; notice how all we do is set a global variable 36and then raise an exception. That's because on most systems, 37libraries are not re-entrant; particularly, memory allocation and I/O 38routines are not. That meant that doing nearly I<anything> in your 39handler could in theory trigger a memory fault and subsequent core 40dump - see L</Deferred Signals (Safe Signals)> below. 41 42The names of the signals are the ones listed out by C<kill -l> on your 43system, or you can retrieve them using the CPAN module L<IPC::Signal>. 44 45You may also choose to assign the strings C<"IGNORE"> or C<"DEFAULT"> as 46the handler, in which case Perl will try to discard the signal or do the 47default thing. 48 49On most Unix platforms, the C<CHLD> (sometimes also known as C<CLD>) signal 50has special behavior with respect to a value of C<"IGNORE">. 51Setting C<$SIG{CHLD}> to C<"IGNORE"> on such a platform has the effect of 52not creating zombie processes when the parent process fails to C<wait()> 53on its child processes (i.e., child processes are automatically reaped). 54Calling C<wait()> with C<$SIG{CHLD}> set to C<"IGNORE"> usually returns 55C<-1> on such platforms. 56 57Some signals can be neither trapped nor ignored, such as the KILL and STOP 58(but not the TSTP) signals. Note that ignoring signals makes them disappear. 59If you only want them blocked temporarily without them getting lost you'll 60have to use the C<POSIX> module's L<sigprocmask|POSIX/sigprocmask>. 61 62Sending a signal to a negative process ID means that you send the signal 63to the entire Unix process group. This code sends a hang-up signal to all 64processes in the current process group, and also sets $SIG{HUP} to C<"IGNORE"> 65so it doesn't kill itself: 66 67 # block scope for local 68 { 69 local $SIG{HUP} = "IGNORE"; 70 kill HUP => -getpgrp(); 71 # snazzy writing of: kill("HUP", -getpgrp()) 72 } 73 74Another interesting signal to send is signal number zero. This doesn't 75actually affect a child process, but instead checks whether it's alive 76or has changed its UIDs. 77 78 unless (kill 0 => $kid_pid) { 79 warn "something wicked happened to $kid_pid"; 80 } 81 82Signal number zero may fail because you lack permission to send the 83signal when directed at a process whose real or saved UID is not 84identical to the real or effective UID of the sending process, even 85though the process is alive. You may be able to determine the cause of 86failure using C<$!> or C<%!>. 87 88 unless (kill(0 => $pid) || $!{EPERM}) { 89 warn "$pid looks dead"; 90 } 91 92You might also want to employ anonymous functions for simple signal 93handlers: 94 95 $SIG{INT} = sub { die "\nOutta here!\n" }; 96 97SIGCHLD handlers require some special care. If a second child dies 98while in the signal handler caused by the first death, we won't get 99another signal. So must loop here else we will leave the unreaped child 100as a zombie. And the next time two children die we get another zombie. 101And so on. 102 103 use POSIX ":sys_wait_h"; 104 $SIG{CHLD} = sub { 105 while ((my $child = waitpid(-1, WNOHANG)) > 0) { 106 $Kid_Status{$child} = $?; 107 } 108 }; 109 # do something that forks... 110 111Be careful: qx(), system(), and some modules for calling external commands 112do a fork(), then wait() for the result. Thus, your signal handler 113will be called. Because wait() was already called by system() or qx(), 114the wait() in the signal handler will see no more zombies and will 115therefore block. 116 117The best way to prevent this issue is to use waitpid(), as in the following 118example: 119 120 use POSIX ":sys_wait_h"; # for nonblocking read 121 122 my %children; 123 124 $SIG{CHLD} = sub { 125 # don't change $! and $? outside handler 126 local ($!, $?); 127 while ( (my $pid = waitpid(-1, WNOHANG)) > 0 ) { 128 delete $children{$pid}; 129 cleanup_child($pid, $?); 130 } 131 }; 132 133 while (1) { 134 my $pid = fork(); 135 die "cannot fork" unless defined $pid; 136 if ($pid == 0) { 137 # ... 138 exit 0; 139 } else { 140 $children{$pid}=1; 141 # ... 142 system($command); 143 # ... 144 } 145 } 146 147Signal handling is also used for timeouts in Unix. While safely 148protected within an C<eval{}> block, you set a signal handler to trap 149alarm signals and then schedule to have one delivered to you in some 150number of seconds. Then try your blocking operation, clearing the alarm 151when it's done but not before you've exited your C<eval{}> block. If it 152goes off, you'll use die() to jump out of the block. 153 154Here's an example: 155 156 my $ALARM_EXCEPTION = "alarm clock restart"; 157 eval { 158 local $SIG{ALRM} = sub { die $ALARM_EXCEPTION }; 159 alarm 10; 160 flock($fh, 2) # blocking write lock 161 || die "cannot flock: $!"; 162 alarm 0; 163 }; 164 if ($@ && $@ !~ quotemeta($ALARM_EXCEPTION)) { die } 165 166If the operation being timed out is system() or qx(), this technique 167is liable to generate zombies. If this matters to you, you'll 168need to do your own fork() and exec(), and kill the errant child process. 169 170For more complex signal handling, you might see the standard POSIX 171module. Lamentably, this is almost entirely undocumented, but the 172F<ext/POSIX/t/sigaction.t> file from the Perl source distribution has 173some examples in it. 174 175=head2 Handling the SIGHUP Signal in Daemons 176 177A process that usually starts when the system boots and shuts down 178when the system is shut down is called a daemon (Disk And Execution 179MONitor). If a daemon process has a configuration file which is 180modified after the process has been started, there should be a way to 181tell that process to reread its configuration file without stopping 182the process. Many daemons provide this mechanism using a C<SIGHUP> 183signal handler. When you want to tell the daemon to reread the file, 184simply send it the C<SIGHUP> signal. 185 186The following example implements a simple daemon, which restarts 187itself every time the C<SIGHUP> signal is received. The actual code is 188located in the subroutine C<code()>, which just prints some debugging 189info to show that it works; it should be replaced with the real code. 190 191 #!/usr/bin/perl 192 193 use strict; 194 use warnings; 195 196 use POSIX (); 197 use FindBin (); 198 use File::Basename (); 199 use File::Spec::Functions qw(catfile); 200 201 $| = 1; 202 203 # make the daemon cross-platform, so exec always calls the script 204 # itself with the right path, no matter how the script was invoked. 205 my $script = File::Basename::basename($0); 206 my $SELF = catfile($FindBin::Bin, $script); 207 208 # POSIX unmasks the sigprocmask properly 209 $SIG{HUP} = sub { 210 print "got SIGHUP\n"; 211 exec($SELF, @ARGV) || die "$0: couldn't restart: $!"; 212 }; 213 214 code(); 215 216 sub code { 217 print "PID: $$\n"; 218 print "ARGV: @ARGV\n"; 219 my $count = 0; 220 while (1) { 221 sleep 2; 222 print ++$count, "\n"; 223 } 224 } 225 226 227=head2 Deferred Signals (Safe Signals) 228 229Before Perl 5.8.0, installing Perl code to deal with signals exposed you to 230danger from two things. First, few system library functions are 231re-entrant. If the signal interrupts while Perl is executing one function 232(like malloc(3) or printf(3)), and your signal handler then calls the same 233function again, you could get unpredictable behavior--often, a core dump. 234Second, Perl isn't itself re-entrant at the lowest levels. If the signal 235interrupts Perl while Perl is changing its own internal data structures, 236similarly unpredictable behavior may result. 237 238There were two things you could do, knowing this: be paranoid or be 239pragmatic. The paranoid approach was to do as little as possible in your 240signal handler. Set an existing integer variable that already has a 241value, and return. This doesn't help you if you're in a slow system call, 242which will just restart. That means you have to C<die> to longjmp(3) out 243of the handler. Even this is a little cavalier for the true paranoiac, 244who avoids C<die> in a handler because the system I<is> out to get you. 245The pragmatic approach was to say "I know the risks, but prefer the 246convenience", and to do anything you wanted in your signal handler, 247and be prepared to clean up core dumps now and again. 248 249Perl 5.8.0 and later avoid these problems by "deferring" signals. That is, 250when the signal is delivered to the process by the system (to the C code 251that implements Perl) a flag is set, and the handler returns immediately. 252Then at strategic "safe" points in the Perl interpreter (e.g. when it is 253about to execute a new opcode) the flags are checked and the Perl level 254handler from %SIG is executed. The "deferred" scheme allows much more 255flexibility in the coding of signal handlers as we know the Perl 256interpreter is in a safe state, and that we are not in a system library 257function when the handler is called. However the implementation does 258differ from previous Perls in the following ways: 259 260=over 4 261 262=item Long-running opcodes 263 264As the Perl interpreter looks at signal flags only when it is about 265to execute a new opcode, a signal that arrives during a long-running 266opcode (e.g. a regular expression operation on a very large string) will 267not be seen until the current opcode completes. 268 269If a signal of any given type fires multiple times during an opcode 270(such as from a fine-grained timer), the handler for that signal will 271be called only once, after the opcode completes; all other 272instances will be discarded. Furthermore, if your system's signal queue 273gets flooded to the point that there are signals that have been raised 274but not yet caught (and thus not deferred) at the time an opcode 275completes, those signals may well be caught and deferred during 276subsequent opcodes, with sometimes surprising results. For example, you 277may see alarms delivered even after calling C<alarm(0)> as the latter 278stops the raising of alarms but does not cancel the delivery of alarms 279raised but not yet caught. Do not depend on the behaviors described in 280this paragraph as they are side effects of the current implementation and 281may change in future versions of Perl. 282 283=item Interrupting IO 284 285When a signal is delivered (e.g., SIGINT from a control-C) the operating 286system breaks into IO operations like I<read>(2), which is used to 287implement Perl's readline() function, the C<< <> >> operator. On older 288Perls the handler was called immediately (and as C<read> is not "unsafe", 289this worked well). With the "deferred" scheme the handler is I<not> called 290immediately, and if Perl is using the system's C<stdio> library that 291library may restart the C<read> without returning to Perl to give it a 292chance to call the %SIG handler. If this happens on your system the 293solution is to use the C<:perlio> layer to do IO--at least on those handles 294that you want to be able to break into with signals. (The C<:perlio> layer 295checks the signal flags and calls %SIG handlers before resuming IO 296operation.) 297 298The default in Perl 5.8.0 and later is to automatically use 299the C<:perlio> layer. 300 301Note that it is not advisable to access a file handle within a signal 302handler where that signal has interrupted an I/O operation on that same 303handle. While perl will at least try hard not to crash, there are no 304guarantees of data integrity; for example, some data might get dropped or 305written twice. 306 307Some networking library functions like gethostbyname() are known to have 308their own implementations of timeouts which may conflict with your 309timeouts. If you have problems with such functions, try using the POSIX 310sigaction() function, which bypasses Perl safe signals. Be warned that 311this does subject you to possible memory corruption, as described above. 312 313Instead of setting C<$SIG{ALRM}>: 314 315 local $SIG{ALRM} = sub { die "alarm" }; 316 317try something like the following: 318 319 use POSIX qw(SIGALRM); 320 POSIX::sigaction(SIGALRM, 321 POSIX::SigAction->new(sub { die "alarm" })) 322 || die "Error setting SIGALRM handler: $!\n"; 323 324Another way to disable the safe signal behavior locally is to use 325the C<Perl::Unsafe::Signals> module from CPAN, which affects 326all signals. 327 328=item Restartable system calls 329 330On systems that supported it, older versions of Perl used the 331SA_RESTART flag when installing %SIG handlers. This meant that 332restartable system calls would continue rather than returning when 333a signal arrived. In order to deliver deferred signals promptly, 334Perl 5.8.0 and later do I<not> use SA_RESTART. Consequently, 335restartable system calls can fail (with $! set to C<EINTR>) in places 336where they previously would have succeeded. 337 338The default C<:perlio> layer retries C<read>, C<write> 339and C<close> as described above; interrupted C<wait> and 340C<waitpid> calls will always be retried. 341 342=item Signals as "faults" 343 344Certain signals like SEGV, ILL, and BUS are generated by virtual memory 345addressing errors and similar "faults". These are normally fatal: there is 346little a Perl-level handler can do with them. So Perl delivers them 347immediately rather than attempting to defer them. 348 349=item Signals triggered by operating system state 350 351On some operating systems certain signal handlers are supposed to "do 352something" before returning. One example can be CHLD or CLD, which 353indicates a child process has completed. On some operating systems the 354signal handler is expected to C<wait> for the completed child 355process. On such systems the deferred signal scheme will not work for 356those signals: it does not do the C<wait>. Again the failure will 357look like a loop as the operating system will reissue the signal because 358there are completed child processes that have not yet been C<wait>ed for. 359 360=back 361 362If you want the old signal behavior back despite possible 363memory corruption, set the environment variable C<PERL_SIGNALS> to 364C<"unsafe">. This feature first appeared in Perl 5.8.1. 365 366=head1 Named Pipes 367 368A named pipe (often referred to as a FIFO) is an old Unix IPC 369mechanism for processes communicating on the same machine. It works 370just like regular anonymous pipes, except that the 371processes rendezvous using a filename and need not be related. 372 373To create a named pipe, use the C<POSIX::mkfifo()> function. 374 375 use POSIX qw(mkfifo); 376 mkfifo($path, 0700) || die "mkfifo $path failed: $!"; 377 378You can also use the Unix command mknod(1), or on some 379systems, mkfifo(1). These may not be in your normal path, though. 380 381 # system return val is backwards, so && not || 382 # 383 $ENV{PATH} .= ":/etc:/usr/etc"; 384 if ( system("mknod", $path, "p") 385 && system("mkfifo", $path) ) 386 { 387 die "mk{nod,fifo} $path failed"; 388 } 389 390 391A fifo is convenient when you want to connect a process to an unrelated 392one. When you open a fifo, the program will block until there's something 393on the other end. 394 395For example, let's say you'd like to have your F<.signature> file be a 396named pipe that has a Perl program on the other end. Now every time any 397program (like a mailer, news reader, finger program, etc.) tries to read 398from that file, the reading program will read the new signature from your 399program. We'll use the pipe-checking file-test operator, B<-p>, to find 400out whether anyone (or anything) has accidentally removed our fifo. 401 402 chdir(); # go home 403 my $FIFO = ".signature"; 404 405 while (1) { 406 unless (-p $FIFO) { 407 unlink $FIFO; # discard any failure, will catch later 408 require POSIX; # delayed loading of heavy module 409 POSIX::mkfifo($FIFO, 0700) 410 || die "can't mkfifo $FIFO: $!"; 411 } 412 413 # next line blocks till there's a reader 414 open (my $fh, ">", $FIFO) || die "can't open $FIFO: $!"; 415 print $fh "John Smith (smith\@host.org)\n", `fortune -s`; 416 close($fh) || die "can't close $FIFO: $!"; 417 sleep 2; # to avoid dup signals 418 } 419 420=head1 Using open() for IPC 421 422Perl's basic open() statement can also be used for unidirectional 423interprocess communication by specifying the open mode as C<|-> or C<-|>. 424Here's how to start 425something up in a child process you intend to write to: 426 427 open(my $spooler, "|-", "cat -v | lpr -h 2>/dev/null") 428 || die "can't fork: $!"; 429 local $SIG{PIPE} = sub { die "spooler pipe broke" }; 430 print $spooler "stuff\n"; 431 close $spooler || die "bad spool: $! $?"; 432 433And here's how to start up a child process you intend to read from: 434 435 open(my $status, "-|", "netstat -an 2>&1") 436 || die "can't fork: $!"; 437 while (<$status>) { 438 next if /^(tcp|udp)/; 439 print; 440 } 441 close $status || die "bad netstat: $! $?"; 442 443Be aware that these operations are full Unix forks, which means they may 444not be correctly implemented on all alien systems. See L<perlport/open> 445for portability details. 446 447In the two-argument form of open(), a pipe open can be achieved by 448either appending or prepending a pipe symbol to the second argument: 449 450 open(my $spooler, "| cat -v | lpr -h 2>/dev/null") 451 || die "can't fork: $!"; 452 open(my $status, "netstat -an 2>&1 |") 453 || die "can't fork: $!"; 454 455This can be used even on systems that do not support forking, but this 456possibly allows code intended to read files to unexpectedly execute 457programs. If one can be sure that a particular program is a Perl script 458expecting filenames in @ARGV using the two-argument form of open() or the 459C<< <> >> operator, the clever programmer can write something like this: 460 461 % program f1 "cmd1|" - f2 "cmd2|" f3 < tmpfile 462 463and no matter which sort of shell it's called from, the Perl program will 464read from the file F<f1>, the process F<cmd1>, standard input (F<tmpfile> 465in this case), the F<f2> file, the F<cmd2> command, and finally the F<f3> 466file. Pretty nifty, eh? 467 468You might notice that you could use backticks for much the 469same effect as opening a pipe for reading: 470 471 print grep { !/^(tcp|udp)/ } `netstat -an 2>&1`; 472 die "bad netstatus ($?)" if $?; 473 474While this is true on the surface, it's much more efficient to process the 475file one line or record at a time because then you don't have to read the 476whole thing into memory at once. It also gives you finer control of the 477whole process, letting you kill off the child process early if you'd like. 478 479Be careful to check the return values from both open() and close(). If 480you're I<writing> to a pipe, you should also trap SIGPIPE. Otherwise, 481think of what happens when you start up a pipe to a command that doesn't 482exist: the open() will in all likelihood succeed (it only reflects the 483fork()'s success), but then your output will fail--spectacularly. Perl 484can't know whether the command worked, because your command is actually 485running in a separate process whose exec() might have failed. Therefore, 486while readers of bogus commands return just a quick EOF, writers 487to bogus commands will get hit with a signal, which they'd best be prepared 488to handle. Consider: 489 490 open(my $fh, "|-", "bogus") || die "can't fork: $!"; 491 print $fh "bang\n"; # neither necessary nor sufficient 492 # to check print retval! 493 close($fh) || die "can't close: $!"; 494 495The reason for not checking the return value from print() is because of 496pipe buffering; physical writes are delayed. That won't blow up until the 497close, and it will blow up with a SIGPIPE. To catch it, you could use 498this: 499 500 $SIG{PIPE} = "IGNORE"; 501 open(my $fh, "|-", "bogus") || die "can't fork: $!"; 502 print $fh "bang\n"; 503 close($fh) || die "can't close: status=$?"; 504 505=head2 Filehandles 506 507Both the main process and any child processes it forks share the same 508STDIN, STDOUT, and STDERR filehandles. If both processes try to access 509them at once, strange things can happen. You may also want to close 510or reopen the filehandles for the child. You can get around this by 511opening your pipe with open(), but on some systems this means that the 512child process cannot outlive the parent. 513 514=head2 Background Processes 515 516You can run a command in the background with: 517 518 system("cmd &"); 519 520The command's STDOUT and STDERR (and possibly STDIN, depending on your 521shell) will be the same as the parent's. You won't need to catch 522SIGCHLD because of the double-fork taking place; see below for details. 523 524=head2 Complete Dissociation of Child from Parent 525 526In some cases (starting server processes, for instance) you'll want to 527completely dissociate the child process from the parent. This is 528often called daemonization. A well-behaved daemon will also chdir() 529to the root directory so it doesn't prevent unmounting the filesystem 530containing the directory from which it was launched, and redirect its 531standard file descriptors from and to F</dev/null> so that random 532output doesn't wind up on the user's terminal. 533 534 use POSIX "setsid"; 535 536 sub daemonize { 537 chdir("/") || die "can't chdir to /: $!"; 538 open(STDIN, "<", "/dev/null") || die "can't read /dev/null: $!"; 539 open(STDOUT, ">", "/dev/null") || die "can't write /dev/null: $!"; 540 defined(my $pid = fork()) || die "can't fork: $!"; 541 exit if $pid; # non-zero now means I am the parent 542 (setsid() != -1) || die "Can't start a new session: $!"; 543 open(STDERR, ">&", STDOUT) || die "can't dup stdout: $!"; 544 } 545 546The fork() has to come before the setsid() to ensure you aren't a 547process group leader; the setsid() will fail if you are. If your 548system doesn't have the setsid() function, open F</dev/tty> and use the 549C<TIOCNOTTY> ioctl() on it instead. See tty(4) for details. 550 551Non-Unix users should check their C<< I<Your_OS>::Process >> module for 552other possible solutions. 553 554=head2 Safe Pipe Opens 555 556Another interesting approach to IPC is making your single program go 557multiprocess and communicate between--or even amongst--yourselves. The 558two-argument form of the 559open() function will accept a file argument of either C<"-|"> or C<"|-"> 560to do a very interesting thing: it forks a child connected to the 561filehandle you've opened. The child is running the same program as the 562parent. This is useful for safely opening a file when running under an 563assumed UID or GID, for example. If you open a pipe I<to> minus, you can 564write to the filehandle you opened and your kid will find it in I<his> 565STDIN. If you open a pipe I<from> minus, you can read from the filehandle 566you opened whatever your kid writes to I<his> STDOUT. 567 568 my $PRECIOUS = "/path/to/some/safe/file"; 569 my $sleep_count; 570 my $pid; 571 my $kid_to_write; 572 573 do { 574 $pid = open($kid_to_write, "|-"); 575 unless (defined $pid) { 576 warn "cannot fork: $!"; 577 die "bailing out" if $sleep_count++ > 6; 578 sleep 10; 579 } 580 } until defined $pid; 581 582 if ($pid) { # I am the parent 583 print $kid_to_write @some_data; 584 close($kid_to_write) || warn "kid exited $?"; 585 } else { # I am the child 586 # drop permissions in setuid and/or setgid programs: 587 ($>, $)) = ($<, $(); 588 open (my $outfile, ">", $PRECIOUS) 589 || die "can't open $PRECIOUS: $!"; 590 while (<STDIN>) { 591 print $outfile; # child STDIN is parent $kid_to_write 592 } 593 close($outfile) || die "can't close $PRECIOUS: $!"; 594 exit(0); # don't forget this!! 595 } 596 597Another common use for this construct is when you need to execute 598something without the shell's interference. With system(), it's 599straightforward, but you can't use a pipe open or backticks safely. 600That's because there's no way to stop the shell from getting its hands on 601your arguments. Instead, use lower-level control to call exec() directly. 602 603Here's a safe backtick or pipe open for read: 604 605 my $pid = open(my $kid_to_read, "-|"); 606 defined($pid) || die "can't fork: $!"; 607 608 if ($pid) { # parent 609 while (<$kid_to_read>) { 610 # do something interesting 611 } 612 close($kid_to_read) || warn "kid exited $?"; 613 614 } else { # child 615 ($>, $)) = ($<, $(); # suid only 616 exec($program, @options, @args) 617 || die "can't exec program: $!"; 618 # NOTREACHED 619 } 620 621And here's a safe pipe open for writing: 622 623 my $pid = open(my $kid_to_write, "|-"); 624 defined($pid) || die "can't fork: $!"; 625 626 $SIG{PIPE} = sub { die "whoops, $program pipe broke" }; 627 628 if ($pid) { # parent 629 print $kid_to_write @data; 630 close($kid_to_write) || warn "kid exited $?"; 631 632 } else { # child 633 ($>, $)) = ($<, $(); 634 exec($program, @options, @args) 635 || die "can't exec program: $!"; 636 # NOTREACHED 637 } 638 639It is very easy to dead-lock a process using this form of open(), or 640indeed with any use of pipe() with multiple subprocesses. The 641example above is "safe" because it is simple and calls exec(). See 642L</"Avoiding Pipe Deadlocks"> for general safety principles, but there 643are extra gotchas with Safe Pipe Opens. 644 645In particular, if you opened the pipe using C<open $fh, "|-">, then you 646cannot simply use close() in the parent process to close an unwanted 647writer. Consider this code: 648 649 my $pid = open(my $writer, "|-"); # fork open a kid 650 defined($pid) || die "first fork failed: $!"; 651 if ($pid) { 652 if (my $sub_pid = fork()) { 653 defined($sub_pid) || die "second fork failed: $!"; 654 close($writer) || die "couldn't close writer: $!"; 655 # now do something else... 656 } 657 else { 658 # first write to $writer 659 # ... 660 # then when finished 661 close($writer) || die "couldn't close writer: $!"; 662 exit(0); 663 } 664 } 665 else { 666 # first do something with STDIN, then 667 exit(0); 668 } 669 670In the example above, the true parent does not want to write to the $writer 671filehandle, so it closes it. However, because $writer was opened using 672C<open $fh, "|-">, it has a special behavior: closing it calls 673waitpid() (see L<perlfunc/waitpid>), which waits for the subprocess 674to exit. If the child process ends up waiting for something happening 675in the section marked "do something else", you have deadlock. 676 677This can also be a problem with intermediate subprocesses in more 678complicated code, which will call waitpid() on all open filehandles 679during global destruction--in no predictable order. 680 681To solve this, you must manually use pipe(), fork(), and the form of 682open() which sets one file descriptor to another, as shown below: 683 684 pipe(my $reader, my $writer) || die "pipe failed: $!"; 685 my $pid = fork(); 686 defined($pid) || die "first fork failed: $!"; 687 if ($pid) { 688 close $reader; 689 if (my $sub_pid = fork()) { 690 defined($sub_pid) || die "first fork failed: $!"; 691 close($writer) || die "can't close writer: $!"; 692 } 693 else { 694 # write to $writer... 695 # ... 696 # then when finished 697 close($writer) || die "can't close writer: $!"; 698 exit(0); 699 } 700 # write to $writer... 701 } 702 else { 703 open(STDIN, "<&", $reader) || die "can't reopen STDIN: $!"; 704 close($writer) || die "can't close writer: $!"; 705 # do something... 706 exit(0); 707 } 708 709Since Perl 5.8.0, you can also use the list form of C<open> for pipes. 710This is preferred when you wish to avoid having the shell interpret 711metacharacters that may be in your command string. 712 713So for example, instead of using: 714 715 open(my $ps_pipe, "-|", "ps aux") || die "can't open ps pipe: $!"; 716 717One would use either of these: 718 719 open(my $ps_pipe, "-|", "ps", "aux") 720 || die "can't open ps pipe: $!"; 721 722 my @ps_args = qw[ ps aux ]; 723 open(my $ps_pipe, "-|", @ps_args) 724 || die "can't open @ps_args|: $!"; 725 726Because there are more than three arguments to open(), it forks the ps(1) 727command I<without> spawning a shell, and reads its standard output via the 728C<$ps_pipe> filehandle. The corresponding syntax to I<write> to command 729pipes is to use C<"|-"> in place of C<"-|">. 730 731This was admittedly a rather silly example, because you're using string 732literals whose content is perfectly safe. There is therefore no cause to 733resort to the harder-to-read, multi-argument form of pipe open(). However, 734whenever you cannot be assured that the program arguments are free of shell 735metacharacters, the fancier form of open() should be used. For example: 736 737 my @grep_args = ("egrep", "-i", $some_pattern, @many_files); 738 open(my $grep_pipe, "-|", @grep_args) 739 || die "can't open @grep_args|: $!"; 740 741Here the multi-argument form of pipe open() is preferred because the 742pattern and indeed even the filenames themselves might hold metacharacters. 743 744=head2 Avoiding Pipe Deadlocks 745 746Whenever you have more than one subprocess, you must be careful that each 747closes whichever half of any pipes created for interprocess communication 748it is not using. This is because any child process reading from the pipe 749and expecting an EOF will never receive it, and therefore never exit. A 750single process closing a pipe is not enough to close it; the last process 751with the pipe open must close it for it to read EOF. 752 753Certain built-in Unix features help prevent this most of the time. For 754instance, filehandles have a "close on exec" flag, which is set I<en masse> 755under control of the C<$^F> variable. This is so any filehandles you 756didn't explicitly route to the STDIN, STDOUT or STDERR of a child 757I<program> will be automatically closed. 758 759Always explicitly and immediately call close() on the writable end of any 760pipe, unless that process is actually writing to it. Even if you don't 761explicitly call close(), Perl will still close() all filehandles during 762global destruction. As previously discussed, if those filehandles have 763been opened with Safe Pipe Open, this will result in calling waitpid(), 764which may again deadlock. 765 766=head2 Bidirectional Communication with Another Process 767 768While this works reasonably well for unidirectional communication, what 769about bidirectional communication? The most obvious approach doesn't work: 770 771 # THIS DOES NOT WORK!! 772 open(my $prog_for_reading_and_writing, "| some program |") 773 774If you forget to C<use warnings>, you'll miss out entirely on the 775helpful diagnostic message: 776 777 Can't do bidirectional pipe at -e line 1. 778 779If you really want to, you can use the standard open2() from the 780L<IPC::Open2> module to catch both ends. There's also an open3() in 781L<IPC::Open3> for tridirectional I/O so you can also catch your child's 782STDERR, but doing so would then require an awkward select() loop and 783wouldn't allow you to use normal Perl input operations. 784 785If you look at its source, you'll see that open2() uses low-level 786primitives like the pipe() and exec() syscalls to create all the 787connections. Although it might have been more efficient by using 788socketpair(), this would have been even less portable than it already 789is. The open2() and open3() functions are unlikely to work anywhere 790except on a Unix system, or at least one purporting POSIX compliance. 791 792=for TODO 793Hold on, is this even true? First it says that socketpair() is avoided 794for portability, but then it says it probably won't work except on 795Unixy systems anyway. Which one of those is true? 796 797Here's an example of using open2(): 798 799 use IPC::Open2; 800 my $pid = open2(my $reader, my $writer, "cat -un"); 801 print $writer "stuff\n"; 802 my $got = <$reader>; 803 waitpid $pid, 0; 804 805The problem with this is that buffering is really going to ruin your 806day. Even though your C<$writer> filehandle is auto-flushed so the process 807on the other end gets your data in a timely manner, you can't usually do 808anything to force that process to give its data to you in a similarly quick 809fashion. In this special case, we could actually so, because we gave 810I<cat> a B<-u> flag to make it unbuffered. But very few commands are 811designed to operate over pipes, so this seldom works unless you yourself 812wrote the program on the other end of the double-ended pipe. 813 814A solution to this is to use a library which uses pseudottys to make your 815program behave more reasonably. This way you don't have to have control 816over the source code of the program you're using. The C<Expect> module 817from CPAN also addresses this kind of thing. This module requires two 818other modules from CPAN, C<IO::Pty> and C<IO::Stty>. It sets up a pseudo 819terminal to interact with programs that insist on talking to the terminal 820device driver. If your system is supported, this may be your best bet. 821 822=head2 Bidirectional Communication with Yourself 823 824If you want, you may make low-level pipe() and fork() syscalls to stitch 825this together by hand. This example only talks to itself, but you could 826reopen the appropriate handles to STDIN and STDOUT and call other processes. 827(The following example lacks proper error checking.) 828 829 #!/usr/bin/perl 830 # pipe1 - bidirectional communication using two pipe pairs 831 # designed for the socketpair-challenged 832 use strict; 833 use warnings; 834 use IO::Handle; # enable autoflush method before Perl 5.14 835 pipe(my $parent_rdr, my $child_wtr); # XXX: check failure? 836 pipe(my $child_rdr, my $parent_wtr); # XXX: check failure? 837 $child_wtr->autoflush(1); 838 $parent_wtr->autoflush(1); 839 840 if ($pid = fork()) { 841 close $parent_rdr; 842 close $parent_wtr; 843 print $child_wtr "Parent Pid $$ is sending this\n"; 844 chomp(my $line = <$child_rdr>); 845 print "Parent Pid $$ just read this: '$line'\n"; 846 close $child_rdr; close $child_wtr; 847 waitpid($pid, 0); 848 } else { 849 die "cannot fork: $!" unless defined $pid; 850 close $child_rdr; 851 close $child_wtr; 852 chomp(my $line = <$parent_rdr>); 853 print "Child Pid $$ just read this: '$line'\n"; 854 print $parent_wtr "Child Pid $$ is sending this\n"; 855 close $parent_rdr; 856 close $parent_wtr; 857 exit(0); 858 } 859 860But you don't actually have to make two pipe calls. If you 861have the socketpair() system call, it will do this all for you. 862 863 #!/usr/bin/perl 864 # pipe2 - bidirectional communication using socketpair 865 # "the best ones always go both ways" 866 867 use strict; 868 use warnings; 869 use Socket; 870 use IO::Handle; # enable autoflush method before Perl 5.14 871 872 # We say AF_UNIX because although *_LOCAL is the 873 # POSIX 1003.1g form of the constant, many machines 874 # still don't have it. 875 socketpair(my $child, my $parent, AF_UNIX, SOCK_STREAM, PF_UNSPEC) 876 || die "socketpair: $!"; 877 878 $child->autoflush(1); 879 $parent->autoflush(1); 880 881 if ($pid = fork()) { 882 close $parent; 883 print $child "Parent Pid $$ is sending this\n"; 884 chomp(my $line = <$child>); 885 print "Parent Pid $$ just read this: '$line'\n"; 886 close $child; 887 waitpid($pid, 0); 888 } else { 889 die "cannot fork: $!" unless defined $pid; 890 close $child; 891 chomp(my $line = <$parent>); 892 print "Child Pid $$ just read this: '$line'\n"; 893 print $parent "Child Pid $$ is sending this\n"; 894 close $parent; 895 exit(0); 896 } 897 898=head1 Sockets: Client/Server Communication 899 900While not entirely limited to Unix-derived operating systems (e.g., WinSock 901on PCs provides socket support, as do some VMS libraries), you might not have 902sockets on your system, in which case this section probably isn't going to 903do you much good. With sockets, you can do both virtual circuits like TCP 904streams and datagrams like UDP packets. You may be able to do even more 905depending on your system. 906 907The Perl functions for dealing with sockets have the same names as 908the corresponding system calls in C, but their arguments tend to differ 909for two reasons. First, Perl filehandles work differently than C file 910descriptors. Second, Perl already knows the length of its strings, so you 911don't need to pass that information. 912 913One of the major problems with ancient, antemillennial socket code in Perl 914was that it used hard-coded values for some of the constants, which 915severely hurt portability. If you ever see code that does anything like 916explicitly setting C<$AF_INET = 2>, you know you're in for big trouble. 917An immeasurably superior approach is to use the L<Socket> module, which more 918reliably grants access to the various constants and functions you'll need. 919 920If you're not writing a server/client for an existing protocol like 921NNTP or SMTP, you should give some thought to how your server will 922know when the client has finished talking, and vice-versa. Most 923protocols are based on one-line messages and responses (so one party 924knows the other has finished when a "\n" is received) or multi-line 925messages and responses that end with a period on an empty line 926("\n.\n" terminates a message/response). 927 928=head2 Internet Line Terminators 929 930The Internet line terminator is "\015\012". Under ASCII variants of 931Unix, that could usually be written as "\r\n", but under other systems, 932"\r\n" might at times be "\015\015\012", "\012\012\015", or something 933completely different. The standards specify writing "\015\012" to be 934conformant (be strict in what you provide), but they also recommend 935accepting a lone "\012" on input (be lenient in what you require). 936We haven't always been very good about that in the code in this manpage, 937but unless you're on a Mac from way back in its pre-Unix dark ages, you'll 938probably be ok. 939 940=head2 Internet TCP Clients and Servers 941 942Use Internet-domain sockets when you want to do client-server 943communication that might extend to machines outside of your own system. 944 945Here's a sample TCP client using Internet-domain sockets: 946 947 #!/usr/bin/perl 948 use strict; 949 use warnings; 950 use Socket; 951 952 my $remote = shift || "localhost"; 953 my $port = shift || 2345; # random port 954 if ($port =~ /\D/) { $port = getservbyname($port, "tcp") } 955 die "No port" unless $port; 956 my $iaddr = inet_aton($remote) || die "no host: $remote"; 957 my $paddr = sockaddr_in($port, $iaddr); 958 959 my $proto = getprotobyname("tcp"); 960 socket(my $sock, PF_INET, SOCK_STREAM, $proto) || die "socket: $!"; 961 connect($sock, $paddr) || die "connect: $!"; 962 while (my $line = <$sock>) { 963 print $line; 964 } 965 966 close ($sock) || die "close: $!"; 967 exit(0); 968 969And here's a corresponding server to go along with it. We'll 970leave the address as C<INADDR_ANY> so that the kernel can choose 971the appropriate interface on multihomed hosts. If you want sit 972on a particular interface (like the external side of a gateway 973or firewall machine), fill this in with your real address instead. 974 975 #!/usr/bin/perl -T 976 use strict; 977 use warnings; 978 BEGIN { $ENV{PATH} = "/usr/bin:/bin" } 979 use Socket; 980 use Carp; 981 my $EOL = "\015\012"; 982 983 sub logmsg { print "$0 $$: @_ at ", scalar localtime(), "\n" } 984 985 my $port = shift || 2345; 986 die "invalid port" unless $port =~ /^ \d+ $/x; 987 988 my $proto = getprotobyname("tcp"); 989 990 socket(my $server, PF_INET, SOCK_STREAM, $proto) || die "socket: $!"; 991 setsockopt($server, SOL_SOCKET, SO_REUSEADDR, pack("l", 1)) 992 || die "setsockopt: $!"; 993 bind($server, sockaddr_in($port, INADDR_ANY)) || die "bind: $!"; 994 listen($server, SOMAXCONN) || die "listen: $!"; 995 996 logmsg "server started on port $port"; 997 998 for (my $paddr; $paddr = accept(my $client, $server); close $client) { 999 my($port, $iaddr) = sockaddr_in($paddr); 1000 my $name = gethostbyaddr($iaddr, AF_INET); 1001 1002 logmsg "connection from $name [", 1003 inet_ntoa($iaddr), "] 1004 at port $port"; 1005 1006 print $client "Hello there, $name, it's now ", 1007 scalar localtime(), $EOL; 1008 } 1009 1010And here's a multitasking version. It's multitasked in that 1011like most typical servers, it spawns (fork()s) a slave server to 1012handle the client request so that the master server can quickly 1013go back to service a new client. 1014 1015 #!/usr/bin/perl -T 1016 use strict; 1017 use warnings; 1018 BEGIN { $ENV{PATH} = "/usr/bin:/bin" } 1019 use Socket; 1020 use Carp; 1021 my $EOL = "\015\012"; 1022 1023 sub spawn; # forward declaration 1024 sub logmsg { print "$0 $$: @_ at ", scalar localtime(), "\n" } 1025 1026 my $port = shift || 2345; 1027 die "invalid port" unless $port =~ /^ \d+ $/x; 1028 1029 my $proto = getprotobyname("tcp"); 1030 1031 socket(my $server, PF_INET, SOCK_STREAM, $proto) || die "socket: $!"; 1032 setsockopt($server, SOL_SOCKET, SO_REUSEADDR, pack("l", 1)) 1033 || die "setsockopt: $!"; 1034 bind($server, sockaddr_in($port, INADDR_ANY)) || die "bind: $!"; 1035 listen($server, SOMAXCONN) || die "listen: $!"; 1036 1037 logmsg "server started on port $port"; 1038 1039 my $waitedpid = 0; 1040 1041 use POSIX ":sys_wait_h"; 1042 use Errno; 1043 1044 sub REAPER { 1045 local $!; # don't let waitpid() overwrite current error 1046 while ((my $pid = waitpid(-1, WNOHANG)) > 0 && WIFEXITED($?)) { 1047 logmsg "reaped $waitedpid" . ($? ? " with exit $?" : ""); 1048 } 1049 $SIG{CHLD} = \&REAPER; # loathe SysV 1050 } 1051 1052 $SIG{CHLD} = \&REAPER; 1053 1054 while (1) { 1055 my $paddr = accept(my $client, $server) || do { 1056 # try again if accept() returned because got a signal 1057 next if $!{EINTR}; 1058 die "accept: $!"; 1059 }; 1060 my ($port, $iaddr) = sockaddr_in($paddr); 1061 my $name = gethostbyaddr($iaddr, AF_INET); 1062 1063 logmsg "connection from $name [", 1064 inet_ntoa($iaddr), 1065 "] at port $port"; 1066 1067 spawn $client, sub { 1068 $| = 1; 1069 print "Hello there, $name, it's now ", 1070 scalar localtime(), 1071 $EOL; 1072 exec "/usr/games/fortune" # XXX: "wrong" line terminators 1073 or confess "can't exec fortune: $!"; 1074 }; 1075 close $client; 1076 } 1077 1078 sub spawn { 1079 my $client = shift; 1080 my $coderef = shift; 1081 1082 unless (@_ == 0 && $coderef && ref($coderef) eq "CODE") { 1083 confess "usage: spawn CLIENT CODEREF"; 1084 } 1085 1086 my $pid; 1087 unless (defined($pid = fork())) { 1088 logmsg "cannot fork: $!"; 1089 return; 1090 } 1091 elsif ($pid) { 1092 logmsg "begat $pid"; 1093 return; # I'm the parent 1094 } 1095 # else I'm the child -- go spawn 1096 1097 open(STDIN, "<&", $client) || die "can't dup client to stdin"; 1098 open(STDOUT, ">&", $client) || die "can't dup client to stdout"; 1099 ## open(STDERR, ">&", STDOUT) || die "can't dup stdout to stderr"; 1100 exit($coderef->()); 1101 } 1102 1103This server takes the trouble to clone off a child version via fork() 1104for each incoming request. That way it can handle many requests at 1105once, which you might not always want. Even if you don't fork(), the 1106listen() will allow that many pending connections. Forking servers 1107have to be particularly careful about cleaning up their dead children 1108(called "zombies" in Unix parlance), because otherwise you'll quickly 1109fill up your process table. The REAPER subroutine is used here to 1110call waitpid() for any child processes that have finished, thereby 1111ensuring that they terminate cleanly and don't join the ranks of the 1112living dead. 1113 1114Within the while loop we call accept() and check to see if it returns 1115a false value. This would normally indicate a system error needs 1116to be reported. However, the introduction of safe signals (see 1117L</Deferred Signals (Safe Signals)> above) in Perl 5.8.0 means that 1118accept() might also be interrupted when the process receives a signal. 1119This typically happens when one of the forked subprocesses exits and 1120notifies the parent process with a CHLD signal. 1121 1122If accept() is interrupted by a signal, $! will be set to EINTR. 1123If this happens, we can safely continue to the next iteration of 1124the loop and another call to accept(). It is important that your 1125signal handling code not modify the value of $!, or else this test 1126will likely fail. In the REAPER subroutine we create a local version 1127of $! before calling waitpid(). When waitpid() sets $! to ECHILD as 1128it inevitably does when it has no more children waiting, it 1129updates the local copy and leaves the original unchanged. 1130 1131You should use the B<-T> flag to enable taint checking (see L<perlsec>) 1132even if we aren't running setuid or setgid. This is always a good idea 1133for servers or any program run on behalf of someone else (like CGI 1134scripts), because it lessens the chances that people from the outside will 1135be able to compromise your system. 1136 1137Let's look at another TCP client. This one connects to the TCP "time" 1138service on a number of different machines and shows how far their clocks 1139differ from the system on which it's being run: 1140 1141 #!/usr/bin/perl 1142 use strict; 1143 use warnings; 1144 use Socket; 1145 1146 my $SECS_OF_70_YEARS = 2208988800; 1147 sub ctime { scalar localtime(shift() || time()) } 1148 1149 my $iaddr = gethostbyname("localhost"); 1150 my $proto = getprotobyname("tcp"); 1151 my $port = getservbyname("time", "tcp"); 1152 my $paddr = sockaddr_in(0, $iaddr); 1153 1154 $| = 1; 1155 printf "%-24s %8s %s\n", "localhost", 0, ctime(); 1156 1157 foreach my $host (@ARGV) { 1158 printf "%-24s ", $host; 1159 my $hisiaddr = inet_aton($host) || die "unknown host"; 1160 my $hispaddr = sockaddr_in($port, $hisiaddr); 1161 socket(my $socket, PF_INET, SOCK_STREAM, $proto) 1162 || die "socket: $!"; 1163 connect($socket, $hispaddr) || die "connect: $!"; 1164 my $rtime = pack("C4", ()); 1165 read($socket, $rtime, 4); 1166 close($socket); 1167 my $histime = unpack("N", $rtime) - $SECS_OF_70_YEARS; 1168 printf "%8d %s\n", $histime - time(), ctime($histime); 1169 } 1170 1171=head2 Unix-Domain TCP Clients and Servers 1172 1173That's fine for Internet-domain clients and servers, but what about local 1174communications? While you can use the same setup, sometimes you don't 1175want to. Unix-domain sockets are local to the current host, and are often 1176used internally to implement pipes. Unlike Internet domain sockets, Unix 1177domain sockets can show up in the file system with an ls(1) listing. 1178 1179 % ls -l /dev/log 1180 srw-rw-rw- 1 root 0 Oct 31 07:23 /dev/log 1181 1182You can test for these with Perl's B<-S> file test: 1183 1184 unless (-S "/dev/log") { 1185 die "something's wicked with the log system"; 1186 } 1187 1188Here's a sample Unix-domain client: 1189 1190 #!/usr/bin/perl 1191 use Socket; 1192 use strict; 1193 use warnings; 1194 1195 my $rendezvous = shift || "catsock"; 1196 socket(my $sock, PF_UNIX, SOCK_STREAM, 0) || die "socket: $!"; 1197 connect($sock, sockaddr_un($rendezvous)) || die "connect: $!"; 1198 while (defined(my $line = <$sock>)) { 1199 print $line; 1200 } 1201 exit(0); 1202 1203And here's a corresponding server. You don't have to worry about silly 1204network terminators here because Unix domain sockets are guaranteed 1205to be on the localhost, and thus everything works right. 1206 1207 #!/usr/bin/perl -T 1208 use strict; 1209 use warnings; 1210 use Socket; 1211 use Carp; 1212 1213 BEGIN { $ENV{PATH} = "/usr/bin:/bin" } 1214 sub spawn; # forward declaration 1215 sub logmsg { print "$0 $$: @_ at ", scalar localtime(), "\n" } 1216 1217 my $NAME = "catsock"; 1218 my $uaddr = sockaddr_un($NAME); 1219 my $proto = getprotobyname("tcp"); 1220 1221 socket(my $server, PF_UNIX, SOCK_STREAM, 0) || die "socket: $!"; 1222 unlink($NAME); 1223 bind ($server, $uaddr) || die "bind: $!"; 1224 listen($server, SOMAXCONN) || die "listen: $!"; 1225 1226 logmsg "server started on $NAME"; 1227 1228 my $waitedpid; 1229 1230 use POSIX ":sys_wait_h"; 1231 sub REAPER { 1232 my $child; 1233 while (($waitedpid = waitpid(-1, WNOHANG)) > 0) { 1234 logmsg "reaped $waitedpid" . ($? ? " with exit $?" : ""); 1235 } 1236 $SIG{CHLD} = \&REAPER; # loathe SysV 1237 } 1238 1239 $SIG{CHLD} = \&REAPER; 1240 1241 1242 for ( $waitedpid = 0; 1243 accept(my $client, $server) || $waitedpid; 1244 $waitedpid = 0, close $client) 1245 { 1246 next if $waitedpid; 1247 logmsg "connection on $NAME"; 1248 spawn $client, sub { 1249 print "Hello there, it's now ", scalar localtime(), "\n"; 1250 exec("/usr/games/fortune") || die "can't exec fortune: $!"; 1251 }; 1252 } 1253 1254 sub spawn { 1255 my $client = shift(); 1256 my $coderef = shift(); 1257 1258 unless (@_ == 0 && $coderef && ref($coderef) eq "CODE") { 1259 confess "usage: spawn CLIENT CODEREF"; 1260 } 1261 1262 my $pid; 1263 unless (defined($pid = fork())) { 1264 logmsg "cannot fork: $!"; 1265 return; 1266 } 1267 elsif ($pid) { 1268 logmsg "begat $pid"; 1269 return; # I'm the parent 1270 } 1271 else { 1272 # I'm the child -- go spawn 1273 } 1274 1275 open(STDIN, "<&", $client) 1276 || die "can't dup client to stdin"; 1277 open(STDOUT, ">&", $client) 1278 || die "can't dup client to stdout"; 1279 ## open(STDERR, ">&", STDOUT) 1280 ## || die "can't dup stdout to stderr"; 1281 exit($coderef->()); 1282 } 1283 1284As you see, it's remarkably similar to the Internet domain TCP server, so 1285much so, in fact, that we've omitted several duplicate functions--spawn(), 1286logmsg(), ctime(), and REAPER()--which are the same as in the other server. 1287 1288So why would you ever want to use a Unix domain socket instead of a 1289simpler named pipe? Because a named pipe doesn't give you sessions. You 1290can't tell one process's data from another's. With socket programming, 1291you get a separate session for each client; that's why accept() takes two 1292arguments. 1293 1294For example, let's say that you have a long-running database server daemon 1295that you want folks to be able to access from the Web, but only 1296if they go through a CGI interface. You'd have a small, simple CGI 1297program that does whatever checks and logging you feel like, and then acts 1298as a Unix-domain client and connects to your private server. 1299 1300=head1 TCP Clients with IO::Socket 1301 1302For those preferring a higher-level interface to socket programming, the 1303IO::Socket module provides an object-oriented approach. If for some reason 1304you lack this module, you can just fetch IO::Socket from CPAN, where you'll also 1305find modules providing easy interfaces to the following systems: DNS, FTP, 1306Ident (RFC 931), NIS and NISPlus, NNTP, Ping, POP3, SMTP, SNMP, SSLeay, 1307Telnet, and Time--to name just a few. 1308 1309=head2 A Simple Client 1310 1311Here's a client that creates a TCP connection to the "daytime" 1312service at port 13 of the host name "localhost" and prints out everything 1313that the server there cares to provide. 1314 1315 #!/usr/bin/perl 1316 use strict; 1317 use warnings; 1318 use IO::Socket; 1319 my $remote = IO::Socket::INET->new( 1320 Proto => "tcp", 1321 PeerAddr => "localhost", 1322 PeerPort => "daytime(13)", 1323 ) 1324 || die "can't connect to daytime service on localhost"; 1325 while (<$remote>) { print } 1326 1327When you run this program, you should get something back that 1328looks like this: 1329 1330 Wed May 14 08:40:46 MDT 1997 1331 1332Here are what those parameters to the new() constructor mean: 1333 1334=over 4 1335 1336=item C<Proto> 1337 1338This is which protocol to use. In this case, the socket handle returned 1339will be connected to a TCP socket, because we want a stream-oriented 1340connection, that is, one that acts pretty much like a plain old file. 1341Not all sockets are this of this type. For example, the UDP protocol 1342can be used to make a datagram socket, used for message-passing. 1343 1344=item C<PeerAddr> 1345 1346This is the name or Internet address of the remote host the server is 1347running on. We could have specified a longer name like C<"www.perl.com">, 1348or an address like C<"207.171.7.72">. For demonstration purposes, we've 1349used the special hostname C<"localhost">, which should always mean the 1350current machine you're running on. The corresponding Internet address 1351for localhost is C<"127.0.0.1">, if you'd rather use that. 1352 1353=item C<PeerPort> 1354 1355This is the service name or port number we'd like to connect to. 1356We could have gotten away with using just C<"daytime"> on systems with a 1357well-configured system services file,[FOOTNOTE: The system services file 1358is found in I</etc/services> under Unixy systems.] but here we've specified the 1359port number (13) in parentheses. Using just the number would have also 1360worked, but numeric literals make careful programmers nervous. 1361 1362=back 1363 1364=head2 A Webget Client 1365 1366Here's a simple client that takes a remote host to fetch a document 1367from, and then a list of files to get from that host. This is a 1368more interesting client than the previous one because it first sends 1369something to the server before fetching the server's response. 1370 1371 #!/usr/bin/perl 1372 use strict; 1373 use warnings; 1374 use IO::Socket; 1375 unless (@ARGV > 1) { die "usage: $0 host url ..." } 1376 my $host = shift(@ARGV); 1377 my $EOL = "\015\012"; 1378 my $BLANK = $EOL x 2; 1379 for my $document (@ARGV) { 1380 my $remote = IO::Socket::INET->new( Proto => "tcp", 1381 PeerAddr => $host, 1382 PeerPort => "http(80)", 1383 ) || die "cannot connect to httpd on $host"; 1384 $remote->autoflush(1); 1385 print $remote "GET $document HTTP/1.0" . $BLANK; 1386 while ( <$remote> ) { print } 1387 close $remote; 1388 } 1389 1390The web server handling the HTTP service is assumed to be at 1391its standard port, number 80. If the server you're trying to 1392connect to is at a different port, like 1080 or 8080, you should specify it 1393as the named-parameter pair, C<< PeerPort => 8080 >>. The C<autoflush> 1394method is used on the socket because otherwise the system would buffer 1395up the output we sent it. (If you're on a prehistoric Mac, you'll also 1396need to change every C<"\n"> in your code that sends data over the network 1397to be a C<"\015\012"> instead.) 1398 1399Connecting to the server is only the first part of the process: once you 1400have the connection, you have to use the server's language. Each server 1401on the network has its own little command language that it expects as 1402input. The string that we send to the server starting with "GET" is in 1403HTTP syntax. In this case, we simply request each specified document. 1404Yes, we really are making a new connection for each document, even though 1405it's the same host. That's the way you always used to have to speak HTTP. 1406Recent versions of web browsers may request that the remote server leave 1407the connection open a little while, but the server doesn't have to honor 1408such a request. 1409 1410Here's an example of running that program, which we'll call I<webget>: 1411 1412 % webget www.perl.com /guanaco.html 1413 HTTP/1.1 404 File Not Found 1414 Date: Thu, 08 May 1997 18:02:32 GMT 1415 Server: Apache/1.2b6 1416 Connection: close 1417 Content-type: text/html 1418 1419 <HEAD><TITLE>404 File Not Found</TITLE></HEAD> 1420 <BODY><H1>File Not Found</H1> 1421 The requested URL /guanaco.html was not found on this server.<P> 1422 </BODY> 1423 1424Ok, so that's not very interesting, because it didn't find that 1425particular document. But a long response wouldn't have fit on this page. 1426 1427For a more featureful version of this program, you should look to 1428the I<lwp-request> program included with the LWP modules from CPAN. 1429 1430=head2 Interactive Client with IO::Socket 1431 1432Well, that's all fine if you want to send one command and get one answer, 1433but what about setting up something fully interactive, somewhat like 1434the way I<telnet> works? That way you can type a line, get the answer, 1435type a line, get the answer, etc. 1436 1437This client is more complicated than the two we've done so far, but if 1438you're on a system that supports the powerful C<fork> call, the solution 1439isn't that rough. Once you've made the connection to whatever service 1440you'd like to chat with, call C<fork> to clone your process. Each of 1441these two identical process has a very simple job to do: the parent 1442copies everything from the socket to standard output, while the child 1443simultaneously copies everything from standard input to the socket. 1444To accomplish the same thing using just one process would be I<much> 1445harder, because it's easier to code two processes to do one thing than it 1446is to code one process to do two things. (This keep-it-simple principle 1447a cornerstones of the Unix philosophy, and good software engineering as 1448well, which is probably why it's spread to other systems.) 1449 1450Here's the code: 1451 1452 #!/usr/bin/perl 1453 use strict; 1454 use warnings; 1455 use IO::Socket; 1456 1457 unless (@ARGV == 2) { die "usage: $0 host port" } 1458 my ($host, $port) = @ARGV; 1459 1460 # create a tcp connection to the specified host and port 1461 my $handle = IO::Socket::INET->new(Proto => "tcp", 1462 PeerAddr => $host, 1463 PeerPort => $port) 1464 || die "can't connect to port $port on $host: $!"; 1465 1466 $handle->autoflush(1); # so output gets there right away 1467 print STDERR "[Connected to $host:$port]\n"; 1468 1469 # split the program into two processes, identical twins 1470 die "can't fork: $!" unless defined(my $kidpid = fork()); 1471 1472 # the if{} block runs only in the parent process 1473 if ($kidpid) { 1474 # copy the socket to standard output 1475 while (defined (my $line = <$handle>)) { 1476 print STDOUT $line; 1477 } 1478 kill("TERM", $kidpid); # send SIGTERM to child 1479 } 1480 # the else{} block runs only in the child process 1481 else { 1482 # copy standard input to the socket 1483 while (defined (my $line = <STDIN>)) { 1484 print $handle $line; 1485 } 1486 exit(0); # just in case 1487 } 1488 1489The C<kill> function in the parent's C<if> block is there to send a 1490signal to our child process, currently running in the C<else> block, 1491as soon as the remote server has closed its end of the connection. 1492 1493If the remote server sends data a byte at time, and you need that 1494data immediately without waiting for a newline (which might not happen), 1495you may wish to replace the C<while> loop in the parent with the 1496following: 1497 1498 my $byte; 1499 while (sysread($handle, $byte, 1) == 1) { 1500 print STDOUT $byte; 1501 } 1502 1503Making a system call for each byte you want to read is not very efficient 1504(to put it mildly) but is the simplest to explain and works reasonably 1505well. 1506 1507=head1 TCP Servers with IO::Socket 1508 1509As always, setting up a server is little bit more involved than running a client. 1510The model is that the server creates a special kind of socket that 1511does nothing but listen on a particular port for incoming connections. 1512It does this by calling the C<< IO::Socket::INET->new() >> method with 1513slightly different arguments than the client did. 1514 1515=over 4 1516 1517=item Proto 1518 1519This is which protocol to use. Like our clients, we'll 1520still specify C<"tcp"> here. 1521 1522=item LocalPort 1523 1524We specify a local 1525port in the C<LocalPort> argument, which we didn't do for the client. 1526This is service name or port number for which you want to be the 1527server. (Under Unix, ports under 1024 are restricted to the 1528superuser.) In our sample, we'll use port 9000, but you can use 1529any port that's not currently in use on your system. If you try 1530to use one already in used, you'll get an "Address already in use" 1531message. Under Unix, the C<netstat -a> command will show 1532which services current have servers. 1533 1534=item Listen 1535 1536The C<Listen> parameter is set to the maximum number of 1537pending connections we can accept until we turn away incoming clients. 1538Think of it as a call-waiting queue for your telephone. 1539The low-level Socket module has a special symbol for the system maximum, which 1540is SOMAXCONN. 1541 1542=item Reuse 1543 1544The C<Reuse> parameter is needed so that we restart our server 1545manually without waiting a few minutes to allow system buffers to 1546clear out. 1547 1548=back 1549 1550Once the generic server socket has been created using the parameters 1551listed above, the server then waits for a new client to connect 1552to it. The server blocks in the C<accept> method, which eventually accepts a 1553bidirectional connection from the remote client. (Make sure to autoflush 1554this handle to circumvent buffering.) 1555 1556To add to user-friendliness, our server prompts the user for commands. 1557Most servers don't do this. Because of the prompt without a newline, 1558you'll have to use the C<sysread> variant of the interactive client above. 1559 1560This server accepts one of five different commands, sending output back to 1561the client. Unlike most network servers, this one handles only one 1562incoming client at a time. Multitasking servers are covered in 1563Chapter 16 of the Camel. 1564 1565Here's the code. 1566 1567 #!/usr/bin/perl 1568 use strict; 1569 use warnings; 1570 use IO::Socket; 1571 use Net::hostent; # for OOish version of gethostbyaddr 1572 1573 my $PORT = 9000; # pick something not in use 1574 1575 my $server = IO::Socket::INET->new( Proto => "tcp", 1576 LocalPort => $PORT, 1577 Listen => SOMAXCONN, 1578 Reuse => 1); 1579 1580 die "can't setup server" unless $server; 1581 print "[Server $0 accepting clients]\n"; 1582 1583 while (my $client = $server->accept()) { 1584 $client->autoflush(1); 1585 print $client "Welcome to $0; type help for command list.\n"; 1586 my $hostinfo = gethostbyaddr($client->peeraddr); 1587 printf "[Connect from %s]\n", 1588 $hostinfo ? $hostinfo->name : $client->peerhost; 1589 print $client "Command? "; 1590 while ( <$client>) { 1591 next unless /\S/; # blank line 1592 if (/quit|exit/i) { last } 1593 elsif (/date|time/i) { printf $client "%s\n", scalar localtime() } 1594 elsif (/who/i ) { print $client `who 2>&1` } 1595 elsif (/cookie/i ) { print $client `/usr/games/fortune 2>&1` } 1596 elsif (/motd/i ) { print $client `cat /etc/motd 2>&1` } 1597 else { 1598 print $client "Commands: quit date who cookie motd\n"; 1599 } 1600 } continue { 1601 print $client "Command? "; 1602 } 1603 close $client; 1604 } 1605 1606=head1 UDP: Message Passing 1607 1608Another kind of client-server setup is one that uses not connections, but 1609messages. UDP communications involve much lower overhead but also provide 1610less reliability, as there are no promises that messages will arrive at 1611all, let alone in order and unmangled. Still, UDP offers some advantages 1612over TCP, including being able to "broadcast" or "multicast" to a whole 1613bunch of destination hosts at once (usually on your local subnet). If you 1614find yourself overly concerned about reliability and start building checks 1615into your message system, then you probably should use just TCP to start 1616with. 1617 1618UDP datagrams are I<not> a bytestream and should not be treated as such. 1619This makes using I/O mechanisms with internal buffering like stdio (i.e. 1620print() and friends) especially cumbersome. Use syswrite(), or better 1621send(), like in the example below. 1622 1623Here's a UDP program similar to the sample Internet TCP client given 1624earlier. However, instead of checking one host at a time, the UDP version 1625will check many of them asynchronously by simulating a multicast and then 1626using select() to do a timed-out wait for I/O. To do something similar 1627with TCP, you'd have to use a different socket handle for each host. 1628 1629 #!/usr/bin/perl 1630 use strict; 1631 use warnings; 1632 use Socket; 1633 use Sys::Hostname; 1634 1635 my $SECS_OF_70_YEARS = 2_208_988_800; 1636 1637 my $iaddr = gethostbyname(hostname()); 1638 my $proto = getprotobyname("udp"); 1639 my $port = getservbyname("time", "udp"); 1640 my $paddr = sockaddr_in(0, $iaddr); # 0 means let kernel pick 1641 1642 socket(my $socket, PF_INET, SOCK_DGRAM, $proto) || die "socket: $!"; 1643 bind($socket, $paddr) || die "bind: $!"; 1644 1645 $| = 1; 1646 printf "%-12s %8s %s\n", "localhost", 0, scalar localtime(); 1647 my $count = 0; 1648 for my $host (@ARGV) { 1649 $count++; 1650 my $hisiaddr = inet_aton($host) || die "unknown host"; 1651 my $hispaddr = sockaddr_in($port, $hisiaddr); 1652 defined(send($socket, 0, 0, $hispaddr)) || die "send $host: $!"; 1653 } 1654 1655 my $rout = my $rin = ""; 1656 vec($rin, fileno($socket), 1) = 1; 1657 1658 # timeout after 10.0 seconds 1659 while ($count && select($rout = $rin, undef, undef, 10.0)) { 1660 my $rtime = ""; 1661 my $hispaddr = recv($socket, $rtime, 4, 0) || die "recv: $!"; 1662 my ($port, $hisiaddr) = sockaddr_in($hispaddr); 1663 my $host = gethostbyaddr($hisiaddr, AF_INET); 1664 my $histime = unpack("N", $rtime) - $SECS_OF_70_YEARS; 1665 printf "%-12s ", $host; 1666 printf "%8d %s\n", $histime - time(), scalar localtime($histime); 1667 $count--; 1668 } 1669 1670This example does not include any retries and may consequently fail to 1671contact a reachable host. The most prominent reason for this is congestion 1672of the queues on the sending host if the number of hosts to contact is 1673sufficiently large. 1674 1675=head1 SysV IPC 1676 1677While System V IPC isn't so widely used as sockets, it still has some 1678interesting uses. However, you cannot use SysV IPC or Berkeley mmap() to 1679have a variable shared amongst several processes. That's because Perl 1680would reallocate your string when you weren't wanting it to. You might 1681look into the C<IPC::Shareable> or C<threads::shared> modules for that. 1682 1683Here's a small example showing shared memory usage. 1684 1685 use IPC::SysV qw(IPC_PRIVATE IPC_RMID S_IRUSR S_IWUSR); 1686 1687 my $size = 2000; 1688 my $id = shmget(IPC_PRIVATE, $size, S_IRUSR | S_IWUSR); 1689 defined($id) || die "shmget: $!"; 1690 print "shm key $id\n"; 1691 1692 my $message = "Message #1"; 1693 shmwrite($id, $message, 0, 60) || die "shmwrite: $!"; 1694 print "wrote: '$message'\n"; 1695 shmread($id, my $buff, 0, 60) || die "shmread: $!"; 1696 print "read : '$buff'\n"; 1697 1698 # the buffer of shmread is zero-character end-padded. 1699 substr($buff, index($buff, "\0")) = ""; 1700 print "un" unless $buff eq $message; 1701 print "swell\n"; 1702 1703 print "deleting shm $id\n"; 1704 shmctl($id, IPC_RMID, 0) || die "shmctl: $!"; 1705 1706Here's an example of a semaphore: 1707 1708 use IPC::SysV qw(IPC_CREAT); 1709 1710 my $IPC_KEY = 1234; 1711 my $id = semget($IPC_KEY, 10, 0666 | IPC_CREAT); 1712 defined($id) || die "semget: $!"; 1713 print "sem id $id\n"; 1714 1715Put this code in a separate file to be run in more than one process. 1716Call the file F<take>: 1717 1718 # create a semaphore 1719 1720 my $IPC_KEY = 1234; 1721 my $id = semget($IPC_KEY, 0, 0); 1722 defined($id) || die "semget: $!"; 1723 1724 my $semnum = 0; 1725 my $semflag = 0; 1726 1727 # "take" semaphore 1728 # wait for semaphore to be zero 1729 my $semop = 0; 1730 my $opstring1 = pack("s!s!s!", $semnum, $semop, $semflag); 1731 1732 # Increment the semaphore count 1733 $semop = 1; 1734 my $opstring2 = pack("s!s!s!", $semnum, $semop, $semflag); 1735 my $opstring = $opstring1 . $opstring2; 1736 1737 semop($id, $opstring) || die "semop: $!"; 1738 1739Put this code in a separate file to be run in more than one process. 1740Call this file F<give>: 1741 1742 # "give" the semaphore 1743 # run this in the original process and you will see 1744 # that the second process continues 1745 1746 my $IPC_KEY = 1234; 1747 my $id = semget($IPC_KEY, 0, 0); 1748 die unless defined($id); 1749 1750 my $semnum = 0; 1751 my $semflag = 0; 1752 1753 # Decrement the semaphore count 1754 my $semop = -1; 1755 my $opstring = pack("s!s!s!", $semnum, $semop, $semflag); 1756 1757 semop($id, $opstring) || die "semop: $!"; 1758 1759The SysV IPC code above was written long ago, and it's definitely 1760clunky looking. For a more modern look, see the IPC::SysV module. 1761 1762A small example demonstrating SysV message queues: 1763 1764 use IPC::SysV qw(IPC_PRIVATE IPC_RMID IPC_CREAT S_IRUSR S_IWUSR); 1765 1766 my $id = msgget(IPC_PRIVATE, IPC_CREAT | S_IRUSR | S_IWUSR); 1767 defined($id) || die "msgget failed: $!"; 1768 1769 my $sent = "message"; 1770 my $type_sent = 1234; 1771 1772 msgsnd($id, pack("l! a*", $type_sent, $sent), 0) 1773 || die "msgsnd failed: $!"; 1774 1775 msgrcv($id, my $rcvd_buf, 60, 0, 0) 1776 || die "msgrcv failed: $!"; 1777 1778 my($type_rcvd, $rcvd) = unpack("l! a*", $rcvd_buf); 1779 1780 if ($rcvd eq $sent) { 1781 print "okay\n"; 1782 } else { 1783 print "not okay\n"; 1784 } 1785 1786 msgctl($id, IPC_RMID, 0) || die "msgctl failed: $!\n"; 1787 1788=head1 NOTES 1789 1790Most of these routines quietly but politely return C<undef> when they 1791fail instead of causing your program to die right then and there due to 1792an uncaught exception. (Actually, some of the new I<Socket> conversion 1793functions do croak() on bad arguments.) It is therefore essential to 1794check return values from these functions. Always begin your socket 1795programs this way for optimal success, and don't forget to add the B<-T> 1796taint-checking flag to the C<#!> line for servers: 1797 1798 #!/usr/bin/perl -T 1799 use strict; 1800 use warnings; 1801 use sigtrap; 1802 use Socket; 1803 1804=head1 BUGS 1805 1806These routines all create system-specific portability problems. As noted 1807elsewhere, Perl is at the mercy of your C libraries for much of its system 1808behavior. It's probably safest to assume broken SysV semantics for 1809signals and to stick with simple TCP and UDP socket operations; e.g., don't 1810try to pass open file descriptors over a local UDP datagram socket if you 1811want your code to stand a chance of being portable. 1812 1813=head1 AUTHOR 1814 1815Tom Christiansen, with occasional vestiges of Larry Wall's original 1816version and suggestions from the Perl Porters. 1817 1818=head1 SEE ALSO 1819 1820There's a lot more to networking than this, but this should get you 1821started. 1822 1823For intrepid programmers, the indispensable textbook is I<Unix Network 1824Programming, 2nd Edition, Volume 1> by W. Richard Stevens (published by 1825Prentice-Hall). Most books on networking address the subject from the 1826perspective of a C programmer; translation to Perl is left as an exercise 1827for the reader. 1828 1829The IO::Socket(3) manpage describes the object library, and the Socket(3) 1830manpage describes the low-level interface to sockets. Besides the obvious 1831functions in L<perlfunc>, you should also check out the F<modules> file at 1832your nearest CPAN site, especially 1833L<http://www.cpan.org/modules/00modlist.long.html#ID5_Networking_>. 1834See L<perlmodlib> or best yet, the F<Perl FAQ> for a description 1835of what CPAN is and where to get it if the previous link doesn't work 1836for you. 1837 1838Section 5 of CPAN's F<modules> file is devoted to "Networking, Device 1839Control (modems), and Interprocess Communication", and contains numerous 1840unbundled modules numerous networking modules, Chat and Expect operations, 1841CGI programming, DCE, FTP, IPC, NNTP, Proxy, Ptty, RPC, SNMP, SMTP, Telnet, 1842Threads, and ToolTalk--to name just a few. 1843