1=head1 NAME 2 3perlipc - Perl interprocess communication (signals, fifos, pipes, safe subprocesses, sockets, and semaphores) 4 5=head1 DESCRIPTION 6 7The basic IPC facilities of Perl are built out of the good old Unix 8signals, named pipes, pipe opens, the Berkeley socket routines, and SysV 9IPC calls. Each is used in slightly different situations. 10 11=head1 Signals 12 13Perl uses a simple signal handling model: the %SIG hash contains names 14or references of user-installed signal handlers. These handlers will 15be called with an argument which is the name of the signal that 16triggered it. A signal may be generated intentionally from a 17particular keyboard sequence like control-C or control-Z, sent to you 18from another process, or triggered automatically by the kernel when 19special events transpire, like a child process exiting, your own process 20running out of stack space, or hitting a process file-size limit. 21 22For example, to trap an interrupt signal, set up a handler like this: 23 24 our $shucks; 25 26 sub catch_zap { 27 my $signame = shift; 28 $shucks++; 29 die "Somebody sent me a SIG$signame"; 30 } 31 $SIG{INT} = __PACKAGE__ . "::catch_zap"; 32 $SIG{INT} = \&catch_zap; # best strategy 33 34Prior to Perl 5.8.0 it was necessary to do as little as you possibly 35could in your handler; notice how all we do is set a global variable 36and then raise an exception. That's because on most systems, 37libraries are not re-entrant; particularly, memory allocation and I/O 38routines are not. That meant that doing nearly I<anything> in your 39handler could in theory trigger a memory fault and subsequent core 40dump - see L</Deferred Signals (Safe Signals)> below. 41 42The names of the signals are the ones listed out by C<kill -l> on your 43system, or you can retrieve them using the CPAN module L<IPC::Signal>. 44 45You may also choose to assign the strings C<"IGNORE"> or C<"DEFAULT"> as 46the handler, in which case Perl will try to discard the signal or do the 47default thing. 48 49On most Unix platforms, the C<CHLD> (sometimes also known as C<CLD>) signal 50has special behavior with respect to a value of C<"IGNORE">. 51Setting C<$SIG{CHLD}> to C<"IGNORE"> on such a platform has the effect of 52not creating zombie processes when the parent process fails to C<wait()> 53on its child processes (i.e., child processes are automatically reaped). 54Calling C<wait()> with C<$SIG{CHLD}> set to C<"IGNORE"> usually returns 55C<-1> on such platforms. 56 57Some signals can be neither trapped nor ignored, such as the KILL and STOP 58(but not the TSTP) signals. Note that ignoring signals makes them disappear. 59If you only want them blocked temporarily without them getting lost you'll 60have to use the C<POSIX> module's L<sigprocmask|POSIX/sigprocmask>. 61 62Sending a signal to a negative process ID means that you send the signal 63to the entire Unix process group. This code sends a hang-up signal to all 64processes in the current process group, and also sets $SIG{HUP} to C<"IGNORE"> 65so it doesn't kill itself: 66 67 # block scope for local 68 { 69 local $SIG{HUP} = "IGNORE"; 70 kill HUP => -getpgrp(); 71 # snazzy writing of: kill("HUP", -getpgrp()) 72 } 73 74Another interesting signal to send is signal number zero. This doesn't 75actually affect a child process, but instead checks whether it's alive 76or has changed its UIDs. 77 78 unless (kill 0 => $kid_pid) { 79 warn "something wicked happened to $kid_pid"; 80 } 81 82Signal number zero may fail because you lack permission to send the 83signal when directed at a process whose real or saved UID is not 84identical to the real or effective UID of the sending process, even 85though the process is alive. You may be able to determine the cause of 86failure using C<$!> or C<%!>. 87 88 unless (kill(0 => $pid) || $!{EPERM}) { 89 warn "$pid looks dead"; 90 } 91 92You might also want to employ anonymous functions for simple signal 93handlers: 94 95 $SIG{INT} = sub { die "\nOutta here!\n" }; 96 97SIGCHLD handlers require some special care. If a second child dies 98while in the signal handler caused by the first death, we won't get 99another signal. So must loop here else we will leave the unreaped child 100as a zombie. And the next time two children die we get another zombie. 101And so on. 102 103 use POSIX ":sys_wait_h"; 104 $SIG{CHLD} = sub { 105 while ((my $child = waitpid(-1, WNOHANG)) > 0) { 106 $Kid_Status{$child} = $?; 107 } 108 }; 109 # do something that forks... 110 111Be careful: qx(), system(), and some modules for calling external commands 112do a fork(), then wait() for the result. Thus, your signal handler 113will be called. Because wait() was already called by system() or qx(), 114the wait() in the signal handler will see no more zombies and will 115therefore block. 116 117The best way to prevent this issue is to use waitpid(), as in the following 118example: 119 120 use POSIX ":sys_wait_h"; # for nonblocking read 121 122 my %children; 123 124 $SIG{CHLD} = sub { 125 # don't change $! and $? outside handler 126 local ($!, $?); 127 while ( (my $pid = waitpid(-1, WNOHANG)) > 0 ) { 128 delete $children{$pid}; 129 cleanup_child($pid, $?); 130 } 131 }; 132 133 while (1) { 134 my $pid = fork(); 135 die "cannot fork" unless defined $pid; 136 if ($pid == 0) { 137 # ... 138 exit 0; 139 } else { 140 $children{$pid}=1; 141 # ... 142 system($command); 143 # ... 144 } 145 } 146 147Signal handling is also used for timeouts in Unix. While safely 148protected within an C<eval{}> block, you set a signal handler to trap 149alarm signals and then schedule to have one delivered to you in some 150number of seconds. Then try your blocking operation, clearing the alarm 151when it's done but not before you've exited your C<eval{}> block. If it 152goes off, you'll use die() to jump out of the block. 153 154Here's an example: 155 156 my $ALARM_EXCEPTION = "alarm clock restart"; 157 eval { 158 local $SIG{ALRM} = sub { die $ALARM_EXCEPTION }; 159 alarm 10; 160 flock($fh, 2) # blocking write lock 161 || die "cannot flock: $!"; 162 alarm 0; 163 }; 164 if ($@ && $@ !~ quotemeta($ALARM_EXCEPTION)) { die } 165 166If the operation being timed out is system() or qx(), this technique 167is liable to generate zombies. If this matters to you, you'll 168need to do your own fork() and exec(), and kill the errant child process. 169 170For more complex signal handling, you might see the standard POSIX 171module. Lamentably, this is almost entirely undocumented, but the 172F<ext/POSIX/t/sigaction.t> file from the Perl source distribution has 173some examples in it. 174 175=head2 Handling the SIGHUP Signal in Daemons 176 177A process that usually starts when the system boots and shuts down 178when the system is shut down is called a daemon (Disk And Execution 179MONitor). If a daemon process has a configuration file which is 180modified after the process has been started, there should be a way to 181tell that process to reread its configuration file without stopping 182the process. Many daemons provide this mechanism using a C<SIGHUP> 183signal handler. When you want to tell the daemon to reread the file, 184simply send it the C<SIGHUP> signal. 185 186The following example implements a simple daemon, which restarts 187itself every time the C<SIGHUP> signal is received. The actual code is 188located in the subroutine C<code()>, which just prints some debugging 189info to show that it works; it should be replaced with the real code. 190 191 #!/usr/bin/perl 192 193 use v5.36; 194 195 use POSIX (); 196 use FindBin (); 197 use File::Basename (); 198 use File::Spec::Functions qw(catfile); 199 200 $| = 1; 201 202 # make the daemon cross-platform, so exec always calls the script 203 # itself with the right path, no matter how the script was invoked. 204 my $script = File::Basename::basename($0); 205 my $SELF = catfile($FindBin::Bin, $script); 206 207 # POSIX unmasks the sigprocmask properly 208 $SIG{HUP} = sub { 209 print "got SIGHUP\n"; 210 exec($SELF, @ARGV) || die "$0: couldn't restart: $!"; 211 }; 212 213 code(); 214 215 sub code { 216 print "PID: $$\n"; 217 print "ARGV: @ARGV\n"; 218 my $count = 0; 219 while (1) { 220 sleep 2; 221 print ++$count, "\n"; 222 } 223 } 224 225 226=head2 Deferred Signals (Safe Signals) 227 228Before Perl 5.8.0, installing Perl code to deal with signals exposed you to 229danger from two things. First, few system library functions are 230re-entrant. If the signal interrupts while Perl is executing one function 231(like malloc(3) or printf(3)), and your signal handler then calls the same 232function again, you could get unpredictable behavior--often, a core dump. 233Second, Perl isn't itself re-entrant at the lowest levels. If the signal 234interrupts Perl while Perl is changing its own internal data structures, 235similarly unpredictable behavior may result. 236 237There were two things you could do, knowing this: be paranoid or be 238pragmatic. The paranoid approach was to do as little as possible in your 239signal handler. Set an existing integer variable that already has a 240value, and return. This doesn't help you if you're in a slow system call, 241which will just restart. That means you have to C<die> to longjmp(3) out 242of the handler. Even this is a little cavalier for the true paranoiac, 243who avoids C<die> in a handler because the system I<is> out to get you. 244The pragmatic approach was to say "I know the risks, but prefer the 245convenience", and to do anything you wanted in your signal handler, 246and be prepared to clean up core dumps now and again. 247 248Perl 5.8.0 and later avoid these problems by "deferring" signals. That is, 249when the signal is delivered to the process by the system (to the C code 250that implements Perl) a flag is set, and the handler returns immediately. 251Then at strategic "safe" points in the Perl interpreter (e.g. when it is 252about to execute a new opcode) the flags are checked and the Perl level 253handler from %SIG is executed. The "deferred" scheme allows much more 254flexibility in the coding of signal handlers as we know the Perl 255interpreter is in a safe state, and that we are not in a system library 256function when the handler is called. However the implementation does 257differ from previous Perls in the following ways: 258 259=over 4 260 261=item Long-running opcodes 262 263As the Perl interpreter looks at signal flags only when it is about 264to execute a new opcode, a signal that arrives during a long-running 265opcode (e.g. a regular expression operation on a very large string) will 266not be seen until the current opcode completes. 267 268If a signal of any given type fires multiple times during an opcode 269(such as from a fine-grained timer), the handler for that signal will 270be called only once, after the opcode completes; all other 271instances will be discarded. Furthermore, if your system's signal queue 272gets flooded to the point that there are signals that have been raised 273but not yet caught (and thus not deferred) at the time an opcode 274completes, those signals may well be caught and deferred during 275subsequent opcodes, with sometimes surprising results. For example, you 276may see alarms delivered even after calling C<alarm(0)> as the latter 277stops the raising of alarms but does not cancel the delivery of alarms 278raised but not yet caught. Do not depend on the behaviors described in 279this paragraph as they are side effects of the current implementation and 280may change in future versions of Perl. 281 282=item Interrupting IO 283 284When a signal is delivered (e.g., SIGINT from a control-C) the operating 285system breaks into IO operations like I<read>(2), which is used to 286implement Perl's readline() function, the C<< <> >> operator. On older 287Perls the handler was called immediately (and as C<read> is not "unsafe", 288this worked well). With the "deferred" scheme the handler is I<not> called 289immediately, and if Perl is using the system's C<stdio> library that 290library may restart the C<read> without returning to Perl to give it a 291chance to call the %SIG handler. If this happens on your system the 292solution is to use the C<:perlio> layer to do IO--at least on those handles 293that you want to be able to break into with signals. (The C<:perlio> layer 294checks the signal flags and calls %SIG handlers before resuming IO 295operation.) 296 297The default in Perl 5.8.0 and later is to automatically use 298the C<:perlio> layer. 299 300Note that it is not advisable to access a file handle within a signal 301handler where that signal has interrupted an I/O operation on that same 302handle. While perl will at least try hard not to crash, there are no 303guarantees of data integrity; for example, some data might get dropped or 304written twice. 305 306Some networking library functions like gethostbyname() are known to have 307their own implementations of timeouts which may conflict with your 308timeouts. If you have problems with such functions, try using the POSIX 309sigaction() function, which bypasses Perl safe signals. Be warned that 310this does subject you to possible memory corruption, as described above. 311 312Instead of setting C<$SIG{ALRM}>: 313 314 local $SIG{ALRM} = sub { die "alarm" }; 315 316try something like the following: 317 318 use POSIX qw(SIGALRM); 319 POSIX::sigaction(SIGALRM, 320 POSIX::SigAction->new(sub { die "alarm" })) 321 || die "Error setting SIGALRM handler: $!\n"; 322 323Another way to disable the safe signal behavior locally is to use 324the C<Perl::Unsafe::Signals> module from CPAN, which affects 325all signals. 326 327=item Restartable system calls 328 329On systems that supported it, older versions of Perl used the 330SA_RESTART flag when installing %SIG handlers. This meant that 331restartable system calls would continue rather than returning when 332a signal arrived. In order to deliver deferred signals promptly, 333Perl 5.8.0 and later do I<not> use SA_RESTART. Consequently, 334restartable system calls can fail (with $! set to C<EINTR>) in places 335where they previously would have succeeded. 336 337The default C<:perlio> layer retries C<read>, C<write> 338and C<close> as described above; interrupted C<wait> and 339C<waitpid> calls will always be retried. 340 341=item Signals as "faults" 342 343Certain signals like SEGV, ILL, BUS and FPE are generated by virtual memory 344addressing errors and similar "faults". These are normally fatal: there is 345little a Perl-level handler can do with them. So Perl delivers them 346immediately rather than attempting to defer them. 347 348It is possible to catch these with a C<%SIG> handler (see L<perlvar>), 349but on top of the usual problems of "unsafe" signals the signal is likely 350to get rethrown immediately on return from the signal handler, so such 351a handler should C<die> or C<exit> instead. 352 353=item Signals triggered by operating system state 354 355On some operating systems certain signal handlers are supposed to "do 356something" before returning. One example can be CHLD or CLD, which 357indicates a child process has completed. On some operating systems the 358signal handler is expected to C<wait> for the completed child 359process. On such systems the deferred signal scheme will not work for 360those signals: it does not do the C<wait>. Again the failure will 361look like a loop as the operating system will reissue the signal because 362there are completed child processes that have not yet been C<wait>ed for. 363 364=back 365 366If you want the old signal behavior back despite possible 367memory corruption, set the environment variable C<PERL_SIGNALS> to 368C<"unsafe">. This feature first appeared in Perl 5.8.1. 369 370=head1 Named Pipes 371 372A named pipe (often referred to as a FIFO) is an old Unix IPC 373mechanism for processes communicating on the same machine. It works 374just like regular anonymous pipes, except that the 375processes rendezvous using a filename and need not be related. 376 377To create a named pipe, use the C<POSIX::mkfifo()> function. 378 379 use POSIX qw(mkfifo); 380 mkfifo($path, 0700) || die "mkfifo $path failed: $!"; 381 382You can also use the Unix command mknod(1), or on some 383systems, mkfifo(1). These may not be in your normal path, though. 384 385 # system return val is backwards, so && not || 386 # 387 $ENV{PATH} .= ":/etc:/usr/etc"; 388 if ( system("mknod", $path, "p") 389 && system("mkfifo", $path) ) 390 { 391 die "mk{nod,fifo} $path failed"; 392 } 393 394 395A fifo is convenient when you want to connect a process to an unrelated 396one. When you open a fifo, the program will block until there's something 397on the other end. 398 399For example, let's say you'd like to have your F<.signature> file be a 400named pipe that has a Perl program on the other end. Now every time any 401program (like a mailer, news reader, finger program, etc.) tries to read 402from that file, the reading program will read the new signature from your 403program. We'll use the pipe-checking file-test operator, B<-p>, to find 404out whether anyone (or anything) has accidentally removed our fifo. 405 406 chdir(); # go home 407 my $FIFO = ".signature"; 408 409 while (1) { 410 unless (-p $FIFO) { 411 unlink $FIFO; # discard any failure, will catch later 412 require POSIX; # delayed loading of heavy module 413 POSIX::mkfifo($FIFO, 0700) 414 || die "can't mkfifo $FIFO: $!"; 415 } 416 417 # next line blocks till there's a reader 418 open (my $fh, ">", $FIFO) || die "can't open $FIFO: $!"; 419 print $fh "John Smith (smith\@host.org)\n", `fortune -s`; 420 close($fh) || die "can't close $FIFO: $!"; 421 sleep 2; # to avoid dup signals 422 } 423 424=head1 Using open() for IPC 425 426Perl's basic open() statement can also be used for unidirectional 427interprocess communication by specifying the open mode as C<|-> or C<-|>. 428Here's how to start 429something up in a child process you intend to write to: 430 431 open(my $spooler, "|-", "cat -v | lpr -h 2>/dev/null") 432 || die "can't fork: $!"; 433 local $SIG{PIPE} = sub { die "spooler pipe broke" }; 434 print $spooler "stuff\n"; 435 close $spooler || die "bad spool: $! $?"; 436 437And here's how to start up a child process you intend to read from: 438 439 open(my $status, "-|", "netstat -an 2>&1") 440 || die "can't fork: $!"; 441 while (<$status>) { 442 next if /^(tcp|udp)/; 443 print; 444 } 445 close $status || die "bad netstat: $! $?"; 446 447Be aware that these operations are full Unix forks, which means they may 448not be correctly implemented on all alien systems. See L<perlport/open> 449for portability details. 450 451In the two-argument form of open(), a pipe open can be achieved by 452either appending or prepending a pipe symbol to the second argument: 453 454 open(my $spooler, "| cat -v | lpr -h 2>/dev/null") 455 || die "can't fork: $!"; 456 open(my $status, "netstat -an 2>&1 |") 457 || die "can't fork: $!"; 458 459This can be used even on systems that do not support forking, but this 460possibly allows code intended to read files to unexpectedly execute 461programs. If one can be sure that a particular program is a Perl script 462expecting filenames in @ARGV using the two-argument form of open() or the 463C<< <> >> operator, the clever programmer can write something like this: 464 465 % program f1 "cmd1|" - f2 "cmd2|" f3 < tmpfile 466 467and no matter which sort of shell it's called from, the Perl program will 468read from the file F<f1>, the process F<cmd1>, standard input (F<tmpfile> 469in this case), the F<f2> file, the F<cmd2> command, and finally the F<f3> 470file. Pretty nifty, eh? 471 472You might notice that you could use backticks for much the 473same effect as opening a pipe for reading: 474 475 print grep { !/^(tcp|udp)/ } `netstat -an 2>&1`; 476 die "bad netstatus ($?)" if $?; 477 478While this is true on the surface, it's much more efficient to process the 479file one line or record at a time because then you don't have to read the 480whole thing into memory at once. It also gives you finer control of the 481whole process, letting you kill off the child process early if you'd like. 482 483Be careful to check the return values from both open() and close(). If 484you're I<writing> to a pipe, you should also trap SIGPIPE. Otherwise, 485think of what happens when you start up a pipe to a command that doesn't 486exist: the open() will in all likelihood succeed (it only reflects the 487fork()'s success), but then your output will fail--spectacularly. Perl 488can't know whether the command worked, because your command is actually 489running in a separate process whose exec() might have failed. Therefore, 490while readers of bogus commands return just a quick EOF, writers 491to bogus commands will get hit with a signal, which they'd best be prepared 492to handle. Consider: 493 494 open(my $fh, "|-", "bogus") || die "can't fork: $!"; 495 print $fh "bang\n"; # neither necessary nor sufficient 496 # to check print retval! 497 close($fh) || die "can't close: $!"; 498 499The reason for not checking the return value from print() is because of 500pipe buffering; physical writes are delayed. That won't blow up until the 501close, and it will blow up with a SIGPIPE. To catch it, you could use 502this: 503 504 $SIG{PIPE} = "IGNORE"; 505 open(my $fh, "|-", "bogus") || die "can't fork: $!"; 506 print $fh "bang\n"; 507 close($fh) || die "can't close: status=$?"; 508 509=head2 Filehandles 510 511Both the main process and any child processes it forks share the same 512STDIN, STDOUT, and STDERR filehandles. If both processes try to access 513them at once, strange things can happen. You may also want to close 514or reopen the filehandles for the child. You can get around this by 515opening your pipe with open(), but on some systems this means that the 516child process cannot outlive the parent. 517 518=head2 Background Processes 519 520You can run a command in the background with: 521 522 system("cmd &"); 523 524The command's STDOUT and STDERR (and possibly STDIN, depending on your 525shell) will be the same as the parent's. You won't need to catch 526SIGCHLD because of the double-fork taking place; see below for details. 527 528=head2 Complete Dissociation of Child from Parent 529 530In some cases (starting server processes, for instance) you'll want to 531completely dissociate the child process from the parent. This is 532often called daemonization. A well-behaved daemon will also chdir() 533to the root directory so it doesn't prevent unmounting the filesystem 534containing the directory from which it was launched, and redirect its 535standard file descriptors from and to F</dev/null> so that random 536output doesn't wind up on the user's terminal. 537 538 use POSIX "setsid"; 539 540 sub daemonize { 541 chdir("/") || die "can't chdir to /: $!"; 542 open(STDIN, "<", "/dev/null") || die "can't read /dev/null: $!"; 543 open(STDOUT, ">", "/dev/null") || die "can't write /dev/null: $!"; 544 defined(my $pid = fork()) || die "can't fork: $!"; 545 exit if $pid; # non-zero now means I am the parent 546 (setsid() != -1) || die "Can't start a new session: $!"; 547 open(STDERR, ">&", STDOUT) || die "can't dup stdout: $!"; 548 } 549 550The fork() has to come before the setsid() to ensure you aren't a 551process group leader; the setsid() will fail if you are. If your 552system doesn't have the setsid() function, open F</dev/tty> and use the 553C<TIOCNOTTY> ioctl() on it instead. See tty(4) for details. 554 555Non-Unix users should check their C<< I<Your_OS>::Process >> module for 556other possible solutions. 557 558=head2 Safe Pipe Opens 559 560Another interesting approach to IPC is making your single program go 561multiprocess and communicate between--or even amongst--yourselves. The 562two-argument form of the 563open() function will accept a file argument of either C<"-|"> or C<"|-"> 564to do a very interesting thing: it forks a child connected to the 565filehandle you've opened. The child is running the same program as the 566parent. This is useful for safely opening a file when running under an 567assumed UID or GID, for example. If you open a pipe I<to> minus, you can 568write to the filehandle you opened and your kid will find it in I<his> 569STDIN. If you open a pipe I<from> minus, you can read from the filehandle 570you opened whatever your kid writes to I<his> STDOUT. 571 572 my $PRECIOUS = "/path/to/some/safe/file"; 573 my $sleep_count; 574 my $pid; 575 my $kid_to_write; 576 577 do { 578 $pid = open($kid_to_write, "|-"); 579 unless (defined $pid) { 580 warn "cannot fork: $!"; 581 die "bailing out" if $sleep_count++ > 6; 582 sleep 10; 583 } 584 } until defined $pid; 585 586 if ($pid) { # I am the parent 587 print $kid_to_write @some_data; 588 close($kid_to_write) || warn "kid exited $?"; 589 } else { # I am the child 590 # drop permissions in setuid and/or setgid programs: 591 ($>, $)) = ($<, $(); 592 open (my $outfile, ">", $PRECIOUS) 593 || die "can't open $PRECIOUS: $!"; 594 while (<STDIN>) { 595 print $outfile; # child STDIN is parent $kid_to_write 596 } 597 close($outfile) || die "can't close $PRECIOUS: $!"; 598 exit(0); # don't forget this!! 599 } 600 601Another common use for this construct is when you need to execute 602something without the shell's interference. With system(), it's 603straightforward, but you can't use a pipe open or backticks safely. 604That's because there's no way to stop the shell from getting its hands on 605your arguments. Instead, use lower-level control to call exec() directly. 606 607Here's a safe backtick or pipe open for read: 608 609 my $pid = open(my $kid_to_read, "-|"); 610 defined($pid) || die "can't fork: $!"; 611 612 if ($pid) { # parent 613 while (<$kid_to_read>) { 614 # do something interesting 615 } 616 close($kid_to_read) || warn "kid exited $?"; 617 618 } else { # child 619 ($>, $)) = ($<, $(); # suid only 620 exec($program, @options, @args) 621 || die "can't exec program: $!"; 622 # NOTREACHED 623 } 624 625And here's a safe pipe open for writing: 626 627 my $pid = open(my $kid_to_write, "|-"); 628 defined($pid) || die "can't fork: $!"; 629 630 $SIG{PIPE} = sub { die "whoops, $program pipe broke" }; 631 632 if ($pid) { # parent 633 print $kid_to_write @data; 634 close($kid_to_write) || warn "kid exited $?"; 635 636 } else { # child 637 ($>, $)) = ($<, $(); 638 exec($program, @options, @args) 639 || die "can't exec program: $!"; 640 # NOTREACHED 641 } 642 643It is very easy to dead-lock a process using this form of open(), or 644indeed with any use of pipe() with multiple subprocesses. The 645example above is "safe" because it is simple and calls exec(). See 646L</"Avoiding Pipe Deadlocks"> for general safety principles, but there 647are extra gotchas with Safe Pipe Opens. 648 649In particular, if you opened the pipe using C<open $fh, "|-">, then you 650cannot simply use close() in the parent process to close an unwanted 651writer. Consider this code: 652 653 my $pid = open(my $writer, "|-"); # fork open a kid 654 defined($pid) || die "first fork failed: $!"; 655 if ($pid) { 656 if (my $sub_pid = fork()) { 657 defined($sub_pid) || die "second fork failed: $!"; 658 close($writer) || die "couldn't close writer: $!"; 659 # now do something else... 660 } 661 else { 662 # first write to $writer 663 # ... 664 # then when finished 665 close($writer) || die "couldn't close writer: $!"; 666 exit(0); 667 } 668 } 669 else { 670 # first do something with STDIN, then 671 exit(0); 672 } 673 674In the example above, the true parent does not want to write to the $writer 675filehandle, so it closes it. However, because $writer was opened using 676C<open $fh, "|-">, it has a special behavior: closing it calls 677waitpid() (see L<perlfunc/waitpid>), which waits for the subprocess 678to exit. If the child process ends up waiting for something happening 679in the section marked "do something else", you have deadlock. 680 681This can also be a problem with intermediate subprocesses in more 682complicated code, which will call waitpid() on all open filehandles 683during global destruction--in no predictable order. 684 685To solve this, you must manually use pipe(), fork(), and the form of 686open() which sets one file descriptor to another, as shown below: 687 688 pipe(my $reader, my $writer) || die "pipe failed: $!"; 689 my $pid = fork(); 690 defined($pid) || die "first fork failed: $!"; 691 if ($pid) { 692 close $reader; 693 if (my $sub_pid = fork()) { 694 defined($sub_pid) || die "first fork failed: $!"; 695 close($writer) || die "can't close writer: $!"; 696 } 697 else { 698 # write to $writer... 699 # ... 700 # then when finished 701 close($writer) || die "can't close writer: $!"; 702 exit(0); 703 } 704 # write to $writer... 705 } 706 else { 707 open(STDIN, "<&", $reader) || die "can't reopen STDIN: $!"; 708 close($writer) || die "can't close writer: $!"; 709 # do something... 710 exit(0); 711 } 712 713Since Perl 5.8.0, you can also use the list form of C<open> for pipes. 714This is preferred when you wish to avoid having the shell interpret 715metacharacters that may be in your command string. 716 717So for example, instead of using: 718 719 open(my $ps_pipe, "-|", "ps aux") || die "can't open ps pipe: $!"; 720 721One would use either of these: 722 723 open(my $ps_pipe, "-|", "ps", "aux") 724 || die "can't open ps pipe: $!"; 725 726 my @ps_args = qw[ ps aux ]; 727 open(my $ps_pipe, "-|", @ps_args) 728 || die "can't open @ps_args|: $!"; 729 730Because there are more than three arguments to open(), it forks the ps(1) 731command I<without> spawning a shell, and reads its standard output via the 732C<$ps_pipe> filehandle. The corresponding syntax to I<write> to command 733pipes is to use C<"|-"> in place of C<"-|">. 734 735This was admittedly a rather silly example, because you're using string 736literals whose content is perfectly safe. There is therefore no cause to 737resort to the harder-to-read, multi-argument form of pipe open(). However, 738whenever you cannot be assured that the program arguments are free of shell 739metacharacters, the fancier form of open() should be used. For example: 740 741 my @grep_args = ("egrep", "-i", $some_pattern, @many_files); 742 open(my $grep_pipe, "-|", @grep_args) 743 || die "can't open @grep_args|: $!"; 744 745Here the multi-argument form of pipe open() is preferred because the 746pattern and indeed even the filenames themselves might hold metacharacters. 747 748=head2 Avoiding Pipe Deadlocks 749 750Whenever you have more than one subprocess, you must be careful that each 751closes whichever half of any pipes created for interprocess communication 752it is not using. This is because any child process reading from the pipe 753and expecting an EOF will never receive it, and therefore never exit. A 754single process closing a pipe is not enough to close it; the last process 755with the pipe open must close it for it to read EOF. 756 757Certain built-in Unix features help prevent this most of the time. For 758instance, filehandles have a "close on exec" flag, which is set I<en masse> 759under control of the C<$^F> variable. This is so any filehandles you 760didn't explicitly route to the STDIN, STDOUT or STDERR of a child 761I<program> will be automatically closed. 762 763Always explicitly and immediately call close() on the writable end of any 764pipe, unless that process is actually writing to it. Even if you don't 765explicitly call close(), Perl will still close() all filehandles during 766global destruction. As previously discussed, if those filehandles have 767been opened with Safe Pipe Open, this will result in calling waitpid(), 768which may again deadlock. 769 770=head2 Bidirectional Communication with Another Process 771 772While this works reasonably well for unidirectional communication, what 773about bidirectional communication? The most obvious approach doesn't work: 774 775 # THIS DOES NOT WORK!! 776 open(my $prog_for_reading_and_writing, "| some program |") 777 778If you forget to C<use warnings>, you'll miss out entirely on the 779helpful diagnostic message: 780 781 Can't do bidirectional pipe at -e line 1. 782 783If you really want to, you can use the standard open2() from the 784L<IPC::Open2> module to catch both ends. There's also an open3() in 785L<IPC::Open3> for tridirectional I/O so you can also catch your child's 786STDERR, but doing so would then require an awkward select() loop and 787wouldn't allow you to use normal Perl input operations. 788 789If you look at its source, you'll see that open2() uses low-level 790primitives like the pipe() and exec() syscalls to create all the 791connections. Although it might have been more efficient by using 792socketpair(), this would have been even less portable than it already 793is. The open2() and open3() functions are unlikely to work anywhere 794except on a Unix system, or at least one purporting POSIX compliance. 795 796=for TODO 797Hold on, is this even true? First it says that socketpair() is avoided 798for portability, but then it says it probably won't work except on 799Unixy systems anyway. Which one of those is true? 800 801Here's an example of using open2(): 802 803 use IPC::Open2; 804 my $pid = open2(my $reader, my $writer, "cat -un"); 805 print $writer "stuff\n"; 806 my $got = <$reader>; 807 waitpid $pid, 0; 808 809The problem with this is that buffering is really going to ruin your 810day. Even though your C<$writer> filehandle is auto-flushed so the process 811on the other end gets your data in a timely manner, you can't usually do 812anything to force that process to give its data to you in a similarly quick 813fashion. In this special case, we could actually so, because we gave 814I<cat> a B<-u> flag to make it unbuffered. But very few commands are 815designed to operate over pipes, so this seldom works unless you yourself 816wrote the program on the other end of the double-ended pipe. 817 818A solution to this is to use a library which uses pseudottys to make your 819program behave more reasonably. This way you don't have to have control 820over the source code of the program you're using. The C<Expect> module 821from CPAN also addresses this kind of thing. This module requires two 822other modules from CPAN, C<IO::Pty> and C<IO::Stty>. It sets up a pseudo 823terminal to interact with programs that insist on talking to the terminal 824device driver. If your system is supported, this may be your best bet. 825 826=head2 Bidirectional Communication with Yourself 827 828If you want, you may make low-level pipe() and fork() syscalls to stitch 829this together by hand. This example only talks to itself, but you could 830reopen the appropriate handles to STDIN and STDOUT and call other processes. 831(The following example lacks proper error checking.) 832 833 #!/usr/bin/perl 834 # pipe1 - bidirectional communication using two pipe pairs 835 # designed for the socketpair-challenged 836 use v5.36; 837 use IO::Handle; # enable autoflush method before Perl 5.14 838 pipe(my $parent_rdr, my $child_wtr); # XXX: check failure? 839 pipe(my $child_rdr, my $parent_wtr); # XXX: check failure? 840 $child_wtr->autoflush(1); 841 $parent_wtr->autoflush(1); 842 843 if ($pid = fork()) { 844 close $parent_rdr; 845 close $parent_wtr; 846 print $child_wtr "Parent Pid $$ is sending this\n"; 847 chomp(my $line = <$child_rdr>); 848 print "Parent Pid $$ just read this: '$line'\n"; 849 close $child_rdr; close $child_wtr; 850 waitpid($pid, 0); 851 } else { 852 die "cannot fork: $!" unless defined $pid; 853 close $child_rdr; 854 close $child_wtr; 855 chomp(my $line = <$parent_rdr>); 856 print "Child Pid $$ just read this: '$line'\n"; 857 print $parent_wtr "Child Pid $$ is sending this\n"; 858 close $parent_rdr; 859 close $parent_wtr; 860 exit(0); 861 } 862 863But you don't actually have to make two pipe calls. If you 864have the socketpair() system call, it will do this all for you. 865 866 #!/usr/bin/perl 867 # pipe2 - bidirectional communication using socketpair 868 # "the best ones always go both ways" 869 870 use v5.36; 871 use Socket; 872 use IO::Handle; # enable autoflush method before Perl 5.14 873 874 # We say AF_UNIX because although *_LOCAL is the 875 # POSIX 1003.1g form of the constant, many machines 876 # still don't have it. 877 socketpair(my $child, my $parent, AF_UNIX, SOCK_STREAM, PF_UNSPEC) 878 || die "socketpair: $!"; 879 880 $child->autoflush(1); 881 $parent->autoflush(1); 882 883 if ($pid = fork()) { 884 close $parent; 885 print $child "Parent Pid $$ is sending this\n"; 886 chomp(my $line = <$child>); 887 print "Parent Pid $$ just read this: '$line'\n"; 888 close $child; 889 waitpid($pid, 0); 890 } else { 891 die "cannot fork: $!" unless defined $pid; 892 close $child; 893 chomp(my $line = <$parent>); 894 print "Child Pid $$ just read this: '$line'\n"; 895 print $parent "Child Pid $$ is sending this\n"; 896 close $parent; 897 exit(0); 898 } 899 900=head1 Sockets: Client/Server Communication 901 902While not entirely limited to Unix-derived operating systems (e.g., WinSock 903on PCs provides socket support, as do some VMS libraries), you might not have 904sockets on your system, in which case this section probably isn't going to 905do you much good. With sockets, you can do both virtual circuits like TCP 906streams and datagrams like UDP packets. You may be able to do even more 907depending on your system. 908 909The Perl functions for dealing with sockets have the same names as 910the corresponding system calls in C, but their arguments tend to differ 911for two reasons. First, Perl filehandles work differently than C file 912descriptors. Second, Perl already knows the length of its strings, so you 913don't need to pass that information. 914 915One of the major problems with ancient, antemillennial socket code in Perl 916was that it used hard-coded values for some of the constants, which 917severely hurt portability. If you ever see code that does anything like 918explicitly setting C<$AF_INET = 2>, you know you're in for big trouble. 919An immeasurably superior approach is to use the L<Socket> module, which more 920reliably grants access to the various constants and functions you'll need. 921 922If you're not writing a server/client for an existing protocol like 923NNTP or SMTP, you should give some thought to how your server will 924know when the client has finished talking, and vice-versa. Most 925protocols are based on one-line messages and responses (so one party 926knows the other has finished when a "\n" is received) or multi-line 927messages and responses that end with a period on an empty line 928("\n.\n" terminates a message/response). 929 930=head2 Internet Line Terminators 931 932The Internet line terminator is "\015\012". Under ASCII variants of 933Unix, that could usually be written as "\r\n", but under other systems, 934"\r\n" might at times be "\015\015\012", "\012\012\015", or something 935completely different. The standards specify writing "\015\012" to be 936conformant (be strict in what you provide), but they also recommend 937accepting a lone "\012" on input (be lenient in what you require). 938We haven't always been very good about that in the code in this manpage, 939but unless you're on a Mac from way back in its pre-Unix dark ages, you'll 940probably be ok. 941 942=head2 Internet TCP Clients and Servers 943 944Use Internet-domain sockets when you want to do client-server 945communication that might extend to machines outside of your own system. 946 947Here's a sample TCP client using Internet-domain sockets: 948 949 #!/usr/bin/perl 950 use v5.36; 951 use Socket; 952 953 my $remote = shift || "localhost"; 954 my $port = shift || 2345; # random port 955 if ($port =~ /\D/) { $port = getservbyname($port, "tcp") } 956 die "No port" unless $port; 957 my $iaddr = inet_aton($remote) || die "no host: $remote"; 958 my $paddr = sockaddr_in($port, $iaddr); 959 960 my $proto = getprotobyname("tcp"); 961 socket(my $sock, PF_INET, SOCK_STREAM, $proto) || die "socket: $!"; 962 connect($sock, $paddr) || die "connect: $!"; 963 while (my $line = <$sock>) { 964 print $line; 965 } 966 967 close ($sock) || die "close: $!"; 968 exit(0); 969 970And here's a corresponding server to go along with it. We'll 971leave the address as C<INADDR_ANY> so that the kernel can choose 972the appropriate interface on multihomed hosts. If you want sit 973on a particular interface (like the external side of a gateway 974or firewall machine), fill this in with your real address instead. 975 976 #!/usr/bin/perl -T 977 use v5.36; 978 BEGIN { $ENV{PATH} = "/usr/bin:/bin" } 979 use Socket; 980 use Carp; 981 my $EOL = "\015\012"; 982 983 sub logmsg { print "$0 $$: @_ at ", scalar localtime(), "\n" } 984 985 my $port = shift || 2345; 986 die "invalid port" unless $port =~ /^ \d+ $/x; 987 988 my $proto = getprotobyname("tcp"); 989 990 socket(my $server, PF_INET, SOCK_STREAM, $proto) || die "socket: $!"; 991 setsockopt($server, SOL_SOCKET, SO_REUSEADDR, pack("l", 1)) 992 || die "setsockopt: $!"; 993 bind($server, sockaddr_in($port, INADDR_ANY)) || die "bind: $!"; 994 listen($server, SOMAXCONN) || die "listen: $!"; 995 996 logmsg "server started on port $port"; 997 998 for (my $paddr; $paddr = accept(my $client, $server); close $client) { 999 my($port, $iaddr) = sockaddr_in($paddr); 1000 my $name = gethostbyaddr($iaddr, AF_INET); 1001 1002 logmsg "connection from $name [", 1003 inet_ntoa($iaddr), "] 1004 at port $port"; 1005 1006 print $client "Hello there, $name, it's now ", 1007 scalar localtime(), $EOL; 1008 } 1009 1010And here's a multitasking version. It's multitasked in that 1011like most typical servers, it spawns (fork()s) a child server to 1012handle the client request so that the master server can quickly 1013go back to service a new client. 1014 1015 #!/usr/bin/perl -T 1016 use v5.36; 1017 BEGIN { $ENV{PATH} = "/usr/bin:/bin" } 1018 use Socket; 1019 use Carp; 1020 my $EOL = "\015\012"; 1021 1022 sub spawn; # forward declaration 1023 sub logmsg { print "$0 $$: @_ at ", scalar localtime(), "\n" } 1024 1025 my $port = shift || 2345; 1026 die "invalid port" unless $port =~ /^ \d+ $/x; 1027 1028 my $proto = getprotobyname("tcp"); 1029 1030 socket(my $server, PF_INET, SOCK_STREAM, $proto) || die "socket: $!"; 1031 setsockopt($server, SOL_SOCKET, SO_REUSEADDR, pack("l", 1)) 1032 || die "setsockopt: $!"; 1033 bind($server, sockaddr_in($port, INADDR_ANY)) || die "bind: $!"; 1034 listen($server, SOMAXCONN) || die "listen: $!"; 1035 1036 logmsg "server started on port $port"; 1037 1038 my $waitedpid = 0; 1039 1040 use POSIX ":sys_wait_h"; 1041 use Errno; 1042 1043 sub REAPER { 1044 local $!; # don't let waitpid() overwrite current error 1045 while ((my $pid = waitpid(-1, WNOHANG)) > 0 && WIFEXITED($?)) { 1046 logmsg "reaped $waitedpid" . ($? ? " with exit $?" : ""); 1047 } 1048 $SIG{CHLD} = \&REAPER; # loathe SysV 1049 } 1050 1051 $SIG{CHLD} = \&REAPER; 1052 1053 while (1) { 1054 my $paddr = accept(my $client, $server) || do { 1055 # try again if accept() returned because got a signal 1056 next if $!{EINTR}; 1057 die "accept: $!"; 1058 }; 1059 my ($port, $iaddr) = sockaddr_in($paddr); 1060 my $name = gethostbyaddr($iaddr, AF_INET); 1061 1062 logmsg "connection from $name [", 1063 inet_ntoa($iaddr), 1064 "] at port $port"; 1065 1066 spawn $client, sub { 1067 $| = 1; 1068 print "Hello there, $name, it's now ", 1069 scalar localtime(), 1070 $EOL; 1071 exec "/usr/games/fortune" # XXX: "wrong" line terminators 1072 or confess "can't exec fortune: $!"; 1073 }; 1074 close $client; 1075 } 1076 1077 sub spawn { 1078 my $client = shift; 1079 my $coderef = shift; 1080 1081 unless (@_ == 0 && $coderef && ref($coderef) eq "CODE") { 1082 confess "usage: spawn CLIENT CODEREF"; 1083 } 1084 1085 my $pid; 1086 unless (defined($pid = fork())) { 1087 logmsg "cannot fork: $!"; 1088 return; 1089 } 1090 elsif ($pid) { 1091 logmsg "begat $pid"; 1092 return; # I'm the parent 1093 } 1094 # else I'm the child -- go spawn 1095 1096 open(STDIN, "<&", $client) || die "can't dup client to stdin"; 1097 open(STDOUT, ">&", $client) || die "can't dup client to stdout"; 1098 ## open(STDERR, ">&", STDOUT) || die "can't dup stdout to stderr"; 1099 exit($coderef->()); 1100 } 1101 1102This server takes the trouble to clone off a child version via fork() 1103for each incoming request. That way it can handle many requests at 1104once, which you might not always want. Even if you don't fork(), the 1105listen() will allow that many pending connections. Forking servers 1106have to be particularly careful about cleaning up their dead children 1107(called "zombies" in Unix parlance), because otherwise you'll quickly 1108fill up your process table. The REAPER subroutine is used here to 1109call waitpid() for any child processes that have finished, thereby 1110ensuring that they terminate cleanly and don't join the ranks of the 1111living dead. 1112 1113Within the while loop we call accept() and check to see if it returns 1114a false value. This would normally indicate a system error needs 1115to be reported. However, the introduction of safe signals (see 1116L</Deferred Signals (Safe Signals)> above) in Perl 5.8.0 means that 1117accept() might also be interrupted when the process receives a signal. 1118This typically happens when one of the forked subprocesses exits and 1119notifies the parent process with a CHLD signal. 1120 1121If accept() is interrupted by a signal, $! will be set to EINTR. 1122If this happens, we can safely continue to the next iteration of 1123the loop and another call to accept(). It is important that your 1124signal handling code not modify the value of $!, or else this test 1125will likely fail. In the REAPER subroutine we create a local version 1126of $! before calling waitpid(). When waitpid() sets $! to ECHILD as 1127it inevitably does when it has no more children waiting, it 1128updates the local copy and leaves the original unchanged. 1129 1130You should use the B<-T> flag to enable taint checking (see L<perlsec>) 1131even if we aren't running setuid or setgid. This is always a good idea 1132for servers or any program run on behalf of someone else (like CGI 1133scripts), because it lessens the chances that people from the outside will 1134be able to compromise your system. 1135Note that perl can be built without taint support. There are two 1136different modes: in one, B<-T> will silently do nothing. In the other 1137mode B<-T> results in a fatal error. 1138 1139Let's look at another TCP client. This one connects to the TCP "time" 1140service on a number of different machines and shows how far their clocks 1141differ from the system on which it's being run: 1142 1143 #!/usr/bin/perl 1144 use v5.36; 1145 use Socket; 1146 1147 my $SECS_OF_70_YEARS = 2208988800; 1148 sub ctime { scalar localtime(shift() || time()) } 1149 1150 my $iaddr = gethostbyname("localhost"); 1151 my $proto = getprotobyname("tcp"); 1152 my $port = getservbyname("time", "tcp"); 1153 my $paddr = sockaddr_in(0, $iaddr); 1154 1155 $| = 1; 1156 printf "%-24s %8s %s\n", "localhost", 0, ctime(); 1157 1158 foreach my $host (@ARGV) { 1159 printf "%-24s ", $host; 1160 my $hisiaddr = inet_aton($host) || die "unknown host"; 1161 my $hispaddr = sockaddr_in($port, $hisiaddr); 1162 socket(my $socket, PF_INET, SOCK_STREAM, $proto) 1163 || die "socket: $!"; 1164 connect($socket, $hispaddr) || die "connect: $!"; 1165 my $rtime = pack("C4", ()); 1166 read($socket, $rtime, 4); 1167 close($socket); 1168 my $histime = unpack("N", $rtime) - $SECS_OF_70_YEARS; 1169 printf "%8d %s\n", $histime - time(), ctime($histime); 1170 } 1171 1172=head2 Unix-Domain TCP Clients and Servers 1173 1174That's fine for Internet-domain clients and servers, but what about local 1175communications? While you can use the same setup, sometimes you don't 1176want to. Unix-domain sockets are local to the current host, and are often 1177used internally to implement pipes. Unlike Internet domain sockets, Unix 1178domain sockets can show up in the file system with an ls(1) listing. 1179 1180 % ls -l /dev/log 1181 srw-rw-rw- 1 root 0 Oct 31 07:23 /dev/log 1182 1183You can test for these with Perl's B<-S> file test: 1184 1185 unless (-S "/dev/log") { 1186 die "something's wicked with the log system"; 1187 } 1188 1189Here's a sample Unix-domain client: 1190 1191 #!/usr/bin/perl 1192 use v5.36; 1193 use Socket; 1194 1195 my $rendezvous = shift || "catsock"; 1196 socket(my $sock, PF_UNIX, SOCK_STREAM, 0) || die "socket: $!"; 1197 connect($sock, sockaddr_un($rendezvous)) || die "connect: $!"; 1198 while (defined(my $line = <$sock>)) { 1199 print $line; 1200 } 1201 exit(0); 1202 1203And here's a corresponding server. You don't have to worry about silly 1204network terminators here because Unix domain sockets are guaranteed 1205to be on the localhost, and thus everything works right. 1206 1207 #!/usr/bin/perl -T 1208 use v5.36; 1209 use Socket; 1210 use Carp; 1211 1212 BEGIN { $ENV{PATH} = "/usr/bin:/bin" } 1213 sub spawn; # forward declaration 1214 sub logmsg { print "$0 $$: @_ at ", scalar localtime(), "\n" } 1215 1216 my $NAME = "catsock"; 1217 my $uaddr = sockaddr_un($NAME); 1218 my $proto = getprotobyname("tcp"); 1219 1220 socket(my $server, PF_UNIX, SOCK_STREAM, 0) || die "socket: $!"; 1221 unlink($NAME); 1222 bind ($server, $uaddr) || die "bind: $!"; 1223 listen($server, SOMAXCONN) || die "listen: $!"; 1224 1225 logmsg "server started on $NAME"; 1226 1227 my $waitedpid; 1228 1229 use POSIX ":sys_wait_h"; 1230 sub REAPER { 1231 my $child; 1232 while (($waitedpid = waitpid(-1, WNOHANG)) > 0) { 1233 logmsg "reaped $waitedpid" . ($? ? " with exit $?" : ""); 1234 } 1235 $SIG{CHLD} = \&REAPER; # loathe SysV 1236 } 1237 1238 $SIG{CHLD} = \&REAPER; 1239 1240 1241 for ( $waitedpid = 0; 1242 accept(my $client, $server) || $waitedpid; 1243 $waitedpid = 0, close $client) 1244 { 1245 next if $waitedpid; 1246 logmsg "connection on $NAME"; 1247 spawn $client, sub { 1248 print "Hello there, it's now ", scalar localtime(), "\n"; 1249 exec("/usr/games/fortune") || die "can't exec fortune: $!"; 1250 }; 1251 } 1252 1253 sub spawn { 1254 my $client = shift(); 1255 my $coderef = shift(); 1256 1257 unless (@_ == 0 && $coderef && ref($coderef) eq "CODE") { 1258 confess "usage: spawn CLIENT CODEREF"; 1259 } 1260 1261 my $pid; 1262 unless (defined($pid = fork())) { 1263 logmsg "cannot fork: $!"; 1264 return; 1265 } 1266 elsif ($pid) { 1267 logmsg "begat $pid"; 1268 return; # I'm the parent 1269 } 1270 else { 1271 # I'm the child -- go spawn 1272 } 1273 1274 open(STDIN, "<&", $client) 1275 || die "can't dup client to stdin"; 1276 open(STDOUT, ">&", $client) 1277 || die "can't dup client to stdout"; 1278 ## open(STDERR, ">&", STDOUT) 1279 ## || die "can't dup stdout to stderr"; 1280 exit($coderef->()); 1281 } 1282 1283As you see, it's remarkably similar to the Internet domain TCP server, so 1284much so, in fact, that we've omitted several duplicate functions--spawn(), 1285logmsg(), ctime(), and REAPER()--which are the same as in the other server. 1286 1287So why would you ever want to use a Unix domain socket instead of a 1288simpler named pipe? Because a named pipe doesn't give you sessions. You 1289can't tell one process's data from another's. With socket programming, 1290you get a separate session for each client; that's why accept() takes two 1291arguments. 1292 1293For example, let's say that you have a long-running database server daemon 1294that you want folks to be able to access from the Web, but only 1295if they go through a CGI interface. You'd have a small, simple CGI 1296program that does whatever checks and logging you feel like, and then acts 1297as a Unix-domain client and connects to your private server. 1298 1299=head1 TCP Clients with IO::Socket 1300 1301For those preferring a higher-level interface to socket programming, the 1302IO::Socket module provides an object-oriented approach. If for some reason 1303you lack this module, you can just fetch IO::Socket from CPAN, where you'll also 1304find modules providing easy interfaces to the following systems: DNS, FTP, 1305Ident (RFC 931), NIS and NISPlus, NNTP, Ping, POP3, SMTP, SNMP, SSLeay, 1306Telnet, and Time--to name just a few. 1307 1308=head2 A Simple Client 1309 1310Here's a client that creates a TCP connection to the "daytime" 1311service at port 13 of the host name "localhost" and prints out everything 1312that the server there cares to provide. 1313 1314 #!/usr/bin/perl 1315 use v5.36; 1316 use IO::Socket; 1317 my $remote = IO::Socket::INET->new( 1318 Proto => "tcp", 1319 PeerAddr => "localhost", 1320 PeerPort => "daytime(13)", 1321 ) 1322 || die "can't connect to daytime service on localhost"; 1323 while (<$remote>) { print } 1324 1325When you run this program, you should get something back that 1326looks like this: 1327 1328 Wed May 14 08:40:46 MDT 1997 1329 1330Here are what those parameters to the new() constructor mean: 1331 1332=over 4 1333 1334=item C<Proto> 1335 1336This is which protocol to use. In this case, the socket handle returned 1337will be connected to a TCP socket, because we want a stream-oriented 1338connection, that is, one that acts pretty much like a plain old file. 1339Not all sockets are this of this type. For example, the UDP protocol 1340can be used to make a datagram socket, used for message-passing. 1341 1342=item C<PeerAddr> 1343 1344This is the name or Internet address of the remote host the server is 1345running on. We could have specified a longer name like C<"www.perl.com">, 1346or an address like C<"207.171.7.72">. For demonstration purposes, we've 1347used the special hostname C<"localhost">, which should always mean the 1348current machine you're running on. The corresponding Internet address 1349for localhost is C<"127.0.0.1">, if you'd rather use that. 1350 1351=item C<PeerPort> 1352 1353This is the service name or port number we'd like to connect to. 1354We could have gotten away with using just C<"daytime"> on systems with a 1355well-configured system services file,[FOOTNOTE: The system services file 1356is found in I</etc/services> under Unixy systems.] but here we've specified the 1357port number (13) in parentheses. Using just the number would have also 1358worked, but numeric literals make careful programmers nervous. 1359 1360=back 1361 1362=head2 A Webget Client 1363 1364Here's a simple client that takes a remote host to fetch a document 1365from, and then a list of files to get from that host. This is a 1366more interesting client than the previous one because it first sends 1367something to the server before fetching the server's response. 1368 1369 #!/usr/bin/perl 1370 use v5.36; 1371 use IO::Socket; 1372 unless (@ARGV > 1) { die "usage: $0 host url ..." } 1373 my $host = shift(@ARGV); 1374 my $EOL = "\015\012"; 1375 my $BLANK = $EOL x 2; 1376 for my $document (@ARGV) { 1377 my $remote = IO::Socket::INET->new( Proto => "tcp", 1378 PeerAddr => $host, 1379 PeerPort => "http(80)", 1380 ) || die "cannot connect to httpd on $host"; 1381 $remote->autoflush(1); 1382 print $remote "GET $document HTTP/1.0" . $BLANK; 1383 while ( <$remote> ) { print } 1384 close $remote; 1385 } 1386 1387The web server handling the HTTP service is assumed to be at 1388its standard port, number 80. If the server you're trying to 1389connect to is at a different port, like 1080 or 8080, you should specify it 1390as the named-parameter pair, C<< PeerPort => 8080 >>. The C<autoflush> 1391method is used on the socket because otherwise the system would buffer 1392up the output we sent it. (If you're on a prehistoric Mac, you'll also 1393need to change every C<"\n"> in your code that sends data over the network 1394to be a C<"\015\012"> instead.) 1395 1396Connecting to the server is only the first part of the process: once you 1397have the connection, you have to use the server's language. Each server 1398on the network has its own little command language that it expects as 1399input. The string that we send to the server starting with "GET" is in 1400HTTP syntax. In this case, we simply request each specified document. 1401Yes, we really are making a new connection for each document, even though 1402it's the same host. That's the way you always used to have to speak HTTP. 1403Recent versions of web browsers may request that the remote server leave 1404the connection open a little while, but the server doesn't have to honor 1405such a request. 1406 1407Here's an example of running that program, which we'll call I<webget>: 1408 1409 % webget www.perl.com /guanaco.html 1410 HTTP/1.1 404 File Not Found 1411 Date: Thu, 08 May 1997 18:02:32 GMT 1412 Server: Apache/1.2b6 1413 Connection: close 1414 Content-type: text/html 1415 1416 <HEAD><TITLE>404 File Not Found</TITLE></HEAD> 1417 <BODY><H1>File Not Found</H1> 1418 The requested URL /guanaco.html was not found on this server.<P> 1419 </BODY> 1420 1421Ok, so that's not very interesting, because it didn't find that 1422particular document. But a long response wouldn't have fit on this page. 1423 1424For a more featureful version of this program, you should look to 1425the I<lwp-request> program included with the LWP modules from CPAN. 1426 1427=head2 Interactive Client with IO::Socket 1428 1429Well, that's all fine if you want to send one command and get one answer, 1430but what about setting up something fully interactive, somewhat like 1431the way I<telnet> works? That way you can type a line, get the answer, 1432type a line, get the answer, etc. 1433 1434This client is more complicated than the two we've done so far, but if 1435you're on a system that supports the powerful C<fork> call, the solution 1436isn't that rough. Once you've made the connection to whatever service 1437you'd like to chat with, call C<fork> to clone your process. Each of 1438these two identical process has a very simple job to do: the parent 1439copies everything from the socket to standard output, while the child 1440simultaneously copies everything from standard input to the socket. 1441To accomplish the same thing using just one process would be I<much> 1442harder, because it's easier to code two processes to do one thing than it 1443is to code one process to do two things. (This keep-it-simple principle 1444a cornerstones of the Unix philosophy, and good software engineering as 1445well, which is probably why it's spread to other systems.) 1446 1447Here's the code: 1448 1449 #!/usr/bin/perl 1450 use v5.36; 1451 use IO::Socket; 1452 1453 unless (@ARGV == 2) { die "usage: $0 host port" } 1454 my ($host, $port) = @ARGV; 1455 1456 # create a tcp connection to the specified host and port 1457 my $handle = IO::Socket::INET->new(Proto => "tcp", 1458 PeerAddr => $host, 1459 PeerPort => $port) 1460 || die "can't connect to port $port on $host: $!"; 1461 1462 $handle->autoflush(1); # so output gets there right away 1463 print STDERR "[Connected to $host:$port]\n"; 1464 1465 # split the program into two processes, identical twins 1466 die "can't fork: $!" unless defined(my $kidpid = fork()); 1467 1468 # the if{} block runs only in the parent process 1469 if ($kidpid) { 1470 # copy the socket to standard output 1471 while (defined (my $line = <$handle>)) { 1472 print STDOUT $line; 1473 } 1474 kill("TERM", $kidpid); # send SIGTERM to child 1475 } 1476 # the else{} block runs only in the child process 1477 else { 1478 # copy standard input to the socket 1479 while (defined (my $line = <STDIN>)) { 1480 print $handle $line; 1481 } 1482 exit(0); # just in case 1483 } 1484 1485The C<kill> function in the parent's C<if> block is there to send a 1486signal to our child process, currently running in the C<else> block, 1487as soon as the remote server has closed its end of the connection. 1488 1489If the remote server sends data a byte at time, and you need that 1490data immediately without waiting for a newline (which might not happen), 1491you may wish to replace the C<while> loop in the parent with the 1492following: 1493 1494 my $byte; 1495 while (sysread($handle, $byte, 1) == 1) { 1496 print STDOUT $byte; 1497 } 1498 1499Making a system call for each byte you want to read is not very efficient 1500(to put it mildly) but is the simplest to explain and works reasonably 1501well. 1502 1503=head1 TCP Servers with IO::Socket 1504 1505As always, setting up a server is little bit more involved than running a client. 1506The model is that the server creates a special kind of socket that 1507does nothing but listen on a particular port for incoming connections. 1508It does this by calling the C<< IO::Socket::INET->new() >> method with 1509slightly different arguments than the client did. 1510 1511=over 4 1512 1513=item Proto 1514 1515This is which protocol to use. Like our clients, we'll 1516still specify C<"tcp"> here. 1517 1518=item LocalPort 1519 1520We specify a local 1521port in the C<LocalPort> argument, which we didn't do for the client. 1522This is service name or port number for which you want to be the 1523server. (Under Unix, ports under 1024 are restricted to the 1524superuser.) In our sample, we'll use port 9000, but you can use 1525any port that's not currently in use on your system. If you try 1526to use one already in used, you'll get an "Address already in use" 1527message. Under Unix, the C<netstat -a> command will show 1528which services current have servers. 1529 1530=item Listen 1531 1532The C<Listen> parameter is set to the maximum number of 1533pending connections we can accept until we turn away incoming clients. 1534Think of it as a call-waiting queue for your telephone. 1535The low-level Socket module has a special symbol for the system maximum, which 1536is SOMAXCONN. 1537 1538=item Reuse 1539 1540The C<Reuse> parameter is needed so that we restart our server 1541manually without waiting a few minutes to allow system buffers to 1542clear out. 1543 1544=back 1545 1546Once the generic server socket has been created using the parameters 1547listed above, the server then waits for a new client to connect 1548to it. The server blocks in the C<accept> method, which eventually accepts a 1549bidirectional connection from the remote client. (Make sure to autoflush 1550this handle to circumvent buffering.) 1551 1552To add to user-friendliness, our server prompts the user for commands. 1553Most servers don't do this. Because of the prompt without a newline, 1554you'll have to use the C<sysread> variant of the interactive client above. 1555 1556This server accepts one of five different commands, sending output back to 1557the client. Unlike most network servers, this one handles only one 1558incoming client at a time. Multitasking servers are covered in 1559Chapter 16 of the Camel. 1560 1561Here's the code. 1562 1563 #!/usr/bin/perl 1564 use v5.36; 1565 use IO::Socket; 1566 use Net::hostent; # for OOish version of gethostbyaddr 1567 1568 my $PORT = 9000; # pick something not in use 1569 1570 my $server = IO::Socket::INET->new( Proto => "tcp", 1571 LocalPort => $PORT, 1572 Listen => SOMAXCONN, 1573 Reuse => 1); 1574 1575 die "can't setup server" unless $server; 1576 print "[Server $0 accepting clients]\n"; 1577 1578 while (my $client = $server->accept()) { 1579 $client->autoflush(1); 1580 print $client "Welcome to $0; type help for command list.\n"; 1581 my $hostinfo = gethostbyaddr($client->peeraddr); 1582 printf "[Connect from %s]\n", 1583 $hostinfo ? $hostinfo->name : $client->peerhost; 1584 print $client "Command? "; 1585 while ( <$client>) { 1586 next unless /\S/; # blank line 1587 if (/quit|exit/i) { last } 1588 elsif (/date|time/i) { printf $client "%s\n", scalar localtime() } 1589 elsif (/who/i ) { print $client `who 2>&1` } 1590 elsif (/cookie/i ) { print $client `/usr/games/fortune 2>&1` } 1591 elsif (/motd/i ) { print $client `cat /etc/motd 2>&1` } 1592 else { 1593 print $client "Commands: quit date who cookie motd\n"; 1594 } 1595 } continue { 1596 print $client "Command? "; 1597 } 1598 close $client; 1599 } 1600 1601=head1 UDP: Message Passing 1602 1603Another kind of client-server setup is one that uses not connections, but 1604messages. UDP communications involve much lower overhead but also provide 1605less reliability, as there are no promises that messages will arrive at 1606all, let alone in order and unmangled. Still, UDP offers some advantages 1607over TCP, including being able to "broadcast" or "multicast" to a whole 1608bunch of destination hosts at once (usually on your local subnet). If you 1609find yourself overly concerned about reliability and start building checks 1610into your message system, then you probably should use just TCP to start 1611with. 1612 1613UDP datagrams are I<not> a bytestream and should not be treated as such. 1614This makes using I/O mechanisms with internal buffering like stdio (i.e. 1615print() and friends) especially cumbersome. Use syswrite(), or better 1616send(), like in the example below. 1617 1618Here's a UDP program similar to the sample Internet TCP client given 1619earlier. However, instead of checking one host at a time, the UDP version 1620will check many of them asynchronously by simulating a multicast and then 1621using select() to do a timed-out wait for I/O. To do something similar 1622with TCP, you'd have to use a different socket handle for each host. 1623 1624 #!/usr/bin/perl 1625 use v5.36; 1626 use Socket; 1627 use Sys::Hostname; 1628 1629 my $SECS_OF_70_YEARS = 2_208_988_800; 1630 1631 my $iaddr = gethostbyname(hostname()); 1632 my $proto = getprotobyname("udp"); 1633 my $port = getservbyname("time", "udp"); 1634 my $paddr = sockaddr_in(0, $iaddr); # 0 means let kernel pick 1635 1636 socket(my $socket, PF_INET, SOCK_DGRAM, $proto) || die "socket: $!"; 1637 bind($socket, $paddr) || die "bind: $!"; 1638 1639 $| = 1; 1640 printf "%-12s %8s %s\n", "localhost", 0, scalar localtime(); 1641 my $count = 0; 1642 for my $host (@ARGV) { 1643 $count++; 1644 my $hisiaddr = inet_aton($host) || die "unknown host"; 1645 my $hispaddr = sockaddr_in($port, $hisiaddr); 1646 defined(send($socket, 0, 0, $hispaddr)) || die "send $host: $!"; 1647 } 1648 1649 my $rout = my $rin = ""; 1650 vec($rin, fileno($socket), 1) = 1; 1651 1652 # timeout after 10.0 seconds 1653 while ($count && select($rout = $rin, undef, undef, 10.0)) { 1654 my $rtime = ""; 1655 my $hispaddr = recv($socket, $rtime, 4, 0) || die "recv: $!"; 1656 my ($port, $hisiaddr) = sockaddr_in($hispaddr); 1657 my $host = gethostbyaddr($hisiaddr, AF_INET); 1658 my $histime = unpack("N", $rtime) - $SECS_OF_70_YEARS; 1659 printf "%-12s ", $host; 1660 printf "%8d %s\n", $histime - time(), scalar localtime($histime); 1661 $count--; 1662 } 1663 1664This example does not include any retries and may consequently fail to 1665contact a reachable host. The most prominent reason for this is congestion 1666of the queues on the sending host if the number of hosts to contact is 1667sufficiently large. 1668 1669=head1 SysV IPC 1670 1671While System V IPC isn't so widely used as sockets, it still has some 1672interesting uses. However, you cannot use SysV IPC or Berkeley mmap() to 1673have a variable shared amongst several processes. That's because Perl 1674would reallocate your string when you weren't wanting it to. You might 1675look into the C<IPC::Shareable> or C<threads::shared> modules for that. 1676 1677Here's a small example showing shared memory usage. 1678 1679 use IPC::SysV qw(IPC_PRIVATE IPC_RMID S_IRUSR S_IWUSR); 1680 1681 my $size = 2000; 1682 my $id = shmget(IPC_PRIVATE, $size, S_IRUSR | S_IWUSR); 1683 defined($id) || die "shmget: $!"; 1684 print "shm key $id\n"; 1685 1686 my $message = "Message #1"; 1687 shmwrite($id, $message, 0, 60) || die "shmwrite: $!"; 1688 print "wrote: '$message'\n"; 1689 shmread($id, my $buff, 0, 60) || die "shmread: $!"; 1690 print "read : '$buff'\n"; 1691 1692 # the buffer of shmread is zero-character end-padded. 1693 substr($buff, index($buff, "\0")) = ""; 1694 print "un" unless $buff eq $message; 1695 print "swell\n"; 1696 1697 print "deleting shm $id\n"; 1698 shmctl($id, IPC_RMID, 0) || die "shmctl: $!"; 1699 1700Here's an example of a semaphore: 1701 1702 use IPC::SysV qw(IPC_CREAT); 1703 1704 my $IPC_KEY = 1234; 1705 my $id = semget($IPC_KEY, 10, 0666 | IPC_CREAT); 1706 defined($id) || die "semget: $!"; 1707 print "sem id $id\n"; 1708 1709Put this code in a separate file to be run in more than one process. 1710Call the file F<take>: 1711 1712 # create a semaphore 1713 1714 my $IPC_KEY = 1234; 1715 my $id = semget($IPC_KEY, 0, 0); 1716 defined($id) || die "semget: $!"; 1717 1718 my $semnum = 0; 1719 my $semflag = 0; 1720 1721 # "take" semaphore 1722 # wait for semaphore to be zero 1723 my $semop = 0; 1724 my $opstring1 = pack("s!s!s!", $semnum, $semop, $semflag); 1725 1726 # Increment the semaphore count 1727 $semop = 1; 1728 my $opstring2 = pack("s!s!s!", $semnum, $semop, $semflag); 1729 my $opstring = $opstring1 . $opstring2; 1730 1731 semop($id, $opstring) || die "semop: $!"; 1732 1733Put this code in a separate file to be run in more than one process. 1734Call this file F<give>: 1735 1736 # "give" the semaphore 1737 # run this in the original process and you will see 1738 # that the second process continues 1739 1740 my $IPC_KEY = 1234; 1741 my $id = semget($IPC_KEY, 0, 0); 1742 die unless defined($id); 1743 1744 my $semnum = 0; 1745 my $semflag = 0; 1746 1747 # Decrement the semaphore count 1748 my $semop = -1; 1749 my $opstring = pack("s!s!s!", $semnum, $semop, $semflag); 1750 1751 semop($id, $opstring) || die "semop: $!"; 1752 1753The SysV IPC code above was written long ago, and it's definitely 1754clunky looking. For a more modern look, see the IPC::SysV module. 1755 1756A small example demonstrating SysV message queues: 1757 1758 use IPC::SysV qw(IPC_PRIVATE IPC_RMID IPC_CREAT S_IRUSR S_IWUSR); 1759 1760 my $id = msgget(IPC_PRIVATE, IPC_CREAT | S_IRUSR | S_IWUSR); 1761 defined($id) || die "msgget failed: $!"; 1762 1763 my $sent = "message"; 1764 my $type_sent = 1234; 1765 1766 msgsnd($id, pack("l! a*", $type_sent, $sent), 0) 1767 || die "msgsnd failed: $!"; 1768 1769 msgrcv($id, my $rcvd_buf, 60, 0, 0) 1770 || die "msgrcv failed: $!"; 1771 1772 my($type_rcvd, $rcvd) = unpack("l! a*", $rcvd_buf); 1773 1774 if ($rcvd eq $sent) { 1775 print "okay\n"; 1776 } else { 1777 print "not okay\n"; 1778 } 1779 1780 msgctl($id, IPC_RMID, 0) || die "msgctl failed: $!\n"; 1781 1782=head1 NOTES 1783 1784Most of these routines quietly but politely return C<undef> when they 1785fail instead of causing your program to die right then and there due to 1786an uncaught exception. (Actually, some of the new I<Socket> conversion 1787functions do croak() on bad arguments.) It is therefore essential to 1788check return values from these functions. Always begin your socket 1789programs this way for optimal success, and don't forget to add the B<-T> 1790taint-checking flag to the C<#!> line for servers: 1791 1792 #!/usr/bin/perl -T 1793 use v5.36; 1794 use sigtrap; 1795 use Socket; 1796 1797=head1 BUGS 1798 1799These routines all create system-specific portability problems. As noted 1800elsewhere, Perl is at the mercy of your C libraries for much of its system 1801behavior. It's probably safest to assume broken SysV semantics for 1802signals and to stick with simple TCP and UDP socket operations; e.g., don't 1803try to pass open file descriptors over a local UDP datagram socket if you 1804want your code to stand a chance of being portable. 1805 1806=head1 AUTHOR 1807 1808Tom Christiansen, with occasional vestiges of Larry Wall's original 1809version and suggestions from the Perl Porters. 1810 1811=head1 SEE ALSO 1812 1813There's a lot more to networking than this, but this should get you 1814started. 1815 1816For intrepid programmers, the indispensable textbook is I<Unix Network 1817Programming, 2nd Edition, Volume 1> by W. Richard Stevens (published by 1818Prentice-Hall). Most books on networking address the subject from the 1819perspective of a C programmer; translation to Perl is left as an exercise 1820for the reader. 1821 1822The IO::Socket(3) manpage describes the object library, and the Socket(3) 1823manpage describes the low-level interface to sockets. Besides the obvious 1824functions in L<perlfunc>, you should also check out the F<modules> file at 1825your nearest CPAN site, especially 1826L<http://www.cpan.org/modules/00modlist.long.html#ID5_Networking_>. 1827See L<perlmodlib> or best yet, the F<Perl FAQ> for a description 1828of what CPAN is and where to get it if the previous link doesn't work 1829for you. 1830 1831Section 5 of CPAN's F<modules> file is devoted to "Networking, Device 1832Control (modems), and Interprocess Communication", and contains numerous 1833unbundled modules numerous networking modules, Chat and Expect operations, 1834CGI programming, DCE, FTP, IPC, NNTP, Proxy, Ptty, RPC, SNMP, SMTP, Telnet, 1835Threads, and ToolTalk--to name just a few. 1836