1=head1 NAME 2 3perlipc - Perl interprocess communication (signals, fifos, pipes, safe subprocesses, sockets, and semaphores) 4 5=head1 DESCRIPTION 6 7The basic IPC facilities of Perl are built out of the good old Unix 8signals, named pipes, pipe opens, the Berkeley socket routines, and SysV 9IPC calls. Each is used in slightly different situations. 10 11=head1 Signals 12 13Perl uses a simple signal handling model: the %SIG hash contains names 14or references of user-installed signal handlers. These handlers will 15be called with an argument which is the name of the signal that 16triggered it. A signal may be generated intentionally from a 17particular keyboard sequence like control-C or control-Z, sent to you 18from another process, or triggered automatically by the kernel when 19special events transpire, like a child process exiting, your process 20running out of stack space, or hitting file size limit. 21 22For example, to trap an interrupt signal, set up a handler like this: 23 24 sub catch_zap { 25 my $signame = shift; 26 $shucks++; 27 die "Somebody sent me a SIG$signame"; 28 } 29 $SIG{INT} = 'catch_zap'; # could fail in modules 30 $SIG{INT} = \&catch_zap; # best strategy 31 32Prior to Perl 5.7.3 it was necessary to do as little as you possibly 33could in your handler; notice how all we do is set a global variable 34and then raise an exception. That's because on most systems, 35libraries are not re-entrant; particularly, memory allocation and I/O 36routines are not. That meant that doing nearly I<anything> in your 37handler could in theory trigger a memory fault and subsequent core 38dump - see L</Deferred Signals (Safe Signals)> below. 39 40The names of the signals are the ones listed out by C<kill -l> on your 41system, or you can retrieve them from the Config module. Set up an 42@signame list indexed by number to get the name and a %signo table 43indexed by name to get the number: 44 45 use Config; 46 defined $Config{sig_name} || die "No sigs?"; 47 foreach $name (split(' ', $Config{sig_name})) { 48 $signo{$name} = $i; 49 $signame[$i] = $name; 50 $i++; 51 } 52 53So to check whether signal 17 and SIGALRM were the same, do just this: 54 55 print "signal #17 = $signame[17]\n"; 56 if ($signo{ALRM}) { 57 print "SIGALRM is $signo{ALRM}\n"; 58 } 59 60You may also choose to assign the strings C<'IGNORE'> or C<'DEFAULT'> as 61the handler, in which case Perl will try to discard the signal or do the 62default thing. 63 64On most Unix platforms, the C<CHLD> (sometimes also known as C<CLD>) signal 65has special behavior with respect to a value of C<'IGNORE'>. 66Setting C<$SIG{CHLD}> to C<'IGNORE'> on such a platform has the effect of 67not creating zombie processes when the parent process fails to C<wait()> 68on its child processes (i.e. child processes are automatically reaped). 69Calling C<wait()> with C<$SIG{CHLD}> set to C<'IGNORE'> usually returns 70C<-1> on such platforms. 71 72Some signals can be neither trapped nor ignored, such as 73the KILL and STOP (but not the TSTP) signals. One strategy for 74temporarily ignoring signals is to use a local() statement, which will be 75automatically restored once your block is exited. (Remember that local() 76values are "inherited" by functions called from within that block.) 77 78 sub precious { 79 local $SIG{INT} = 'IGNORE'; 80 &more_functions; 81 } 82 sub more_functions { 83 # interrupts still ignored, for now... 84 } 85 86Sending a signal to a negative process ID means that you send the signal 87to the entire Unix process-group. This code sends a hang-up signal to all 88processes in the current process group (and sets $SIG{HUP} to IGNORE so 89it doesn't kill itself): 90 91 { 92 local $SIG{HUP} = 'IGNORE'; 93 kill HUP => -$$; 94 # snazzy writing of: kill('HUP', -$$) 95 } 96 97Another interesting signal to send is signal number zero. This doesn't 98actually affect a child process, but instead checks whether it's alive 99or has changed its UID. 100 101 unless (kill 0 => $kid_pid) { 102 warn "something wicked happened to $kid_pid"; 103 } 104 105When directed at a process whose UID is not identical to that 106of the sending process, signal number zero may fail because 107you lack permission to send the signal, even though the process is alive. 108You may be able to determine the cause of failure using C<%!>. 109 110 unless (kill 0 => $pid or $!{EPERM}) { 111 warn "$pid looks dead"; 112 } 113 114You might also want to employ anonymous functions for simple signal 115handlers: 116 117 $SIG{INT} = sub { die "\nOutta here!\n" }; 118 119But that will be problematic for the more complicated handlers that need 120to reinstall themselves. Because Perl's signal mechanism is currently 121based on the signal(3) function from the C library, you may sometimes be so 122unfortunate as to run on systems where that function is "broken", that 123is, it behaves in the old unreliable SysV way rather than the newer, more 124reasonable BSD and POSIX fashion. So you'll see defensive people writing 125signal handlers like this: 126 127 sub REAPER { 128 $waitedpid = wait; 129 # loathe SysV: it makes us not only reinstate 130 # the handler, but place it after the wait 131 $SIG{CHLD} = \&REAPER; 132 } 133 $SIG{CHLD} = \&REAPER; 134 # now do something that forks... 135 136or better still: 137 138 use POSIX ":sys_wait_h"; 139 sub REAPER { 140 my $child; 141 # If a second child dies while in the signal handler caused by the 142 # first death, we won't get another signal. So must loop here else 143 # we will leave the unreaped child as a zombie. And the next time 144 # two children die we get another zombie. And so on. 145 while (($child = waitpid(-1,WNOHANG)) > 0) { 146 $Kid_Status{$child} = $?; 147 } 148 $SIG{CHLD} = \&REAPER; # still loathe SysV 149 } 150 $SIG{CHLD} = \&REAPER; 151 # do something that forks... 152 153Note: qx(), system() and some modules for calling external commands do a 154fork() and wait() for the result. Thus, your signal handler (REAPER in the 155example) will be called. Since wait() was already called by system() or qx() 156the wait() in the signal handler will not see any more zombies and therefore 157block. 158 159The best way to prevent this issue is to use waitpid, as in the following 160example: 161 162 use POSIX ":sys_wait_h"; # for nonblocking read 163 164 my %children; 165 166 $SIG{CHLD} = sub { 167 # don't change $! and $? outside handler 168 local ($!,$?); 169 my $pid = waitpid(-1, WNOHANG); 170 return if $pid == -1; 171 return unless defined $children{$pid}; 172 delete $children{$pid}; 173 cleanup_child($pid, $?); 174 }; 175 176 while (1) { 177 my $pid = fork(); 178 if ($pid == 0) { 179 # ... 180 exit 0; 181 } else { 182 $children{$pid}=1; 183 # ... 184 system($command); 185 # ... 186 } 187 } 188 189Signal handling is also used for timeouts in Unix. While safely 190protected within an C<eval{}> block, you set a signal handler to trap 191alarm signals and then schedule to have one delivered to you in some 192number of seconds. Then try your blocking operation, clearing the alarm 193when it's done but not before you've exited your C<eval{}> block. If it 194goes off, you'll use die() to jump out of the block, much as you might 195using longjmp() or throw() in other languages. 196 197Here's an example: 198 199 eval { 200 local $SIG{ALRM} = sub { die "alarm clock restart" }; 201 alarm 10; 202 flock(FH, 2); # blocking write lock 203 alarm 0; 204 }; 205 if ($@ and $@ !~ /alarm clock restart/) { die } 206 207If the operation being timed out is system() or qx(), this technique 208is liable to generate zombies. If this matters to you, you'll 209need to do your own fork() and exec(), and kill the errant child process. 210 211For more complex signal handling, you might see the standard POSIX 212module. Lamentably, this is almost entirely undocumented, but 213the F<t/lib/posix.t> file from the Perl source distribution has some 214examples in it. 215 216=head2 Handling the SIGHUP Signal in Daemons 217 218A process that usually starts when the system boots and shuts down 219when the system is shut down is called a daemon (Disk And Execution 220MONitor). If a daemon process has a configuration file which is 221modified after the process has been started, there should be a way to 222tell that process to re-read its configuration file, without stopping 223the process. Many daemons provide this mechanism using the C<SIGHUP> 224signal handler. When you want to tell the daemon to re-read the file 225you simply send it the C<SIGHUP> signal. 226 227Not all platforms automatically reinstall their (native) signal 228handlers after a signal delivery. This means that the handler works 229only the first time the signal is sent. The solution to this problem 230is to use C<POSIX> signal handlers if available, their behaviour 231is well-defined. 232 233The following example implements a simple daemon, which restarts 234itself every time the C<SIGHUP> signal is received. The actual code is 235located in the subroutine C<code()>, which simply prints some debug 236info to show that it works and should be replaced with the real code. 237 238 #!/usr/bin/perl -w 239 240 use POSIX (); 241 use FindBin (); 242 use File::Basename (); 243 use File::Spec::Functions; 244 245 $|=1; 246 247 # make the daemon cross-platform, so exec always calls the script 248 # itself with the right path, no matter how the script was invoked. 249 my $script = File::Basename::basename($0); 250 my $SELF = catfile $FindBin::Bin, $script; 251 252 # POSIX unmasks the sigprocmask properly 253 my $sigset = POSIX::SigSet->new(); 254 my $action = POSIX::SigAction->new('sigHUP_handler', 255 $sigset, 256 &POSIX::SA_NODEFER); 257 POSIX::sigaction(&POSIX::SIGHUP, $action); 258 259 sub sigHUP_handler { 260 print "got SIGHUP\n"; 261 exec($SELF, @ARGV) or die "Couldn't restart: $!\n"; 262 } 263 264 code(); 265 266 sub code { 267 print "PID: $$\n"; 268 print "ARGV: @ARGV\n"; 269 my $c = 0; 270 while (++$c) { 271 sleep 2; 272 print "$c\n"; 273 } 274 } 275 __END__ 276 277 278=head1 Named Pipes 279 280A named pipe (often referred to as a FIFO) is an old Unix IPC 281mechanism for processes communicating on the same machine. It works 282just like a regular, connected anonymous pipes, except that the 283processes rendezvous using a filename and don't have to be related. 284 285To create a named pipe, use the C<POSIX::mkfifo()> function. 286 287 use POSIX qw(mkfifo); 288 mkfifo($path, 0700) or die "mkfifo $path failed: $!"; 289 290You can also use the Unix command mknod(1) or on some 291systems, mkfifo(1). These may not be in your normal path. 292 293 # system return val is backwards, so && not || 294 # 295 $ENV{PATH} .= ":/etc:/usr/etc"; 296 if ( system('mknod', $path, 'p') 297 && system('mkfifo', $path) ) 298 { 299 die "mk{nod,fifo} $path failed"; 300 } 301 302 303A fifo is convenient when you want to connect a process to an unrelated 304one. When you open a fifo, the program will block until there's something 305on the other end. 306 307For example, let's say you'd like to have your F<.signature> file be a 308named pipe that has a Perl program on the other end. Now every time any 309program (like a mailer, news reader, finger program, etc.) tries to read 310from that file, the reading program will block and your program will 311supply the new signature. We'll use the pipe-checking file test B<-p> 312to find out whether anyone (or anything) has accidentally removed our fifo. 313 314 chdir; # go home 315 $FIFO = '.signature'; 316 317 while (1) { 318 unless (-p $FIFO) { 319 unlink $FIFO; 320 require POSIX; 321 POSIX::mkfifo($FIFO, 0700) 322 or die "can't mkfifo $FIFO: $!"; 323 } 324 325 # next line blocks until there's a reader 326 open (FIFO, "> $FIFO") || die "can't write $FIFO: $!"; 327 print FIFO "John Smith (smith\@host.org)\n", `fortune -s`; 328 close FIFO; 329 sleep 2; # to avoid dup signals 330 } 331 332=head2 Deferred Signals (Safe Signals) 333 334In Perls before Perl 5.7.3 by installing Perl code to deal with 335signals, you were exposing yourself to danger from two things. First, 336few system library functions are re-entrant. If the signal interrupts 337while Perl is executing one function (like malloc(3) or printf(3)), 338and your signal handler then calls the same function again, you could 339get unpredictable behavior--often, a core dump. Second, Perl isn't 340itself re-entrant at the lowest levels. If the signal interrupts Perl 341while Perl is changing its own internal data structures, similarly 342unpredictable behaviour may result. 343 344There were two things you could do, knowing this: be paranoid or be 345pragmatic. The paranoid approach was to do as little as possible in your 346signal handler. Set an existing integer variable that already has a 347value, and return. This doesn't help you if you're in a slow system call, 348which will just restart. That means you have to C<die> to longjmp(3) out 349of the handler. Even this is a little cavalier for the true paranoiac, 350who avoids C<die> in a handler because the system I<is> out to get you. 351The pragmatic approach was to say "I know the risks, but prefer the 352convenience", and to do anything you wanted in your signal handler, 353and be prepared to clean up core dumps now and again. 354 355Perl 5.7.3 and later avoid these problems by "deferring" signals. 356That is, when the signal is delivered to the process by 357the system (to the C code that implements Perl) a flag is set, and the 358handler returns immediately. Then at strategic "safe" points in the 359Perl interpreter (e.g. when it is about to execute a new opcode) the 360flags are checked and the Perl level handler from %SIG is 361executed. The "deferred" scheme allows much more flexibility in the 362coding of signal handler as we know Perl interpreter is in a safe 363state, and that we are not in a system library function when the 364handler is called. However the implementation does differ from 365previous Perls in the following ways: 366 367=over 4 368 369=item Long-running opcodes 370 371As the Perl interpreter only looks at the signal flags when it is about 372to execute a new opcode, a signal that arrives during a long-running 373opcode (e.g. a regular expression operation on a very large string) will 374not be seen until the current opcode completes. 375 376N.B. If a signal of any given type fires multiple times during an opcode 377(such as from a fine-grained timer), the handler for that signal will 378only be called once after the opcode completes, and all the other 379instances will be discarded. Furthermore, if your system's signal queue 380gets flooded to the point that there are signals that have been raised 381but not yet caught (and thus not deferred) at the time an opcode 382completes, those signals may well be caught and deferred during 383subsequent opcodes, with sometimes surprising results. For example, you 384may see alarms delivered even after calling C<alarm(0)> as the latter 385stops the raising of alarms but does not cancel the delivery of alarms 386raised but not yet caught. Do not depend on the behaviors described in 387this paragraph as they are side effects of the current implementation and 388may change in future versions of Perl. 389 390 391=item Interrupting IO 392 393When a signal is delivered (e.g. INT control-C) the operating system 394breaks into IO operations like C<read> (used to implement Perls 395E<lt>E<gt> operator). On older Perls the handler was called 396immediately (and as C<read> is not "unsafe" this worked well). With 397the "deferred" scheme the handler is not called immediately, and if 398Perl is using system's C<stdio> library that library may re-start the 399C<read> without returning to Perl and giving it a chance to call the 400%SIG handler. If this happens on your system the solution is to use 401C<:perlio> layer to do IO - at least on those handles which you want 402to be able to break into with signals. (The C<:perlio> layer checks 403the signal flags and calls %SIG handlers before resuming IO operation.) 404 405Note that the default in Perl 5.7.3 and later is to automatically use 406the C<:perlio> layer. 407 408Note that some networking library functions like gethostbyname() are 409known to have their own implementations of timeouts which may conflict 410with your timeouts. If you are having problems with such functions, 411you can try using the POSIX sigaction() function, which bypasses the 412Perl safe signals (note that this means subjecting yourself to 413possible memory corruption, as described above). Instead of setting 414C<$SIG{ALRM}>: 415 416 local $SIG{ALRM} = sub { die "alarm" }; 417 418try something like the following: 419 420 use POSIX qw(SIGALRM); 421 POSIX::sigaction(SIGALRM, 422 POSIX::SigAction->new(sub { die "alarm" })) 423 or die "Error setting SIGALRM handler: $!\n"; 424 425Another way to disable the safe signal behavior locally is to use 426the C<Perl::Unsafe::Signals> module from CPAN (which will affect 427all signals). 428 429=item Restartable system calls 430 431On systems that supported it, older versions of Perl used the 432SA_RESTART flag when installing %SIG handlers. This meant that 433restartable system calls would continue rather than returning when 434a signal arrived. In order to deliver deferred signals promptly, 435Perl 5.7.3 and later do I<not> use SA_RESTART. Consequently, 436restartable system calls can fail (with $! set to C<EINTR>) in places 437where they previously would have succeeded. 438 439Note that the default C<:perlio> layer will retry C<read>, C<write> 440and C<close> as described above and that interrupted C<wait> and 441C<waitpid> calls will always be retried. 442 443=item Signals as "faults" 444 445Certain signals, e.g. SEGV, ILL, and BUS, are generated as a result of 446virtual memory or other "faults". These are normally fatal and there is 447little a Perl-level handler can do with them, so Perl now delivers them 448immediately rather than attempting to defer them. 449 450=item Signals triggered by operating system state 451 452On some operating systems certain signal handlers are supposed to "do 453something" before returning. One example can be CHLD or CLD which 454indicates a child process has completed. On some operating systems the 455signal handler is expected to C<wait> for the completed child 456process. On such systems the deferred signal scheme will not work for 457those signals (it does not do the C<wait>). Again the failure will 458look like a loop as the operating system will re-issue the signal as 459there are un-waited-for completed child processes. 460 461=back 462 463If you want the old signal behaviour back regardless of possible 464memory corruption, set the environment variable C<PERL_SIGNALS> to 465C<"unsafe"> (a new feature since Perl 5.8.1). 466 467=head1 Using open() for IPC 468 469Perl's basic open() statement can also be used for unidirectional 470interprocess communication by either appending or prepending a pipe 471symbol to the second argument to open(). Here's how to start 472something up in a child process you intend to write to: 473 474 open(SPOOLER, "| cat -v | lpr -h 2>/dev/null") 475 || die "can't fork: $!"; 476 local $SIG{PIPE} = sub { die "spooler pipe broke" }; 477 print SPOOLER "stuff\n"; 478 close SPOOLER || die "bad spool: $! $?"; 479 480And here's how to start up a child process you intend to read from: 481 482 open(STATUS, "netstat -an 2>&1 |") 483 || die "can't fork: $!"; 484 while (<STATUS>) { 485 next if /^(tcp|udp)/; 486 print; 487 } 488 close STATUS || die "bad netstat: $! $?"; 489 490If one can be sure that a particular program is a Perl script that is 491expecting filenames in @ARGV, the clever programmer can write something 492like this: 493 494 % program f1 "cmd1|" - f2 "cmd2|" f3 < tmpfile 495 496and irrespective of which shell it's called from, the Perl program will 497read from the file F<f1>, the process F<cmd1>, standard input (F<tmpfile> 498in this case), the F<f2> file, the F<cmd2> command, and finally the F<f3> 499file. Pretty nifty, eh? 500 501You might notice that you could use backticks for much the 502same effect as opening a pipe for reading: 503 504 print grep { !/^(tcp|udp)/ } `netstat -an 2>&1`; 505 die "bad netstat" if $?; 506 507While this is true on the surface, it's much more efficient to process the 508file one line or record at a time because then you don't have to read the 509whole thing into memory at once. It also gives you finer control of the 510whole process, letting you to kill off the child process early if you'd 511like. 512 513Be careful to check both the open() and the close() return values. If 514you're I<writing> to a pipe, you should also trap SIGPIPE. Otherwise, 515think of what happens when you start up a pipe to a command that doesn't 516exist: the open() will in all likelihood succeed (it only reflects the 517fork()'s success), but then your output will fail--spectacularly. Perl 518can't know whether the command worked because your command is actually 519running in a separate process whose exec() might have failed. Therefore, 520while readers of bogus commands return just a quick end of file, writers 521to bogus command will trigger a signal they'd better be prepared to 522handle. Consider: 523 524 open(FH, "|bogus") or die "can't fork: $!"; 525 print FH "bang\n" or die "can't write: $!"; 526 close FH or die "can't close: $!"; 527 528That won't blow up until the close, and it will blow up with a SIGPIPE. 529To catch it, you could use this: 530 531 $SIG{PIPE} = 'IGNORE'; 532 open(FH, "|bogus") or die "can't fork: $!"; 533 print FH "bang\n" or die "can't write: $!"; 534 close FH or die "can't close: status=$?"; 535 536=head2 Filehandles 537 538Both the main process and any child processes it forks share the same 539STDIN, STDOUT, and STDERR filehandles. If both processes try to access 540them at once, strange things can happen. You may also want to close 541or reopen the filehandles for the child. You can get around this by 542opening your pipe with open(), but on some systems this means that the 543child process cannot outlive the parent. 544 545=head2 Background Processes 546 547You can run a command in the background with: 548 549 system("cmd &"); 550 551The command's STDOUT and STDERR (and possibly STDIN, depending on your 552shell) will be the same as the parent's. You won't need to catch 553SIGCHLD because of the double-fork taking place (see below for more 554details). 555 556=head2 Complete Dissociation of Child from Parent 557 558In some cases (starting server processes, for instance) you'll want to 559completely dissociate the child process from the parent. This is 560often called daemonization. A well behaved daemon will also chdir() 561to the root directory (so it doesn't prevent unmounting the filesystem 562containing the directory from which it was launched) and redirect its 563standard file descriptors from and to F</dev/null> (so that random 564output doesn't wind up on the user's terminal). 565 566 use POSIX 'setsid'; 567 568 sub daemonize { 569 chdir '/' or die "Can't chdir to /: $!"; 570 open STDIN, '/dev/null' or die "Can't read /dev/null: $!"; 571 open STDOUT, '>/dev/null' 572 or die "Can't write to /dev/null: $!"; 573 defined(my $pid = fork) or die "Can't fork: $!"; 574 exit if $pid; 575 die "Can't start a new session: $!" if setsid == -1; 576 open STDERR, '>&STDOUT' or die "Can't dup stdout: $!"; 577 } 578 579The fork() has to come before the setsid() to ensure that you aren't a 580process group leader (the setsid() will fail if you are). If your 581system doesn't have the setsid() function, open F</dev/tty> and use the 582C<TIOCNOTTY> ioctl() on it instead. See tty(4) for details. 583 584Non-Unix users should check their Your_OS::Process module for other 585solutions. 586 587=head2 Safe Pipe Opens 588 589Another interesting approach to IPC is making your single program go 590multiprocess and communicate between (or even amongst) yourselves. The 591open() function will accept a file argument of either C<"-|"> or C<"|-"> 592to do a very interesting thing: it forks a child connected to the 593filehandle you've opened. The child is running the same program as the 594parent. This is useful for safely opening a file when running under an 595assumed UID or GID, for example. If you open a pipe I<to> minus, you can 596write to the filehandle you opened and your kid will find it in his 597STDIN. If you open a pipe I<from> minus, you can read from the filehandle 598you opened whatever your kid writes to his STDOUT. 599 600 use English '-no_match_vars'; 601 my $sleep_count = 0; 602 603 do { 604 $pid = open(KID_TO_WRITE, "|-"); 605 unless (defined $pid) { 606 warn "cannot fork: $!"; 607 die "bailing out" if $sleep_count++ > 6; 608 sleep 10; 609 } 610 } until defined $pid; 611 612 if ($pid) { # parent 613 print KID_TO_WRITE @some_data; 614 close(KID_TO_WRITE) || warn "kid exited $?"; 615 } else { # child 616 ($EUID, $EGID) = ($UID, $GID); # suid progs only 617 open (FILE, "> /safe/file") 618 || die "can't open /safe/file: $!"; 619 while (<STDIN>) { 620 print FILE; # child's STDIN is parent's KID_TO_WRITE 621 } 622 exit; # don't forget this 623 } 624 625Another common use for this construct is when you need to execute 626something without the shell's interference. With system(), it's 627straightforward, but you can't use a pipe open or backticks safely. 628That's because there's no way to stop the shell from getting its hands on 629your arguments. Instead, use lower-level control to call exec() directly. 630 631Here's a safe backtick or pipe open for read: 632 633 # add error processing as above 634 $pid = open(KID_TO_READ, "-|"); 635 636 if ($pid) { # parent 637 while (<KID_TO_READ>) { 638 # do something interesting 639 } 640 close(KID_TO_READ) || warn "kid exited $?"; 641 642 } else { # child 643 ($EUID, $EGID) = ($UID, $GID); # suid only 644 exec($program, @options, @args) 645 || die "can't exec program: $!"; 646 # NOTREACHED 647 } 648 649 650And here's a safe pipe open for writing: 651 652 # add error processing as above 653 $pid = open(KID_TO_WRITE, "|-"); 654 $SIG{PIPE} = sub { die "whoops, $program pipe broke" }; 655 656 if ($pid) { # parent 657 for (@data) { 658 print KID_TO_WRITE; 659 } 660 close(KID_TO_WRITE) || warn "kid exited $?"; 661 662 } else { # child 663 ($EUID, $EGID) = ($UID, $GID); 664 exec($program, @options, @args) 665 || die "can't exec program: $!"; 666 # NOTREACHED 667 } 668 669It is very easy to dead-lock a process using this form of open(), or 670indeed any use of pipe() and multiple sub-processes. The above 671example is 'safe' because it is simple and calls exec(). See 672L</"Avoiding Pipe Deadlocks"> for general safety principles, but there 673are extra gotchas with Safe Pipe Opens. 674 675In particular, if you opened the pipe using C<open FH, "|-">, then you 676cannot simply use close() in the parent process to close an unwanted 677writer. Consider this code: 678 679 $pid = open WRITER, "|-"; 680 defined $pid or die "fork failed; $!"; 681 if ($pid) { 682 if (my $sub_pid = fork()) { 683 close WRITER; 684 # do something else... 685 } 686 else { 687 # write to WRITER... 688 exit; 689 } 690 } 691 else { 692 # do something with STDIN... 693 exit; 694 } 695 696In the above, the true parent does not want to write to the WRITER 697filehandle, so it closes it. However, because WRITER was opened using 698C<open FH, "|-">, it has a special behaviour: closing it will call 699waitpid() (see L<perlfunc/waitpid>), which waits for the sub-process 700to exit. If the child process ends up waiting for something happening 701in the section marked "do something else", then you have a deadlock. 702 703This can also be a problem with intermediate sub-processes in more 704complicated code, which will call waitpid() on all open filehandles 705during global destruction; in no predictable order. 706 707To solve this, you must manually use pipe(), fork(), and the form of 708open() which sets one file descriptor to another, as below: 709 710 pipe(READER, WRITER); 711 $pid = fork(); 712 defined $pid or die "fork failed; $!"; 713 if ($pid) { 714 close READER; 715 if (my $sub_pid = fork()) { 716 close WRITER; 717 } 718 else { 719 # write to WRITER... 720 exit; 721 } 722 # write to WRITER... 723 } 724 else { 725 open STDIN, "<&READER"; 726 close WRITER; 727 # do something... 728 exit; 729 } 730 731Since Perl 5.8.0, you can also use the list form of C<open> for pipes : 732the syntax 733 734 open KID_PS, "-|", "ps", "aux" or die $!; 735 736forks the ps(1) command (without spawning a shell, as there are more than 737three arguments to open()), and reads its standard output via the 738C<KID_PS> filehandle. The corresponding syntax to write to command 739pipes (with C<"|-"> in place of C<"-|">) is also implemented. 740 741Note that these operations are full Unix forks, which means they may not be 742correctly implemented on alien systems. Additionally, these are not true 743multithreading. If you'd like to learn more about threading, see the 744F<modules> file mentioned below in the SEE ALSO section. 745 746=head2 Avoiding Pipe Deadlocks 747 748In general, if you have more than one sub-process, you need to be very 749careful that any process which does not need the writer half of any 750pipe you create for inter-process communication does not have it open. 751 752The reason for this is that any child process which is reading from 753the pipe and expecting an EOF will never receive it, and therefore 754never exit. A single process closing a pipe is not enough to close it; 755the last process with the pipe open must close it for it to read EOF. 756 757Certain built-in Unix features help prevent this most of 758the time. For instance, filehandles have a 'close on exec' flag (set 759I<en masse> with Perl using the C<$^F> L<perlvar>), so that any 760filehandles which you didn't explicitly route to the STDIN, STDOUT or 761STDERR of a child I<program> will automatically be closed for you. 762 763So, always explicitly and immediately call close() on the writable end 764of any pipe, unless that process is actually writing to it. If you 765don't explicitly call close() then be warned Perl will still close() 766all the filehandles during global destruction. As warned above, if 767those filehandles were opened with Safe Pipe Open, they will also call 768waitpid() and you might again deadlock. 769 770=head2 Bidirectional Communication with Another Process 771 772While this works reasonably well for unidirectional communication, what 773about bidirectional communication? The obvious thing you'd like to do 774doesn't actually work: 775 776 open(PROG_FOR_READING_AND_WRITING, "| some program |") 777 778and if you forget to use the C<use warnings> pragma or the B<-w> flag, 779then you'll miss out entirely on the diagnostic message: 780 781 Can't do bidirectional pipe at -e line 1. 782 783If you really want to, you can use the standard open2() library function 784to catch both ends. There's also an open3() for tridirectional I/O so you 785can also catch your child's STDERR, but doing so would then require an 786awkward select() loop and wouldn't allow you to use normal Perl input 787operations. 788 789If you look at its source, you'll see that open2() uses low-level 790primitives like Unix pipe() and exec() calls to create all the connections. 791While it might have been slightly more efficient by using socketpair(), it 792would have then been even less portable than it already is. The open2() 793and open3() functions are unlikely to work anywhere except on a Unix 794system or some other one purporting to be POSIX compliant. 795 796Here's an example of using open2(): 797 798 use FileHandle; 799 use IPC::Open2; 800 $pid = open2(*Reader, *Writer, "cat -u -n" ); 801 print Writer "stuff\n"; 802 $got = <Reader>; 803 804The problem with this is that Unix buffering is really going to 805ruin your day. Even though your C<Writer> filehandle is auto-flushed, 806and the process on the other end will get your data in a timely manner, 807you can't usually do anything to force it to give it back to you 808in a similarly quick fashion. In this case, we could, because we 809gave I<cat> a B<-u> flag to make it unbuffered. But very few Unix 810commands are designed to operate over pipes, so this seldom works 811unless you yourself wrote the program on the other end of the 812double-ended pipe. 813 814A solution to this is the nonstandard F<Comm.pl> library. It uses 815pseudo-ttys to make your program behave more reasonably: 816 817 require 'Comm.pl'; 818 $ph = open_proc('cat -n'); 819 for (1..10) { 820 print $ph "a line\n"; 821 print "got back ", scalar <$ph>; 822 } 823 824This way you don't have to have control over the source code of the 825program you're using. The F<Comm> library also has expect() 826and interact() functions. Find the library (and we hope its 827successor F<IPC::Chat>) at your nearest CPAN archive as detailed 828in the SEE ALSO section below. 829 830The newer Expect.pm module from CPAN also addresses this kind of thing. 831This module requires two other modules from CPAN: IO::Pty and IO::Stty. 832It sets up a pseudo-terminal to interact with programs that insist on 833using talking to the terminal device driver. If your system is 834amongst those supported, this may be your best bet. 835 836=head2 Bidirectional Communication with Yourself 837 838If you want, you may make low-level pipe() and fork() 839to stitch this together by hand. This example only 840talks to itself, but you could reopen the appropriate 841handles to STDIN and STDOUT and call other processes. 842 843 #!/usr/bin/perl -w 844 # pipe1 - bidirectional communication using two pipe pairs 845 # designed for the socketpair-challenged 846 use IO::Handle; # thousands of lines just for autoflush :-( 847 pipe(PARENT_RDR, CHILD_WTR); # XXX: failure? 848 pipe(CHILD_RDR, PARENT_WTR); # XXX: failure? 849 CHILD_WTR->autoflush(1); 850 PARENT_WTR->autoflush(1); 851 852 if ($pid = fork) { 853 close PARENT_RDR; close PARENT_WTR; 854 print CHILD_WTR "Parent Pid $$ is sending this\n"; 855 chomp($line = <CHILD_RDR>); 856 print "Parent Pid $$ just read this: `$line'\n"; 857 close CHILD_RDR; close CHILD_WTR; 858 waitpid($pid,0); 859 } else { 860 die "cannot fork: $!" unless defined $pid; 861 close CHILD_RDR; close CHILD_WTR; 862 chomp($line = <PARENT_RDR>); 863 print "Child Pid $$ just read this: `$line'\n"; 864 print PARENT_WTR "Child Pid $$ is sending this\n"; 865 close PARENT_RDR; close PARENT_WTR; 866 exit; 867 } 868 869But you don't actually have to make two pipe calls. If you 870have the socketpair() system call, it will do this all for you. 871 872 #!/usr/bin/perl -w 873 # pipe2 - bidirectional communication using socketpair 874 # "the best ones always go both ways" 875 876 use Socket; 877 use IO::Handle; # thousands of lines just for autoflush :-( 878 # We say AF_UNIX because although *_LOCAL is the 879 # POSIX 1003.1g form of the constant, many machines 880 # still don't have it. 881 socketpair(CHILD, PARENT, AF_UNIX, SOCK_STREAM, PF_UNSPEC) 882 or die "socketpair: $!"; 883 884 CHILD->autoflush(1); 885 PARENT->autoflush(1); 886 887 if ($pid = fork) { 888 close PARENT; 889 print CHILD "Parent Pid $$ is sending this\n"; 890 chomp($line = <CHILD>); 891 print "Parent Pid $$ just read this: `$line'\n"; 892 close CHILD; 893 waitpid($pid,0); 894 } else { 895 die "cannot fork: $!" unless defined $pid; 896 close CHILD; 897 chomp($line = <PARENT>); 898 print "Child Pid $$ just read this: `$line'\n"; 899 print PARENT "Child Pid $$ is sending this\n"; 900 close PARENT; 901 exit; 902 } 903 904=head1 Sockets: Client/Server Communication 905 906While not limited to Unix-derived operating systems (e.g., WinSock on PCs 907provides socket support, as do some VMS libraries), you may not have 908sockets on your system, in which case this section probably isn't going to do 909you much good. With sockets, you can do both virtual circuits (i.e., TCP 910streams) and datagrams (i.e., UDP packets). You may be able to do even more 911depending on your system. 912 913The Perl function calls for dealing with sockets have the same names as 914the corresponding system calls in C, but their arguments tend to differ 915for two reasons: first, Perl filehandles work differently than C file 916descriptors. Second, Perl already knows the length of its strings, so you 917don't need to pass that information. 918 919One of the major problems with old socket code in Perl was that it used 920hard-coded values for some of the constants, which severely hurt 921portability. If you ever see code that does anything like explicitly 922setting C<$AF_INET = 2>, you know you're in for big trouble: An 923immeasurably superior approach is to use the C<Socket> module, which more 924reliably grants access to various constants and functions you'll need. 925 926If you're not writing a server/client for an existing protocol like 927NNTP or SMTP, you should give some thought to how your server will 928know when the client has finished talking, and vice-versa. Most 929protocols are based on one-line messages and responses (so one party 930knows the other has finished when a "\n" is received) or multi-line 931messages and responses that end with a period on an empty line 932("\n.\n" terminates a message/response). 933 934=head2 Internet Line Terminators 935 936The Internet line terminator is "\015\012". Under ASCII variants of 937Unix, that could usually be written as "\r\n", but under other systems, 938"\r\n" might at times be "\015\015\012", "\012\012\015", or something 939completely different. The standards specify writing "\015\012" to be 940conformant (be strict in what you provide), but they also recommend 941accepting a lone "\012" on input (but be lenient in what you require). 942We haven't always been very good about that in the code in this manpage, 943but unless you're on a Mac, you'll probably be ok. 944 945=head2 Internet TCP Clients and Servers 946 947Use Internet-domain sockets when you want to do client-server 948communication that might extend to machines outside of your own system. 949 950Here's a sample TCP client using Internet-domain sockets: 951 952 #!/usr/bin/perl -w 953 use strict; 954 use Socket; 955 my ($remote,$port, $iaddr, $paddr, $proto, $line); 956 957 $remote = shift || 'localhost'; 958 $port = shift || 2345; # random port 959 if ($port =~ /\D/) { $port = getservbyname($port, 'tcp') } 960 die "No port" unless $port; 961 $iaddr = inet_aton($remote) || die "no host: $remote"; 962 $paddr = sockaddr_in($port, $iaddr); 963 964 $proto = getprotobyname('tcp'); 965 socket(SOCK, PF_INET, SOCK_STREAM, $proto) || die "socket: $!"; 966 connect(SOCK, $paddr) || die "connect: $!"; 967 while (defined($line = <SOCK>)) { 968 print $line; 969 } 970 971 close (SOCK) || die "close: $!"; 972 exit; 973 974And here's a corresponding server to go along with it. We'll 975leave the address as INADDR_ANY so that the kernel can choose 976the appropriate interface on multihomed hosts. If you want sit 977on a particular interface (like the external side of a gateway 978or firewall machine), you should fill this in with your real address 979instead. 980 981 #!/usr/bin/perl -Tw 982 use strict; 983 BEGIN { $ENV{PATH} = '/usr/ucb:/bin' } 984 use Socket; 985 use Carp; 986 my $EOL = "\015\012"; 987 988 sub logmsg { print "$0 $$: @_ at ", scalar localtime, "\n" } 989 990 my $port = shift || 2345; 991 my $proto = getprotobyname('tcp'); 992 993 ($port) = $port =~ /^(\d+)$/ or die "invalid port"; 994 995 socket(Server, PF_INET, SOCK_STREAM, $proto) || die "socket: $!"; 996 setsockopt(Server, SOL_SOCKET, SO_REUSEADDR, 997 pack("l", 1)) || die "setsockopt: $!"; 998 bind(Server, sockaddr_in($port, INADDR_ANY)) || die "bind: $!"; 999 listen(Server,SOMAXCONN) || die "listen: $!"; 1000 1001 logmsg "server started on port $port"; 1002 1003 my $paddr; 1004 1005 $SIG{CHLD} = \&REAPER; 1006 1007 for ( ; $paddr = accept(Client,Server); close Client) { 1008 my($port,$iaddr) = sockaddr_in($paddr); 1009 my $name = gethostbyaddr($iaddr,AF_INET); 1010 1011 logmsg "connection from $name [", 1012 inet_ntoa($iaddr), "] 1013 at port $port"; 1014 1015 print Client "Hello there, $name, it's now ", 1016 scalar localtime, $EOL; 1017 } 1018 1019And here's a multithreaded version. It's multithreaded in that 1020like most typical servers, it spawns (forks) a slave server to 1021handle the client request so that the master server can quickly 1022go back to service a new client. 1023 1024 #!/usr/bin/perl -Tw 1025 use strict; 1026 BEGIN { $ENV{PATH} = '/usr/ucb:/bin' } 1027 use Socket; 1028 use Carp; 1029 my $EOL = "\015\012"; 1030 1031 sub spawn; # forward declaration 1032 sub logmsg { print "$0 $$: @_ at ", scalar localtime, "\n" } 1033 1034 my $port = shift || 2345; 1035 my $proto = getprotobyname('tcp'); 1036 1037 ($port) = $port =~ /^(\d+)$/ or die "invalid port"; 1038 1039 socket(Server, PF_INET, SOCK_STREAM, $proto) || die "socket: $!"; 1040 setsockopt(Server, SOL_SOCKET, SO_REUSEADDR, 1041 pack("l", 1)) || die "setsockopt: $!"; 1042 bind(Server, sockaddr_in($port, INADDR_ANY)) || die "bind: $!"; 1043 listen(Server,SOMAXCONN) || die "listen: $!"; 1044 1045 logmsg "server started on port $port"; 1046 1047 my $waitedpid = 0; 1048 my $paddr; 1049 1050 use POSIX ":sys_wait_h"; 1051 use Errno; 1052 1053 sub REAPER { 1054 local $!; # don't let waitpid() overwrite current error 1055 while ((my $pid = waitpid(-1,WNOHANG)) > 0 && WIFEXITED($?)) { 1056 logmsg "reaped $waitedpid" . ($? ? " with exit $?" : ''); 1057 } 1058 $SIG{CHLD} = \&REAPER; # loathe SysV 1059 } 1060 1061 $SIG{CHLD} = \&REAPER; 1062 1063 while(1) { 1064 $paddr = accept(Client, Server) || do { 1065 # try again if accept() returned because a signal was received 1066 next if $!{EINTR}; 1067 die "accept: $!"; 1068 }; 1069 my ($port, $iaddr) = sockaddr_in($paddr); 1070 my $name = gethostbyaddr($iaddr, AF_INET); 1071 1072 logmsg "connection from $name [", 1073 inet_ntoa($iaddr), 1074 "] at port $port"; 1075 1076 spawn sub { 1077 $|=1; 1078 print "Hello there, $name, it's now ", scalar localtime, $EOL; 1079 exec '/usr/games/fortune' # XXX: `wrong' line terminators 1080 or confess "can't exec fortune: $!"; 1081 }; 1082 close Client; 1083 } 1084 1085 sub spawn { 1086 my $coderef = shift; 1087 1088 unless (@_ == 0 && $coderef && ref($coderef) eq 'CODE') { 1089 confess "usage: spawn CODEREF"; 1090 } 1091 1092 my $pid; 1093 if (! defined($pid = fork)) { 1094 logmsg "cannot fork: $!"; 1095 return; 1096 } 1097 elsif ($pid) { 1098 logmsg "begat $pid"; 1099 return; # I'm the parent 1100 } 1101 # else I'm the child -- go spawn 1102 1103 open(STDIN, "<&Client") || die "can't dup client to stdin"; 1104 open(STDOUT, ">&Client") || die "can't dup client to stdout"; 1105 ## open(STDERR, ">&STDOUT") || die "can't dup stdout to stderr"; 1106 exit &$coderef(); 1107 } 1108 1109This server takes the trouble to clone off a child version via fork() 1110for each incoming request. That way it can handle many requests at 1111once, which you might not always want. Even if you don't fork(), the 1112listen() will allow that many pending connections. Forking servers 1113have to be particularly careful about cleaning up their dead children 1114(called "zombies" in Unix parlance), because otherwise you'll quickly 1115fill up your process table. The REAPER subroutine is used here to 1116call waitpid() for any child processes that have finished, thereby 1117ensuring that they terminate cleanly and don't join the ranks of the 1118living dead. 1119 1120Within the while loop we call accept() and check to see if it returns 1121a false value. This would normally indicate a system error that needs 1122to be reported. However the introduction of safe signals (see 1123L</Deferred Signals (Safe Signals)> above) in Perl 5.7.3 means that 1124accept() may also be interrupted when the process receives a signal. 1125This typically happens when one of the forked sub-processes exits and 1126notifies the parent process with a CHLD signal. 1127 1128If accept() is interrupted by a signal then $! will be set to EINTR. 1129If this happens then we can safely continue to the next iteration of 1130the loop and another call to accept(). It is important that your 1131signal handling code doesn't modify the value of $! or this test will 1132most likely fail. In the REAPER subroutine we create a local version 1133of $! before calling waitpid(). When waitpid() sets $! to ECHILD (as 1134it inevitably does when it has no more children waiting), it will 1135update the local copy leaving the original unchanged. 1136 1137We suggest that you use the B<-T> flag to use taint checking (see L<perlsec>) 1138even if we aren't running setuid or setgid. This is always a good idea 1139for servers and other programs run on behalf of someone else (like CGI 1140scripts), because it lessens the chances that people from the outside will 1141be able to compromise your system. 1142 1143Let's look at another TCP client. This one connects to the TCP "time" 1144service on a number of different machines and shows how far their clocks 1145differ from the system on which it's being run: 1146 1147 #!/usr/bin/perl -w 1148 use strict; 1149 use Socket; 1150 1151 my $SECS_of_70_YEARS = 2208988800; 1152 sub ctime { scalar localtime(shift) } 1153 1154 my $iaddr = gethostbyname('localhost'); 1155 my $proto = getprotobyname('tcp'); 1156 my $port = getservbyname('time', 'tcp'); 1157 my $paddr = sockaddr_in(0, $iaddr); 1158 my($host); 1159 1160 $| = 1; 1161 printf "%-24s %8s %s\n", "localhost", 0, ctime(time()); 1162 1163 foreach $host (@ARGV) { 1164 printf "%-24s ", $host; 1165 my $hisiaddr = inet_aton($host) || die "unknown host"; 1166 my $hispaddr = sockaddr_in($port, $hisiaddr); 1167 socket(SOCKET, PF_INET, SOCK_STREAM, $proto) || die "socket: $!"; 1168 connect(SOCKET, $hispaddr) || die "connect: $!"; 1169 my $rtime = ' '; 1170 read(SOCKET, $rtime, 4); 1171 close(SOCKET); 1172 my $histime = unpack("N", $rtime) - $SECS_of_70_YEARS; 1173 printf "%8d %s\n", $histime - time, ctime($histime); 1174 } 1175 1176=head2 Unix-Domain TCP Clients and Servers 1177 1178That's fine for Internet-domain clients and servers, but what about local 1179communications? While you can use the same setup, sometimes you don't 1180want to. Unix-domain sockets are local to the current host, and are often 1181used internally to implement pipes. Unlike Internet domain sockets, Unix 1182domain sockets can show up in the file system with an ls(1) listing. 1183 1184 % ls -l /dev/log 1185 srw-rw-rw- 1 root 0 Oct 31 07:23 /dev/log 1186 1187You can test for these with Perl's B<-S> file test: 1188 1189 unless ( -S '/dev/log' ) { 1190 die "something's wicked with the log system"; 1191 } 1192 1193Here's a sample Unix-domain client: 1194 1195 #!/usr/bin/perl -w 1196 use Socket; 1197 use strict; 1198 my ($rendezvous, $line); 1199 1200 $rendezvous = shift || 'catsock'; 1201 socket(SOCK, PF_UNIX, SOCK_STREAM, 0) || die "socket: $!"; 1202 connect(SOCK, sockaddr_un($rendezvous)) || die "connect: $!"; 1203 while (defined($line = <SOCK>)) { 1204 print $line; 1205 } 1206 exit; 1207 1208And here's a corresponding server. You don't have to worry about silly 1209network terminators here because Unix domain sockets are guaranteed 1210to be on the localhost, and thus everything works right. 1211 1212 #!/usr/bin/perl -Tw 1213 use strict; 1214 use Socket; 1215 use Carp; 1216 1217 BEGIN { $ENV{PATH} = '/usr/ucb:/bin' } 1218 sub spawn; # forward declaration 1219 sub logmsg { print "$0 $$: @_ at ", scalar localtime, "\n" } 1220 1221 my $NAME = 'catsock'; 1222 my $uaddr = sockaddr_un($NAME); 1223 my $proto = getprotobyname('tcp'); 1224 1225 socket(Server,PF_UNIX,SOCK_STREAM,0) || die "socket: $!"; 1226 unlink($NAME); 1227 bind (Server, $uaddr) || die "bind: $!"; 1228 listen(Server,SOMAXCONN) || die "listen: $!"; 1229 1230 logmsg "server started on $NAME"; 1231 1232 my $waitedpid; 1233 1234 use POSIX ":sys_wait_h"; 1235 sub REAPER { 1236 my $child; 1237 while (($waitedpid = waitpid(-1,WNOHANG)) > 0) { 1238 logmsg "reaped $waitedpid" . ($? ? " with exit $?" : ''); 1239 } 1240 $SIG{CHLD} = \&REAPER; # loathe SysV 1241 } 1242 1243 $SIG{CHLD} = \&REAPER; 1244 1245 1246 for ( $waitedpid = 0; 1247 accept(Client,Server) || $waitedpid; 1248 $waitedpid = 0, close Client) 1249 { 1250 next if $waitedpid; 1251 logmsg "connection on $NAME"; 1252 spawn sub { 1253 print "Hello there, it's now ", scalar localtime, "\n"; 1254 exec '/usr/games/fortune' or die "can't exec fortune: $!"; 1255 }; 1256 } 1257 1258 sub spawn { 1259 my $coderef = shift; 1260 1261 unless (@_ == 0 && $coderef && ref($coderef) eq 'CODE') { 1262 confess "usage: spawn CODEREF"; 1263 } 1264 1265 my $pid; 1266 if (!defined($pid = fork)) { 1267 logmsg "cannot fork: $!"; 1268 return; 1269 } elsif ($pid) { 1270 logmsg "begat $pid"; 1271 return; # I'm the parent 1272 } 1273 # else I'm the child -- go spawn 1274 1275 open(STDIN, "<&Client") || die "can't dup client to stdin"; 1276 open(STDOUT, ">&Client") || die "can't dup client to stdout"; 1277 ## open(STDERR, ">&STDOUT") || die "can't dup stdout to stderr"; 1278 exit &$coderef(); 1279 } 1280 1281As you see, it's remarkably similar to the Internet domain TCP server, so 1282much so, in fact, that we've omitted several duplicate functions--spawn(), 1283logmsg(), ctime(), and REAPER()--which are exactly the same as in the 1284other server. 1285 1286So why would you ever want to use a Unix domain socket instead of a 1287simpler named pipe? Because a named pipe doesn't give you sessions. You 1288can't tell one process's data from another's. With socket programming, 1289you get a separate session for each client: that's why accept() takes two 1290arguments. 1291 1292For example, let's say that you have a long running database server daemon 1293that you want folks from the World Wide Web to be able to access, but only 1294if they go through a CGI interface. You'd have a small, simple CGI 1295program that does whatever checks and logging you feel like, and then acts 1296as a Unix-domain client and connects to your private server. 1297 1298=head1 TCP Clients with IO::Socket 1299 1300For those preferring a higher-level interface to socket programming, the 1301IO::Socket module provides an object-oriented approach. IO::Socket is 1302included as part of the standard Perl distribution as of the 5.004 1303release. If you're running an earlier version of Perl, just fetch 1304IO::Socket from CPAN, where you'll also find modules providing easy 1305interfaces to the following systems: DNS, FTP, Ident (RFC 931), NIS and 1306NISPlus, NNTP, Ping, POP3, SMTP, SNMP, SSLeay, Telnet, and Time--just 1307to name a few. 1308 1309=head2 A Simple Client 1310 1311Here's a client that creates a TCP connection to the "daytime" 1312service at port 13 of the host name "localhost" and prints out everything 1313that the server there cares to provide. 1314 1315 #!/usr/bin/perl -w 1316 use IO::Socket; 1317 $remote = IO::Socket::INET->new( 1318 Proto => "tcp", 1319 PeerAddr => "localhost", 1320 PeerPort => "daytime(13)", 1321 ) 1322 or die "cannot connect to daytime port at localhost"; 1323 while ( <$remote> ) { print } 1324 1325When you run this program, you should get something back that 1326looks like this: 1327 1328 Wed May 14 08:40:46 MDT 1997 1329 1330Here are what those parameters to the C<new> constructor mean: 1331 1332=over 4 1333 1334=item C<Proto> 1335 1336This is which protocol to use. In this case, the socket handle returned 1337will be connected to a TCP socket, because we want a stream-oriented 1338connection, that is, one that acts pretty much like a plain old file. 1339Not all sockets are this of this type. For example, the UDP protocol 1340can be used to make a datagram socket, used for message-passing. 1341 1342=item C<PeerAddr> 1343 1344This is the name or Internet address of the remote host the server is 1345running on. We could have specified a longer name like C<"www.perl.com">, 1346or an address like C<"204.148.40.9">. For demonstration purposes, we've 1347used the special hostname C<"localhost">, which should always mean the 1348current machine you're running on. The corresponding Internet address 1349for localhost is C<"127.1">, if you'd rather use that. 1350 1351=item C<PeerPort> 1352 1353This is the service name or port number we'd like to connect to. 1354We could have gotten away with using just C<"daytime"> on systems with a 1355well-configured system services file,[FOOTNOTE: The system services file 1356is in I</etc/services> under Unix] but just in case, we've specified the 1357port number (13) in parentheses. Using just the number would also have 1358worked, but constant numbers make careful programmers nervous. 1359 1360=back 1361 1362Notice how the return value from the C<new> constructor is used as 1363a filehandle in the C<while> loop? That's what's called an indirect 1364filehandle, a scalar variable containing a filehandle. You can use 1365it the same way you would a normal filehandle. For example, you 1366can read one line from it this way: 1367 1368 $line = <$handle>; 1369 1370all remaining lines from is this way: 1371 1372 @lines = <$handle>; 1373 1374and send a line of data to it this way: 1375 1376 print $handle "some data\n"; 1377 1378=head2 A Webget Client 1379 1380Here's a simple client that takes a remote host to fetch a document 1381from, and then a list of documents to get from that host. This is a 1382more interesting client than the previous one because it first sends 1383something to the server before fetching the server's response. 1384 1385 #!/usr/bin/perl -w 1386 use IO::Socket; 1387 unless (@ARGV > 1) { die "usage: $0 host document ..." } 1388 $host = shift(@ARGV); 1389 $EOL = "\015\012"; 1390 $BLANK = $EOL x 2; 1391 foreach $document ( @ARGV ) { 1392 $remote = IO::Socket::INET->new( Proto => "tcp", 1393 PeerAddr => $host, 1394 PeerPort => "http(80)", 1395 ); 1396 unless ($remote) { die "cannot connect to http daemon on $host" } 1397 $remote->autoflush(1); 1398 print $remote "GET $document HTTP/1.0" . $BLANK; 1399 while ( <$remote> ) { print } 1400 close $remote; 1401 } 1402 1403The web server handing the "http" service, which is assumed to be at 1404its standard port, number 80. If the web server you're trying to 1405connect to is at a different port (like 1080 or 8080), you should specify 1406as the named-parameter pair, C<< PeerPort => 8080 >>. The C<autoflush> 1407method is used on the socket because otherwise the system would buffer 1408up the output we sent it. (If you're on a Mac, you'll also need to 1409change every C<"\n"> in your code that sends data over the network to 1410be a C<"\015\012"> instead.) 1411 1412Connecting to the server is only the first part of the process: once you 1413have the connection, you have to use the server's language. Each server 1414on the network has its own little command language that it expects as 1415input. The string that we send to the server starting with "GET" is in 1416HTTP syntax. In this case, we simply request each specified document. 1417Yes, we really are making a new connection for each document, even though 1418it's the same host. That's the way you always used to have to speak HTTP. 1419Recent versions of web browsers may request that the remote server leave 1420the connection open a little while, but the server doesn't have to honor 1421such a request. 1422 1423Here's an example of running that program, which we'll call I<webget>: 1424 1425 % webget www.perl.com /guanaco.html 1426 HTTP/1.1 404 File Not Found 1427 Date: Thu, 08 May 1997 18:02:32 GMT 1428 Server: Apache/1.2b6 1429 Connection: close 1430 Content-type: text/html 1431 1432 <HEAD><TITLE>404 File Not Found</TITLE></HEAD> 1433 <BODY><H1>File Not Found</H1> 1434 The requested URL /guanaco.html was not found on this server.<P> 1435 </BODY> 1436 1437Ok, so that's not very interesting, because it didn't find that 1438particular document. But a long response wouldn't have fit on this page. 1439 1440For a more fully-featured version of this program, you should look to 1441the I<lwp-request> program included with the LWP modules from CPAN. 1442 1443=head2 Interactive Client with IO::Socket 1444 1445Well, that's all fine if you want to send one command and get one answer, 1446but what about setting up something fully interactive, somewhat like 1447the way I<telnet> works? That way you can type a line, get the answer, 1448type a line, get the answer, etc. 1449 1450This client is more complicated than the two we've done so far, but if 1451you're on a system that supports the powerful C<fork> call, the solution 1452isn't that rough. Once you've made the connection to whatever service 1453you'd like to chat with, call C<fork> to clone your process. Each of 1454these two identical process has a very simple job to do: the parent 1455copies everything from the socket to standard output, while the child 1456simultaneously copies everything from standard input to the socket. 1457To accomplish the same thing using just one process would be I<much> 1458harder, because it's easier to code two processes to do one thing than it 1459is to code one process to do two things. (This keep-it-simple principle 1460a cornerstones of the Unix philosophy, and good software engineering as 1461well, which is probably why it's spread to other systems.) 1462 1463Here's the code: 1464 1465 #!/usr/bin/perl -w 1466 use strict; 1467 use IO::Socket; 1468 my ($host, $port, $kidpid, $handle, $line); 1469 1470 unless (@ARGV == 2) { die "usage: $0 host port" } 1471 ($host, $port) = @ARGV; 1472 1473 # create a tcp connection to the specified host and port 1474 $handle = IO::Socket::INET->new(Proto => "tcp", 1475 PeerAddr => $host, 1476 PeerPort => $port) 1477 or die "can't connect to port $port on $host: $!"; 1478 1479 $handle->autoflush(1); # so output gets there right away 1480 print STDERR "[Connected to $host:$port]\n"; 1481 1482 # split the program into two processes, identical twins 1483 die "can't fork: $!" unless defined($kidpid = fork()); 1484 1485 # the if{} block runs only in the parent process 1486 if ($kidpid) { 1487 # copy the socket to standard output 1488 while (defined ($line = <$handle>)) { 1489 print STDOUT $line; 1490 } 1491 kill("TERM", $kidpid); # send SIGTERM to child 1492 } 1493 # the else{} block runs only in the child process 1494 else { 1495 # copy standard input to the socket 1496 while (defined ($line = <STDIN>)) { 1497 print $handle $line; 1498 } 1499 } 1500 1501The C<kill> function in the parent's C<if> block is there to send a 1502signal to our child process (current running in the C<else> block) 1503as soon as the remote server has closed its end of the connection. 1504 1505If the remote server sends data a byte at time, and you need that 1506data immediately without waiting for a newline (which might not happen), 1507you may wish to replace the C<while> loop in the parent with the 1508following: 1509 1510 my $byte; 1511 while (sysread($handle, $byte, 1) == 1) { 1512 print STDOUT $byte; 1513 } 1514 1515Making a system call for each byte you want to read is not very efficient 1516(to put it mildly) but is the simplest to explain and works reasonably 1517well. 1518 1519=head1 TCP Servers with IO::Socket 1520 1521As always, setting up a server is little bit more involved than running a client. 1522The model is that the server creates a special kind of socket that 1523does nothing but listen on a particular port for incoming connections. 1524It does this by calling the C<< IO::Socket::INET->new() >> method with 1525slightly different arguments than the client did. 1526 1527=over 4 1528 1529=item Proto 1530 1531This is which protocol to use. Like our clients, we'll 1532still specify C<"tcp"> here. 1533 1534=item LocalPort 1535 1536We specify a local 1537port in the C<LocalPort> argument, which we didn't do for the client. 1538This is service name or port number for which you want to be the 1539server. (Under Unix, ports under 1024 are restricted to the 1540superuser.) In our sample, we'll use port 9000, but you can use 1541any port that's not currently in use on your system. If you try 1542to use one already in used, you'll get an "Address already in use" 1543message. Under Unix, the C<netstat -a> command will show 1544which services current have servers. 1545 1546=item Listen 1547 1548The C<Listen> parameter is set to the maximum number of 1549pending connections we can accept until we turn away incoming clients. 1550Think of it as a call-waiting queue for your telephone. 1551The low-level Socket module has a special symbol for the system maximum, which 1552is SOMAXCONN. 1553 1554=item Reuse 1555 1556The C<Reuse> parameter is needed so that we restart our server 1557manually without waiting a few minutes to allow system buffers to 1558clear out. 1559 1560=back 1561 1562Once the generic server socket has been created using the parameters 1563listed above, the server then waits for a new client to connect 1564to it. The server blocks in the C<accept> method, which eventually accepts a 1565bidirectional connection from the remote client. (Make sure to autoflush 1566this handle to circumvent buffering.) 1567 1568To add to user-friendliness, our server prompts the user for commands. 1569Most servers don't do this. Because of the prompt without a newline, 1570you'll have to use the C<sysread> variant of the interactive client above. 1571 1572This server accepts one of five different commands, sending output 1573back to the client. Note that unlike most network servers, this one 1574only handles one incoming client at a time. Multithreaded servers are 1575covered in Chapter 6 of the Camel. 1576 1577Here's the code. We'll 1578 1579 #!/usr/bin/perl -w 1580 use IO::Socket; 1581 use Net::hostent; # for OO version of gethostbyaddr 1582 1583 $PORT = 9000; # pick something not in use 1584 1585 $server = IO::Socket::INET->new( Proto => 'tcp', 1586 LocalPort => $PORT, 1587 Listen => SOMAXCONN, 1588 Reuse => 1); 1589 1590 die "can't setup server" unless $server; 1591 print "[Server $0 accepting clients]\n"; 1592 1593 while ($client = $server->accept()) { 1594 $client->autoflush(1); 1595 print $client "Welcome to $0; type help for command list.\n"; 1596 $hostinfo = gethostbyaddr($client->peeraddr); 1597 printf "[Connect from %s]\n", $hostinfo ? $hostinfo->name : $client->peerhost; 1598 print $client "Command? "; 1599 while ( <$client>) { 1600 next unless /\S/; # blank line 1601 if (/quit|exit/i) { last; } 1602 elsif (/date|time/i) { printf $client "%s\n", scalar localtime; } 1603 elsif (/who/i ) { print $client `who 2>&1`; } 1604 elsif (/cookie/i ) { print $client `/usr/games/fortune 2>&1`; } 1605 elsif (/motd/i ) { print $client `cat /etc/motd 2>&1`; } 1606 else { 1607 print $client "Commands: quit date who cookie motd\n"; 1608 } 1609 } continue { 1610 print $client "Command? "; 1611 } 1612 close $client; 1613 } 1614 1615=head1 UDP: Message Passing 1616 1617Another kind of client-server setup is one that uses not connections, but 1618messages. UDP communications involve much lower overhead but also provide 1619less reliability, as there are no promises that messages will arrive at 1620all, let alone in order and unmangled. Still, UDP offers some advantages 1621over TCP, including being able to "broadcast" or "multicast" to a whole 1622bunch of destination hosts at once (usually on your local subnet). If you 1623find yourself overly concerned about reliability and start building checks 1624into your message system, then you probably should use just TCP to start 1625with. 1626 1627Note that UDP datagrams are I<not> a bytestream and should not be treated 1628as such. This makes using I/O mechanisms with internal buffering 1629like stdio (i.e. print() and friends) especially cumbersome. Use syswrite(), 1630or better send(), like in the example below. 1631 1632Here's a UDP program similar to the sample Internet TCP client given 1633earlier. However, instead of checking one host at a time, the UDP version 1634will check many of them asynchronously by simulating a multicast and then 1635using select() to do a timed-out wait for I/O. To do something similar 1636with TCP, you'd have to use a different socket handle for each host. 1637 1638 #!/usr/bin/perl -w 1639 use strict; 1640 use Socket; 1641 use Sys::Hostname; 1642 1643 my ( $count, $hisiaddr, $hispaddr, $histime, 1644 $host, $iaddr, $paddr, $port, $proto, 1645 $rin, $rout, $rtime, $SECS_of_70_YEARS); 1646 1647 $SECS_of_70_YEARS = 2208988800; 1648 1649 $iaddr = gethostbyname(hostname()); 1650 $proto = getprotobyname('udp'); 1651 $port = getservbyname('time', 'udp'); 1652 $paddr = sockaddr_in(0, $iaddr); # 0 means let kernel pick 1653 1654 socket(SOCKET, PF_INET, SOCK_DGRAM, $proto) || die "socket: $!"; 1655 bind(SOCKET, $paddr) || die "bind: $!"; 1656 1657 $| = 1; 1658 printf "%-12s %8s %s\n", "localhost", 0, scalar localtime time; 1659 $count = 0; 1660 for $host (@ARGV) { 1661 $count++; 1662 $hisiaddr = inet_aton($host) || die "unknown host"; 1663 $hispaddr = sockaddr_in($port, $hisiaddr); 1664 defined(send(SOCKET, 0, 0, $hispaddr)) || die "send $host: $!"; 1665 } 1666 1667 $rin = ''; 1668 vec($rin, fileno(SOCKET), 1) = 1; 1669 1670 # timeout after 10.0 seconds 1671 while ($count && select($rout = $rin, undef, undef, 10.0)) { 1672 $rtime = ''; 1673 ($hispaddr = recv(SOCKET, $rtime, 4, 0)) || die "recv: $!"; 1674 ($port, $hisiaddr) = sockaddr_in($hispaddr); 1675 $host = gethostbyaddr($hisiaddr, AF_INET); 1676 $histime = unpack("N", $rtime) - $SECS_of_70_YEARS; 1677 printf "%-12s ", $host; 1678 printf "%8d %s\n", $histime - time, scalar localtime($histime); 1679 $count--; 1680 } 1681 1682Note that this example does not include any retries and may consequently 1683fail to contact a reachable host. The most prominent reason for this 1684is congestion of the queues on the sending host if the number of 1685list of hosts to contact is sufficiently large. 1686 1687=head1 SysV IPC 1688 1689While System V IPC isn't so widely used as sockets, it still has some 1690interesting uses. You can't, however, effectively use SysV IPC or 1691Berkeley mmap() to have shared memory so as to share a variable amongst 1692several processes. That's because Perl would reallocate your string when 1693you weren't wanting it to. 1694 1695Here's a small example showing shared memory usage. 1696 1697 use IPC::SysV qw(IPC_PRIVATE IPC_RMID S_IRUSR S_IWUSR); 1698 1699 $size = 2000; 1700 $id = shmget(IPC_PRIVATE, $size, S_IRUSR|S_IWUSR) // die "$!"; 1701 print "shm key $id\n"; 1702 1703 $message = "Message #1"; 1704 shmwrite($id, $message, 0, 60) || die "$!"; 1705 print "wrote: '$message'\n"; 1706 shmread($id, $buff, 0, 60) || die "$!"; 1707 print "read : '$buff'\n"; 1708 1709 # the buffer of shmread is zero-character end-padded. 1710 substr($buff, index($buff, "\0")) = ''; 1711 print "un" unless $buff eq $message; 1712 print "swell\n"; 1713 1714 print "deleting shm $id\n"; 1715 shmctl($id, IPC_RMID, 0) || die "$!"; 1716 1717Here's an example of a semaphore: 1718 1719 use IPC::SysV qw(IPC_CREAT); 1720 1721 $IPC_KEY = 1234; 1722 $id = semget($IPC_KEY, 10, 0666 | IPC_CREAT ) // die "$!"; 1723 print "shm key $id\n"; 1724 1725Put this code in a separate file to be run in more than one process. 1726Call the file F<take>: 1727 1728 # create a semaphore 1729 1730 $IPC_KEY = 1234; 1731 $id = semget($IPC_KEY, 0 , 0 ); 1732 die if !defined($id); 1733 1734 $semnum = 0; 1735 $semflag = 0; 1736 1737 # 'take' semaphore 1738 # wait for semaphore to be zero 1739 $semop = 0; 1740 $opstring1 = pack("s!s!s!", $semnum, $semop, $semflag); 1741 1742 # Increment the semaphore count 1743 $semop = 1; 1744 $opstring2 = pack("s!s!s!", $semnum, $semop, $semflag); 1745 $opstring = $opstring1 . $opstring2; 1746 1747 semop($id,$opstring) || die "$!"; 1748 1749Put this code in a separate file to be run in more than one process. 1750Call this file F<give>: 1751 1752 # 'give' the semaphore 1753 # run this in the original process and you will see 1754 # that the second process continues 1755 1756 $IPC_KEY = 1234; 1757 $id = semget($IPC_KEY, 0, 0); 1758 die if !defined($id); 1759 1760 $semnum = 0; 1761 $semflag = 0; 1762 1763 # Decrement the semaphore count 1764 $semop = -1; 1765 $opstring = pack("s!s!s!", $semnum, $semop, $semflag); 1766 1767 semop($id,$opstring) || die "$!"; 1768 1769The SysV IPC code above was written long ago, and it's definitely 1770clunky looking. For a more modern look, see the IPC::SysV module 1771which is included with Perl starting from Perl 5.005. 1772 1773A small example demonstrating SysV message queues: 1774 1775 use IPC::SysV qw(IPC_PRIVATE IPC_RMID IPC_CREAT S_IRUSR S_IWUSR); 1776 1777 my $id = msgget(IPC_PRIVATE, IPC_CREAT | S_IRUSR | S_IWUSR); 1778 1779 my $sent = "message"; 1780 my $type_sent = 1234; 1781 my $rcvd; 1782 my $type_rcvd; 1783 1784 if (defined $id) { 1785 if (msgsnd($id, pack("l! a*", $type_sent, $sent), 0)) { 1786 if (msgrcv($id, $rcvd, 60, 0, 0)) { 1787 ($type_rcvd, $rcvd) = unpack("l! a*", $rcvd); 1788 if ($rcvd eq $sent) { 1789 print "okay\n"; 1790 } else { 1791 print "not okay\n"; 1792 } 1793 } else { 1794 die "# msgrcv failed\n"; 1795 } 1796 } else { 1797 die "# msgsnd failed\n"; 1798 } 1799 msgctl($id, IPC_RMID, 0) || die "# msgctl failed: $!\n"; 1800 } else { 1801 die "# msgget failed\n"; 1802 } 1803 1804=head1 NOTES 1805 1806Most of these routines quietly but politely return C<undef> when they 1807fail instead of causing your program to die right then and there due to 1808an uncaught exception. (Actually, some of the new I<Socket> conversion 1809functions croak() on bad arguments.) It is therefore essential to 1810check return values from these functions. Always begin your socket 1811programs this way for optimal success, and don't forget to add B<-T> 1812taint checking flag to the #! line for servers: 1813 1814 #!/usr/bin/perl -Tw 1815 use strict; 1816 use sigtrap; 1817 use Socket; 1818 1819=head1 BUGS 1820 1821All these routines create system-specific portability problems. As noted 1822elsewhere, Perl is at the mercy of your C libraries for much of its system 1823behaviour. It's probably safest to assume broken SysV semantics for 1824signals and to stick with simple TCP and UDP socket operations; e.g., don't 1825try to pass open file descriptors over a local UDP datagram socket if you 1826want your code to stand a chance of being portable. 1827 1828=head1 AUTHOR 1829 1830Tom Christiansen, with occasional vestiges of Larry Wall's original 1831version and suggestions from the Perl Porters. 1832 1833=head1 SEE ALSO 1834 1835There's a lot more to networking than this, but this should get you 1836started. 1837 1838For intrepid programmers, the indispensable textbook is I<Unix 1839Network Programming, 2nd Edition, Volume 1> by W. Richard Stevens 1840(published by Prentice-Hall). Note that most books on networking 1841address the subject from the perspective of a C programmer; translation 1842to Perl is left as an exercise for the reader. 1843 1844The IO::Socket(3) manpage describes the object library, and the Socket(3) 1845manpage describes the low-level interface to sockets. Besides the obvious 1846functions in L<perlfunc>, you should also check out the F<modules> file 1847at your nearest CPAN site. (See L<perlmodlib> or best yet, the F<Perl 1848FAQ> for a description of what CPAN is and where to get it.) 1849 1850Section 5 of the F<modules> file is devoted to "Networking, Device Control 1851(modems), and Interprocess Communication", and contains numerous unbundled 1852modules numerous networking modules, Chat and Expect operations, CGI 1853programming, DCE, FTP, IPC, NNTP, Proxy, Ptty, RPC, SNMP, SMTP, Telnet, 1854Threads, and ToolTalk--just to name a few. 1855