xref: /openbsd/gnu/usr.bin/perl/pod/perlipc.pod (revision eac174f2)
1=head1 NAME
2
3perlipc - Perl interprocess communication (signals, fifos, pipes, safe subprocesses, sockets, and semaphores)
4
5=head1 DESCRIPTION
6
7The basic IPC facilities of Perl are built out of the good old Unix
8signals, named pipes, pipe opens, the Berkeley socket routines, and SysV
9IPC calls.  Each is used in slightly different situations.
10
11=head1 Signals
12
13Perl uses a simple signal handling model: the %SIG hash contains names
14or references of user-installed signal handlers.  These handlers will
15be called with an argument which is the name of the signal that
16triggered it.  A signal may be generated intentionally from a
17particular keyboard sequence like control-C or control-Z, sent to you
18from another process, or triggered automatically by the kernel when
19special events transpire, like a child process exiting, your own process
20running out of stack space, or hitting a process file-size limit.
21
22For example, to trap an interrupt signal, set up a handler like this:
23
24    our $shucks;
25
26    sub catch_zap {
27        my $signame = shift;
28        $shucks++;
29        die "Somebody sent me a SIG$signame";
30    }
31    $SIG{INT} = __PACKAGE__ . "::catch_zap";
32    $SIG{INT} = \&catch_zap;  # best strategy
33
34Prior to Perl 5.8.0 it was necessary to do as little as you possibly
35could in your handler; notice how all we do is set a global variable
36and then raise an exception.  That's because on most systems,
37libraries are not re-entrant; particularly, memory allocation and I/O
38routines are not.  That meant that doing nearly I<anything> in your
39handler could in theory trigger a memory fault and subsequent core
40dump - see L</Deferred Signals (Safe Signals)> below.
41
42The names of the signals are the ones listed out by C<kill -l> on your
43system, or you can retrieve them using the CPAN module L<IPC::Signal>.
44
45You may also choose to assign the strings C<"IGNORE"> or C<"DEFAULT"> as
46the handler, in which case Perl will try to discard the signal or do the
47default thing.
48
49On most Unix platforms, the C<CHLD> (sometimes also known as C<CLD>) signal
50has special behavior with respect to a value of C<"IGNORE">.
51Setting C<$SIG{CHLD}> to C<"IGNORE"> on such a platform has the effect of
52not creating zombie processes when the parent process fails to C<wait()>
53on its child processes (i.e., child processes are automatically reaped).
54Calling C<wait()> with C<$SIG{CHLD}> set to C<"IGNORE"> usually returns
55C<-1> on such platforms.
56
57Some signals can be neither trapped nor ignored, such as the KILL and STOP
58(but not the TSTP) signals. Note that ignoring signals makes them disappear.
59If you only want them blocked temporarily without them getting lost you'll
60have to use the C<POSIX> module's L<sigprocmask|POSIX/sigprocmask>.
61
62Sending a signal to a negative process ID means that you send the signal
63to the entire Unix process group.  This code sends a hang-up signal to all
64processes in the current process group, and also sets $SIG{HUP} to C<"IGNORE">
65so it doesn't kill itself:
66
67    # block scope for local
68    {
69        local $SIG{HUP} = "IGNORE";
70        kill HUP => -getpgrp();
71        # snazzy writing of: kill("HUP", -getpgrp())
72    }
73
74Another interesting signal to send is signal number zero.  This doesn't
75actually affect a child process, but instead checks whether it's alive
76or has changed its UIDs.
77
78    unless (kill 0 => $kid_pid) {
79        warn "something wicked happened to $kid_pid";
80    }
81
82Signal number zero may fail because you lack permission to send the
83signal when directed at a process whose real or saved UID is not
84identical to the real or effective UID of the sending process, even
85though the process is alive.  You may be able to determine the cause of
86failure using C<$!> or C<%!>.
87
88    unless (kill(0 => $pid) || $!{EPERM}) {
89        warn "$pid looks dead";
90    }
91
92You might also want to employ anonymous functions for simple signal
93handlers:
94
95    $SIG{INT} = sub { die "\nOutta here!\n" };
96
97SIGCHLD handlers require some special care.  If a second child dies
98while in the signal handler caused by the first death, we won't get
99another signal. So must loop here else we will leave the unreaped child
100as a zombie. And the next time two children die we get another zombie.
101And so on.
102
103    use POSIX ":sys_wait_h";
104    $SIG{CHLD} = sub {
105        while ((my $child = waitpid(-1, WNOHANG)) > 0) {
106            $Kid_Status{$child} = $?;
107        }
108    };
109    # do something that forks...
110
111Be careful: qx(), system(), and some modules for calling external commands
112do a fork(), then wait() for the result. Thus, your signal handler
113will be called. Because wait() was already called by system() or qx(),
114the wait() in the signal handler will see no more zombies and will
115therefore block.
116
117The best way to prevent this issue is to use waitpid(), as in the following
118example:
119
120    use POSIX ":sys_wait_h"; # for nonblocking read
121
122    my %children;
123
124    $SIG{CHLD} = sub {
125        # don't change $! and $? outside handler
126        local ($!, $?);
127        while ( (my $pid = waitpid(-1, WNOHANG)) > 0 ) {
128            delete $children{$pid};
129            cleanup_child($pid, $?);
130        }
131    };
132
133    while (1) {
134        my $pid = fork();
135        die "cannot fork" unless defined $pid;
136        if ($pid == 0) {
137            # ...
138            exit 0;
139        } else {
140            $children{$pid}=1;
141            # ...
142            system($command);
143            # ...
144       }
145    }
146
147Signal handling is also used for timeouts in Unix.  While safely
148protected within an C<eval{}> block, you set a signal handler to trap
149alarm signals and then schedule to have one delivered to you in some
150number of seconds.  Then try your blocking operation, clearing the alarm
151when it's done but not before you've exited your C<eval{}> block.  If it
152goes off, you'll use die() to jump out of the block.
153
154Here's an example:
155
156    my $ALARM_EXCEPTION = "alarm clock restart";
157    eval {
158        local $SIG{ALRM} = sub { die $ALARM_EXCEPTION };
159        alarm 10;
160        flock($fh, 2)    # blocking write lock
161                        || die "cannot flock: $!";
162        alarm 0;
163    };
164    if ($@ && $@ !~ quotemeta($ALARM_EXCEPTION)) { die }
165
166If the operation being timed out is system() or qx(), this technique
167is liable to generate zombies.    If this matters to you, you'll
168need to do your own fork() and exec(), and kill the errant child process.
169
170For more complex signal handling, you might see the standard POSIX
171module.  Lamentably, this is almost entirely undocumented, but the
172F<ext/POSIX/t/sigaction.t> file from the Perl source distribution has
173some examples in it.
174
175=head2 Handling the SIGHUP Signal in Daemons
176
177A process that usually starts when the system boots and shuts down
178when the system is shut down is called a daemon (Disk And Execution
179MONitor). If a daemon process has a configuration file which is
180modified after the process has been started, there should be a way to
181tell that process to reread its configuration file without stopping
182the process. Many daemons provide this mechanism using a C<SIGHUP>
183signal handler. When you want to tell the daemon to reread the file,
184simply send it the C<SIGHUP> signal.
185
186The following example implements a simple daemon, which restarts
187itself every time the C<SIGHUP> signal is received. The actual code is
188located in the subroutine C<code()>, which just prints some debugging
189info to show that it works; it should be replaced with the real code.
190
191  #!/usr/bin/perl
192
193  use v5.36;
194
195  use POSIX ();
196  use FindBin ();
197  use File::Basename ();
198  use File::Spec::Functions qw(catfile);
199
200  $| = 1;
201
202  # make the daemon cross-platform, so exec always calls the script
203  # itself with the right path, no matter how the script was invoked.
204  my $script = File::Basename::basename($0);
205  my $SELF  = catfile($FindBin::Bin, $script);
206
207  # POSIX unmasks the sigprocmask properly
208  $SIG{HUP} = sub {
209      print "got SIGHUP\n";
210      exec($SELF, @ARGV)        || die "$0: couldn't restart: $!";
211  };
212
213  code();
214
215  sub code {
216      print "PID: $$\n";
217      print "ARGV: @ARGV\n";
218      my $count = 0;
219      while (1) {
220          sleep 2;
221          print ++$count, "\n";
222      }
223  }
224
225
226=head2 Deferred Signals (Safe Signals)
227
228Before Perl 5.8.0, installing Perl code to deal with signals exposed you to
229danger from two things.  First, few system library functions are
230re-entrant.  If the signal interrupts while Perl is executing one function
231(like malloc(3) or printf(3)), and your signal handler then calls the same
232function again, you could get unpredictable behavior--often, a core dump.
233Second, Perl isn't itself re-entrant at the lowest levels.  If the signal
234interrupts Perl while Perl is changing its own internal data structures,
235similarly unpredictable behavior may result.
236
237There were two things you could do, knowing this: be paranoid or be
238pragmatic.  The paranoid approach was to do as little as possible in your
239signal handler.  Set an existing integer variable that already has a
240value, and return.  This doesn't help you if you're in a slow system call,
241which will just restart.  That means you have to C<die> to longjmp(3) out
242of the handler.  Even this is a little cavalier for the true paranoiac,
243who avoids C<die> in a handler because the system I<is> out to get you.
244The pragmatic approach was to say "I know the risks, but prefer the
245convenience", and to do anything you wanted in your signal handler,
246and be prepared to clean up core dumps now and again.
247
248Perl 5.8.0 and later avoid these problems by "deferring" signals.  That is,
249when the signal is delivered to the process by the system (to the C code
250that implements Perl) a flag is set, and the handler returns immediately.
251Then at strategic "safe" points in the Perl interpreter (e.g. when it is
252about to execute a new opcode) the flags are checked and the Perl level
253handler from %SIG is executed. The "deferred" scheme allows much more
254flexibility in the coding of signal handlers as we know the Perl
255interpreter is in a safe state, and that we are not in a system library
256function when the handler is called.  However the implementation does
257differ from previous Perls in the following ways:
258
259=over 4
260
261=item Long-running opcodes
262
263As the Perl interpreter looks at signal flags only when it is about
264to execute a new opcode, a signal that arrives during a long-running
265opcode (e.g. a regular expression operation on a very large string) will
266not be seen until the current opcode completes.
267
268If a signal of any given type fires multiple times during an opcode
269(such as from a fine-grained timer), the handler for that signal will
270be called only once, after the opcode completes; all other
271instances will be discarded.  Furthermore, if your system's signal queue
272gets flooded to the point that there are signals that have been raised
273but not yet caught (and thus not deferred) at the time an opcode
274completes, those signals may well be caught and deferred during
275subsequent opcodes, with sometimes surprising results.  For example, you
276may see alarms delivered even after calling C<alarm(0)> as the latter
277stops the raising of alarms but does not cancel the delivery of alarms
278raised but not yet caught.  Do not depend on the behaviors described in
279this paragraph as they are side effects of the current implementation and
280may change in future versions of Perl.
281
282=item Interrupting IO
283
284When a signal is delivered (e.g., SIGINT from a control-C) the operating
285system breaks into IO operations like I<read>(2), which is used to
286implement Perl's readline() function, the C<< <> >> operator. On older
287Perls the handler was called immediately (and as C<read> is not "unsafe",
288this worked well). With the "deferred" scheme the handler is I<not> called
289immediately, and if Perl is using the system's C<stdio> library that
290library may restart the C<read> without returning to Perl to give it a
291chance to call the %SIG handler. If this happens on your system the
292solution is to use the C<:perlio> layer to do IO--at least on those handles
293that you want to be able to break into with signals. (The C<:perlio> layer
294checks the signal flags and calls %SIG handlers before resuming IO
295operation.)
296
297The default in Perl 5.8.0 and later is to automatically use
298the C<:perlio> layer.
299
300Note that it is not advisable to access a file handle within a signal
301handler where that signal has interrupted an I/O operation on that same
302handle. While perl will at least try hard not to crash, there are no
303guarantees of data integrity; for example, some data might get dropped or
304written twice.
305
306Some networking library functions like gethostbyname() are known to have
307their own implementations of timeouts which may conflict with your
308timeouts.  If you have problems with such functions, try using the POSIX
309sigaction() function, which bypasses Perl safe signals.  Be warned that
310this does subject you to possible memory corruption, as described above.
311
312Instead of setting C<$SIG{ALRM}>:
313
314   local $SIG{ALRM} = sub { die "alarm" };
315
316try something like the following:
317
318 use POSIX qw(SIGALRM);
319 POSIX::sigaction(SIGALRM,
320                  POSIX::SigAction->new(sub { die "alarm" }))
321          || die "Error setting SIGALRM handler: $!\n";
322
323Another way to disable the safe signal behavior locally is to use
324the C<Perl::Unsafe::Signals> module from CPAN, which affects
325all signals.
326
327=item Restartable system calls
328
329On systems that supported it, older versions of Perl used the
330SA_RESTART flag when installing %SIG handlers.  This meant that
331restartable system calls would continue rather than returning when
332a signal arrived.  In order to deliver deferred signals promptly,
333Perl 5.8.0 and later do I<not> use SA_RESTART.  Consequently,
334restartable system calls can fail (with $! set to C<EINTR>) in places
335where they previously would have succeeded.
336
337The default C<:perlio> layer retries C<read>, C<write>
338and C<close> as described above; interrupted C<wait> and
339C<waitpid> calls will always be retried.
340
341=item Signals as "faults"
342
343Certain signals like SEGV, ILL, BUS and FPE are generated by virtual memory
344addressing errors and similar "faults". These are normally fatal: there is
345little a Perl-level handler can do with them.  So Perl delivers them
346immediately rather than attempting to defer them.
347
348It is possible to catch these with a C<%SIG> handler (see L<perlvar>),
349but on top of the usual problems of "unsafe" signals the signal is likely
350to get rethrown immediately on return from the signal handler, so such
351a handler should C<die> or C<exit> instead.
352
353=item Signals triggered by operating system state
354
355On some operating systems certain signal handlers are supposed to "do
356something" before returning. One example can be CHLD or CLD, which
357indicates a child process has completed. On some operating systems the
358signal handler is expected to C<wait> for the completed child
359process. On such systems the deferred signal scheme will not work for
360those signals: it does not do the C<wait>. Again the failure will
361look like a loop as the operating system will reissue the signal because
362there are completed child processes that have not yet been C<wait>ed for.
363
364=back
365
366If you want the old signal behavior back despite possible
367memory corruption, set the environment variable C<PERL_SIGNALS> to
368C<"unsafe">.  This feature first appeared in Perl 5.8.1.
369
370=head1 Named Pipes
371
372A named pipe (often referred to as a FIFO) is an old Unix IPC
373mechanism for processes communicating on the same machine.  It works
374just like regular anonymous pipes, except that the
375processes rendezvous using a filename and need not be related.
376
377To create a named pipe, use the C<POSIX::mkfifo()> function.
378
379    use POSIX qw(mkfifo);
380    mkfifo($path, 0700)     ||  die "mkfifo $path failed: $!";
381
382You can also use the Unix command mknod(1), or on some
383systems, mkfifo(1).  These may not be in your normal path, though.
384
385    # system return val is backwards, so && not ||
386    #
387    $ENV{PATH} .= ":/etc:/usr/etc";
388    if  (      system("mknod",  $path, "p")
389            && system("mkfifo", $path) )
390    {
391        die "mk{nod,fifo} $path failed";
392    }
393
394
395A fifo is convenient when you want to connect a process to an unrelated
396one.  When you open a fifo, the program will block until there's something
397on the other end.
398
399For example, let's say you'd like to have your F<.signature> file be a
400named pipe that has a Perl program on the other end.  Now every time any
401program (like a mailer, news reader, finger program, etc.) tries to read
402from that file, the reading program will read the new signature from your
403program.  We'll use the pipe-checking file-test operator, B<-p>, to find
404out whether anyone (or anything) has accidentally removed our fifo.
405
406    chdir();    # go home
407    my $FIFO = ".signature";
408
409    while (1) {
410        unless (-p $FIFO) {
411            unlink $FIFO;   # discard any failure, will catch later
412            require POSIX;  # delayed loading of heavy module
413            POSIX::mkfifo($FIFO, 0700)
414                                  || die "can't mkfifo $FIFO: $!";
415        }
416
417        # next line blocks till there's a reader
418        open (my $fh, ">", $FIFO) || die "can't open $FIFO: $!";
419        print $fh "John Smith (smith\@host.org)\n", `fortune -s`;
420        close($fh)                || die "can't close $FIFO: $!";
421        sleep 2;                # to avoid dup signals
422    }
423
424=head1 Using open() for IPC
425
426Perl's basic open() statement can also be used for unidirectional
427interprocess communication by specifying the open mode as C<|-> or C<-|>.
428Here's how to start
429something up in a child process you intend to write to:
430
431    open(my $spooler, "|-", "cat -v | lpr -h 2>/dev/null")
432                        || die "can't fork: $!";
433    local $SIG{PIPE} = sub { die "spooler pipe broke" };
434    print $spooler "stuff\n";
435    close $spooler      || die "bad spool: $! $?";
436
437And here's how to start up a child process you intend to read from:
438
439    open(my $status, "-|", "netstat -an 2>&1")
440                        || die "can't fork: $!";
441    while (<$status>) {
442        next if /^(tcp|udp)/;
443        print;
444    }
445    close $status       || die "bad netstat: $! $?";
446
447Be aware that these operations are full Unix forks, which means they may
448not be correctly implemented on all alien systems.  See L<perlport/open>
449for portability details.
450
451In the two-argument form of open(), a pipe open can be achieved by
452either appending or prepending a pipe symbol to the second argument:
453
454    open(my $spooler, "| cat -v | lpr -h 2>/dev/null")
455                        || die "can't fork: $!";
456    open(my $status, "netstat -an 2>&1 |")
457                        || die "can't fork: $!";
458
459This can be used even on systems that do not support forking, but this
460possibly allows code intended to read files to unexpectedly execute
461programs.  If one can be sure that a particular program is a Perl script
462expecting filenames in @ARGV using the two-argument form of open() or the
463C<< <> >> operator, the clever programmer can write something like this:
464
465    % program f1 "cmd1|" - f2 "cmd2|" f3 < tmpfile
466
467and no matter which sort of shell it's called from, the Perl program will
468read from the file F<f1>, the process F<cmd1>, standard input (F<tmpfile>
469in this case), the F<f2> file, the F<cmd2> command, and finally the F<f3>
470file.  Pretty nifty, eh?
471
472You might notice that you could use backticks for much the
473same effect as opening a pipe for reading:
474
475    print grep { !/^(tcp|udp)/ } `netstat -an 2>&1`;
476    die "bad netstatus ($?)" if $?;
477
478While this is true on the surface, it's much more efficient to process the
479file one line or record at a time because then you don't have to read the
480whole thing into memory at once.  It also gives you finer control of the
481whole process, letting you kill off the child process early if you'd like.
482
483Be careful to check the return values from both open() and close().  If
484you're I<writing> to a pipe, you should also trap SIGPIPE.  Otherwise,
485think of what happens when you start up a pipe to a command that doesn't
486exist: the open() will in all likelihood succeed (it only reflects the
487fork()'s success), but then your output will fail--spectacularly.  Perl
488can't know whether the command worked, because your command is actually
489running in a separate process whose exec() might have failed.  Therefore,
490while readers of bogus commands return just a quick EOF, writers
491to bogus commands will get hit with a signal, which they'd best be prepared
492to handle.  Consider:
493
494    open(my $fh, "|-", "bogus") || die "can't fork: $!";
495    print $fh "bang\n";         #  neither necessary nor sufficient
496                                #  to check print retval!
497    close($fh)                  || die "can't close: $!";
498
499The reason for not checking the return value from print() is because of
500pipe buffering; physical writes are delayed.  That won't blow up until the
501close, and it will blow up with a SIGPIPE.  To catch it, you could use
502this:
503
504    $SIG{PIPE} = "IGNORE";
505    open(my $fh, "|-", "bogus") || die "can't fork: $!";
506    print $fh "bang\n";
507    close($fh)                  || die "can't close: status=$?";
508
509=head2 Filehandles
510
511Both the main process and any child processes it forks share the same
512STDIN, STDOUT, and STDERR filehandles.  If both processes try to access
513them at once, strange things can happen.  You may also want to close
514or reopen the filehandles for the child.  You can get around this by
515opening your pipe with open(), but on some systems this means that the
516child process cannot outlive the parent.
517
518=head2 Background Processes
519
520You can run a command in the background with:
521
522    system("cmd &");
523
524The command's STDOUT and STDERR (and possibly STDIN, depending on your
525shell) will be the same as the parent's.  You won't need to catch
526SIGCHLD because of the double-fork taking place; see below for details.
527
528=head2 Complete Dissociation of Child from Parent
529
530In some cases (starting server processes, for instance) you'll want to
531completely dissociate the child process from the parent.  This is
532often called daemonization.  A well-behaved daemon will also chdir()
533to the root directory so it doesn't prevent unmounting the filesystem
534containing the directory from which it was launched, and redirect its
535standard file descriptors from and to F</dev/null> so that random
536output doesn't wind up on the user's terminal.
537
538 use POSIX "setsid";
539
540 sub daemonize {
541     chdir("/")                     || die "can't chdir to /: $!";
542     open(STDIN,  "<", "/dev/null") || die "can't read /dev/null: $!";
543     open(STDOUT, ">", "/dev/null") || die "can't write /dev/null: $!";
544     defined(my $pid = fork())      || die "can't fork: $!";
545     exit if $pid;              # non-zero now means I am the parent
546     (setsid() != -1)           || die "Can't start a new session: $!";
547     open(STDERR, ">&", STDOUT) || die "can't dup stdout: $!";
548 }
549
550The fork() has to come before the setsid() to ensure you aren't a
551process group leader; the setsid() will fail if you are.  If your
552system doesn't have the setsid() function, open F</dev/tty> and use the
553C<TIOCNOTTY> ioctl() on it instead.  See tty(4) for details.
554
555Non-Unix users should check their C<< I<Your_OS>::Process >> module for
556other possible solutions.
557
558=head2 Safe Pipe Opens
559
560Another interesting approach to IPC is making your single program go
561multiprocess and communicate between--or even amongst--yourselves.  The
562two-argument form of the
563open() function will accept a file argument of either C<"-|"> or C<"|-">
564to do a very interesting thing: it forks a child connected to the
565filehandle you've opened.  The child is running the same program as the
566parent.  This is useful for safely opening a file when running under an
567assumed UID or GID, for example.  If you open a pipe I<to> minus, you can
568write to the filehandle you opened and your kid will find it in I<his>
569STDIN.  If you open a pipe I<from> minus, you can read from the filehandle
570you opened whatever your kid writes to I<his> STDOUT.
571
572    my $PRECIOUS = "/path/to/some/safe/file";
573    my $sleep_count;
574    my $pid;
575    my $kid_to_write;
576
577    do {
578        $pid = open($kid_to_write, "|-");
579        unless (defined $pid) {
580            warn "cannot fork: $!";
581            die "bailing out" if $sleep_count++ > 6;
582            sleep 10;
583        }
584    } until defined $pid;
585
586    if ($pid) {                 # I am the parent
587        print $kid_to_write @some_data;
588        close($kid_to_write)    || warn "kid exited $?";
589    } else {                    # I am the child
590        # drop permissions in setuid and/or setgid programs:
591        ($>, $)) = ($<, $();
592        open (my $outfile, ">", $PRECIOUS)
593                                || die "can't open $PRECIOUS: $!";
594        while (<STDIN>) {
595            print $outfile;     # child STDIN is parent $kid_to_write
596        }
597        close($outfile)         || die "can't close $PRECIOUS: $!";
598        exit(0);                # don't forget this!!
599    }
600
601Another common use for this construct is when you need to execute
602something without the shell's interference.  With system(), it's
603straightforward, but you can't use a pipe open or backticks safely.
604That's because there's no way to stop the shell from getting its hands on
605your arguments.   Instead, use lower-level control to call exec() directly.
606
607Here's a safe backtick or pipe open for read:
608
609    my $pid = open(my $kid_to_read, "-|");
610    defined($pid)            || die "can't fork: $!";
611
612    if ($pid) {             # parent
613        while (<$kid_to_read>) {
614                            # do something interesting
615        }
616        close($kid_to_read)  || warn "kid exited $?";
617
618    } else {                # child
619        ($>, $)) = ($<, $(); # suid only
620        exec($program, @options, @args)
621                             || die "can't exec program: $!";
622        # NOTREACHED
623    }
624
625And here's a safe pipe open for writing:
626
627    my $pid = open(my $kid_to_write, "|-");
628    defined($pid)            || die "can't fork: $!";
629
630    $SIG{PIPE} = sub { die "whoops, $program pipe broke" };
631
632    if ($pid) {             # parent
633        print $kid_to_write @data;
634        close($kid_to_write) || warn "kid exited $?";
635
636    } else {                # child
637        ($>, $)) = ($<, $();
638        exec($program, @options, @args)
639                             || die "can't exec program: $!";
640        # NOTREACHED
641    }
642
643It is very easy to dead-lock a process using this form of open(), or
644indeed with any use of pipe() with multiple subprocesses.  The
645example above is "safe" because it is simple and calls exec().  See
646L</"Avoiding Pipe Deadlocks"> for general safety principles, but there
647are extra gotchas with Safe Pipe Opens.
648
649In particular, if you opened the pipe using C<open $fh, "|-">, then you
650cannot simply use close() in the parent process to close an unwanted
651writer.  Consider this code:
652
653    my $pid = open(my $writer, "|-");        # fork open a kid
654    defined($pid)               || die "first fork failed: $!";
655    if ($pid) {
656        if (my $sub_pid = fork()) {
657            defined($sub_pid)   || die "second fork failed: $!";
658            close($writer)      || die "couldn't close writer: $!";
659            # now do something else...
660        }
661        else {
662            # first write to $writer
663            # ...
664            # then when finished
665            close($writer)      || die "couldn't close writer: $!";
666            exit(0);
667        }
668    }
669    else {
670        # first do something with STDIN, then
671        exit(0);
672    }
673
674In the example above, the true parent does not want to write to the $writer
675filehandle, so it closes it.  However, because $writer was opened using
676C<open $fh, "|-">, it has a special behavior: closing it calls
677waitpid() (see L<perlfunc/waitpid>), which waits for the subprocess
678to exit.  If the child process ends up waiting for something happening
679in the section marked "do something else", you have deadlock.
680
681This can also be a problem with intermediate subprocesses in more
682complicated code, which will call waitpid() on all open filehandles
683during global destruction--in no predictable order.
684
685To solve this, you must manually use pipe(), fork(), and the form of
686open() which sets one file descriptor to another, as shown below:
687
688    pipe(my $reader, my $writer)   || die "pipe failed: $!";
689    my $pid = fork();
690    defined($pid)                  || die "first fork failed: $!";
691    if ($pid) {
692        close $reader;
693        if (my $sub_pid = fork()) {
694            defined($sub_pid)      || die "first fork failed: $!";
695            close($writer)         || die "can't close writer: $!";
696        }
697        else {
698            # write to $writer...
699            # ...
700            # then  when finished
701            close($writer)         || die "can't close writer: $!";
702            exit(0);
703        }
704        # write to $writer...
705    }
706    else {
707        open(STDIN, "<&", $reader) || die "can't reopen STDIN: $!";
708        close($writer)             || die "can't close writer: $!";
709        # do something...
710        exit(0);
711    }
712
713Since Perl 5.8.0, you can also use the list form of C<open> for pipes.
714This is preferred when you wish to avoid having the shell interpret
715metacharacters that may be in your command string.
716
717So for example, instead of using:
718
719    open(my $ps_pipe, "-|", "ps aux") || die "can't open ps pipe: $!";
720
721One would use either of these:
722
723    open(my $ps_pipe, "-|", "ps", "aux")
724                                      || die "can't open ps pipe: $!";
725
726    my @ps_args = qw[ ps aux ];
727    open(my $ps_pipe, "-|", @ps_args)
728                                      || die "can't open @ps_args|: $!";
729
730Because there are more than three arguments to open(), it forks the ps(1)
731command I<without> spawning a shell, and reads its standard output via the
732C<$ps_pipe> filehandle.  The corresponding syntax to I<write> to command
733pipes is to use C<"|-"> in place of C<"-|">.
734
735This was admittedly a rather silly example, because you're using string
736literals whose content is perfectly safe.  There is therefore no cause to
737resort to the harder-to-read, multi-argument form of pipe open().  However,
738whenever you cannot be assured that the program arguments are free of shell
739metacharacters, the fancier form of open() should be used.  For example:
740
741    my @grep_args = ("egrep", "-i", $some_pattern, @many_files);
742    open(my $grep_pipe, "-|", @grep_args)
743                        || die "can't open @grep_args|: $!";
744
745Here the multi-argument form of pipe open() is preferred because the
746pattern and indeed even the filenames themselves might hold metacharacters.
747
748=head2 Avoiding Pipe Deadlocks
749
750Whenever you have more than one subprocess, you must be careful that each
751closes whichever half of any pipes created for interprocess communication
752it is not using.  This is because any child process reading from the pipe
753and expecting an EOF will never receive it, and therefore never exit. A
754single process closing a pipe is not enough to close it; the last process
755with the pipe open must close it for it to read EOF.
756
757Certain built-in Unix features help prevent this most of the time.  For
758instance, filehandles have a "close on exec" flag, which is set I<en masse>
759under control of the C<$^F> variable.  This is so any filehandles you
760didn't explicitly route to the STDIN, STDOUT or STDERR of a child
761I<program> will be automatically closed.
762
763Always explicitly and immediately call close() on the writable end of any
764pipe, unless that process is actually writing to it.  Even if you don't
765explicitly call close(), Perl will still close() all filehandles during
766global destruction.  As previously discussed, if those filehandles have
767been opened with Safe Pipe Open, this will result in calling waitpid(),
768which may again deadlock.
769
770=head2 Bidirectional Communication with Another Process
771
772While this works reasonably well for unidirectional communication, what
773about bidirectional communication?  The most obvious approach doesn't work:
774
775    # THIS DOES NOT WORK!!
776    open(my $prog_for_reading_and_writing, "| some program |")
777
778If you forget to C<use warnings>, you'll miss out entirely on the
779helpful diagnostic message:
780
781    Can't do bidirectional pipe at -e line 1.
782
783If you really want to, you can use the standard open2() from the
784L<IPC::Open2> module to catch both ends.  There's also an open3() in
785L<IPC::Open3> for tridirectional I/O so you can also catch your child's
786STDERR, but doing so would then require an awkward select() loop and
787wouldn't allow you to use normal Perl input operations.
788
789If you look at its source, you'll see that open2() uses low-level
790primitives like the pipe() and exec() syscalls to create all the
791connections.  Although it might have been more efficient by using
792socketpair(), this would have been even less portable than it already
793is. The open2() and open3() functions are unlikely to work anywhere
794except on a Unix system, or at least one purporting POSIX compliance.
795
796=for TODO
797Hold on, is this even true?  First it says that socketpair() is avoided
798for portability, but then it says it probably won't work except on
799Unixy systems anyway.  Which one of those is true?
800
801Here's an example of using open2():
802
803    use IPC::Open2;
804    my $pid = open2(my $reader, my $writer, "cat -un");
805    print $writer "stuff\n";
806    my $got = <$reader>;
807    waitpid $pid, 0;
808
809The problem with this is that buffering is really going to ruin your
810day.  Even though your C<$writer> filehandle is auto-flushed so the process
811on the other end gets your data in a timely manner, you can't usually do
812anything to force that process to give its data to you in a similarly quick
813fashion.  In this special case, we could actually so, because we gave
814I<cat> a B<-u> flag to make it unbuffered.  But very few commands are
815designed to operate over pipes, so this seldom works unless you yourself
816wrote the program on the other end of the double-ended pipe.
817
818A solution to this is to use a library which uses pseudottys to make your
819program behave more reasonably.  This way you don't have to have control
820over the source code of the program you're using.  The C<Expect> module
821from CPAN also addresses this kind of thing.  This module requires two
822other modules from CPAN, C<IO::Pty> and C<IO::Stty>.  It sets up a pseudo
823terminal to interact with programs that insist on talking to the terminal
824device driver.  If your system is supported, this may be your best bet.
825
826=head2 Bidirectional Communication with Yourself
827
828If you want, you may make low-level pipe() and fork() syscalls to stitch
829this together by hand.  This example only talks to itself, but you could
830reopen the appropriate handles to STDIN and STDOUT and call other processes.
831(The following example lacks proper error checking.)
832
833 #!/usr/bin/perl
834 # pipe1 - bidirectional communication using two pipe pairs
835 #         designed for the socketpair-challenged
836 use v5.36;
837 use IO::Handle;  # enable autoflush method before Perl 5.14
838 pipe(my $parent_rdr, my $child_wtr);  # XXX: check failure?
839 pipe(my $child_rdr,  my $parent_wtr); # XXX: check failure?
840 $child_wtr->autoflush(1);
841 $parent_wtr->autoflush(1);
842
843 if ($pid = fork()) {
844     close $parent_rdr;
845     close $parent_wtr;
846     print $child_wtr "Parent Pid $$ is sending this\n";
847     chomp(my $line = <$child_rdr>);
848     print "Parent Pid $$ just read this: '$line'\n";
849     close $child_rdr; close $child_wtr;
850     waitpid($pid, 0);
851 } else {
852     die "cannot fork: $!" unless defined $pid;
853     close $child_rdr;
854     close $child_wtr;
855     chomp(my $line = <$parent_rdr>);
856     print "Child Pid $$ just read this: '$line'\n";
857     print $parent_wtr "Child Pid $$ is sending this\n";
858     close $parent_rdr;
859     close $parent_wtr;
860     exit(0);
861 }
862
863But you don't actually have to make two pipe calls.  If you
864have the socketpair() system call, it will do this all for you.
865
866 #!/usr/bin/perl
867 # pipe2 - bidirectional communication using socketpair
868 #   "the best ones always go both ways"
869
870 use v5.36;
871 use Socket;
872 use IO::Handle;  # enable autoflush method before Perl 5.14
873
874 # We say AF_UNIX because although *_LOCAL is the
875 # POSIX 1003.1g form of the constant, many machines
876 # still don't have it.
877 socketpair(my $child, my $parent, AF_UNIX, SOCK_STREAM, PF_UNSPEC)
878                             ||  die "socketpair: $!";
879
880 $child->autoflush(1);
881 $parent->autoflush(1);
882
883 if ($pid = fork()) {
884     close $parent;
885     print $child "Parent Pid $$ is sending this\n";
886     chomp(my $line = <$child>);
887     print "Parent Pid $$ just read this: '$line'\n";
888     close $child;
889     waitpid($pid, 0);
890 } else {
891     die "cannot fork: $!" unless defined $pid;
892     close $child;
893     chomp(my $line = <$parent>);
894     print "Child Pid $$ just read this: '$line'\n";
895     print $parent "Child Pid $$ is sending this\n";
896     close $parent;
897     exit(0);
898 }
899
900=head1 Sockets: Client/Server Communication
901
902While not entirely limited to Unix-derived operating systems (e.g., WinSock
903on PCs provides socket support, as do some VMS libraries), you might not have
904sockets on your system, in which case this section probably isn't going to
905do you much good.  With sockets, you can do both virtual circuits like TCP
906streams and datagrams like UDP packets.  You may be able to do even more
907depending on your system.
908
909The Perl functions for dealing with sockets have the same names as
910the corresponding system calls in C, but their arguments tend to differ
911for two reasons.  First, Perl filehandles work differently than C file
912descriptors.  Second, Perl already knows the length of its strings, so you
913don't need to pass that information.
914
915One of the major problems with ancient, antemillennial socket code in Perl
916was that it used hard-coded values for some of the constants, which
917severely hurt portability.  If you ever see code that does anything like
918explicitly setting C<$AF_INET = 2>, you know you're in for big trouble.
919An immeasurably superior approach is to use the L<Socket> module, which more
920reliably grants access to the various constants and functions you'll need.
921
922If you're not writing a server/client for an existing protocol like
923NNTP or SMTP, you should give some thought to how your server will
924know when the client has finished talking, and vice-versa.  Most
925protocols are based on one-line messages and responses (so one party
926knows the other has finished when a "\n" is received) or multi-line
927messages and responses that end with a period on an empty line
928("\n.\n" terminates a message/response).
929
930=head2 Internet Line Terminators
931
932The Internet line terminator is "\015\012".  Under ASCII variants of
933Unix, that could usually be written as "\r\n", but under other systems,
934"\r\n" might at times be "\015\015\012", "\012\012\015", or something
935completely different.  The standards specify writing "\015\012" to be
936conformant (be strict in what you provide), but they also recommend
937accepting a lone "\012" on input (be lenient in what you require).
938We haven't always been very good about that in the code in this manpage,
939but unless you're on a Mac from way back in its pre-Unix dark ages, you'll
940probably be ok.
941
942=head2 Internet TCP Clients and Servers
943
944Use Internet-domain sockets when you want to do client-server
945communication that might extend to machines outside of your own system.
946
947Here's a sample TCP client using Internet-domain sockets:
948
949    #!/usr/bin/perl
950    use v5.36;
951    use Socket;
952
953    my $remote  = shift || "localhost";
954    my $port    = shift || 2345;  # random port
955    if ($port =~ /\D/) { $port = getservbyname($port, "tcp") }
956    die "No port" unless $port;
957    my $iaddr   = inet_aton($remote)       || die "no host: $remote";
958    my $paddr   = sockaddr_in($port, $iaddr);
959
960    my $proto   = getprotobyname("tcp");
961    socket(my $sock, PF_INET, SOCK_STREAM, $proto)  || die "socket: $!";
962    connect($sock, $paddr)              || die "connect: $!";
963    while (my $line = <$sock>) {
964        print $line;
965    }
966
967    close ($sock)                        || die "close: $!";
968    exit(0);
969
970And here's a corresponding server to go along with it.  We'll
971leave the address as C<INADDR_ANY> so that the kernel can choose
972the appropriate interface on multihomed hosts.  If you want sit
973on a particular interface (like the external side of a gateway
974or firewall machine), fill this in with your real address instead.
975
976 #!/usr/bin/perl -T
977 use v5.36;
978 BEGIN { $ENV{PATH} = "/usr/bin:/bin" }
979 use Socket;
980 use Carp;
981 my $EOL = "\015\012";
982
983 sub logmsg { print "$0 $$: @_ at ", scalar localtime(), "\n" }
984
985 my $port  = shift || 2345;
986 die "invalid port" unless $port =~ /^ \d+ $/x;
987
988 my $proto = getprotobyname("tcp");
989
990 socket(my $server, PF_INET, SOCK_STREAM, $proto) || die "socket: $!";
991 setsockopt($server, SOL_SOCKET, SO_REUSEADDR, pack("l", 1))
992                                               || die "setsockopt: $!";
993 bind($server, sockaddr_in($port, INADDR_ANY)) || die "bind: $!";
994 listen($server, SOMAXCONN)                    || die "listen: $!";
995
996 logmsg "server started on port $port";
997
998 for (my $paddr; $paddr = accept(my $client, $server); close $client) {
999     my($port, $iaddr) = sockaddr_in($paddr);
1000     my $name = gethostbyaddr($iaddr, AF_INET);
1001
1002     logmsg "connection from $name [",
1003             inet_ntoa($iaddr), "]
1004             at port $port";
1005
1006     print $client "Hello there, $name, it's now ",
1007                     scalar localtime(), $EOL;
1008 }
1009
1010And here's a multitasking version.  It's multitasked in that
1011like most typical servers, it spawns (fork()s) a child server to
1012handle the client request so that the master server can quickly
1013go back to service a new client.
1014
1015 #!/usr/bin/perl -T
1016 use v5.36;
1017 BEGIN { $ENV{PATH} = "/usr/bin:/bin" }
1018 use Socket;
1019 use Carp;
1020 my $EOL = "\015\012";
1021
1022 sub spawn;  # forward declaration
1023 sub logmsg { print "$0 $$: @_ at ", scalar localtime(), "\n" }
1024
1025 my $port  = shift || 2345;
1026 die "invalid port" unless $port =~ /^ \d+ $/x;
1027
1028 my $proto = getprotobyname("tcp");
1029
1030 socket(my $server, PF_INET, SOCK_STREAM, $proto) || die "socket: $!";
1031 setsockopt($server, SOL_SOCKET, SO_REUSEADDR, pack("l", 1))
1032                                               || die "setsockopt: $!";
1033 bind($server, sockaddr_in($port, INADDR_ANY)) || die "bind: $!";
1034 listen($server, SOMAXCONN)                    || die "listen: $!";
1035
1036 logmsg "server started on port $port";
1037
1038 my $waitedpid = 0;
1039
1040 use POSIX ":sys_wait_h";
1041 use Errno;
1042
1043 sub REAPER {
1044     local $!;   # don't let waitpid() overwrite current error
1045     while ((my $pid = waitpid(-1, WNOHANG)) > 0 && WIFEXITED($?)) {
1046         logmsg "reaped $waitedpid" . ($? ? " with exit $?" : "");
1047     }
1048     $SIG{CHLD} = \&REAPER;  # loathe SysV
1049 }
1050
1051 $SIG{CHLD} = \&REAPER;
1052
1053 while (1) {
1054     my $paddr = accept(my $client, $server) || do {
1055         # try again if accept() returned because got a signal
1056         next if $!{EINTR};
1057         die "accept: $!";
1058     };
1059     my ($port, $iaddr) = sockaddr_in($paddr);
1060     my $name = gethostbyaddr($iaddr, AF_INET);
1061
1062     logmsg "connection from $name [",
1063            inet_ntoa($iaddr),
1064            "] at port $port";
1065
1066     spawn $client, sub {
1067         $| = 1;
1068         print "Hello there, $name, it's now ",
1069               scalar localtime(),
1070               $EOL;
1071         exec "/usr/games/fortune"       # XXX: "wrong" line terminators
1072             or confess "can't exec fortune: $!";
1073     };
1074     close $client;
1075 }
1076
1077 sub spawn {
1078     my $client = shift;
1079     my $coderef = shift;
1080
1081     unless (@_ == 0 && $coderef && ref($coderef) eq "CODE") {
1082         confess "usage: spawn CLIENT CODEREF";
1083     }
1084
1085     my $pid;
1086     unless (defined($pid = fork())) {
1087         logmsg "cannot fork: $!";
1088         return;
1089     }
1090     elsif ($pid) {
1091         logmsg "begat $pid";
1092         return; # I'm the parent
1093     }
1094     # else I'm the child -- go spawn
1095
1096     open(STDIN,  "<&", $client)   || die "can't dup client to stdin";
1097     open(STDOUT, ">&", $client)   || die "can't dup client to stdout";
1098     ## open(STDERR, ">&", STDOUT) || die "can't dup stdout to stderr";
1099     exit($coderef->());
1100 }
1101
1102This server takes the trouble to clone off a child version via fork()
1103for each incoming request.  That way it can handle many requests at
1104once, which you might not always want.  Even if you don't fork(), the
1105listen() will allow that many pending connections.  Forking servers
1106have to be particularly careful about cleaning up their dead children
1107(called "zombies" in Unix parlance), because otherwise you'll quickly
1108fill up your process table.  The REAPER subroutine is used here to
1109call waitpid() for any child processes that have finished, thereby
1110ensuring that they terminate cleanly and don't join the ranks of the
1111living dead.
1112
1113Within the while loop we call accept() and check to see if it returns
1114a false value.  This would normally indicate a system error needs
1115to be reported.  However, the introduction of safe signals (see
1116L</Deferred Signals (Safe Signals)> above) in Perl 5.8.0 means that
1117accept() might also be interrupted when the process receives a signal.
1118This typically happens when one of the forked subprocesses exits and
1119notifies the parent process with a CHLD signal.
1120
1121If accept() is interrupted by a signal, $! will be set to EINTR.
1122If this happens, we can safely continue to the next iteration of
1123the loop and another call to accept().  It is important that your
1124signal handling code not modify the value of $!, or else this test
1125will likely fail.  In the REAPER subroutine we create a local version
1126of $! before calling waitpid().  When waitpid() sets $! to ECHILD as
1127it inevitably does when it has no more children waiting, it
1128updates the local copy and leaves the original unchanged.
1129
1130You should use the B<-T> flag to enable taint checking (see L<perlsec>)
1131even if we aren't running setuid or setgid.  This is always a good idea
1132for servers or any program run on behalf of someone else (like CGI
1133scripts), because it lessens the chances that people from the outside will
1134be able to compromise your system.
1135Note that perl can be built without taint support.  There are two
1136different modes: in one, B<-T> will silently do nothing.  In the other
1137mode B<-T> results in a fatal error.
1138
1139Let's look at another TCP client.  This one connects to the TCP "time"
1140service on a number of different machines and shows how far their clocks
1141differ from the system on which it's being run:
1142
1143    #!/usr/bin/perl
1144    use v5.36;
1145    use Socket;
1146
1147    my $SECS_OF_70_YEARS = 2208988800;
1148    sub ctime { scalar localtime(shift() || time()) }
1149
1150    my $iaddr = gethostbyname("localhost");
1151    my $proto = getprotobyname("tcp");
1152    my $port = getservbyname("time", "tcp");
1153    my $paddr = sockaddr_in(0, $iaddr);
1154
1155    $| = 1;
1156    printf "%-24s %8s %s\n", "localhost", 0, ctime();
1157
1158    foreach my $host (@ARGV) {
1159        printf "%-24s ", $host;
1160        my $hisiaddr = inet_aton($host)     || die "unknown host";
1161        my $hispaddr = sockaddr_in($port, $hisiaddr);
1162        socket(my $socket, PF_INET, SOCK_STREAM, $proto)
1163                                            || die "socket: $!";
1164        connect($socket, $hispaddr)         || die "connect: $!";
1165        my $rtime = pack("C4", ());
1166        read($socket, $rtime, 4);
1167        close($socket);
1168        my $histime = unpack("N", $rtime) - $SECS_OF_70_YEARS;
1169        printf "%8d %s\n", $histime - time(), ctime($histime);
1170    }
1171
1172=head2 Unix-Domain TCP Clients and Servers
1173
1174That's fine for Internet-domain clients and servers, but what about local
1175communications?  While you can use the same setup, sometimes you don't
1176want to.  Unix-domain sockets are local to the current host, and are often
1177used internally to implement pipes.  Unlike Internet domain sockets, Unix
1178domain sockets can show up in the file system with an ls(1) listing.
1179
1180    % ls -l /dev/log
1181    srw-rw-rw-  1 root            0 Oct 31 07:23 /dev/log
1182
1183You can test for these with Perl's B<-S> file test:
1184
1185    unless (-S "/dev/log") {
1186        die "something's wicked with the log system";
1187    }
1188
1189Here's a sample Unix-domain client:
1190
1191    #!/usr/bin/perl
1192    use v5.36;
1193    use Socket;
1194
1195    my $rendezvous = shift || "catsock";
1196    socket(my $sock, PF_UNIX, SOCK_STREAM, 0) || die "socket: $!";
1197    connect($sock, sockaddr_un($rendezvous))  || die "connect: $!";
1198    while (defined(my $line = <$sock>)) {
1199        print $line;
1200    }
1201    exit(0);
1202
1203And here's a corresponding server.  You don't have to worry about silly
1204network terminators here because Unix domain sockets are guaranteed
1205to be on the localhost, and thus everything works right.
1206
1207    #!/usr/bin/perl -T
1208    use v5.36;
1209    use Socket;
1210    use Carp;
1211
1212    BEGIN { $ENV{PATH} = "/usr/bin:/bin" }
1213    sub spawn;  # forward declaration
1214    sub logmsg { print "$0 $$: @_ at ", scalar localtime(), "\n" }
1215
1216    my $NAME = "catsock";
1217    my $uaddr = sockaddr_un($NAME);
1218    my $proto = getprotobyname("tcp");
1219
1220    socket(my $server, PF_UNIX, SOCK_STREAM, 0) || die "socket: $!";
1221    unlink($NAME);
1222    bind  ($server, $uaddr)                     || die "bind: $!";
1223    listen($server, SOMAXCONN)                  || die "listen: $!";
1224
1225    logmsg "server started on $NAME";
1226
1227    my $waitedpid;
1228
1229    use POSIX ":sys_wait_h";
1230    sub REAPER {
1231        my $child;
1232        while (($waitedpid = waitpid(-1, WNOHANG)) > 0) {
1233            logmsg "reaped $waitedpid" . ($? ? " with exit $?" : "");
1234        }
1235        $SIG{CHLD} = \&REAPER;  # loathe SysV
1236    }
1237
1238    $SIG{CHLD} = \&REAPER;
1239
1240
1241    for ( $waitedpid = 0;
1242          accept(my $client, $server) || $waitedpid;
1243          $waitedpid = 0, close $client)
1244    {
1245        next if $waitedpid;
1246        logmsg "connection on $NAME";
1247        spawn $client, sub {
1248            print "Hello there, it's now ", scalar localtime(), "\n";
1249            exec("/usr/games/fortune")  || die "can't exec fortune: $!";
1250        };
1251    }
1252
1253    sub spawn {
1254        my $client = shift();
1255        my $coderef = shift();
1256
1257        unless (@_ == 0 && $coderef && ref($coderef) eq "CODE") {
1258            confess "usage: spawn CLIENT CODEREF";
1259        }
1260
1261        my $pid;
1262        unless (defined($pid = fork())) {
1263            logmsg "cannot fork: $!";
1264            return;
1265        }
1266        elsif ($pid) {
1267            logmsg "begat $pid";
1268            return; # I'm the parent
1269        }
1270        else {
1271            # I'm the child -- go spawn
1272        }
1273
1274        open(STDIN,  "<&", $client)
1275            || die "can't dup client to stdin";
1276        open(STDOUT, ">&", $client)
1277            || die "can't dup client to stdout";
1278        ## open(STDERR, ">&", STDOUT)
1279        ##  || die "can't dup stdout to stderr";
1280        exit($coderef->());
1281    }
1282
1283As you see, it's remarkably similar to the Internet domain TCP server, so
1284much so, in fact, that we've omitted several duplicate functions--spawn(),
1285logmsg(), ctime(), and REAPER()--which are the same as in the other server.
1286
1287So why would you ever want to use a Unix domain socket instead of a
1288simpler named pipe?  Because a named pipe doesn't give you sessions.  You
1289can't tell one process's data from another's.  With socket programming,
1290you get a separate session for each client; that's why accept() takes two
1291arguments.
1292
1293For example, let's say that you have a long-running database server daemon
1294that you want folks to be able to access from the Web, but only
1295if they go through a CGI interface.  You'd have a small, simple CGI
1296program that does whatever checks and logging you feel like, and then acts
1297as a Unix-domain client and connects to your private server.
1298
1299=head1 TCP Clients with IO::Socket
1300
1301For those preferring a higher-level interface to socket programming, the
1302IO::Socket module provides an object-oriented approach.  If for some reason
1303you lack this module, you can just fetch IO::Socket from CPAN, where you'll also
1304find modules providing easy interfaces to the following systems: DNS, FTP,
1305Ident (RFC 931), NIS and NISPlus, NNTP, Ping, POP3, SMTP, SNMP, SSLeay,
1306Telnet, and Time--to name just a few.
1307
1308=head2 A Simple Client
1309
1310Here's a client that creates a TCP connection to the "daytime"
1311service at port 13 of the host name "localhost" and prints out everything
1312that the server there cares to provide.
1313
1314    #!/usr/bin/perl
1315    use v5.36;
1316    use IO::Socket;
1317    my $remote = IO::Socket::INET->new(
1318                        Proto    => "tcp",
1319                        PeerAddr => "localhost",
1320                        PeerPort => "daytime(13)",
1321                    )
1322                 || die "can't connect to daytime service on localhost";
1323    while (<$remote>) { print }
1324
1325When you run this program, you should get something back that
1326looks like this:
1327
1328    Wed May 14 08:40:46 MDT 1997
1329
1330Here are what those parameters to the new() constructor mean:
1331
1332=over 4
1333
1334=item C<Proto>
1335
1336This is which protocol to use.  In this case, the socket handle returned
1337will be connected to a TCP socket, because we want a stream-oriented
1338connection, that is, one that acts pretty much like a plain old file.
1339Not all sockets are this of this type.  For example, the UDP protocol
1340can be used to make a datagram socket, used for message-passing.
1341
1342=item C<PeerAddr>
1343
1344This is the name or Internet address of the remote host the server is
1345running on.  We could have specified a longer name like C<"www.perl.com">,
1346or an address like C<"207.171.7.72">.  For demonstration purposes, we've
1347used the special hostname C<"localhost">, which should always mean the
1348current machine you're running on.  The corresponding Internet address
1349for localhost is C<"127.0.0.1">, if you'd rather use that.
1350
1351=item C<PeerPort>
1352
1353This is the service name or port number we'd like to connect to.
1354We could have gotten away with using just C<"daytime"> on systems with a
1355well-configured system services file,[FOOTNOTE: The system services file
1356is found in I</etc/services> under Unixy systems.] but here we've specified the
1357port number (13) in parentheses.  Using just the number would have also
1358worked, but numeric literals make careful programmers nervous.
1359
1360=back
1361
1362=head2 A Webget Client
1363
1364Here's a simple client that takes a remote host to fetch a document
1365from, and then a list of files to get from that host.  This is a
1366more interesting client than the previous one because it first sends
1367something to the server before fetching the server's response.
1368
1369    #!/usr/bin/perl
1370    use v5.36;
1371    use IO::Socket;
1372    unless (@ARGV > 1) { die "usage: $0 host url ..." }
1373    my $host = shift(@ARGV);
1374    my $EOL = "\015\012";
1375    my $BLANK = $EOL x 2;
1376    for my $document (@ARGV) {
1377        my $remote = IO::Socket::INET->new( Proto     => "tcp",
1378                                            PeerAddr  => $host,
1379                                            PeerPort  => "http(80)",
1380                  )     || die "cannot connect to httpd on $host";
1381        $remote->autoflush(1);
1382        print $remote "GET $document HTTP/1.0" . $BLANK;
1383        while ( <$remote> ) { print }
1384        close $remote;
1385    }
1386
1387The web server handling the HTTP service is assumed to be at
1388its standard port, number 80.  If the server you're trying to
1389connect to is at a different port, like 1080 or 8080, you should specify it
1390as the named-parameter pair, C<< PeerPort => 8080 >>.  The C<autoflush>
1391method is used on the socket because otherwise the system would buffer
1392up the output we sent it.  (If you're on a prehistoric Mac, you'll also
1393need to change every C<"\n"> in your code that sends data over the network
1394to be a C<"\015\012"> instead.)
1395
1396Connecting to the server is only the first part of the process: once you
1397have the connection, you have to use the server's language.  Each server
1398on the network has its own little command language that it expects as
1399input.  The string that we send to the server starting with "GET" is in
1400HTTP syntax.  In this case, we simply request each specified document.
1401Yes, we really are making a new connection for each document, even though
1402it's the same host.  That's the way you always used to have to speak HTTP.
1403Recent versions of web browsers may request that the remote server leave
1404the connection open a little while, but the server doesn't have to honor
1405such a request.
1406
1407Here's an example of running that program, which we'll call I<webget>:
1408
1409    % webget www.perl.com /guanaco.html
1410    HTTP/1.1 404 File Not Found
1411    Date: Thu, 08 May 1997 18:02:32 GMT
1412    Server: Apache/1.2b6
1413    Connection: close
1414    Content-type: text/html
1415
1416    <HEAD><TITLE>404 File Not Found</TITLE></HEAD>
1417    <BODY><H1>File Not Found</H1>
1418    The requested URL /guanaco.html was not found on this server.<P>
1419    </BODY>
1420
1421Ok, so that's not very interesting, because it didn't find that
1422particular document.  But a long response wouldn't have fit on this page.
1423
1424For a more featureful version of this program, you should look to
1425the I<lwp-request> program included with the LWP modules from CPAN.
1426
1427=head2 Interactive Client with IO::Socket
1428
1429Well, that's all fine if you want to send one command and get one answer,
1430but what about setting up something fully interactive, somewhat like
1431the way I<telnet> works?  That way you can type a line, get the answer,
1432type a line, get the answer, etc.
1433
1434This client is more complicated than the two we've done so far, but if
1435you're on a system that supports the powerful C<fork> call, the solution
1436isn't that rough.  Once you've made the connection to whatever service
1437you'd like to chat with, call C<fork> to clone your process.  Each of
1438these two identical process has a very simple job to do: the parent
1439copies everything from the socket to standard output, while the child
1440simultaneously copies everything from standard input to the socket.
1441To accomplish the same thing using just one process would be I<much>
1442harder, because it's easier to code two processes to do one thing than it
1443is to code one process to do two things.  (This keep-it-simple principle
1444a cornerstones of the Unix philosophy, and good software engineering as
1445well, which is probably why it's spread to other systems.)
1446
1447Here's the code:
1448
1449    #!/usr/bin/perl
1450    use v5.36;
1451    use IO::Socket;
1452
1453    unless (@ARGV == 2) { die "usage: $0 host port" }
1454    my ($host, $port) = @ARGV;
1455
1456    # create a tcp connection to the specified host and port
1457    my $handle = IO::Socket::INET->new(Proto     => "tcp",
1458                                       PeerAddr  => $host,
1459                                       PeerPort  => $port)
1460               || die "can't connect to port $port on $host: $!";
1461
1462    $handle->autoflush(1);       # so output gets there right away
1463    print STDERR "[Connected to $host:$port]\n";
1464
1465    # split the program into two processes, identical twins
1466    die "can't fork: $!" unless defined(my $kidpid = fork());
1467
1468    # the if{} block runs only in the parent process
1469    if ($kidpid) {
1470        # copy the socket to standard output
1471        while (defined (my $line = <$handle>)) {
1472            print STDOUT $line;
1473        }
1474        kill("TERM", $kidpid);   # send SIGTERM to child
1475    }
1476    # the else{} block runs only in the child process
1477    else {
1478        # copy standard input to the socket
1479        while (defined (my $line = <STDIN>)) {
1480            print $handle $line;
1481        }
1482        exit(0);                # just in case
1483    }
1484
1485The C<kill> function in the parent's C<if> block is there to send a
1486signal to our child process, currently running in the C<else> block,
1487as soon as the remote server has closed its end of the connection.
1488
1489If the remote server sends data a byte at time, and you need that
1490data immediately without waiting for a newline (which might not happen),
1491you may wish to replace the C<while> loop in the parent with the
1492following:
1493
1494    my $byte;
1495    while (sysread($handle, $byte, 1) == 1) {
1496        print STDOUT $byte;
1497    }
1498
1499Making a system call for each byte you want to read is not very efficient
1500(to put it mildly) but is the simplest to explain and works reasonably
1501well.
1502
1503=head1 TCP Servers with IO::Socket
1504
1505As always, setting up a server is little bit more involved than running a client.
1506The model is that the server creates a special kind of socket that
1507does nothing but listen on a particular port for incoming connections.
1508It does this by calling the C<< IO::Socket::INET->new() >> method with
1509slightly different arguments than the client did.
1510
1511=over 4
1512
1513=item Proto
1514
1515This is which protocol to use.  Like our clients, we'll
1516still specify C<"tcp"> here.
1517
1518=item LocalPort
1519
1520We specify a local
1521port in the C<LocalPort> argument, which we didn't do for the client.
1522This is service name or port number for which you want to be the
1523server. (Under Unix, ports under 1024 are restricted to the
1524superuser.)  In our sample, we'll use port 9000, but you can use
1525any port that's not currently in use on your system.  If you try
1526to use one already in used, you'll get an "Address already in use"
1527message.  Under Unix, the C<netstat -a> command will show
1528which services current have servers.
1529
1530=item Listen
1531
1532The C<Listen> parameter is set to the maximum number of
1533pending connections we can accept until we turn away incoming clients.
1534Think of it as a call-waiting queue for your telephone.
1535The low-level Socket module has a special symbol for the system maximum, which
1536is SOMAXCONN.
1537
1538=item Reuse
1539
1540The C<Reuse> parameter is needed so that we restart our server
1541manually without waiting a few minutes to allow system buffers to
1542clear out.
1543
1544=back
1545
1546Once the generic server socket has been created using the parameters
1547listed above, the server then waits for a new client to connect
1548to it.  The server blocks in the C<accept> method, which eventually accepts a
1549bidirectional connection from the remote client.  (Make sure to autoflush
1550this handle to circumvent buffering.)
1551
1552To add to user-friendliness, our server prompts the user for commands.
1553Most servers don't do this.  Because of the prompt without a newline,
1554you'll have to use the C<sysread> variant of the interactive client above.
1555
1556This server accepts one of five different commands, sending output back to
1557the client.  Unlike most network servers, this one handles only one
1558incoming client at a time.  Multitasking servers are covered in
1559Chapter 16 of the Camel.
1560
1561Here's the code.
1562
1563 #!/usr/bin/perl
1564 use v5.36;
1565 use IO::Socket;
1566 use Net::hostent;      # for OOish version of gethostbyaddr
1567
1568 my $PORT = 9000;       # pick something not in use
1569
1570 my $server = IO::Socket::INET->new( Proto     => "tcp",
1571                                     LocalPort => $PORT,
1572                                     Listen    => SOMAXCONN,
1573                                     Reuse     => 1);
1574
1575 die "can't setup server" unless $server;
1576 print "[Server $0 accepting clients]\n";
1577
1578 while (my $client = $server->accept()) {
1579   $client->autoflush(1);
1580   print $client "Welcome to $0; type help for command list.\n";
1581   my $hostinfo = gethostbyaddr($client->peeraddr);
1582   printf "[Connect from %s]\n",
1583          $hostinfo ? $hostinfo->name : $client->peerhost;
1584   print $client "Command? ";
1585   while ( <$client>) {
1586     next unless /\S/;     # blank line
1587     if    (/quit|exit/i)  { last                                      }
1588     elsif (/date|time/i)  { printf $client "%s\n", scalar localtime() }
1589     elsif (/who/i )       { print  $client `who 2>&1`                 }
1590     elsif (/cookie/i )    { print  $client `/usr/games/fortune 2>&1`  }
1591     elsif (/motd/i )      { print  $client `cat /etc/motd 2>&1`       }
1592     else {
1593       print $client "Commands: quit date who cookie motd\n";
1594     }
1595   } continue {
1596      print $client "Command? ";
1597   }
1598   close $client;
1599 }
1600
1601=head1 UDP: Message Passing
1602
1603Another kind of client-server setup is one that uses not connections, but
1604messages.  UDP communications involve much lower overhead but also provide
1605less reliability, as there are no promises that messages will arrive at
1606all, let alone in order and unmangled.  Still, UDP offers some advantages
1607over TCP, including being able to "broadcast" or "multicast" to a whole
1608bunch of destination hosts at once (usually on your local subnet).  If you
1609find yourself overly concerned about reliability and start building checks
1610into your message system, then you probably should use just TCP to start
1611with.
1612
1613UDP datagrams are I<not> a bytestream and should not be treated as such.
1614This makes using I/O mechanisms with internal buffering like stdio (i.e.
1615print() and friends) especially cumbersome. Use syswrite(), or better
1616send(), like in the example below.
1617
1618Here's a UDP program similar to the sample Internet TCP client given
1619earlier.  However, instead of checking one host at a time, the UDP version
1620will check many of them asynchronously by simulating a multicast and then
1621using select() to do a timed-out wait for I/O.  To do something similar
1622with TCP, you'd have to use a different socket handle for each host.
1623
1624 #!/usr/bin/perl
1625 use v5.36;
1626 use Socket;
1627 use Sys::Hostname;
1628
1629 my $SECS_OF_70_YEARS = 2_208_988_800;
1630
1631 my $iaddr = gethostbyname(hostname());
1632 my $proto = getprotobyname("udp");
1633 my $port = getservbyname("time", "udp");
1634 my $paddr = sockaddr_in(0, $iaddr); # 0 means let kernel pick
1635
1636 socket(my $socket, PF_INET, SOCK_DGRAM, $proto) || die "socket: $!";
1637 bind($socket, $paddr)                           || die "bind: $!";
1638
1639 $| = 1;
1640 printf "%-12s %8s %s\n",  "localhost", 0, scalar localtime();
1641 my $count = 0;
1642 for my $host (@ARGV) {
1643     $count++;
1644     my $hisiaddr = inet_aton($host)         || die "unknown host";
1645     my $hispaddr = sockaddr_in($port, $hisiaddr);
1646     defined(send($socket, 0, 0, $hispaddr)) || die "send $host: $!";
1647 }
1648
1649 my $rout = my $rin = "";
1650 vec($rin, fileno($socket), 1) = 1;
1651
1652 # timeout after 10.0 seconds
1653 while ($count && select($rout = $rin, undef, undef, 10.0)) {
1654     my $rtime = "";
1655     my $hispaddr = recv($socket, $rtime, 4, 0) || die "recv: $!";
1656     my ($port, $hisiaddr) = sockaddr_in($hispaddr);
1657     my $host = gethostbyaddr($hisiaddr, AF_INET);
1658     my $histime = unpack("N", $rtime) - $SECS_OF_70_YEARS;
1659     printf "%-12s ", $host;
1660     printf "%8d %s\n", $histime - time(), scalar localtime($histime);
1661     $count--;
1662 }
1663
1664This example does not include any retries and may consequently fail to
1665contact a reachable host. The most prominent reason for this is congestion
1666of the queues on the sending host if the number of hosts to contact is
1667sufficiently large.
1668
1669=head1 SysV IPC
1670
1671While System V IPC isn't so widely used as sockets, it still has some
1672interesting uses.  However, you cannot use SysV IPC or Berkeley mmap() to
1673have a variable shared amongst several processes.  That's because Perl
1674would reallocate your string when you weren't wanting it to.  You might
1675look into the C<IPC::Shareable> or C<threads::shared> modules for that.
1676
1677Here's a small example showing shared memory usage.
1678
1679    use IPC::SysV qw(IPC_PRIVATE IPC_RMID S_IRUSR S_IWUSR);
1680
1681    my $size = 2000;
1682    my $id = shmget(IPC_PRIVATE, $size, S_IRUSR | S_IWUSR);
1683    defined($id)                    || die "shmget: $!";
1684    print "shm key $id\n";
1685
1686    my $message = "Message #1";
1687    shmwrite($id, $message, 0, 60)  || die "shmwrite: $!";
1688    print "wrote: '$message'\n";
1689    shmread($id, my $buff, 0, 60)      || die "shmread: $!";
1690    print "read : '$buff'\n";
1691
1692    # the buffer of shmread is zero-character end-padded.
1693    substr($buff, index($buff, "\0")) = "";
1694    print "un" unless $buff eq $message;
1695    print "swell\n";
1696
1697    print "deleting shm $id\n";
1698    shmctl($id, IPC_RMID, 0)        || die "shmctl: $!";
1699
1700Here's an example of a semaphore:
1701
1702    use IPC::SysV qw(IPC_CREAT);
1703
1704    my $IPC_KEY = 1234;
1705    my $id = semget($IPC_KEY, 10, 0666 | IPC_CREAT);
1706    defined($id)                    || die "semget: $!";
1707    print "sem id $id\n";
1708
1709Put this code in a separate file to be run in more than one process.
1710Call the file F<take>:
1711
1712    # create a semaphore
1713
1714    my $IPC_KEY = 1234;
1715    my $id = semget($IPC_KEY, 0, 0);
1716    defined($id)                    || die "semget: $!";
1717
1718    my $semnum  = 0;
1719    my $semflag = 0;
1720
1721    # "take" semaphore
1722    # wait for semaphore to be zero
1723    my $semop = 0;
1724    my $opstring1 = pack("s!s!s!", $semnum, $semop, $semflag);
1725
1726    # Increment the semaphore count
1727    $semop = 1;
1728    my $opstring2 = pack("s!s!s!", $semnum, $semop,  $semflag);
1729    my $opstring  = $opstring1 . $opstring2;
1730
1731    semop($id, $opstring)   || die "semop: $!";
1732
1733Put this code in a separate file to be run in more than one process.
1734Call this file F<give>:
1735
1736    # "give" the semaphore
1737    # run this in the original process and you will see
1738    # that the second process continues
1739
1740    my $IPC_KEY = 1234;
1741    my $id = semget($IPC_KEY, 0, 0);
1742    die unless defined($id);
1743
1744    my $semnum  = 0;
1745    my $semflag = 0;
1746
1747    # Decrement the semaphore count
1748    my $semop = -1;
1749    my $opstring = pack("s!s!s!", $semnum, $semop, $semflag);
1750
1751    semop($id, $opstring)   || die "semop: $!";
1752
1753The SysV IPC code above was written long ago, and it's definitely
1754clunky looking.  For a more modern look, see the IPC::SysV module.
1755
1756A small example demonstrating SysV message queues:
1757
1758    use IPC::SysV qw(IPC_PRIVATE IPC_RMID IPC_CREAT S_IRUSR S_IWUSR);
1759
1760    my $id = msgget(IPC_PRIVATE, IPC_CREAT | S_IRUSR | S_IWUSR);
1761    defined($id)                || die "msgget failed: $!";
1762
1763    my $sent      = "message";
1764    my $type_sent = 1234;
1765
1766    msgsnd($id, pack("l! a*", $type_sent, $sent), 0)
1767                                || die "msgsnd failed: $!";
1768
1769    msgrcv($id, my $rcvd_buf, 60, 0, 0)
1770                                || die "msgrcv failed: $!";
1771
1772    my($type_rcvd, $rcvd) = unpack("l! a*", $rcvd_buf);
1773
1774    if ($rcvd eq $sent) {
1775        print "okay\n";
1776    } else {
1777        print "not okay\n";
1778    }
1779
1780    msgctl($id, IPC_RMID, 0)    || die "msgctl failed: $!\n";
1781
1782=head1 NOTES
1783
1784Most of these routines quietly but politely return C<undef> when they
1785fail instead of causing your program to die right then and there due to
1786an uncaught exception.  (Actually, some of the new I<Socket> conversion
1787functions do croak() on bad arguments.)  It is therefore essential to
1788check return values from these functions.  Always begin your socket
1789programs this way for optimal success, and don't forget to add the B<-T>
1790taint-checking flag to the C<#!> line for servers:
1791
1792    #!/usr/bin/perl -T
1793    use v5.36;
1794    use sigtrap;
1795    use Socket;
1796
1797=head1 BUGS
1798
1799These routines all create system-specific portability problems.  As noted
1800elsewhere, Perl is at the mercy of your C libraries for much of its system
1801behavior.  It's probably safest to assume broken SysV semantics for
1802signals and to stick with simple TCP and UDP socket operations; e.g., don't
1803try to pass open file descriptors over a local UDP datagram socket if you
1804want your code to stand a chance of being portable.
1805
1806=head1 AUTHOR
1807
1808Tom Christiansen, with occasional vestiges of Larry Wall's original
1809version and suggestions from the Perl Porters.
1810
1811=head1 SEE ALSO
1812
1813There's a lot more to networking than this, but this should get you
1814started.
1815
1816For intrepid programmers, the indispensable textbook is I<Unix Network
1817Programming, 2nd Edition, Volume 1> by W. Richard Stevens (published by
1818Prentice-Hall).  Most books on networking address the subject from the
1819perspective of a C programmer; translation to Perl is left as an exercise
1820for the reader.
1821
1822The IO::Socket(3) manpage describes the object library, and the Socket(3)
1823manpage describes the low-level interface to sockets.  Besides the obvious
1824functions in L<perlfunc>, you should also check out the F<modules> file at
1825your nearest CPAN site, especially
1826L<http://www.cpan.org/modules/00modlist.long.html#ID5_Networking_>.
1827See L<perlmodlib> or best yet, the F<Perl FAQ> for a description
1828of what CPAN is and where to get it if the previous link doesn't work
1829for you.
1830
1831Section 5 of CPAN's F<modules> file is devoted to "Networking, Device
1832Control (modems), and Interprocess Communication", and contains numerous
1833unbundled modules numerous networking modules, Chat and Expect operations,
1834CGI programming, DCE, FTP, IPC, NNTP, Proxy, Ptty, RPC, SNMP, SMTP, Telnet,
1835Threads, and ToolTalk--to name just a few.
1836