sh/USD.doc/t3

         $NetBSD: t3,v 1.3 2010/08/22 02:19:07 perry Exp $

 Copyright (C) Caldera International Inc. 2001-2002. All rights reserved.

 Redistribution and use in source and binary forms, with or without
 modification, are permitted provided that the following conditions are
 met:

 Redistributions of source code and documentation must retain the above
 copyright notice, this list of conditions and the following
 disclaimer.

 Redistributions in binary form must reproduce the above copyright
 notice, this list of conditions and the following disclaimer in the
 documentation and/or other materials provided with the distribution.

 All advertising materials mentioning features or use of this software
 must display the following acknowledgement:

 This product includes software developed or owned by Caldera
 International, Inc. Neither the name of Caldera International, Inc.
 nor the names of other contributors may be used to endorse or promote
 products derived from this software without specific prior written
 permission.

 USE OF THE SOFTWARE PROVIDED FOR UNDER THIS LICENSE BY CALDERA
 INTERNATIONAL, INC. AND CONTRIBUTORS ``AS IS'' AND ANY EXPRESS OR
 IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
 WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
 DISCLAIMED. IN NO EVENT SHALL CALDERA INTERNATIONAL, INC. BE LIABLE
 FOR ANY DIRECT, INDIRECT INCIDENTAL, SPECIAL, EXEMPLARY, OR
 CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
 SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
 BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
 WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE
 OR OTHERWISE) RISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN
 IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

 @(#)t3 8.1 (Berkeley) 6/8/93

3.0 Keyword parameters

Shell variables may be given values
by assignment
or when a shell script is invoked.
An argument to a command of the form
name=value
that precedes the command name
causes value
to be assigned to name
before execution of the command begins.
The value of name in the invoking
shell is not affected.
For example,

 user=fred command

will execute command with
user set to fred.
 Removed by Perry Metzger because -k is not in POSIX

 The -k flag causes arguments of the form
 name=value to be interpreted in this way
 anywhere in the argument list.
 Such names are sometimes
 called keyword parameters.
 If any arguments remain they
 are available as positional
 parameters $1, $2, \*(ZZ\|.

The set command
may also be used to set positional parameters
from within a script.
For example,

 set -- \*(ST

will set $1 to the first file name
in the current directory, $2 to the next,
and so on.
Note that the first argument, --, ensures correct treatment
when the first file name begins with a -\|.

3.1 Parameter transmission

When a command is executed both positional parameters
and shell variables may be set on invocation.
Variables are also made available implicitly
to a command
by specifying in advance that such parameters
are to be exported from the invoking shell.
For example,

 export user box=red

marks the variables user and box
for export (setting box to ``red'' in the process).
When a command is invoked
copies are made of all exportable variables
(also known as environment variables)
for use within the invoked program.
Modification of such variables
within an invoked command does not
affect the values in the invoking shell.
It is generally true of
a shell script or other program
that it
cannot modify the state
of its caller without explicit
actions on the part of the caller.
 Removed by Perry Metzger because this is confusing to beginners.

 (Shared file descriptors are an
 exception to this rule.)

Names whose value is intended to remain
constant may be declared readonly\|.
The form of this command is the same as that of the export
command,

 readonly name[=value] \*(ZZ

Subsequent attempts to set readonly variables
are illegal.
3.2 Parameter substitution

If a shell parameter is not set
then the null string is substituted for it.
For example, if the variable d
is not set

 echo $d

or

 echo ${d}

will echo nothing.
A default string may be given
as in

 echo ${d:-.}

which will echo
the value of the variable d
if it is set and not null and `.' otherwise.
The default string is evaluated using the usual
quoting conventions so that

 echo ${d:-\'\*(ST\'}

will echo \*(ST if the variable d
is not set or null.
Similarly

 echo ${d:-$1}

will echo the value of d if it is set and not null
and the value (if any) of $1 otherwise.

The notation ${d:+.} performs the inverse operation. It
substitutes `.' if d is set or not null, and otherwise
substitutes null.

A variable may be assigned a default value
using
the notation

 echo ${d:=.}

which substitutes the same string as

 echo ${d:-.}

and if d were not previously set or null
then it will be set to the string `.'\|.

If there is no sensible default then
the notation

 echo ${d:?message}

will echo the value of the variable d if it is set and not null,
otherwise message is printed by the shell and
execution of the shell script is abandoned.
If message is absent then a standard message
is printed.
A shell script that requires some variables
to be set might start as follows:

 : ${user:?} ${acct:?} ${bin:?}
 \*(ZZ

Colon (:) is a command
that is
built in to the shell and does nothing
once its arguments have been evaluated.
If any of the variables user, acct
or bin are not set then the shell
will abandon execution of the script.
3.3 Command substitution

The standard output from a command can be
substituted in a similar way to parameters.
The command pwd prints on its standard
output the name of the current directory.
For example, if the current directory is
/usr/fred/bin
then the commands

 d=$(pwd)

(or the older notation d=\`pwd\`)
is equivalent to

 d=/usr/fred/bin


The entire string inside $(\*(ZZ)\| (or between grave accents \`\*(ZZ\`)
is taken as the command
to be executed
and is replaced with the output from
the command.
(The difference between the $(\*(ZZ) and \`\*(ZZ\` notations is that
the former may be nested, while the latter cannot be.)

The command is written using the usual quoting conventions,
except that inside \`\*(ZZ\`
a \` must be escaped using
a \\\|.
For example,

 ls $(echo "$HOME")

is equivalent to

 ls $HOME

Command substitution occurs in all contexts
where parameter substitution occurs (including here documents) and the
treatment of the resulting text is the same
in both cases.
This mechanism allows string
processing commands to be used within
shell scripts.
An example of such a command is basename
which removes a specified suffix from a string.
For example,

 basename main.c .c

will print the string main\|.
Its use is illustrated by the following
fragment from a cc command.

 case $A in
 \*(Ca\*(ZZ
 \*(Ca\*(ST.c) B=$(basename $A .c)
 \*(Ca\*(ZZ
 esac

that sets B to the part of $A
with the suffix .c stripped.

Here are some composite examples.

 \(bu
for i in \`ls -t\`; do \*(ZZ

The variable i is set
to the names of files in time order,
most recent first.
 \(bu
set --\| \`date\`; echo $6 $2 $3, $4

will print, e.g.,
1977 Nov 1, 23:59:59

3.4 Arithmetic Expansion

Within a $((\*(ZZ)) construct, integer arithmetic operations are
evaluated.
(The $ in front of variable names is optional within $((\*(ZZ)).
For example:

 x=5; y=1
 echo $(($x+3*2))
 echo $((y+=x))
 echo $y

will print `11', then `6', then `6' again.
Most of the constructs permitted in C arithmetic operations are
permitted though some (like `++') are not universally supported \(em
see the shell manual page for details.
3.5 Evaluation and quoting

The shell is a macro processor that
provides parameter substitution, command substitution and file
name generation for the arguments to commands.
This section discusses the order in which
these evaluations occur and the
effects of the various quoting mechanisms.

Commands are parsed initially according to the grammar
given in appendix A.
Before a command is executed
the following
substitutions occur.

 \(bu
parameter substitution, e.g. $user
 \(bu
command substitution, e.g. $(pwd) or \`pwd\`
 \(bu
arithmetic expansion, e.g. $(($count+1))


Only one evaluation occurs so that if, for example, the value of the variable
X
is the string $y
then

 echo $X

will echo $y\|.

 \(bu
blank interpretation


Following the above substitutions
the resulting characters
are broken into non-blank words (blank interpretation).
For this purpose `blanks' are the characters of the string
$\s-1IFS\s0.
By default, this string consists of blank, tab and newline.
The null string
is not regarded as a word unless it is quoted.
For example,

 echo \'\'

will pass on the null string as the first argument to echo,
whereas

 echo $null

will call echo with no arguments
if the variable null is not set
or set to the null string.

 \(bu
file name generation


Each word
is then scanned for the file pattern characters
\*(ST, ? and [\*(ZZ]
and an alphabetical list of file names
is generated to replace the word.
Each such file name is a separate argument.


The evaluations just described also occur
in the list of words associated with a for
loop.
Only substitution occurs
in the word used
for a case branch.

As well as the quoting mechanisms described
earlier using \\ and \'\*(ZZ\'
a third quoting mechanism is provided using double quotes.
Within double quotes parameter and command substitution
occurs but file name generation and the interpretation
of blanks does not.
The following characters
have a special meaning within double quotes
and may be quoted using \\\|.

 $ parameter substitution
 $() command substitution
 \` command substitution
 " ends the quoted string
 \e quotes the special characters $ \` " \e

For example,

 echo "$x"

will pass the value of the variable x as a
single argument to echo.
Similarly,

 echo "$\*(ST"

will pass the positional parameters as a single
argument and is equivalent to

 echo "$1 $2 \*(ZZ"

The notation $@
is the same as $\*(ST
except when it is quoted.

 echo "$@"

will pass the positional parameters, unevaluated, to echo
and is equivalent to

 echo "$1" "$2" \*(ZZ


The following table gives, for each quoting mechanism,
the shell metacharacters that are evaluated.


metacharacter
 \e $ * \` " \'
\' n n n n n t
\` y n n t n n
" y y n y t n

 t terminator
 y interpreted
 n not interpreted


Figure 2. Quoting mechanisms


In cases where more than one evaluation of a string
is required the built-in command eval
may be used.
For example,
if the variable X has the value
$y, and if y has the value pqr
then

 eval echo $X

will echo the string pqr\|.

In general the eval command
evaluates its arguments (as do all commands)
and treats the result as input to the shell.
The input is read and the resulting command(s)
executed.
For example,

 wg=\'eval who\*(VTgrep\'
 $wg fred

is equivalent to

 who\*(VTgrep fred

In this example,
eval is required
since there is no interpretation
of metacharacters, such as \*(VT\|, following
substitution.
3.6 Error handling

The treatment of errors detected by
the shell depends on the type of error
and on whether the shell is being
used interactively.
An interactive shell is one whose
input and output are connected
to a terminal.
 Removed by Perry Metzger, obsolete and excess detail

 (as determined by
 gtty (2)).
A shell invoked with the -i
flag is also interactive.

Execution of a command (see also 3.7) may fail
for any of the following reasons.
 \(bu
Input output redirection may fail.
For example, if a file does not exist
or cannot be created.
 \(bu
The command itself does not exist
or cannot be executed.
 \(bu
The command terminates abnormally,
for example, with a "bus error"
or "memory fault".
See Figure 2 below for a complete list
of UNIX signals.
 \(bu
The command terminates normally
but returns a non-zero exit status.

In all of these cases the shell
will go on to execute the next command.
Except for the last case an error
message will be printed by the shell.
All remaining errors cause the shell
to exit from a script.
An interactive shell will return
to read another command from the terminal.
Such errors include the following.
 \(bu
Syntax errors.
e.g., if \*(ZZ then \*(ZZ done
 \(bu
A signal such as interrupt.
The shell waits for the current
command, if any, to finish execution and
then either exits or returns to the terminal.
 \(bu
Failure of any of the built-in commands
such as cd.

The shell flag -e
causes the shell to terminate
if any error is detected.

1 hangup
2 interrupt
3* quit
4* illegal instruction
5* trace trap
6* IOT instruction
7* EMT instruction
8* floating point exception
9 kill (cannot be caught or ignored)
10* bus error
11* segmentation violation
12* bad argument to system call
13 write on a pipe with no one to read it
14 alarm clock
15 software termination (from kill (1))


Figure 3. UNIX signals\(dg
.FS
\(dg Additional signals have been added in modern Unix.
See sigvec(2) or signal(3) for an up-to-date list.
.FE
Those signals marked with an asterisk
produce a core dump
if not caught.
However,
the shell itself ignores quit which is the only
external signal that can cause a dump.
The signals in this list of potential interest
to shell programs are 1, 2, 3, 14 and 15.
3.7 Fault handling

shell scripts normally terminate
when an interrupt is received from the
terminal.
The trap command is used
if some cleaning up is required, such
as removing temporary files.
For example,

 trap \'rm /tmp/ps$$; exit\' 2

sets a trap for signal 2 (terminal
interrupt), and if this signal is received
will execute the commands

 rm /tmp/ps$$; exit

exit is
another built-in command
that terminates execution of a shell script.
The exit is required; otherwise,
after the trap has been taken,
the shell will resume executing
the script
at the place where it was interrupted.

UNIX signals can be handled in one of three ways.
They can be ignored, in which case
the signal is never sent to the process.
They can be caught, in which case the process
must decide what action to take when the
signal is received.
Lastly, they can be left to cause
termination of the process without
it having to take any further action.
If a signal is being ignored
on entry to the shell script, for example,
by invoking it in the background (see 3.7) then trap
commands (and the signal) are ignored.

The use of trap is illustrated
by this modified version of the touch
command (Figure 4).
The cleanup action is to remove the file junk$$\|.

 #!/bin/sh

 flag=
 trap \'rm -f junk$$; exit\' 1 2 3 15
 for i
 do case $i in
 \*(DC-c) flag=N ;;
 \*(DC\*(ST) if test -f $i
 \*(DC then cp $i junk$$; mv junk$$ $i
 \*(DC elif test $flag
 \*(DC then echo file \\'$i\\' does not exist
 \*(DC else >$i
 \*(DC fi
 \*(DOesac
 done


Figure 4. The touch command
The trap command
appears before the creation
of the temporary file;
otherwise it would be
possible for the process
to die without removing
the file.

Since there is no signal 0 in UNIX
it is used by the shell to indicate the
commands to be executed on exit from the
shell script.

A script may, itself, elect to
ignore signals by specifying the null
string as the argument to trap.
The following fragment is taken from the
nohup command.

 trap \'\' 1 2 3 15

which causes hangup, interrupt, quit and kill
to be ignored both by the
script and by invoked commands.

Traps may be reset by saying

 trap 2 3

which resets the traps for signals 2 and 3 to their default values.
A list of the current values of traps may be obtained
by writing

 trap


The script scan (Figure 5) is an example
of the use of trap where there is no exit
in the trap command.
scan takes each directory
in the current directory, prompts
with its name, and then executes
commands typed at the terminal
until an end of file or an interrupt is received.
Interrupts are ignored while executing
the requested commands but cause
termination when scan is
waiting for input.

 d=\`pwd\`
 for i in \*(ST
 do if test -d $d/$i
 \*(DOthen cd $d/$i
 \*(DO\*(THwhile echo "$i:"
 \*(DO\*(TH\*(WHtrap exit 2
 \*(DO\*(TH\*(WHread x
 \*(DO\*(THdo trap : 2; eval $x; done
 \*(DOfi
 done


Figure 5. The scan command
read x is a built-in command that reads one line from the
standard input
and places the result in the variable x\|.
It returns a non-zero exit status if either
an end-of-file is read or an interrupt
is received.
3.8 Command execution

To run a command (other than a built-in) the shell first creates
a new process using the system call fork.
The execution environment for the command
includes input, output and the states of signals, and
is established in the child process
before the command is executed.
The built-in command exec
is used in the rare cases when no fork
is required
and simply replaces the shell with a new command.
For example, a simple version of the nohup
command looks like

 trap \\'\\' 1 2 3 15
 exec $\*(ST

The trap turns off the signals specified
so that they are ignored by subsequently created commands
and exec replaces the shell by the command
specified.

Most forms of input output redirection have already been
described.
In the following word is only subject
to parameter and command substitution.
No file name generation or blank interpretation
takes place so that, for example,

 echo \*(ZZ >\*(ST.c

will write its output into a file whose name is \*(ST.c\|.
Input output specifications are evaluated left to right
as they appear in the command.
 > word 12
The standard output (file descriptor 1)
is sent to the file word which is
created if it does not already exist.
 \*(AP word 12
The standard output is sent to file word.
If the file exists then output is appended
(by seeking to the end);
otherwise the file is created.
 < word 12
The standard input (file descriptor 0)
is taken from the file word.
 \*(HE word 12
The standard input is taken from the lines
of shell input that follow up to but not
including a line consisting only of word.
If word is quoted then no interpretation
of the document occurs.
If word is not quoted
then parameter and command substitution
occur and \\ is used to quote
the characters \\ $ \` and the first character
of word.
In the latter case \\newline is ignored (c.f. quoted strings).
 >& digit 12
The file descriptor digit is duplicated using the system
call dup (2)
and the result is used as the standard output.
 <& digit 12
The standard input is duplicated from file descriptor digit.
 <&- 12
The standard input is closed.
 >&- 12
The standard output is closed.

Any of the above may be preceded by a digit in which
case the file descriptor created is that specified by the digit
instead of the default 0 or 1.
For example,

 \*(ZZ 2>file

runs a command with message output (file descriptor 2)
directed to file.

 \*(ZZ 2>&1

runs a command with its standard output and message output
merged.
(Strictly speaking file descriptor 2 is created
by duplicating file descriptor 1 but the effect is usually to
merge the two streams.)
 Removed by Perry Metzger, most of this is now obsolete

 .LP
 The environment for a command run in the background such as
 .DS
 list \*(ST.c \*(VT lpr &
 .DE
 is modified in two ways.
 Firstly, the default standard input
 for such a command is the empty file /dev/null\|.
 This prevents two processes (the shell and the command),
 which are running in parallel, from trying to
 read the same input.
 Chaos would ensue
 if this were not the case.
 For example,
 .DS
 ed file &
 .DE
 would allow both the editor and the shell
 to read from the same input at the same time.
 .LP
 The other modification to the environment of a background
 command is to turn off the QUIT and INTERRUPT signals
 so that they are ignored by the command.
 This allows these signals to be used
 at the terminal without causing background
 commands to terminate.
 For this reason the UNIX convention
 for a signal is that if it is set to 1
 (ignored) then it is never changed
 even for a short time.
 Note that the shell command trap
 has no effect for an ignored signal.
3.9 Invoking the shell

The following flags are interpreted by the shell
when it is invoked.
If the first character of argument zero is a minus,
then commands are read from the file .profile\|.
 -c string

If the -c flag is present then
commands are read from string\|.
 -s
If the -s flag is present or if no
arguments remain
then commands are read from the standard input.
Shell output is written to
file descriptor 2.
 -i
If the -i flag is present or
if the shell input and output are attached to a terminal (as told by gtty)
then this shell is interactive.
In this case TERMINATE is ignored (so that kill 0
does not kill an interactive shell) and INTERRUPT is caught and ignored
(so that wait is interruptable).
In all cases QUIT is ignored by the shell.
3.10 Job Control

When a command or pipeline (also known as a job) is running in
the foreground, entering the stop character (typically
\s-1CONTROL-Z\s0 but user settable with the stty(1) command)
will usually cause the job to stop.

The jobs associated with the current shell may be listed by entering
the jobs command.
Each job has an associated job number.
Jobs that are stopped may be continued by entering

 bg %jobnumber

and jobs may be moved to the foreground by entering

 fg %jobnumber

If there is a sole job with a particular name (say only one instance
of cc running), fg and bg may also use name of the
command in place of the number, as in:

 bg %cc

If no `%' clause is entered, most recently stopped job
(indicated with a `+' by the jobs command) will be assumed.
See the manual page for the shell for more details.
3.11 Aliases

The alias command creates a so-called shell alias, which is an
abbreviation that macro-expands at run time into some other command.
For example:

 alias ls="ls -F"

would cause the command sequence ls -F to be executed whenever
the user types ls into the shell.
Note that if the user types ls -a, the shell will in fact
execute ls -F -a.
The command alias on its own prints out all current aliases.
The unalias command, as in:

 unalias ls

will remove an existing alias.
Aliases can shadow pre-existing commands, as in the example above.
They are typically used to override the interactive behavior of
commands in small ways, for example to always invoke some program with
a favorite option, and are almost never found in scripts.
3.12 Command Line Editing and Recall

When working interactively with the shell, it is often tedious to
retype previously entered commands, especially if they are complicated.
The shell therefore maintains a so-called history, which is
stored in the file specified by the \s-1HISTFILE\s0 environment
variable if it is set.
Users may view, edit, and re-enter previous lines of input using
a small subset of the commands of the vi(1) or
emacs(1)\(dg editors.
.FS
Technically, vi command editing is standardized by POSIX while emacs
is not.
However, all modern shells support both styles.
.FE
Emacs style editing may be selected by entering

 set -o emacs

and vi style editing may be selected with

 set -o vi

The details of how command line editing works are beyond the scope of
this document.
See the shell manual page for details.
Acknowledgements

The design of the shell is
based in part on the original UNIX shell
.[
unix command language thompson
.]
and the PWB/UNIX shell,
.[
pwb shell mashey unix
.]
some
features having been taken from both.
Similarities also exist with the
command interpreters
of the Cambridge Multiple Access System
.[
cambridge multiple access system hartley
.]
and of CTSS.
.[
ctss
.]

I would like to thank Dennis Ritchie
and John Mashey for many
discussions during the design of the shell.
I am also grateful to the members of the Computing Science Research Center
and to Joe Maranzano for their
comments on drafts of this document.
.[
$LIST$
.]