xref: /original-bsd/share/doc/usd/01.begin/u4 (revision 79cf7955)
@(#)u4 6.1 (Berkeley) 06/02/86

IV. PROGRAMMING

There will be no attempt made to teach any of the programming languages available but a few words of advice are in order. One of the reasons why the C UNIX system is a productive programming environment is that there is already a rich set of tools available, and facilities like pipes, I/O redirection, and the capabilities of the shell often make it possible to do a job by pasting together programs that already exist instead of writing from scratch.

The Shell

The pipe mechanism lets you fabricate quite complicated operations out of spare parts that already exist. For example, the first draft of the spell program was (roughly)

1 cat ... \f2collect the files\f3 | tr ... \f2put each word on a new line\f3 | tr ... \f2delete punctuation, etc.\f3 | sort \f2into dictionary order\f3 | uniq \f2discard duplicates\f3 | comm \f2print words in text\f3 \f2 but not in dictionary\f3

2 More pieces have been added subsequently, but this goes a long way for such a small effort.

The editor can be made to do things that would normally require special programs on other systems. For example, to list the first and last lines of each of a set of files, such as a book, you could laboriously type

1 ed e chap1.1 1p $p e chap1.2 1p $p etc.

2 But you can do the job much more easily. One way is to type

1 ls chap* >temp

2 to get the list of filenames into a file. Then edit this file to make the necessary series of editing commands (using the global commands of ed ), and write it into script . Now the command

1 ed <script

2 will produce the same output as the laborious hand typing. Alternately (and more easily), you can use the fact that the shell will perform loops, repeating a set of commands over and over again for a set of arguments:

1 for i in chap* do ed $i <script done

2 This sets the shell variable i to each file name in turn, then does the command. You can type this command at the terminal, or put it in a file for later execution.

Programming the Shell

An option often overlooked by newcomers is that the shell is itself a programming language, with variables, control flow if-else , ( while , for , case ), subroutines, and interrupt handling. Since there are many building-block programs, you can sometimes avoid writing a new program merely by piecing together some of the building blocks with shell command files.

We will not go into any details here; examples and rules can be found in .ul An Introduction to the .ul C UNIX T Shell , by S. R. Bourne.

Programming in C

If you are undertaking anything substantial, C is the only reasonable choice of programming language: everything in the C UNIX system is tuned to it. The system itself is written in C, as are most of the programs that run on it. It is also a easy language to use once you get started. C is introduced and fully described in .ul The C Programming Language by B. W. Kernighan and D. M. Ritchie (Prentice-Hall, 1978). Several sections of the manual describe the system interfaces, that is, how you do I/O and similar functions. Read .ul UNIX Programming for more complicated things.

Most input and output in C is best handled with the standard I/O library, which provides a set of I/O functions that exist in compatible form on most machines that have C compilers. In general, it's wisest to confine the system interactions in a program to the facilities provided by this library.

C programs that don't depend too much on special features of C UNIX (such as pipes) can be moved to other computers that have C compilers. The list of such machines grows daily; in addition to the original C PDP -11, it currently includes at least Honeywell 6000, IBM 370 and PC families, Interdata 8/32, Data General Nova and Eclipse, HP 2100, Harris /7, Motorola 68000 family (including machines like Sun Microsystems and Apple Macintosh), VAX 11 family, SEL 86, and Zilog Z80. Calls to the standard I/O library will work on all of these machines.

There are a number of supporting programs that go with C. lint checks C programs for potential portability problems, and detects errors such as mismatched argument types and uninitialized variables.

For larger programs (anything whose source is on more than one file) make allows you to specify the dependencies among the source files and the processing steps needed to make a new version; it then checks the times that the pieces were last changed and does the minimal amount of recompiling to create a consistent updated version.

The debugger adb is useful for digging through the dead bodies of C programs, but is rather hard to learn to use effectively. The most effective debugging tool is still careful thought, coupled with judiciously placed print statements.\(dg .FS \(dg The "dbx" debugger, supplied starting with 4.2BSD, has extensive facilities for high-level debugging of C programs and is much easier to use than "adb". .FE

The C compiler provides a limited instrumentation service, so you can find out where programs spend their time and what parts are worth optimizing. Compile the routines with the -p option; after the test run, use prof to print an execution profile. The command time will give you the gross run-time statistics of a program, but they are not super accurate or reproducible.

Other Languages

If you .ul have to use Fortran, there are two possibilities. You might consider Ratfor, which gives you the decent control structures and free-form input that characterize C, yet lets you write code that is still portable to other environments. Bear in mind that C UNIX Fortran tends to produce large and relatively slow-running programs. Furthermore, supporting software like adb , prof , etc., are all virtually useless with Fortran programs. There may also be a Fortran 77 compiler on your system. If so, this is a viable alternative to Ratfor, and has the non-trivial advantage that it is compatible with C and related programs. (The Ratfor processor and C tools can be used with Fortran 77 too.)

If your application requires you to translate a language into a set of actions or another language, you are in effect building a compiler, though probably a small one. In that case, you should be using the yacc compiler-compiler, which helps you develop a compiler quickly. The lex lexical analyzer generator does the same job for the simpler languages that can be expressed as regular expressions. It can be used by itself, or as a front end to recognize inputs for a yacc -based program. Both yacc and lex require some sophistication to use, but the initial effort of learning them can be repaid many times over in programs that are easy to change later on.

Most C UNIX systems also make available other languages, such as Algol 68, APL, Basic, Lisp, Pascal, and Snobol. Whether these are useful depends largely on the local environment: if someone cares about the language and has worked on it, it may be in good shape. If not, the odds are strong that it will be more trouble than it's worth.