xref: /original-bsd/share/doc/usd/01.begin/u2 (revision c3e32dec)
%sccs.include.proprietary.roff%

@(#)u2 8.1 (Berkeley) 06/08/93

II. DAY-TO-DAY USE
Creating Files \(em The Editor

If you have to type a paper or a letter or a program, how do you get the information stored in the machine? Most of these tasks are done with the C UNIX ``text editor'' ed . Since ed is thoroughly documented in ed (1) and explained in .ul A Tutorial Introduction to the UNIX Text Editor, we won't spend any time here describing how to use it. All we want it for right now is to make some .ul files. (A file is just a collection of information stored in the machine, a simplistic but adequate definition.)

To create a file called junk with some text in it, do the following:

1 ed junk (invokes the text editor)\f3 a (command to ``ed'', to add text)\f3 now type in whatever text you want ... . (signals the end of adding text)\f3

2 The ``\f3.'' that signals the end of adding text must be at the beginning of a line by itself. Don't forget it, for until it is typed, no other ed commands will be recognized \(em everything you type will be treated as text to be added.

At this point you can do various editing operations on the text you typed in, such as correcting spelling mistakes, rearranging paragraphs and the like. Finally, you must write the information you have typed into a file with the editor command w :

1 w

2 ed will respond with the number of characters it wrote into the file junk .

Until the w command, nothing is stored permanently, so if you hang up and go home the information is lost.\(dg .FS \(dg This is not strictly true \(em if you hang up while editing, the data you were working on is saved in a file called ed.hup , which you can continue with at your next session. .FE But after w the information is there permanently; you can re-access it any time by typing

1 ed junk

2 Type a q command to quit the editor. (If you try to quit without writing, ed will print a ? to remind you. A second q gets you out regardless.)

Now create a second file called temp in the same manner. You should now have two files, junk and temp .

What files are out there?

The ls (for ``list'') command lists the names (not contents) of any of the files that C UNIX knows about. If you type

1 ls

2 the response will be

1 junk temp

2 which are indeed the two files just created. The names are sorted into alphabetical order automatically, but other variations are possible. For example, the command

1 ls -t

2 causes the files to be listed in the order in which they were last changed, most recent first. The -l option gives a ``long'' listing:

1 ls -l

2 will produce something like

1 -rw-rw-rw- 1 bwk 41 Jul 22 2:56 junk -rw-rw-rw- 1 bwk 78 Jul 22 2:57 temp

2 The date and time are of the last change to the file. The 41 and 78 are the number of characters (which should agree with the numbers you got from ed ). bwk is the owner of the file, that is, the person who created it. The -rw-rw-rw- tells who has permission to read and write the file, in this case everyone.

Options can be combined: ls -lt gives the same thing as ls -l , but sorted into time order. You can also name the files you're interested in, and ls will list the information about them only. More details can be found in ls (1).

The use of optional arguments that begin with a minus sign, like -t and -lt , is a common convention for C UNIX programs. In general, if a program accepts such optional arguments, they precede any filename arguments. It is also vital that you separate the various arguments with spaces: ls-l is not the same as ls -l .

Printing Files

Now that you've got a file of text, how do you print it so people can look at it? There are a host of programs that do that, probably more than are needed.

One simple thing is to use the editor, since printing is often done just before making changes anyway. You can say

1 ed junk 1,$p

2 ed will reply with the count of the characters in junk and then print all the lines in the file. After you learn how to use the editor, you can be selective about the parts you print.

There are times when it's not feasible to use the editor for printing. For example, there is a limit on how big a file ed can handle (several thousand lines). Secondly, it will only print one file at a time, and sometimes you want to print several, one after another. So here are a couple of alternatives.

First is cat , the simplest of all the printing programs. cat simply prints on the terminal the contents of all the files named in a list. Thus

1 cat junk

2 prints one file, and

1 cat junk temp

2 prints two. The files are simply concatenated (hence the name cat '') `` onto the terminal.

pr produces formatted printouts of files. As with cat , pr prints all the files named in a list. The difference is that it produces headings with date, time, page number and file name at the top of each page, and extra lines to skip over the fold in the paper. Thus,

1 pr junk temp

2 will print junk neatly, then skip to the top of a new page and print temp neatly.

pr can also produce multi-column output:

1 pr -3 junk

2 prints junk in 3-column format. You can use any reasonable number in place of ``3'' and pr will do its best. pr has other capabilities as well; see pr (1).

It should be noted that pr is .ul not a formatting program in the sense of shuffling lines around and justifying margins. The true formatters are nroff and troff , which we will get to in the section on document preparation.

There are also programs that print files on a high-speed printer. Look in your manual under opr and lpr . Which to use depends on what equipment is attached to your machine.

Shuffling Files About

Now that you have some files in the file system and some experience in printing them, you can try bigger things. For example, you can move a file from one place to another (which amounts to giving it a new name), like this:

1 mv junk precious

2 This means that what used to be ``junk'' is now ``precious''. If you do an ls command now, you will get

1 precious temp

2 Beware that if you move a file to another one that already exists, the already existing contents are lost forever.

If you want to make a .ul copy of a file (that is, to have two versions of something), you can use the cp command:

1 cp precious temp1

2 makes a duplicate copy of precious in temp1 .

Finally, when you get tired of creating and moving files, there is a command to remove files from the file system, called rm .

1 rm temp temp1

2 will remove both of the files named.

You will get a warning message if one of the named files wasn't there, but otherwise rm , like most C UNIX commands, does its work silently. There is no prompting or chatter, and error messages are occasionally curt. This terseness is sometimes disconcerting to new\%comers, but experienced users find it desirable.

What's in a Filename

So far we have used filenames without ever saying what's a legal name, so it's time for a couple of rules. First, filenames are limited to 14 characters, which is enough to be descriptive.\(dg .FS \(dg In 4.2 BSD the limit was extended to 255 characters. .FE Second, although you can use almost any character in a filename, common sense says you should stick to ones that are visible, and that you should probably avoid characters that might be used with other meanings. We have already seen, for example, that in the ls command, ls -t means to list in time order. So if you had a file whose name was -t , you would have a tough time listing it by name. Besides the minus sign, there are other characters which have special meaning. To avoid pitfalls, you would do well to use only letters, numbers and the period until you're familiar with the situation.

On to some more positive suggestions. Suppose you're typing a large document like a book. Logically this divides into many small pieces, like chapters and perhaps sections. Physically it must be divided too, for ed will not handle really big files. Thus you should type the document as a number of files. You might have a separate file for each chapter, called

1 chap1 chap2 etc...

2 Or, if each chapter were broken into several files, you might have

1 chap1.1 chap1.2 chap1.3 ... chap2.1 chap2.2 ...

2 You can now tell at a glance where a particular file fits into the whole.

There are advantages to a systematic naming convention which are not obvious to the novice C UNIX user. What if you wanted to print the whole book? You could say

1 pr chap1.1 chap1.2 chap1.3 ......

2 but you would get tired pretty fast, and would probably even make mistakes. Fortunately, there is a shortcut. You can say

1 pr chap*

2 The * means ``anything at all,'' so this translates into ``print all files whose names begin with chap '', listed in alphabetical order.

This shorthand notation is not a property of the pr command, by the way. It is system-wide, a service of the program that interprets commands (the ``shell,'' sh (1)). Using that fact, you can see how to list the names of the files in the book:

1 ls chap*

2 produces

1 chap1.1 chap1.2 chap1.3 ...

2 The * is not limited to the last position in a filename \(em it can be anywhere and can occur several times. Thus

1 rm *junk* *temp*

2 removes all files that contain junk or temp as any part of their name. As a special case, * by itself matches every filename, so

1 pr *

2 prints all your files (alphabetical order), and

1 rm *

2 removes .ul all files. (You had better be T very sure that's what you wanted to say!)

The * is not the only pattern-matching feature available. Suppose you want to print only chapters 1 through 4 and 9. Then you can say

1 pr chap[12349]*

2 The [...] means to match any of the characters inside the brackets. A range of consecutive letters or digits can be abbreviated, so you can also do this with

1 pr chap[1-49]*

2 Letters can also be used within brackets: [a-z] matches any character in the range a through z .

The ? pattern matches any single character, so

1 ls ?

2 lists all files which have single-character names, and

1 ls -l chap?.1

2 lists information about the first file of each chapter chap1.1 , ( chap2.1 , etc.).

Of these niceties, * is certainly the most useful, and you should get used to it. The others are frills, but worth knowing.

If you should ever have to turn off the special meaning of * , ? , etc., enclose the entire argument in single quotes, as in

1 ls \(fm?\(fm

2 We'll see some more examples of this shortly.

What's in a Filename, Continued

When you first made that file called junk , how did the system know that there wasn't another junk somewhere else, especially since the person in the next office is also reading this tutorial? The answer is that generally each user has a private T directory , which contains only the files that belong to him. When you log in, you are ``in'' your directory. Unless you take special action, when you create a new file, it is made in the directory that you are currently in; this is most often your own directory, and thus the file is unrelated to any other file of the same name that might exist in someone else's directory.

The set of all files is organized into a (usually big) tree, with your files located several branches into the tree. It is possible for you to ``walk'' around this tree, and to find any file in the system, by starting at the root of the tree and walking along the proper set of branches. Conversely, you can start where you are and walk toward the root.

Let's try the latter first. The basic tools is the command pwd (``print working directory''), which prints the name of the directory you are currently in.

Although the details will vary according to the system you are on, if you give the command pwd , it will print something like

1 /usr/your\(hyname

2 This says that you are currently in the directory your-name , which is in turn in the directory /usr , which is in turn in the root directory called by convention just / . (Even if it's not called /usr on your system, you will get something analogous. Make the corresponding mental adjustment and read on.)

If you now type

1 ls /usr/your\(hyname

2 you should get exactly the same list of file names as you get from a plain ls : with no arguments, ls lists the contents of the current directory; given the name of a directory, it lists the contents of that directory.

Next, try

1 ls /usr

2 This should print a long series of names, among which is your own login name your-name . On many systems, usr is a directory that contains the directories of all the normal users of the system, like you.

The next step is to try

1 ls /

2 You should get a response something like this (although again the details may be different):

1 bin dev etc lib tmp usr

2 This is a collection of the basic directories of files that the system knows about; we are at the root of the tree.

Now try

1 cat /usr/your\(hyname/junk

2 (if junk is still around in your directory). The name

1 /usr/your\(hyname/junk

2 is called the pathname of the file that you normally think of as ``junk''. ``Pathname'' has an obvious meaning: it represents the full name of the path you have to follow from the root through the tree of directories to get to a particular file. It is a universal rule in the C UNIX system that anywhere you can use an ordinary filename, you can use a pathname.

Here is a picture which may make this clearer:

1 1

100 (root) / | \e / | \e / | \e bin etc usr dev tmp / | \e / | \e / | \e / | \e / | \e / | \e / | \e adam eve mary / / \e \e / \e junk junk temp

0

.tr //

2

Notice that Mary's junk is unrelated to Eve's.

This isn't too exciting if all the files of interest are in your own directory, but if you work with someone else or on several projects concurrently, it becomes handy indeed. For example, your friends can print your book by saying

1 pr /usr/your\(hyname/chap*

2 Similarly, you can find out what files your neighbor has by saying

1 ls /usr/neighbor\(hyname

2 or make your own copy of one of his files by

1 cp /usr/your\(hyneighbor/his\(hyfile yourfile

2

If your neighbor doesn't want you poking around in his files, or vice versa, privacy can be arranged. Each file and directory has read-write-execute permissions for the owner, a group, and everyone else, which can be set to control access. See ls (1) and chmod (1) for details. As a matter of observed fact, most users most of the time find openness of more benefit than privacy.

As a final experiment with pathnames, try

1 ls /bin /usr/bin

2 Do some of the names look familiar? When you run a program, by typing its name after the prompt character, the system simply looks for a file of that name. It normally looks first in your directory (where it typically doesn't find it), then in /bin and finally in /usr/bin . There is nothing magic about commands like cat or ls , except that they have been collected into a couple of places to be easy to find and administer.

What if you work regularly with someone else on common information in his directory? You could just log in as your friend each time you want to, but you can also say ``I want to work on his files instead of my own''. This is done by changing the directory that you are currently in:

1 cd /usr/your\(hyfriend

2 (On some systems, cd is spelled chdir .) Now when you use a filename in something like cat or pr , it refers to the file in your friend's directory. Changing directories doesn't affect any permissions associated with a file \(em if you couldn't access a file from your own directory, changing to another directory won't alter that fact. Of course, if you forget what directory you're in, type

1 pwd

2 to find out.

It is usually convenient to arrange your own files so that all the files related to one thing are in a directory separate from other projects. For example, when you write your book, you might want to keep all the text in a directory called book . So make one with

1 mkdir book

2 then go to it with

1 cd book

2 then start typing chapters. The book is now found in (presumably)

1 /usr/your\(hyname/book

2 To remove the directory book , type

1 rm book/* rmdir book

2 The first command removes all files from the directory; the second removes the empty directory.

You can go up one level in the tree of files by saying

1 cd ..

2 .. '' `` is the name of the parent of whatever directory you are currently in. For completeness, . '' `` is an alternate name for the directory you are in.

Using Files instead of the Terminal

Most of the commands we have seen so far produce output on the terminal; some, like the editor, also take their input from the terminal. It is universal in C UNIX systems that the terminal can be replaced by a file for either or both of input and output. As one example,

1 ls

2 makes a list of files on your terminal. But if you say

1 ls >filelist

2 a list of your files will be placed in the file filelist (which will be created if it doesn't already exist, or overwritten if it does). The symbol > means ``put the output on the following file, rather than on the terminal.'' Nothing is produced on the terminal. As another example, you could combine several files into one by capturing the output of cat in a file:

1 cat f1 f2 f3 >temp

2

The symbol >> operates very much like > does, except that it means ``add to the end of.'' That is,

1 cat f1 f2 f3 >>temp

2 means to concatenate f1 , f2 and f3 to the end of whatever is already in temp , instead of overwriting the existing contents. As with > , if temp doesn't exist, it will be created for you.

In a similar way, the symbol < means to take the input for a program from the following file, instead of from the terminal. Thus, you could make up a script of commonly used editing commands and put them into a file called script . Then you can run the script on a file by saying

1 ed file <script

2 As another example, you can use ed to prepare a letter in file let , then send it to several people with

1 mail adam eve mary joe <let

2

Pipes

One of the novel contributions of the C UNIX system is the idea of a .ul pipe. A pipe is simply a way to connect the output of one program to the input of another program, so the two run as a sequence of processes \(em a pipeline.

For example,

1 pr f g h

2 will print the files f , g , and h , beginning each on a new page. Suppose you want them run together instead. You could say

1 cat f g h >temp pr <temp rm temp

2 but this is more work than necessary. Clearly what we want is to take the output of cat and connect it to the input of pr . So let us use a pipe:

1 cat f g h | pr

2 The vertical bar | means to take the output from cat , which would normally have gone to the terminal, and put it into pr to be neatly formatted.

There are many other examples of pipes. For example,

1 ls | pr -3

2 prints a list of your files in three columns. The program wc counts the number of lines, words and characters in its input, and as we saw earlier, who prints a list of currently-logged on people, one per line. Thus

1 who | wc

2 tells how many people are logged on. And of course

1 ls | wc

2 counts your files.

Any program that reads from the terminal can read from a pipe instead; any program that writes on the terminal can drive a pipe. You can have as many elements in a pipeline as you wish.

Many C UNIX programs are written so that they will take their input from one or more files if file arguments are given; if no arguments are given they will read from the terminal, and thus can be used in pipelines. pr is one example:

1 pr -3 a b c

2 prints files a , b and c in order in three columns. But in

1 cat a b c | pr -3

2 pr prints the information coming down the pipeline, still in three columns.

The Shell

We have already mentioned once or twice the mysterious ``shell,'' which is in fact sh (1).\(dg .FS \(dg On Berkeley Unix systems, the usual shell for interactive use is the c shell, csh(1). .FE The shell is the program that interprets what you type as commands and arguments. It also looks after translating * , etc., into lists of filenames, and < , > , and | into changes of input and output streams.

The shell has other capabilities too. For example, you can run two programs with one command line by separating the commands with a semicolon; the shell recognizes the semicolon and breaks the line into two commands. Thus

1 date; who

2 does both commands before returning with a prompt character.

You can also have more than one program running .ul simultaneously if you wish. For example, if you are doing something time-consuming, like the editor script of an earlier section, and you don't want to wait around for the results before starting something else, you can say

1 ed file <script &

2 The ampersand at the end of a command line says ``start this command running, then take further commands from the terminal immediately,'' that is, don't wait for it to complete. Thus the script will begin, but you can do something else at the same time. Of course, to keep the output from interfering with what you're doing on the terminal, it would be better to say

1 ed file <script >script.out &

2 which saves the output lines in a file called script.out .

When you initiate a command with & , the system replies with a number called the process number, which identifies the command in case you later want to stop it. If you do, you can say

1 kill process\(hynumber

2 If you forget the process number, the command ps will tell you about everything you have running. (If you are desperate, kill 0 will kill all your processes.) And if you're curious about other people, ps a will tell you about .ul all programs that are currently running.

You can say

1 1 (command\(hy1; command\(hy2; command\(hy3) &

2 to start three commands in the background, or you can start a background pipeline with

1 command\(hy1 | command\(hy2 &

2

Just as you can tell the editor or some similar program to take its input from a file instead of from the terminal, you can tell the shell to read a file to get commands. (Why not? The shell, after all, is just a program, albeit a clever one.) For instance, suppose you want to set tabs on your terminal, and find out the date and who's on the system every time you log in. Then you can put the three necessary commands tabs , ( date , who ) into a file, let's call it startup , and then run it with

1 sh startup

2 This says to run the shell with the file startup as input. The effect is as if you had typed the contents of startup on the terminal.

If this is to be a regular thing, you can eliminate the need to type sh : simply type, once only, the command

1 chmod +x startup

2 and thereafter you need only say

1 startup

2 to run the sequence of commands. The chmod (1) command marks the file executable; the shell recognizes this and runs it as a sequence of commands.

If you want startup to run automatically every time you log in, create a file in your login directory called .profile , and place in it the line startup . When the shell first gains control when you log in, it looks for the .profile file and does whatever commands it finds in it.\(dg .FS \(dg The c shell instead reads a file called .login . .FE We'll get back to the shell in the section on programming.