xref: /original-bsd/share/doc/psd/04.uprog/p3 (revision c3e32dec)
%sccs.include.proprietary.roff%

@(#)p3 8.1 (Berkeley) 06/08/93

THE STANDARD I/O LIBRARY

The ``Standard I/O Library'' is a collection of routines intended to provide efficient and portable I/O services for most C programs. The standard I/O library is available on each system that supports C, so programs that confine their system interactions to its facilities can be transported from one system to another essentially without change.

In this section, we will discuss the basics of the standard I/O library. The appendix contains a more complete description of its capabilities. File Access

The programs written so far have all read the standard input and written the standard output, which we have assumed are magically pre-defined. The next step is to write a program that accesses a file that is .ul not already connected to the program. One simple example is T wc , which counts the lines, words and characters in a set of files. For instance, the command

1 wc x.c y.c

2 prints the number of lines, words and characters in x.c and y.c and the totals.

The question is how to arrange for the named files to be read \(em that is, how to connect the file system names to the I/O statements which actually read the data.

The rules are simple. Before it can be read or written a file has to be .ul opened by the standard library function fopen . fopen takes an external name (like x.c or y.c ), does some housekeeping and negotiation with the operating system, and returns an internal name which must be used in subsequent reads or writes of the file.

This internal name is actually a pointer, called a T file T pointer , to a structure which contains information about the file, such as the location of a buffer, the current character position in the buffer, whether the file is being read or written, and the like. Users don't need to know the details, because part of the standard I/O definitions obtained by including stdio.h is a structure definition called FILE . The only declaration needed for a file pointer is exemplified by

1 FILE *fp, *fopen();

2 This says that fp is a pointer to a FILE , and fopen returns a pointer to a FILE . FILE ( is a type name, like int , not a structure tag.

The actual call to fopen in a program is

1 fp = fopen(name, mode);

2 The first argument of fopen is the name of the file, as a character string. The second argument is the mode, also as a character string, which indicates how you intend to use the file. The only allowable modes are read "r" ), ( write "w" ), ( or append "a" ). (

If a file that you open for writing or appending does not exist, it is created (if possible). Opening an existing file for writing causes the old contents to be discarded. Trying to read a file that does not exist is an error, and there may be other causes of error as well (like trying to read a file when you don't have permission). If there is any error, fopen will return the null pointer value NULL (which is defined as zero in stdio.h ).

The next thing needed is a way to read or write the file once it is open. There are several possibilities, of which getc and putc are the simplest. getc returns the next character from a file; it needs the file pointer to tell it what file. Thus

1 c = getc(fp)

2 places in c the next character from the file referred to by fp ; it returns EOF when it reaches end of file. putc is the inverse of getc :

1 putc(c, fp)

2 puts the character c on the file fp and returns c . getc and putc return EOF on error.

When a program is started, three files are opened automatically, and file pointers are provided for them. These files are the standard input, the standard output, and the standard error output; the corresponding file pointers are called stdin , stdout , and stderr . Normally these are all connected to the terminal, but may be redirected to files or pipes as described in Section 2.2. stdin , stdout and stderr are pre-defined in the I/O library as the standard input, output and error files; they may be used anywhere an object of type FILE * can be. They are constants, however, .ul not variables, so don't try to assign to them.

With some of the preliminaries out of the way, we can now write T wc . The basic design is one that has been found convenient for many programs: if there are command-line arguments, they are processed in order. If there are no arguments, the standard input is processed. This way the program can be used stand-alone or as part of a larger process.

1 #include <stdio.h> main(argc, argv) /* wc: count lines, words, chars */ int argc; char *argv[]; { int c, i, inword; FILE *fp, *fopen(); long linect, wordct, charct; long tlinect = 0, twordct = 0, tcharct = 0; i = 1; fp = stdin; do { if (argc > 1 && (fp=fopen(argv[i], "r")) == NULL) { fprintf(stderr, "wc: can't open %s\en", argv[i]); continue; } linect = wordct = charct = inword = 0; while ((c = getc(fp)) != EOF) { charct++; if (c == '\en') linect++; if (c == ' ' || c == '\et' || c == '\en') inword = 0; else if (inword == 0) { inword = 1; wordct++; } } printf("%7ld %7ld %7ld", linect, wordct, charct); printf(argc > 1 ? " %s\en" : "\en", argv[i]); fclose(fp); tlinect += linect; twordct += wordct; tcharct += charct; } while (++i < argc); if (argc > 2) printf("%7ld %7ld %7ld total\en", tlinect, twordct, tcharct); exit(0); }

2 The function fprintf is identical to printf , save that the first argument is a file pointer that specifies the file to be written.

The function fclose is the inverse of fopen ; it breaks the connection between the file pointer and the external name that was established by fopen , freeing the file pointer for another file. Since there is a limit on the number of files that a program may have open simultaneously, it's a good idea to free things when they are no longer needed. There is also another reason to call fclose on an output file \(em it flushes the buffer in which putc is collecting output. fclose ( is called automatically for each open file when a program terminates normally.) Error Handling \(em Stderr and Exit

stderr is assigned to a program in the same way that stdin and stdout are. Output written on stderr appears on the user's terminal even if the standard output is redirected. T wc writes its diagnostics on stderr instead of stdout so that if one of the files can't be accessed for some reason, the message finds its way to the user's terminal instead of disappearing down a pipeline or into an output file.

The program actually signals errors in another way, using the function exit to terminate program execution. The argument of exit is available to whatever process called it (see Section 6), so the success or failure of the program can be tested by another program that uses this one as a sub-process. By convention, a return value of 0 signals that all is well; non-zero values signal abnormal situations.

exit itself calls fclose for each open output file, to flush out any buffered output, then calls a routine named _exit . The function _exit causes immediate termination without any buffer flushing; it may be called directly if desired. Miscellaneous I/O Functions

The standard I/O library provides several other I/O functions besides those we have illustrated above.

Normally output with putc , etc., is buffered (except to stderr ); to force it out immediately, use fflush(fp) .

fscanf is identical to scanf , except that its first argument is a file pointer (as with fprintf ) that specifies the file from which the input comes; it returns EOF at end of file.

The functions sscanf and sprintf are identical to fscanf and fprintf , except that the first argument names a character string instead of a file pointer. The conversion is done from the string for sscanf and into it for sprintf .

fgets(buf, size, fp) copies the next line from fp , up to and including a newline, into buf ; at most size-1 characters are copied; it returns NULL at end of file. fputs(buf, fp) writes the string in buf onto file fp .

The function ungetc(c, fp) ``pushes back'' the character c onto the input stream fp ; a subsequent call to getc , fscanf , etc., will encounter c . Only one character of pushback per file is permitted.