Copyright (c) 1980, 1993
The Regents of the University of California. All rights reserved.

%sccs.include.redist.man%

@(#)puman3.n 8.2 (Berkeley) 06/01/94

.so tmac.p \} .nr H1 2
Error diagnostics

This section of the M discusses the error diagnostics of the programs

I ,

C and .X . Pix is a simple but useful program which invokes

I and .X to do all the real processing. See its manual section (1) and section 5.2 below for more details. All the diagnostics given by

I will also be given by

C . Translator syntax errors

A few comments on the general nature of the syntax errors usually made by Pascal programmers and the recovery mechanisms of the current translator may help in using the system.

Illegal characters

Characters such as `$', `!', and `@' are not part of the language Pascal. If they are found in the source program, and are not part of a constant string, a constant character, or a comment, they are considered to be `illegal characters'. This can happen if you leave off an opening string quote `\(aa'. Note that the character `"', although used in English to quote strings, is not used to quote strings in Pascal. Most non-printing characters in your input are also illegal except in character constants and character strings. Except for the tab and form feed characters, which are used to ease formatting of the program, non-printing characters in the input file print as the character `?' so that they will show in your listing.

String errors

There is no character string of length 0 in Pascal. Consequently the input `\(aa\(aa' is not acceptable. Similarly, encountering an end-of-line after an opening string quote `\(aa' without encountering the matching closing quote yields the diagnostic ``Unmatched \(aa for string''. It is permissible to use the character `#' instead of `\'' to delimit character and constant strings for portability reasons. For this reason, a spuriously placed `#' sometimes causes the diagnostic about unbalanced quotes. Similarly, a `#' in column one is used when preparing programs which are to be kept in multiple files. See section 5.11 for details.

Comments in a comment, non-terminated comments

As we saw above, these errors are usually caused by leaving off a comment delimiter. You can convert parts of your program to comments without generating this diagnostic since there are two different kinds of comments - those delimited by `{' and `}', and those delimited by `(*' and `*)'. Thus consider: .LS { This is a comment enclosing a piece of program a := functioncall; (* comment within comment *) procedurecall; lhs := rhs; (* another comment *) } .LE

By using one kind of comment exclusively in your program you can use the other delimiters when you need to ``comment out'' parts of your program\*(dg. .FS \*(dgIf you wish to transport your program, especially to the 6000-3.4 implementation, you should use the character sequence `(*' to delimit comments. For transportation over the rcslink to Pascal 6000-3.4, the character `#' should be used to delimit characters and constant strings. .FE In this way you will also allow the translator to help by detecting statements accidentally placed within comments.

If a comment does not terminate before the end of the input file, the translator will point to the beginning of the comment, indicating that the comment is not terminated. In this case processing will terminate immediately. See the discussion of ``QUIT'' below.

Digits in numbers

This part of the language is a minor nuisance. Pascal requires digits in real numbers both before and after the decimal point. Thus the following statements, which look quite reasonable to FORTRAN users, generate diagnostics in Pascal: .LS .so digitsout .LE These same constructs are also illegal as input to the Pascal interpreter px .

Replacements, insertions, and deletions

When a syntax error is encountered in the input text, the parser invokes an error recovery procedure. This procedure examines the input text immediately after the point of error and considers a set of simple corrections to see whether they will allow the analysis to continue. These corrections involve replacing an input token with a different token, inserting a token, or replacing an input token with a different token. Most of these changes will not cause fatal syntax errors. The exception is the insertion of or replacement with a symbol such as an identifier or a number; in this case the recovery makes no attempt to determine which identifier or what number should be inserted, hence these are considered fatal syntax errors.

Consider the following example. .LS % \*bpix -l synerr.p .tr -- .so synerrout % .LE The only surprise here may be that Pascal does not have an exponentiation operator, hence the complaint about `**'. This error illustrates that, if you assume that the language has a feature which it does not, the translator diagnostic may not indicate this, as the translator is unlikely to recognize the construct you supply.

Undefined or improper identifiers

If an identifier is encountered in the input but is undefined, the error recovery will replace it with an identifier of the appropriate class. Further references to this identifier will be summarized at the end of the containing procedure or function or at the end of the program if the reference occurred in the main program. Similarly, if an identifier is used in an inappropriate way, e.g. if a type identifier is used in an assignment statement, or if a simple variable is used where a record variable is required, a diagnostic will be produced and an identifier of the appropriate type inserted. Further incorrect references to this identifier will be flagged only if they involve incorrect use in a different way, with all incorrect uses being summarized in the same way as undefined variable uses are.

Expected symbols, malformed constructs

If none of the above mentioned corrections appear reasonable, the error recovery will examine the input to the left of the point of error to see if there is only one symbol which can follow this input. If this is the case, the recovery will print a diagnostic which indicates that the given symbol was `Expected'.

In cases where none of these corrections resolve the problems in the input, the recovery may issue a diagnostic that indicates that the input is ``malformed''. If necessary, the translator may then skip forward in the input to a place where analysis can continue. This process may cause some errors in the text to be missed.

Consider the following example: .LS % \*bpix -l synerr2.p .so synerr2out % .LE Here we misspelled output and gave a FORTRAN style variable declaration which the translator diagnosed as a `Malformed declaration'. When, on line 6, we used `(' and `)' for subscripting (as in FORTRAN ) rather than the `[' and `]' which are used in Pascal, the translator noted that a was not defined as a procedure . This occurred because procedure and function argument lists are delimited by parentheses in Pascal. As it is not permissible to assign to procedure calls the translator diagnosed a malformed statement at the point of assignment.

Expected and unexpected end-of-file, ``QUIT''

If the translator finds a complete program, but there is more non-comment text in the input file, then it will indicate that an end-of-file was expected. This situation may occur after a bracketing error, or if too many end s are present in the input. The message may appear after the recovery says that it ``Expected \`.\'\|'' since `.' is the symbol that terminates a program.

If severe errors in the input prohibit further processing the translator may produce a diagnostic followed by ``QUIT''. One example of this was given above - a non-terminated comment; another example is a line which is longer than 160 characters. Consider also the following example. .LS % \*bpix -l mism.p .so mismout % .LE Translator semantic errors

The extremely large number of semantic diagnostic messages which the translator produces make it unreasonable to discuss each message or group of messages in detail. The messages are, however, very informative. We will here explain the typical formats and the terminology used in the error messages so that you will be able to make sense out of them. In any case in which a diagnostic is not completely comprehensible you can refer to the "User Manual" by Jensen and Wirth for examples.

Format of the error diagnostics

As we saw in the example program above, the error diagnostics from the Pascal translator include the number of a line in the text of the program as well as the text of the error message. While this number is most often the line where the error occurred, it is occasionally the number of a line containing a bracketing keyword like end or until . In this case, the diagnostic may refer to the previous statement. This occurs because of the method the translator uses for sampling line numbers. The absence of a trailing `;' in the previous statement causes the line number corresponding to the end or until . to become associated with the statement. As Pascal is a free-format language, the line number associations can only be approximate and may seem arbitrary to some users. This is the only notable exception, however, to reasonable associations.

Incompatible types

Since Pascal is a strongly typed language, many semantic errors manifest themselves as type errors. These are called `type clashes' by the translator. The types allowed for various operators in the language are summarized on page 108 of the Jensen-Wirth "User Manual" . It is important to know that the Pascal translator, in its diagnostics, distinguishes between the following type `classes':

array Boolean char file integer
pointer real record scalar string
These words are plugged into a great number of error messages. Thus, if you tried to assign an integer value to a char variable you would receive a diagnostic like the following: .LS .so clashout .LE In this case, one error produced a two line error message. If the same error occurs more than once, the same explanatory diagnostic will be given each time.
Scalar

The only class whose meaning is not self-explanatory is `scalar'. Scalar has a precise meaning in the Jensen-Wirth "User Manual" where, in fact, it refers to char , integer , real , and Boolean types as well as the enumerated types. For the purposes of the Pascal translator, scalar in an error message refers to a user-defined, enumerated type, such as ops in the example above or color in .LS \*btype color = (red, green, blue) .LE For integers, the more explicit denotation integer is used. Although it would be correct, in the context of the "User Manual" to refer to an integer variable as a scalar variable

I prefers the more specific identification.

Function and procedure type errors

For built-in procedures and functions, two kinds of errors occur. If the routines are called with the wrong number of arguments a message similar to: .LS .so sinout1 .LE is given. If the type of the argument is wrong, a message like .LS .so sinout2 .LE is produced. A few functions and procedures implemented in Pascal 6000-3.4 are diagnosed as unimplemented in Berkeley Pascal, notably those related to segmented files.

Can't read and write scalars, etc.

The messages which state that scalar (user-defined) types cannot be written to and from files are often mysterious. It is in fact the case that if you define .LS \*btype color = (red, green, blue) .LE ``standard'' Pascal does not associate these constants with the strings `red', `green', and `blue' in any way. An extension has been added which allows enumerated types to be read and written, however if the program is to be portable, you will have to write your own routines to perform these functions. Standard Pascal only allows the reading of characters, integers and real numbers from text files. You cannot read strings or Booleans. It is possible to make a .LS \*bfile of color .LE but the representation is binary rather than string.

Expression diagnostics

The diagnostics for semantically ill-formed expressions are very explicit. Consider this sample translation: .LS % \*bpi -l expr.p .so exprout % .LE This example is admittedly far-fetched, but illustrates that the error messages are sufficiently clear to allow easy determination of the problem in the expressions.

Type equivalence

Several diagnostics produced by the Pascal translator complain about `non-equivalent types'. In general, Berkeley Pascal considers variables to have the same type only if they were declared with the same constructed type or with the same type identifier. Thus, the variables x and y declared as .LS \*bvar x: ^ integer; y: ^ integer; .LE do not have the same type. The assignment .LS x := y .LE thus produces the diagnostics: .LS .so typequout .LE Thus it is always necessary to declare a type such as .LS \*btype intptr = ^ integer; .LE and use it to declare .LS \*bvar x: intptr; y: intptr; .LE Note that if we had initially declared .LS \*bvar x, y: ^ integer; .LE then the assignment statement would have worked. The statement .LS x^ := y^ .LE is allowed in either case. Since the parameter to a procedure or function must be declared with a type identifier rather than a constructed type, it is always necessary, in practice, to declare any type which will be used in this way.

Unreachable statements

Berkeley Pascal flags unreachable statements. Such statements usually correspond to errors in the program logic. Note that a statement is considered to be reachable if there is a potential path of control, even if it can never be taken. Thus, no diagnostic is produced for the statement: .LS \*bif false \*bthen writeln('impossible!') .LE

Goto's into structured statements

The translator detects and complains about goto statements which transfer control into structured statements (\c for , while , etc.) It does not allow such jumps, nor does it allow branching from the then part of an if statement into the else part. Such checks are made only within the body of a single procedure or function.

Unused variables, never set variables

Although

I always clears variables to 0 at procedure and function entry,

C does not unless runtime checking is enabled using the C option. It is not good programming practice to rely on this initialization. To discourage this practice, and to help detect errors in program logic,

I flags as a `w' warning error:

1) Use of a variable which is never assigned a value.

2)
A variable which is declared but never used, distinguishing between those variables for which values are computed but which are never used, and those completely unused.

In fact, these diagnostics are applied to all declared items. Thus a const or a procedure which is declared but never used is flagged. The w option of

I may be used to suppress these warnings; see sections 5.1 and 5.2. Translator panics, i/o errors

Panics

One class of error which rarely occurs, but which causes termination of all processing when it does is a panic. A panic indicates a translator-detected internal inconsistency. A typical panic message is: .LS snark (rvalue) line=110 yyline=109 Snark in pi .LE If you receive such a message, the translation will be quickly and perhaps ungracefully terminated. You should contact a teaching assistant or a member of the system staff, after saving a copy of your program for later inspection. If you were making changes to an existing program when the problem occurred, you may be able to work around the problem by ascertaining which change caused the snark and making a different change or correcting an error in the program. A small number of panics are possible in .X . All panics should be reported to a teaching assistant or systems staff so that they can be fixed.

Out of memory

The only other error which will abort translation when no errors are detected is running out of memory. All tables in the translator, with the exception of the parse stack, are dynamically allocated, and can grow to take up the full available process space of 64000 bytes on the \s-2PDP\s0-11. On the \s-2VAX\s0-11, table sizes are extremely generous and very large (25000) line programs have been easily accommodated. For the \s-2PDP\s0-11, it is generally true that the size of the largest translatable program is directly related to procedure and function size. A number of non-trivial Pascal programs, including some with more than 2000 lines and 2500 statements have been translated and interpreted using Berkeley Pascal on \s-2PDP\s0-11's. Notable among these are the Pascal-S interpreter, a large set of programs for automated generation of code generators, and a general context-free parsing program which has been used to parse sentences with a grammar for a superset of English. In general, very large programs should be translated using

C and the separate compilation facility.

If you receive an out of space message from the translator during translation of a large procedure or function or one containing a large number of string constants you may yet be able to translate your program if you break this one procedure or function into several routines.

I/O errors

Other errors which you may encounter when running

I relate to input-output. If

I cannot open the file you specify, or if the file is empty, you will be so informed. Run-time errors

We saw, in our second example, a run-time error. We here give the general description of run-time errors. The more unusual interpreter error messages are explained briefly in the manual section for px (1).

Start-up errors

These errors occur when the object file to be executed is not available or appropriate. Typical errors here are caused by the specified object file not existing, not being a Pascal object, or being inaccessible to the user.

Program execution errors

These errors occur when the program interacts with the Pascal runtime environment in an inappropriate way. Typical errors are values or subscripts out of range, bad arguments to built-in functions, exceeding the statement limit because of an infinite loop, or running out of memory\*(dd. .FS \*(ddThe checks for running out of memory are not foolproof and there is a chance that the interpreter will fault, producing a core image when it runs out of memory. This situation occurs very rarely. .FE The interpreter will produce a backtrace after the error occurs, showing all the active routine calls, unless the p option was disabled when the program was translated. Unfortunately, no variable values are given and no way of extracting them is available.* .FS * On the \s-2VAX\s0-11, each variable is restricted to allocate at most 65000 bytes of storage (this is a \s-2PDP\s0-11ism that has survived to the \s-2VAX\s0.) .FE

As an example of such an error, assume that we have accidentally declared the constant n1 to be 6, instead of 7 on line 2 of the program primes as given in section 2.6 above. If we run this program we get the following response. .LS % \*bpix primes.p .so primeout3 % .LE

Here the interpreter indicates that the program terminated abnormally due to a subscript out of range near line 14, which is eight lines into the body of the program primes.

Interrupts

If the program is interrupted while executing and the p option was not specified, then a backtrace will be printed.\*(dg .FS \*(dgOccasionally, the Pascal system will be in an inconsistent state when this occurs, e.g. when an interrupt terminates a procedure or function entry or exit. In this case, the backtrace will only contain the current line. A reverse call order list of procedures will not be given. .FE The file pmon.out of profile information will be written if the program was translated with the z option enabled to

I or .

I/O interaction errors

The final class of interpreter errors results from inappropriate interactions with files, including the user's terminal. Included here are bad formats for integer and real numbers (such as no digits after the decimal point) when reading.