There was a bug which caused compiler errors when floating values were assigned to bit fields. Some overenthusiastic optimization on my part led to a float-to-int conversion on the rhs of the assignment being dropped. Bit fields are now treated as a special case. [pcc.vax: local2.c] At the instigation of Rob Pike, enums were neutered: they now behave exactly like ints. The only traces of Johnson's treatment of enums are warnings about clashes between one enum type and another enum type in certain expressions. This fix was trickier than it looked; it would have been much simpler to drop ALL warnings about enums, as Rob would recommend. [mip: trees.c] Arthur Olsen pointed out a bug with the evaluation of constant expressions -- the usual arithmetic conversions are not always performed. Rather than follow Arthur's simple suggestion, I decided to be safe and arranged to use the same conversion code on constants that we use on variables. We short-cut the test if the operand(s) are ints, which is the usual case, so the impact on compile time should be small. [mip: trees.c] Guy Harris noted that chkpun() considered multiply dimensioned arrays and multiply indirect pointers to be the same thing, which they are most certainly not. His fix causes an 'illegal pointer combination' warning to be emitted for code like: char **cpp, c2a[10][20]; cpp = c2a; cpp[5][3] = 'a'; The new code is actually simpler than the old... [mip: trees.c] Another irritation of Rob Pike's is fixed -- the old-style assignment ops have been ifdef'ed out. No more of those obnoxious 'ambiguous assignment: assignment op taken' messages! [mip: scan.c] Yet another Rob Pike complaint: you couldn't refer directly to the members of a structure which had been returned by a function. You may now utter things like 'f().a' and get away with them. The illegal form '(&f())->a' used to work; since this has the same tree representation as the legal form, an ugly wart was added to the action for '&' which specifically rules this form out. To be consistent, a similar exception was made for expressions of the form '(a = b).c'. [mip: cgram.y] Moved configuration flags from Makefile into macdefs.h and converted some more trivial routines into macros. This cleans up the Makefile considerably and consolidates assembler-dependent flags. [pcc.vax: macdefs.h, code.c, order.c, Makefile; mip: onepass.h] More efficiency hacks: the compiler now does its best to avoid emitting ANY code after an error is detected. Previously only code generation through the code table was suppressed after an error. This change can buy up to 25% in speed improvements if the dbx stab flag is enabled. The latest 'ccom' runs twice as fast as the 4.2 BSD compiler in this situation... [pcc.vax: macdefs.h, local.c, code.c; mip: cgram.y, scan.c, pftn.c] For ANSI compatibility, we now accept void expressions in the colon part of a conditional expression. Both subexpressions under the colon operator must be void if one is void. The overall type of the conditional expression is void in this case. [mip: trees.c] As long as we're doing away with old-fashioned assignment operators, we might as well terminate old-fashioned initializations too. [mip: cgram.y] A fix from James Schoner for bitfield assignments was installed. The problem was that the rhs of a bitfield assignment was used as its value, an overlooked aspect of the several earlier bug fixes for problems of this variety. The bug caused struct { unsigned a:4, b:3, c:2; } x; int i; i = x.b = 255; to store 255 in i instead of 7. [pcc.vax: table.c, local2.c] Scanner code was added to handle the System V/ANSI preprocessor extensions #ident and #pragma. Currently #ident is ignored. #pragma may be used as a substitute for lint-style comment directives; e.g. use '#pragma LINTLIBRARY' for '/* LINTLIBRARY */'. The #pragma stuff requires the hacked-up cpp with ANSI extensions which may eventually be put into circulation here at Utah. The #line control from cpp is now recognized with the same syntax by ccom. Unknown control lines elicit a warning. A small bug is fixed by this new code -- previously #ident and other unknown controls caused the line control mechanism to get screwed up, so that bogus dbx stabs were put in the output. [mip: scan.c] There seems to be a loophole in initializations -- apparently it is legal to initialize a bitfield with a symbolic constant (i.e. the address of something). No loader that I know of will handle this! The compiler now emits an error when someone tries this trick; it really can't do any better. [mip: pftn.c] Someone complained that all illegal characters were being printed as octal numbers in the error message. This was changed so that printable characters are printed normally, and all funny characters are printed in C 'char' style. [mip: scan.c] Conversion of unsigned constants to floating point values was broken because the requisite cast in makety() had been commented out! Argh. Unsigned comparisons of constants were similarly botched. [mip: trees.c] Ray Butterworth made a sensible suggestion that array definitions which aren't external and allocate no storage should elicit errors. I modified his suggestion slightly by moving the test into nidcl(), where we can be sure that an array isn't being initialized. I also adopted his suggestion for a lint warning for arrays with explicit dimensions that are lost by the rule that converts array types into pointer types in formal arguments; the warning is contingent on lint's -h flag. [mip: pftn.c] The structures for fpn and dpn nodes (floating point constants) were changed to make them conform in shape to the other nodes. [mip: ndu.h] Overenthusiastic SCONV optimization code in optim2() led to functions not cooperating with certain casts; e.g. unsigned char x(); ... y((char)x()) ... failed to sign-extend the result of x(). [pcc.vax: local2.c] A bug in outstruct() caused it to check 'stab[-1].name == NULL' for unnamed structures. For some reason this didn't break until my recent lint fixes tweaked the compiler in some subtle way... [pcc.vax: stab.c] The makefile was changed to pass C options to lint properly and to drop the '-a' flag to lint in the 'lintall' entry. '-a' really ought to work but unfortunately the arrangement of 'int'-like types in the compiler is extremely confusing and inconsistent, so I eventually gave up trying to force the issue. Sam Leffler's 'rel.c' for release information was also added. [pcc.vax: Makefile, rel.c] I fixed defid() so that it now insists that all argument type declarations must refer to names in the argument list. Previously you could get away with: int a; x() int a; { ... } Use of 'a' in the body of x() would elicit the immortal error message 'warning: bad arg temp'. [mip: pftn.c] For some reason, mainp2() in the two-pass version of the compiler had a switch statement with two 'default' cases. I zapped the obvious one. [mip: reader.c] Botched initializations sometimes left the declarations code in a funny state, so a fixinit() routine was invented to aid error recovery. For example, the following illegal program forced a core dump: char m[] = ; x() { y("splat"); } The compiler tried to read "splat" into m[] and died horribly. [mip: cgram.y, pftn.c] The compiler now warns 'can't take size of function' when asked to do something amazing like '... void f(); ... if (f[10]) ...'. Previously it produced a compiler error instead. [mip: pftn.c] If a program tried to access a value below the argument list on the stack using the clever tactic of manipulating the address of an argument, it drew a warning 'bad arg temp'. For the sake of these clever programs, the warning has been suppressed. [pcc.vax: local2.c] The assembler would sometimes print 'Branch too long: try -J flag' even when you used the -J flag and when there was no reason for it to be making long branches. This was due to a bug in jxxxfix() which caused tests for explodable branch instructions to terminate early in later phases of the topological sort, because the loop termination code didn't take into account the fact that not all code addresses are updated by jxxxbump(). The fix is to match the pointer to the data structure for the code in the sorted table rather than check for its generated address. [as: asjxxx.c] The peephole optimizer normally compares instructions for equality based on their instruction type and their operands. Unfortunately several instructions are too complex for c2 to handle and are given an instruction type of 0; thus all such instructions compared equal. The equop() routine was changed to test the names of these '0' type instructions for equality. [c2: c21.c] The compiler sometimes botched computations into stack temporaries by treating expressions like '-40(fp)[r1]' as permissible temporaries. I rewrote the shtemp() routine to make it explicit that the 'stack temporary' goal in code generation means precisely that the final expression must not contain any references to temporary registers (like r1). I had to add a couple templates to the code table to push exceedingly complex OREG expressions onto the stack when this goal is attempted. [pcc.vax: local2.c, table.c] A bug sometimes caused a redeclaration of an array in an inner scope to affect the outer array when the outer array was incompletely specified (by leaving out the most significant dimension). For example: extern int a[]; x(){ int a[10]; } int a[20]; This code would elicit the error 'foo.c, line 3: redeclaration of a'. The routine defid() is used to enter new definitions; for some reason scope problems are not resolved in defid() until much other code has been executed, including code that deals with filling out array sizes. The change makes defid() notice inner scopes with auto and register declarations earlier than usual. [mip: pftn.c] Case expressions are explicitly restricted to contain only int constants, char constants and sizeof expressions (C Reference Manual section 15). Previously the compiler didn't test for expressions like '(int) &foo[10]' and thus it would generate some rather bogus code. Expressions which resolve to names now elicit the same 'non-constant case expression' warning which you receive for variables. [mip: cgram.y] The value of an assignment to an unsigned bitfield was signed through an oversight in the code table. [pcc.vax: table.c] From Sam Kendall, a fix to prevent structs, unions, floats, doubles and void from being cast to pointer types. [mip: trees.c] Some relict code in moditype() was causing void functions not to be converted into pointers in some situations. [mip: trees.c] A minor optimization -- when the lhs of a simple assignment op (&=, |=, ^=, +-, -=) is smaller than int size, we can sometimes pun the rhs and avoid promoting the lhs to int, performing the operation in int width and converting from int back to the lhs type. For example: register char *ap, *bp; *ap++ |= *bp << 1; This used to require 7 instructions, but now needs only 3. [pcc.vax: table.c] At some point I added code to conval() to balance types before performing constant folding... While hacking on the tahoe compiler, I decided that this code was too complex and replaced it with equivalent code that's shorter and easier to understand. [mip: trees.c] Lines containing multiple statements were broken up for the sake of tracing with the source debugger in tcheck() and talloc(). [mip: common.c] I discovered that the C compiler called urem() in three different places with a constant divisor... In my subsequent rampage I hacked the compiler to generate inline code for all unsigned division and modulus operations with constant divisors. The largest inline expansion should use only 5 instructions, with most using just 3 or 4. The changes touched several files but really weren't very messy. [mip: pass1.h, match.c; pcc.vax: local2.c, order.c, table.c] A lot of new code was added to handle a really simple problem: unsigned char uc = 255; if (uc == -1) ... This incorrectly tested true, because the compiler generated a test that looked only at the low order byte of the constant. Not only that, but the compiler didn't realize that this test could be short- circuited, since -1 is equal to 4294967295 unsigned and is hence out of the range of an unsigned char. Rather than add lots of cruft to the code table, I shoved it into optim2() -- the compiler now picks up all the absurd cases where a constant is out of the range of precision of a variable it's tested against. To avoid having to write lots of code templates to handle unbalanced unsigned/signed expressions, I forced tymatch() to take notice of unbalanced expressions and promote the signed operand to unsigned (except with assignment operators, sigh). This change in turned required tweaks in autoincr() and in the code table to get code quality back. I hope I can come up with a better way to do this... [mip: trees.c; pcc.vax: local2.c, order.c, table.c] The value of TNULL was changed from 'pointer to undef' to 'pointer to member of enum' so that 'void *' can be a real type. TNULL is used to tag unused symbol table slots. [mip: manifest.h] A bug in clearst() led to problems with 'schain botch' errors. When a hash collision occurs, a symbol is (linearly) rehashed; if the symbol which forced the rehash is deleted, the relook() loop in clearst() will cause another symbol with the same hash code to move up and replace the deleted symbol. Torek's 'schain' hack for speedy identification of symbols at the same block level will get screwed up by this operation since it relies on a linked list of table entries -- moving an entry garbles the list. How did this code ever work before? [mip: pftn.c] Changed putins() in ascode.c in the assembler to permit 0(pc) as a write operand... Previously the assembler automatically optimized this to (pc), which is an illegal operand. [as.vax: ascode.c] The complement of an unsigned char or unsigned short value should have its high bits set, since the 'usual arithmetic conversions' widen these small integers 'before' the operation. [pcc.vax: table.c] A minor code improvement in ccom led to problems in c2 -- c2 was able to optimize sequences like 'cvtbl -4(fp),r0; bicl2 $-256,r0' but not the (shorter and faster) 'cvtbl -4(fp),r0; movzbl r0,r0'. A change in bflow() causes redundant conversions to be noted and removed, restoring code quality. [c2: c21.c] A typo in the 'bitsize' array definition resulted in an unterminated comment which screwed up the bit sizes for several types. I only noticed this because I ran the source off with vgrind and the error was exposed by comment highlighting... [c2: c21.c] An earlier change to conval() caused LONG and ULONG types to be hacked into INT and UNSIGNED; this was fine for the (VAX) compiler, but led to inconsistencies with lint. [mip: trees.c] When a syntax error occurs, the parser throws away tokens until it can enter a known state. If a string or character constant delimiter is tossed, the parser will try to interpret the contents of the constant as code and can get very confused. A hack was added to yylex() to detect this situation -- basically, if a delimiter is seen but the string or character constant has not been processed by lxstr() at the next call to yylex(), yylex() will call lxstr() itself and dispose of the rest of the constant. [mip: scan.c] Following a suggestion by Arthur Olsen, the production for 'switch' was modified to complain about constant switch expressions with 'lint -h'. [mip: cgram.y] Another Arthur Olsen bug report pointed out a problem with increment operations that don't match a code template... Two attempts at rewriting the increment are made: the first tries to turn the lvalue operand into an OREG, and the second applies a tree transformation to convert 'x++' into '(x += sizeof x) - sizeof x'. A mistake in the routine setincr() caused the lvalue operand in its entirety to be generated into a register instead of just the lvalue operand's address, producing something like 'r0 = x, (r0 += sizeof x) - sizeof x' instead of 'r0 = &x, (*r0 += sizeof x) - sizeof x'. [pcc.vax: order.c] Better code for floating post-increment and -decrement can be generated with a simple change to the code table and to zzzcode() so that the same hack for ordinary post-increment will work for floating point too. [pcc.vax: local2.c, table.c] I added Arthur Olsen's massive lint fixes for typechecking printf(). It sure would be nice if there were a way to specify new printf-like commands at execute time, perhaps through lint directives embedded in include files. [lint: lint.c] Arthur's warning about superfluous backslashes was added to lxstr(). Rather than adding Arthur's (expensive) code for warning about the use of '$', I simply made it illegal (unless 'VMS' is defined). I also took the opportunity to remove '`' gcos BCD constants. I made a slight alteration to yylex() to cause it to eat unknown characters rather than punt, since this seemed more useful. [mip: scan.c] Lint would sometimes print a bogus 'i set but not used' warning in situations like this: static int i; static int *ip = &i; i = 1; return *ip; If you moved the initialization out of the declaration, the warning disappeared. I installed Arthur's hack for forcing lint to examine initializations. This causes lint to treat initializations of auto, register and static variables as 'uses' and to ignore sizeof expressions as 'uses'. Also, '&i' in a static or external initialization is now a 'set' and a 'use' of 'i'. [lint: lint.c; mip: cgram.y, pftn.c] VARARGS0 is now correctly treated differently from plain VARARGS. I don't remember who originally noticed this... [lint: lint.c, lpass2.c] The register allocation code failed to 'share' register pairs. I don't know why this escaped notice for this long... I added a bit in the 'busy' array to keep track of pairs and modified the code in usable() to notice pairs and try to 'share' them. Some other code which treated the values of busy[] elements as arithmetic values had to be changed; there is now a macro which performs the proper test. [mip: allo.c, match.c, pass2.h] Some extensive code tweaking... (1) If order() is called on to rewrite a UNARY MUL node and that node has a usable index expression, we now try to rewrite the base into a register so that oreg2() will produce a doubly-indexed OREG. This is usually an impressive space saving. (2) Instead of laboriously copying a constant 0.0 in data space to clear a double or a float, we issue the proper 'clrd' or 'clrf'. This is done by a trick using an alternate prtdcon() routine; I'm not sure who invented it. I guess I'm still not prepared to hack in support for floating literals and immediate constants. (3) The conversion code now handles stack pushes directly, which often saves a spill to register. With very little adjustment, this also buys us optimally small pushes of constants. (4) Pointer comparisons are now unsigned; I'm not sure what this really buys us, but I added it anyway. (5) AND tests against constants are 'small' if both the constant and the other operand are also 'small'. (6) base() now recognizes that NAME nodes can be used in pc relative deferred indexed addressing, which is much more compact than the equivalent code to compute the address into a register and indirect through it. (7) The optimization code for ANDing with a constant now tries to produce a positive mask when small types are used so that literal operands are possible; a side effect is that the code is more readable. (8) UCHAR/USHORT to FLOAT/DOUBLE conversions take an extra step through INT type to avoid the overhead of an UNSIGNED to FLOAT/DOUBLE conversion. (9) If a logical operator sits above a pair of FLOAT to DOUBLE conversions, the conversions are deleted. (10) Vast numbers of redundant or useless templates were deleted from the code table. (11) Conversions to FLOAT in double-only arithmetic now go to the trouble of clipping off excess precision from INT and DOUBLE operands. (12) DOUBLE to DOUBLE conversions introduced by reclaim() are now silently deleted in the table. (13) A few 'movd's were turned into 'movq's -- more work needs to be done to make this consistent. [mip: reader.c; pcc.vax: macdefs.h, local.c, local2.c, table.c] A bug which caused assignment op expressions with an unsigned char or unsigned short lhs and a floating rhs to treat the lhs as signed was fixed. Some conversion-related stuff which used to be done in the table is now done in sconv() so that it's easier to handle and so that zzzcode() and its descendants can more safely perform conversions by calling zzzcode(p, 'A'). [pcc.vax: local2.c, table.c] The code for setting the type of a floating point constant was bogus. A floating constant was float if it fit in a float without loss of precision, otherwise it was double. This caused silliness like unexpectedly losing low order bits of integers in mixed floating and integral expressions. The fix was to adopt the ANSI proposal that all floating constants are type double unless they bear an 'f' or 'F' suffix, in which case they are type float. (Note that a cast to float has the same effect as a 'f' suffix and is just as efficient, but I conceded to the evident popularity of the 'f' suffix...) [mip: scan.c; pcc.vax: local.c] The ASG OPSIMP templates that produce byte and word instructions for byte and word destinations weren't being activated very often because the constant operands weren't normalized. I added code to optim2() to appropriately reduce the range of constant operands of ASG OPSIMP operators and sign-extend. This blows away many useless conversions to and from int. [pcc.vax: local2.c] The template for assignment ops with unsigned char/short lhs and floating rhs indicated register sharing for the wrong operand... [pcc.vax: table.c] The new template that handled OREG for INTEMP failed to take into account the size variation in OREG objects. [pcc.vax: table.c] The offstar() routine tries to tweak UNARY MUL trees so that they can be handled most effectively by VAX addressing modes. The code for identifying index expressions was adjusted so that more indexed addressing modes can be produced. [pcc.vax: order.c] Bogus error messages were being emitted for certain initializations following an earlier legitimate error. It turns out that the optimization to prevent initialization code from being emitted after errors was preventing the initializer offset counter from being updated, and when this occurs, the initialization code screws up -- for example, string constants appear to be zero length. The initialization code now always updates the offset even if errors have been detected, although code generation is still suppressed. [pcc.vax: local.c] An assignment to a bitfield in an indexed int-width structure led to code generation failure due to an indexed OREG child of FLD. This is taboo because the VAX field instructions have byte-size side effects, and code in clocal() arranged for indexed structs to have int width. I changed clocal() to use byte width instead and it appears to work now (and even uses indexed byte addressing correctly). [pcc.vax: local.c] For some reason the unsigned-to-floating conversion code has always been long and complex when it could be short and simple. I used the simple code in the Tahoe compiler but didn't think to put it in the VAX compiler until prodded by Robert Firth... [pcc.vax: local2.c] John Gilmore noticed that the ! operator didn't work with floating constants; this was pretty easy to fix. [mip: trees.c] For some reason, opact() put left and right shifts through tymatch(). The type balancing of tymatch() is wrong for shifts -- the type of the shift depends only on the left operand, while the right operand is converted to int. We now use the shift special case in buildtree() to fix the type. [mip: trees.c] Following ANSI (for once) we eliminate warnings for pointer conversions involving void *. [mip: trees.c] There were at least a couple bugs in c2 with code that converts 'ashl $2,rA,rB; movab _x[rB],rC' into 'moval _x[rB],rC'; one caused the type to be wrong ('movab' for 'moval'), one caused neighboring instructions to get deleted. [c2.vax: c21.c] A branch to a redundant test sometimes resulted in c2's deleting the label too, even if the label itself was not redundant. [c2.vax: c21.c]