1There was a bug which caused compiler errors when floating values were 2assigned to bit fields. Some overenthusiastic optimization on my part 3led to a float-to-int conversion on the rhs of the assignment being 4dropped. Bit fields are now treated as a special case. [pcc.vax: 5local2.c] 6 7At the instigation of Rob Pike, enums were neutered: they now behave 8exactly like ints. The only traces of Johnson's treatment of enums are 9warnings about clashes between one enum type and another enum type in 10certain expressions. This fix was trickier than it looked; it would 11have been much simpler to drop ALL warnings about enums, as Rob would 12recommend. [mip: trees.c] 13 14Arthur Olsen pointed out a bug with the evaluation of constant 15expressions -- the usual arithmetic conversions are not always 16performed. Rather than follow Arthur's simple suggestion, I decided to 17be safe and arranged to use the same conversion code on constants that 18we use on variables. We short-cut the test if the operand(s) are ints, 19which is the usual case, so the impact on compile time should be 20small. [mip: trees.c] 21 22Guy Harris noted that chkpun() considered multiply dimensioned arrays 23and multiply indirect pointers to be the same thing, which they are 24most certainly not. His fix causes an 'illegal pointer combination' 25warning to be emitted for code like: 26 27 char **cpp, c2a[10][20]; cpp = c2a; cpp[5][3] = 'a'; 28 29The new code is actually simpler than the old... [mip: trees.c] 30 31Another irritation of Rob Pike's is fixed -- the old-style assignment 32ops have been ifdef'ed out. No more of those obnoxious 'ambiguous 33assignment: assignment op taken' messages! [mip: scan.c] 34 35Yet another Rob Pike complaint: you couldn't refer directly to the 36members of a structure which had been returned by a function. You may 37now utter things like 'f().a' and get away with them. The illegal form 38'(&f())->a' used to work; since this has the same tree representation 39as the legal form, an ugly wart was added to the action for '&' which 40specifically rules this form out. To be consistent, a similar 41exception was made for expressions of the form '(a = b).c'. [mip: 42cgram.y] 43 44Moved configuration flags from Makefile into macdefs.h and converted 45some more trivial routines into macros. This cleans up the Makefile 46considerably and consolidates assembler-dependent flags. [pcc.vax: 47macdefs.h, code.c, order.c, Makefile; mip: onepass.h] 48 49More efficiency hacks: the compiler now does its best to avoid emitting 50ANY code after an error is detected. Previously only code generation 51through the code table was suppressed after an error. This change can 52buy up to 25% in speed improvements if the dbx stab flag is enabled. 53The latest 'ccom' runs twice as fast as the 4.2 BSD compiler in this 54situation... [pcc.vax: macdefs.h, local.c, code.c; mip: cgram.y, 55scan.c, pftn.c] 56 57For ANSI compatibility, we now accept void expressions in the colon 58part of a conditional expression. Both subexpressions under the colon 59operator must be void if one is void. The overall type of the 60conditional expression is void in this case. [mip: trees.c] 61 62As long as we're doing away with old-fashioned assignment operators, we 63might as well terminate old-fashioned initializations too. [mip: cgram.y] 64 65A fix from James Schoner for bitfield assignments was installed. The 66problem was that the rhs of a bitfield assignment was used as its 67value, an overlooked aspect of the several earlier bug fixes for 68problems of this variety. The bug caused 69 70 struct { unsigned a:4, b:3, c:2; } x; 71 int i; 72 i = x.b = 255; 73 74to store 255 in i instead of 7. [pcc.vax: table.c, local2.c] 75 76Scanner code was added to handle the System V/ANSI preprocessor 77extensions #ident and #pragma. Currently #ident is ignored. #pragma 78may be used as a substitute for lint-style comment directives; e.g. use 79'#pragma LINTLIBRARY' for '/* LINTLIBRARY */'. The #pragma stuff 80requires the hacked-up cpp with ANSI extensions which may eventually be 81put into circulation here at Utah. The #line control from cpp is now 82recognized with the same syntax by ccom. Unknown control lines elicit 83a warning. A small bug is fixed by this new code -- previously #ident 84and other unknown controls caused the line control mechanism to get 85screwed up, so that bogus dbx stabs were put in the output. [mip: 86scan.c] 87 88There seems to be a loophole in initializations -- apparently it is 89legal to initialize a bitfield with a symbolic constant (i.e. the 90address of something). No loader that I know of will handle this! The 91compiler now emits an error when someone tries this trick; it really 92can't do any better. [mip: pftn.c] 93 94Someone complained that all illegal characters were being printed as 95octal numbers in the error message. This was changed so that printable 96characters are printed normally, and all funny characters are printed 97in C 'char' style. [mip: scan.c] 98 99Conversion of unsigned constants to floating point values was broken 100because the requisite cast in makety() had been commented out! Argh. 101Unsigned comparisons of constants were similarly botched. [mip: trees.c] 102 103Ray Butterworth made a sensible suggestion that array definitions which 104aren't external and allocate no storage should elicit errors. I 105modified his suggestion slightly by moving the test into nidcl(), where 106we can be sure that an array isn't being initialized. I also adopted 107his suggestion for a lint warning for arrays with explicit dimensions 108that are lost by the rule that converts array types into pointer types 109in formal arguments; the warning is contingent on lint's -h flag. 110[mip: pftn.c] 111 112The structures for fpn and dpn nodes (floating point constants) were 113changed to make them conform in shape to the other nodes. [mip: ndu.h] 114 115Overenthusiastic SCONV optimization code in optim2() led to functions 116not cooperating with certain casts; e.g. 117 118 unsigned char x(); ... y((char)x()) ... 119 120failed to sign-extend the result of x(). [pcc.vax: local2.c] 121 122A bug in outstruct() caused it to check 'stab[-1].name == NULL' for 123unnamed structures. For some reason this didn't break until my recent 124lint fixes tweaked the compiler in some subtle way... [pcc.vax: stab.c] 125 126The makefile was changed to pass C options to lint properly and to drop 127the '-a' flag to lint in the 'lintall' entry. '-a' really ought to 128work but unfortunately the arrangement of 'int'-like types in the 129compiler is extremely confusing and inconsistent, so I eventually gave 130up trying to force the issue. Sam Leffler's 'rel.c' for release 131information was also added. [pcc.vax: Makefile, rel.c] 132 133I fixed defid() so that it now insists that all argument type 134declarations must refer to names in the argument list. Previously you 135could get away with: 136 137 int a; 138 x() 139 int a; 140 { ... } 141 142Use of 'a' in the body of x() would elicit the immortal error message 143'warning: bad arg temp'. [mip: pftn.c] 144 145For some reason, mainp2() in the two-pass version of the compiler had a 146switch statement with two 'default' cases. I zapped the obvious one. 147[mip: reader.c] 148 149Botched initializations sometimes left the declarations code in a funny 150state, so a fixinit() routine was invented to aid error recovery. For 151example, the following illegal program forced a core dump: 152 153 char m[] = ; 154 x() { 155 y("splat"); 156 } 157 158The compiler tried to read "splat" into m[] and died horribly. [mip: 159cgram.y, pftn.c] 160 161The compiler now warns 'can't take size of function' when asked to do 162something amazing like '... void f(); ... if (f[10]) ...'. Previously 163it produced a compiler error instead. [mip: pftn.c] 164 165If a program tried to access a value below the argument list on the 166stack using the clever tactic of manipulating the address of an 167argument, it drew a warning 'bad arg temp'. For the sake of these 168clever programs, the warning has been suppressed. [pcc.vax: local2.c] 169 170The assembler would sometimes print 'Branch too long: try -J flag' even 171when you used the -J flag and when there was no reason for it to be 172making long branches. This was due to a bug in jxxxfix() which caused 173tests for explodable branch instructions to terminate early in later 174phases of the topological sort, because the loop termination code 175didn't take into account the fact that not all code addresses are 176updated by jxxxbump(). The fix is to match the pointer to the data 177structure for the code in the sorted table rather than check for its 178generated address. [as: asjxxx.c] 179 180The peephole optimizer normally compares instructions for equality 181based on their instruction type and their operands. Unfortunately 182several instructions are too complex for c2 to handle and are given an 183instruction type of 0; thus all such instructions compared equal. The 184equop() routine was changed to test the names of these '0' type 185instructions for equality. [c2: c21.c] 186 187The compiler sometimes botched computations into stack temporaries by 188treating expressions like '-40(fp)[r1]' as permissible temporaries. I 189rewrote the shtemp() routine to make it explicit that the 'stack 190temporary' goal in code generation means precisely that the final 191expression must not contain any references to temporary registers (like 192r1). I had to add a couple templates to the code table to push 193exceedingly complex OREG expressions onto the stack when this goal is 194attempted. [pcc.vax: local2.c, table.c] 195 196A bug sometimes caused a redeclaration of an array in an inner scope to 197affect the outer array when the outer array was incompletely specified 198(by leaving out the most significant dimension). For example: 199 200 extern int a[]; 201 x(){ int a[10]; } 202 int a[20]; 203 204This code would elicit the error 'foo.c, line 3: redeclaration of a'. 205The routine defid() is used to enter new definitions; for some reason 206scope problems are not resolved in defid() until much other code has 207been executed, including code that deals with filling out array sizes. 208The change makes defid() notice inner scopes with auto and register 209declarations earlier than usual. [mip: pftn.c] 210 211Case expressions are explicitly restricted to contain only int constants, 212char constants and sizeof expressions (C Reference Manual section 15). 213Previously the compiler didn't test for expressions like '(int) &foo[10]' 214and thus it would generate some rather bogus code. Expressions which 215resolve to names now elicit the same 'non-constant case expression' 216warning which you receive for variables. [mip: cgram.y] 217 218The value of an assignment to an unsigned bitfield was signed through 219an oversight in the code table. [pcc.vax: table.c] 220 221From Sam Kendall, a fix to prevent structs, unions, floats, doubles 222and void from being cast to pointer types. [mip: trees.c] 223 224Some relict code in moditype() was causing void functions not to be 225converted into pointers in some situations. [mip: trees.c] 226 227A minor optimization -- when the lhs of a simple assignment op (&=, |=, 228^=, +-, -=) is smaller than int size, we can sometimes pun the rhs and 229avoid promoting the lhs to int, performing the operation in int width 230and converting from int back to the lhs type. For example: 231 232 register char *ap, *bp; 233 *ap++ |= *bp << 1; 234 235This used to require 7 instructions, but now needs only 3. [pcc.vax: 236table.c] 237 238At some point I added code to conval() to balance types before 239performing constant folding... While hacking on the tahoe compiler, I 240decided that this code was too complex and replaced it with equivalent 241code that's shorter and easier to understand. [mip: trees.c] 242 243Lines containing multiple statements were broken up for the sake of 244tracing with the source debugger in tcheck() and talloc(). [mip: 245common.c] 246 247I discovered that the C compiler called urem() in three different 248places with a constant divisor... In my subsequent rampage I hacked 249the compiler to generate inline code for all unsigned division and 250modulus operations with constant divisors. The largest inline 251expansion should use only 5 instructions, with most using just 3 or 4. 252The changes touched several files but really weren't very messy. 253[mip: pass1.h, match.c; pcc.vax: local2.c, order.c, table.c] 254 255A lot of new code was added to handle a really simple problem: 256 257 unsigned char uc = 255; 258 if (uc == -1) ... 259 260This incorrectly tested true, because the compiler generated a test 261that looked only at the low order byte of the constant. Not only that, 262but the compiler didn't realize that this test could be short- 263circuited, since -1 is equal to 4294967295 unsigned and is hence out of 264the range of an unsigned char. Rather than add lots of cruft to the 265code table, I shoved it into optim2() -- the compiler now picks up all 266the absurd cases where a constant is out of the range of precision of 267a variable it's tested against. To avoid having to write lots of code 268templates to handle unbalanced unsigned/signed expressions, I forced 269tymatch() to take notice of unbalanced expressions and promote the 270signed operand to unsigned (except with assignment operators, sigh). 271This change in turned required tweaks in autoincr() and in the code 272table to get code quality back. I hope I can come up with a better way 273to do this... [mip: trees.c; pcc.vax: local2.c, order.c, table.c] 274 275The value of TNULL was changed from 'pointer to undef' to 'pointer to 276member of enum' so that 'void *' can be a real type. TNULL is used to 277tag unused symbol table slots. [mip: manifest.h] 278 279A bug in clearst() led to problems with 'schain botch' errors. When a 280hash collision occurs, a symbol is (linearly) rehashed; if the symbol 281which forced the rehash is deleted, the relook() loop in clearst() will 282cause another symbol with the same hash code to move up and replace the 283deleted symbol. Torek's 'schain' hack for speedy identification of 284symbols at the same block level will get screwed up by this operation 285since it relies on a linked list of table entries -- moving an entry 286garbles the list. How did this code ever work before? [mip: pftn.c] 287 288Changed putins() in ascode.c in the assembler to permit 0(pc) as a 289write operand... Previously the assembler automatically optimized 290this to (pc), which is an illegal operand. [as.vax: ascode.c] 291 292The complement of an unsigned char or unsigned short value should have 293its high bits set, since the 'usual arithmetic conversions' widen these 294small integers 'before' the operation. [pcc.vax: table.c] 295 296A minor code improvement in ccom led to problems in c2 -- c2 was able 297to optimize sequences like 'cvtbl -4(fp),r0; bicl2 $-256,r0' but not 298the (shorter and faster) 'cvtbl -4(fp),r0; movzbl r0,r0'. A change in 299bflow() causes redundant conversions to be noted and removed, restoring 300code quality. [c2: c21.c] 301 302A typo in the 'bitsize' array definition resulted in an unterminated 303comment which screwed up the bit sizes for several types. I only 304noticed this because I ran the source off with vgrind and the error 305was exposed by comment highlighting... [c2: c21.c] 306 307An earlier change to conval() caused LONG and ULONG types to be hacked 308into INT and UNSIGNED; this was fine for the (VAX) compiler, but led 309to inconsistencies with lint. [mip: trees.c] 310 311When a syntax error occurs, the parser throws away tokens until it can 312enter a known state. If a string or character constant delimiter is 313tossed, the parser will try to interpret the contents of the constant 314as code and can get very confused. A hack was added to yylex() to 315detect this situation -- basically, if a delimiter is seen but the 316string or character constant has not been processed by lxstr() at the 317next call to yylex(), yylex() will call lxstr() itself and dispose of 318the rest of the constant. [mip: scan.c] 319 320Following a suggestion by Arthur Olsen, the production for 'switch' was 321modified to complain about constant switch expressions with 'lint -h'. 322[mip: cgram.y] 323 324Another Arthur Olsen bug report pointed out a problem with increment 325operations that don't match a code template... Two attempts at 326rewriting the increment are made: the first tries to turn the lvalue 327operand into an OREG, and the second applies a tree transformation to 328convert 'x++' into '(x += sizeof x) - sizeof x'. A mistake in the 329routine setincr() caused the lvalue operand in its entirety to be 330generated into a register instead of just the lvalue operand's address, 331producing something like 'r0 = x, (r0 += sizeof x) - sizeof x' instead 332of 'r0 = &x, (*r0 += sizeof x) - sizeof x'. [pcc.vax: order.c] 333 334Better code for floating post-increment and -decrement can be generated 335with a simple change to the code table and to zzzcode() so that the 336same hack for ordinary post-increment will work for floating point too. 337[pcc.vax: local2.c, table.c] 338 339I added Arthur Olsen's massive lint fixes for typechecking printf(). 340It sure would be nice if there were a way to specify new printf-like 341commands at execute time, perhaps through lint directives embedded in 342include files. [lint: lint.c] 343 344Arthur's warning about superfluous backslashes was added to lxstr(). 345Rather than adding Arthur's (expensive) code for warning about the 346use of '$', I simply made it illegal (unless 'VMS' is defined). I 347also took the opportunity to remove '`' gcos BCD constants. I made 348a slight alteration to yylex() to cause it to eat unknown characters 349rather than punt, since this seemed more useful. [mip: scan.c] 350 351Lint would sometimes print a bogus 'i set but not used' warning in 352situations like this: 353 354 static int i; 355 static int *ip = &i; 356 357 i = 1; 358 return *ip; 359 360If you moved the initialization out of the declaration, the warning 361disappeared. I installed Arthur's hack for forcing lint to examine 362initializations. This causes lint to treat initializations of auto, 363register and static variables as 'uses' and to ignore sizeof 364expressions as 'uses'. Also, '&i' in a static or external 365initialization is now a 'set' and a 'use' of 'i'. [lint: lint.c; mip: 366cgram.y, pftn.c] 367 368VARARGS0 is now correctly treated differently from plain VARARGS. 369I don't remember who originally noticed this... [lint: lint.c, 370lpass2.c] 371 372The register allocation code failed to 'share' register pairs. I don't 373know why this escaped notice for this long... I added a bit in the 374'busy' array to keep track of pairs and modified the code in usable() 375to notice pairs and try to 'share' them. Some other code which treated 376the values of busy[] elements as arithmetic values had to be changed; 377there is now a macro which performs the proper test. [mip: allo.c, 378match.c, pass2.h] 379 380Some extensive code tweaking... (1) If order() is called on to rewrite 381a UNARY MUL node and that node has a usable index expression, we now 382try to rewrite the base into a register so that oreg2() will produce a 383doubly-indexed OREG. This is usually an impressive space saving. (2) 384Instead of laboriously copying a constant 0.0 in data space to clear a 385double or a float, we issue the proper 'clrd' or 'clrf'. This is done 386by a trick using an alternate prtdcon() routine; I'm not sure who 387invented it. I guess I'm still not prepared to hack in support for 388floating literals and immediate constants. (3) The conversion code now 389handles stack pushes directly, which often saves a spill to register. 390With very little adjustment, this also buys us optimally small pushes 391of constants. (4) Pointer comparisons are now unsigned; I'm not sure 392what this really buys us, but I added it anyway. (5) AND tests against 393constants are 'small' if both the constant and the other operand are 394also 'small'. (6) base() now recognizes that NAME nodes can be used in 395pc relative deferred indexed addressing, which is much more compact 396than the equivalent code to compute the address into a register and 397indirect through it. (7) The optimization code for ANDing with a 398constant now tries to produce a positive mask when small types are used 399so that literal operands are possible; a side effect is that the code 400is more readable. (8) UCHAR/USHORT to FLOAT/DOUBLE conversions take an 401extra step through INT type to avoid the overhead of an UNSIGNED to 402FLOAT/DOUBLE conversion. (9) If a logical operator sits above a pair 403of FLOAT to DOUBLE conversions, the conversions are deleted. (10) Vast 404numbers of redundant or useless templates were deleted from the code 405table. (11) Conversions to FLOAT in double-only arithmetic now go to 406the trouble of clipping off excess precision from INT and DOUBLE 407operands. (12) DOUBLE to DOUBLE conversions introduced by reclaim() 408are now silently deleted in the table. (13) A few 'movd's were turned 409into 'movq's -- more work needs to be done to make this consistent. 410[mip: reader.c; pcc.vax: macdefs.h, local.c, local2.c, table.c] 411 412A bug which caused assignment op expressions with an unsigned char or 413unsigned short lhs and a floating rhs to treat the lhs as signed was 414fixed. Some conversion-related stuff which used to be done in the 415table is now done in sconv() so that it's easier to handle and so that 416zzzcode() and its descendants can more safely perform conversions by 417calling zzzcode(p, 'A'). [pcc.vax: local2.c, table.c] 418 419The code for setting the type of a floating point constant was bogus. 420A floating constant was float if it fit in a float without loss of 421precision, otherwise it was double. This caused silliness like 422unexpectedly losing low order bits of integers in mixed floating and 423integral expressions. The fix was to adopt the ANSI proposal that all 424floating constants are type double unless they bear an 'f' or 'F' 425suffix, in which case they are type float. (Note that a cast to float 426has the same effect as a 'f' suffix and is just as efficient, but I 427conceded to the evident popularity of the 'f' suffix...) [mip: scan.c; 428pcc.vax: local.c] 429 430The ASG OPSIMP templates that produce byte and word instructions for 431byte and word destinations weren't being activated very often because 432the constant operands weren't normalized. I added code to optim2() to 433appropriately reduce the range of constant operands of ASG OPSIMP 434operators and sign-extend. This blows away many useless conversions to 435and from int. [pcc.vax: local2.c] 436 437The template for assignment ops with unsigned char/short lhs and 438floating rhs indicated register sharing for the wrong operand... 439[pcc.vax: table.c] 440 441The new template that handled OREG for INTEMP failed to take into 442account the size variation in OREG objects. [pcc.vax: table.c] 443 444The offstar() routine tries to tweak UNARY MUL trees so that they can 445be handled most effectively by VAX addressing modes. The code for 446identifying index expressions was adjusted so that more indexed 447addressing modes can be produced. [pcc.vax: order.c] 448 449Bogus error messages were being emitted for certain initializations 450following an earlier legitimate error. It turns out that the 451optimization to prevent initialization code from being emitted after 452errors was preventing the initializer offset counter from being 453updated, and when this occurs, the initialization code screws up -- for 454example, string constants appear to be zero length. The initialization 455code now always updates the offset even if errors have been detected, 456although code generation is still suppressed. [pcc.vax: local.c] 457 458An assignment to a bitfield in an indexed int-width structure led to 459code generation failure due to an indexed OREG child of FLD. This is 460taboo because the VAX field instructions have byte-size side effects, 461and code in clocal() arranged for indexed structs to have int width. 462I changed clocal() to use byte width instead and it appears to work now 463(and even uses indexed byte addressing correctly). [pcc.vax: local.c] 464 465For some reason the unsigned-to-floating conversion code has always 466been long and complex when it could be short and simple. I used the 467simple code in the Tahoe compiler but didn't think to put it in the VAX 468compiler until prodded by Robert Firth... [pcc.vax: local2.c] 469 470John Gilmore noticed that the ! operator didn't work with floating 471constants; this was pretty easy to fix. [mip: trees.c] 472 473For some reason, opact() put left and right shifts through tymatch(). 474The type balancing of tymatch() is wrong for shifts -- the type of the 475shift depends only on the left operand, while the right operand is 476converted to int. We now use the shift special case in buildtree() to 477fix the type. [mip: trees.c] 478 479Following ANSI (for once) we eliminate warnings for pointer conversions 480involving void *. [mip: trees.c] 481 482There were at least a couple bugs in c2 with code that converts 'ashl 483$2,rA,rB; movab _x[rB],rC' into 'moval _x[rB],rC'; one caused the type 484to be wrong ('movab' for 'moval'), one caused neighboring instructions 485to get deleted. [c2.vax: c21.c] 486 487A branch to a redundant test sometimes resulted in c2's deleting the 488label too, even if the label itself was not redundant. [c2.vax: c21.c] 489 490