xref: /dragonfly/contrib/awk/FIXES (revision afb4a8be)
1/****************************************************************
2Copyright (C) Lucent Technologies 1997
3All Rights Reserved
4
5Permission to use, copy, modify, and distribute this software and
6its documentation for any purpose and without fee is hereby
7granted, provided that the above copyright notice appear in all
8copies and that both that the copyright notice and this
9permission notice and warranty disclaimer appear in supporting
10documentation, and that the name Lucent Technologies or any of
11its entities not be used in advertising or publicity pertaining
12to distribution of the software without specific, written prior
13permission.
14
15LUCENT DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE,
16INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS.
17IN NO EVENT SHALL LUCENT OR ANY OF ITS ENTITIES BE LIABLE FOR ANY
18SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
19WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER
20IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION,
21ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF
22THIS SOFTWARE.
23****************************************************************/
24
25This file lists all bug fixes, changes, etc., made since the AWK book
26was sent to the printers in August, 1987.
27
28July 2, 2020:
29	Merge PRs 85 and 86 which fix regressions. Thanks to
30	Tim van der Molen for the fixes.
31
32June 25, 2020:
33	Merge PRs 82 and 84. The latter fixes issue #83. Thanks to
34	Todd Miller and awkfan77.
35
36June 12, 2020:
37	Clear errno before calling errcheck to avoid any spurious errors
38	left over from previous calls that may have set it. Thanks to
39	Todd Miller for the fix, from PR #80.
40
41	Fix Issue #78 by allowing \r to follow floating point numbers in
42	lib.c:is_number. Thanks to GitHub user ajcarr for the report
43	and to Arnold Robbins for the fix.
44
45June 5, 2020:
46	In fldbld(), make sure that inputFS is set before trying to
47	use it. Thanks to  Steffen Nurpmeso <steffen@sdaoden.eu>
48	for the report.
49
50May 5, 2020:
51	Fix checks for compilers that can handle noreturn. Thanks to
52	GitHub user enh-google for pointing it out. Closes Issue #79.
53
54April 16, 2020:
55	Handle old compilers that don't support C11 (for noreturn).
56	Thanks to Arnold Robbins.
57
58April 5, 2020:
59	Use <stdnoreturn.h> and noreturn instead of GCC attributes.
60	Thanks to GitHub user awkfan77. Closes PR #77.
61
62February 28, 2020:
63	More cleanups from Christos Zoulas: notably backslash continuation
64	inside strings removes the newline and a fix for RS = "^a".
65	Fix for address sanitizer-found problem. Thanks to GitHub user
66	enh-google.
67
68February 19, 2020:
69	More small cleanups from Christos Zoulas.
70
71February 18, 2020:
72	Additional cleanups from Christos Zoulas. It's no longer necessary
73	to use the -y flag to bison.
74
75February 6, 2020:
76	Additional small cleanups from Christos Zoulas. awk is now
77	a little more robust about reporting I/O errors upon exit.
78
79January 31, 2020:
80	Merge PR #70, which avoids use of variable length arrays. Thanks
81	to GitHub user michaelforney.  Fix issue #60 ({0} in interval
82	expressions doesn't work).  Also get all tests working again.
83	Thanks to Arnold Robbins.
84
85January 24, 2020:
86	A number of small cleanups from Christos Zoulas.  Add the close
87	on exec flag to files/pipes opened for redirection; courtesy of
88	Arnold Robbins.
89
90January 19, 2020:
91	If POSIXLY_CORRECT is set in the environment, then sub and gsub
92	use POSIX rules for multiple backslashes.  This fixes Issue #66,
93	while maintaining backwards compatibility.
94
95January 9, 2020:
96	Input/output errors on closing files are now fatal instead of
97	mere warnings. Thanks to Martijn Dekker <martijn@inlv.org>.
98
99January 5, 2020:
100	Fix a bug in the concatentation of two string constants into
101	one done in the grammar.  Fixes GitHub issue #61.  Thanks
102	to GitHub user awkfan77 for pointing out the direction for
103	the fix.  New test T.concat added to the test suite.
104	Fix a few memory leaks reported by valgrind, as well.
105
106December 27, 2019:
107	Fix a bug whereby a{0,3} could match four a's.  Thanks to
108	"Anonymous AWK fan" for the report.
109
110December 11, 2019:
111	Further printf-related fixes for 32 bit systems.
112	Thanks again to Christos Zoulas.
113
114December 8, 2019:
115	Fix the return value of sprintf("%d") on 32 bit systems.
116	Thanks to Jim Lowe for the report and to Christos Zoulas
117	for the fix.
118
119November 10, 2019:
120	Convert a number of Boolean integer variables into
121	actual bools. Convert compile_time variable into an
122	enum and simplify some of the related code.  Thanks
123	to Arnold Robbins.
124
125November 8, 2019:
126	Fix from Ori Bernstein to get UTF-8 characters instead of
127	bytes when FS = "".  This is currently the only bit of
128	the One True Awk that understands multibyte characters.
129	From Arnold Robbins, apply some cleanups in the test suite.
130
131October 25, 2019:
132	More fixes and cleanups from NetBSD, courtesy of Christos
133	Zoulas. Merges PRs 54 and 55.
134
135October 24, 2019:
136	Import second round of code cleanups from NetBSD. Much thanks
137	to Christos Zoulas (GitHub user zoulasc). Merges PR 53.
138	Add an optimization for string concatenation, also from
139	Christos.
140
141October 17, 2019:
142	Import code cleanups from NetBSD. Much thanks to Christos
143	Zoulas (GitHub user zoulasc). Merges PR 51.
144
145October 6, 2019:
146	Import code from NetBSD awk that implements RS as a regular
147	expression.
148
149September 10, 2019:
150	Fixes for various array / memory overruns found via gcc's
151	-fsanitize=unknown. Thanks to Alexander Richardson (GitHub
152	user arichardson). Merges PRs 47 and 48.
153
154July 28, 2019:
155	Import grammar optimization from NetBSD: Two string constants
156	concatenated together get turned into a single string.
157
158July 26, 2019:
159	Support POSIX-specified C-style escape sequences "\a" (alarm)
160	and "\v" (vertical tab) in command line arguments and regular
161	expressions, further to the support for them in strings added on
162	Apr 9, 1989. These now no longer match as literal "a" and "v"
163	characters (as they don't on other awk implementations).
164	Thanks to Martijn Dekker.
165
166July 17, 2019:
167	Pull in a number of code cleanups and minor fixes from
168	Warner Losh's bsd-ota branch.  The only user visible change
169	is the use of random(3) as the random number generator.
170	Thanks to Warner Losh for collecting all these fixes in
171	one easy place to get them from.
172
173July 16, 2019:
174	Fix field splitting to use FS value as of the time a record
175	was read or assigned to.  Thanks to GitHub user Cody Mello (melloc)
176	for the fix. (Merged from his branch, via PR #42.) Updated
177	testdir/T.split per said PR as well.
178
179June 24, 2019:
180	Extract awktest.tar into testdir directory. Add some very
181	simple mechanics to the makefile for running the tests and
182	for cleaning up. No changes to awk itself.
183
184June 17, 2019:
185	Disallow deleting SYMTAB and its elements, which creates
186	use-after-free bugs. Thanks to GitHub user Cody Mello (melloc)
187	for the fix. (Merged from PR #43.)
188
189June 5, 2019:
190	Allow unmatched right parenthesis in a regular expression to
191	be treated literally. Fixes Issue #40. Thanks to GitHub user
192	Warner Losh (bsdimp) for the report. Thanks to Arnold Robbins
193	for the fix.
194
195May 29,2019:
196	Fix check for command line arguments to no longer require that
197	first character after '=' not be another '='. Reverts change of
198	August 11, 1989. Thanks to GitHub user Jamie Landeg Jones for
199	pointing out the issue; from Issue #38.
200
201Apr 7, 2019:
202	Update awktest.tar(p.50) to use modern options to sort. Needed
203	for Android development. Thanks to GitHub user mohd-akram (Mohamed
204	Akram).  From Issue #33.
205
206Mar 12, 2019:
207	Added very simplistic support for cross-compiling in the
208	makefile.  We are NOT going to go in the direction of the
209	autotools, though.  Thanks to GitHub user nee-san for
210	the basic change. (Merged from PR #34.)
211
212Mar 5, 2019:
213	Added support for POSIX-standard interval expressions (a.k.a.
214	bounds, a.k.a. repetition expressions) in regular expressions,
215	backported (via NetBSD) from Apple awk-24 (20070501).
216	Thanks to Martijn Dekker <martijn@inlv.org> for the port.
217	(Merged from PR #30.)
218
219Mar 3, 2019:
220	Merge PRs as follows:
221	#12: Avoid undefined behaviour when using ctype(3) functions in
222	     relex(). Thanks to GitHub user iamleot.
223	#31: Make getline handle numeric strings, and update FIXES. Thanks
224	     to GitHub user arnoldrobbins.
225	#32: maketab: support build systems with read-only source. Thanks
226	     to GitHub user enh.
227
228Jan 25, 2019:
229	Make getline handle numeric strings properly in all cases.
230	(Thanks, Arnold.)
231
232Jan 21, 2019:
233	Merged a number of small fixes from GitHub pull requests.
234	Thanks to GitHub users Arnold Robbins (arnoldrobbins),
235	Cody Mello (melloc) and Christoph Junghans (junghans).
236	PR numbers: 13-21, 23, 24, 27.
237
238Oct 25, 2018:
239	Added test in maketab.c to prevent generating a proctab entry
240	for YYSTYPE_IS_DEFINED.  It was harmless but some gcc settings
241	generated a warning message.  Thanks to Nan Xiao for report.
242
243Aug 27, 2018:
244	Disallow '$' in printf formats; arguments evaluated in order
245	and printed in order.
246
247	Added some casts to silence warnings on debugging printfs.
248	(Thanks, Arnold.)
249
250Aug 23, 2018:
251        A long list of fixes courtesy of Arnold Robbins,
252        to whom profound thanks.
253
254        1. ofs-rebuild: OFS value used to rebuild the record was incorrect.
255        Fixed August 19, 2014. Revised fix August 2018.
256
257        2. system-status: Instead of a floating-point division by 256, use
258        the wait(2) macros to create a reasonable exit status.
259        Fixed March 12, 2016.
260
261        3. space: Use provided xisblank() function instead of ispace() for
262        matching [[:blank:]].
263
264        4. a-format: Add POSIX standard %a and %A to supported formats. Check
265        at runtime that this format is available.
266
267        5. decr-NF: Decrementing NF did not change $0. This is a decades-old
268        bug. There are interactions with the old and new value of OFS as well.
269        Most of the fix came from the NetBSD awk.
270
271        6. string-conv: String conversions of scalars were sticky.  Once a
272        conversion to string happened, even with OFMT, that value was used until
273        a new numeric value was assigned, even if OFMT differed from CONVFMT,
274        and also if CONVFMT changed.
275
276        7. unary-plus: Unary plus on a string constant returned the string.
277        Instead, it should convert the value to numeric and give that value.
278
279	Also added Arnold's tests for these to awktest.tar as T.arnold.
280
281Aug 15, 2018:
282	fixed mangled awktest.tar (thanks, Arnold), posted all
283	current (very minor) fixes to github / onetrueawk
284
285Jun 7, 2018:
286	(yes, a long layoff)
287	Updated some broken tests (beebe.tar, T.lilly)
288	[thanks to Arnold Robbins]
289
290Mar 26, 2015:
291	buffer overflow in error reporting; thanks to tobias ulmer
292	and john-mark gurney for spotting it and the fix.
293
294Feb 4, 2013:
295	cleaned up a handful of tests that didn't seem to actually
296	test for correct behavior: T.latin1, T.gawk.
297
298Jan 5, 2013:
299	added ,NULL initializer to static Cells in run.c; not really
300	needed but cleaner.  Thanks to Michael Bombardieri.
301
302Dec 20, 2012:
303	fiddled makefile to get correct yacc and bison flags.  pick yacc
304	(linux) or bison (mac) as necessary.
305
306	added  __attribute__((__noreturn__)) to a couple of lines in
307	proto.h, to silence someone's enthusiastic checker.
308
309	fixed obscure call by value bug in split(a[1],a) reported on
310	9fans.  the management of temporary values is just a mess; i
311	took a shortcut by making an extra string copy.  thanks
312	to paul patience and arnold robbins for passing it on and for
313	proposed patches.
314
315	tiny fiddle in setfval to eliminate -0 results in T.expr, which
316	has irritated me for 20+ years.
317
318Aug 10, 2011:
319	another fix to avoid core dump with delete(ARGV); again, many thanks
320	to ruslan ermilov.
321
322Aug 7, 2011:
323	split(s, a, //) now behaves the same as split(s, a, "")
324
325Jun 12, 2011:
326	/pat/, \n /pat/ {...} is now legal, though bad style to use.
327
328	added checks to new -v code that permits -vnospace; thanks to
329	ruslan ermilov for spotting this and providing the patch.
330
331	removed fixed limit on number of open files; thanks to aleksey
332	cheusov and christos zoulos.
333
334	fixed day 1 bug that resurrected deleted elements of ARGV when
335	used as filenames (in lib.c).
336
337	minor type fiddles to make gcc -Wall -pedantic happier (but not
338	totally so); turned on -fno-strict-aliasing in makefile.
339
340May 6, 2011:
341	added #ifdef for isblank.
342	now allows -ffoo as well as -f foo arguments.
343	(thanks, ruslan)
344
345May 1, 2011:
346	after advice from todd miller, kevin lo, ruslan ermilov,
347	and arnold robbins, changed srand() to return the previous
348	seed (which is 1 on the first call of srand).  the seed is
349	an Awkfloat internally though converted to unsigned int to
350	pass to the library srand().  thanks, everyone.
351
352	fixed a subtle (and i hope low-probability) overflow error
353	in fldbld, by adding space for one extra \0.  thanks to
354	robert bassett for spotting this one and providing a fix.
355
356	removed the files related to compilation on windows.  i no
357	longer have anything like a current windows environment, so
358	i can't test any of it.
359
360May 23, 2010:
361	fixed long-standing overflow bug in run.c; many thanks to
362	nelson beebe for spotting it and providing the fix.
363
364	fixed bug that didn't parse -vd=1 properly; thanks to santiago
365	vila for spotting it.
366
367Feb 8, 2010:
368	i give up.  replaced isblank with isspace in b.c; there are
369	no consistent header files.
370
371Nov 26, 2009:
372	fixed a long-standing issue with when FS takes effect.  a
373	change to FS is now noticed immediately for subsequent splits.
374
375	changed the name getline() to awkgetline() to avoid yet another
376	name conflict somewhere.
377
378Feb 11, 2009:
379	temporarily for now defined HAS_ISBLANK, since that seems to
380	be the best way through the thicket.  isblank arrived in C99,
381	but seems to be arriving at different systems at different
382	times.
383
384Oct 8, 2008:
385	fixed typo in b.c that set tmpvec wrongly.  no one had ever
386	run into the problem, apparently.  thanks to alistair crooks.
387
388Oct 23, 2007:
389	minor fix in lib.c: increase inputFS to 100, change malloc
390	for fields to n+1.
391
392	fixed memory fault caused by out of order test in setsval.
393
394	thanks to david o'brien, freebsd, for both fixes.
395
396May 1, 2007:
397	fiddle in makefile to fix for BSD make; thanks to igor sobrado.
398
399Mar 31, 2007:
400	fixed some null pointer refs calling adjbuf.
401
402Feb 21, 2007:
403	fixed a bug in matching the null RE in sub and gsub.  thanks to al aho
404	who actually did the fix (in b.c), and to wolfgang seeberg for finding
405	it and providing a very compact test case.
406
407	fixed quotation in b.c; thanks to Hal Pratt and the Princeton Dante
408	Project.
409
410	removed some no-effect asserts in run.c.
411
412	fiddled maketab.c to not complain about bison-generated values.
413
414	removed the obsolete -V argument; fixed --version to print the
415	version and exit.
416
417	fixed wording and an outright error in the usage message; thanks to igor
418	sobrado and jason mcintyre.
419
420	fixed a bug in -d that caused core dump if no program followed.
421
422Jan 1, 2007:
423	dropped mac.code from makefile; there are few non-MacOSX
424	mac's these days.
425
426Jan 17, 2006:
427	system() not flagged as unsafe in the unadvertised -safe option.
428	found it while enhancing tests before shipping the ;login: article.
429	practice what you preach.
430
431	removed the 9-years-obsolete -mr and -mf flags.
432
433	added -version and --version options.
434
435	core dump on linux with BEGIN {nextfile}, now fixed.
436
437	removed some #ifdef's in run.c and lex.c that appear to no
438	longer be necessary.
439
440Apr 24, 2005:
441	modified lib.c so that values of $0 et al are preserved in the END
442	block, apparently as required by posix.  thanks to havard eidnes
443	for the report and code.
444
445Jan 14, 2005:
446	fixed infinite loop in parsing, originally found by brian tsang.
447	thanks to arnold robbins for a suggestion that started me
448	rethinking it.
449
450Dec 31, 2004:
451	prevent overflow of -f array in main, head off potential error in
452	call of SYNTAX(), test malloc return in lib.c, all with thanks to
453	todd miller.
454
455Dec 22, 2004:
456	cranked up size of NCHARS; coverity thinks it can be overrun with
457	smaller size, and i think that's right.  added some assertions to b.c
458	to catch places where it might overrun.  the RE code is still fragile.
459
460Dec 5, 2004:
461	fixed a couple of overflow problems with ridiculous field numbers:
462	e.g., print $(2^32-1).  thanks to ruslan ermilov, giorgos keramidas
463	and david o'brien at freebsd.org for patches.  this really should
464	be re-done from scratch.
465
466Nov 21, 2004:
467	fixed another 25-year-old RE bug, in split.  it's another failure
468	to (re-)initialize.  thanks to steve fisher for spotting this and
469	providing a good test case.
470
471Nov 22, 2003:
472	fixed a bug in regular expressions that dates (so help me) from 1977;
473	it's been there from the beginning.  an anchored longest match that
474	was longer than the number of states triggered a failure to initialize
475	the machine properly.  many thanks to moinak ghosh for not only finding
476	this one but for providing a fix, in some of the most mysterious
477	code known to man.
478
479	fixed a storage leak in call() that appears to have been there since
480	1983 or so -- a function without an explicit return that assigns a
481	string to a parameter leaked a Cell.  thanks to moinak ghosh for
482	spotting this very subtle one.
483
484Jul 31, 2003:
485	fixed, thanks to andrey chernov and ruslan ermilov, a bug in lex.c
486	that mis-handled the character 255 in input.  (it was being compared
487	to EOF with a signed comparison.)
488
489Jul 29, 2003:
490	fixed (i think) the long-standing botch that included the beginning of
491	line state ^ for RE's in the set of valid characters; this led to a
492	variety of odd problems, including failure to properly match certain
493	regular expressions in non-US locales.  thanks to ruslan for keeping
494	at this one.
495
496Jul 28, 2003:
497	n-th try at getting internationalization right, with thanks to volker
498	kiefel, arnold robbins and ruslan ermilov for advice, though they
499	should not be blamed for the outcome.  according to posix, "."  is the
500	radix character in programs and command line arguments regardless of
501	the locale; otherwise, the locale should prevail for input and output
502	of numbers.  so it's intended to work that way.
503
504	i have rescinded the attempt to use strcoll in expanding shorthands in
505	regular expressions (cclenter).  its properties are much too
506	surprising; for example [a-c] matches aAbBc in locale en_US but abBcC
507	in locale fr_CA.  i can see how this might arise by implementation
508	but i cannot explain it to a human user.  (this behavior can be seen
509	in gawk as well; we're leaning on the same library.)
510
511	the issue appears to be that strcoll is meant for sorting, where
512	merging upper and lower case may make sense (though note that unix
513	sort does not do this by default either).  it is not appropriate
514	for regular expressions, where the goal is to match specific
515	patterns of characters.  in any case, the notations [:lower:], etc.,
516	are available in awk, and they are more likely to work correctly in
517	most locales.
518
519	a moratorium is hereby declared on internationalization changes.
520	i apologize to friends and colleagues in other parts of the world.
521	i would truly like to get this "right", but i don't know what
522	that is, and i do not want to keep making changes until it's clear.
523
524Jul 4, 2003:
525	fixed bug that permitted non-terminated RE, as in "awk /x".
526
527Jun 1, 2003:
528	subtle change to split: if source is empty, number of elems
529	is always 0 and the array is not set.
530
531Mar 21, 2003:
532	added some parens to isblank, in another attempt to make things
533	internationally portable.
534
535Mar 14, 2003:
536	the internationalization changes, somewhat modified, are now
537	reinstated.  in theory awk will now do character comparisons
538	and case conversions in national language, but "." will always
539	be the decimal point separator on input and output regardless
540	of national language.  isblank(){} has an #ifndef.
541
542	this no longer compiles on windows: LC_MESSAGES isn't defined
543	in vc6++.
544
545	fixed subtle behavior in field and record splitting: if FS is
546	a single character and RS is not empty, \n is NOT a separator.
547	this tortuous reading is found in the awk book; behavior now
548	matches gawk and mawk.
549
550Dec 13, 2002:
551	for the moment, the internationalization changes of nov 29 are
552	rolled back -- programs like x = 1.2 don't work in some locales,
553	because the parser is expecting x = 1,2.  until i understand this
554	better, this will have to wait.
555
556Nov 29, 2002:
557	modified b.c (with tiny changes in main and run) to support
558	locales, using strcoll and iswhatever tests for posix character
559	classes.  thanks to ruslan ermilov (ru@freebsd.org) for code.
560	the function isblank doesn't seem to have propagated to any
561	header file near me, so it's there explicitly.  not properly
562	tested on non-ascii character sets by me.
563
564Jun 28, 2002:
565	modified run/format() and tran/getsval() to do a slightly better
566	job on using OFMT for output from print and CONVFMT for other
567	number->string conversions, as promised by posix and done by
568	gawk and mawk.  there are still places where it doesn't work
569	right if CONVFMT is changed; by then the STR attribute of the
570	variable has been irrevocably set.  thanks to arnold robbins for
571	code and examples.
572
573	fixed subtle bug in format that could get core dump.  thanks to
574	Jaromir Dolecek <jdolecek@NetBSD.org> for finding and fixing.
575	minor cleanup in run.c / format() at the same time.
576
577	added some tests for null pointers to debugging printf's, which
578	were never intended for external consumption.  thanks to dave
579	kerns (dkerns@lucent.com) for pointing this out.
580
581	GNU compatibility: an empty regexp matches anything (thanks to
582	dag-erling smorgrav, des@ofug.org).  subject to reversion if
583	this does more harm than good.
584
585	pervasive small changes to make things more const-correct, as
586	reported by gcc's -Wwrite-strings.  as it says in the gcc manual,
587	this may be more nuisance than useful.  provoked by a suggestion
588	and code from arnaud desitter, arnaud@nimbus.geog.ox.ac.uk
589
590	minor documentation changes to note that this now compiles out
591	of the box on Mac OS X.
592
593Feb 10, 2002:
594	changed types in posix chars structure to quiet solaris cc.
595
596Jan 1, 2002:
597	fflush() or fflush("") flushes all files and pipes.
598
599	length(arrayname) returns number of elements; thanks to
600	arnold robbins for suggestion.
601
602	added a makefile.win to make it easier to build on windows.
603	based on dan allen's buildwin.bat.
604
605Nov 16, 2001:
606	added support for posix character class names like [:digit:],
607	which are not exactly shorter than [0-9] and perhaps no more
608	portable.  thanks to dag-erling smorgrav for code.
609
610Feb 16, 2001:
611	removed -m option; no longer needed, and it was actually
612	broken (noted thanks to volker kiefel).
613
614Feb 10, 2001:
615	fixed an appalling bug in gettok: any sequence of digits, +,-, E, e,
616	and period was accepted as a valid number if it started with a period.
617	this would never have happened with the lex version.
618
619	other 1-character botches, now fixed, include a bare $ and a
620	bare " at the end of the input.
621
622Feb 7, 2001:
623	more (const char *) casts in b.c and tran.c to silence warnings.
624
625Nov 15, 2000:
626	fixed a bug introduced in august 1997 that caused expressions
627	like $f[1] to be syntax errors.  thanks to arnold robbins for
628	noticing this and providing a fix.
629
630Oct 30, 2000:
631	fixed some nextfile bugs: not handling all cases.  thanks to
632	arnold robbins for pointing this out.  new regressions added.
633
634	close() is now a function.  it returns whatever the library
635	fclose returns, and -1 for closing a file or pipe that wasn't
636	opened.
637
638Sep 24, 2000:
639	permit \n explicitly in character classes; won't work right
640	if comes in as "[\n]" but ok as /[\n]/, because of multiple
641	processing of \'s.  thanks to arnold robbins.
642
643July 5, 2000:
644	minor fiddles in tran.c to keep compilers happy about uschar.
645	thanks to norman wilson.
646
647May 25, 2000:
648	yet another attempt at making 8-bit input work, with another
649	band-aid in b.c (member()), and some (uschar) casts to head
650	off potential errors in subscripts (like isdigit).  also
651	changed HAT to NCHARS-2.  thanks again to santiago vila.
652
653	changed maketab.c to ignore apparently out of range definitions
654	instead of halting; new freeBSD generates one.  thanks to
655	jon snader <jsnader@ix.netcom.com> for pointing out the problem.
656
657May 2, 2000:
658	fixed an 8-bit problem in b.c by making several char*'s into
659	unsigned char*'s.  not clear i have them all yet.  thanks to
660	Santiago Vila <sanvila@unex.es> for the bug report.
661
662Apr 21, 2000:
663	finally found and fixed a memory leak in function call; it's
664	been there since functions were added ~1983.  thanks to
665	jon bentley for the test case that found it.
666
667	added test in envinit to catch environment "variables" with
668	names beginning with '='; thanks to Berend Hasselman.
669
670Jul 28, 1999:
671	added test in defn() to catch function foo(foo), which
672	otherwise recurses until core dump.  thanks to arnold
673	robbins for noticing this.
674
675Jun 20, 1999:
676	added *bp in gettok in lex.c; appears possible to exit function
677	without terminating the string.  thanks to russ cox.
678
679Jun 2, 1999:
680	added function stdinit() to run to initialize files[] array,
681	in case stdin, etc., are not constants; some compilers care.
682
683May 10, 1999:
684	replaced the ERROR ... FATAL, etc., macros with functions
685	based on vprintf, to avoid problems caused by overrunning
686	fixed-size errbuf array.  thanks to ralph corderoy for the
687	impetus, and for pointing out a string termination bug in
688	qstring as well.
689
690Apr 21, 1999:
691	fixed bug that caused occasional core dumps with commandline
692	variable with value ending in \.  (thanks to nelson beebe for
693	the test case.)
694
695Apr 16, 1999:
696	with code kindly provided by Bruce Lilly, awk now parses
697	/=/ and similar constructs more sensibly in more places.
698	Bruce also provided some helpful test cases.
699
700Apr 5, 1999:
701	changed true/false to True/False in run.c to make it
702	easier to compile with C++.  Added some casts on malloc
703	and realloc to be honest about casts; ditto.  changed
704	ltype int to long in struct rrow to reduce some 64-bit
705	complaints; other changes scattered throughout for the
706	same purpose.  thanks to Nelson Beebe for these portability
707	improvements.
708
709	removed some horrible pointer-int casting in b.c and elsewhere
710	by adding ptoi and itonp to localize the casts, which are
711	all benign.  fixed one incipient bug that showed up on sgi
712	in 64-bit mode.
713
714	reset lineno for new source file; include filename in error
715	message.  also fixed line number error in continuation lines.
716	(thanks to Nelson Beebe for both of these.)
717
718Mar 24, 1999:
719	Nelson Beebe notes that irix 5.3 yacc dies with a bogus
720	error; use a newer version or switch to bison, since sgi
721	is unlikely to fix it.
722
723Mar 5, 1999:
724	changed isnumber to is_number to avoid the problem caused by
725	versions of ctype.h that include the name isnumber.
726
727	distribution now includes a script for building on a Mac,
728	thanks to Dan Allen.
729
730Feb 20, 1999:
731	fixed memory leaks in run.c (call) and tran.c (setfval).
732	thanks to Stephen Nutt for finding these and providing the fixes.
733
734Jan 13, 1999:
735	replaced srand argument by (unsigned int) in run.c;
736	avoids problem on Mac and potentially on Unix & Windows.
737	thanks to Dan Allen.
738
739	added a few (int) casts to silence useless compiler warnings.
740	e.g., errorflag= in run.c jump().
741
742	added proctab.c to the bundle outout; one less thing
743	to have to compile out of the box.
744
745	added calls to _popen and _pclose to the win95 stub for
746	pipes (thanks to Steve Adams for this helpful suggestion).
747	seems to work, though properties are not well understood
748	by me, and it appears that under some circumstances the
749	pipe output is truncated.  Be careful.
750
751Oct 19, 1998:
752	fixed a couple of bugs in getrec: could fail to update $0
753	after a getline var; because inputFS wasn't initialized,
754	could split $0 on every character, a misleading diversion.
755
756	fixed caching bug in makedfa: LRU was actually removing
757	least often used.
758
759	thanks to ross ridge for finding these, and for providing
760	great bug reports.
761
762May 12, 1998:
763	fixed potential bug in readrec: might fail to update record
764	pointer after growing.  thanks to dan levy for spotting this
765	and suggesting the fix.
766
767Mar 12, 1998:
768	added -V to print version number and die.
769
770[notify dave kerns, dkerns@dacsoup.ih.lucent.com]
771
772Feb 11, 1998:
773	subtle silent bug in lex.c: if the program ended with a number
774	longer than 1 digit, part of the input would be pushed back and
775	parsed again because token buffer wasn't terminated right.
776	example:  awk 'length($0) > 10'.  blush.  at least i found it
777	myself.
778
779Aug 31, 1997:
780	s/adelete/awkdelete/: SGI uses this in malloc.h.
781	thanks to nelson beebe for pointing this one out.
782
783Aug 21, 1997:
784	fixed some bugs in sub and gsub when replacement includes \\.
785	this is a dark, horrible corner, but at least now i believe that
786	the behavior is the same as gawk and the intended posix standard.
787	thanks to arnold robbins for advice here.
788
789Aug 9, 1997:
790	somewhat regretfully, replaced the ancient lex-based lexical
791	analyzer with one written in C.  it's longer, generates less code,
792	and more portable; the old one depended too much on mysterious
793	properties of lex that were not preserved in other environments.
794	in theory these recognize the same language.
795
796	now using strtod to test whether a string is a number, instead of
797	the convoluted original function.  should be more portable and
798	reliable if strtod is implemented right.
799
800	removed now-pointless optimization in makefile that tries to avoid
801	recompilation when awkgram.y is changed but symbols are not.
802
803	removed most fixed-size arrays, though a handful remain, some
804	of which are unchecked.  you have been warned.
805
806Aug 4, 1997:
807	with some trepidation, replaced the ancient code that managed
808	fields and $0 in fixed-size arrays with arrays that grow on
809	demand.  there is still some tension between trying to make this
810	run fast and making it clean; not sure it's right yet.
811
812	the ill-conceived -mr and -mf arguments are now useful only
813	for debugging.  previous dynamic string code removed.
814
815	numerous other minor cleanups along the way.
816
817Jul 30, 1997:
818	using code provided by dan levy (to whom profuse thanks), replaced
819	fixed-size arrays and awkward kludges by a fairly uniform mechanism
820	to grow arrays as needed for printf, sub, gsub, etc.
821
822Jul 23, 1997:
823	falling off the end of a function returns "" and 0, not 0.
824	thanks to arnold robbins.
825
826Jun 17, 1997:
827	replaced several fixed-size arrays by dynamically-created ones
828	in run.c; added overflow tests to some previously unchecked cases.
829	getline, toupper, tolower.
830
831	getline code is still broken in that recursive calls may wind
832	up using the same space.  [fixed later]
833
834	increased RECSIZE to 8192 to push problems further over the horizon.
835
836	added \r to \n as input line separator for programs, not data.
837	damn CRLFs.
838
839	modified format() to permit explicit printf("%c", 0) to include
840	a null byte in output.  thanks to ken stailey for the fix.
841
842	added a "-safe" argument that disables file output (print >,
843	print >>), process creation (cmd|getline, print |, system), and
844	access to the environment (ENVIRON).  this is a first approximation
845	to a "safe" version of awk, but don't rely on it too much.  thanks
846	to joan feigenbaum and matt blaze for the inspiration long ago.
847
848Jul 8, 1996:
849	fixed long-standing bug in sub, gsub(/a/, "\\\\&"); thanks to
850	ralph corderoy.
851
852Jun 29, 1996:
853	fixed awful bug in new field splitting; didn't get all the places
854	where input was done.
855
856Jun 28, 1996:
857	changed field-splitting to conform to posix definition: fields are
858	split using the value of FS at the time of input; it used to be
859	the value when the field or NF was first referred to, a much less
860	predictable definition.  thanks to arnold robbins for encouragement
861	to do the right thing.
862
863May 28, 1996:
864	fixed appalling but apparently unimportant bug in parsing octal
865	numbers in reg exprs.
866
867	explicit hex in reg exprs now limited to 2 chars: \xa, \xaa.
868
869May 27, 1996:
870	cleaned up some declarations so gcc -Wall is now almost silent.
871
872	makefile now includes backup copies of ytab.c and lexyy.c in case
873	one makes before looking; it also avoids recreating lexyy.c unless
874	really needed.
875
876	s/aprintf/awkprint, s/asprintf/awksprintf/ to avoid some name clashes
877	with unwisely-written header files.
878
879	thanks to jeffrey friedl for several of these.
880
881May 26, 1996:
882	an attempt to rationalize the (unsigned) char issue.  almost all
883	instances of unsigned char have been removed; the handful of places
884	in b.c where chars are used as table indices have been hand-crafted.
885	added some latin-1 tests to the regression, but i'm not confident;
886	none of my compilers seem to care much.  thanks to nelson beebe for
887	pointing out some others that do care.
888
889May 2, 1996:
890	removed all register declarations.
891
892	enhanced split(), as in gawk, etc:  split(s, a, "") splits s into
893	a[1]...a[length(s)] with each character a single element.
894
895	made the same changes for field-splitting if FS is "".
896
897	added nextfile, as in gawk: causes immediate advance to next
898	input file. (thanks to arnold robbins for inspiration and code).
899
900	small fixes to regexpr code:  can now handle []], [[], and
901	variants;  [] is now a syntax error, rather than matching
902	everything;  [z-a] is now empty, not z.  far from complete
903	or correct, however.  (thanks to jeffrey friedl for pointing out
904	some awful behaviors.)
905
906Apr 29, 1996:
907	replaced uchar by uschar everywhere; apparently some compilers
908	usurp this name and this causes conflicts.
909
910	fixed call to time in run.c (bltin); arg is time_t *.
911
912	replaced horrible pointer/long punning in b.c by a legitimate
913	union.  should be safer on 64-bit machines and cleaner everywhere.
914	(thanks to nelson beebe for pointing out some of these problems.)
915
916	replaced nested comments by #if 0...#endif in run.c, lib.c.
917
918	removed getsval, setsval, execute macros from run.c and lib.c.
919	machines are 100x faster than they were when these macros were
920	first used.
921
922	revised filenames: awk.g.y => awkgram.y, awk.lx.l => awklex.l,
923	y.tab.[ch] => ytab.[ch], lex.yy.c => lexyy.c, all in the aid of
924	portability to nameless systems.
925
926	"make bundle" now includes yacc and lex output files for recipients
927	who don't have yacc or lex.
928
929Aug 15, 1995:
930	initialized Cells in setsymtab more carefully; some fields
931	were not set.  (thanks to purify, all of whose complaints i
932	think i now understand.)
933
934	fixed at least one error in gsub that looked at -1-th element
935	of an array when substituting for a null match (e.g., $).
936
937	delete arrayname is now legal; it clears the elements but leaves
938	the array, which may not be the right behavior.
939
940	modified makefile: my current make can't cope with the test used
941	to avoid unnecessary yacc invocations.
942
943Jul 17, 1995:
944	added dynamically growing strings to awk.lx.l and b.c
945	to permit regular expressions to be much bigger.
946	the state arrays can still overflow.
947
948Aug 24, 1994:
949	detect duplicate arguments in function definitions (mdm).
950
951May 11, 1994:
952	trivial fix to printf to limit string size in sub().
953
954Apr 22, 1994:
955	fixed yet another subtle self-assignment problem:
956	$1 = $2; $1 = $1 clobbered $1.
957
958	Regression tests now use private echo, to avoid quoting problems.
959
960Feb 2, 1994:
961	changed error() to print line number as %d, not %g.
962
963Jul 23, 1993:
964	cosmetic changes: increased sizes of some arrays,
965	reworded some error messages.
966
967	added CONVFMT as in posix (just replaced OFMT in getsval)
968
969	FILENAME is now "" until the first thing that causes a file
970	to be opened.
971
972Nov 28, 1992:
973	deleted yyunput and yyoutput from proto.h;
974	different versions of lex give these different declarations.
975
976May 31, 1992:
977	added -mr N and -mf N options: more record and fields.
978	these really ought to adjust automatically.
979
980	cleaned up some error messages; "out of space" now means
981	malloc returned NULL in all cases.
982
983	changed rehash so that if it runs out, it just returns;
984	things will continue to run slow, but maybe a bit longer.
985
986Apr 24, 1992:
987	remove redundant close of stdin when using -f -.
988
989	got rid of core dump with -d; awk -d just prints date.
990
991Apr 12, 1992:
992	added explicit check for /dev/std(in,out,err) in redirection.
993	unlike gawk, no /dev/fd/n yet.
994
995	added (file/pipe) builtin.  hard to test satisfactorily.
996	not posix.
997
998Feb 20, 1992:
999	recompile after abortive changes;  should be unchanged.
1000
1001Dec 2, 1991:
1002	die-casting time:  converted to ansi C, installed that.
1003
1004Nov 30, 1991:
1005	fixed storage leak in freefa, failing to recover [N]CCL.
1006	thanks to Bill Jones (jones@cs.usask.ca)
1007
1008Nov 19, 1991:
1009	use RAND_MAX instead of literal in builtin().
1010
1011Nov 12, 1991:
1012	cranked up some fixed-size arrays in b.c, and added a test for
1013	overflow in penter.  thanks to mark larsen.
1014
1015Sep 24, 1991:
1016	increased buffer in gsub.  a very crude fix to a general problem.
1017	and again on Sep 26.
1018
1019Aug 18, 1991:
1020	enforce variable name syntax for commandline variables: has to
1021	start with letter or _.
1022
1023Jul 27, 1991:
1024	allow newline after ; in for statements.
1025
1026Jul 21, 1991:
1027	fixed so that in self-assignment like $1=$1, side effects
1028	like recomputing $0 take place.  (this is getting subtle.)
1029
1030Jun 30, 1991:
1031	better test for detecting too-long output record.
1032
1033Jun 2, 1991:
1034	better defense against very long printf strings.
1035	made break and continue illegal outside of loops.
1036
1037May 13, 1991:
1038	removed extra arg on gettemp, tempfree.  minor error message rewording.
1039
1040May 6, 1991:
1041	fixed silly bug in hex parsing in hexstr().
1042	removed an apparently unnecessary test in isnumber().
1043	warn about weird printf conversions.
1044	fixed unchecked array overwrite in relex().
1045
1046	changed for (i in array) to access elements in sorted order.
1047	then unchanged it -- it really does run slower in too many cases.
1048	left the code in place, commented out.
1049
1050Feb 10, 1991:
1051	check error status on all writes, to avoid banging on full disks.
1052
1053Jan 28, 1991:
1054	awk -f - reads the program from stdin.
1055
1056Jan 11, 1991:
1057	failed to set numeric state on $0 in cmd|getline context in run.c.
1058
1059Nov 2, 1990:
1060	fixed sleazy test for integrality in getsval;  use modf.
1061
1062Oct 29, 1990:
1063	fixed sleazy buggy code in lib.c that looked (incorrectly) for
1064	too long input lines.
1065
1066Oct 14, 1990:
1067	fixed the bug on p. 198 in which it couldn't deduce that an
1068	argument was an array in some contexts.  replaced the error
1069	message in intest() by code that damn well makes it an array.
1070
1071Oct 8, 1990:
1072	fixed horrible bug:  types and values were not preserved in
1073	some kinds of self-assignment. (in assign().)
1074
1075Aug 24, 1990:
1076	changed NCHARS to 256 to handle 8-bit characters in strings
1077	presented to match(), etc.
1078
1079Jun 26, 1990:
1080	changed struct rrow (awk.h) to use long instead of int for lval,
1081	since cfoll() stores a pointer in it.  now works better when int's
1082	are smaller than pointers!
1083
1084May 6, 1990:
1085	AVA fixed the grammar so that ! is uniformly of the same precedence as
1086	unary + and -.  This renders illegal some constructs like !x=y, which
1087	now has to be parenthesized as !(x=y), and makes others work properly:
1088	!x+y is (!x)+y, and x!y is x !y, not two pattern-action statements.
1089	(These problems were pointed out by Bob Lenk of Posix.)
1090
1091	Added \x to regular expressions (already in strings).
1092	Limited octal to octal digits; \8 and \9 are not octal.
1093	Centralized the code for parsing escapes in regular expressions.
1094	Added a bunch of tests to T.re and T.sub to verify some of this.
1095
1096Feb 9, 1990:
1097	fixed null pointer dereference bug in main.c:  -F[nothing].  sigh.
1098
1099	restored srand behavior:  it returns the current seed.
1100
1101Jan 18, 1990:
1102	srand now returns previous seed value (0 to start).
1103
1104Jan 5, 1990:
1105	fix potential problem in tran.c -- something was freed,
1106	then used in freesymtab.
1107
1108Oct 18, 1989:
1109	another try to get the max number of open files set with
1110	relatively machine-independent code.
1111
1112	small fix to input() in case of multiple reads after EOF.
1113
1114Oct 11, 1989:
1115	FILENAME is now defined in the BEGIN block -- too many old
1116	programs broke.
1117
1118	"-" means stdin in getline as well as on the commandline.
1119
1120	added a bunch of casts to the code to tell the truth about
1121	char * vs. unsigned char *, a right royal pain.  added a
1122	setlocale call to the front of main, though probably no one
1123	has it usefully implemented yet.
1124
1125Aug 24, 1989:
1126	removed redundant relational tests against nullnode if parse
1127	tree already had a relational at that point.
1128
1129Aug 11, 1989:
1130	fixed bug:  commandline variable assignment has to look like
1131	var=something.  (consider the man page for =, in file =.1)
1132
1133	changed number of arguments to functions to static arrays
1134	to avoid repeated malloc calls.
1135
1136Aug 2, 1989:
1137	restored -F (space) separator
1138
1139Jul 30, 1989:
1140	added -v x=1 y=2 ... for immediate commandline variable assignment;
1141	done before the BEGIN block for sure.  they have to precede the
1142	program if the program is on the commandline.
1143	Modified Aug 2 to require a separate -v for each assignment.
1144
1145Jul 10, 1989:
1146	fixed ref-thru-zero bug in environment code in tran.c
1147
1148Jun 23, 1989:
1149	add newline to usage message.
1150
1151Jun 14, 1989:
1152	added some missing ansi printf conversion letters: %i %X %E %G.
1153	no sensible meaning for h or L, so they may not do what one expects.
1154
1155	made %* conversions work.
1156
1157	changed x^y so that if n is a positive integer, it's done
1158	by explicit multiplication, thus achieving maximum accuracy.
1159	(this should be done by pow() but it seems not to be locally.)
1160	done to x ^= y as well.
1161
1162Jun 4, 1989:
1163	ENVIRON array contains environment: if shell variable V=thing,
1164		ENVIRON["V"] is "thing"
1165
1166	multiple -f arguments permitted.  error reporting is naive.
1167	(they were permitted before, but only the last was used.)
1168
1169	fixed a really stupid botch in the debugging macro dprintf
1170
1171	fixed order of evaluation of commandline assignments to match
1172	what the book claims:  an argument of the form x=e is evaluated
1173	at the time it would have been opened if it were a filename (p 63).
1174	this invalidates the suggested answer to ex 4-1 (p 195).
1175
1176	removed some code that permitted -F (space) fieldseparator,
1177	since it didn't quite work right anyway.  (restored aug 2)
1178
1179Apr 27, 1989:
1180	Line number now accumulated correctly for comment lines.
1181
1182Apr 26, 1989:
1183	Debugging output now includes a version date,
1184	if one compiles it into the source each time.
1185
1186Apr 9, 1989:
1187	Changed grammar to prohibit constants as 3rd arg of sub and gsub;
1188	prevents class of overwriting-a-constant errors.  (Last one?)
1189	This invalidates the "banana" example on page 43 of the book.
1190
1191	Added \a ("alert"), \v (vertical tab), \xhhh (hexadecimal),
1192	as in ANSI, for strings.  Rescinded the sloppiness that permitted
1193	non-octal digits in \ooo.  Warning:  not all compilers and libraries
1194	will be able to deal with \x correctly.
1195
1196Jan 9, 1989:
1197	Fixed bug that caused tempcell list to contain a duplicate.
1198	The fix is kludgy.
1199
1200Dec 17, 1988:
1201	Catches some more commandline errors in main.
1202	Removed redundant decl of modf in run.c (confuses some compilers).
1203	Warning:  there's no single declaration of malloc, etc., in awk.h
1204	that seems to satisfy all compilers.
1205
1206Dec 7, 1988:
1207	Added a bit of code to error printing to avoid printing nulls.
1208	(Not clear that it actually would.)
1209
1210Nov 27, 1988:
1211	With fear and trembling, modified the grammar to permit
1212	multiple pattern-action statements on one line without
1213	an explicit separator.  By definition, this capitulation
1214	to the ghost of ancient implementations remains undefined
1215	and thus subject to change without notice or apology.
1216	DO NOT COUNT ON IT.
1217
1218Oct 30, 1988:
1219	Fixed bug in call() that failed to recover storage.
1220
1221	A warning is now generated if there are more arguments
1222	in the call than in the definition (in lieu of fixing
1223	another storage leak).
1224
1225Oct 20, 1988:
1226	Fixed %c:  if expr is numeric, use numeric value;
1227	otherwise print 1st char of string value.  still
1228	doesn't work if the value is 0 -- won't print \0.
1229
1230	Added a few more checks for running out of malloc.
1231
1232Oct 12, 1988:
1233	Fixed bug in call() that freed local arrays twice.
1234
1235	Fixed to handle deletion of non-existent array right;
1236	complains about attempt to delete non-array element.
1237
1238Sep 30, 1988:
1239	Now guarantees to evaluate all arguments of built-in
1240	functions, as in C;  the appearance is that arguments
1241	are evaluated before the function is called.  Places
1242	affected are sub (gsub was ok), substr, printf, and
1243	all the built-in arithmetic functions in bltin().
1244	A warning is generated if a bltin() is called with
1245	the wrong number of arguments.
1246
1247	This requires changing makeprof on p167 of the book.
1248
1249Aug 23, 1988:
1250	setting FILENAME in BEGIN caused core dump, apparently
1251	because it was freeing space not allocated by malloc.
1252
1253July 24, 1988:
1254	fixed egregious error in toupper/tolower functions.
1255	still subject to rescinding, however.
1256
1257July 2, 1988:
1258	flush stdout before opening file or pipe
1259
1260July 2, 1988:
1261	performance bug in b.c/cgoto(): not freeing some sets of states.
1262	partial fix only right now, and the number of states increased
1263	to make it less obvious.
1264
1265June 1, 1988:
1266	check error status on close
1267
1268May 28, 1988:
1269	srand returns seed value it's using.
1270	see 1/18/90
1271
1272May 22, 1988:
1273	Removed limit on depth of function calls.
1274
1275May 10, 1988:
1276	Fixed lib.c to permit _ in commandline variable names.
1277
1278Mar 25, 1988:
1279	main.c fixed to recognize -- as terminator of command-
1280	line options.  Illegal options flagged.
1281	Error reporting slightly cleaned up.
1282
1283Dec 2, 1987:
1284	Newer C compilers apply a strict scope rule to extern
1285	declarations within functions.  Two extern declarations in
1286	lib.c and tran.c have been moved to obviate this problem.
1287
1288Oct xx, 1987:
1289	Reluctantly added toupper and tolower functions.
1290	Subject to rescinding without notice.
1291
1292Sep 17, 1987:
1293	Error-message printer had printf(s) instead of
1294	printf("%s",s);  got core dumps when the message
1295	included a %.
1296
1297Sep 12, 1987:
1298	Very long printf strings caused core dump;
1299	fixed aprintf, asprintf, format to catch them.
1300	Can still get a core dump in printf itself.
1301
1302
1303