xref: /freebsd/contrib/one-true-awk/FIXES (revision 1719886f)
1/****************************************************************
2Copyright (C) Lucent Technologies 1997
3All Rights Reserved
4
5Permission to use, copy, modify, and distribute this software and
6its documentation for any purpose and without fee is hereby
7granted, provided that the above copyright notice appear in all
8copies and that both that the copyright notice and this
9permission notice and warranty disclaimer appear in supporting
10documentation, and that the name Lucent Technologies or any of
11its entities not be used in advertising or publicity pertaining
12to distribution of the software without specific, written prior
13permission.
14
15LUCENT DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE,
16INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS.
17IN NO EVENT SHALL LUCENT OR ANY OF ITS ENTITIES BE LIABLE FOR ANY
18SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
19WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER
20IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION,
21ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF
22THIS SOFTWARE.
23****************************************************************/
24
25This file lists all bug fixes, changes, etc., made since the
26second edition of the AWK book was published in September 2023.
27
28Jan 22, 2024:
29	Restore the ability to compile with g++. Thanks to
30	Arnold Robbins.
31
32Dec 24, 2023:
33	Matchop dereference after free problem fix when the first
34	argument is a function call. Thanks to Oguz Ismail Uysal.
35	Fix inconsistent handling of --csv and FS set in the
36	command line. Thanks to Wilbert van der Poel.
37	Casting changes to int for is* functions.
38
39Nov 27, 2023:
40	Fix exit status of system on MacOS. Update to REGRESS.
41	Thanks to Arnold Robbins.
42	Fix inconsistent handling of -F and --csv, and loss of csv
43	mode when FS is set.
44
45Nov 24, 2023:
46        Fix issue #199: gototab improvements to dynamically resize the
47        table, qsort and bsearch to improve the lookup speed as the
48        table gets larger for multibyte input. Thanks to Arnold Robbins.
49
50Nov 23, 2023:
51	Fix Issue #169, related to escape sequences in strings.
52	Thanks to Github user rajeevvp.
53	Fix Issue #147, reported by Github user drawkula, and fixed
54	by Miguel Pineiro Jr.
55
56Nov 20, 2023:
57	Rewrite of fnematch to fix a number of issues, including
58	extraneous output, out-of-bounds access, number of bytes
59	to push back after a failed match etc.
60	Thanks to Miguel Pineiro Jr.
61
62Nov 15, 2023:
63	Man page edit, regression test fixes. Thanks to Arnold Robbins
64	Consolidation of sub and gsub into dosub, removing duplicate
65	code. Thanks to Miguel Pineiro Jr.
66	gcc replaced with cc everywhere.
67
68Oct 30, 2023:
69	Multiple fixes and a minor code cleanup.
70	Disabled utf-8 for non-multibyte locales, such as C or POSIX.
71	Fixed a bad char * cast that causes incorrect results on big-endian
72	systems. Also fixed an out-of-bounds read for empty CCL.
73	Fixed a buffer overflow in substr with utf-8 strings.
74	Many thanks to Todd C Miller.
75
76Sep 24, 2023:
77	fnematch and getrune have been overhauled to solve issues around
78	unicode FS and RS. Also fixed gsub null match issue with unicode.
79	Big thanks to Arnold Robbins.
80
81Sep 12, 2023:
82	Fixed a length error in u8_byte2char that set RSTART to
83	incorrect (cannot happen) value for EOL match(str, /$/).
84
85
86-----------------------------------------------------------------
87
88[This entry is a summary, not a precise list of changes.]
89
90	Added --csv option to enable processing of comma-separated
91	values inputs.  When --csv is enabled, fields are separated
92	by commas, fields may be quoted with " double quotes, fields
93	may contain embedded newlines.
94
95	If no explicit separator argument is provided, split() uses
96	the setting of --csv to determine how fields are split.
97
98	Strings may now contain UTF-8 code points (not necessarily
99	characters).  Functions that operate on characters, like
100	length, substr, index, match, etc., use UTF-8, so the length
101	of a string of 3 emojis is 3, not 12 as it would be if bytes
102	were counted.
103
104	Regular expressions are processed as UTF-8.
105
106	Unicode literals can be written as \u followed by one
107	to eight hexadecimal digits.  These may appear in strings and
108	regular expressions.
109