/dragonfly/gnu/usr.bin/grep/egrep/ |
H A D | egrep | 51ddd709 Fri Oct 10 21:33:40 GMT 2014 John Marino <draco@marino.st> Complete upgrade of gnu grep 2.14 => 2.20
** 2.20 Bug fixes grep --max-count=N FILE would no longer stop reading after Nth match. I.e., while grep would still print the correct output, it would continue reading until end of input, and hence, potentially forever. [bug introduced in grep-2.19]
A command like echo aa|grep -E 'a(b$|c$)' would mistakenly report the input as a matched line. [bug introduced in grep-2.19]
** 2.20 Changes in behavior grep --exclude-dir='FOO/' now excludes the directory FOO. Previously, the trailing slash meant the option was ineffective.
** 2.19 Improvements Performance has improved, typically by 10% and in some cases by a factor of 200. However, performance of grep -P in UTF-8 locales has gotten worse as part of the fix for the crashes mentioned below.
** 2.19 Bug fixes grep no longer mishandles patterns like [a-[.z.]], and no longer mishandles patterns like [^a] in locales that have multicharacter collating sequences so that [^a] can match a string of two characters.
grep no longer mishandles an empty pattern at the end of a pattern list. [bug introduced in grep-2.5]
grep -C NUM now outputs separators consistently even when NUM is zero, and similarly for grep -A NUM and grep -B NUM. [bug present since "the beginning"]
grep -f no longer mishandles patterns containing NUL bytes. [bug introduced in grep-2.11]
Plain grep, grep -E, and grep -F now treat encoding errors in patterns the same way the GNU regular expression matcher treats them, with respect to whether the errors can match parts of multibyte characters in data. [bug present since "the beginning"]
grep -w no longer mishandles a potential match adjacent to a letter that takes up two or more bytes in a multibyte encoding. Similarly, the patterns '\<', '\>', '\b', and '\B' no longer mishandle word-boundary matches in multibyte locales. [bug present since "the beginning"]
grep -P now reports an error and exits when given invalid UTF-8 data. Previously it was unreliable, and sometimes crashed or looped. [bug introduced in grep-2.16]
grep -P now works with -w and -x and backreferences. Before, echo aa|grep -Pw '(.)\1' would fail to match, yet echo aa|grep -Pw '(.)\2' would match.
grep -Pw now works like grep -w in that the matched string has to be preceded and followed by non-word components or the beginning and end of the line (as opposed to word boundaries before). Before, this echo a@@a| grep -Pw @@ would match, yet this echo a@@a| grep -w @@ would not. Now, they both fail to match, per the documentation on how grep's -w works.
grep -i no longer mishandles patterns containing titlecase characters. For example, in a locale containing the titlecase character 'Lj' (U+01C8 LATIN CAPITAL LETTER L WITH SMALL LETTER J), 'grep -i Lj' now matches both 'LJ' (U+01C7 LATIN CAPITAL LETTER LJ) and 'lj' (U+01C9 LATIN SMALL LETTER LJ).
** 2.18 Bug fixes grep no longer mishandles patterns like [^^-~] in unibyte locales. [bug introduced in grep-2.8]
grep -i in a multibyte, non-UTF8 locale could be up to 200 times slower than in 2.16. [bug introduced in grep-2.17]
** 2.17 Improvements grep -i in a multibyte locale is now typically 10 times faster for patterns that do not contain \ or [.
grep (without -i) in a multibyte locale is now up to 7 times faster when processing many matched lines.
** 2.16 Bug fixes The fix to make \s and \S work with multi-byte white space broke the use of each shortcut whenever followed by a repetition operator. For example, \s*, \s+, \s? and \s{3} would all malfunction in a multi-byte locale. [bug introduced in grep-2.15]
The fix to make grep -P work better with UTF-8 made it possible for grep to evoke a larger set of PCRE errors, some of which could trigger an abort. E.g., this would abort: printf '\x82'|LC_ALL=en_US.UTF-8 grep -P y Now grep handles arbitrary PCRE errors. [bug introduced in grep-2.15]
Handle very long lines (2GiB and longer) on systems with a deficient read system call.
** 2.15 Bug fixes grep's \s and \S failed to work with multi-byte white space characters. For example, \s would fail to match a non-breaking space, and this would print nothing: printf '\xc2\xa0' | LC_ALL=en_US.UTF-8 grep '\s' A related bug is that \S would mistakenly match an invalid multibyte character. For example, the following would match: printf '\x82\n' | LC_ALL=en_US.UTF-8 grep '^\S$' [bug present since grep-2.6]
grep -i would segfault on systems using UTF-16-based wchar_t (Cygwin) when converting an input string containing certain 4-byte UTF-8 sequences to lower case. The conversions to wchar_t and back to a UTF-8 multibyte string did not take surrogate pairs into account. [bug present since at least grep-2.6, though the segfault is new with 2.13]
grep -E would segfault when given a regexp like '([^.]*[M]){1,2}' for any multibyte character M. [bug introduced in grep-2.6, which would segfault, but 2.7 and 2.8 had no problem, and 2.9 through 2.14 would hit a failed assertion. ]
grep -F would get stuck in an infinite loop when given a search string that is an invalid byte sequence in the current locale and that matches the bytes of the input twice on a line. Now grep fails with exit status 1.
grep -P could misbehave. While multi-byte mode is only supported by PCRE with UTF-8 locales, grep did not activate it. This would cause failures to match multibyte characters against some regular expressions, especially those including the '.' or '\p' metacharacters.
** 2.15 New features grep -P can now use a just-in-time compiler to greatly speed up matches, This feature is transparent to the user; no flag is required to enable it. It is only available if the corresponding support in the PCRE library is detected when grep is compiled.
|
H A D | Makefile | diff 51ddd709 Fri Oct 10 21:33:40 GMT 2014 John Marino <draco@marino.st> Complete upgrade of gnu grep 2.14 => 2.20
** 2.20 Bug fixes grep --max-count=N FILE would no longer stop reading after Nth match. I.e., while grep would still print the correct output, it would continue reading until end of input, and hence, potentially forever. [bug introduced in grep-2.19]
A command like echo aa|grep -E 'a(b$|c$)' would mistakenly report the input as a matched line. [bug introduced in grep-2.19]
** 2.20 Changes in behavior grep --exclude-dir='FOO/' now excludes the directory FOO. Previously, the trailing slash meant the option was ineffective.
** 2.19 Improvements Performance has improved, typically by 10% and in some cases by a factor of 200. However, performance of grep -P in UTF-8 locales has gotten worse as part of the fix for the crashes mentioned below.
** 2.19 Bug fixes grep no longer mishandles patterns like [a-[.z.]], and no longer mishandles patterns like [^a] in locales that have multicharacter collating sequences so that [^a] can match a string of two characters.
grep no longer mishandles an empty pattern at the end of a pattern list. [bug introduced in grep-2.5]
grep -C NUM now outputs separators consistently even when NUM is zero, and similarly for grep -A NUM and grep -B NUM. [bug present since "the beginning"]
grep -f no longer mishandles patterns containing NUL bytes. [bug introduced in grep-2.11]
Plain grep, grep -E, and grep -F now treat encoding errors in patterns the same way the GNU regular expression matcher treats them, with respect to whether the errors can match parts of multibyte characters in data. [bug present since "the beginning"]
grep -w no longer mishandles a potential match adjacent to a letter that takes up two or more bytes in a multibyte encoding. Similarly, the patterns '\<', '\>', '\b', and '\B' no longer mishandle word-boundary matches in multibyte locales. [bug present since "the beginning"]
grep -P now reports an error and exits when given invalid UTF-8 data. Previously it was unreliable, and sometimes crashed or looped. [bug introduced in grep-2.16]
grep -P now works with -w and -x and backreferences. Before, echo aa|grep -Pw '(.)\1' would fail to match, yet echo aa|grep -Pw '(.)\2' would match.
grep -Pw now works like grep -w in that the matched string has to be preceded and followed by non-word components or the beginning and end of the line (as opposed to word boundaries before). Before, this echo a@@a| grep -Pw @@ would match, yet this echo a@@a| grep -w @@ would not. Now, they both fail to match, per the documentation on how grep's -w works.
grep -i no longer mishandles patterns containing titlecase characters. For example, in a locale containing the titlecase character 'Lj' (U+01C8 LATIN CAPITAL LETTER L WITH SMALL LETTER J), 'grep -i Lj' now matches both 'LJ' (U+01C7 LATIN CAPITAL LETTER LJ) and 'lj' (U+01C9 LATIN SMALL LETTER LJ).
** 2.18 Bug fixes grep no longer mishandles patterns like [^^-~] in unibyte locales. [bug introduced in grep-2.8]
grep -i in a multibyte, non-UTF8 locale could be up to 200 times slower than in 2.16. [bug introduced in grep-2.17]
** 2.17 Improvements grep -i in a multibyte locale is now typically 10 times faster for patterns that do not contain \ or [.
grep (without -i) in a multibyte locale is now up to 7 times faster when processing many matched lines.
** 2.16 Bug fixes The fix to make \s and \S work with multi-byte white space broke the use of each shortcut whenever followed by a repetition operator. For example, \s*, \s+, \s? and \s{3} would all malfunction in a multi-byte locale. [bug introduced in grep-2.15]
The fix to make grep -P work better with UTF-8 made it possible for grep to evoke a larger set of PCRE errors, some of which could trigger an abort. E.g., this would abort: printf '\x82'|LC_ALL=en_US.UTF-8 grep -P y Now grep handles arbitrary PCRE errors. [bug introduced in grep-2.15]
Handle very long lines (2GiB and longer) on systems with a deficient read system call.
** 2.15 Bug fixes grep's \s and \S failed to work with multi-byte white space characters. For example, \s would fail to match a non-breaking space, and this would print nothing: printf '\xc2\xa0' | LC_ALL=en_US.UTF-8 grep '\s' A related bug is that \S would mistakenly match an invalid multibyte character. For example, the following would match: printf '\x82\n' | LC_ALL=en_US.UTF-8 grep '^\S$' [bug present since grep-2.6]
grep -i would segfault on systems using UTF-16-based wchar_t (Cygwin) when converting an input string containing certain 4-byte UTF-8 sequences to lower case. The conversions to wchar_t and back to a UTF-8 multibyte string did not take surrogate pairs into account. [bug present since at least grep-2.6, though the segfault is new with 2.13]
grep -E would segfault when given a regexp like '([^.]*[M]){1,2}' for any multibyte character M. [bug introduced in grep-2.6, which would segfault, but 2.7 and 2.8 had no problem, and 2.9 through 2.14 would hit a failed assertion. ]
grep -F would get stuck in an infinite loop when given a search string that is an invalid byte sequence in the current locale and that matches the bytes of the input twice on a line. Now grep fails with exit status 1.
grep -P could misbehave. While multi-byte mode is only supported by PCRE with UTF-8 locales, grep did not activate it. This would cause failures to match multibyte characters against some regular expressions, especially those including the '.' or '\p' metacharacters.
** 2.15 New features grep -P can now use a just-in-time compiler to greatly speed up matches, This feature is transparent to the user; no flag is required to enable it. It is only available if the corresponding support in the PCRE library is detected when grep is compiled.
|
/dragonfly/gnu/usr.bin/grep/fgrep/ |
H A D | fgrep | 51ddd709 Fri Oct 10 21:33:40 GMT 2014 John Marino <draco@marino.st> Complete upgrade of gnu grep 2.14 => 2.20
** 2.20 Bug fixes grep --max-count=N FILE would no longer stop reading after Nth match. I.e., while grep would still print the correct output, it would continue reading until end of input, and hence, potentially forever. [bug introduced in grep-2.19]
A command like echo aa|grep -E 'a(b$|c$)' would mistakenly report the input as a matched line. [bug introduced in grep-2.19]
** 2.20 Changes in behavior grep --exclude-dir='FOO/' now excludes the directory FOO. Previously, the trailing slash meant the option was ineffective.
** 2.19 Improvements Performance has improved, typically by 10% and in some cases by a factor of 200. However, performance of grep -P in UTF-8 locales has gotten worse as part of the fix for the crashes mentioned below.
** 2.19 Bug fixes grep no longer mishandles patterns like [a-[.z.]], and no longer mishandles patterns like [^a] in locales that have multicharacter collating sequences so that [^a] can match a string of two characters.
grep no longer mishandles an empty pattern at the end of a pattern list. [bug introduced in grep-2.5]
grep -C NUM now outputs separators consistently even when NUM is zero, and similarly for grep -A NUM and grep -B NUM. [bug present since "the beginning"]
grep -f no longer mishandles patterns containing NUL bytes. [bug introduced in grep-2.11]
Plain grep, grep -E, and grep -F now treat encoding errors in patterns the same way the GNU regular expression matcher treats them, with respect to whether the errors can match parts of multibyte characters in data. [bug present since "the beginning"]
grep -w no longer mishandles a potential match adjacent to a letter that takes up two or more bytes in a multibyte encoding. Similarly, the patterns '\<', '\>', '\b', and '\B' no longer mishandle word-boundary matches in multibyte locales. [bug present since "the beginning"]
grep -P now reports an error and exits when given invalid UTF-8 data. Previously it was unreliable, and sometimes crashed or looped. [bug introduced in grep-2.16]
grep -P now works with -w and -x and backreferences. Before, echo aa|grep -Pw '(.)\1' would fail to match, yet echo aa|grep -Pw '(.)\2' would match.
grep -Pw now works like grep -w in that the matched string has to be preceded and followed by non-word components or the beginning and end of the line (as opposed to word boundaries before). Before, this echo a@@a| grep -Pw @@ would match, yet this echo a@@a| grep -w @@ would not. Now, they both fail to match, per the documentation on how grep's -w works.
grep -i no longer mishandles patterns containing titlecase characters. For example, in a locale containing the titlecase character 'Lj' (U+01C8 LATIN CAPITAL LETTER L WITH SMALL LETTER J), 'grep -i Lj' now matches both 'LJ' (U+01C7 LATIN CAPITAL LETTER LJ) and 'lj' (U+01C9 LATIN SMALL LETTER LJ).
** 2.18 Bug fixes grep no longer mishandles patterns like [^^-~] in unibyte locales. [bug introduced in grep-2.8]
grep -i in a multibyte, non-UTF8 locale could be up to 200 times slower than in 2.16. [bug introduced in grep-2.17]
** 2.17 Improvements grep -i in a multibyte locale is now typically 10 times faster for patterns that do not contain \ or [.
grep (without -i) in a multibyte locale is now up to 7 times faster when processing many matched lines.
** 2.16 Bug fixes The fix to make \s and \S work with multi-byte white space broke the use of each shortcut whenever followed by a repetition operator. For example, \s*, \s+, \s? and \s{3} would all malfunction in a multi-byte locale. [bug introduced in grep-2.15]
The fix to make grep -P work better with UTF-8 made it possible for grep to evoke a larger set of PCRE errors, some of which could trigger an abort. E.g., this would abort: printf '\x82'|LC_ALL=en_US.UTF-8 grep -P y Now grep handles arbitrary PCRE errors. [bug introduced in grep-2.15]
Handle very long lines (2GiB and longer) on systems with a deficient read system call.
** 2.15 Bug fixes grep's \s and \S failed to work with multi-byte white space characters. For example, \s would fail to match a non-breaking space, and this would print nothing: printf '\xc2\xa0' | LC_ALL=en_US.UTF-8 grep '\s' A related bug is that \S would mistakenly match an invalid multibyte character. For example, the following would match: printf '\x82\n' | LC_ALL=en_US.UTF-8 grep '^\S$' [bug present since grep-2.6]
grep -i would segfault on systems using UTF-16-based wchar_t (Cygwin) when converting an input string containing certain 4-byte UTF-8 sequences to lower case. The conversions to wchar_t and back to a UTF-8 multibyte string did not take surrogate pairs into account. [bug present since at least grep-2.6, though the segfault is new with 2.13]
grep -E would segfault when given a regexp like '([^.]*[M]){1,2}' for any multibyte character M. [bug introduced in grep-2.6, which would segfault, but 2.7 and 2.8 had no problem, and 2.9 through 2.14 would hit a failed assertion. ]
grep -F would get stuck in an infinite loop when given a search string that is an invalid byte sequence in the current locale and that matches the bytes of the input twice on a line. Now grep fails with exit status 1.
grep -P could misbehave. While multi-byte mode is only supported by PCRE with UTF-8 locales, grep did not activate it. This would cause failures to match multibyte characters against some regular expressions, especially those including the '.' or '\p' metacharacters.
** 2.15 New features grep -P can now use a just-in-time compiler to greatly speed up matches, This feature is transparent to the user; no flag is required to enable it. It is only available if the corresponding support in the PCRE library is detected when grep is compiled.
|
H A D | Makefile | diff 51ddd709 Fri Oct 10 21:33:40 GMT 2014 John Marino <draco@marino.st> Complete upgrade of gnu grep 2.14 => 2.20
** 2.20 Bug fixes grep --max-count=N FILE would no longer stop reading after Nth match. I.e., while grep would still print the correct output, it would continue reading until end of input, and hence, potentially forever. [bug introduced in grep-2.19]
A command like echo aa|grep -E 'a(b$|c$)' would mistakenly report the input as a matched line. [bug introduced in grep-2.19]
** 2.20 Changes in behavior grep --exclude-dir='FOO/' now excludes the directory FOO. Previously, the trailing slash meant the option was ineffective.
** 2.19 Improvements Performance has improved, typically by 10% and in some cases by a factor of 200. However, performance of grep -P in UTF-8 locales has gotten worse as part of the fix for the crashes mentioned below.
** 2.19 Bug fixes grep no longer mishandles patterns like [a-[.z.]], and no longer mishandles patterns like [^a] in locales that have multicharacter collating sequences so that [^a] can match a string of two characters.
grep no longer mishandles an empty pattern at the end of a pattern list. [bug introduced in grep-2.5]
grep -C NUM now outputs separators consistently even when NUM is zero, and similarly for grep -A NUM and grep -B NUM. [bug present since "the beginning"]
grep -f no longer mishandles patterns containing NUL bytes. [bug introduced in grep-2.11]
Plain grep, grep -E, and grep -F now treat encoding errors in patterns the same way the GNU regular expression matcher treats them, with respect to whether the errors can match parts of multibyte characters in data. [bug present since "the beginning"]
grep -w no longer mishandles a potential match adjacent to a letter that takes up two or more bytes in a multibyte encoding. Similarly, the patterns '\<', '\>', '\b', and '\B' no longer mishandle word-boundary matches in multibyte locales. [bug present since "the beginning"]
grep -P now reports an error and exits when given invalid UTF-8 data. Previously it was unreliable, and sometimes crashed or looped. [bug introduced in grep-2.16]
grep -P now works with -w and -x and backreferences. Before, echo aa|grep -Pw '(.)\1' would fail to match, yet echo aa|grep -Pw '(.)\2' would match.
grep -Pw now works like grep -w in that the matched string has to be preceded and followed by non-word components or the beginning and end of the line (as opposed to word boundaries before). Before, this echo a@@a| grep -Pw @@ would match, yet this echo a@@a| grep -w @@ would not. Now, they both fail to match, per the documentation on how grep's -w works.
grep -i no longer mishandles patterns containing titlecase characters. For example, in a locale containing the titlecase character 'Lj' (U+01C8 LATIN CAPITAL LETTER L WITH SMALL LETTER J), 'grep -i Lj' now matches both 'LJ' (U+01C7 LATIN CAPITAL LETTER LJ) and 'lj' (U+01C9 LATIN SMALL LETTER LJ).
** 2.18 Bug fixes grep no longer mishandles patterns like [^^-~] in unibyte locales. [bug introduced in grep-2.8]
grep -i in a multibyte, non-UTF8 locale could be up to 200 times slower than in 2.16. [bug introduced in grep-2.17]
** 2.17 Improvements grep -i in a multibyte locale is now typically 10 times faster for patterns that do not contain \ or [.
grep (without -i) in a multibyte locale is now up to 7 times faster when processing many matched lines.
** 2.16 Bug fixes The fix to make \s and \S work with multi-byte white space broke the use of each shortcut whenever followed by a repetition operator. For example, \s*, \s+, \s? and \s{3} would all malfunction in a multi-byte locale. [bug introduced in grep-2.15]
The fix to make grep -P work better with UTF-8 made it possible for grep to evoke a larger set of PCRE errors, some of which could trigger an abort. E.g., this would abort: printf '\x82'|LC_ALL=en_US.UTF-8 grep -P y Now grep handles arbitrary PCRE errors. [bug introduced in grep-2.15]
Handle very long lines (2GiB and longer) on systems with a deficient read system call.
** 2.15 Bug fixes grep's \s and \S failed to work with multi-byte white space characters. For example, \s would fail to match a non-breaking space, and this would print nothing: printf '\xc2\xa0' | LC_ALL=en_US.UTF-8 grep '\s' A related bug is that \S would mistakenly match an invalid multibyte character. For example, the following would match: printf '\x82\n' | LC_ALL=en_US.UTF-8 grep '^\S$' [bug present since grep-2.6]
grep -i would segfault on systems using UTF-16-based wchar_t (Cygwin) when converting an input string containing certain 4-byte UTF-8 sequences to lower case. The conversions to wchar_t and back to a UTF-8 multibyte string did not take surrogate pairs into account. [bug present since at least grep-2.6, though the segfault is new with 2.13]
grep -E would segfault when given a regexp like '([^.]*[M]){1,2}' for any multibyte character M. [bug introduced in grep-2.6, which would segfault, but 2.7 and 2.8 had no problem, and 2.9 through 2.14 would hit a failed assertion. ]
grep -F would get stuck in an infinite loop when given a search string that is an invalid byte sequence in the current locale and that matches the bytes of the input twice on a line. Now grep fails with exit status 1.
grep -P could misbehave. While multi-byte mode is only supported by PCRE with UTF-8 locales, grep did not activate it. This would cause failures to match multibyte characters against some regular expressions, especially those including the '.' or '\p' metacharacters.
** 2.15 New features grep -P can now use a just-in-time compiler to greatly speed up matches, This feature is transparent to the user; no flag is required to enable it. It is only available if the corresponding support in the PCRE library is detected when grep is compiled.
|
/dragonfly/gnu/usr.bin/grep/libgreputils/sys/ |
H A D | time.h | 51ddd709 Fri Oct 10 21:33:40 GMT 2014 John Marino <draco@marino.st> Complete upgrade of gnu grep 2.14 => 2.20
** 2.20 Bug fixes grep --max-count=N FILE would no longer stop reading after Nth match. I.e., while grep would still print the correct output, it would continue reading until end of input, and hence, potentially forever. [bug introduced in grep-2.19]
A command like echo aa|grep -E 'a(b$|c$)' would mistakenly report the input as a matched line. [bug introduced in grep-2.19]
** 2.20 Changes in behavior grep --exclude-dir='FOO/' now excludes the directory FOO. Previously, the trailing slash meant the option was ineffective.
** 2.19 Improvements Performance has improved, typically by 10% and in some cases by a factor of 200. However, performance of grep -P in UTF-8 locales has gotten worse as part of the fix for the crashes mentioned below.
** 2.19 Bug fixes grep no longer mishandles patterns like [a-[.z.]], and no longer mishandles patterns like [^a] in locales that have multicharacter collating sequences so that [^a] can match a string of two characters.
grep no longer mishandles an empty pattern at the end of a pattern list. [bug introduced in grep-2.5]
grep -C NUM now outputs separators consistently even when NUM is zero, and similarly for grep -A NUM and grep -B NUM. [bug present since "the beginning"]
grep -f no longer mishandles patterns containing NUL bytes. [bug introduced in grep-2.11]
Plain grep, grep -E, and grep -F now treat encoding errors in patterns the same way the GNU regular expression matcher treats them, with respect to whether the errors can match parts of multibyte characters in data. [bug present since "the beginning"]
grep -w no longer mishandles a potential match adjacent to a letter that takes up two or more bytes in a multibyte encoding. Similarly, the patterns '\<', '\>', '\b', and '\B' no longer mishandle word-boundary matches in multibyte locales. [bug present since "the beginning"]
grep -P now reports an error and exits when given invalid UTF-8 data. Previously it was unreliable, and sometimes crashed or looped. [bug introduced in grep-2.16]
grep -P now works with -w and -x and backreferences. Before, echo aa|grep -Pw '(.)\1' would fail to match, yet echo aa|grep -Pw '(.)\2' would match.
grep -Pw now works like grep -w in that the matched string has to be preceded and followed by non-word components or the beginning and end of the line (as opposed to word boundaries before). Before, this echo a@@a| grep -Pw @@ would match, yet this echo a@@a| grep -w @@ would not. Now, they both fail to match, per the documentation on how grep's -w works.
grep -i no longer mishandles patterns containing titlecase characters. For example, in a locale containing the titlecase character 'Lj' (U+01C8 LATIN CAPITAL LETTER L WITH SMALL LETTER J), 'grep -i Lj' now matches both 'LJ' (U+01C7 LATIN CAPITAL LETTER LJ) and 'lj' (U+01C9 LATIN SMALL LETTER LJ).
** 2.18 Bug fixes grep no longer mishandles patterns like [^^-~] in unibyte locales. [bug introduced in grep-2.8]
grep -i in a multibyte, non-UTF8 locale could be up to 200 times slower than in 2.16. [bug introduced in grep-2.17]
** 2.17 Improvements grep -i in a multibyte locale is now typically 10 times faster for patterns that do not contain \ or [.
grep (without -i) in a multibyte locale is now up to 7 times faster when processing many matched lines.
** 2.16 Bug fixes The fix to make \s and \S work with multi-byte white space broke the use of each shortcut whenever followed by a repetition operator. For example, \s*, \s+, \s? and \s{3} would all malfunction in a multi-byte locale. [bug introduced in grep-2.15]
The fix to make grep -P work better with UTF-8 made it possible for grep to evoke a larger set of PCRE errors, some of which could trigger an abort. E.g., this would abort: printf '\x82'|LC_ALL=en_US.UTF-8 grep -P y Now grep handles arbitrary PCRE errors. [bug introduced in grep-2.15]
Handle very long lines (2GiB and longer) on systems with a deficient read system call.
** 2.15 Bug fixes grep's \s and \S failed to work with multi-byte white space characters. For example, \s would fail to match a non-breaking space, and this would print nothing: printf '\xc2\xa0' | LC_ALL=en_US.UTF-8 grep '\s' A related bug is that \S would mistakenly match an invalid multibyte character. For example, the following would match: printf '\x82\n' | LC_ALL=en_US.UTF-8 grep '^\S$' [bug present since grep-2.6]
grep -i would segfault on systems using UTF-16-based wchar_t (Cygwin) when converting an input string containing certain 4-byte UTF-8 sequences to lower case. The conversions to wchar_t and back to a UTF-8 multibyte string did not take surrogate pairs into account. [bug present since at least grep-2.6, though the segfault is new with 2.13]
grep -E would segfault when given a regexp like '([^.]*[M]){1,2}' for any multibyte character M. [bug introduced in grep-2.6, which would segfault, but 2.7 and 2.8 had no problem, and 2.9 through 2.14 would hit a failed assertion. ]
grep -F would get stuck in an infinite loop when given a search string that is an invalid byte sequence in the current locale and that matches the bytes of the input twice on a line. Now grep fails with exit status 1.
grep -P could misbehave. While multi-byte mode is only supported by PCRE with UTF-8 locales, grep did not activate it. This would cause failures to match multibyte characters against some regular expressions, especially those including the '.' or '\p' metacharacters.
** 2.15 New features grep -P can now use a just-in-time compiler to greatly speed up matches, This feature is transparent to the user; no flag is required to enable it. It is only available if the corresponding support in the PCRE library is detected when grep is compiled.
|
H A D | types.h | 51ddd709 Fri Oct 10 21:33:40 GMT 2014 John Marino <draco@marino.st> Complete upgrade of gnu grep 2.14 => 2.20
** 2.20 Bug fixes grep --max-count=N FILE would no longer stop reading after Nth match. I.e., while grep would still print the correct output, it would continue reading until end of input, and hence, potentially forever. [bug introduced in grep-2.19]
A command like echo aa|grep -E 'a(b$|c$)' would mistakenly report the input as a matched line. [bug introduced in grep-2.19]
** 2.20 Changes in behavior grep --exclude-dir='FOO/' now excludes the directory FOO. Previously, the trailing slash meant the option was ineffective.
** 2.19 Improvements Performance has improved, typically by 10% and in some cases by a factor of 200. However, performance of grep -P in UTF-8 locales has gotten worse as part of the fix for the crashes mentioned below.
** 2.19 Bug fixes grep no longer mishandles patterns like [a-[.z.]], and no longer mishandles patterns like [^a] in locales that have multicharacter collating sequences so that [^a] can match a string of two characters.
grep no longer mishandles an empty pattern at the end of a pattern list. [bug introduced in grep-2.5]
grep -C NUM now outputs separators consistently even when NUM is zero, and similarly for grep -A NUM and grep -B NUM. [bug present since "the beginning"]
grep -f no longer mishandles patterns containing NUL bytes. [bug introduced in grep-2.11]
Plain grep, grep -E, and grep -F now treat encoding errors in patterns the same way the GNU regular expression matcher treats them, with respect to whether the errors can match parts of multibyte characters in data. [bug present since "the beginning"]
grep -w no longer mishandles a potential match adjacent to a letter that takes up two or more bytes in a multibyte encoding. Similarly, the patterns '\<', '\>', '\b', and '\B' no longer mishandle word-boundary matches in multibyte locales. [bug present since "the beginning"]
grep -P now reports an error and exits when given invalid UTF-8 data. Previously it was unreliable, and sometimes crashed or looped. [bug introduced in grep-2.16]
grep -P now works with -w and -x and backreferences. Before, echo aa|grep -Pw '(.)\1' would fail to match, yet echo aa|grep -Pw '(.)\2' would match.
grep -Pw now works like grep -w in that the matched string has to be preceded and followed by non-word components or the beginning and end of the line (as opposed to word boundaries before). Before, this echo a@@a| grep -Pw @@ would match, yet this echo a@@a| grep -w @@ would not. Now, they both fail to match, per the documentation on how grep's -w works.
grep -i no longer mishandles patterns containing titlecase characters. For example, in a locale containing the titlecase character 'Lj' (U+01C8 LATIN CAPITAL LETTER L WITH SMALL LETTER J), 'grep -i Lj' now matches both 'LJ' (U+01C7 LATIN CAPITAL LETTER LJ) and 'lj' (U+01C9 LATIN SMALL LETTER LJ).
** 2.18 Bug fixes grep no longer mishandles patterns like [^^-~] in unibyte locales. [bug introduced in grep-2.8]
grep -i in a multibyte, non-UTF8 locale could be up to 200 times slower than in 2.16. [bug introduced in grep-2.17]
** 2.17 Improvements grep -i in a multibyte locale is now typically 10 times faster for patterns that do not contain \ or [.
grep (without -i) in a multibyte locale is now up to 7 times faster when processing many matched lines.
** 2.16 Bug fixes The fix to make \s and \S work with multi-byte white space broke the use of each shortcut whenever followed by a repetition operator. For example, \s*, \s+, \s? and \s{3} would all malfunction in a multi-byte locale. [bug introduced in grep-2.15]
The fix to make grep -P work better with UTF-8 made it possible for grep to evoke a larger set of PCRE errors, some of which could trigger an abort. E.g., this would abort: printf '\x82'|LC_ALL=en_US.UTF-8 grep -P y Now grep handles arbitrary PCRE errors. [bug introduced in grep-2.15]
Handle very long lines (2GiB and longer) on systems with a deficient read system call.
** 2.15 Bug fixes grep's \s and \S failed to work with multi-byte white space characters. For example, \s would fail to match a non-breaking space, and this would print nothing: printf '\xc2\xa0' | LC_ALL=en_US.UTF-8 grep '\s' A related bug is that \S would mistakenly match an invalid multibyte character. For example, the following would match: printf '\x82\n' | LC_ALL=en_US.UTF-8 grep '^\S$' [bug present since grep-2.6]
grep -i would segfault on systems using UTF-16-based wchar_t (Cygwin) when converting an input string containing certain 4-byte UTF-8 sequences to lower case. The conversions to wchar_t and back to a UTF-8 multibyte string did not take surrogate pairs into account. [bug present since at least grep-2.6, though the segfault is new with 2.13]
grep -E would segfault when given a regexp like '([^.]*[M]){1,2}' for any multibyte character M. [bug introduced in grep-2.6, which would segfault, but 2.7 and 2.8 had no problem, and 2.9 through 2.14 would hit a failed assertion. ]
grep -F would get stuck in an infinite loop when given a search string that is an invalid byte sequence in the current locale and that matches the bytes of the input twice on a line. Now grep fails with exit status 1.
grep -P could misbehave. While multi-byte mode is only supported by PCRE with UTF-8 locales, grep did not activate it. This would cause failures to match multibyte characters against some regular expressions, especially those including the '.' or '\p' metacharacters.
** 2.15 New features grep -P can now use a just-in-time compiler to greatly speed up matches, This feature is transparent to the user; no flag is required to enable it. It is only available if the corresponding support in the PCRE library is detected when grep is compiled.
|
H A D | stat.h | diff 51ddd709 Fri Oct 10 21:33:40 GMT 2014 John Marino <draco@marino.st> Complete upgrade of gnu grep 2.14 => 2.20
** 2.20 Bug fixes grep --max-count=N FILE would no longer stop reading after Nth match. I.e., while grep would still print the correct output, it would continue reading until end of input, and hence, potentially forever. [bug introduced in grep-2.19]
A command like echo aa|grep -E 'a(b$|c$)' would mistakenly report the input as a matched line. [bug introduced in grep-2.19]
** 2.20 Changes in behavior grep --exclude-dir='FOO/' now excludes the directory FOO. Previously, the trailing slash meant the option was ineffective.
** 2.19 Improvements Performance has improved, typically by 10% and in some cases by a factor of 200. However, performance of grep -P in UTF-8 locales has gotten worse as part of the fix for the crashes mentioned below.
** 2.19 Bug fixes grep no longer mishandles patterns like [a-[.z.]], and no longer mishandles patterns like [^a] in locales that have multicharacter collating sequences so that [^a] can match a string of two characters.
grep no longer mishandles an empty pattern at the end of a pattern list. [bug introduced in grep-2.5]
grep -C NUM now outputs separators consistently even when NUM is zero, and similarly for grep -A NUM and grep -B NUM. [bug present since "the beginning"]
grep -f no longer mishandles patterns containing NUL bytes. [bug introduced in grep-2.11]
Plain grep, grep -E, and grep -F now treat encoding errors in patterns the same way the GNU regular expression matcher treats them, with respect to whether the errors can match parts of multibyte characters in data. [bug present since "the beginning"]
grep -w no longer mishandles a potential match adjacent to a letter that takes up two or more bytes in a multibyte encoding. Similarly, the patterns '\<', '\>', '\b', and '\B' no longer mishandle word-boundary matches in multibyte locales. [bug present since "the beginning"]
grep -P now reports an error and exits when given invalid UTF-8 data. Previously it was unreliable, and sometimes crashed or looped. [bug introduced in grep-2.16]
grep -P now works with -w and -x and backreferences. Before, echo aa|grep -Pw '(.)\1' would fail to match, yet echo aa|grep -Pw '(.)\2' would match.
grep -Pw now works like grep -w in that the matched string has to be preceded and followed by non-word components or the beginning and end of the line (as opposed to word boundaries before). Before, this echo a@@a| grep -Pw @@ would match, yet this echo a@@a| grep -w @@ would not. Now, they both fail to match, per the documentation on how grep's -w works.
grep -i no longer mishandles patterns containing titlecase characters. For example, in a locale containing the titlecase character 'Lj' (U+01C8 LATIN CAPITAL LETTER L WITH SMALL LETTER J), 'grep -i Lj' now matches both 'LJ' (U+01C7 LATIN CAPITAL LETTER LJ) and 'lj' (U+01C9 LATIN SMALL LETTER LJ).
** 2.18 Bug fixes grep no longer mishandles patterns like [^^-~] in unibyte locales. [bug introduced in grep-2.8]
grep -i in a multibyte, non-UTF8 locale could be up to 200 times slower than in 2.16. [bug introduced in grep-2.17]
** 2.17 Improvements grep -i in a multibyte locale is now typically 10 times faster for patterns that do not contain \ or [.
grep (without -i) in a multibyte locale is now up to 7 times faster when processing many matched lines.
** 2.16 Bug fixes The fix to make \s and \S work with multi-byte white space broke the use of each shortcut whenever followed by a repetition operator. For example, \s*, \s+, \s? and \s{3} would all malfunction in a multi-byte locale. [bug introduced in grep-2.15]
The fix to make grep -P work better with UTF-8 made it possible for grep to evoke a larger set of PCRE errors, some of which could trigger an abort. E.g., this would abort: printf '\x82'|LC_ALL=en_US.UTF-8 grep -P y Now grep handles arbitrary PCRE errors. [bug introduced in grep-2.15]
Handle very long lines (2GiB and longer) on systems with a deficient read system call.
** 2.15 Bug fixes grep's \s and \S failed to work with multi-byte white space characters. For example, \s would fail to match a non-breaking space, and this would print nothing: printf '\xc2\xa0' | LC_ALL=en_US.UTF-8 grep '\s' A related bug is that \S would mistakenly match an invalid multibyte character. For example, the following would match: printf '\x82\n' | LC_ALL=en_US.UTF-8 grep '^\S$' [bug present since grep-2.6]
grep -i would segfault on systems using UTF-16-based wchar_t (Cygwin) when converting an input string containing certain 4-byte UTF-8 sequences to lower case. The conversions to wchar_t and back to a UTF-8 multibyte string did not take surrogate pairs into account. [bug present since at least grep-2.6, though the segfault is new with 2.13]
grep -E would segfault when given a regexp like '([^.]*[M]){1,2}' for any multibyte character M. [bug introduced in grep-2.6, which would segfault, but 2.7 and 2.8 had no problem, and 2.9 through 2.14 would hit a failed assertion. ]
grep -F would get stuck in an infinite loop when given a search string that is an invalid byte sequence in the current locale and that matches the bytes of the input twice on a line. Now grep fails with exit status 1.
grep -P could misbehave. While multi-byte mode is only supported by PCRE with UTF-8 locales, grep did not activate it. This would cause failures to match multibyte characters against some regular expressions, especially those including the '.' or '\p' metacharacters.
** 2.15 New features grep -P can now use a just-in-time compiler to greatly speed up matches, This feature is transparent to the user; no flag is required to enable it. It is only available if the corresponding support in the PCRE library is detected when grep is compiled.
|
/dragonfly/gnu/usr.bin/grep/grep/ |
H A D | Makefile | diff 51ddd709 Fri Oct 10 21:33:40 GMT 2014 John Marino <draco@marino.st> Complete upgrade of gnu grep 2.14 => 2.20
** 2.20 Bug fixes grep --max-count=N FILE would no longer stop reading after Nth match. I.e., while grep would still print the correct output, it would continue reading until end of input, and hence, potentially forever. [bug introduced in grep-2.19]
A command like echo aa|grep -E 'a(b$|c$)' would mistakenly report the input as a matched line. [bug introduced in grep-2.19]
** 2.20 Changes in behavior grep --exclude-dir='FOO/' now excludes the directory FOO. Previously, the trailing slash meant the option was ineffective.
** 2.19 Improvements Performance has improved, typically by 10% and in some cases by a factor of 200. However, performance of grep -P in UTF-8 locales has gotten worse as part of the fix for the crashes mentioned below.
** 2.19 Bug fixes grep no longer mishandles patterns like [a-[.z.]], and no longer mishandles patterns like [^a] in locales that have multicharacter collating sequences so that [^a] can match a string of two characters.
grep no longer mishandles an empty pattern at the end of a pattern list. [bug introduced in grep-2.5]
grep -C NUM now outputs separators consistently even when NUM is zero, and similarly for grep -A NUM and grep -B NUM. [bug present since "the beginning"]
grep -f no longer mishandles patterns containing NUL bytes. [bug introduced in grep-2.11]
Plain grep, grep -E, and grep -F now treat encoding errors in patterns the same way the GNU regular expression matcher treats them, with respect to whether the errors can match parts of multibyte characters in data. [bug present since "the beginning"]
grep -w no longer mishandles a potential match adjacent to a letter that takes up two or more bytes in a multibyte encoding. Similarly, the patterns '\<', '\>', '\b', and '\B' no longer mishandle word-boundary matches in multibyte locales. [bug present since "the beginning"]
grep -P now reports an error and exits when given invalid UTF-8 data. Previously it was unreliable, and sometimes crashed or looped. [bug introduced in grep-2.16]
grep -P now works with -w and -x and backreferences. Before, echo aa|grep -Pw '(.)\1' would fail to match, yet echo aa|grep -Pw '(.)\2' would match.
grep -Pw now works like grep -w in that the matched string has to be preceded and followed by non-word components or the beginning and end of the line (as opposed to word boundaries before). Before, this echo a@@a| grep -Pw @@ would match, yet this echo a@@a| grep -w @@ would not. Now, they both fail to match, per the documentation on how grep's -w works.
grep -i no longer mishandles patterns containing titlecase characters. For example, in a locale containing the titlecase character 'Lj' (U+01C8 LATIN CAPITAL LETTER L WITH SMALL LETTER J), 'grep -i Lj' now matches both 'LJ' (U+01C7 LATIN CAPITAL LETTER LJ) and 'lj' (U+01C9 LATIN SMALL LETTER LJ).
** 2.18 Bug fixes grep no longer mishandles patterns like [^^-~] in unibyte locales. [bug introduced in grep-2.8]
grep -i in a multibyte, non-UTF8 locale could be up to 200 times slower than in 2.16. [bug introduced in grep-2.17]
** 2.17 Improvements grep -i in a multibyte locale is now typically 10 times faster for patterns that do not contain \ or [.
grep (without -i) in a multibyte locale is now up to 7 times faster when processing many matched lines.
** 2.16 Bug fixes The fix to make \s and \S work with multi-byte white space broke the use of each shortcut whenever followed by a repetition operator. For example, \s*, \s+, \s? and \s{3} would all malfunction in a multi-byte locale. [bug introduced in grep-2.15]
The fix to make grep -P work better with UTF-8 made it possible for grep to evoke a larger set of PCRE errors, some of which could trigger an abort. E.g., this would abort: printf '\x82'|LC_ALL=en_US.UTF-8 grep -P y Now grep handles arbitrary PCRE errors. [bug introduced in grep-2.15]
Handle very long lines (2GiB and longer) on systems with a deficient read system call.
** 2.15 Bug fixes grep's \s and \S failed to work with multi-byte white space characters. For example, \s would fail to match a non-breaking space, and this would print nothing: printf '\xc2\xa0' | LC_ALL=en_US.UTF-8 grep '\s' A related bug is that \S would mistakenly match an invalid multibyte character. For example, the following would match: printf '\x82\n' | LC_ALL=en_US.UTF-8 grep '^\S$' [bug present since grep-2.6]
grep -i would segfault on systems using UTF-16-based wchar_t (Cygwin) when converting an input string containing certain 4-byte UTF-8 sequences to lower case. The conversions to wchar_t and back to a UTF-8 multibyte string did not take surrogate pairs into account. [bug present since at least grep-2.6, though the segfault is new with 2.13]
grep -E would segfault when given a regexp like '([^.]*[M]){1,2}' for any multibyte character M. [bug introduced in grep-2.6, which would segfault, but 2.7 and 2.8 had no problem, and 2.9 through 2.14 would hit a failed assertion. ]
grep -F would get stuck in an infinite loop when given a search string that is an invalid byte sequence in the current locale and that matches the bytes of the input twice on a line. Now grep fails with exit status 1.
grep -P could misbehave. While multi-byte mode is only supported by PCRE with UTF-8 locales, grep did not activate it. This would cause failures to match multibyte characters against some regular expressions, especially those including the '.' or '\p' metacharacters.
** 2.15 New features grep -P can now use a just-in-time compiler to greatly speed up matches, This feature is transparent to the user; no flag is required to enable it. It is only available if the corresponding support in the PCRE library is detected when grep is compiled.
|
/dragonfly/gnu/usr.bin/grep/libgreputils/ |
H A D | ctype.h | 51ddd709 Fri Oct 10 21:33:40 GMT 2014 John Marino <draco@marino.st> Complete upgrade of gnu grep 2.14 => 2.20
** 2.20 Bug fixes grep --max-count=N FILE would no longer stop reading after Nth match. I.e., while grep would still print the correct output, it would continue reading until end of input, and hence, potentially forever. [bug introduced in grep-2.19]
A command like echo aa|grep -E 'a(b$|c$)' would mistakenly report the input as a matched line. [bug introduced in grep-2.19]
** 2.20 Changes in behavior grep --exclude-dir='FOO/' now excludes the directory FOO. Previously, the trailing slash meant the option was ineffective.
** 2.19 Improvements Performance has improved, typically by 10% and in some cases by a factor of 200. However, performance of grep -P in UTF-8 locales has gotten worse as part of the fix for the crashes mentioned below.
** 2.19 Bug fixes grep no longer mishandles patterns like [a-[.z.]], and no longer mishandles patterns like [^a] in locales that have multicharacter collating sequences so that [^a] can match a string of two characters.
grep no longer mishandles an empty pattern at the end of a pattern list. [bug introduced in grep-2.5]
grep -C NUM now outputs separators consistently even when NUM is zero, and similarly for grep -A NUM and grep -B NUM. [bug present since "the beginning"]
grep -f no longer mishandles patterns containing NUL bytes. [bug introduced in grep-2.11]
Plain grep, grep -E, and grep -F now treat encoding errors in patterns the same way the GNU regular expression matcher treats them, with respect to whether the errors can match parts of multibyte characters in data. [bug present since "the beginning"]
grep -w no longer mishandles a potential match adjacent to a letter that takes up two or more bytes in a multibyte encoding. Similarly, the patterns '\<', '\>', '\b', and '\B' no longer mishandle word-boundary matches in multibyte locales. [bug present since "the beginning"]
grep -P now reports an error and exits when given invalid UTF-8 data. Previously it was unreliable, and sometimes crashed or looped. [bug introduced in grep-2.16]
grep -P now works with -w and -x and backreferences. Before, echo aa|grep -Pw '(.)\1' would fail to match, yet echo aa|grep -Pw '(.)\2' would match.
grep -Pw now works like grep -w in that the matched string has to be preceded and followed by non-word components or the beginning and end of the line (as opposed to word boundaries before). Before, this echo a@@a| grep -Pw @@ would match, yet this echo a@@a| grep -w @@ would not. Now, they both fail to match, per the documentation on how grep's -w works.
grep -i no longer mishandles patterns containing titlecase characters. For example, in a locale containing the titlecase character 'Lj' (U+01C8 LATIN CAPITAL LETTER L WITH SMALL LETTER J), 'grep -i Lj' now matches both 'LJ' (U+01C7 LATIN CAPITAL LETTER LJ) and 'lj' (U+01C9 LATIN SMALL LETTER LJ).
** 2.18 Bug fixes grep no longer mishandles patterns like [^^-~] in unibyte locales. [bug introduced in grep-2.8]
grep -i in a multibyte, non-UTF8 locale could be up to 200 times slower than in 2.16. [bug introduced in grep-2.17]
** 2.17 Improvements grep -i in a multibyte locale is now typically 10 times faster for patterns that do not contain \ or [.
grep (without -i) in a multibyte locale is now up to 7 times faster when processing many matched lines.
** 2.16 Bug fixes The fix to make \s and \S work with multi-byte white space broke the use of each shortcut whenever followed by a repetition operator. For example, \s*, \s+, \s? and \s{3} would all malfunction in a multi-byte locale. [bug introduced in grep-2.15]
The fix to make grep -P work better with UTF-8 made it possible for grep to evoke a larger set of PCRE errors, some of which could trigger an abort. E.g., this would abort: printf '\x82'|LC_ALL=en_US.UTF-8 grep -P y Now grep handles arbitrary PCRE errors. [bug introduced in grep-2.15]
Handle very long lines (2GiB and longer) on systems with a deficient read system call.
** 2.15 Bug fixes grep's \s and \S failed to work with multi-byte white space characters. For example, \s would fail to match a non-breaking space, and this would print nothing: printf '\xc2\xa0' | LC_ALL=en_US.UTF-8 grep '\s' A related bug is that \S would mistakenly match an invalid multibyte character. For example, the following would match: printf '\x82\n' | LC_ALL=en_US.UTF-8 grep '^\S$' [bug present since grep-2.6]
grep -i would segfault on systems using UTF-16-based wchar_t (Cygwin) when converting an input string containing certain 4-byte UTF-8 sequences to lower case. The conversions to wchar_t and back to a UTF-8 multibyte string did not take surrogate pairs into account. [bug present since at least grep-2.6, though the segfault is new with 2.13]
grep -E would segfault when given a regexp like '([^.]*[M]){1,2}' for any multibyte character M. [bug introduced in grep-2.6, which would segfault, but 2.7 and 2.8 had no problem, and 2.9 through 2.14 would hit a failed assertion. ]
grep -F would get stuck in an infinite loop when given a search string that is an invalid byte sequence in the current locale and that matches the bytes of the input twice on a line. Now grep fails with exit status 1.
grep -P could misbehave. While multi-byte mode is only supported by PCRE with UTF-8 locales, grep did not activate it. This would cause failures to match multibyte characters against some regular expressions, especially those including the '.' or '\p' metacharacters.
** 2.15 New features grep -P can now use a just-in-time compiler to greatly speed up matches, This feature is transparent to the user; no flag is required to enable it. It is only available if the corresponding support in the PCRE library is detected when grep is compiled.
|
H A D | dirent.h | 51ddd709 Fri Oct 10 21:33:40 GMT 2014 John Marino <draco@marino.st> Complete upgrade of gnu grep 2.14 => 2.20
** 2.20 Bug fixes grep --max-count=N FILE would no longer stop reading after Nth match. I.e., while grep would still print the correct output, it would continue reading until end of input, and hence, potentially forever. [bug introduced in grep-2.19]
A command like echo aa|grep -E 'a(b$|c$)' would mistakenly report the input as a matched line. [bug introduced in grep-2.19]
** 2.20 Changes in behavior grep --exclude-dir='FOO/' now excludes the directory FOO. Previously, the trailing slash meant the option was ineffective.
** 2.19 Improvements Performance has improved, typically by 10% and in some cases by a factor of 200. However, performance of grep -P in UTF-8 locales has gotten worse as part of the fix for the crashes mentioned below.
** 2.19 Bug fixes grep no longer mishandles patterns like [a-[.z.]], and no longer mishandles patterns like [^a] in locales that have multicharacter collating sequences so that [^a] can match a string of two characters.
grep no longer mishandles an empty pattern at the end of a pattern list. [bug introduced in grep-2.5]
grep -C NUM now outputs separators consistently even when NUM is zero, and similarly for grep -A NUM and grep -B NUM. [bug present since "the beginning"]
grep -f no longer mishandles patterns containing NUL bytes. [bug introduced in grep-2.11]
Plain grep, grep -E, and grep -F now treat encoding errors in patterns the same way the GNU regular expression matcher treats them, with respect to whether the errors can match parts of multibyte characters in data. [bug present since "the beginning"]
grep -w no longer mishandles a potential match adjacent to a letter that takes up two or more bytes in a multibyte encoding. Similarly, the patterns '\<', '\>', '\b', and '\B' no longer mishandle word-boundary matches in multibyte locales. [bug present since "the beginning"]
grep -P now reports an error and exits when given invalid UTF-8 data. Previously it was unreliable, and sometimes crashed or looped. [bug introduced in grep-2.16]
grep -P now works with -w and -x and backreferences. Before, echo aa|grep -Pw '(.)\1' would fail to match, yet echo aa|grep -Pw '(.)\2' would match.
grep -Pw now works like grep -w in that the matched string has to be preceded and followed by non-word components or the beginning and end of the line (as opposed to word boundaries before). Before, this echo a@@a| grep -Pw @@ would match, yet this echo a@@a| grep -w @@ would not. Now, they both fail to match, per the documentation on how grep's -w works.
grep -i no longer mishandles patterns containing titlecase characters. For example, in a locale containing the titlecase character 'Lj' (U+01C8 LATIN CAPITAL LETTER L WITH SMALL LETTER J), 'grep -i Lj' now matches both 'LJ' (U+01C7 LATIN CAPITAL LETTER LJ) and 'lj' (U+01C9 LATIN SMALL LETTER LJ).
** 2.18 Bug fixes grep no longer mishandles patterns like [^^-~] in unibyte locales. [bug introduced in grep-2.8]
grep -i in a multibyte, non-UTF8 locale could be up to 200 times slower than in 2.16. [bug introduced in grep-2.17]
** 2.17 Improvements grep -i in a multibyte locale is now typically 10 times faster for patterns that do not contain \ or [.
grep (without -i) in a multibyte locale is now up to 7 times faster when processing many matched lines.
** 2.16 Bug fixes The fix to make \s and \S work with multi-byte white space broke the use of each shortcut whenever followed by a repetition operator. For example, \s*, \s+, \s? and \s{3} would all malfunction in a multi-byte locale. [bug introduced in grep-2.15]
The fix to make grep -P work better with UTF-8 made it possible for grep to evoke a larger set of PCRE errors, some of which could trigger an abort. E.g., this would abort: printf '\x82'|LC_ALL=en_US.UTF-8 grep -P y Now grep handles arbitrary PCRE errors. [bug introduced in grep-2.15]
Handle very long lines (2GiB and longer) on systems with a deficient read system call.
** 2.15 Bug fixes grep's \s and \S failed to work with multi-byte white space characters. For example, \s would fail to match a non-breaking space, and this would print nothing: printf '\xc2\xa0' | LC_ALL=en_US.UTF-8 grep '\s' A related bug is that \S would mistakenly match an invalid multibyte character. For example, the following would match: printf '\x82\n' | LC_ALL=en_US.UTF-8 grep '^\S$' [bug present since grep-2.6]
grep -i would segfault on systems using UTF-16-based wchar_t (Cygwin) when converting an input string containing certain 4-byte UTF-8 sequences to lower case. The conversions to wchar_t and back to a UTF-8 multibyte string did not take surrogate pairs into account. [bug present since at least grep-2.6, though the segfault is new with 2.13]
grep -E would segfault when given a regexp like '([^.]*[M]){1,2}' for any multibyte character M. [bug introduced in grep-2.6, which would segfault, but 2.7 and 2.8 had no problem, and 2.9 through 2.14 would hit a failed assertion. ]
grep -F would get stuck in an infinite loop when given a search string that is an invalid byte sequence in the current locale and that matches the bytes of the input twice on a line. Now grep fails with exit status 1.
grep -P could misbehave. While multi-byte mode is only supported by PCRE with UTF-8 locales, grep did not activate it. This would cause failures to match multibyte characters against some regular expressions, especially those including the '.' or '\p' metacharacters.
** 2.15 New features grep -P can now use a just-in-time compiler to greatly speed up matches, This feature is transparent to the user; no flag is required to enable it. It is only available if the corresponding support in the PCRE library is detected when grep is compiled.
|
H A D | wchar.h | 51ddd709 Fri Oct 10 21:33:40 GMT 2014 John Marino <draco@marino.st> Complete upgrade of gnu grep 2.14 => 2.20
** 2.20 Bug fixes grep --max-count=N FILE would no longer stop reading after Nth match. I.e., while grep would still print the correct output, it would continue reading until end of input, and hence, potentially forever. [bug introduced in grep-2.19]
A command like echo aa|grep -E 'a(b$|c$)' would mistakenly report the input as a matched line. [bug introduced in grep-2.19]
** 2.20 Changes in behavior grep --exclude-dir='FOO/' now excludes the directory FOO. Previously, the trailing slash meant the option was ineffective.
** 2.19 Improvements Performance has improved, typically by 10% and in some cases by a factor of 200. However, performance of grep -P in UTF-8 locales has gotten worse as part of the fix for the crashes mentioned below.
** 2.19 Bug fixes grep no longer mishandles patterns like [a-[.z.]], and no longer mishandles patterns like [^a] in locales that have multicharacter collating sequences so that [^a] can match a string of two characters.
grep no longer mishandles an empty pattern at the end of a pattern list. [bug introduced in grep-2.5]
grep -C NUM now outputs separators consistently even when NUM is zero, and similarly for grep -A NUM and grep -B NUM. [bug present since "the beginning"]
grep -f no longer mishandles patterns containing NUL bytes. [bug introduced in grep-2.11]
Plain grep, grep -E, and grep -F now treat encoding errors in patterns the same way the GNU regular expression matcher treats them, with respect to whether the errors can match parts of multibyte characters in data. [bug present since "the beginning"]
grep -w no longer mishandles a potential match adjacent to a letter that takes up two or more bytes in a multibyte encoding. Similarly, the patterns '\<', '\>', '\b', and '\B' no longer mishandle word-boundary matches in multibyte locales. [bug present since "the beginning"]
grep -P now reports an error and exits when given invalid UTF-8 data. Previously it was unreliable, and sometimes crashed or looped. [bug introduced in grep-2.16]
grep -P now works with -w and -x and backreferences. Before, echo aa|grep -Pw '(.)\1' would fail to match, yet echo aa|grep -Pw '(.)\2' would match.
grep -Pw now works like grep -w in that the matched string has to be preceded and followed by non-word components or the beginning and end of the line (as opposed to word boundaries before). Before, this echo a@@a| grep -Pw @@ would match, yet this echo a@@a| grep -w @@ would not. Now, they both fail to match, per the documentation on how grep's -w works.
grep -i no longer mishandles patterns containing titlecase characters. For example, in a locale containing the titlecase character 'Lj' (U+01C8 LATIN CAPITAL LETTER L WITH SMALL LETTER J), 'grep -i Lj' now matches both 'LJ' (U+01C7 LATIN CAPITAL LETTER LJ) and 'lj' (U+01C9 LATIN SMALL LETTER LJ).
** 2.18 Bug fixes grep no longer mishandles patterns like [^^-~] in unibyte locales. [bug introduced in grep-2.8]
grep -i in a multibyte, non-UTF8 locale could be up to 200 times slower than in 2.16. [bug introduced in grep-2.17]
** 2.17 Improvements grep -i in a multibyte locale is now typically 10 times faster for patterns that do not contain \ or [.
grep (without -i) in a multibyte locale is now up to 7 times faster when processing many matched lines.
** 2.16 Bug fixes The fix to make \s and \S work with multi-byte white space broke the use of each shortcut whenever followed by a repetition operator. For example, \s*, \s+, \s? and \s{3} would all malfunction in a multi-byte locale. [bug introduced in grep-2.15]
The fix to make grep -P work better with UTF-8 made it possible for grep to evoke a larger set of PCRE errors, some of which could trigger an abort. E.g., this would abort: printf '\x82'|LC_ALL=en_US.UTF-8 grep -P y Now grep handles arbitrary PCRE errors. [bug introduced in grep-2.15]
Handle very long lines (2GiB and longer) on systems with a deficient read system call.
** 2.15 Bug fixes grep's \s and \S failed to work with multi-byte white space characters. For example, \s would fail to match a non-breaking space, and this would print nothing: printf '\xc2\xa0' | LC_ALL=en_US.UTF-8 grep '\s' A related bug is that \S would mistakenly match an invalid multibyte character. For example, the following would match: printf '\x82\n' | LC_ALL=en_US.UTF-8 grep '^\S$' [bug present since grep-2.6]
grep -i would segfault on systems using UTF-16-based wchar_t (Cygwin) when converting an input string containing certain 4-byte UTF-8 sequences to lower case. The conversions to wchar_t and back to a UTF-8 multibyte string did not take surrogate pairs into account. [bug present since at least grep-2.6, though the segfault is new with 2.13]
grep -E would segfault when given a regexp like '([^.]*[M]){1,2}' for any multibyte character M. [bug introduced in grep-2.6, which would segfault, but 2.7 and 2.8 had no problem, and 2.9 through 2.14 would hit a failed assertion. ]
grep -F would get stuck in an infinite loop when given a search string that is an invalid byte sequence in the current locale and that matches the bytes of the input twice on a line. Now grep fails with exit status 1.
grep -P could misbehave. While multi-byte mode is only supported by PCRE with UTF-8 locales, grep did not activate it. This would cause failures to match multibyte characters against some regular expressions, especially those including the '.' or '\p' metacharacters.
** 2.15 New features grep -P can now use a just-in-time compiler to greatly speed up matches, This feature is transparent to the user; no flag is required to enable it. It is only available if the corresponding support in the PCRE library is detected when grep is compiled.
|
H A D | wctype.h | 51ddd709 Fri Oct 10 21:33:40 GMT 2014 John Marino <draco@marino.st> Complete upgrade of gnu grep 2.14 => 2.20
** 2.20 Bug fixes grep --max-count=N FILE would no longer stop reading after Nth match. I.e., while grep would still print the correct output, it would continue reading until end of input, and hence, potentially forever. [bug introduced in grep-2.19]
A command like echo aa|grep -E 'a(b$|c$)' would mistakenly report the input as a matched line. [bug introduced in grep-2.19]
** 2.20 Changes in behavior grep --exclude-dir='FOO/' now excludes the directory FOO. Previously, the trailing slash meant the option was ineffective.
** 2.19 Improvements Performance has improved, typically by 10% and in some cases by a factor of 200. However, performance of grep -P in UTF-8 locales has gotten worse as part of the fix for the crashes mentioned below.
** 2.19 Bug fixes grep no longer mishandles patterns like [a-[.z.]], and no longer mishandles patterns like [^a] in locales that have multicharacter collating sequences so that [^a] can match a string of two characters.
grep no longer mishandles an empty pattern at the end of a pattern list. [bug introduced in grep-2.5]
grep -C NUM now outputs separators consistently even when NUM is zero, and similarly for grep -A NUM and grep -B NUM. [bug present since "the beginning"]
grep -f no longer mishandles patterns containing NUL bytes. [bug introduced in grep-2.11]
Plain grep, grep -E, and grep -F now treat encoding errors in patterns the same way the GNU regular expression matcher treats them, with respect to whether the errors can match parts of multibyte characters in data. [bug present since "the beginning"]
grep -w no longer mishandles a potential match adjacent to a letter that takes up two or more bytes in a multibyte encoding. Similarly, the patterns '\<', '\>', '\b', and '\B' no longer mishandle word-boundary matches in multibyte locales. [bug present since "the beginning"]
grep -P now reports an error and exits when given invalid UTF-8 data. Previously it was unreliable, and sometimes crashed or looped. [bug introduced in grep-2.16]
grep -P now works with -w and -x and backreferences. Before, echo aa|grep -Pw '(.)\1' would fail to match, yet echo aa|grep -Pw '(.)\2' would match.
grep -Pw now works like grep -w in that the matched string has to be preceded and followed by non-word components or the beginning and end of the line (as opposed to word boundaries before). Before, this echo a@@a| grep -Pw @@ would match, yet this echo a@@a| grep -w @@ would not. Now, they both fail to match, per the documentation on how grep's -w works.
grep -i no longer mishandles patterns containing titlecase characters. For example, in a locale containing the titlecase character 'Lj' (U+01C8 LATIN CAPITAL LETTER L WITH SMALL LETTER J), 'grep -i Lj' now matches both 'LJ' (U+01C7 LATIN CAPITAL LETTER LJ) and 'lj' (U+01C9 LATIN SMALL LETTER LJ).
** 2.18 Bug fixes grep no longer mishandles patterns like [^^-~] in unibyte locales. [bug introduced in grep-2.8]
grep -i in a multibyte, non-UTF8 locale could be up to 200 times slower than in 2.16. [bug introduced in grep-2.17]
** 2.17 Improvements grep -i in a multibyte locale is now typically 10 times faster for patterns that do not contain \ or [.
grep (without -i) in a multibyte locale is now up to 7 times faster when processing many matched lines.
** 2.16 Bug fixes The fix to make \s and \S work with multi-byte white space broke the use of each shortcut whenever followed by a repetition operator. For example, \s*, \s+, \s? and \s{3} would all malfunction in a multi-byte locale. [bug introduced in grep-2.15]
The fix to make grep -P work better with UTF-8 made it possible for grep to evoke a larger set of PCRE errors, some of which could trigger an abort. E.g., this would abort: printf '\x82'|LC_ALL=en_US.UTF-8 grep -P y Now grep handles arbitrary PCRE errors. [bug introduced in grep-2.15]
Handle very long lines (2GiB and longer) on systems with a deficient read system call.
** 2.15 Bug fixes grep's \s and \S failed to work with multi-byte white space characters. For example, \s would fail to match a non-breaking space, and this would print nothing: printf '\xc2\xa0' | LC_ALL=en_US.UTF-8 grep '\s' A related bug is that \S would mistakenly match an invalid multibyte character. For example, the following would match: printf '\x82\n' | LC_ALL=en_US.UTF-8 grep '^\S$' [bug present since grep-2.6]
grep -i would segfault on systems using UTF-16-based wchar_t (Cygwin) when converting an input string containing certain 4-byte UTF-8 sequences to lower case. The conversions to wchar_t and back to a UTF-8 multibyte string did not take surrogate pairs into account. [bug present since at least grep-2.6, though the segfault is new with 2.13]
grep -E would segfault when given a regexp like '([^.]*[M]){1,2}' for any multibyte character M. [bug introduced in grep-2.6, which would segfault, but 2.7 and 2.8 had no problem, and 2.9 through 2.14 would hit a failed assertion. ]
grep -F would get stuck in an infinite loop when given a search string that is an invalid byte sequence in the current locale and that matches the bytes of the input twice on a line. Now grep fails with exit status 1.
grep -P could misbehave. While multi-byte mode is only supported by PCRE with UTF-8 locales, grep did not activate it. This would cause failures to match multibyte characters against some regular expressions, especially those including the '.' or '\p' metacharacters.
** 2.15 New features grep -P can now use a just-in-time compiler to greatly speed up matches, This feature is transparent to the user; no flag is required to enable it. It is only available if the corresponding support in the PCRE library is detected when grep is compiled.
|
H A D | inttypes.h | 51ddd709 Fri Oct 10 21:33:40 GMT 2014 John Marino <draco@marino.st> Complete upgrade of gnu grep 2.14 => 2.20
** 2.20 Bug fixes grep --max-count=N FILE would no longer stop reading after Nth match. I.e., while grep would still print the correct output, it would continue reading until end of input, and hence, potentially forever. [bug introduced in grep-2.19]
A command like echo aa|grep -E 'a(b$|c$)' would mistakenly report the input as a matched line. [bug introduced in grep-2.19]
** 2.20 Changes in behavior grep --exclude-dir='FOO/' now excludes the directory FOO. Previously, the trailing slash meant the option was ineffective.
** 2.19 Improvements Performance has improved, typically by 10% and in some cases by a factor of 200. However, performance of grep -P in UTF-8 locales has gotten worse as part of the fix for the crashes mentioned below.
** 2.19 Bug fixes grep no longer mishandles patterns like [a-[.z.]], and no longer mishandles patterns like [^a] in locales that have multicharacter collating sequences so that [^a] can match a string of two characters.
grep no longer mishandles an empty pattern at the end of a pattern list. [bug introduced in grep-2.5]
grep -C NUM now outputs separators consistently even when NUM is zero, and similarly for grep -A NUM and grep -B NUM. [bug present since "the beginning"]
grep -f no longer mishandles patterns containing NUL bytes. [bug introduced in grep-2.11]
Plain grep, grep -E, and grep -F now treat encoding errors in patterns the same way the GNU regular expression matcher treats them, with respect to whether the errors can match parts of multibyte characters in data. [bug present since "the beginning"]
grep -w no longer mishandles a potential match adjacent to a letter that takes up two or more bytes in a multibyte encoding. Similarly, the patterns '\<', '\>', '\b', and '\B' no longer mishandle word-boundary matches in multibyte locales. [bug present since "the beginning"]
grep -P now reports an error and exits when given invalid UTF-8 data. Previously it was unreliable, and sometimes crashed or looped. [bug introduced in grep-2.16]
grep -P now works with -w and -x and backreferences. Before, echo aa|grep -Pw '(.)\1' would fail to match, yet echo aa|grep -Pw '(.)\2' would match.
grep -Pw now works like grep -w in that the matched string has to be preceded and followed by non-word components or the beginning and end of the line (as opposed to word boundaries before). Before, this echo a@@a| grep -Pw @@ would match, yet this echo a@@a| grep -w @@ would not. Now, they both fail to match, per the documentation on how grep's -w works.
grep -i no longer mishandles patterns containing titlecase characters. For example, in a locale containing the titlecase character 'Lj' (U+01C8 LATIN CAPITAL LETTER L WITH SMALL LETTER J), 'grep -i Lj' now matches both 'LJ' (U+01C7 LATIN CAPITAL LETTER LJ) and 'lj' (U+01C9 LATIN SMALL LETTER LJ).
** 2.18 Bug fixes grep no longer mishandles patterns like [^^-~] in unibyte locales. [bug introduced in grep-2.8]
grep -i in a multibyte, non-UTF8 locale could be up to 200 times slower than in 2.16. [bug introduced in grep-2.17]
** 2.17 Improvements grep -i in a multibyte locale is now typically 10 times faster for patterns that do not contain \ or [.
grep (without -i) in a multibyte locale is now up to 7 times faster when processing many matched lines.
** 2.16 Bug fixes The fix to make \s and \S work with multi-byte white space broke the use of each shortcut whenever followed by a repetition operator. For example, \s*, \s+, \s? and \s{3} would all malfunction in a multi-byte locale. [bug introduced in grep-2.15]
The fix to make grep -P work better with UTF-8 made it possible for grep to evoke a larger set of PCRE errors, some of which could trigger an abort. E.g., this would abort: printf '\x82'|LC_ALL=en_US.UTF-8 grep -P y Now grep handles arbitrary PCRE errors. [bug introduced in grep-2.15]
Handle very long lines (2GiB and longer) on systems with a deficient read system call.
** 2.15 Bug fixes grep's \s and \S failed to work with multi-byte white space characters. For example, \s would fail to match a non-breaking space, and this would print nothing: printf '\xc2\xa0' | LC_ALL=en_US.UTF-8 grep '\s' A related bug is that \S would mistakenly match an invalid multibyte character. For example, the following would match: printf '\x82\n' | LC_ALL=en_US.UTF-8 grep '^\S$' [bug present since grep-2.6]
grep -i would segfault on systems using UTF-16-based wchar_t (Cygwin) when converting an input string containing certain 4-byte UTF-8 sequences to lower case. The conversions to wchar_t and back to a UTF-8 multibyte string did not take surrogate pairs into account. [bug present since at least grep-2.6, though the segfault is new with 2.13]
grep -E would segfault when given a regexp like '([^.]*[M]){1,2}' for any multibyte character M. [bug introduced in grep-2.6, which would segfault, but 2.7 and 2.8 had no problem, and 2.9 through 2.14 would hit a failed assertion. ]
grep -F would get stuck in an infinite loop when given a search string that is an invalid byte sequence in the current locale and that matches the bytes of the input twice on a line. Now grep fails with exit status 1.
grep -P could misbehave. While multi-byte mode is only supported by PCRE with UTF-8 locales, grep did not activate it. This would cause failures to match multibyte characters against some regular expressions, especially those including the '.' or '\p' metacharacters.
** 2.15 New features grep -P can now use a just-in-time compiler to greatly speed up matches, This feature is transparent to the user; no flag is required to enable it. It is only available if the corresponding support in the PCRE library is detected when grep is compiled.
|
H A D | langinfo.h | 51ddd709 Fri Oct 10 21:33:40 GMT 2014 John Marino <draco@marino.st> Complete upgrade of gnu grep 2.14 => 2.20
** 2.20 Bug fixes grep --max-count=N FILE would no longer stop reading after Nth match. I.e., while grep would still print the correct output, it would continue reading until end of input, and hence, potentially forever. [bug introduced in grep-2.19]
A command like echo aa|grep -E 'a(b$|c$)' would mistakenly report the input as a matched line. [bug introduced in grep-2.19]
** 2.20 Changes in behavior grep --exclude-dir='FOO/' now excludes the directory FOO. Previously, the trailing slash meant the option was ineffective.
** 2.19 Improvements Performance has improved, typically by 10% and in some cases by a factor of 200. However, performance of grep -P in UTF-8 locales has gotten worse as part of the fix for the crashes mentioned below.
** 2.19 Bug fixes grep no longer mishandles patterns like [a-[.z.]], and no longer mishandles patterns like [^a] in locales that have multicharacter collating sequences so that [^a] can match a string of two characters.
grep no longer mishandles an empty pattern at the end of a pattern list. [bug introduced in grep-2.5]
grep -C NUM now outputs separators consistently even when NUM is zero, and similarly for grep -A NUM and grep -B NUM. [bug present since "the beginning"]
grep -f no longer mishandles patterns containing NUL bytes. [bug introduced in grep-2.11]
Plain grep, grep -E, and grep -F now treat encoding errors in patterns the same way the GNU regular expression matcher treats them, with respect to whether the errors can match parts of multibyte characters in data. [bug present since "the beginning"]
grep -w no longer mishandles a potential match adjacent to a letter that takes up two or more bytes in a multibyte encoding. Similarly, the patterns '\<', '\>', '\b', and '\B' no longer mishandle word-boundary matches in multibyte locales. [bug present since "the beginning"]
grep -P now reports an error and exits when given invalid UTF-8 data. Previously it was unreliable, and sometimes crashed or looped. [bug introduced in grep-2.16]
grep -P now works with -w and -x and backreferences. Before, echo aa|grep -Pw '(.)\1' would fail to match, yet echo aa|grep -Pw '(.)\2' would match.
grep -Pw now works like grep -w in that the matched string has to be preceded and followed by non-word components or the beginning and end of the line (as opposed to word boundaries before). Before, this echo a@@a| grep -Pw @@ would match, yet this echo a@@a| grep -w @@ would not. Now, they both fail to match, per the documentation on how grep's -w works.
grep -i no longer mishandles patterns containing titlecase characters. For example, in a locale containing the titlecase character 'Lj' (U+01C8 LATIN CAPITAL LETTER L WITH SMALL LETTER J), 'grep -i Lj' now matches both 'LJ' (U+01C7 LATIN CAPITAL LETTER LJ) and 'lj' (U+01C9 LATIN SMALL LETTER LJ).
** 2.18 Bug fixes grep no longer mishandles patterns like [^^-~] in unibyte locales. [bug introduced in grep-2.8]
grep -i in a multibyte, non-UTF8 locale could be up to 200 times slower than in 2.16. [bug introduced in grep-2.17]
** 2.17 Improvements grep -i in a multibyte locale is now typically 10 times faster for patterns that do not contain \ or [.
grep (without -i) in a multibyte locale is now up to 7 times faster when processing many matched lines.
** 2.16 Bug fixes The fix to make \s and \S work with multi-byte white space broke the use of each shortcut whenever followed by a repetition operator. For example, \s*, \s+, \s? and \s{3} would all malfunction in a multi-byte locale. [bug introduced in grep-2.15]
The fix to make grep -P work better with UTF-8 made it possible for grep to evoke a larger set of PCRE errors, some of which could trigger an abort. E.g., this would abort: printf '\x82'|LC_ALL=en_US.UTF-8 grep -P y Now grep handles arbitrary PCRE errors. [bug introduced in grep-2.15]
Handle very long lines (2GiB and longer) on systems with a deficient read system call.
** 2.15 Bug fixes grep's \s and \S failed to work with multi-byte white space characters. For example, \s would fail to match a non-breaking space, and this would print nothing: printf '\xc2\xa0' | LC_ALL=en_US.UTF-8 grep '\s' A related bug is that \S would mistakenly match an invalid multibyte character. For example, the following would match: printf '\x82\n' | LC_ALL=en_US.UTF-8 grep '^\S$' [bug present since grep-2.6]
grep -i would segfault on systems using UTF-16-based wchar_t (Cygwin) when converting an input string containing certain 4-byte UTF-8 sequences to lower case. The conversions to wchar_t and back to a UTF-8 multibyte string did not take surrogate pairs into account. [bug present since at least grep-2.6, though the segfault is new with 2.13]
grep -E would segfault when given a regexp like '([^.]*[M]){1,2}' for any multibyte character M. [bug introduced in grep-2.6, which would segfault, but 2.7 and 2.8 had no problem, and 2.9 through 2.14 would hit a failed assertion. ]
grep -F would get stuck in an infinite loop when given a search string that is an invalid byte sequence in the current locale and that matches the bytes of the input twice on a line. Now grep fails with exit status 1.
grep -P could misbehave. While multi-byte mode is only supported by PCRE with UTF-8 locales, grep did not activate it. This would cause failures to match multibyte characters against some regular expressions, especially those including the '.' or '\p' metacharacters.
** 2.15 New features grep -P can now use a just-in-time compiler to greatly speed up matches, This feature is transparent to the user; no flag is required to enable it. It is only available if the corresponding support in the PCRE library is detected when grep is compiled.
|
H A D | stdio.h | 51ddd709 Fri Oct 10 21:33:40 GMT 2014 John Marino <draco@marino.st> Complete upgrade of gnu grep 2.14 => 2.20
** 2.20 Bug fixes grep --max-count=N FILE would no longer stop reading after Nth match. I.e., while grep would still print the correct output, it would continue reading until end of input, and hence, potentially forever. [bug introduced in grep-2.19]
A command like echo aa|grep -E 'a(b$|c$)' would mistakenly report the input as a matched line. [bug introduced in grep-2.19]
** 2.20 Changes in behavior grep --exclude-dir='FOO/' now excludes the directory FOO. Previously, the trailing slash meant the option was ineffective.
** 2.19 Improvements Performance has improved, typically by 10% and in some cases by a factor of 200. However, performance of grep -P in UTF-8 locales has gotten worse as part of the fix for the crashes mentioned below.
** 2.19 Bug fixes grep no longer mishandles patterns like [a-[.z.]], and no longer mishandles patterns like [^a] in locales that have multicharacter collating sequences so that [^a] can match a string of two characters.
grep no longer mishandles an empty pattern at the end of a pattern list. [bug introduced in grep-2.5]
grep -C NUM now outputs separators consistently even when NUM is zero, and similarly for grep -A NUM and grep -B NUM. [bug present since "the beginning"]
grep -f no longer mishandles patterns containing NUL bytes. [bug introduced in grep-2.11]
Plain grep, grep -E, and grep -F now treat encoding errors in patterns the same way the GNU regular expression matcher treats them, with respect to whether the errors can match parts of multibyte characters in data. [bug present since "the beginning"]
grep -w no longer mishandles a potential match adjacent to a letter that takes up two or more bytes in a multibyte encoding. Similarly, the patterns '\<', '\>', '\b', and '\B' no longer mishandle word-boundary matches in multibyte locales. [bug present since "the beginning"]
grep -P now reports an error and exits when given invalid UTF-8 data. Previously it was unreliable, and sometimes crashed or looped. [bug introduced in grep-2.16]
grep -P now works with -w and -x and backreferences. Before, echo aa|grep -Pw '(.)\1' would fail to match, yet echo aa|grep -Pw '(.)\2' would match.
grep -Pw now works like grep -w in that the matched string has to be preceded and followed by non-word components or the beginning and end of the line (as opposed to word boundaries before). Before, this echo a@@a| grep -Pw @@ would match, yet this echo a@@a| grep -w @@ would not. Now, they both fail to match, per the documentation on how grep's -w works.
grep -i no longer mishandles patterns containing titlecase characters. For example, in a locale containing the titlecase character 'Lj' (U+01C8 LATIN CAPITAL LETTER L WITH SMALL LETTER J), 'grep -i Lj' now matches both 'LJ' (U+01C7 LATIN CAPITAL LETTER LJ) and 'lj' (U+01C9 LATIN SMALL LETTER LJ).
** 2.18 Bug fixes grep no longer mishandles patterns like [^^-~] in unibyte locales. [bug introduced in grep-2.8]
grep -i in a multibyte, non-UTF8 locale could be up to 200 times slower than in 2.16. [bug introduced in grep-2.17]
** 2.17 Improvements grep -i in a multibyte locale is now typically 10 times faster for patterns that do not contain \ or [.
grep (without -i) in a multibyte locale is now up to 7 times faster when processing many matched lines.
** 2.16 Bug fixes The fix to make \s and \S work with multi-byte white space broke the use of each shortcut whenever followed by a repetition operator. For example, \s*, \s+, \s? and \s{3} would all malfunction in a multi-byte locale. [bug introduced in grep-2.15]
The fix to make grep -P work better with UTF-8 made it possible for grep to evoke a larger set of PCRE errors, some of which could trigger an abort. E.g., this would abort: printf '\x82'|LC_ALL=en_US.UTF-8 grep -P y Now grep handles arbitrary PCRE errors. [bug introduced in grep-2.15]
Handle very long lines (2GiB and longer) on systems with a deficient read system call.
** 2.15 Bug fixes grep's \s and \S failed to work with multi-byte white space characters. For example, \s would fail to match a non-breaking space, and this would print nothing: printf '\xc2\xa0' | LC_ALL=en_US.UTF-8 grep '\s' A related bug is that \S would mistakenly match an invalid multibyte character. For example, the following would match: printf '\x82\n' | LC_ALL=en_US.UTF-8 grep '^\S$' [bug present since grep-2.6]
grep -i would segfault on systems using UTF-16-based wchar_t (Cygwin) when converting an input string containing certain 4-byte UTF-8 sequences to lower case. The conversions to wchar_t and back to a UTF-8 multibyte string did not take surrogate pairs into account. [bug present since at least grep-2.6, though the segfault is new with 2.13]
grep -E would segfault when given a regexp like '([^.]*[M]){1,2}' for any multibyte character M. [bug introduced in grep-2.6, which would segfault, but 2.7 and 2.8 had no problem, and 2.9 through 2.14 would hit a failed assertion. ]
grep -F would get stuck in an infinite loop when given a search string that is an invalid byte sequence in the current locale and that matches the bytes of the input twice on a line. Now grep fails with exit status 1.
grep -P could misbehave. While multi-byte mode is only supported by PCRE with UTF-8 locales, grep did not activate it. This would cause failures to match multibyte characters against some regular expressions, especially those including the '.' or '\p' metacharacters.
** 2.15 New features grep -P can now use a just-in-time compiler to greatly speed up matches, This feature is transparent to the user; no flag is required to enable it. It is only available if the corresponding support in the PCRE library is detected when grep is compiled.
|
H A D | stdlib.h | 51ddd709 Fri Oct 10 21:33:40 GMT 2014 John Marino <draco@marino.st> Complete upgrade of gnu grep 2.14 => 2.20
** 2.20 Bug fixes grep --max-count=N FILE would no longer stop reading after Nth match. I.e., while grep would still print the correct output, it would continue reading until end of input, and hence, potentially forever. [bug introduced in grep-2.19]
A command like echo aa|grep -E 'a(b$|c$)' would mistakenly report the input as a matched line. [bug introduced in grep-2.19]
** 2.20 Changes in behavior grep --exclude-dir='FOO/' now excludes the directory FOO. Previously, the trailing slash meant the option was ineffective.
** 2.19 Improvements Performance has improved, typically by 10% and in some cases by a factor of 200. However, performance of grep -P in UTF-8 locales has gotten worse as part of the fix for the crashes mentioned below.
** 2.19 Bug fixes grep no longer mishandles patterns like [a-[.z.]], and no longer mishandles patterns like [^a] in locales that have multicharacter collating sequences so that [^a] can match a string of two characters.
grep no longer mishandles an empty pattern at the end of a pattern list. [bug introduced in grep-2.5]
grep -C NUM now outputs separators consistently even when NUM is zero, and similarly for grep -A NUM and grep -B NUM. [bug present since "the beginning"]
grep -f no longer mishandles patterns containing NUL bytes. [bug introduced in grep-2.11]
Plain grep, grep -E, and grep -F now treat encoding errors in patterns the same way the GNU regular expression matcher treats them, with respect to whether the errors can match parts of multibyte characters in data. [bug present since "the beginning"]
grep -w no longer mishandles a potential match adjacent to a letter that takes up two or more bytes in a multibyte encoding. Similarly, the patterns '\<', '\>', '\b', and '\B' no longer mishandle word-boundary matches in multibyte locales. [bug present since "the beginning"]
grep -P now reports an error and exits when given invalid UTF-8 data. Previously it was unreliable, and sometimes crashed or looped. [bug introduced in grep-2.16]
grep -P now works with -w and -x and backreferences. Before, echo aa|grep -Pw '(.)\1' would fail to match, yet echo aa|grep -Pw '(.)\2' would match.
grep -Pw now works like grep -w in that the matched string has to be preceded and followed by non-word components or the beginning and end of the line (as opposed to word boundaries before). Before, this echo a@@a| grep -Pw @@ would match, yet this echo a@@a| grep -w @@ would not. Now, they both fail to match, per the documentation on how grep's -w works.
grep -i no longer mishandles patterns containing titlecase characters. For example, in a locale containing the titlecase character 'Lj' (U+01C8 LATIN CAPITAL LETTER L WITH SMALL LETTER J), 'grep -i Lj' now matches both 'LJ' (U+01C7 LATIN CAPITAL LETTER LJ) and 'lj' (U+01C9 LATIN SMALL LETTER LJ).
** 2.18 Bug fixes grep no longer mishandles patterns like [^^-~] in unibyte locales. [bug introduced in grep-2.8]
grep -i in a multibyte, non-UTF8 locale could be up to 200 times slower than in 2.16. [bug introduced in grep-2.17]
** 2.17 Improvements grep -i in a multibyte locale is now typically 10 times faster for patterns that do not contain \ or [.
grep (without -i) in a multibyte locale is now up to 7 times faster when processing many matched lines.
** 2.16 Bug fixes The fix to make \s and \S work with multi-byte white space broke the use of each shortcut whenever followed by a repetition operator. For example, \s*, \s+, \s? and \s{3} would all malfunction in a multi-byte locale. [bug introduced in grep-2.15]
The fix to make grep -P work better with UTF-8 made it possible for grep to evoke a larger set of PCRE errors, some of which could trigger an abort. E.g., this would abort: printf '\x82'|LC_ALL=en_US.UTF-8 grep -P y Now grep handles arbitrary PCRE errors. [bug introduced in grep-2.15]
Handle very long lines (2GiB and longer) on systems with a deficient read system call.
** 2.15 Bug fixes grep's \s and \S failed to work with multi-byte white space characters. For example, \s would fail to match a non-breaking space, and this would print nothing: printf '\xc2\xa0' | LC_ALL=en_US.UTF-8 grep '\s' A related bug is that \S would mistakenly match an invalid multibyte character. For example, the following would match: printf '\x82\n' | LC_ALL=en_US.UTF-8 grep '^\S$' [bug present since grep-2.6]
grep -i would segfault on systems using UTF-16-based wchar_t (Cygwin) when converting an input string containing certain 4-byte UTF-8 sequences to lower case. The conversions to wchar_t and back to a UTF-8 multibyte string did not take surrogate pairs into account. [bug present since at least grep-2.6, though the segfault is new with 2.13]
grep -E would segfault when given a regexp like '([^.]*[M]){1,2}' for any multibyte character M. [bug introduced in grep-2.6, which would segfault, but 2.7 and 2.8 had no problem, and 2.9 through 2.14 would hit a failed assertion. ]
grep -F would get stuck in an infinite loop when given a search string that is an invalid byte sequence in the current locale and that matches the bytes of the input twice on a line. Now grep fails with exit status 1.
grep -P could misbehave. While multi-byte mode is only supported by PCRE with UTF-8 locales, grep did not activate it. This would cause failures to match multibyte characters against some regular expressions, especially those including the '.' or '\p' metacharacters.
** 2.15 New features grep -P can now use a just-in-time compiler to greatly speed up matches, This feature is transparent to the user; no flag is required to enable it. It is only available if the corresponding support in the PCRE library is detected when grep is compiled.
|
H A D | string.h | 51ddd709 Fri Oct 10 21:33:40 GMT 2014 John Marino <draco@marino.st> Complete upgrade of gnu grep 2.14 => 2.20
** 2.20 Bug fixes grep --max-count=N FILE would no longer stop reading after Nth match. I.e., while grep would still print the correct output, it would continue reading until end of input, and hence, potentially forever. [bug introduced in grep-2.19]
A command like echo aa|grep -E 'a(b$|c$)' would mistakenly report the input as a matched line. [bug introduced in grep-2.19]
** 2.20 Changes in behavior grep --exclude-dir='FOO/' now excludes the directory FOO. Previously, the trailing slash meant the option was ineffective.
** 2.19 Improvements Performance has improved, typically by 10% and in some cases by a factor of 200. However, performance of grep -P in UTF-8 locales has gotten worse as part of the fix for the crashes mentioned below.
** 2.19 Bug fixes grep no longer mishandles patterns like [a-[.z.]], and no longer mishandles patterns like [^a] in locales that have multicharacter collating sequences so that [^a] can match a string of two characters.
grep no longer mishandles an empty pattern at the end of a pattern list. [bug introduced in grep-2.5]
grep -C NUM now outputs separators consistently even when NUM is zero, and similarly for grep -A NUM and grep -B NUM. [bug present since "the beginning"]
grep -f no longer mishandles patterns containing NUL bytes. [bug introduced in grep-2.11]
Plain grep, grep -E, and grep -F now treat encoding errors in patterns the same way the GNU regular expression matcher treats them, with respect to whether the errors can match parts of multibyte characters in data. [bug present since "the beginning"]
grep -w no longer mishandles a potential match adjacent to a letter that takes up two or more bytes in a multibyte encoding. Similarly, the patterns '\<', '\>', '\b', and '\B' no longer mishandle word-boundary matches in multibyte locales. [bug present since "the beginning"]
grep -P now reports an error and exits when given invalid UTF-8 data. Previously it was unreliable, and sometimes crashed or looped. [bug introduced in grep-2.16]
grep -P now works with -w and -x and backreferences. Before, echo aa|grep -Pw '(.)\1' would fail to match, yet echo aa|grep -Pw '(.)\2' would match.
grep -Pw now works like grep -w in that the matched string has to be preceded and followed by non-word components or the beginning and end of the line (as opposed to word boundaries before). Before, this echo a@@a| grep -Pw @@ would match, yet this echo a@@a| grep -w @@ would not. Now, they both fail to match, per the documentation on how grep's -w works.
grep -i no longer mishandles patterns containing titlecase characters. For example, in a locale containing the titlecase character 'Lj' (U+01C8 LATIN CAPITAL LETTER L WITH SMALL LETTER J), 'grep -i Lj' now matches both 'LJ' (U+01C7 LATIN CAPITAL LETTER LJ) and 'lj' (U+01C9 LATIN SMALL LETTER LJ).
** 2.18 Bug fixes grep no longer mishandles patterns like [^^-~] in unibyte locales. [bug introduced in grep-2.8]
grep -i in a multibyte, non-UTF8 locale could be up to 200 times slower than in 2.16. [bug introduced in grep-2.17]
** 2.17 Improvements grep -i in a multibyte locale is now typically 10 times faster for patterns that do not contain \ or [.
grep (without -i) in a multibyte locale is now up to 7 times faster when processing many matched lines.
** 2.16 Bug fixes The fix to make \s and \S work with multi-byte white space broke the use of each shortcut whenever followed by a repetition operator. For example, \s*, \s+, \s? and \s{3} would all malfunction in a multi-byte locale. [bug introduced in grep-2.15]
The fix to make grep -P work better with UTF-8 made it possible for grep to evoke a larger set of PCRE errors, some of which could trigger an abort. E.g., this would abort: printf '\x82'|LC_ALL=en_US.UTF-8 grep -P y Now grep handles arbitrary PCRE errors. [bug introduced in grep-2.15]
Handle very long lines (2GiB and longer) on systems with a deficient read system call.
** 2.15 Bug fixes grep's \s and \S failed to work with multi-byte white space characters. For example, \s would fail to match a non-breaking space, and this would print nothing: printf '\xc2\xa0' | LC_ALL=en_US.UTF-8 grep '\s' A related bug is that \S would mistakenly match an invalid multibyte character. For example, the following would match: printf '\x82\n' | LC_ALL=en_US.UTF-8 grep '^\S$' [bug present since grep-2.6]
grep -i would segfault on systems using UTF-16-based wchar_t (Cygwin) when converting an input string containing certain 4-byte UTF-8 sequences to lower case. The conversions to wchar_t and back to a UTF-8 multibyte string did not take surrogate pairs into account. [bug present since at least grep-2.6, though the segfault is new with 2.13]
grep -E would segfault when given a regexp like '([^.]*[M]){1,2}' for any multibyte character M. [bug introduced in grep-2.6, which would segfault, but 2.7 and 2.8 had no problem, and 2.9 through 2.14 would hit a failed assertion. ]
grep -F would get stuck in an infinite loop when given a search string that is an invalid byte sequence in the current locale and that matches the bytes of the input twice on a line. Now grep fails with exit status 1.
grep -P could misbehave. While multi-byte mode is only supported by PCRE with UTF-8 locales, grep did not activate it. This would cause failures to match multibyte characters against some regular expressions, especially those including the '.' or '\p' metacharacters.
** 2.15 New features grep -P can now use a just-in-time compiler to greatly speed up matches, This feature is transparent to the user; no flag is required to enable it. It is only available if the corresponding support in the PCRE library is detected when grep is compiled.
|
H A D | unistd.h | 51ddd709 Fri Oct 10 21:33:40 GMT 2014 John Marino <draco@marino.st> Complete upgrade of gnu grep 2.14 => 2.20
** 2.20 Bug fixes grep --max-count=N FILE would no longer stop reading after Nth match. I.e., while grep would still print the correct output, it would continue reading until end of input, and hence, potentially forever. [bug introduced in grep-2.19]
A command like echo aa|grep -E 'a(b$|c$)' would mistakenly report the input as a matched line. [bug introduced in grep-2.19]
** 2.20 Changes in behavior grep --exclude-dir='FOO/' now excludes the directory FOO. Previously, the trailing slash meant the option was ineffective.
** 2.19 Improvements Performance has improved, typically by 10% and in some cases by a factor of 200. However, performance of grep -P in UTF-8 locales has gotten worse as part of the fix for the crashes mentioned below.
** 2.19 Bug fixes grep no longer mishandles patterns like [a-[.z.]], and no longer mishandles patterns like [^a] in locales that have multicharacter collating sequences so that [^a] can match a string of two characters.
grep no longer mishandles an empty pattern at the end of a pattern list. [bug introduced in grep-2.5]
grep -C NUM now outputs separators consistently even when NUM is zero, and similarly for grep -A NUM and grep -B NUM. [bug present since "the beginning"]
grep -f no longer mishandles patterns containing NUL bytes. [bug introduced in grep-2.11]
Plain grep, grep -E, and grep -F now treat encoding errors in patterns the same way the GNU regular expression matcher treats them, with respect to whether the errors can match parts of multibyte characters in data. [bug present since "the beginning"]
grep -w no longer mishandles a potential match adjacent to a letter that takes up two or more bytes in a multibyte encoding. Similarly, the patterns '\<', '\>', '\b', and '\B' no longer mishandle word-boundary matches in multibyte locales. [bug present since "the beginning"]
grep -P now reports an error and exits when given invalid UTF-8 data. Previously it was unreliable, and sometimes crashed or looped. [bug introduced in grep-2.16]
grep -P now works with -w and -x and backreferences. Before, echo aa|grep -Pw '(.)\1' would fail to match, yet echo aa|grep -Pw '(.)\2' would match.
grep -Pw now works like grep -w in that the matched string has to be preceded and followed by non-word components or the beginning and end of the line (as opposed to word boundaries before). Before, this echo a@@a| grep -Pw @@ would match, yet this echo a@@a| grep -w @@ would not. Now, they both fail to match, per the documentation on how grep's -w works.
grep -i no longer mishandles patterns containing titlecase characters. For example, in a locale containing the titlecase character 'Lj' (U+01C8 LATIN CAPITAL LETTER L WITH SMALL LETTER J), 'grep -i Lj' now matches both 'LJ' (U+01C7 LATIN CAPITAL LETTER LJ) and 'lj' (U+01C9 LATIN SMALL LETTER LJ).
** 2.18 Bug fixes grep no longer mishandles patterns like [^^-~] in unibyte locales. [bug introduced in grep-2.8]
grep -i in a multibyte, non-UTF8 locale could be up to 200 times slower than in 2.16. [bug introduced in grep-2.17]
** 2.17 Improvements grep -i in a multibyte locale is now typically 10 times faster for patterns that do not contain \ or [.
grep (without -i) in a multibyte locale is now up to 7 times faster when processing many matched lines.
** 2.16 Bug fixes The fix to make \s and \S work with multi-byte white space broke the use of each shortcut whenever followed by a repetition operator. For example, \s*, \s+, \s? and \s{3} would all malfunction in a multi-byte locale. [bug introduced in grep-2.15]
The fix to make grep -P work better with UTF-8 made it possible for grep to evoke a larger set of PCRE errors, some of which could trigger an abort. E.g., this would abort: printf '\x82'|LC_ALL=en_US.UTF-8 grep -P y Now grep handles arbitrary PCRE errors. [bug introduced in grep-2.15]
Handle very long lines (2GiB and longer) on systems with a deficient read system call.
** 2.15 Bug fixes grep's \s and \S failed to work with multi-byte white space characters. For example, \s would fail to match a non-breaking space, and this would print nothing: printf '\xc2\xa0' | LC_ALL=en_US.UTF-8 grep '\s' A related bug is that \S would mistakenly match an invalid multibyte character. For example, the following would match: printf '\x82\n' | LC_ALL=en_US.UTF-8 grep '^\S$' [bug present since grep-2.6]
grep -i would segfault on systems using UTF-16-based wchar_t (Cygwin) when converting an input string containing certain 4-byte UTF-8 sequences to lower case. The conversions to wchar_t and back to a UTF-8 multibyte string did not take surrogate pairs into account. [bug present since at least grep-2.6, though the segfault is new with 2.13]
grep -E would segfault when given a regexp like '([^.]*[M]){1,2}' for any multibyte character M. [bug introduced in grep-2.6, which would segfault, but 2.7 and 2.8 had no problem, and 2.9 through 2.14 would hit a failed assertion. ]
grep -F would get stuck in an infinite loop when given a search string that is an invalid byte sequence in the current locale and that matches the bytes of the input twice on a line. Now grep fails with exit status 1.
grep -P could misbehave. While multi-byte mode is only supported by PCRE with UTF-8 locales, grep did not activate it. This would cause failures to match multibyte characters against some regular expressions, especially those including the '.' or '\p' metacharacters.
** 2.15 New features grep -P can now use a just-in-time compiler to greatly speed up matches, This feature is transparent to the user; no flag is required to enable it. It is only available if the corresponding support in the PCRE library is detected when grep is compiled.
|
H A D | configmake.h | diff 51ddd709 Fri Oct 10 21:33:40 GMT 2014 John Marino <draco@marino.st> Complete upgrade of gnu grep 2.14 => 2.20
** 2.20 Bug fixes grep --max-count=N FILE would no longer stop reading after Nth match. I.e., while grep would still print the correct output, it would continue reading until end of input, and hence, potentially forever. [bug introduced in grep-2.19]
A command like echo aa|grep -E 'a(b$|c$)' would mistakenly report the input as a matched line. [bug introduced in grep-2.19]
** 2.20 Changes in behavior grep --exclude-dir='FOO/' now excludes the directory FOO. Previously, the trailing slash meant the option was ineffective.
** 2.19 Improvements Performance has improved, typically by 10% and in some cases by a factor of 200. However, performance of grep -P in UTF-8 locales has gotten worse as part of the fix for the crashes mentioned below.
** 2.19 Bug fixes grep no longer mishandles patterns like [a-[.z.]], and no longer mishandles patterns like [^a] in locales that have multicharacter collating sequences so that [^a] can match a string of two characters.
grep no longer mishandles an empty pattern at the end of a pattern list. [bug introduced in grep-2.5]
grep -C NUM now outputs separators consistently even when NUM is zero, and similarly for grep -A NUM and grep -B NUM. [bug present since "the beginning"]
grep -f no longer mishandles patterns containing NUL bytes. [bug introduced in grep-2.11]
Plain grep, grep -E, and grep -F now treat encoding errors in patterns the same way the GNU regular expression matcher treats them, with respect to whether the errors can match parts of multibyte characters in data. [bug present since "the beginning"]
grep -w no longer mishandles a potential match adjacent to a letter that takes up two or more bytes in a multibyte encoding. Similarly, the patterns '\<', '\>', '\b', and '\B' no longer mishandle word-boundary matches in multibyte locales. [bug present since "the beginning"]
grep -P now reports an error and exits when given invalid UTF-8 data. Previously it was unreliable, and sometimes crashed or looped. [bug introduced in grep-2.16]
grep -P now works with -w and -x and backreferences. Before, echo aa|grep -Pw '(.)\1' would fail to match, yet echo aa|grep -Pw '(.)\2' would match.
grep -Pw now works like grep -w in that the matched string has to be preceded and followed by non-word components or the beginning and end of the line (as opposed to word boundaries before). Before, this echo a@@a| grep -Pw @@ would match, yet this echo a@@a| grep -w @@ would not. Now, they both fail to match, per the documentation on how grep's -w works.
grep -i no longer mishandles patterns containing titlecase characters. For example, in a locale containing the titlecase character 'Lj' (U+01C8 LATIN CAPITAL LETTER L WITH SMALL LETTER J), 'grep -i Lj' now matches both 'LJ' (U+01C7 LATIN CAPITAL LETTER LJ) and 'lj' (U+01C9 LATIN SMALL LETTER LJ).
** 2.18 Bug fixes grep no longer mishandles patterns like [^^-~] in unibyte locales. [bug introduced in grep-2.8]
grep -i in a multibyte, non-UTF8 locale could be up to 200 times slower than in 2.16. [bug introduced in grep-2.17]
** 2.17 Improvements grep -i in a multibyte locale is now typically 10 times faster for patterns that do not contain \ or [.
grep (without -i) in a multibyte locale is now up to 7 times faster when processing many matched lines.
** 2.16 Bug fixes The fix to make \s and \S work with multi-byte white space broke the use of each shortcut whenever followed by a repetition operator. For example, \s*, \s+, \s? and \s{3} would all malfunction in a multi-byte locale. [bug introduced in grep-2.15]
The fix to make grep -P work better with UTF-8 made it possible for grep to evoke a larger set of PCRE errors, some of which could trigger an abort. E.g., this would abort: printf '\x82'|LC_ALL=en_US.UTF-8 grep -P y Now grep handles arbitrary PCRE errors. [bug introduced in grep-2.15]
Handle very long lines (2GiB and longer) on systems with a deficient read system call.
** 2.15 Bug fixes grep's \s and \S failed to work with multi-byte white space characters. For example, \s would fail to match a non-breaking space, and this would print nothing: printf '\xc2\xa0' | LC_ALL=en_US.UTF-8 grep '\s' A related bug is that \S would mistakenly match an invalid multibyte character. For example, the following would match: printf '\x82\n' | LC_ALL=en_US.UTF-8 grep '^\S$' [bug present since grep-2.6]
grep -i would segfault on systems using UTF-16-based wchar_t (Cygwin) when converting an input string containing certain 4-byte UTF-8 sequences to lower case. The conversions to wchar_t and back to a UTF-8 multibyte string did not take surrogate pairs into account. [bug present since at least grep-2.6, though the segfault is new with 2.13]
grep -E would segfault when given a regexp like '([^.]*[M]){1,2}' for any multibyte character M. [bug introduced in grep-2.6, which would segfault, but 2.7 and 2.8 had no problem, and 2.9 through 2.14 would hit a failed assertion. ]
grep -F would get stuck in an infinite loop when given a search string that is an invalid byte sequence in the current locale and that matches the bytes of the input twice on a line. Now grep fails with exit status 1.
grep -P could misbehave. While multi-byte mode is only supported by PCRE with UTF-8 locales, grep did not activate it. This would cause failures to match multibyte characters against some regular expressions, especially those including the '.' or '\p' metacharacters.
** 2.15 New features grep -P can now use a just-in-time compiler to greatly speed up matches, This feature is transparent to the user; no flag is required to enable it. It is only available if the corresponding support in the PCRE library is detected when grep is compiled.
|
H A D | iconv.h | 51ddd709 Fri Oct 10 21:33:40 GMT 2014 John Marino <draco@marino.st> Complete upgrade of gnu grep 2.14 => 2.20
** 2.20 Bug fixes grep --max-count=N FILE would no longer stop reading after Nth match. I.e., while grep would still print the correct output, it would continue reading until end of input, and hence, potentially forever. [bug introduced in grep-2.19]
A command like echo aa|grep -E 'a(b$|c$)' would mistakenly report the input as a matched line. [bug introduced in grep-2.19]
** 2.20 Changes in behavior grep --exclude-dir='FOO/' now excludes the directory FOO. Previously, the trailing slash meant the option was ineffective.
** 2.19 Improvements Performance has improved, typically by 10% and in some cases by a factor of 200. However, performance of grep -P in UTF-8 locales has gotten worse as part of the fix for the crashes mentioned below.
** 2.19 Bug fixes grep no longer mishandles patterns like [a-[.z.]], and no longer mishandles patterns like [^a] in locales that have multicharacter collating sequences so that [^a] can match a string of two characters.
grep no longer mishandles an empty pattern at the end of a pattern list. [bug introduced in grep-2.5]
grep -C NUM now outputs separators consistently even when NUM is zero, and similarly for grep -A NUM and grep -B NUM. [bug present since "the beginning"]
grep -f no longer mishandles patterns containing NUL bytes. [bug introduced in grep-2.11]
Plain grep, grep -E, and grep -F now treat encoding errors in patterns the same way the GNU regular expression matcher treats them, with respect to whether the errors can match parts of multibyte characters in data. [bug present since "the beginning"]
grep -w no longer mishandles a potential match adjacent to a letter that takes up two or more bytes in a multibyte encoding. Similarly, the patterns '\<', '\>', '\b', and '\B' no longer mishandle word-boundary matches in multibyte locales. [bug present since "the beginning"]
grep -P now reports an error and exits when given invalid UTF-8 data. Previously it was unreliable, and sometimes crashed or looped. [bug introduced in grep-2.16]
grep -P now works with -w and -x and backreferences. Before, echo aa|grep -Pw '(.)\1' would fail to match, yet echo aa|grep -Pw '(.)\2' would match.
grep -Pw now works like grep -w in that the matched string has to be preceded and followed by non-word components or the beginning and end of the line (as opposed to word boundaries before). Before, this echo a@@a| grep -Pw @@ would match, yet this echo a@@a| grep -w @@ would not. Now, they both fail to match, per the documentation on how grep's -w works.
grep -i no longer mishandles patterns containing titlecase characters. For example, in a locale containing the titlecase character 'Lj' (U+01C8 LATIN CAPITAL LETTER L WITH SMALL LETTER J), 'grep -i Lj' now matches both 'LJ' (U+01C7 LATIN CAPITAL LETTER LJ) and 'lj' (U+01C9 LATIN SMALL LETTER LJ).
** 2.18 Bug fixes grep no longer mishandles patterns like [^^-~] in unibyte locales. [bug introduced in grep-2.8]
grep -i in a multibyte, non-UTF8 locale could be up to 200 times slower than in 2.16. [bug introduced in grep-2.17]
** 2.17 Improvements grep -i in a multibyte locale is now typically 10 times faster for patterns that do not contain \ or [.
grep (without -i) in a multibyte locale is now up to 7 times faster when processing many matched lines.
** 2.16 Bug fixes The fix to make \s and \S work with multi-byte white space broke the use of each shortcut whenever followed by a repetition operator. For example, \s*, \s+, \s? and \s{3} would all malfunction in a multi-byte locale. [bug introduced in grep-2.15]
The fix to make grep -P work better with UTF-8 made it possible for grep to evoke a larger set of PCRE errors, some of which could trigger an abort. E.g., this would abort: printf '\x82'|LC_ALL=en_US.UTF-8 grep -P y Now grep handles arbitrary PCRE errors. [bug introduced in grep-2.15]
Handle very long lines (2GiB and longer) on systems with a deficient read system call.
** 2.15 Bug fixes grep's \s and \S failed to work with multi-byte white space characters. For example, \s would fail to match a non-breaking space, and this would print nothing: printf '\xc2\xa0' | LC_ALL=en_US.UTF-8 grep '\s' A related bug is that \S would mistakenly match an invalid multibyte character. For example, the following would match: printf '\x82\n' | LC_ALL=en_US.UTF-8 grep '^\S$' [bug present since grep-2.6]
grep -i would segfault on systems using UTF-16-based wchar_t (Cygwin) when converting an input string containing certain 4-byte UTF-8 sequences to lower case. The conversions to wchar_t and back to a UTF-8 multibyte string did not take surrogate pairs into account. [bug present since at least grep-2.6, though the segfault is new with 2.13]
grep -E would segfault when given a regexp like '([^.]*[M]){1,2}' for any multibyte character M. [bug introduced in grep-2.6, which would segfault, but 2.7 and 2.8 had no problem, and 2.9 through 2.14 would hit a failed assertion. ]
grep -F would get stuck in an infinite loop when given a search string that is an invalid byte sequence in the current locale and that matches the bytes of the input twice on a line. Now grep fails with exit status 1.
grep -P could misbehave. While multi-byte mode is only supported by PCRE with UTF-8 locales, grep did not activate it. This would cause failures to match multibyte characters against some regular expressions, especially those including the '.' or '\p' metacharacters.
** 2.15 New features grep -P can now use a just-in-time compiler to greatly speed up matches, This feature is transparent to the user; no flag is required to enable it. It is only available if the corresponding support in the PCRE library is detected when grep is compiled.
|
H A D | locale.h | 51ddd709 Fri Oct 10 21:33:40 GMT 2014 John Marino <draco@marino.st> Complete upgrade of gnu grep 2.14 => 2.20
** 2.20 Bug fixes grep --max-count=N FILE would no longer stop reading after Nth match. I.e., while grep would still print the correct output, it would continue reading until end of input, and hence, potentially forever. [bug introduced in grep-2.19]
A command like echo aa|grep -E 'a(b$|c$)' would mistakenly report the input as a matched line. [bug introduced in grep-2.19]
** 2.20 Changes in behavior grep --exclude-dir='FOO/' now excludes the directory FOO. Previously, the trailing slash meant the option was ineffective.
** 2.19 Improvements Performance has improved, typically by 10% and in some cases by a factor of 200. However, performance of grep -P in UTF-8 locales has gotten worse as part of the fix for the crashes mentioned below.
** 2.19 Bug fixes grep no longer mishandles patterns like [a-[.z.]], and no longer mishandles patterns like [^a] in locales that have multicharacter collating sequences so that [^a] can match a string of two characters.
grep no longer mishandles an empty pattern at the end of a pattern list. [bug introduced in grep-2.5]
grep -C NUM now outputs separators consistently even when NUM is zero, and similarly for grep -A NUM and grep -B NUM. [bug present since "the beginning"]
grep -f no longer mishandles patterns containing NUL bytes. [bug introduced in grep-2.11]
Plain grep, grep -E, and grep -F now treat encoding errors in patterns the same way the GNU regular expression matcher treats them, with respect to whether the errors can match parts of multibyte characters in data. [bug present since "the beginning"]
grep -w no longer mishandles a potential match adjacent to a letter that takes up two or more bytes in a multibyte encoding. Similarly, the patterns '\<', '\>', '\b', and '\B' no longer mishandle word-boundary matches in multibyte locales. [bug present since "the beginning"]
grep -P now reports an error and exits when given invalid UTF-8 data. Previously it was unreliable, and sometimes crashed or looped. [bug introduced in grep-2.16]
grep -P now works with -w and -x and backreferences. Before, echo aa|grep -Pw '(.)\1' would fail to match, yet echo aa|grep -Pw '(.)\2' would match.
grep -Pw now works like grep -w in that the matched string has to be preceded and followed by non-word components or the beginning and end of the line (as opposed to word boundaries before). Before, this echo a@@a| grep -Pw @@ would match, yet this echo a@@a| grep -w @@ would not. Now, they both fail to match, per the documentation on how grep's -w works.
grep -i no longer mishandles patterns containing titlecase characters. For example, in a locale containing the titlecase character 'Lj' (U+01C8 LATIN CAPITAL LETTER L WITH SMALL LETTER J), 'grep -i Lj' now matches both 'LJ' (U+01C7 LATIN CAPITAL LETTER LJ) and 'lj' (U+01C9 LATIN SMALL LETTER LJ).
** 2.18 Bug fixes grep no longer mishandles patterns like [^^-~] in unibyte locales. [bug introduced in grep-2.8]
grep -i in a multibyte, non-UTF8 locale could be up to 200 times slower than in 2.16. [bug introduced in grep-2.17]
** 2.17 Improvements grep -i in a multibyte locale is now typically 10 times faster for patterns that do not contain \ or [.
grep (without -i) in a multibyte locale is now up to 7 times faster when processing many matched lines.
** 2.16 Bug fixes The fix to make \s and \S work with multi-byte white space broke the use of each shortcut whenever followed by a repetition operator. For example, \s*, \s+, \s? and \s{3} would all malfunction in a multi-byte locale. [bug introduced in grep-2.15]
The fix to make grep -P work better with UTF-8 made it possible for grep to evoke a larger set of PCRE errors, some of which could trigger an abort. E.g., this would abort: printf '\x82'|LC_ALL=en_US.UTF-8 grep -P y Now grep handles arbitrary PCRE errors. [bug introduced in grep-2.15]
Handle very long lines (2GiB and longer) on systems with a deficient read system call.
** 2.15 Bug fixes grep's \s and \S failed to work with multi-byte white space characters. For example, \s would fail to match a non-breaking space, and this would print nothing: printf '\xc2\xa0' | LC_ALL=en_US.UTF-8 grep '\s' A related bug is that \S would mistakenly match an invalid multibyte character. For example, the following would match: printf '\x82\n' | LC_ALL=en_US.UTF-8 grep '^\S$' [bug present since grep-2.6]
grep -i would segfault on systems using UTF-16-based wchar_t (Cygwin) when converting an input string containing certain 4-byte UTF-8 sequences to lower case. The conversions to wchar_t and back to a UTF-8 multibyte string did not take surrogate pairs into account. [bug present since at least grep-2.6, though the segfault is new with 2.13]
grep -E would segfault when given a regexp like '([^.]*[M]){1,2}' for any multibyte character M. [bug introduced in grep-2.6, which would segfault, but 2.7 and 2.8 had no problem, and 2.9 through 2.14 would hit a failed assertion. ]
grep -F would get stuck in an infinite loop when given a search string that is an invalid byte sequence in the current locale and that matches the bytes of the input twice on a line. Now grep fails with exit status 1.
grep -P could misbehave. While multi-byte mode is only supported by PCRE with UTF-8 locales, grep did not activate it. This would cause failures to match multibyte characters against some regular expressions, especially those including the '.' or '\p' metacharacters.
** 2.15 New features grep -P can now use a just-in-time compiler to greatly speed up matches, This feature is transparent to the user; no flag is required to enable it. It is only available if the corresponding support in the PCRE library is detected when grep is compiled.
|
H A D | fcntl.h | diff 51ddd709 Fri Oct 10 21:33:40 GMT 2014 John Marino <draco@marino.st> Complete upgrade of gnu grep 2.14 => 2.20
** 2.20 Bug fixes grep --max-count=N FILE would no longer stop reading after Nth match. I.e., while grep would still print the correct output, it would continue reading until end of input, and hence, potentially forever. [bug introduced in grep-2.19]
A command like echo aa|grep -E 'a(b$|c$)' would mistakenly report the input as a matched line. [bug introduced in grep-2.19]
** 2.20 Changes in behavior grep --exclude-dir='FOO/' now excludes the directory FOO. Previously, the trailing slash meant the option was ineffective.
** 2.19 Improvements Performance has improved, typically by 10% and in some cases by a factor of 200. However, performance of grep -P in UTF-8 locales has gotten worse as part of the fix for the crashes mentioned below.
** 2.19 Bug fixes grep no longer mishandles patterns like [a-[.z.]], and no longer mishandles patterns like [^a] in locales that have multicharacter collating sequences so that [^a] can match a string of two characters.
grep no longer mishandles an empty pattern at the end of a pattern list. [bug introduced in grep-2.5]
grep -C NUM now outputs separators consistently even when NUM is zero, and similarly for grep -A NUM and grep -B NUM. [bug present since "the beginning"]
grep -f no longer mishandles patterns containing NUL bytes. [bug introduced in grep-2.11]
Plain grep, grep -E, and grep -F now treat encoding errors in patterns the same way the GNU regular expression matcher treats them, with respect to whether the errors can match parts of multibyte characters in data. [bug present since "the beginning"]
grep -w no longer mishandles a potential match adjacent to a letter that takes up two or more bytes in a multibyte encoding. Similarly, the patterns '\<', '\>', '\b', and '\B' no longer mishandle word-boundary matches in multibyte locales. [bug present since "the beginning"]
grep -P now reports an error and exits when given invalid UTF-8 data. Previously it was unreliable, and sometimes crashed or looped. [bug introduced in grep-2.16]
grep -P now works with -w and -x and backreferences. Before, echo aa|grep -Pw '(.)\1' would fail to match, yet echo aa|grep -Pw '(.)\2' would match.
grep -Pw now works like grep -w in that the matched string has to be preceded and followed by non-word components or the beginning and end of the line (as opposed to word boundaries before). Before, this echo a@@a| grep -Pw @@ would match, yet this echo a@@a| grep -w @@ would not. Now, they both fail to match, per the documentation on how grep's -w works.
grep -i no longer mishandles patterns containing titlecase characters. For example, in a locale containing the titlecase character 'Lj' (U+01C8 LATIN CAPITAL LETTER L WITH SMALL LETTER J), 'grep -i Lj' now matches both 'LJ' (U+01C7 LATIN CAPITAL LETTER LJ) and 'lj' (U+01C9 LATIN SMALL LETTER LJ).
** 2.18 Bug fixes grep no longer mishandles patterns like [^^-~] in unibyte locales. [bug introduced in grep-2.8]
grep -i in a multibyte, non-UTF8 locale could be up to 200 times slower than in 2.16. [bug introduced in grep-2.17]
** 2.17 Improvements grep -i in a multibyte locale is now typically 10 times faster for patterns that do not contain \ or [.
grep (without -i) in a multibyte locale is now up to 7 times faster when processing many matched lines.
** 2.16 Bug fixes The fix to make \s and \S work with multi-byte white space broke the use of each shortcut whenever followed by a repetition operator. For example, \s*, \s+, \s? and \s{3} would all malfunction in a multi-byte locale. [bug introduced in grep-2.15]
The fix to make grep -P work better with UTF-8 made it possible for grep to evoke a larger set of PCRE errors, some of which could trigger an abort. E.g., this would abort: printf '\x82'|LC_ALL=en_US.UTF-8 grep -P y Now grep handles arbitrary PCRE errors. [bug introduced in grep-2.15]
Handle very long lines (2GiB and longer) on systems with a deficient read system call.
** 2.15 Bug fixes grep's \s and \S failed to work with multi-byte white space characters. For example, \s would fail to match a non-breaking space, and this would print nothing: printf '\xc2\xa0' | LC_ALL=en_US.UTF-8 grep '\s' A related bug is that \S would mistakenly match an invalid multibyte character. For example, the following would match: printf '\x82\n' | LC_ALL=en_US.UTF-8 grep '^\S$' [bug present since grep-2.6]
grep -i would segfault on systems using UTF-16-based wchar_t (Cygwin) when converting an input string containing certain 4-byte UTF-8 sequences to lower case. The conversions to wchar_t and back to a UTF-8 multibyte string did not take surrogate pairs into account. [bug present since at least grep-2.6, though the segfault is new with 2.13]
grep -E would segfault when given a regexp like '([^.]*[M]){1,2}' for any multibyte character M. [bug introduced in grep-2.6, which would segfault, but 2.7 and 2.8 had no problem, and 2.9 through 2.14 would hit a failed assertion. ]
grep -F would get stuck in an infinite loop when given a search string that is an invalid byte sequence in the current locale and that matches the bytes of the input twice on a line. Now grep fails with exit status 1.
grep -P could misbehave. While multi-byte mode is only supported by PCRE with UTF-8 locales, grep did not activate it. This would cause failures to match multibyte characters against some regular expressions, especially those including the '.' or '\p' metacharacters.
** 2.15 New features grep -P can now use a just-in-time compiler to greatly speed up matches, This feature is transparent to the user; no flag is required to enable it. It is only available if the corresponding support in the PCRE library is detected when grep is compiled.
|
H A D | unitypes.h | diff 51ddd709 Fri Oct 10 21:33:40 GMT 2014 John Marino <draco@marino.st> Complete upgrade of gnu grep 2.14 => 2.20
** 2.20 Bug fixes grep --max-count=N FILE would no longer stop reading after Nth match. I.e., while grep would still print the correct output, it would continue reading until end of input, and hence, potentially forever. [bug introduced in grep-2.19]
A command like echo aa|grep -E 'a(b$|c$)' would mistakenly report the input as a matched line. [bug introduced in grep-2.19]
** 2.20 Changes in behavior grep --exclude-dir='FOO/' now excludes the directory FOO. Previously, the trailing slash meant the option was ineffective.
** 2.19 Improvements Performance has improved, typically by 10% and in some cases by a factor of 200. However, performance of grep -P in UTF-8 locales has gotten worse as part of the fix for the crashes mentioned below.
** 2.19 Bug fixes grep no longer mishandles patterns like [a-[.z.]], and no longer mishandles patterns like [^a] in locales that have multicharacter collating sequences so that [^a] can match a string of two characters.
grep no longer mishandles an empty pattern at the end of a pattern list. [bug introduced in grep-2.5]
grep -C NUM now outputs separators consistently even when NUM is zero, and similarly for grep -A NUM and grep -B NUM. [bug present since "the beginning"]
grep -f no longer mishandles patterns containing NUL bytes. [bug introduced in grep-2.11]
Plain grep, grep -E, and grep -F now treat encoding errors in patterns the same way the GNU regular expression matcher treats them, with respect to whether the errors can match parts of multibyte characters in data. [bug present since "the beginning"]
grep -w no longer mishandles a potential match adjacent to a letter that takes up two or more bytes in a multibyte encoding. Similarly, the patterns '\<', '\>', '\b', and '\B' no longer mishandle word-boundary matches in multibyte locales. [bug present since "the beginning"]
grep -P now reports an error and exits when given invalid UTF-8 data. Previously it was unreliable, and sometimes crashed or looped. [bug introduced in grep-2.16]
grep -P now works with -w and -x and backreferences. Before, echo aa|grep -Pw '(.)\1' would fail to match, yet echo aa|grep -Pw '(.)\2' would match.
grep -Pw now works like grep -w in that the matched string has to be preceded and followed by non-word components or the beginning and end of the line (as opposed to word boundaries before). Before, this echo a@@a| grep -Pw @@ would match, yet this echo a@@a| grep -w @@ would not. Now, they both fail to match, per the documentation on how grep's -w works.
grep -i no longer mishandles patterns containing titlecase characters. For example, in a locale containing the titlecase character 'Lj' (U+01C8 LATIN CAPITAL LETTER L WITH SMALL LETTER J), 'grep -i Lj' now matches both 'LJ' (U+01C7 LATIN CAPITAL LETTER LJ) and 'lj' (U+01C9 LATIN SMALL LETTER LJ).
** 2.18 Bug fixes grep no longer mishandles patterns like [^^-~] in unibyte locales. [bug introduced in grep-2.8]
grep -i in a multibyte, non-UTF8 locale could be up to 200 times slower than in 2.16. [bug introduced in grep-2.17]
** 2.17 Improvements grep -i in a multibyte locale is now typically 10 times faster for patterns that do not contain \ or [.
grep (without -i) in a multibyte locale is now up to 7 times faster when processing many matched lines.
** 2.16 Bug fixes The fix to make \s and \S work with multi-byte white space broke the use of each shortcut whenever followed by a repetition operator. For example, \s*, \s+, \s? and \s{3} would all malfunction in a multi-byte locale. [bug introduced in grep-2.15]
The fix to make grep -P work better with UTF-8 made it possible for grep to evoke a larger set of PCRE errors, some of which could trigger an abort. E.g., this would abort: printf '\x82'|LC_ALL=en_US.UTF-8 grep -P y Now grep handles arbitrary PCRE errors. [bug introduced in grep-2.15]
Handle very long lines (2GiB and longer) on systems with a deficient read system call.
** 2.15 Bug fixes grep's \s and \S failed to work with multi-byte white space characters. For example, \s would fail to match a non-breaking space, and this would print nothing: printf '\xc2\xa0' | LC_ALL=en_US.UTF-8 grep '\s' A related bug is that \S would mistakenly match an invalid multibyte character. For example, the following would match: printf '\x82\n' | LC_ALL=en_US.UTF-8 grep '^\S$' [bug present since grep-2.6]
grep -i would segfault on systems using UTF-16-based wchar_t (Cygwin) when converting an input string containing certain 4-byte UTF-8 sequences to lower case. The conversions to wchar_t and back to a UTF-8 multibyte string did not take surrogate pairs into account. [bug present since at least grep-2.6, though the segfault is new with 2.13]
grep -E would segfault when given a regexp like '([^.]*[M]){1,2}' for any multibyte character M. [bug introduced in grep-2.6, which would segfault, but 2.7 and 2.8 had no problem, and 2.9 through 2.14 would hit a failed assertion. ]
grep -F would get stuck in an infinite loop when given a search string that is an invalid byte sequence in the current locale and that matches the bytes of the input twice on a line. Now grep fails with exit status 1.
grep -P could misbehave. While multi-byte mode is only supported by PCRE with UTF-8 locales, grep did not activate it. This would cause failures to match multibyte characters against some regular expressions, especially those including the '.' or '\p' metacharacters.
** 2.15 New features grep -P can now use a just-in-time compiler to greatly speed up matches, This feature is transparent to the user; no flag is required to enable it. It is only available if the corresponding support in the PCRE library is detected when grep is compiled.
|
H A D | getopt.h | diff 51ddd709 Fri Oct 10 21:33:40 GMT 2014 John Marino <draco@marino.st> Complete upgrade of gnu grep 2.14 => 2.20
** 2.20 Bug fixes grep --max-count=N FILE would no longer stop reading after Nth match. I.e., while grep would still print the correct output, it would continue reading until end of input, and hence, potentially forever. [bug introduced in grep-2.19]
A command like echo aa|grep -E 'a(b$|c$)' would mistakenly report the input as a matched line. [bug introduced in grep-2.19]
** 2.20 Changes in behavior grep --exclude-dir='FOO/' now excludes the directory FOO. Previously, the trailing slash meant the option was ineffective.
** 2.19 Improvements Performance has improved, typically by 10% and in some cases by a factor of 200. However, performance of grep -P in UTF-8 locales has gotten worse as part of the fix for the crashes mentioned below.
** 2.19 Bug fixes grep no longer mishandles patterns like [a-[.z.]], and no longer mishandles patterns like [^a] in locales that have multicharacter collating sequences so that [^a] can match a string of two characters.
grep no longer mishandles an empty pattern at the end of a pattern list. [bug introduced in grep-2.5]
grep -C NUM now outputs separators consistently even when NUM is zero, and similarly for grep -A NUM and grep -B NUM. [bug present since "the beginning"]
grep -f no longer mishandles patterns containing NUL bytes. [bug introduced in grep-2.11]
Plain grep, grep -E, and grep -F now treat encoding errors in patterns the same way the GNU regular expression matcher treats them, with respect to whether the errors can match parts of multibyte characters in data. [bug present since "the beginning"]
grep -w no longer mishandles a potential match adjacent to a letter that takes up two or more bytes in a multibyte encoding. Similarly, the patterns '\<', '\>', '\b', and '\B' no longer mishandle word-boundary matches in multibyte locales. [bug present since "the beginning"]
grep -P now reports an error and exits when given invalid UTF-8 data. Previously it was unreliable, and sometimes crashed or looped. [bug introduced in grep-2.16]
grep -P now works with -w and -x and backreferences. Before, echo aa|grep -Pw '(.)\1' would fail to match, yet echo aa|grep -Pw '(.)\2' would match.
grep -Pw now works like grep -w in that the matched string has to be preceded and followed by non-word components or the beginning and end of the line (as opposed to word boundaries before). Before, this echo a@@a| grep -Pw @@ would match, yet this echo a@@a| grep -w @@ would not. Now, they both fail to match, per the documentation on how grep's -w works.
grep -i no longer mishandles patterns containing titlecase characters. For example, in a locale containing the titlecase character 'Lj' (U+01C8 LATIN CAPITAL LETTER L WITH SMALL LETTER J), 'grep -i Lj' now matches both 'LJ' (U+01C7 LATIN CAPITAL LETTER LJ) and 'lj' (U+01C9 LATIN SMALL LETTER LJ).
** 2.18 Bug fixes grep no longer mishandles patterns like [^^-~] in unibyte locales. [bug introduced in grep-2.8]
grep -i in a multibyte, non-UTF8 locale could be up to 200 times slower than in 2.16. [bug introduced in grep-2.17]
** 2.17 Improvements grep -i in a multibyte locale is now typically 10 times faster for patterns that do not contain \ or [.
grep (without -i) in a multibyte locale is now up to 7 times faster when processing many matched lines.
** 2.16 Bug fixes The fix to make \s and \S work with multi-byte white space broke the use of each shortcut whenever followed by a repetition operator. For example, \s*, \s+, \s? and \s{3} would all malfunction in a multi-byte locale. [bug introduced in grep-2.15]
The fix to make grep -P work better with UTF-8 made it possible for grep to evoke a larger set of PCRE errors, some of which could trigger an abort. E.g., this would abort: printf '\x82'|LC_ALL=en_US.UTF-8 grep -P y Now grep handles arbitrary PCRE errors. [bug introduced in grep-2.15]
Handle very long lines (2GiB and longer) on systems with a deficient read system call.
** 2.15 Bug fixes grep's \s and \S failed to work with multi-byte white space characters. For example, \s would fail to match a non-breaking space, and this would print nothing: printf '\xc2\xa0' | LC_ALL=en_US.UTF-8 grep '\s' A related bug is that \S would mistakenly match an invalid multibyte character. For example, the following would match: printf '\x82\n' | LC_ALL=en_US.UTF-8 grep '^\S$' [bug present since grep-2.6]
grep -i would segfault on systems using UTF-16-based wchar_t (Cygwin) when converting an input string containing certain 4-byte UTF-8 sequences to lower case. The conversions to wchar_t and back to a UTF-8 multibyte string did not take surrogate pairs into account. [bug present since at least grep-2.6, though the segfault is new with 2.13]
grep -E would segfault when given a regexp like '([^.]*[M]){1,2}' for any multibyte character M. [bug introduced in grep-2.6, which would segfault, but 2.7 and 2.8 had no problem, and 2.9 through 2.14 would hit a failed assertion. ]
grep -F would get stuck in an infinite loop when given a search string that is an invalid byte sequence in the current locale and that matches the bytes of the input twice on a line. Now grep fails with exit status 1.
grep -P could misbehave. While multi-byte mode is only supported by PCRE with UTF-8 locales, grep did not activate it. This would cause failures to match multibyte characters against some regular expressions, especially those including the '.' or '\p' metacharacters.
** 2.15 New features grep -P can now use a just-in-time compiler to greatly speed up matches, This feature is transparent to the user; no flag is required to enable it. It is only available if the corresponding support in the PCRE library is detected when grep is compiled.
|
H A D | unistr.h | diff 51ddd709 Fri Oct 10 21:33:40 GMT 2014 John Marino <draco@marino.st> Complete upgrade of gnu grep 2.14 => 2.20
** 2.20 Bug fixes grep --max-count=N FILE would no longer stop reading after Nth match. I.e., while grep would still print the correct output, it would continue reading until end of input, and hence, potentially forever. [bug introduced in grep-2.19]
A command like echo aa|grep -E 'a(b$|c$)' would mistakenly report the input as a matched line. [bug introduced in grep-2.19]
** 2.20 Changes in behavior grep --exclude-dir='FOO/' now excludes the directory FOO. Previously, the trailing slash meant the option was ineffective.
** 2.19 Improvements Performance has improved, typically by 10% and in some cases by a factor of 200. However, performance of grep -P in UTF-8 locales has gotten worse as part of the fix for the crashes mentioned below.
** 2.19 Bug fixes grep no longer mishandles patterns like [a-[.z.]], and no longer mishandles patterns like [^a] in locales that have multicharacter collating sequences so that [^a] can match a string of two characters.
grep no longer mishandles an empty pattern at the end of a pattern list. [bug introduced in grep-2.5]
grep -C NUM now outputs separators consistently even when NUM is zero, and similarly for grep -A NUM and grep -B NUM. [bug present since "the beginning"]
grep -f no longer mishandles patterns containing NUL bytes. [bug introduced in grep-2.11]
Plain grep, grep -E, and grep -F now treat encoding errors in patterns the same way the GNU regular expression matcher treats them, with respect to whether the errors can match parts of multibyte characters in data. [bug present since "the beginning"]
grep -w no longer mishandles a potential match adjacent to a letter that takes up two or more bytes in a multibyte encoding. Similarly, the patterns '\<', '\>', '\b', and '\B' no longer mishandle word-boundary matches in multibyte locales. [bug present since "the beginning"]
grep -P now reports an error and exits when given invalid UTF-8 data. Previously it was unreliable, and sometimes crashed or looped. [bug introduced in grep-2.16]
grep -P now works with -w and -x and backreferences. Before, echo aa|grep -Pw '(.)\1' would fail to match, yet echo aa|grep -Pw '(.)\2' would match.
grep -Pw now works like grep -w in that the matched string has to be preceded and followed by non-word components or the beginning and end of the line (as opposed to word boundaries before). Before, this echo a@@a| grep -Pw @@ would match, yet this echo a@@a| grep -w @@ would not. Now, they both fail to match, per the documentation on how grep's -w works.
grep -i no longer mishandles patterns containing titlecase characters. For example, in a locale containing the titlecase character 'Lj' (U+01C8 LATIN CAPITAL LETTER L WITH SMALL LETTER J), 'grep -i Lj' now matches both 'LJ' (U+01C7 LATIN CAPITAL LETTER LJ) and 'lj' (U+01C9 LATIN SMALL LETTER LJ).
** 2.18 Bug fixes grep no longer mishandles patterns like [^^-~] in unibyte locales. [bug introduced in grep-2.8]
grep -i in a multibyte, non-UTF8 locale could be up to 200 times slower than in 2.16. [bug introduced in grep-2.17]
** 2.17 Improvements grep -i in a multibyte locale is now typically 10 times faster for patterns that do not contain \ or [.
grep (without -i) in a multibyte locale is now up to 7 times faster when processing many matched lines.
** 2.16 Bug fixes The fix to make \s and \S work with multi-byte white space broke the use of each shortcut whenever followed by a repetition operator. For example, \s*, \s+, \s? and \s{3} would all malfunction in a multi-byte locale. [bug introduced in grep-2.15]
The fix to make grep -P work better with UTF-8 made it possible for grep to evoke a larger set of PCRE errors, some of which could trigger an abort. E.g., this would abort: printf '\x82'|LC_ALL=en_US.UTF-8 grep -P y Now grep handles arbitrary PCRE errors. [bug introduced in grep-2.15]
Handle very long lines (2GiB and longer) on systems with a deficient read system call.
** 2.15 Bug fixes grep's \s and \S failed to work with multi-byte white space characters. For example, \s would fail to match a non-breaking space, and this would print nothing: printf '\xc2\xa0' | LC_ALL=en_US.UTF-8 grep '\s' A related bug is that \S would mistakenly match an invalid multibyte character. For example, the following would match: printf '\x82\n' | LC_ALL=en_US.UTF-8 grep '^\S$' [bug present since grep-2.6]
grep -i would segfault on systems using UTF-16-based wchar_t (Cygwin) when converting an input string containing certain 4-byte UTF-8 sequences to lower case. The conversions to wchar_t and back to a UTF-8 multibyte string did not take surrogate pairs into account. [bug present since at least grep-2.6, though the segfault is new with 2.13]
grep -E would segfault when given a regexp like '([^.]*[M]){1,2}' for any multibyte character M. [bug introduced in grep-2.6, which would segfault, but 2.7 and 2.8 had no problem, and 2.9 through 2.14 would hit a failed assertion. ]
grep -F would get stuck in an infinite loop when given a search string that is an invalid byte sequence in the current locale and that matches the bytes of the input twice on a line. Now grep fails with exit status 1.
grep -P could misbehave. While multi-byte mode is only supported by PCRE with UTF-8 locales, grep did not activate it. This would cause failures to match multibyte characters against some regular expressions, especially those including the '.' or '\p' metacharacters.
** 2.15 New features grep -P can now use a just-in-time compiler to greatly speed up matches, This feature is transparent to the user; no flag is required to enable it. It is only available if the corresponding support in the PCRE library is detected when grep is compiled.
|