1.xx meta.keywords="regular expression pattern match regression test"
2.MT 4
3.TL
4AT&T Research regex(3) regression tests
5.AF "AT&T Research - Florham Park NJ"
6.AU "Glenn Fowler <gsf@research.att.com>"
7.H 1
8.xx link="testregex.c	testregex.c 2004-05-31"
9is the latest source for the AT&T Research regression test
10harness for the
11.xx link="http://www.opengroup.org/onlinepubs/007904975/functions/regcomp.html	X/Open regex"
12pattern match interface.
13See
14.BR testregex (1)
15for option and test input details.
16The source and test data posted here are license free.
17.P
18.B testregex
19can:
20.BL
21.LI
22verify stability for a particular implementation in the face of
23source code and/or compilation environment changes
24.LI
25verify standard compliance for all implementations
26.LI
27provide a basis for discussions on what
28.I compliance
29means
30.LE
31.P
32See
33.xx link="re-interpretation.html	An Interpretation of the POSIX regex Standards"
34for an analysis of the POSIX-X/Open
35.B regex
36standards.
37.H 1 "Reference Implementations"
38.B testregex
39is currently built against these reference implementations:
40.TS
41center box;
42rb cb lb
43r c l.
44NAME	LABEL	AUTHORS
45AT&T ast	\h'0*\w"http://www.research.att.com/sw/download/"'A\h'0'	Glenn Fowler and Doug McIlroy
46bsd	\h'0*\w"ftp://ftp.netbsd.org/pub/NetBSD/NetBSD-1.5.2/source/sets/src.tgz"'B\h'0'	\|
47Bell Labs	\h'0*\w"http://www.bell-labs.com/"'D\h'0'	Doug McIlroy
48old gnu	\h'0*\w"http://www.gnu.org"'G\h'0'	\|
49gnu	\h'0*\w"http://www.gnu.org"'H\h'0'	Isamu Hasegawa
50irix	\h'0*\w"http://www.sgi.com"'I\h'0'	\|
51boost	\h'0*\w"http://www.boost.org/libs/regex/"'J\h'0'	John Maddock
52regex++	\h'0*\w"http://ourworld.compuserve.com/homepages/John_Maddock/regexpp.htm"'M\h'0'	John Maddock
53pcre perl compatible	\h'0*\w"http://www.pcre.org/"'P\h'0'	Philip Hazel
54rx	\h'0*\w"ftp://regexps.com/pub/src/hackerlab/"'R\h'0'	Tom Lord
55spencer	\h'0*\w"http://arglist.com/regex/rxspencer-alpha3.8.g2.tar.gz"'S\h'0'	Henry Spencer
56libtre	\h'0*\w"http://kouli.iki.fi/~vlaurika/libtre/"'T\h'0'	Ville Laurikari
57unix caldera	\h'0*\w"http://unixtools.sourceforge.net/"'U\h'0'	\|
58.TE
59.H 1 "Test Data Repository"
60.TS
61center box;
62r l.
63\h'0*\w"basic.dat"'basic.dat\h'0'	\|\|basic regex(3) -- all implementations should pass these
64\h'0*\w"categorize.dat"'categorize.dat\h'0'	\|\|\h'0*\w"./re-categorize.html"'implementation categorization\h'0'
65\h'0*\w"nullsubexpr.dat"'nullsubexpr.dat\h'0'	\|\|\h'0*\w"./re-nullsubexpr.html"'null (...)* tests\h'0'
66\h'0*\w"leftassoc.dat"'leftassoc.dat\h'0'	\|\|\h'0*\w"./re-assoc.html"'left associative catenation implementation must pass these\h'0'
67\h'0*\w"rightassoc.dat"'rightassoc.dat\h'0'	\|\|\h'0*\w"./re-assoc.html"'right associative catenation implementation must pass these\h'0'
68\h'0*\w"forcedassoc.dat"'forcedassoc.dat\h'0'	\|\|\h'0*\w"./re-assoc.html"'subexpression grouping to force associativity\h'0'
69\h'0*\w"repetition.dat"'repetition.dat\h'0'	\|\|\h'0*\w"./re-repetition.html"'explicit vs. implicit repetitions\h'0'
70.TE
71.H 1 "Usage"
72To run the
73.B basic.dat
74tests:
75.EX
76testregex < basic.dat
77.EE
78.P
79If the local implementation hangs or dumps on some tests then run with
80the \fB-c\fP option.
81The \fB-h\fP option lists the test data format details.
82The test data files exercise all features;
83the test harness detects and ignores features not
84supported by the local implementation.
85.H 1 "Reference Implementation Notes"
86.H 2 "D: diet libc"
87The
88.xx link="http://www.fefe.de/dietlibc/	diet libc"
89implementation is currently omitted because it fails all but one
90.B basic.dat
91test.
92.H 2 "P: PCRE"
93The
94.B P
95implementation emulates
96.BR perl (1)
97and is not X/Open compliant by design.
98The main differences are:
99.BL
100.LI
101.B P
102.I "leftmost-first"
103matching as opposed to the X/Open
104.IR "leftmost-longest" .
105.LI
106.B REG_EXTENDED
107patterns only.
108.LE
109.P
110However, the
111.B P
112package regression tests, and
113.BR perl (1)
114features creeping into other implementations,
115make it reasonable to include here.
116.H 1 "testregex Notes"
117Extensions to the standard terminology are derived from the AT&T
118implementation, unified under
119.B <regex.h>
120with these modes:
121.TS
122center allbox;
123cb lb lb
124r l l.
125MODE	FLAGS	DESCRIPTION
126BRE	0	basic RE
127ERE	REG_EXTENDED	egrep RE with perl (...) extensions
128ARE	REG_AUGMENTED	ERE with ! negation, <> word boundaries
129SRE	REG_SHELL	sh patterns
130KRE	REG_SHELL|REG_AUGMENTED	ksh93 patterns: ! @ ( | & ) { }
131LRE	REG_LITERAL	fgrep patterns
132.TE
133.P
134and a few flags to handle
135.BR fnmatch (3):
136.TS
137center allbox;
138lb lb
139l l.
140regex FLAG	fnmatch FLAG
141REG_SHELL_ESCAPED	FNM_NOESCAPE
142REG_SHELL_PATH	FNM_PATHNAME
143REG_SHELL_DOT	FNM_PERIOD
144.TE
145.P
146The original
147.L testregex.c
148was done by Doug McIlroy at Bell Labs.
149The current implementation is maintained by Glenn Fowler <gsf@research.att.com>.
150