xref: /dragonfly/lib/libc/tre-regex/re_format.7 (revision 6af9a77b)
1*6af9a77bSJohn Marino.\" Copyright (c) 1992, 1993, 1994 Henry Spencer.
2*6af9a77bSJohn Marino.\" Copyright (c) 1992, 1993, 1994
3*6af9a77bSJohn Marino.\"	The Regents of the University of California.  All rights reserved.
4*6af9a77bSJohn Marino.\"
5*6af9a77bSJohn Marino.\" This code is derived from software contributed to Berkeley by
6*6af9a77bSJohn Marino.\" Henry Spencer.
7*6af9a77bSJohn Marino.\"
8*6af9a77bSJohn Marino.\" Redistribution and use in source and binary forms, with or without
9*6af9a77bSJohn Marino.\" modification, are permitted provided that the following conditions
10*6af9a77bSJohn Marino.\" are met:
11*6af9a77bSJohn Marino.\" 1. Redistributions of source code must retain the above copyright
12*6af9a77bSJohn Marino.\"    notice, this list of conditions and the following disclaimer.
13*6af9a77bSJohn Marino.\" 2. Redistributions in binary form must reproduce the above copyright
14*6af9a77bSJohn Marino.\"    notice, this list of conditions and the following disclaimer in the
15*6af9a77bSJohn Marino.\"    documentation and/or other materials provided with the distribution.
16*6af9a77bSJohn Marino.\" 3. Neither the name of the University nor the names of its contributors
17*6af9a77bSJohn Marino.\"    may be used to endorse or promote products derived from this software
18*6af9a77bSJohn Marino.\"    without specific prior written permission.
19*6af9a77bSJohn Marino.\"
20*6af9a77bSJohn Marino.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
21*6af9a77bSJohn Marino.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
22*6af9a77bSJohn Marino.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
23*6af9a77bSJohn Marino.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
24*6af9a77bSJohn Marino.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
25*6af9a77bSJohn Marino.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
26*6af9a77bSJohn Marino.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
27*6af9a77bSJohn Marino.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
28*6af9a77bSJohn Marino.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
29*6af9a77bSJohn Marino.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
30*6af9a77bSJohn Marino.\" SUCH DAMAGE.
31*6af9a77bSJohn Marino.\"
32*6af9a77bSJohn Marino.\"	@(#)re_format.7	8.3 (Berkeley) 3/20/94
33*6af9a77bSJohn Marino.\" $FreeBSD: src/lib/libc/regex/re_format.7,v 1.12 2008/09/05 17:41:20 keramida Exp $
34*6af9a77bSJohn Marino.\"
35*6af9a77bSJohn Marino.Dd August 6, 2015
36*6af9a77bSJohn Marino.Dt RE_FORMAT 7
37*6af9a77bSJohn Marino.Os
38*6af9a77bSJohn Marino.Sh NAME
39*6af9a77bSJohn Marino.Nm re_format
40*6af9a77bSJohn Marino.Nd POSIX 1003.2 regular expressions
41*6af9a77bSJohn Marino.Sh DESCRIPTION
42*6af9a77bSJohn MarinoRegular expressions
43*6af9a77bSJohn Marino.Pq Dq RE Ns s ,
44*6af9a77bSJohn Marinoas defined in
45*6af9a77bSJohn Marino.St -p1003.2 ,
46*6af9a77bSJohn Marinocome in two forms:
47*6af9a77bSJohn Marinomodern REs (roughly those of
48*6af9a77bSJohn Marino.Xr egrep 1 ;
49*6af9a77bSJohn Marino1003.2 calls these
50*6af9a77bSJohn Marino.Dq extended
51*6af9a77bSJohn MarinoREs)
52*6af9a77bSJohn Marinoand obsolete REs (roughly those of
53*6af9a77bSJohn Marino.Xr ed 1 ;
54*6af9a77bSJohn Marino1003.2
55*6af9a77bSJohn Marino.Dq basic
56*6af9a77bSJohn MarinoREs).
57*6af9a77bSJohn MarinoObsolete REs mostly exist for backward compatibility in some old programs;
58*6af9a77bSJohn Marinothey will be discussed at the end.
59*6af9a77bSJohn Marino.St -p1003.2
60*6af9a77bSJohn Marinoleaves some aspects of RE syntax and semantics open;
61*6af9a77bSJohn Marino`\(dd' marks decisions on these aspects that
62*6af9a77bSJohn Marinomay not be fully portable to other
63*6af9a77bSJohn Marino.St -p1003.2
64*6af9a77bSJohn Marinoimplementations.
65*6af9a77bSJohn Marino.Pp
66*6af9a77bSJohn MarinoA (modern) RE is one\(dd or more non-empty\(dd
67*6af9a77bSJohn Marino.Em branches ,
68*6af9a77bSJohn Marinoseparated by
69*6af9a77bSJohn Marino.Ql \&| .
70*6af9a77bSJohn MarinoIt matches anything that matches one of the branches.
71*6af9a77bSJohn Marino.Pp
72*6af9a77bSJohn MarinoA branch is one\(dd or more
73*6af9a77bSJohn Marino.Em pieces ,
74*6af9a77bSJohn Marinoconcatenated.
75*6af9a77bSJohn MarinoIt matches a match for the first, followed by a match for the second, etc.
76*6af9a77bSJohn Marino.Pp
77*6af9a77bSJohn MarinoA piece is an
78*6af9a77bSJohn Marino.Em atom
79*6af9a77bSJohn Marinopossibly followed
80*6af9a77bSJohn Marinoby a single\(dd
81*6af9a77bSJohn Marino.Ql \&* ,
82*6af9a77bSJohn Marino.Ql \&+ ,
83*6af9a77bSJohn Marino.Ql \&? ,
84*6af9a77bSJohn Marinoor
85*6af9a77bSJohn Marino.Em bound .
86*6af9a77bSJohn MarinoAn atom followed by
87*6af9a77bSJohn Marino.Ql \&*
88*6af9a77bSJohn Marinomatches a sequence of 0 or more matches of the atom.
89*6af9a77bSJohn MarinoAn atom followed by
90*6af9a77bSJohn Marino.Ql \&+
91*6af9a77bSJohn Marinomatches a sequence of 1 or more matches of the atom.
92*6af9a77bSJohn MarinoAn atom followed by
93*6af9a77bSJohn Marino.Ql ?\&
94*6af9a77bSJohn Marinomatches a sequence of 0 or 1 matches of the atom.
95*6af9a77bSJohn Marino.Pp
96*6af9a77bSJohn MarinoA
97*6af9a77bSJohn Marino.Em bound
98*6af9a77bSJohn Marinois
99*6af9a77bSJohn Marino.Ql \&{
100*6af9a77bSJohn Marinofollowed by an unsigned decimal integer,
101*6af9a77bSJohn Marinopossibly followed by
102*6af9a77bSJohn Marino.Ql \&,
103*6af9a77bSJohn Marinopossibly followed by another unsigned decimal integer,
104*6af9a77bSJohn Marinoalways followed by
105*6af9a77bSJohn Marino.Ql \&} .
106*6af9a77bSJohn MarinoThe integers must lie between 0 and
107*6af9a77bSJohn Marino.Dv RE_DUP_MAX
108*6af9a77bSJohn Marino(255\(dd) inclusive,
109*6af9a77bSJohn Marinoand if there are two of them, the first may not exceed the second.
110*6af9a77bSJohn MarinoAn atom followed by a bound containing one integer
111*6af9a77bSJohn Marino.Em i
112*6af9a77bSJohn Marinoand no comma matches
113*6af9a77bSJohn Marinoa sequence of exactly
114*6af9a77bSJohn Marino.Em i
115*6af9a77bSJohn Marinomatches of the atom.
116*6af9a77bSJohn MarinoAn atom followed by a bound
117*6af9a77bSJohn Marinocontaining one integer
118*6af9a77bSJohn Marino.Em i
119*6af9a77bSJohn Marinoand a comma matches
120*6af9a77bSJohn Marinoa sequence of
121*6af9a77bSJohn Marino.Em i
122*6af9a77bSJohn Marinoor more matches of the atom.
123*6af9a77bSJohn MarinoAn atom followed by a bound
124*6af9a77bSJohn Marinocontaining two integers
125*6af9a77bSJohn Marino.Em i
126*6af9a77bSJohn Marinoand
127*6af9a77bSJohn Marino.Em j
128*6af9a77bSJohn Marinomatches
129*6af9a77bSJohn Marinoa sequence of
130*6af9a77bSJohn Marino.Em i
131*6af9a77bSJohn Marinothrough
132*6af9a77bSJohn Marino.Em j
133*6af9a77bSJohn Marino(inclusive) matches of the atom.
134*6af9a77bSJohn Marino.Pp
135*6af9a77bSJohn MarinoAn atom is a regular expression enclosed in
136*6af9a77bSJohn Marino.Ql ()
137*6af9a77bSJohn Marino(matching a match for the
138*6af9a77bSJohn Marinoregular expression),
139*6af9a77bSJohn Marinoan empty set of
140*6af9a77bSJohn Marino.Ql ()
141*6af9a77bSJohn Marino(matching the null string)\(dd,
142*6af9a77bSJohn Marinoa
143*6af9a77bSJohn Marino.Em bracket expression
144*6af9a77bSJohn Marino(see below),
145*6af9a77bSJohn Marino.Ql .\&
146*6af9a77bSJohn Marino(matching any single character),
147*6af9a77bSJohn Marino.Ql \&^
148*6af9a77bSJohn Marino(matching the null string at the beginning of a line),
149*6af9a77bSJohn Marino.Ql \&$
150*6af9a77bSJohn Marino(matching the null string at the end of a line), a
151*6af9a77bSJohn Marino.Ql \e
152*6af9a77bSJohn Marinofollowed by one of the characters
153*6af9a77bSJohn Marino.Ql ^.[$()|*+?{\e
154*6af9a77bSJohn Marino(matching that character taken as an ordinary character),
155*6af9a77bSJohn Marinoa
156*6af9a77bSJohn Marino.Ql \e
157*6af9a77bSJohn Marinofollowed by any other character\(dd
158*6af9a77bSJohn Marino(matching that character taken as an ordinary character,
159*6af9a77bSJohn Marinoas if the
160*6af9a77bSJohn Marino.Ql \e
161*6af9a77bSJohn Marinohad not been present\(dd),
162*6af9a77bSJohn Marinoor a single character with no other significance (matching that character).
163*6af9a77bSJohn MarinoA
164*6af9a77bSJohn Marino.Ql \&{
165*6af9a77bSJohn Marinofollowed by a character other than a digit is an ordinary
166*6af9a77bSJohn Marinocharacter, not the beginning of a bound\(dd.
167*6af9a77bSJohn MarinoIt is illegal to end an RE with
168*6af9a77bSJohn Marino.Ql \e .
169*6af9a77bSJohn Marino.Pp
170*6af9a77bSJohn MarinoA
171*6af9a77bSJohn Marino.Em bracket expression
172*6af9a77bSJohn Marinois a list of characters enclosed in
173*6af9a77bSJohn Marino.Ql [] .
174*6af9a77bSJohn MarinoIt normally matches any single character from the list (but see below).
175*6af9a77bSJohn MarinoIf the list begins with
176*6af9a77bSJohn Marino.Ql \&^ ,
177*6af9a77bSJohn Marinoit matches any single character
178*6af9a77bSJohn Marino(but see below)
179*6af9a77bSJohn Marino.Em not
180*6af9a77bSJohn Marinofrom the rest of the list.
181*6af9a77bSJohn MarinoIf two characters in the list are separated by
182*6af9a77bSJohn Marino.Ql \&- ,
183*6af9a77bSJohn Marinothis is shorthand
184*6af9a77bSJohn Marinofor the full
185*6af9a77bSJohn Marino.Em range
186*6af9a77bSJohn Marinoof characters between those two (inclusive) in the
187*6af9a77bSJohn Marinocollating sequence,
188*6af9a77bSJohn Marino.No e.g. Ql [0-9]
189*6af9a77bSJohn Marinoin ASCII matches any decimal digit.
190*6af9a77bSJohn MarinoIt is illegal\(dd for two ranges to share an
191*6af9a77bSJohn Marinoendpoint,
192*6af9a77bSJohn Marino.No e.g. Ql a-c-e .
193*6af9a77bSJohn MarinoRanges are very collating-sequence-dependent,
194*6af9a77bSJohn Marinoand portable programs should avoid relying on them.
195*6af9a77bSJohn Marino.Pp
196*6af9a77bSJohn MarinoTo include a literal
197*6af9a77bSJohn Marino.Ql \&]
198*6af9a77bSJohn Marinoin the list, make it the first character
199*6af9a77bSJohn Marino(following a possible
200*6af9a77bSJohn Marino.Ql \&^ ) .
201*6af9a77bSJohn MarinoTo include a literal
202*6af9a77bSJohn Marino.Ql \&- ,
203*6af9a77bSJohn Marinomake it the first or last character,
204*6af9a77bSJohn Marinoor the second endpoint of a range.
205*6af9a77bSJohn MarinoTo use a literal
206*6af9a77bSJohn Marino.Ql \&-
207*6af9a77bSJohn Marinoas the first endpoint of a range,
208*6af9a77bSJohn Marinoenclose it in
209*6af9a77bSJohn Marino.Ql [.\&
210*6af9a77bSJohn Marinoand
211*6af9a77bSJohn Marino.Ql .]\&
212*6af9a77bSJohn Marinoto make it a collating element (see below).
213*6af9a77bSJohn MarinoWith the exception of these and some combinations using
214*6af9a77bSJohn Marino.Ql \&[
215*6af9a77bSJohn Marino(see next paragraphs), all other special characters, including
216*6af9a77bSJohn Marino.Ql \e ,
217*6af9a77bSJohn Marinolose their special significance within a bracket expression.
218*6af9a77bSJohn Marino.Pp
219*6af9a77bSJohn MarinoWithin a bracket expression, a collating element (a character,
220*6af9a77bSJohn Marinoa multi-character sequence that collates as if it were a single character,
221*6af9a77bSJohn Marinoor a collating-sequence name for either)
222*6af9a77bSJohn Marinoenclosed in
223*6af9a77bSJohn Marino.Ql [.\&
224*6af9a77bSJohn Marinoand
225*6af9a77bSJohn Marino.Ql .]\&
226*6af9a77bSJohn Marinostands for the
227*6af9a77bSJohn Marinosequence of characters of that collating element.
228*6af9a77bSJohn MarinoThe sequence is a single element of the bracket expression's list.
229*6af9a77bSJohn MarinoA bracket expression containing a multi-character collating element
230*6af9a77bSJohn Marinocan thus match more than one character,
231*6af9a77bSJohn Marinoe.g.\& if the collating sequence includes a
232*6af9a77bSJohn Marino.Ql ch
233*6af9a77bSJohn Marinocollating element,
234*6af9a77bSJohn Marinothen the RE
235*6af9a77bSJohn Marino.Ql [[.ch.]]*c
236*6af9a77bSJohn Marinomatches the first five characters
237*6af9a77bSJohn Marinoof
238*6af9a77bSJohn Marino.Ql chchcc .
239*6af9a77bSJohn Marino.Pp
240*6af9a77bSJohn MarinoWithin a bracket expression, a collating element enclosed in
241*6af9a77bSJohn Marino.Ql [=
242*6af9a77bSJohn Marinoand
243*6af9a77bSJohn Marino.Ql =]
244*6af9a77bSJohn Marinois an equivalence class, standing for the sequences of characters
245*6af9a77bSJohn Marinoof all collating elements equivalent to that one, including itself.
246*6af9a77bSJohn Marino(If there are no other equivalent collating elements,
247*6af9a77bSJohn Marinothe treatment is as if the enclosing delimiters were
248*6af9a77bSJohn Marino.Ql [.\&
249*6af9a77bSJohn Marinoand
250*6af9a77bSJohn Marino.Ql .] . )
251*6af9a77bSJohn MarinoFor example, if
252*6af9a77bSJohn Marino.Ql x
253*6af9a77bSJohn Marinoand
254*6af9a77bSJohn Marino.Ql y
255*6af9a77bSJohn Marinoare the members of an equivalence class,
256*6af9a77bSJohn Marinothen
257*6af9a77bSJohn Marino.Ql [[=x=]] ,
258*6af9a77bSJohn Marino.Ql [[=y=]] ,
259*6af9a77bSJohn Marinoand
260*6af9a77bSJohn Marino.Ql [xy]
261*6af9a77bSJohn Marinoare all synonymous.
262*6af9a77bSJohn MarinoAn equivalence class may not\(dd be an endpoint
263*6af9a77bSJohn Marinoof a range.
264*6af9a77bSJohn Marino.Pp
265*6af9a77bSJohn MarinoWithin a bracket expression, the name of a
266*6af9a77bSJohn Marino.Em character class
267*6af9a77bSJohn Marinoenclosed in
268*6af9a77bSJohn Marino.Ql [:
269*6af9a77bSJohn Marinoand
270*6af9a77bSJohn Marino.Ql :]
271*6af9a77bSJohn Marinostands for the list of all characters belonging to that
272*6af9a77bSJohn Marinoclass.
273*6af9a77bSJohn MarinoStandard character class names are:
274*6af9a77bSJohn Marino.Bl -column "alnum" "digit" "xdigit" -offset indent
275*6af9a77bSJohn Marino.It Em "alnum	digit	punct"
276*6af9a77bSJohn Marino.It Em "alpha	graph	space"
277*6af9a77bSJohn Marino.It Em "blank	lower	upper"
278*6af9a77bSJohn Marino.It Em "cntrl	print	xdigit"
279*6af9a77bSJohn Marino.El
280*6af9a77bSJohn Marino.Pp
281*6af9a77bSJohn MarinoThese stand for the character classes defined in
282*6af9a77bSJohn Marino.Xr ctype 3 .
283*6af9a77bSJohn MarinoA locale may provide others.
284*6af9a77bSJohn MarinoA character class may not be used as an endpoint of a range.
285*6af9a77bSJohn Marino.Pp
286*6af9a77bSJohn MarinoA bracketed expression like
287*6af9a77bSJohn Marino.Ql [[:class:]]
288*6af9a77bSJohn Marinocan be used to match a single character that belongs to a character
289*6af9a77bSJohn Marinoclass.
290*6af9a77bSJohn MarinoThe reverse, matching any character that does not belong to a specific
291*6af9a77bSJohn Marinoclass, the negation operator of bracket expressions may be used:
292*6af9a77bSJohn Marino.Ql [^[:class:]] .
293*6af9a77bSJohn Marino.Pp
294*6af9a77bSJohn MarinoThere are two special cases\(dd of bracket expressions:
295*6af9a77bSJohn Marinothe bracket expressions
296*6af9a77bSJohn Marino.Ql [[:<:]]
297*6af9a77bSJohn Marinoand
298*6af9a77bSJohn Marino.Ql [[:>:]]
299*6af9a77bSJohn Marinomatch the null string at the beginning and end of a word respectively.
300*6af9a77bSJohn MarinoA word is defined as a sequence of word characters
301*6af9a77bSJohn Marinowhich is neither preceded nor followed by
302*6af9a77bSJohn Marinoword characters.
303*6af9a77bSJohn MarinoA word character is an
304*6af9a77bSJohn Marino.Em alnum
305*6af9a77bSJohn Marinocharacter (as defined by
306*6af9a77bSJohn Marino.Xr ctype 3 )
307*6af9a77bSJohn Marinoor an underscore.
308*6af9a77bSJohn MarinoThis is an extension,
309*6af9a77bSJohn Marinocompatible with but not specified by
310*6af9a77bSJohn Marino.St -p1003.2 ,
311*6af9a77bSJohn Marinoand should be used with
312*6af9a77bSJohn Marinocaution in software intended to be portable to other systems.
313*6af9a77bSJohn Marino.Pp
314*6af9a77bSJohn MarinoIn the event that an RE could match more than one substring of a given
315*6af9a77bSJohn Marinostring,
316*6af9a77bSJohn Marinothe RE matches the one starting earliest in the string.
317*6af9a77bSJohn MarinoIf the RE could match more than one substring starting at that point,
318*6af9a77bSJohn Marinoit matches the longest.
319*6af9a77bSJohn MarinoSubexpressions also match the longest possible substrings, subject to
320*6af9a77bSJohn Marinothe constraint that the whole match be as long as possible,
321*6af9a77bSJohn Marinowith subexpressions starting earlier in the RE taking priority over
322*6af9a77bSJohn Marinoones starting later.
323*6af9a77bSJohn MarinoNote that higher-level subexpressions thus take priority over
324*6af9a77bSJohn Marinotheir lower-level component subexpressions.
325*6af9a77bSJohn Marino.Pp
326*6af9a77bSJohn MarinoMatch lengths are measured in characters, not collating elements.
327*6af9a77bSJohn MarinoA null string is considered longer than no match at all.
328*6af9a77bSJohn MarinoFor example,
329*6af9a77bSJohn Marino.Ql bb*
330*6af9a77bSJohn Marinomatches the three middle characters of
331*6af9a77bSJohn Marino.Ql abbbc ,
332*6af9a77bSJohn Marino.Ql (wee|week)(knights|nights)
333*6af9a77bSJohn Marinomatches all ten characters of
334*6af9a77bSJohn Marino.Ql weeknights ,
335*6af9a77bSJohn Marinowhen
336*6af9a77bSJohn Marino.Ql (.*).*\&
337*6af9a77bSJohn Marinois matched against
338*6af9a77bSJohn Marino.Ql abc
339*6af9a77bSJohn Marinothe parenthesized subexpression
340*6af9a77bSJohn Marinomatches all three characters, and
341*6af9a77bSJohn Marinowhen
342*6af9a77bSJohn Marino.Ql (a*)*
343*6af9a77bSJohn Marinois matched against
344*6af9a77bSJohn Marino.Ql bc
345*6af9a77bSJohn Marinoboth the whole RE and the parenthesized
346*6af9a77bSJohn Marinosubexpression match the null string.
347*6af9a77bSJohn Marino.Pp
348*6af9a77bSJohn MarinoIf case-independent matching is specified,
349*6af9a77bSJohn Marinothe effect is much as if all case distinctions had vanished from the
350*6af9a77bSJohn Marinoalphabet.
351*6af9a77bSJohn MarinoWhen an alphabetic that exists in multiple cases appears as an
352*6af9a77bSJohn Marinoordinary character outside a bracket expression, it is effectively
353*6af9a77bSJohn Marinotransformed into a bracket expression containing both cases,
354*6af9a77bSJohn Marino.No e.g. Ql x
355*6af9a77bSJohn Marinobecomes
356*6af9a77bSJohn Marino.Ql [xX] .
357*6af9a77bSJohn MarinoWhen it appears inside a bracket expression, all case counterparts
358*6af9a77bSJohn Marinoof it are added to the bracket expression, so that (e.g.)
359*6af9a77bSJohn Marino.Ql [x]
360*6af9a77bSJohn Marinobecomes
361*6af9a77bSJohn Marino.Ql [xX]
362*6af9a77bSJohn Marinoand
363*6af9a77bSJohn Marino.Ql [^x]
364*6af9a77bSJohn Marinobecomes
365*6af9a77bSJohn Marino.Ql [^xX] .
366*6af9a77bSJohn Marino.Pp
367*6af9a77bSJohn MarinoNo particular limit is imposed on the length of REs\(dd.
368*6af9a77bSJohn MarinoPrograms intended to be portable should not employ REs longer
369*6af9a77bSJohn Marinothan 256 bytes,
370*6af9a77bSJohn Marinoas an implementation can refuse to accept such REs and remain
371*6af9a77bSJohn MarinoPOSIX-compliant.
372*6af9a77bSJohn Marino.Pp
373*6af9a77bSJohn MarinoObsolete
374*6af9a77bSJohn Marino.Pq Dq basic
375*6af9a77bSJohn Marinoregular expressions differ in several respects.
376*6af9a77bSJohn Marino.Ql \&|
377*6af9a77bSJohn Marinois an ordinary character and there is no equivalent
378*6af9a77bSJohn Marinofor its functionality.
379*6af9a77bSJohn Marino.Ql \&+
380*6af9a77bSJohn Marinoand
381*6af9a77bSJohn Marino.Ql ?\&
382*6af9a77bSJohn Marinoare ordinary characters, and their functionality
383*6af9a77bSJohn Marinocan be expressed using bounds
384*6af9a77bSJohn Marino.No ( Ql {1,}
385*6af9a77bSJohn Marinoor
386*6af9a77bSJohn Marino.Ql {0,1}
387*6af9a77bSJohn Marinorespectively).
388*6af9a77bSJohn MarinoAlso note that
389*6af9a77bSJohn Marino.Ql x+
390*6af9a77bSJohn Marinoin modern REs is equivalent to
391*6af9a77bSJohn Marino.Ql xx* .
392*6af9a77bSJohn MarinoThe delimiters for bounds are
393*6af9a77bSJohn Marino.Ql \e{
394*6af9a77bSJohn Marinoand
395*6af9a77bSJohn Marino.Ql \e} ,
396*6af9a77bSJohn Marinowith
397*6af9a77bSJohn Marino.Ql \&{
398*6af9a77bSJohn Marinoand
399*6af9a77bSJohn Marino.Ql \&}
400*6af9a77bSJohn Marinoby themselves ordinary characters.
401*6af9a77bSJohn MarinoThe parentheses for nested subexpressions are
402*6af9a77bSJohn Marino.Ql \e(
403*6af9a77bSJohn Marinoand
404*6af9a77bSJohn Marino.Ql \e) ,
405*6af9a77bSJohn Marinowith
406*6af9a77bSJohn Marino.Ql \&(
407*6af9a77bSJohn Marinoand
408*6af9a77bSJohn Marino.Ql \&)
409*6af9a77bSJohn Marinoby themselves ordinary characters.
410*6af9a77bSJohn Marino.Ql \&^
411*6af9a77bSJohn Marinois an ordinary character except at the beginning of the
412*6af9a77bSJohn MarinoRE or\(dd the beginning of a parenthesized subexpression,
413*6af9a77bSJohn Marino.Ql \&$
414*6af9a77bSJohn Marinois an ordinary character except at the end of the
415*6af9a77bSJohn MarinoRE or\(dd the end of a parenthesized subexpression,
416*6af9a77bSJohn Marinoand
417*6af9a77bSJohn Marino.Ql \&*
418*6af9a77bSJohn Marinois an ordinary character if it appears at the beginning of the
419*6af9a77bSJohn MarinoRE or the beginning of a parenthesized subexpression
420*6af9a77bSJohn Marino(after a possible leading
421*6af9a77bSJohn Marino.Ql \&^ ) .
422*6af9a77bSJohn MarinoFinally, there is one new type of atom, a
423*6af9a77bSJohn Marino.Em back reference :
424*6af9a77bSJohn Marino.Ql \e
425*6af9a77bSJohn Marinofollowed by a non-zero decimal digit
426*6af9a77bSJohn Marino.Em d
427*6af9a77bSJohn Marinomatches the same sequence of characters
428*6af9a77bSJohn Marinomatched by the
429*6af9a77bSJohn Marino.Em d Ns th
430*6af9a77bSJohn Marinoparenthesized subexpression
431*6af9a77bSJohn Marino(numbering subexpressions by the positions of their opening parentheses,
432*6af9a77bSJohn Marinoleft to right),
433*6af9a77bSJohn Marinoso that (e.g.)
434*6af9a77bSJohn Marino.Ql \e([bc]\e)\e1
435*6af9a77bSJohn Marinomatches
436*6af9a77bSJohn Marino.Ql bb
437*6af9a77bSJohn Marinoor
438*6af9a77bSJohn Marino.Ql cc
439*6af9a77bSJohn Marinobut not
440*6af9a77bSJohn Marino.Ql bc .
441*6af9a77bSJohn Marino.Sh ENHANCED FEATURES
442*6af9a77bSJohn MarinoWhen the
443*6af9a77bSJohn Marino.Dv REG_ENHANCED
444*6af9a77bSJohn Marinoflag is passed to one of the
445*6af9a77bSJohn Marino.Fn regcomp
446*6af9a77bSJohn Marinovariants, additional features are activated.
447*6af9a77bSJohn MarinoLike the enhanced
448*6af9a77bSJohn Marino.Nm regex
449*6af9a77bSJohn Marinoimplementations in scripting languages such as
450*6af9a77bSJohn Marino.Xr perl 1
451*6af9a77bSJohn Marinoand
452*6af9a77bSJohn Marino.Xr python 1 ,
453*6af9a77bSJohn Marinothese additional features may conflict with the
454*6af9a77bSJohn Marino.St -p1003.2
455*6af9a77bSJohn Marinostandards in some ways.
456*6af9a77bSJohn MarinoUse this with care in situations which require portability
457*6af9a77bSJohn Marino(including to past versions of the Mac OS X using the previous
458*6af9a77bSJohn Marino.Nm regex
459*6af9a77bSJohn Marinoimplementation).
460*6af9a77bSJohn Marino.Pp
461*6af9a77bSJohn MarinoFor enhanced basic REs,
462*6af9a77bSJohn Marino.Ql \&+ ,
463*6af9a77bSJohn Marino.Ql \&?
464*6af9a77bSJohn Marinoand
465*6af9a77bSJohn Marino.Ql \&|
466*6af9a77bSJohn Marinoremain regular characters, but
467*6af9a77bSJohn Marino.Ql \e+ ,
468*6af9a77bSJohn Marino.Ql \e?
469*6af9a77bSJohn Marinoand
470*6af9a77bSJohn Marino.Ql \e|
471*6af9a77bSJohn Marinohave the same special meaning as the unescaped characters do for
472*6af9a77bSJohn Marinoextended REs, i.e., one or more matches, zero or one matches and alteration,
473*6af9a77bSJohn Marinorespectively.
474*6af9a77bSJohn MarinoFor enhanced extended REs,
475*6af9a77bSJohn Marinoback references are available.
476*6af9a77bSJohn MarinoAdditional enhanced features are listed below.
477*6af9a77bSJohn Marino.Pp
478*6af9a77bSJohn MarinoWithin a bracket expression, most characters lose their magic.
479*6af9a77bSJohn MarinoThis also applies to the additional enhanced features, which don't operate
480*6af9a77bSJohn Marinoinside a bracket expression.
481*6af9a77bSJohn Marino.Ss Assertions (available for both enhanced basic and enhanced extended REs)
482*6af9a77bSJohn MarinoIn addition to
483*6af9a77bSJohn Marino.Ql \&^
484*6af9a77bSJohn Marinoand
485*6af9a77bSJohn Marino.Ql \&$
486*6af9a77bSJohn Marino(the assertions that match the null string at the beginning and end of line,
487*6af9a77bSJohn Marinorespectively), the following assertions become available:
488*6af9a77bSJohn Marino.Bl -tag -width ".Sy \eB" -offset indent
489*6af9a77bSJohn Marino.It Sy \e<
490*6af9a77bSJohn MarinoMatches the null string at the beginning of a word.
491*6af9a77bSJohn MarinoThis is equivalent to
492*6af9a77bSJohn Marino.Ql [[:<:]] .
493*6af9a77bSJohn Marino.It Sy \e>
494*6af9a77bSJohn MarinoMatches the null string at the end of a word.
495*6af9a77bSJohn MarinoThis is equivalent to
496*6af9a77bSJohn Marino.Ql [[:>:]] .
497*6af9a77bSJohn Marino.It Sy \eb
498*6af9a77bSJohn MarinoMatches the null string at a word boundary (either the beginning or end of
499*6af9a77bSJohn Marinoa word).
500*6af9a77bSJohn Marino.It Sy \eB
501*6af9a77bSJohn MarinoMatches the null string where there is no word boundary.
502*6af9a77bSJohn MarinoThis is the opposite of
503*6af9a77bSJohn Marino.Ql \eb .
504*6af9a77bSJohn Marino.El
505*6af9a77bSJohn Marino.Ss Shortcuts (available for both enhanced basic and enhanced extended REs)
506*6af9a77bSJohn MarinoThe following shortcuts can be used to replace more complicated
507*6af9a77bSJohn Marinobracket expressions.
508*6af9a77bSJohn Marino.Bl -tag -width ".Sy \eD" -offset indent
509*6af9a77bSJohn Marino.It Sy \ed
510*6af9a77bSJohn MarinoMatches a digit character.
511*6af9a77bSJohn MarinoThis is equivalent to
512*6af9a77bSJohn Marino.Ql [[:digit:]] .
513*6af9a77bSJohn Marino.It Sy \eD
514*6af9a77bSJohn MarinoMatches a non-digit character.
515*6af9a77bSJohn MarinoThis is equivalent to
516*6af9a77bSJohn Marino.Ql [^[:digit:]] .
517*6af9a77bSJohn Marino.It Sy \es
518*6af9a77bSJohn MarinoMatches a space character.
519*6af9a77bSJohn MarinoThis is equivalent to
520*6af9a77bSJohn Marino.Ql [[:space:]] .
521*6af9a77bSJohn Marino.It Sy \eS
522*6af9a77bSJohn MarinoMatches a non-space character.
523*6af9a77bSJohn MarinoThis is equivalent to
524*6af9a77bSJohn Marino.Ql [^[:space:]] .
525*6af9a77bSJohn Marino.It Sy \ew
526*6af9a77bSJohn MarinoMatches a word character.
527*6af9a77bSJohn MarinoThis is equivalent to
528*6af9a77bSJohn Marino.Ql [[:alnum:]_] .
529*6af9a77bSJohn Marino.It Sy \eW
530*6af9a77bSJohn MarinoMatches a non-word character.
531*6af9a77bSJohn MarinoThis is equivalent to
532*6af9a77bSJohn Marino.Ql [^[:alnum:]_] .
533*6af9a77bSJohn Marino.El
534*6af9a77bSJohn Marino.Ss Literal Sequences (available for both enhanced basic and enhanced extended REs)
535*6af9a77bSJohn MarinoLiterals are normally just ordinary characters that are matched directly.
536*6af9a77bSJohn MarinoUnder enhanced mode, certain character sequences are
537*6af9a77bSJohn Marinoconverted to specific literals.
538*6af9a77bSJohn Marino.Bl -tag -width ".Sy \ea" -offset indent
539*6af9a77bSJohn Marino.It Sy \ea
540*6af9a77bSJohn MarinoThe
541*6af9a77bSJohn Marino.Dq bell
542*6af9a77bSJohn Marinocharacter (ASCII code 7).
543*6af9a77bSJohn Marino.It Sy \ee
544*6af9a77bSJohn MarinoThe
545*6af9a77bSJohn Marino.Dq escape
546*6af9a77bSJohn Marinocharacter (ASCII code 27).
547*6af9a77bSJohn Marino.It Sy \ef
548*6af9a77bSJohn MarinoThe
549*6af9a77bSJohn Marino.Dq form-feed
550*6af9a77bSJohn Marinocharacter (ASCII code 12).
551*6af9a77bSJohn Marino.It Sy \en
552*6af9a77bSJohn MarinoThe
553*6af9a77bSJohn Marino.Dq new-line/line-feed
554*6af9a77bSJohn Marinocharacter (ASCII code 10).
555*6af9a77bSJohn Marino.It Sy \er
556*6af9a77bSJohn MarinoThe
557*6af9a77bSJohn Marino.Dq carriage-return
558*6af9a77bSJohn Marinocharacter (ASCII code 13).
559*6af9a77bSJohn Marino.It Sy \et
560*6af9a77bSJohn MarinoThe
561*6af9a77bSJohn Marino.Dq horizontal-tab
562*6af9a77bSJohn Marinocharacter (ASCII code 9).
563*6af9a77bSJohn Marino.El
564*6af9a77bSJohn Marino.Pp
565*6af9a77bSJohn MarinoLiterals can also be specified directly, using their wide character values.
566*6af9a77bSJohn MarinoNote that when matching a multibyte character string, the string's bytes
567*6af9a77bSJohn Marinoare converted to wide character before comparing.
568*6af9a77bSJohn MarinoThis means that a single literal wide character value may match more than
569*6af9a77bSJohn Marinoone string byte, depending on the locale's wide character encoding.
570*6af9a77bSJohn Marino.Bl -tag -width ".Sy \ex{ Ns Em x.. Ns Sy \&}" -offset indent
571*6af9a77bSJohn Marino.It Sy \ex Ns Em x..
572*6af9a77bSJohn MarinoAn arbitray eight-bit value.
573*6af9a77bSJohn MarinoThe
574*6af9a77bSJohn Marino.Em x..
575*6af9a77bSJohn Marinosequence represents zero, one or two hexadecimal digits.
576*6af9a77bSJohn Marino(Note: if
577*6af9a77bSJohn Marino.Em x..
578*6af9a77bSJohn Marinois less than two hexadecimal digits, and the character following this sequence
579*6af9a77bSJohn Marinohappens to be a hexadecimal digit, use the (following) brace form to avoid
580*6af9a77bSJohn Marinoconfusion.)
581*6af9a77bSJohn Marino.It Sy \ex{ Ns Em x.. Ns Sy \&}
582*6af9a77bSJohn MarinoAn arbitrary, up to 32-bit value.
583*6af9a77bSJohn MarinoThe
584*6af9a77bSJohn Marino.Em x..
585*6af9a77bSJohn Marinosequence is an arbitrary sequence of hexadecimal digits that is long enough
586*6af9a77bSJohn Marinoto represent the necessary value.
587*6af9a77bSJohn Marino.El
588*6af9a77bSJohn Marino.Ss Inline Literal Mode (available for both enhanced basic and enhanced extended REs)
589*6af9a77bSJohn MarinoA
590*6af9a77bSJohn Marino.Ql \eQ
591*6af9a77bSJohn Marinosequence causes literal
592*6af9a77bSJohn Marino.Pq Dq quote
593*6af9a77bSJohn Marinomode to be entered,
594*6af9a77bSJohn Marinowhile
595*6af9a77bSJohn Marino.Ql \eE
596*6af9a77bSJohn Marinoends literal mode, and returns to normal regular expression processing.
597*6af9a77bSJohn MarinoThis is similar to specifying the
598*6af9a77bSJohn Marino.Dv REG_NOSPEC
599*6af9a77bSJohn Marino(or
600*6af9a77bSJohn Marino.Dv REG_LITERAL )
601*6af9a77bSJohn Marinooption to
602*6af9a77bSJohn Marino.Fn regcomp ,
603*6af9a77bSJohn Marinoexcept that rather than applying to the whole RE string, it only applies to
604*6af9a77bSJohn Marinothe part between the
605*6af9a77bSJohn Marino.Ql \eQ
606*6af9a77bSJohn Marinoand
607*6af9a77bSJohn Marino.Ql \eE .
608*6af9a77bSJohn MarinoNote that it is not possible to have a
609*6af9a77bSJohn Marino.Ql \eE
610*6af9a77bSJohn Marinoin the middle of an inline literal range, as that would terminate literal mode
611*6af9a77bSJohn Marinoprematurely.
612*6af9a77bSJohn Marino.Ss Minimal Repetitions (available for enhanced extended REs only)
613*6af9a77bSJohn MarinoBy default, the repetition operators,
614*6af9a77bSJohn Marino.Ql \&* ,
615*6af9a77bSJohn Marino.Em bound ,
616*6af9a77bSJohn Marino.Ql \&?
617*6af9a77bSJohn Marinoand
618*6af9a77bSJohn Marino.Ql \&+
619*6af9a77bSJohn Marinoare
620*6af9a77bSJohn Marino.Em greedy ;
621*6af9a77bSJohn Marinothey try to match as many times as possible.
622*6af9a77bSJohn MarinoIn enhanced mode, appending a
623*6af9a77bSJohn Marino.Ql \&?
624*6af9a77bSJohn Marinoto a repetition operator makes it minimal (or
625*6af9a77bSJohn Marino.Em ungreedy ) ;
626*6af9a77bSJohn Marinoit tries to match the fewest number of times (including zero times, as
627*6af9a77bSJohn Marinoappropriate).
628*6af9a77bSJohn Marino.Pp
629*6af9a77bSJohn MarinoFor example, against the string
630*6af9a77bSJohn Marino.Ql aaa ,
631*6af9a77bSJohn Marinothe RE
632*6af9a77bSJohn Marino.Ql a*
633*6af9a77bSJohn Marinowould match the entire string,
634*6af9a77bSJohn Marinowhile
635*6af9a77bSJohn Marino.Ql a*?
636*6af9a77bSJohn Marinowould match the null string at the beginning of the line
637*6af9a77bSJohn Marino(matches zero times).
638*6af9a77bSJohn MarinoLikewise, against the string
639*6af9a77bSJohn Marino.Ql ababab ,
640*6af9a77bSJohn Marinothe RE
641*6af9a77bSJohn Marino.Ql .*b ,
642*6af9a77bSJohn Marinowould also match the entire string,
643*6af9a77bSJohn Marinowhile
644*6af9a77bSJohn Marino.Ql .*?b
645*6af9a77bSJohn Marinowould only match the first two characters.
646*6af9a77bSJohn Marino.Pp
647*6af9a77bSJohn MarinoThe
648*6af9a77bSJohn Marino.Fn regcomp
649*6af9a77bSJohn Marinoflag
650*6af9a77bSJohn Marino.Dv REG_UNGREEDY
651*6af9a77bSJohn Marinowill make the regular
652*6af9a77bSJohn Marino.Pq greedy
653*6af9a77bSJohn Marinorepetition operators ungreedy by default.
654*6af9a77bSJohn MarinoAppending
655*6af9a77bSJohn Marino.Ql \&?
656*6af9a77bSJohn Marinomakes them greedy again.
657*6af9a77bSJohn Marino.Pp
658*6af9a77bSJohn MarinoNote that minimal repetitions are not specified by an official
659*6af9a77bSJohn Marinostandard, so there may be differences between different implementations.
660*6af9a77bSJohn MarinoIn the current implementation, minimal repetitions have a high precedence,
661*6af9a77bSJohn Marinoand can cause other standards requirements to be violated.
662*6af9a77bSJohn MarinoFor instance, on the string
663*6af9a77bSJohn Marino.Ql aaaaa ,
664*6af9a77bSJohn Marinothe RE
665*6af9a77bSJohn Marino.Ql (aaa??)*
666*6af9a77bSJohn Marinowill only match the first four characters, violating the rules that the longest
667*6af9a77bSJohn Marinopossible match is made and the longest subexpressions are matched.
668*6af9a77bSJohn MarinoUsing
669*6af9a77bSJohn Marino.Ql (aaa??)*$
670*6af9a77bSJohn Marinoforces the entire string to be matched.
671*6af9a77bSJohn Marino.Ss Non-capturing Parenthesized Subexpressions (available for enhanced extended REs only)
672*6af9a77bSJohn MarinoNormally, the match offsets to parenthesized subexpressions are
673*6af9a77bSJohn Marinorecorded in the
674*6af9a77bSJohn Marino.Fa pmatch
675*6af9a77bSJohn Marinoarray (that is, when
676*6af9a77bSJohn Marino.Dv REG_NOSUB
677*6af9a77bSJohn Marinois not specified, and
678*6af9a77bSJohn Marino.Fa nmatch
679*6af9a77bSJohn Marinois large enough to encompass the parenthesized subexpression in question).
680*6af9a77bSJohn MarinoIn enhanced mode, if the first two characters following the left parenthesis
681*6af9a77bSJohn Marinoare
682*6af9a77bSJohn Marino.Ql ?: ,
683*6af9a77bSJohn Marinogrouping of the remaining contents is done, but the corresponding offsets are
684*6af9a77bSJohn Marinonot recorded in the
685*6af9a77bSJohn Marino.Fa pmatch
686*6af9a77bSJohn Marinoarray.
687*6af9a77bSJohn MarinoFor example, against the string
688*6af9a77bSJohn Marino.Ql fubar ,
689*6af9a77bSJohn Marinothe RE
690*6af9a77bSJohn Marino.Ql (fu)(bar)
691*6af9a77bSJohn Marinowould have two subexpression matches in
692*6af9a77bSJohn Marino.Fa pmatch ;
693*6af9a77bSJohn Marinothe first for
694*6af9a77bSJohn Marino.Ql fu
695*6af9a77bSJohn Marinoand the second for
696*6af9a77bSJohn Marino.Ql bar .
697*6af9a77bSJohn MarinoBut with the RE
698*6af9a77bSJohn Marino.Ql (?:fu)(bar) ,
699*6af9a77bSJohn Marinothere would only be one subexpression match, that of
700*6af9a77bSJohn Marino.Ql bar .
701*6af9a77bSJohn MarinoFurthermore,
702*6af9a77bSJohn Marinoagainst the string
703*6af9a77bSJohn Marino.Ql fufubar ,
704*6af9a77bSJohn Marinothe RE
705*6af9a77bSJohn Marino.Ql (?fu)*(bar)
706*6af9a77bSJohn Marinowould again match the entire string, but only
707*6af9a77bSJohn Marino.Ql bar
708*6af9a77bSJohn Marinowould be recorded in
709*6af9a77bSJohn Marino.Fa pmatch .
710*6af9a77bSJohn Marino.Ss Inline Options (available for enhanced extended REs only)
711*6af9a77bSJohn MarinoLike the inline literal mode mentioned above, other options can be switched
712*6af9a77bSJohn Marinoon and off for part of a RE.
713*6af9a77bSJohn Marino.Ql (? Ns Em o.. Ns \&)
714*6af9a77bSJohn Marinowill turn on the options specified in
715*6af9a77bSJohn Marino.Em o..
716*6af9a77bSJohn Marino(one or more options characters; see below), while
717*6af9a77bSJohn Marino.Ql (?- Ns Em o.. Ns \&)
718*6af9a77bSJohn Marinowill turn off the specified options, and
719*6af9a77bSJohn Marino.Ql (? Ns Em o1.. Ns \&- Ns Em o2.. Ns \&)
720*6af9a77bSJohn Marinowill turn on the first set of options, and turn off the second set.
721*6af9a77bSJohn Marino.Pp
722*6af9a77bSJohn MarinoThe available options are:
723*6af9a77bSJohn Marino.Bl -tag -width ".Sy \&U" -offset indent
724*6af9a77bSJohn Marino.It Sy \&i
725*6af9a77bSJohn MarinoTurning on this option will ignore case during matching, while turning off
726*6af9a77bSJohn Marinowill restore case-sensitive matching.
727*6af9a77bSJohn MarinoIf
728*6af9a77bSJohn Marino.Dv REG_ICASE
729*6af9a77bSJohn Marinowas specified to
730*6af9a77bSJohn Marino.Fn regcomp ,
731*6af9a77bSJohn Marinothis option can be use to turn that off.
732*6af9a77bSJohn Marino.It Sy \&n
733*6af9a77bSJohn MarinoTurn on or off special handling of the newline character.
734*6af9a77bSJohn MarinoIf
735*6af9a77bSJohn Marino.Dv REG_NEWLINE
736*6af9a77bSJohn Marinowas specified to
737*6af9a77bSJohn Marino.Fn regcomp ,
738*6af9a77bSJohn Marinothis option can be use to turn that off.
739*6af9a77bSJohn Marino.It Sy \&U
740*6af9a77bSJohn MarinoTurning on this option will make ungreedy repetitions the default, while
741*6af9a77bSJohn Marinoturning off will make greedy repetitions the default.
742*6af9a77bSJohn MarinoIf
743*6af9a77bSJohn Marino.Dv REG_UNGREEDY
744*6af9a77bSJohn Marinowas specified to
745*6af9a77bSJohn Marino.Fn regcomp ,
746*6af9a77bSJohn Marinothis option can be use to turn that off.
747*6af9a77bSJohn Marino.El
748*6af9a77bSJohn Marino.Pp
749*6af9a77bSJohn MarinoThe scope of the option change begins immediately following the right
750*6af9a77bSJohn Marinoparenthesis,
751*6af9a77bSJohn Marinobut up to the end of the enclosing subexpression (if any).
752*6af9a77bSJohn MarinoThus, for example, given the RE
753*6af9a77bSJohn Marino.Ql (fu(?i)bar)baz ,
754*6af9a77bSJohn Marinothe
755*6af9a77bSJohn Marino.Ql fu
756*6af9a77bSJohn Marinoportion matches case sensitively,
757*6af9a77bSJohn Marino.Ql bar
758*6af9a77bSJohn Marinomatches case insensitively, and
759*6af9a77bSJohn Marino.Ql baz
760*6af9a77bSJohn Marinomatches case sensitively again (since is it outside the scope of the
761*6af9a77bSJohn Marinosubexpression in which the inline option was specified).
762*6af9a77bSJohn Marino.Pp
763*6af9a77bSJohn MarinoThe inline options syntax can be combined with the non-capturing parenthesized
764*6af9a77bSJohn Marinosubexpression to limit the option scope to just that of the subexpression.
765*6af9a77bSJohn MarinoThen, for example,
766*6af9a77bSJohn Marino.Ql fu(?i:bar)baz
767*6af9a77bSJohn Marinois similar to the previous example, except for the parenthesize subexpression
768*6af9a77bSJohn Marinoaround
769*6af9a77bSJohn Marino.Ql fu(?i)bar
770*6af9a77bSJohn Marinoin the previous example.
771*6af9a77bSJohn Marino.Ss Inline Comments (available for enhanced extended REs only)
772*6af9a77bSJohn MarinoThe syntax
773*6af9a77bSJohn Marino.Ql (?# Ns Em comment Ns \&)
774*6af9a77bSJohn Marinocan be used to embed comments within a RE.
775*6af9a77bSJohn MarinoNote that
776*6af9a77bSJohn Marino.Em comment
777*6af9a77bSJohn Marinocan not contain a right parenthesis.
778*6af9a77bSJohn MarinoAlso note that while syntactically, option characters can be added before
779*6af9a77bSJohn Marinothe
780*6af9a77bSJohn Marino.Ql \&#
781*6af9a77bSJohn Marinocharacter, they will be ignored.
782*6af9a77bSJohn Marino.Sh SEE ALSO
783*6af9a77bSJohn Marino.Xr regex 3
784*6af9a77bSJohn Marino.Rs
785*6af9a77bSJohn Marino.%T Regular Expression Notation
786*6af9a77bSJohn Marino.%R IEEE Std
787*6af9a77bSJohn Marino.%N 1003.2
788*6af9a77bSJohn Marino.%P section 2.8
789*6af9a77bSJohn Marino.Re
790*6af9a77bSJohn Marino.Sh BUGS
791*6af9a77bSJohn MarinoHaving two kinds of REs is a botch.
792*6af9a77bSJohn Marino.Pp
793*6af9a77bSJohn MarinoThe current
794*6af9a77bSJohn Marino.St -p1003.2
795*6af9a77bSJohn Marinospec says that
796*6af9a77bSJohn Marino.Ql \&)
797*6af9a77bSJohn Marinois an ordinary character in
798*6af9a77bSJohn Marinothe absence of an unmatched
799*6af9a77bSJohn Marino.Ql \&( ;
800*6af9a77bSJohn Marinothis was an unintentional result of a wording error,
801*6af9a77bSJohn Marinoand change is likely.
802*6af9a77bSJohn MarinoAvoid relying on it.
803*6af9a77bSJohn Marino.Pp
804*6af9a77bSJohn MarinoBack references are a dreadful botch,
805*6af9a77bSJohn Marinoposing major problems for efficient implementations.
806*6af9a77bSJohn MarinoThey are also somewhat vaguely defined
807*6af9a77bSJohn Marino(does
808*6af9a77bSJohn Marino.Ql a\e(\e(b\e)*\e2\e)*d
809*6af9a77bSJohn Marinomatch
810*6af9a77bSJohn Marino.Ql abbbd ? ) .
811*6af9a77bSJohn MarinoAvoid using them.
812*6af9a77bSJohn Marino.Pp
813*6af9a77bSJohn Marino.St -p1003.2
814*6af9a77bSJohn Marinospecification of case-independent matching is vague.
815*6af9a77bSJohn MarinoThe
816*6af9a77bSJohn Marino.Dq one case implies all cases
817*6af9a77bSJohn Marinodefinition given above
818*6af9a77bSJohn Marinois current consensus among implementors as to the right interpretation.
819*6af9a77bSJohn Marino.Pp
820*6af9a77bSJohn MarinoThe bracket syntax for word boundaries is incredibly ugly.
821