xref: /dragonfly/contrib/grep/doc/grep.texi (revision 78478697)
1\input texinfo  @c -*-texinfo-*-
2@c %**start of header
3@setfilename grep.info
4@include version.texi
5@settitle GNU Grep @value{VERSION}
6
7@c Combine indices.
8@syncodeindex ky cp
9@syncodeindex pg cp
10@syncodeindex tp cp
11@defcodeindex op
12@syncodeindex op cp
13@syncodeindex vr cp
14@c %**end of header
15
16@documentencoding UTF-8
17
18@copying
19This manual is for @command{grep}, a pattern matching engine.
20
21Copyright @copyright{} 1999-2002, 2005, 2008-2015 Free Software Foundation,
22Inc.
23
24@quotation
25Permission is granted to copy, distribute and/or modify this document
26under the terms of the GNU Free Documentation License, Version 1.3 or
27any later version published by the Free Software Foundation; with no
28Invariant Sections, with no Front-Cover Texts, and with no Back-Cover
29Texts.  A copy of the license is included in the section entitled
30``GNU Free Documentation License''.
31@end quotation
32@end copying
33
34@dircategory Text creation and manipulation
35@direntry
36* grep: (grep).                 Print lines matching a pattern.
37@end direntry
38
39@titlepage
40@title GNU Grep: Print lines matching a pattern
41@subtitle version @value{VERSION}, @value{UPDATED}
42@author Alain Magloire et al.
43@page
44@vskip 0pt plus 1filll
45@insertcopying
46@end titlepage
47
48@contents
49
50
51@ifnottex
52@node Top
53@top grep
54
55@command{grep} prints lines that contain a match for a pattern.
56
57This manual is for version @value{VERSION} of GNU Grep.
58
59@insertcopying
60@end ifnottex
61
62@menu
63* Introduction::                Introduction.
64* Invoking::                    Command-line options, environment, exit status.
65* Regular Expressions::         Regular Expressions.
66* Usage::                       Examples.
67* Reporting Bugs::              Reporting Bugs.
68* Copying::                     License terms for this manual.
69* Index::                       Combined index.
70@end menu
71
72
73@node Introduction
74@chapter Introduction
75
76@cindex searching for a pattern
77
78@command{grep} searches input files
79for lines containing a match to a given pattern list.
80When it finds a match in a line,
81it copies the line to standard output (by default),
82or produces whatever other sort of output you have requested with options.
83
84Though @command{grep} expects to do the matching on text,
85it has no limits on input line length other than available memory,
86and it can match arbitrary characters within a line.
87If the final byte of an input file is not a newline,
88@command{grep} silently supplies one.
89Since newline is also a separator for the list of patterns,
90there is no way to match newline characters in a text.
91
92
93@node Invoking
94@chapter Invoking @command{grep}
95
96The general synopsis of the @command{grep} command line is
97
98@example
99grep @var{options} @var{pattern} @var{input_file_names}
100@end example
101
102@noindent
103There can be zero or more @var{options}.
104@var{pattern} will only be seen as such
105(and not as an @var{input_file_name})
106if it wasn't already specified within @var{options}
107(by using the @samp{-e@ @var{pattern}}
108or @samp{-f@ @var{file}} options).
109There can be zero or more @var{input_file_names}.
110
111@menu
112* Command-line Options::        Short and long names, grouped by category.
113* Environment Variables::       POSIX, GNU generic, and GNU grep specific.
114* Exit Status::                 Exit status returned by @command{grep}.
115* grep Programs::               @command{grep} programs.
116@end menu
117
118@node Command-line Options
119@section Command-line Options
120
121@command{grep} comes with a rich set of options:
122some from POSIX and some being GNU extensions.
123Long option names are always a GNU extension,
124even for options that are from POSIX specifications.
125Options that are specified by POSIX,
126under their short names,
127are explicitly marked as such
128to facilitate POSIX-portable programming.
129A few option names are provided
130for compatibility with older or more exotic implementations.
131
132@menu
133* Generic Program Information::
134* Matching Control::
135* General Output Control::
136* Output Line Prefix Control::
137* Context Line Control::
138* File and Directory Selection::
139* Other Options::
140@end menu
141
142Several additional options control
143which variant of the @command{grep} matching engine is used.
144@xref{grep Programs}.
145
146@node Generic Program Information
147@subsection Generic Program Information
148
149@table @option
150
151@item --help
152@opindex --help
153@cindex usage summary, printing
154Print a usage message briefly summarizing the command-line options
155and the bug-reporting address, then exit.
156
157@item -V
158@itemx --version
159@opindex -V
160@opindex --version
161@cindex version, printing
162Print the version number of @command{grep} to the standard output stream.
163This version number should be included in all bug reports.
164
165@end table
166
167@node Matching Control
168@subsection Matching Control
169
170@table @option
171
172@item -e @var{pattern}
173@itemx --regexp=@var{pattern}
174@opindex -e
175@opindex --regexp=@var{pattern}
176@cindex pattern list
177Use @var{pattern} as the pattern.
178This can be used to specify multiple search patterns,
179or to protect a pattern beginning with a @samp{-}.
180(@option{-e} is specified by POSIX.)
181
182@item -f @var{file}
183@itemx --file=@var{file}
184@opindex -f
185@opindex --file
186@cindex pattern from file
187Obtain patterns from @var{file}, one per line.
188The empty file contains zero patterns, and therefore matches nothing.
189(@option{-f} is specified by POSIX.)
190
191@item -i
192@itemx -y
193@itemx --ignore-case
194@opindex -i
195@opindex -y
196@opindex --ignore-case
197@cindex case insensitive search
198Ignore case distinctions, so that characters that differ only in case
199match each other.  Although this is straightforward when letters
200differ in case only via lowercase-uppercase pairs, the behavior is
201unspecified in other situations.  For example, uppercase ``S'' has an
202unusual lowercase counterpart ``ſ'' (Unicode character U+017F, LATIN
203SMALL LETTER LONG S) in many locales, and it is unspecified whether
204this unusual character matches ``S'' or ``s'' even though uppercasing
205it yields ``S''.  Another example: the lowercase German letter ``ß''
206(U+00DF, LATIN SMALL LETTER SHARP S) is normally capitalized as the
207two-character string ``SS'' but it does not match ``SS'', and it might
208not match the uppercase letter ``ẞ'' (U+1E9E, LATIN CAPITAL LETTER
209SHARP S) even though lowercasing the latter yields the former.
210
211@option{-y} is an obsolete synonym that is provided for compatibility.
212(@option{-i} is specified by POSIX.)
213
214@item -v
215@itemx --invert-match
216@opindex -v
217@opindex --invert-match
218@cindex invert matching
219@cindex print non-matching lines
220Invert the sense of matching, to select non-matching lines.
221(@option{-v} is specified by POSIX.)
222
223@item -w
224@itemx --word-regexp
225@opindex -w
226@opindex --word-regexp
227@cindex matching whole words
228Select only those lines containing matches that form whole words.
229The test is that the matching substring must either
230be at the beginning of the line,
231or preceded by a non-word constituent character.
232Similarly,
233it must be either at the end of the line
234or followed by a non-word constituent character.
235Word-constituent characters are letters, digits, and the underscore.
236
237@item -x
238@itemx --line-regexp
239@opindex -x
240@opindex --line-regexp
241@cindex match the whole line
242Select only those matches that exactly match the whole line.
243(@option{-x} is specified by POSIX.)
244
245@end table
246
247@node General Output Control
248@subsection General Output Control
249
250@table @option
251
252@item -c
253@itemx --count
254@opindex -c
255@opindex --count
256@cindex counting lines
257Suppress normal output;
258instead print a count of matching lines for each input file.
259With the @option{-v} (@option{--invert-match}) option,
260count non-matching lines.
261(@option{-c} is specified by POSIX.)
262
263@item --color[=@var{WHEN}]
264@itemx --colour[=@var{WHEN}]
265@opindex --color
266@opindex --colour
267@cindex highlight, color, colour
268Surround the matched (non-empty) strings, matching lines, context lines,
269file names, line numbers, byte offsets, and separators (for fields and
270groups of context lines) with escape sequences to display them in color
271on the terminal.
272The colors are defined by the environment variable @env{GREP_COLORS}
273and default to @samp{ms=01;31:mc=01;31:sl=:cx=:fn=35:ln=32:bn=32:se=36}
274for bold red matched text, magenta file names, green line numbers,
275green byte offsets, cyan separators, and default terminal colors otherwise.
276The deprecated environment variable @env{GREP_COLOR} is still supported,
277but its setting does not have priority;
278it defaults to @samp{01;31} (bold red)
279which only covers the color for matched text.
280@var{WHEN} is @samp{never}, @samp{always}, or @samp{auto}.
281
282@item -L
283@itemx --files-without-match
284@opindex -L
285@opindex --files-without-match
286@cindex files which don't match
287Suppress normal output;
288instead print the name of each input file from which
289no output would normally have been printed.
290The scanning of each file stops on the first match.
291
292@item -l
293@itemx --files-with-matches
294@opindex -l
295@opindex --files-with-matches
296@cindex names of matching files
297Suppress normal output;
298instead print the name of each input file from which
299output would normally have been printed.
300The scanning of each file stops on the first match.
301(@option{-l} is specified by POSIX.)
302
303@item -m @var{num}
304@itemx --max-count=@var{num}
305@opindex -m
306@opindex --max-count
307@cindex max-count
308Stop reading a file after @var{num} matching lines.
309If the input is standard input from a regular file,
310and @var{num} matching lines are output,
311@command{grep} ensures that the standard input is positioned
312just after the last matching line before exiting,
313regardless of the presence of trailing context lines.
314This enables a calling process to resume a search.
315For example, the following shell script makes use of it:
316
317@example
318while grep -m 1 PATTERN
319do
320  echo xxxx
321done < FILE
322@end example
323
324But the following probably will not work because a pipe is not a regular
325file:
326
327@example
328# This probably will not work.
329cat FILE |
330while grep -m 1 PATTERN
331do
332  echo xxxx
333done
334@end example
335
336When @command{grep} stops after @var{num} matching lines,
337it outputs any trailing context lines.
338Since context does not include matching lines,
339@command{grep} will stop when it encounters another matching line.
340When the @option{-c} or @option{--count} option is also used,
341@command{grep} does not output a count greater than @var{num}.
342When the @option{-v} or @option{--invert-match} option is also used,
343@command{grep} stops after outputting @var{num} non-matching lines.
344
345@item -o
346@itemx --only-matching
347@opindex -o
348@opindex --only-matching
349@cindex only matching
350Print only the matched (non-empty) parts of matching lines,
351with each such part on a separate output line.
352
353@item -q
354@itemx --quiet
355@itemx --silent
356@opindex -q
357@opindex --quiet
358@opindex --silent
359@cindex quiet, silent
360Quiet; do not write anything to standard output.
361Exit immediately with zero status if any match is found,
362even if an error was detected.
363Also see the @option{-s} or @option{--no-messages} option.
364(@option{-q} is specified by POSIX.)
365
366@item -s
367@itemx --no-messages
368@opindex -s
369@opindex --no-messages
370@cindex suppress error messages
371Suppress error messages about nonexistent or unreadable files.
372Portability note:
373unlike GNU @command{grep},
3747th Edition Unix @command{grep} did not conform to POSIX,
375because it lacked @option{-q}
376and its @option{-s} option behaved like
377GNU @command{grep}'s @option{-q} option.@footnote{Of course, 7th Edition
378Unix predated POSIX by several years!}
379USG-style @command{grep} also lacked @option{-q}
380but its @option{-s} option behaved like GNU @command{grep}'s.
381Portable shell scripts should avoid both
382@option{-q} and @option{-s} and should redirect
383standard and error output to @file{/dev/null} instead.
384(@option{-s} is specified by POSIX.)
385
386@end table
387
388@node Output Line Prefix Control
389@subsection Output Line Prefix Control
390
391When several prefix fields are to be output,
392the order is always file name, line number, and byte offset,
393regardless of the order in which these options were specified.
394
395@table @option
396
397@item -b
398@itemx --byte-offset
399@opindex -b
400@opindex --byte-offset
401@cindex byte offset
402Print the 0-based byte offset within the input file
403before each line of output.
404If @option{-o} (@option{--only-matching}) is specified,
405print the offset of the matching part itself.
406When @command{grep} runs on MS-DOS or MS-Windows,
407the printed byte offsets depend on whether
408the @option{-u} (@option{--unix-byte-offsets}) option is used;
409see below.
410
411@item -H
412@itemx --with-filename
413@opindex -H
414@opindex --with-filename
415@cindex with filename prefix
416Print the file name for each match.
417This is the default when there is more than one file to search.
418
419@item -h
420@itemx --no-filename
421@opindex -h
422@opindex --no-filename
423@cindex no filename prefix
424Suppress the prefixing of file names on output.
425This is the default when there is only one file
426(or only standard input) to search.
427
428@item --label=@var{LABEL}
429@opindex --label
430@cindex changing name of standard input
431Display input actually coming from standard input
432as input coming from file @var{LABEL}.  This is
433especially useful when implementing tools like
434@command{zgrep}; e.g.:
435
436@example
437gzip -cd foo.gz | grep --label=foo -H something
438@end example
439
440@item -n
441@itemx --line-number
442@opindex -n
443@opindex --line-number
444@cindex line numbering
445Prefix each line of output with the 1-based line number within its input file.
446(@option{-n} is specified by POSIX.)
447
448@item -T
449@itemx --initial-tab
450@opindex -T
451@opindex --initial-tab
452@cindex tab-aligned content lines
453Make sure that the first character of actual line content lies on a tab stop,
454so that the alignment of tabs looks normal.
455This is useful with options that prefix their output to the actual content:
456@option{-H}, @option{-n}, and @option{-b}.
457In order to improve the probability that lines
458from a single file will all start at the same column,
459this also causes the line number and byte offset (if present)
460to be printed in a minimum-size field width.
461
462@item -u
463@itemx --unix-byte-offsets
464@opindex -u
465@opindex --unix-byte-offsets
466@cindex MS-DOS/MS-Windows byte offsets
467@cindex byte offsets, on MS-DOS/MS-Windows
468Report Unix-style byte offsets.
469This option causes @command{grep} to report byte offsets
470as if the file were a Unix-style text file,
471i.e., the byte offsets ignore carriage returns that were stripped.
472This will produce results identical
473to running @command{grep} on a Unix machine.
474This option has no effect unless the @option{-b} option is also used;
475it has no effect on platforms other than MS-DOS and MS-Windows.
476
477@item -Z
478@itemx --null
479@opindex -Z
480@opindex --null
481@cindex zero-terminated file names
482Output a zero byte (the ASCII NUL character)
483instead of the character that normally follows a file name.
484For example,
485@samp{grep -lZ} outputs a zero byte after each file name
486instead of the usual newline.
487This option makes the output unambiguous,
488even in the presence of file names containing unusual characters like newlines.
489This option can be used with commands like
490@samp{find -print0}, @samp{perl -0}, @samp{sort -z}, and @samp{xargs -0}
491to process arbitrary file names,
492even those that contain newline characters.
493
494@end table
495
496@node Context Line Control
497@subsection Context Line Control
498
499Regardless of how these options are set,
500@command{grep} will never print any given line more than once.
501If the @option{-o} (@option{--only-matching}) option is specified,
502these options have no effect and a warning is given upon their use.
503
504@table @option
505
506@item -A @var{num}
507@itemx --after-context=@var{num}
508@opindex -A
509@opindex --after-context
510@cindex after context
511@cindex context lines, after match
512Print @var{num} lines of trailing context after matching lines.
513
514@item -B @var{num}
515@itemx --before-context=@var{num}
516@opindex -B
517@opindex --before-context
518@cindex before context
519@cindex context lines, before match
520Print @var{num} lines of leading context before matching lines.
521
522@item -C @var{num}
523@itemx -@var{num}
524@itemx --context=@var{num}
525@opindex -C
526@opindex --context
527@opindex -@var{num}
528@cindex context
529Print @var{num} lines of leading and trailing output context.
530
531@item --group-separator=@var{string}
532@opindex --group-separator
533@cindex group separator
534When @option{-A}, @option{-B} or @option{-C} are in use,
535print @var{string} instead of @option{--} between groups of lines.
536
537@item --no-group-separator
538@opindex --group-separator
539@cindex group separator
540When @option{-A}, @option{-B} or @option{-C} are in use,
541do not print a separator between groups of lines.
542
543@end table
544
545Here are some points about how @command{grep} chooses
546the separator to print between prefix fields and line content:
547
548@itemize @bullet
549@item
550Matching lines normally use @samp{:} as a separator
551between prefix fields and actual line content.
552
553@item
554Context (i.e., non-matching) lines use @samp{-} instead.
555
556@item
557When context is not specified,
558matching lines are simply output one right after another.
559
560@item
561When context is specified,
562lines that are adjacent in the input form a group
563and are output one right after another, while
564by default a separator appears between non-adjacent groups.
565
566@item
567The default separator
568is a @samp{--} line; its presence and appearance
569can be changed with the options above.
570
571@item
572Each group may contain
573several matching lines when they are close enough to each other
574that two adjacent groups connect and can merge into a single
575contiguous one.
576@end itemize
577
578@node File and Directory Selection
579@subsection File and Directory Selection
580
581@table @option
582
583@item -a
584@itemx --text
585@opindex -a
586@opindex --text
587@cindex suppress binary data
588@cindex binary files
589Process a binary file as if it were text;
590this is equivalent to the @samp{--binary-files=text} option.
591
592@item --binary-files=@var{type}
593@opindex --binary-files
594@cindex binary files
595If a file's allocation metadata,
596or if its data read before a line is selected for output,
597indicate that the file contains binary data,
598assume that the file is of type @var{type}.
599Non-text bytes indicate binary data; these are either data bytes
600improperly encoded for the current locale, or null bytes when the
601@option{-z} (@option{--null-data}) option is not given (@pxref{Other
602Options}).
603
604By default, @var{type} is @samp{binary},
605and @command{grep} normally outputs either
606a one-line message saying that a binary file matches,
607or no message if there is no match.
608When processing binary data, @command{grep} may treat non-text bytes
609as line terminators; for example, the pattern @samp{.} (period) might
610not match a null byte, as the null byte might be treated as a line
611terminator even without the @option{-z} (@option{--null-data}) option.
612
613If @var{type} is @samp{without-match},
614@command{grep} assumes that a binary file does not match;
615this is equivalent to the @option{-I} option.
616
617If @var{type} is @samp{text},
618@command{grep} processes a binary file as if it were text;
619this is equivalent to the @option{-a} option.
620
621@emph{Warning:} @samp{--binary-files=text} might output binary garbage,
622which can have nasty side effects
623if the output is a terminal and
624if the terminal driver interprets some of it as commands.
625
626@item -D @var{action}
627@itemx --devices=@var{action}
628@opindex -D
629@opindex --devices
630@cindex device search
631If an input file is a device, FIFO, or socket, use @var{action} to process it.
632If @var{action} is @samp{read},
633all devices are read just as if they were ordinary files.
634If @var{action} is @samp{skip},
635devices, FIFOs, and sockets are silently skipped.
636By default, devices are read if they are on the command line or if the
637@option{-R} (@option{--dereference-recursive}) option is used, and are
638skipped if they are encountered recursively and the @option{-r}
639(@option{--recursive}) option is used.
640This option has no effect on a file that is read via standard input.
641
642@item -d @var{action}
643@itemx --directories=@var{action}
644@opindex -d
645@opindex --directories
646@cindex directory search
647@cindex symbolic links
648If an input file is a directory, use @var{action} to process it.
649By default, @var{action} is @samp{read},
650which means that directories are read just as if they were ordinary files
651(some operating systems and file systems disallow this,
652and will cause @command{grep}
653to print error messages for every directory or silently skip them).
654If @var{action} is @samp{skip}, directories are silently skipped.
655If @var{action} is @samp{recurse},
656@command{grep} reads all files under each directory, recursively,
657following command-line symbolic links and skipping other symlinks;
658this is equivalent to the @option{-r} option.
659
660@item --exclude=@var{glob}
661@opindex --exclude
662@cindex exclude files
663@cindex searching directory trees
664Skip files whose name matches the pattern @var{glob}, using wildcard
665matching.  When searching recursively, skip any subfile whose base
666name matches @var{glob}; the base name is the part after the last
667@samp{/}.  A pattern can use
668@samp{*}, @samp{?}, and @samp{[}...@samp{]} as wildcards,
669and @code{\} to quote a wildcard or backslash character literally.
670
671@item --exclude-from=@var{file}
672@opindex --exclude-from
673@cindex exclude files
674@cindex searching directory trees
675Skip files whose name matches any of the patterns
676read from @var{file} (using wildcard matching as described
677under @option{--exclude}).
678
679@item --exclude-dir=@var{glob}
680@opindex --exclude-dir
681@cindex exclude directories
682Skip any directory whose name matches the pattern @var{glob}.  When
683searching recursively, skip any subdirectory whose base name matches
684@var{glob}.  Ignore any redundant trailing slashes in @var{glob}.
685
686@item -I
687Process a binary file as if it did not contain matching data;
688this is equivalent to the @samp{--binary-files=without-match} option.
689
690@item --include=@var{glob}
691@opindex --include
692@cindex include files
693@cindex searching directory trees
694Search only files whose name matches @var{glob},
695using wildcard matching as described under @option{--exclude}.
696
697@item -r
698@itemx --recursive
699@opindex -r
700@opindex --recursive
701@cindex recursive search
702@cindex searching directory trees
703@cindex symbolic links
704For each directory operand,
705read and process all files in that directory, recursively.
706Follow symbolic links on the command line, but skip symlinks
707that are encountered recursively.
708Note that if no file operand is given, grep searches the working directory.
709This is the same as the @samp{--directories=recurse} option.
710
711@item -R
712@itemx --dereference-recursive
713@opindex -R
714@opindex --dereference-recursive
715@cindex recursive search
716@cindex searching directory trees
717@cindex symbolic links
718For each directory operand, read and process all files in that
719directory, recursively, following all symbolic links.
720
721@end table
722
723@node Other Options
724@subsection Other Options
725
726@table @option
727
728@item --line-buffered
729@opindex --line-buffered
730@cindex line buffering
731Use line buffering on output.
732This can cause a performance penalty.
733
734@item -U
735@itemx --binary
736@opindex -U
737@opindex --binary
738@cindex MS-DOS/MS-Windows binary files
739@cindex binary files, MS-DOS/MS-Windows
740Treat the file(s) as binary.
741By default, under MS-DOS and MS-Windows,
742@command{grep} guesses whether a file is text or binary
743as described for the @option{--binary-files} option.
744If @command{grep} decides the file is a text file,
745it strips carriage returns from the original file contents
746(to make regular expressions with @code{^} and @code{$} work correctly).
747Specifying @option{-U} overrules this guesswork,
748causing all files to be read and passed to the matching mechanism verbatim;
749if the file is a text file with @code{CR/LF} pairs at the end of each line,
750this will cause some regular expressions to fail.
751This option has no effect
752on platforms other than MS-DOS and MS-Windows.
753
754@item -z
755@itemx --null-data
756@opindex -z
757@opindex --null-data
758@cindex zero-terminated lines
759Treat the input as a set of lines, each terminated by a zero byte (the
760ASCII NUL character) instead of a newline.
761Like the @option{-Z} or @option{--null} option,
762this option can be used with commands like
763@samp{sort -z} to process arbitrary file names.
764
765@end table
766
767@node Environment Variables
768@section Environment Variables
769
770The behavior of @command{grep} is affected
771by the following environment variables.
772
773@vindex LANGUAGE @r{environment variable}
774@vindex LC_ALL @r{environment variable}
775@vindex LC_MESSAGES @r{environment variable}
776@vindex LANG @r{environment variable}
777The locale for category @w{@code{LC_@var{foo}}}
778is specified by examining the three environment variables
779@env{LC_ALL}, @w{@env{LC_@var{foo}}}, and @env{LANG},
780in that order.
781The first of these variables that is set specifies the locale.
782For example, if @env{LC_ALL} is not set,
783but @env{LC_COLLATE} is set to @samp{pt_BR},
784then the Brazilian Portuguese locale is used
785for the @env{LC_COLLATE} category.
786As a special case for @env{LC_MESSAGES} only, the environment variable
787@env{LANGUAGE} can contain a colon-separated list of languages that
788overrides the three environment variables that ordinarily specify
789the @env{LC_MESSAGES} category.
790The @samp{C} locale is used if none of these environment variables are set,
791if the locale catalog is not installed,
792or if @command{grep} was not compiled
793with national language support (NLS).
794
795Many of the environment variables in the following list let you
796control highlighting using
797Select Graphic Rendition (SGR)
798commands interpreted by the terminal or terminal emulator.
799(See the
800section
801in the documentation of your text terminal
802for permitted values and their meanings as character attributes.)
803These substring values are integers in decimal representation
804and can be concatenated with semicolons.
805@command{grep} takes care of assembling the result
806into a complete SGR sequence (@samp{\33[}...@samp{m}).
807Common values to concatenate include
808@samp{1} for bold,
809@samp{4} for underline,
810@samp{5} for blink,
811@samp{7} for inverse,
812@samp{39} for default foreground color,
813@samp{30} to @samp{37} for foreground colors,
814@samp{90} to @samp{97} for 16-color mode foreground colors,
815@samp{38;5;0} to @samp{38;5;255}
816for 88-color and 256-color modes foreground colors,
817@samp{49} for default background color,
818@samp{40} to @samp{47} for background colors,
819@samp{100} to @samp{107} for 16-color mode background colors,
820and @samp{48;5;0} to @samp{48;5;255}
821for 88-color and 256-color modes background colors.
822
823The two-letter names used in the @env{GREP_COLORS} environment variable
824(and some of the others) refer to terminal ``capabilities,'' the ability
825of a terminal to highlight text, or change its color, and so on.
826These capabilities are stored in an online database and accessed by
827the @code{terminfo} library.
828
829@cindex environment variables
830
831@table @env
832
833@item GREP_OPTIONS
834@vindex GREP_OPTIONS @r{environment variable}
835@cindex default options environment variable
836This variable specifies default options to be placed in front of any
837explicit options.
838As this causes problems when writing portable scripts, this feature
839will be removed in a future release of @command{grep}, and @command{grep}
840warns if it is used.  Please use an alias or script instead.
841For example, if @command{grep} is in the directory @samp{/usr/bin} you
842can prepend @file{$HOME/bin} to your @env{PATH} and create an
843executable script @file{$HOME/bin/grep} containing the following:
844
845@example
846#! /bin/sh
847export PATH=/usr/bin
848exec grep --color=auto --devices=skip "$@@"
849@end example
850
851@item GREP_COLOR
852@vindex GREP_COLOR @r{environment variable}
853@cindex highlight markers
854This variable specifies the color used to highlight matched (non-empty) text.
855It is deprecated in favor of @env{GREP_COLORS}, but still supported.
856The @samp{mt}, @samp{ms}, and @samp{mc} capabilities of @env{GREP_COLORS}
857have priority over it.
858It can only specify the color used to highlight
859the matching non-empty text in any matching line
860(a selected line when the @option{-v} command-line option is omitted,
861or a context line when @option{-v} is specified).
862The default is @samp{01;31},
863which means a bold red foreground text on the terminal's default background.
864
865@item GREP_COLORS
866@vindex GREP_COLORS @r{environment variable}
867@cindex highlight markers
868This variable specifies the colors and other attributes
869used to highlight various parts of the output.
870Its value is a colon-separated list of @code{terminfo} capabilities
871that defaults to @samp{ms=01;31:mc=01;31:sl=:cx=:fn=35:ln=32:bn=32:se=36}
872with the @samp{rv} and @samp{ne} boolean capabilities omitted (i.e., false).
873Supported capabilities are as follows.
874
875@table @code
876@item sl=
877@vindex sl GREP_COLORS @r{capability}
878SGR substring for whole selected lines
879(i.e.,
880matching lines when the @option{-v} command-line option is omitted,
881or non-matching lines when @option{-v} is specified).
882If however the boolean @samp{rv} capability
883and the @option{-v} command-line option are both specified,
884it applies to context matching lines instead.
885The default is empty (i.e., the terminal's default color pair).
886
887@item cx=
888@vindex cx GREP_COLORS @r{capability}
889SGR substring for whole context lines
890(i.e.,
891non-matching lines when the @option{-v} command-line option is omitted,
892or matching lines when @option{-v} is specified).
893If however the boolean @samp{rv} capability
894and the @option{-v} command-line option are both specified,
895it applies to selected non-matching lines instead.
896The default is empty (i.e., the terminal's default color pair).
897
898@item rv
899@vindex rv GREP_COLORS @r{capability}
900Boolean value that reverses (swaps) the meanings of
901the @samp{sl=} and @samp{cx=} capabilities
902when the @option{-v} command-line option is specified.
903The default is false (i.e., the capability is omitted).
904
905@item mt=01;31
906@vindex mt GREP_COLORS @r{capability}
907SGR substring for matching non-empty text in any matching line
908(i.e.,
909a selected line when the @option{-v} command-line option is omitted,
910or a context line when @option{-v} is specified).
911Setting this is equivalent to setting both @samp{ms=} and @samp{mc=}
912at once to the same value.
913The default is a bold red text foreground over the current line background.
914
915@item ms=01;31
916@vindex ms GREP_COLORS @r{capability}
917SGR substring for matching non-empty text in a selected line.
918(This is used only when the @option{-v} command-line option is omitted.)
919The effect of the @samp{sl=} (or @samp{cx=} if @samp{rv}) capability
920remains active when this takes effect.
921The default is a bold red text foreground over the current line background.
922
923@item mc=01;31
924@vindex mc GREP_COLORS @r{capability}
925SGR substring for matching non-empty text in a context line.
926(This is used only when the @option{-v} command-line option is specified.)
927The effect of the @samp{cx=} (or @samp{sl=} if @samp{rv}) capability
928remains active when this takes effect.
929The default is a bold red text foreground over the current line background.
930
931@item fn=35
932@vindex fn GREP_COLORS @r{capability}
933SGR substring for file names prefixing any content line.
934The default is a magenta text foreground over the terminal's default background.
935
936@item ln=32
937@vindex ln GREP_COLORS @r{capability}
938SGR substring for line numbers prefixing any content line.
939The default is a green text foreground over the terminal's default background.
940
941@item bn=32
942@vindex bn GREP_COLORS @r{capability}
943SGR substring for byte offsets prefixing any content line.
944The default is a green text foreground over the terminal's default background.
945
946@item se=36
947@vindex fn GREP_COLORS @r{capability}
948SGR substring for separators that are inserted
949between selected line fields (@samp{:}),
950between context line fields (@samp{-}),
951and between groups of adjacent lines
952when nonzero context is specified (@samp{--}).
953The default is a cyan text foreground over the terminal's default background.
954
955@item ne
956@vindex ne GREP_COLORS @r{capability}
957Boolean value that prevents clearing to the end of line
958using Erase in Line (EL) to Right (@samp{\33[K})
959each time a colorized item ends.
960This is needed on terminals on which EL is not supported.
961It is otherwise useful on terminals
962for which the @code{back_color_erase}
963(@code{bce}) boolean @code{terminfo} capability does not apply,
964when the chosen highlight colors do not affect the background,
965or when EL is too slow or causes too much flicker.
966The default is false (i.e., the capability is omitted).
967@end table
968
969Note that boolean capabilities have no @samp{=}... part.
970They are omitted (i.e., false) by default and become true when specified.
971
972
973@item LC_ALL
974@itemx LC_COLLATE
975@itemx LANG
976@vindex LC_ALL @r{environment variable}
977@vindex LC_COLLATE @r{environment variable}
978@vindex LANG @r{environment variable}
979@cindex character type
980@cindex national language support
981@cindex NLS
982These variables specify the locale for the @env{LC_COLLATE} category,
983which might affect how range expressions like @samp{[a-z]} are
984interpreted.
985
986@item LC_ALL
987@itemx LC_CTYPE
988@itemx LANG
989@vindex LC_ALL @r{environment variable}
990@vindex LC_CTYPE @r{environment variable}
991@vindex LANG @r{environment variable}
992These variables specify the locale for the @env{LC_CTYPE} category,
993which determines the type of characters,
994e.g., which characters are whitespace.
995
996@item LANGUAGE
997@itemx LC_ALL
998@itemx LC_MESSAGES
999@itemx LANG
1000@vindex LANGUAGE @r{environment variable}
1001@vindex LC_ALL @r{environment variable}
1002@vindex LC_MESSAGES @r{environment variable}
1003@vindex LANG @r{environment variable}
1004@cindex language of messages
1005@cindex message language
1006@cindex national language support
1007@cindex translation of message language
1008These variables specify the locale for the @env{LC_MESSAGES} category,
1009which determines the language that @command{grep} uses for messages.
1010The default @samp{C} locale uses American English messages.
1011
1012@item POSIXLY_CORRECT
1013@vindex POSIXLY_CORRECT @r{environment variable}
1014If set, @command{grep} behaves as POSIX requires; otherwise,
1015@command{grep} behaves more like other GNU programs.
1016POSIX
1017requires that options that
1018follow file names must be treated as file names;
1019by default,
1020such options are permuted to the front of the operand list
1021and are treated as options.
1022Also, @env{POSIXLY_CORRECT} disables special handling of an
1023invalid bracket expression.  @xref{invalid-bracket-expr}.
1024
1025@item _@var{N}_GNU_nonoption_argv_flags_
1026@vindex _@var{N}_GNU_nonoption_argv_flags_ @r{environment variable}
1027(Here @code{@var{N}} is @command{grep}'s numeric process ID.)
1028If the @var{i}th character of this environment variable's value is @samp{1},
1029do not consider the @var{i}th operand of @command{grep} to be an option,
1030even if it appears to be one.
1031A shell can put this variable in the environment for each command it runs,
1032specifying which operands are the results of file name wildcard expansion
1033and therefore should not be treated as options.
1034This behavior is available only with the GNU C library,
1035and only when @env{POSIXLY_CORRECT} is not set.
1036
1037@end table
1038
1039
1040@node Exit Status
1041@section Exit Status
1042@cindex exit status
1043@cindex return status
1044
1045Normally the exit status is 0 if a line is selected, 1 if no lines
1046were selected, and 2 if an error occurred.  However, if the
1047@option{-q} or @option{--quiet} or @option{--silent} option is used
1048and a line is selected, the exit status is 0 even if an error
1049occurred.  Other @command{grep} implementations may exit with status
1050greater than 2 on error.
1051
1052@node grep Programs
1053@section @command{grep} Programs
1054@cindex @command{grep} programs
1055@cindex variants of @command{grep}
1056
1057@command{grep} searches the named input files
1058for lines containing a match to the given pattern.
1059By default, @command{grep} prints the matching lines.
1060A file named @file{-} stands for standard input.
1061If no input is specified, @command{grep} searches the working
1062directory @file{.} if given a command-line option specifying
1063recursion; otherwise, @command{grep} searches standard input.
1064There are four major variants of @command{grep},
1065controlled by the following options.
1066
1067@table @option
1068
1069@item -G
1070@itemx --basic-regexp
1071@opindex -G
1072@opindex --basic-regexp
1073@cindex matching basic regular expressions
1074Interpret the pattern as a basic regular expression (BRE).
1075This is the default.
1076
1077@item -E
1078@itemx --extended-regexp
1079@opindex -E
1080@opindex --extended-regexp
1081@cindex matching extended regular expressions
1082Interpret the pattern as an extended regular expression (ERE).
1083(@option{-E} is specified by POSIX.)
1084
1085@item -F
1086@itemx --fixed-strings
1087@opindex -F
1088@opindex --fixed-strings
1089@cindex matching fixed strings
1090Interpret the pattern as a list of fixed strings, separated
1091by newlines, any of which is to be matched.
1092(@option{-F} is specified by POSIX.)
1093
1094@item -P
1095@itemx --perl-regexp
1096@opindex -P
1097@opindex --perl-regexp
1098@cindex matching Perl regular expressions
1099Interpret the pattern as a Perl regular expression.
1100This is highly experimental and
1101@samp{grep@ -P} may warn of unimplemented features.
1102
1103@end table
1104
1105In addition,
1106two variant programs @command{egrep} and @command{fgrep} are available.
1107@command{egrep} is the same as @samp{grep@ -E}.
1108@command{fgrep} is the same as @samp{grep@ -F}.
1109Direct invocation as either
1110@command{egrep} or @command{fgrep} is deprecated,
1111but is provided to allow historical applications
1112that rely on them to run unmodified.
1113
1114
1115@node Regular Expressions
1116@chapter Regular Expressions
1117@cindex regular expressions
1118
1119A @dfn{regular expression} is a pattern that describes a set of strings.
1120Regular expressions are constructed analogously to arithmetic expressions,
1121by using various operators to combine smaller expressions.
1122@command{grep} understands
1123three different versions of regular expression syntax:
1124``basic,'' (BRE) ``extended'' (ERE) and ``perl''.
1125In GNU @command{grep},
1126there is no difference in available functionality between the basic and
1127extended syntaxes.
1128In other implementations, basic regular expressions are less powerful.
1129The following description applies to extended regular expressions;
1130differences for basic regular expressions are summarized afterwards.
1131Perl regular expressions give additional functionality, and are
1132documented in the @i{pcresyntax}(3) and @i{pcrepattern}(3) manual pages,
1133but may not be available on every system.
1134
1135@menu
1136* Fundamental Structure::
1137* Character Classes and Bracket Expressions::
1138* The Backslash Character and Special Expressions::
1139* Anchoring::
1140* Back-references and Subexpressions::
1141* Basic vs Extended::
1142@end menu
1143
1144@node Fundamental Structure
1145@section Fundamental Structure
1146
1147The fundamental building blocks are the regular expressions that match
1148a single character.
1149Most characters, including all letters and digits,
1150are regular expressions that match themselves.
1151Any meta-character
1152with special meaning may be quoted by preceding it with a backslash.
1153
1154A regular expression may be followed by one of several
1155repetition operators:
1156
1157@table @samp
1158
1159@item .
1160@opindex .
1161@cindex dot
1162@cindex period
1163The period @samp{.} matches any single character.
1164
1165@item ?
1166@opindex ?
1167@cindex question mark
1168@cindex match expression at most once
1169The preceding item is optional and will be matched at most once.
1170
1171@item *
1172@opindex *
1173@cindex asterisk
1174@cindex match expression zero or more times
1175The preceding item will be matched zero or more times.
1176
1177@item +
1178@opindex +
1179@cindex plus sign
1180@cindex match expression one or more times
1181The preceding item will be matched one or more times.
1182
1183@item @{@var{n}@}
1184@opindex @{@var{n}@}
1185@cindex braces, one argument
1186@cindex match expression @var{n} times
1187The preceding item is matched exactly @var{n} times.
1188
1189@item @{@var{n},@}
1190@opindex @{@var{n},@}
1191@cindex braces, second argument omitted
1192@cindex match expression @var{n} or more times
1193The preceding item is matched @var{n} or more times.
1194
1195@item @{,@var{m}@}
1196@opindex @{,@var{m}@}
1197@cindex braces, first argument omitted
1198@cindex match expression at most @var{m} times
1199The preceding item is matched at most @var{m} times.
1200This is a GNU extension.
1201
1202@item @{@var{n},@var{m}@}
1203@opindex @{@var{n},@var{m}@}
1204@cindex braces, two arguments
1205@cindex match expression from @var{n} to @var{m} times
1206The preceding item is matched at least @var{n} times, but not more than
1207@var{m} times.
1208
1209@end table
1210
1211The empty regular expression matches the empty string.
1212Two regular expressions may be concatenated;
1213the resulting regular expression
1214matches any string formed by concatenating two substrings
1215that respectively match the concatenated expressions.
1216
1217Two regular expressions may be joined by the infix operator @samp{|};
1218the resulting regular expression
1219matches any string matching either alternate expression.
1220
1221Repetition takes precedence over concatenation,
1222which in turn takes precedence over alternation.
1223A whole expression may be enclosed in parentheses
1224to override these precedence rules and form a subexpression.
1225An unmatched @samp{)} matches just itself.
1226
1227@node Character Classes and Bracket Expressions
1228@section Character Classes and Bracket Expressions
1229
1230@cindex bracket expression
1231@cindex character class
1232A @dfn{bracket expression} is a list of characters enclosed by @samp{[} and
1233@samp{]}.
1234It matches any single character in that list;
1235if the first character of the list is the caret @samp{^},
1236then it matches any character @strong{not} in the list.
1237For example, the regular expression
1238@samp{[0123456789]} matches any single digit.
1239
1240@cindex range expression
1241Within a bracket expression, a @dfn{range expression} consists of two
1242characters separated by a hyphen.
1243It matches any single character that
1244sorts between the two characters, inclusive.
1245In the default C locale, the sorting sequence is the native character
1246order; for example, @samp{[a-d]} is equivalent to @samp{[abcd]}.
1247In other locales, the sorting sequence is not specified, and
1248@samp{[a-d]} might be equivalent to @samp{[abcd]} or to
1249@samp{[aBbCcDd]}, or it might fail to match any character, or the set of
1250characters that it matches might even be erratic.
1251To obtain the traditional interpretation
1252of bracket expressions, you can use the @samp{C} locale by setting the
1253@env{LC_ALL} environment variable to the value @samp{C}.
1254
1255Finally, certain named classes of characters are predefined within
1256bracket expressions, as follows.
1257Their interpretation depends on the @env{LC_CTYPE} locale;
1258for example, @samp{[[:alnum:]]} means the character class of numbers and letters
1259in the current locale.
1260
1261@cindex classes of characters
1262@cindex character classes
1263@table @samp
1264
1265@item [:alnum:]
1266@opindex alnum @r{character class}
1267@cindex alphanumeric characters
1268Alphanumeric characters:
1269@samp{[:alpha:]} and @samp{[:digit:]}; in the @samp{C} locale and ASCII
1270character encoding, this is the same as @samp{[0-9A-Za-z]}.
1271
1272@item [:alpha:]
1273@opindex alpha @r{character class}
1274@cindex alphabetic characters
1275Alphabetic characters:
1276@samp{[:lower:]} and @samp{[:upper:]}; in the @samp{C} locale and ASCII
1277character encoding, this is the same as @samp{[A-Za-z]}.
1278
1279@item [:blank:]
1280@opindex blank @r{character class}
1281@cindex blank characters
1282Blank characters:
1283space and tab.
1284
1285@item [:cntrl:]
1286@opindex cntrl @r{character class}
1287@cindex control characters
1288Control characters.
1289In ASCII, these characters have octal codes 000
1290through 037, and 177 (DEL).
1291In other character sets, these are
1292the equivalent characters, if any.
1293
1294@item [:digit:]
1295@opindex digit @r{character class}
1296@cindex digit characters
1297@cindex numeric characters
1298Digits: @code{0 1 2 3 4 5 6 7 8 9}.
1299
1300@item [:graph:]
1301@opindex graph @r{character class}
1302@cindex graphic characters
1303Graphical characters:
1304@samp{[:alnum:]} and @samp{[:punct:]}.
1305
1306@item [:lower:]
1307@opindex lower @r{character class}
1308@cindex lower-case letters
1309Lower-case letters; in the @samp{C} locale and ASCII character
1310encoding, this is
1311@code{a b c d e f g h i j k l m n o p q r s t u v w x y z}.
1312
1313@item [:print:]
1314@opindex print @r{character class}
1315@cindex printable characters
1316Printable characters:
1317@samp{[:alnum:]}, @samp{[:punct:]}, and space.
1318
1319@item [:punct:]
1320@opindex punct @r{character class}
1321@cindex punctuation characters
1322Punctuation characters; in the @samp{C} locale and ASCII character
1323encoding, this is
1324@code{!@: " # $ % & ' ( ) * + , - .@: / : ; < = > ?@: @@ [ \ ] ^ _ ` @{ | @} ~}.
1325
1326@item [:space:]
1327@opindex space @r{character class}
1328@cindex space characters
1329@cindex whitespace characters
1330Space characters: in the @samp{C} locale, this is
1331tab, newline, vertical tab, form feed, carriage return, and space.
1332@xref{Usage}, for more discussion of matching newlines.
1333
1334@item [:upper:]
1335@opindex upper @r{character class}
1336@cindex upper-case letters
1337Upper-case letters: in the @samp{C} locale and ASCII character
1338encoding, this is
1339@code{A B C D E F G H I J K L M N O P Q R S T U V W X Y Z}.
1340
1341@item [:xdigit:]
1342@opindex xdigit @r{character class}
1343@cindex xdigit class
1344@cindex hexadecimal digits
1345Hexadecimal digits:
1346@code{0 1 2 3 4 5 6 7 8 9 A B C D E F a b c d e f}.
1347
1348@end table
1349Note that the brackets in these class names are
1350part of the symbolic names, and must be included in addition to
1351the brackets delimiting the bracket expression.
1352
1353@anchor{invalid-bracket-expr}
1354If you mistakenly omit the outer brackets, and search for say, @samp{[:upper:]},
1355GNU @command{grep} prints a diagnostic and exits with status 2, on
1356the assumption that you did not intend to search for the nominally
1357equivalent regular expression: @samp{[:epru]}.
1358Set the @env{POSIXLY_CORRECT} environment variable to disable this feature.
1359
1360Most meta-characters lose their special meaning inside bracket expressions.
1361
1362@table @samp
1363@item ]
1364ends the bracket expression if it's not the first list item.
1365So, if you want to make the @samp{]} character a list item,
1366you must put it first.
1367
1368@item [.
1369represents the open collating symbol.
1370
1371@item .]
1372represents the close collating symbol.
1373
1374@item [=
1375represents the open equivalence class.
1376
1377@item =]
1378represents the close equivalence class.
1379
1380@item [:
1381represents the open character class symbol, and should be followed by a
1382valid character class name.
1383
1384@item :]
1385represents the close character class symbol.
1386
1387@item -
1388represents the range if it's not first or last in a list or the ending point
1389of a range.
1390
1391@item ^
1392represents the characters not in the list.
1393If you want to make the @samp{^}
1394character a list item, place it anywhere but first.
1395
1396@end table
1397
1398@node The Backslash Character and Special Expressions
1399@section The Backslash Character and Special Expressions
1400@cindex backslash
1401
1402The @samp{\} character,
1403when followed by certain ordinary characters,
1404takes a special meaning:
1405
1406@table @samp
1407
1408@item \b
1409Match the empty string at the edge of a word.
1410
1411@item \B
1412Match the empty string provided it's not at the edge of a word.
1413
1414@item \<
1415Match the empty string at the beginning of word.
1416
1417@item \>
1418Match the empty string at the end of word.
1419
1420@item \w
1421Match word constituent, it is a synonym for @samp{[_[:alnum:]]}.
1422
1423@item \W
1424Match non-word constituent, it is a synonym for @samp{[^_[:alnum:]]}.
1425
1426@item \s
1427Match whitespace, it is a synonym for @samp{[[:space:]]}.
1428
1429@item \S
1430Match non-whitespace, it is a synonym for @samp{[^[:space:]]}.
1431
1432@end table
1433
1434For example, @samp{\brat\b} matches the separate word @samp{rat},
1435@samp{\Brat\B} matches @samp{crate} but not @samp{furry rat}.
1436
1437@node Anchoring
1438@section Anchoring
1439@cindex anchoring
1440
1441The caret @samp{^} and the dollar sign @samp{$} are meta-characters that
1442respectively match the empty string at the beginning and end of a line.
1443They are termed @dfn{anchors}, since they force the match to be ``anchored''
1444to beginning or end of a line, respectively.
1445
1446@node Back-references and Subexpressions
1447@section Back-references and Subexpressions
1448@cindex subexpression
1449@cindex back-reference
1450
1451The back-reference @samp{\@var{n}}, where @var{n} is a single digit, matches
1452the substring previously matched by the @var{n}th parenthesized subexpression
1453of the regular expression.
1454For example, @samp{(a)\1} matches @samp{aa}.
1455When used with alternation, if the group does not participate in the match then
1456the back-reference makes the whole match fail.
1457For example, @samp{a(.)|b\1}
1458will not match @samp{ba}.
1459When multiple regular expressions are given with
1460@option{-e} or from a file (@samp{-f @var{file}}),
1461back-references are local to each expression.
1462
1463@node Basic vs Extended
1464@section Basic vs Extended Regular Expressions
1465@cindex basic regular expressions
1466
1467In basic regular expressions the meta-characters @samp{?}, @samp{+},
1468@samp{@{}, @samp{|}, @samp{(}, and @samp{)} lose their special meaning;
1469instead use the backslashed versions @samp{\?}, @samp{\+}, @samp{\@{},
1470@samp{\|}, @samp{\(}, and @samp{\)}.
1471
1472@cindex interval specifications
1473Traditional @command{egrep} did not support the @samp{@{} meta-character,
1474and some @command{egrep} implementations support @samp{\@{} instead, so
1475portable scripts should avoid @samp{@{} in @samp{grep@ -E} patterns and
1476should use @samp{[@{]} to match a literal @samp{@{}.
1477
1478GNU @command{grep@ -E} attempts to support traditional usage by
1479assuming that @samp{@{} is not special if it would be the start of an
1480invalid interval specification.
1481For example, the command
1482@samp{grep@ -E@ '@{1'} searches for the two-character string @samp{@{1}
1483instead of reporting a syntax error in the regular expression.
1484POSIX allows this behavior as an extension, but portable scripts
1485should avoid it.
1486
1487
1488@node Usage
1489@chapter Usage
1490
1491@cindex usage, examples
1492Here is an example command that invokes GNU @command{grep}:
1493
1494@example
1495grep -i 'hello.*world' menu.h main.c
1496@end example
1497
1498@noindent
1499This lists all lines in the files @file{menu.h} and @file{main.c} that
1500contain the string @samp{hello} followed by the string @samp{world};
1501this is because @samp{.*} matches zero or more characters within a line.
1502@xref{Regular Expressions}.
1503The @option{-i} option causes @command{grep}
1504to ignore case, causing it to match the line @samp{Hello, world!}, which
1505it would not otherwise match.
1506@xref{Invoking}, for more details about
1507how to invoke @command{grep}.
1508
1509@cindex using @command{grep}, Q&A
1510@cindex FAQ about @command{grep} usage
1511Here are some common questions and answers about @command{grep} usage.
1512
1513@enumerate
1514
1515@item
1516How can I list just the names of matching files?
1517
1518@example
1519grep -l 'main' *.c
1520@end example
1521
1522@noindent
1523lists the names of all C files in the current directory whose contents
1524mention @samp{main}.
1525
1526@item
1527How do I search directories recursively?
1528
1529@example
1530grep -r 'hello' /home/gigi
1531@end example
1532
1533@noindent
1534searches for @samp{hello} in all files
1535under the @file{/home/gigi} directory.
1536For more control over which files are searched,
1537use @command{find}, @command{grep}, and @command{xargs}.
1538For example, the following command searches only C files:
1539
1540@example
1541find /home/gigi -name '*.c' -print0 | xargs -0r grep -H 'hello'
1542@end example
1543
1544This differs from the command:
1545
1546@example
1547grep -H 'hello' *.c
1548@end example
1549
1550which merely looks for @samp{hello} in all files in the current
1551directory whose names end in @samp{.c}.
1552The @samp{find ...} command line above is more similar to the command:
1553
1554@example
1555grep -rH --include='*.c' 'hello' /home/gigi
1556@end example
1557
1558@item
1559What if a pattern has a leading @samp{-}?
1560
1561@example
1562grep -e '--cut here--' *
1563@end example
1564
1565@noindent
1566searches for all lines matching @samp{--cut here--}.
1567Without @option{-e},
1568@command{grep} would attempt to parse @samp{--cut here--} as a list of
1569options.
1570
1571@item
1572Suppose I want to search for a whole word, not a part of a word?
1573
1574@example
1575grep -w 'hello' *
1576@end example
1577
1578@noindent
1579searches only for instances of @samp{hello} that are entire words;
1580it does not match @samp{Othello}.
1581For more control, use @samp{\<} and
1582@samp{\>} to match the start and end of words.
1583For example:
1584
1585@example
1586grep 'hello\>' *
1587@end example
1588
1589@noindent
1590searches only for words ending in @samp{hello}, so it matches the word
1591@samp{Othello}.
1592
1593@item
1594How do I output context around the matching lines?
1595
1596@example
1597grep -C 2 'hello' *
1598@end example
1599
1600@noindent
1601prints two lines of context around each matching line.
1602
1603@item
1604How do I force @command{grep} to print the name of the file?
1605
1606Append @file{/dev/null}:
1607
1608@example
1609grep 'eli' /etc/passwd /dev/null
1610@end example
1611
1612gets you:
1613
1614@example
1615/etc/passwd:eli:x:2098:1000:Eli Smith:/home/eli:/bin/bash
1616@end example
1617
1618Alternatively, use @option{-H}, which is a GNU extension:
1619
1620@example
1621grep -H 'eli' /etc/passwd
1622@end example
1623
1624@item
1625Why do people use strange regular expressions on @command{ps} output?
1626
1627@example
1628ps -ef | grep '[c]ron'
1629@end example
1630
1631If the pattern had been written without the square brackets, it would
1632have matched not only the @command{ps} output line for @command{cron},
1633but also the @command{ps} output line for @command{grep}.
1634Note that on some platforms,
1635@command{ps} limits the output to the width of the screen;
1636@command{grep} does not have any limit on the length of a line
1637except the available memory.
1638
1639@item
1640Why does @command{grep} report ``Binary file matches''?
1641
1642If @command{grep} listed all matching ``lines'' from a binary file, it
1643would probably generate output that is not useful, and it might even
1644muck up your display.
1645So GNU @command{grep} suppresses output from
1646files that appear to be binary files.
1647To force GNU @command{grep}
1648to output lines even from files that appear to be binary, use the
1649@option{-a} or @samp{--binary-files=text} option.
1650To eliminate the
1651``Binary file matches'' messages, use the @option{-I} or
1652@samp{--binary-files=without-match} option.
1653
1654@item
1655Why doesn't @samp{grep -lv} print non-matching file names?
1656
1657@samp{grep -lv} lists the names of all files containing one or more
1658lines that do not match.
1659To list the names of all files that contain no
1660matching lines, use the @option{-L} or @option{--files-without-match}
1661option.
1662
1663@item
1664I can do ``OR'' with @samp{|}, but what about ``AND''?
1665
1666@example
1667grep 'paul' /etc/motd | grep 'franc,ois'
1668@end example
1669
1670@noindent
1671finds all lines that contain both @samp{paul} and @samp{franc,ois}.
1672
1673@item
1674Why does the empty pattern match every input line?
1675
1676The @command{grep} command searches for lines that contain strings
1677that match a pattern.  Every line contains the empty string, so an
1678empty pattern causes @command{grep} to find a match on each line.  It
1679is not the only such pattern: @samp{^}, @samp{$}, @samp{.*}, and many
1680other patterns cause @command{grep} to match every line.
1681
1682To match empty lines, use the pattern @samp{^$}.  To match blank
1683lines, use the pattern @samp{^[[:blank:]]*$}.  To match no lines at
1684all, use the command @samp{grep -f /dev/null}.
1685
1686@item
1687How can I search in both standard input and in files?
1688
1689Use the special file name @samp{-}:
1690
1691@example
1692cat /etc/passwd | grep 'alain' - /etc/motd
1693@end example
1694
1695@item
1696@cindex palindromes
1697How to express palindromes in a regular expression?
1698
1699It can be done by using back-references;
1700for example,
1701a palindrome of 4 characters can be written with a BRE:
1702
1703@example
1704grep -w -e '\(.\)\(.\).\2\1' file
1705@end example
1706
1707It matches the word ``radar'' or ``civic.''
1708
1709Guglielmo Bondioni proposed a single RE
1710that finds all palindromes up to 19 characters long
1711using @w{9 subexpressions} and @w{9 back-references}:
1712
1713@smallexample
1714grep -E -e '^(.?)(.?)(.?)(.?)(.?)(.?)(.?)(.?)(.?).?\9\8\7\6\5\4\3\2\1$' file
1715@end smallexample
1716
1717Note this is done by using GNU ERE extensions;
1718it might not be portable to other implementations of @command{grep}.
1719
1720@item
1721Why is this back-reference failing?
1722
1723@example
1724echo 'ba' | grep -E '(a)\1|b\1'
1725@end example
1726
1727This gives no output, because the first alternate @samp{(a)\1} does not match,
1728as there is no @samp{aa} in the input, so the @samp{\1} in the second alternate
1729has nothing to refer back to, meaning it will never match anything.
1730(The second alternate in this example can only match
1731if the first alternate has matched---making the second one superfluous.)
1732
1733@item
1734How can I match across lines?
1735
1736Standard grep cannot do this, as it is fundamentally line-based.
1737Therefore, merely using the @code{[:space:]} character class does not
1738match newlines in the way you might expect.
1739
1740With the GNU @command{grep} option @option{-z} (@option{--null-data}), each
1741input ``line'' is terminated by a null byte; @pxref{Other Options}.  Thus,
1742you can match newlines in the input, but typically if there is a match
1743the entire input is output, so this usage is often combined with
1744output-suppressing options like @option{-q}, e.g.:
1745
1746@example
1747printf 'foo\nbar\n' | grep -z -q 'foo[[:space:]]\+bar'
1748@end example
1749
1750If this does not suffice, you can transform the input
1751before giving it to @command{grep}, or turn to @command{awk},
1752@command{sed}, @command{perl}, or many other utilities that are
1753designed to operate across lines.
1754
1755@item
1756What do @command{grep}, @command{fgrep}, and @command{egrep} stand for?
1757
1758The name @command{grep} comes from the way line editing was done on Unix.
1759For example,
1760@command{ed} uses the following syntax
1761to print a list of matching lines on the screen:
1762
1763@example
1764global/regular expression/print
1765g/re/p
1766@end example
1767
1768@command{fgrep} stands for Fixed @command{grep};
1769@command{egrep} stands for Extended @command{grep}.
1770
1771@end enumerate
1772
1773
1774@node Reporting Bugs
1775@chapter Reporting bugs
1776
1777@cindex bugs, reporting
1778Bug reports can be found at the
1779@url{http://debbugs.gnu.org/cgi/pkgreport.cgi?package=grep,
1780GNU bug report logs for @command{grep}}.
1781If you find a bug not listed there, please email it to
1782@email{bug-grep@@gnu.org} to create a new bug report.
1783
1784@section Known Bugs
1785@cindex Bugs, known
1786
1787Large repetition counts in the @samp{@{n,m@}} construct may cause
1788@command{grep} to use lots of memory.
1789In addition, certain other
1790obscure regular expressions require exponential time and
1791space, and may cause @command{grep} to run out of memory.
1792
1793Back-references are very slow, and may require exponential time.
1794
1795
1796@node Copying
1797@chapter Copying
1798@cindex copying
1799
1800GNU @command{grep} is licensed under the GNU GPL, which makes it @dfn{free
1801software}.
1802
1803The ``free'' in ``free software'' refers to liberty, not price. As
1804some GNU project advocates like to point out, think of ``free speech''
1805rather than ``free beer''.  In short, you have the right (freedom) to
1806run and change @command{grep} and distribute it to other people, and---if you
1807want---charge money for doing either.  The important restriction is
1808that you have to grant your recipients the same rights and impose the
1809same restrictions.
1810
1811This general method of licensing software is sometimes called
1812@dfn{open source}.  The GNU project prefers the term ``free software''
1813for reasons outlined at
1814@url{http://www.gnu.org/philosophy/open-source-misses-the-point.html}.
1815
1816This manual is free documentation in the same sense.  The
1817documentation license is included below.  The license for the program
1818is available with the source code, or at
1819@url{http://www.gnu.org/licenses/gpl.html}.
1820
1821@menu
1822* GNU Free Documentation License::
1823@end menu
1824
1825@node GNU Free Documentation License
1826@section GNU Free Documentation License
1827
1828@include fdl.texi
1829
1830
1831@node Index
1832@unnumbered Index
1833
1834@printindex cp
1835
1836@bye
1837