xref: /dragonfly/contrib/grep/doc/grep.texi (revision 73610d44)
1\input texinfo  @c -*-texinfo-*-
2@c %**start of header
3@setfilename grep.info
4@include version.texi
5@settitle GNU Grep @value{VERSION}
6
7@c Combine indices.
8@syncodeindex ky cp
9@syncodeindex pg cp
10@syncodeindex tp cp
11@defcodeindex op
12@syncodeindex op cp
13@syncodeindex vr cp
14@c %**end of header
15
16@documentencoding UTF-8
17
18@copying
19This manual is for @command{grep}, a pattern matching engine.
20
21Copyright @copyright{} 1999--2002, 2005, 2008--2014 Free Software Foundation,
22Inc.
23
24@quotation
25Permission is granted to copy, distribute and/or modify this document
26under the terms of the GNU Free Documentation License, Version 1.3 or
27any later version published by the Free Software Foundation; with no
28Invariant Sections, with no Front-Cover Texts, and with no Back-Cover
29Texts.  A copy of the license is included in the section entitled
30``GNU Free Documentation License''.
31@end quotation
32@end copying
33
34@dircategory Text creation and manipulation
35@direntry
36* grep: (grep).                 Print lines matching a pattern.
37@end direntry
38
39@titlepage
40@title GNU Grep: Print lines matching a pattern
41@subtitle version @value{VERSION}, @value{UPDATED}
42@author Alain Magloire et al.
43@page
44@vskip 0pt plus 1filll
45@insertcopying
46@end titlepage
47
48@contents
49
50
51@ifnottex
52@node Top
53@top grep
54
55@command{grep} prints lines that contain a match for a pattern.
56
57This manual is for version @value{VERSION} of GNU Grep.
58
59@insertcopying
60@end ifnottex
61
62@menu
63* Introduction::                Introduction.
64* Invoking::                    Command-line options, environment, exit status.
65* Regular Expressions::         Regular Expressions.
66* Usage::                       Examples.
67* Reporting Bugs::              Reporting Bugs.
68* Copying::                     License terms for this manual.
69* Index::                       Combined index.
70@end menu
71
72
73@node Introduction
74@chapter Introduction
75
76@cindex searching for a pattern
77
78@command{grep} searches input files
79for lines containing a match to a given pattern list.
80When it finds a match in a line,
81it copies the line to standard output (by default),
82or produces whatever other sort of output you have requested with options.
83
84Though @command{grep} expects to do the matching on text,
85it has no limits on input line length other than available memory,
86and it can match arbitrary characters within a line.
87If the final byte of an input file is not a newline,
88@command{grep} silently supplies one.
89Since newline is also a separator for the list of patterns,
90there is no way to match newline characters in a text.
91
92
93@node Invoking
94@chapter Invoking @command{grep}
95
96The general synopsis of the @command{grep} command line is
97
98@example
99grep @var{options} @var{pattern} @var{input_file_names}
100@end example
101
102@noindent
103There can be zero or more @var{options}.
104@var{pattern} will only be seen as such
105(and not as an @var{input_file_name})
106if it wasn't already specified within @var{options}
107(by using the @samp{-e@ @var{pattern}}
108or @samp{-f@ @var{file}} options).
109There can be zero or more @var{input_file_names}.
110
111@menu
112* Command-line Options::        Short and long names, grouped by category.
113* Environment Variables::       POSIX, GNU generic, and GNU grep specific.
114* Exit Status::                 Exit status returned by @command{grep}.
115* grep Programs::               @command{grep} programs.
116@end menu
117
118@node Command-line Options
119@section Command-line Options
120
121@command{grep} comes with a rich set of options:
122some from POSIX and some being GNU extensions.
123Long option names are always a GNU extension,
124even for options that are from POSIX specifications.
125Options that are specified by POSIX,
126under their short names,
127are explicitly marked as such
128to facilitate POSIX-portable programming.
129A few option names are provided
130for compatibility with older or more exotic implementations.
131
132@menu
133* Generic Program Information::
134* Matching Control::
135* General Output Control::
136* Output Line Prefix Control::
137* Context Line Control::
138* File and Directory Selection::
139* Other Options::
140@end menu
141
142Several additional options control
143which variant of the @command{grep} matching engine is used.
144@xref{grep Programs}.
145
146@node Generic Program Information
147@subsection Generic Program Information
148
149@table @option
150
151@item --help
152@opindex --help
153@cindex usage summary, printing
154Print a usage message briefly summarizing the command-line options
155and the bug-reporting address, then exit.
156
157@item -V
158@itemx --version
159@opindex -V
160@opindex --version
161@cindex version, printing
162Print the version number of @command{grep} to the standard output stream.
163This version number should be included in all bug reports.
164
165@end table
166
167@node Matching Control
168@subsection Matching Control
169
170@table @option
171
172@item -e @var{pattern}
173@itemx --regexp=@var{pattern}
174@opindex -e
175@opindex --regexp=@var{pattern}
176@cindex pattern list
177Use @var{pattern} as the pattern.
178This can be used to specify multiple search patterns,
179or to protect a pattern beginning with a @samp{-}.
180(@option{-e} is specified by POSIX.)
181
182@item -f @var{file}
183@itemx --file=@var{file}
184@opindex -f
185@opindex --file
186@cindex pattern from file
187Obtain patterns from @var{file}, one per line.
188The empty file contains zero patterns, and therefore matches nothing.
189(@option{-f} is specified by POSIX.)
190
191@item -i
192@itemx -y
193@itemx --ignore-case
194@opindex -i
195@opindex -y
196@opindex --ignore-case
197@cindex case insensitive search
198Ignore case distinctions, so that characters that differ only in case
199match each other.  Although this is straightforward when letters
200differ in case only via lowercase-uppercase pairs, the behavior is
201unspecified in other situations.  For example, uppercase ``S'' has an
202unusual lowercase counterpart ``ſ'' (Unicode character U+017F, LATIN
203SMALL LETTER LONG S) in many locales, and it is unspecified whether
204this unusual character matches ``S'' or ``s'' even though uppercasing
205it yields ``S''.  Another example: the lowercase German letter ``ß''
206(U+00DF, LATIN SMALL LETTER SHARP S) is normally capitalized as the
207two-character string ``SS'' but it does not match ``SS'', and it might
208not match the uppercase letter ``ẞ'' (U+1E9E, LATIN CAPITAL LETTER
209SHARP S) even though lowercasing the latter yields the former.
210
211@option{-y} is an obsolete synonym that is provided for compatibility.
212(@option{-i} is specified by POSIX.)
213
214@item -v
215@itemx --invert-match
216@opindex -v
217@opindex --invert-match
218@cindex invert matching
219@cindex print non-matching lines
220Invert the sense of matching, to select non-matching lines.
221(@option{-v} is specified by POSIX.)
222
223@item -w
224@itemx --word-regexp
225@opindex -w
226@opindex --word-regexp
227@cindex matching whole words
228Select only those lines containing matches that form whole words.
229The test is that the matching substring must either
230be at the beginning of the line,
231or preceded by a non-word constituent character.
232Similarly,
233it must be either at the end of the line
234or followed by a non-word constituent character.
235Word-constituent characters are letters, digits, and the underscore.
236
237@item -x
238@itemx --line-regexp
239@opindex -x
240@opindex --line-regexp
241@cindex match the whole line
242Select only those matches that exactly match the whole line.
243(@option{-x} is specified by POSIX.)
244
245@end table
246
247@node General Output Control
248@subsection General Output Control
249
250@table @option
251
252@item -c
253@itemx --count
254@opindex -c
255@opindex --count
256@cindex counting lines
257Suppress normal output;
258instead print a count of matching lines for each input file.
259With the @option{-v} (@option{--invert-match}) option,
260count non-matching lines.
261(@option{-c} is specified by POSIX.)
262
263@item --color[=@var{WHEN}]
264@itemx --colour[=@var{WHEN}]
265@opindex --color
266@opindex --colour
267@cindex highlight, color, colour
268Surround the matched (non-empty) strings, matching lines, context lines,
269file names, line numbers, byte offsets, and separators (for fields and
270groups of context lines) with escape sequences to display them in color
271on the terminal.
272The colors are defined by the environment variable @env{GREP_COLORS}
273and default to @samp{ms=01;31:mc=01;31:sl=:cx=:fn=35:ln=32:bn=32:se=36}
274for bold red matched text, magenta file names, green line numbers,
275green byte offsets, cyan separators, and default terminal colors otherwise.
276The deprecated environment variable @env{GREP_COLOR} is still supported,
277but its setting does not have priority;
278it defaults to @samp{01;31} (bold red)
279which only covers the color for matched text.
280@var{WHEN} is @samp{never}, @samp{always}, or @samp{auto}.
281
282@item -L
283@itemx --files-without-match
284@opindex -L
285@opindex --files-without-match
286@cindex files which don't match
287Suppress normal output;
288instead print the name of each input file from which
289no output would normally have been printed.
290The scanning of each file stops on the first match.
291
292@item -l
293@itemx --files-with-matches
294@opindex -l
295@opindex --files-with-matches
296@cindex names of matching files
297Suppress normal output;
298instead print the name of each input file from which
299output would normally have been printed.
300The scanning of each file stops on the first match.
301(@option{-l} is specified by POSIX.)
302
303@item -m @var{num}
304@itemx --max-count=@var{num}
305@opindex -m
306@opindex --max-count
307@cindex max-count
308Stop reading a file after @var{num} matching lines.
309If the input is standard input from a regular file,
310and @var{num} matching lines are output,
311@command{grep} ensures that the standard input is positioned
312just after the last matching line before exiting,
313regardless of the presence of trailing context lines.
314This enables a calling process to resume a search.
315For example, the following shell script makes use of it:
316
317@example
318while grep -m 1 PATTERN
319do
320  echo xxxx
321done < FILE
322@end example
323
324But the following probably will not work because a pipe is not a regular
325file:
326
327@example
328# This probably will not work.
329cat FILE |
330while grep -m 1 PATTERN
331do
332  echo xxxx
333done
334@end example
335
336When @command{grep} stops after @var{num} matching lines,
337it outputs any trailing context lines.
338Since context does not include matching lines,
339@command{grep} will stop when it encounters another matching line.
340When the @option{-c} or @option{--count} option is also used,
341@command{grep} does not output a count greater than @var{num}.
342When the @option{-v} or @option{--invert-match} option is also used,
343@command{grep} stops after outputting @var{num} non-matching lines.
344
345@item -o
346@itemx --only-matching
347@opindex -o
348@opindex --only-matching
349@cindex only matching
350Print only the matched (non-empty) parts of matching lines,
351with each such part on a separate output line.
352
353@item -q
354@itemx --quiet
355@itemx --silent
356@opindex -q
357@opindex --quiet
358@opindex --silent
359@cindex quiet, silent
360Quiet; do not write anything to standard output.
361Exit immediately with zero status if any match is found,
362even if an error was detected.
363Also see the @option{-s} or @option{--no-messages} option.
364(@option{-q} is specified by POSIX.)
365
366@item -s
367@itemx --no-messages
368@opindex -s
369@opindex --no-messages
370@cindex suppress error messages
371Suppress error messages about nonexistent or unreadable files.
372Portability note:
373unlike GNU @command{grep},
3747th Edition Unix @command{grep} did not conform to POSIX,
375because it lacked @option{-q}
376and its @option{-s} option behaved like
377GNU @command{grep}'s @option{-q} option.@footnote{Of course, 7th Edition
378Unix predated POSIX by several years!}
379USG-style @command{grep} also lacked @option{-q}
380but its @option{-s} option behaved like GNU @command{grep}'s.
381Portable shell scripts should avoid both
382@option{-q} and @option{-s} and should redirect
383standard and error output to @file{/dev/null} instead.
384(@option{-s} is specified by POSIX.)
385
386@end table
387
388@node Output Line Prefix Control
389@subsection Output Line Prefix Control
390
391When several prefix fields are to be output,
392the order is always file name, line number, and byte offset,
393regardless of the order in which these options were specified.
394
395@table @option
396
397@item -b
398@itemx --byte-offset
399@opindex -b
400@opindex --byte-offset
401@cindex byte offset
402Print the 0-based byte offset within the input file
403before each line of output.
404If @option{-o} (@option{--only-matching}) is specified,
405print the offset of the matching part itself.
406When @command{grep} runs on MS-DOS or MS-Windows,
407the printed byte offsets depend on whether
408the @option{-u} (@option{--unix-byte-offsets}) option is used;
409see below.
410
411@item -H
412@itemx --with-filename
413@opindex -H
414@opindex --with-filename
415@cindex with filename prefix
416Print the file name for each match.
417This is the default when there is more than one file to search.
418
419@item -h
420@itemx --no-filename
421@opindex -h
422@opindex --no-filename
423@cindex no filename prefix
424Suppress the prefixing of file names on output.
425This is the default when there is only one file
426(or only standard input) to search.
427
428@item --label=@var{LABEL}
429@opindex --label
430@cindex changing name of standard input
431Display input actually coming from standard input
432as input coming from file @var{LABEL}.  This is
433especially useful when implementing tools like
434@command{zgrep}; e.g.:
435
436@example
437gzip -cd foo.gz | grep --label=foo -H something
438@end example
439
440@item -n
441@itemx --line-number
442@opindex -n
443@opindex --line-number
444@cindex line numbering
445Prefix each line of output with the 1-based line number within its input file.
446(@option{-n} is specified by POSIX.)
447
448@item -T
449@itemx --initial-tab
450@opindex -T
451@opindex --initial-tab
452@cindex tab-aligned content lines
453Make sure that the first character of actual line content lies on a tab stop,
454so that the alignment of tabs looks normal.
455This is useful with options that prefix their output to the actual content:
456@option{-H}, @option{-n}, and @option{-b}.
457In order to improve the probability that lines
458from a single file will all start at the same column,
459this also causes the line number and byte offset (if present)
460to be printed in a minimum-size field width.
461
462@item -u
463@itemx --unix-byte-offsets
464@opindex -u
465@opindex --unix-byte-offsets
466@cindex MS-DOS/MS-Windows byte offsets
467@cindex byte offsets, on MS-DOS/MS-Windows
468Report Unix-style byte offsets.
469This option causes @command{grep} to report byte offsets
470as if the file were a Unix-style text file,
471i.e., the byte offsets ignore the @code{CR} characters that were stripped.
472This will produce results identical
473to running @command{grep} on a Unix machine.
474This option has no effect unless the @option{-b} option is also used;
475it has no effect on platforms other than MS-DOS and MS-Windows.
476
477@item -Z
478@itemx --null
479@opindex -Z
480@opindex --null
481@cindex zero-terminated file names
482Output a zero byte (the ASCII @code{NUL} character)
483instead of the character that normally follows a file name.
484For example,
485@samp{grep -lZ} outputs a zero byte after each file name
486instead of the usual newline.
487This option makes the output unambiguous,
488even in the presence of file names containing unusual characters like newlines.
489This option can be used with commands like
490@samp{find -print0}, @samp{perl -0}, @samp{sort -z}, and @samp{xargs -0}
491to process arbitrary file names,
492even those that contain newline characters.
493
494@end table
495
496@node Context Line Control
497@subsection Context Line Control
498
499Regardless of how these options are set,
500@command{grep} will never print any given line more than once.
501If the @option{-o} (@option{--only-matching}) option is specified,
502these options have no effect and a warning is given upon their use.
503
504@table @option
505
506@item -A @var{num}
507@itemx --after-context=@var{num}
508@opindex -A
509@opindex --after-context
510@cindex after context
511@cindex context lines, after match
512Print @var{num} lines of trailing context after matching lines.
513
514@item -B @var{num}
515@itemx --before-context=@var{num}
516@opindex -B
517@opindex --before-context
518@cindex before context
519@cindex context lines, before match
520Print @var{num} lines of leading context before matching lines.
521
522@item -C @var{num}
523@itemx -@var{num}
524@itemx --context=@var{num}
525@opindex -C
526@opindex --context
527@opindex -@var{num}
528@cindex context
529Print @var{num} lines of leading and trailing output context.
530
531@item --group-separator=@var{string}
532@opindex --group-separator
533@cindex group separator
534When @option{-A}, @option{-B} or @option{-C} are in use,
535print @var{string} instead of @option{--} between groups of lines.
536
537@item --no-group-separator
538@opindex --group-separator
539@cindex group separator
540When @option{-A}, @option{-B} or @option{-C} are in use,
541do not print a separator between groups of lines.
542
543@end table
544
545Here are some points about how @command{grep} chooses
546the separator to print between prefix fields and line content:
547
548@itemize @bullet
549@item
550Matching lines normally use @samp{:} as a separator
551between prefix fields and actual line content.
552
553@item
554Context (i.e., non-matching) lines use @samp{-} instead.
555
556@item
557When context is not specified,
558matching lines are simply output one right after another.
559
560@item
561When context is specified,
562lines that are adjacent in the input form a group
563and are output one right after another, while
564by default a separator appears between non-adjacent groups.
565
566@item
567The default separator
568is a @samp{--} line; its presence and appearance
569can be changed with the options above.
570
571@item
572Each group may contain
573several matching lines when they are close enough to each other
574that two adjacent groups connect and can merge into a single
575contiguous one.
576@end itemize
577
578@node File and Directory Selection
579@subsection File and Directory Selection
580
581@table @option
582
583@item -a
584@itemx --text
585@opindex -a
586@opindex --text
587@cindex suppress binary data
588@cindex binary files
589Process a binary file as if it were text;
590this is equivalent to the @samp{--binary-files=text} option.
591
592@item --binary-files=@var{type}
593@opindex --binary-files
594@cindex binary files
595If a file's allocation metadata or its first few bytes
596indicate that the file contains binary data,
597assume that the file is of type @var{type}.
598By default, @var{type} is @samp{binary},
599and @command{grep} normally outputs either
600a one-line message saying that a binary file matches,
601or no message if there is no match.
602
603If @var{type} is @samp{without-match},
604@command{grep} assumes that a binary file does not match;
605this is equivalent to the @option{-I} option.
606
607If @var{type} is @samp{text},
608@command{grep} processes a binary file as if it were text;
609this is equivalent to the @option{-a} option.
610
611@emph{Warning:} @samp{--binary-files=text} might output binary garbage,
612which can have nasty side effects
613if the output is a terminal and
614if the terminal driver interprets some of it as commands.
615
616@item -D @var{action}
617@itemx --devices=@var{action}
618@opindex -D
619@opindex --devices
620@cindex device search
621If an input file is a device, FIFO, or socket, use @var{action} to process it.
622If @var{action} is @samp{read},
623all devices are read just as if they were ordinary files.
624If @var{action} is @samp{skip},
625devices, FIFOs, and sockets are silently skipped.
626By default, devices are read if they are on the command line or if the
627@option{-R} (@option{--dereference-recursive}) option is used, and are
628skipped if they are encountered recursively and the @option{-r}
629(@option{--recursive}) option is used.
630This option has no effect on a file that is read via standard input.
631
632@item -d @var{action}
633@itemx --directories=@var{action}
634@opindex -d
635@opindex --directories
636@cindex directory search
637@cindex symbolic links
638If an input file is a directory, use @var{action} to process it.
639By default, @var{action} is @samp{read},
640which means that directories are read just as if they were ordinary files
641(some operating systems and file systems disallow this,
642and will cause @command{grep}
643to print error messages for every directory or silently skip them).
644If @var{action} is @samp{skip}, directories are silently skipped.
645If @var{action} is @samp{recurse},
646@command{grep} reads all files under each directory, recursively,
647following command-line symbolic links and skipping other symlinks;
648this is equivalent to the @option{-r} option.
649
650@item --exclude=@var{glob}
651@opindex --exclude
652@cindex exclude files
653@cindex searching directory trees
654Skip files whose base name matches @var{glob}
655(using wildcard matching).
656A file-name glob can use
657@samp{*}, @samp{?}, and @samp{[}...@samp{]} as wildcards,
658and @code{\} to quote a wildcard or backslash character literally.
659
660@item --exclude-from=@var{file}
661@opindex --exclude-from
662@cindex exclude files
663@cindex searching directory trees
664Skip files whose base name matches any of the file-name globs
665read from @var{file} (using wildcard matching as described
666under @option{--exclude}).
667
668@item --exclude-dir=@var{dir}
669@opindex --exclude-dir
670@cindex exclude directories
671Skip any directory whose name matches the pattern @var{dir}, ignoring
672any redundant trailing slashes in @var{dir}.
673
674@item -I
675Process a binary file as if it did not contain matching data;
676this is equivalent to the @samp{--binary-files=without-match} option.
677
678@item --include=@var{glob}
679@opindex --include
680@cindex include files
681@cindex searching directory trees
682Search only files whose base name matches @var{glob}
683(using wildcard matching as described under @option{--exclude}).
684
685@item -r
686@itemx --recursive
687@opindex -r
688@opindex --recursive
689@cindex recursive search
690@cindex searching directory trees
691@cindex symbolic links
692For each directory operand,
693read and process all files in that directory, recursively.
694Follow symbolic links on the command line, but skip symlinks
695that are encountered recursively.
696This is the same as the @samp{--directories=recurse} option.
697
698@item -R
699@itemx --dereference-recursive
700@opindex -R
701@opindex --dereference-recursive
702@cindex recursive search
703@cindex searching directory trees
704@cindex symbolic links
705For each directory operand, read and process all files in that
706directory, recursively, following all symbolic links.
707
708@end table
709
710@node Other Options
711@subsection Other Options
712
713@table @option
714
715@item --line-buffered
716@opindex --line-buffered
717@cindex line buffering
718Use line buffering on output.
719This can cause a performance penalty.
720
721@item -U
722@itemx --binary
723@opindex -U
724@opindex --binary
725@cindex MS-DOS/MS-Windows binary files
726@cindex binary files, MS-DOS/MS-Windows
727Treat the file(s) as binary.
728By default, under MS-DOS and MS-Windows,
729@command{grep} guesses whether a file is text or binary
730as described for the @option{--binary-files} option.
731If @command{grep} decides the file is a text file,
732it strips the @code{CR} characters from the original file contents
733(to make regular expressions with @code{^} and @code{$} work correctly).
734Specifying @option{-U} overrules this guesswork,
735causing all files to be read and passed to the matching mechanism verbatim;
736if the file is a text file with @code{CR/LF} pairs at the end of each line,
737this will cause some regular expressions to fail.
738This option has no effect
739on platforms other than MS-DOS and MS-Windows.
740
741@item -z
742@itemx --null-data
743@opindex -z
744@opindex --null-data
745@cindex zero-terminated lines
746Treat the input as a set of lines, each terminated by a zero byte (the
747ASCII @code{NUL} character) instead of a newline.
748Like the @option{-Z} or @option{--null} option,
749this option can be used with commands like
750@samp{sort -z} to process arbitrary file names.
751
752@end table
753
754@node Environment Variables
755@section Environment Variables
756
757The behavior of @command{grep} is affected
758by the following environment variables.
759
760The locale for category @w{@code{LC_@var{foo}}}
761is specified by examining the three environment variables
762@env{LC_ALL}, @w{@env{LC_@var{foo}}}, and @env{LANG},
763in that order.
764The first of these variables that is set specifies the locale.
765For example, if @env{LC_ALL} is not set,
766but @env{LC_MESSAGES} is set to @samp{pt_BR},
767then the Brazilian Portuguese locale is used
768for the @code{LC_MESSAGES} category.
769The @samp{C} locale is used if none of these environment variables are set,
770if the locale catalog is not installed,
771or if @command{grep} was not compiled
772with national language support (NLS).
773
774Many of the environment variables in the following list let you
775control highlighting using
776Select Graphic Rendition (SGR)
777commands interpreted by the terminal or terminal emulator.
778(See the
779section
780in the documentation of your text terminal
781for permitted values and their meanings as character attributes.)
782These substring values are integers in decimal representation
783and can be concatenated with semicolons.
784@command{grep} takes care of assembling the result
785into a complete SGR sequence (@samp{\33[}...@samp{m}).
786Common values to concatenate include
787@samp{1} for bold,
788@samp{4} for underline,
789@samp{5} for blink,
790@samp{7} for inverse,
791@samp{39} for default foreground color,
792@samp{30} to @samp{37} for foreground colors,
793@samp{90} to @samp{97} for 16-color mode foreground colors,
794@samp{38;5;0} to @samp{38;5;255}
795for 88-color and 256-color modes foreground colors,
796@samp{49} for default background color,
797@samp{40} to @samp{47} for background colors,
798@samp{100} to @samp{107} for 16-color mode background colors,
799and @samp{48;5;0} to @samp{48;5;255}
800for 88-color and 256-color modes background colors.
801
802The two-letter names used in the @env{GREP_COLORS} environment variable
803(and some of the others) refer to terminal ``capabilities,'' the ability
804of a terminal to highlight text, or change its color, and so on.
805These capabilities are stored in an online database and accessed by
806the @code{terminfo} library.
807
808@cindex environment variables
809
810@table @env
811
812@item GREP_OPTIONS
813@vindex GREP_OPTIONS @r{environment variable}
814@cindex default options environment variable
815This variable specifies default options to be placed in front of any
816explicit options.
817For example, if @code{GREP_OPTIONS} is
818@samp{--binary-files=without-match --directories=skip}, @command{grep}
819behaves as if the two options @samp{--binary-files=without-match} and
820@samp{--directories=skip} had been specified before
821any explicit options.
822Option specifications are separated by
823whitespace.
824A backslash escapes the next character, so it can be used to
825specify an option containing whitespace or a backslash.
826
827The @code{GREP_OPTIONS} value does not affect whether @command{grep}
828without file operands searches standard input or the working
829directory; that is affected only by command-line options.  For
830example, the command @samp{grep PAT} searches standard input and the
831command @samp{grep -r PAT} searches the working directory, regardless
832of whether @code{GREP_OPTIONS} contains @option{-r}.
833
834@item GREP_COLOR
835@vindex GREP_COLOR @r{environment variable}
836@cindex highlight markers
837This variable specifies the color used to highlight matched (non-empty) text.
838It is deprecated in favor of @env{GREP_COLORS}, but still supported.
839The @samp{mt}, @samp{ms}, and @samp{mc} capabilities of @env{GREP_COLORS}
840have priority over it.
841It can only specify the color used to highlight
842the matching non-empty text in any matching line
843(a selected line when the @option{-v} command-line option is omitted,
844or a context line when @option{-v} is specified).
845The default is @samp{01;31},
846which means a bold red foreground text on the terminal's default background.
847
848@item GREP_COLORS
849@vindex GREP_COLORS @r{environment variable}
850@cindex highlight markers
851This variable specifies the colors and other attributes
852used to highlight various parts of the output.
853Its value is a colon-separated list of @code{terminfo} capabilities
854that defaults to @samp{ms=01;31:mc=01;31:sl=:cx=:fn=35:ln=32:bn=32:se=36}
855with the @samp{rv} and @samp{ne} boolean capabilities omitted (i.e., false).
856Supported capabilities are as follows.
857
858@table @code
859@item sl=
860@vindex sl GREP_COLORS @r{capability}
861SGR substring for whole selected lines
862(i.e.,
863matching lines when the @option{-v} command-line option is omitted,
864or non-matching lines when @option{-v} is specified).
865If however the boolean @samp{rv} capability
866and the @option{-v} command-line option are both specified,
867it applies to context matching lines instead.
868The default is empty (i.e., the terminal's default color pair).
869
870@item cx=
871@vindex cx GREP_COLORS @r{capability}
872SGR substring for whole context lines
873(i.e.,
874non-matching lines when the @option{-v} command-line option is omitted,
875or matching lines when @option{-v} is specified).
876If however the boolean @samp{rv} capability
877and the @option{-v} command-line option are both specified,
878it applies to selected non-matching lines instead.
879The default is empty (i.e., the terminal's default color pair).
880
881@item rv
882@vindex rv GREP_COLORS @r{capability}
883Boolean value that reverses (swaps) the meanings of
884the @samp{sl=} and @samp{cx=} capabilities
885when the @option{-v} command-line option is specified.
886The default is false (i.e., the capability is omitted).
887
888@item mt=01;31
889@vindex mt GREP_COLORS @r{capability}
890SGR substring for matching non-empty text in any matching line
891(i.e.,
892a selected line when the @option{-v} command-line option is omitted,
893or a context line when @option{-v} is specified).
894Setting this is equivalent to setting both @samp{ms=} and @samp{mc=}
895at once to the same value.
896The default is a bold red text foreground over the current line background.
897
898@item ms=01;31
899@vindex ms GREP_COLORS @r{capability}
900SGR substring for matching non-empty text in a selected line.
901(This is used only when the @option{-v} command-line option is omitted.)
902The effect of the @samp{sl=} (or @samp{cx=} if @samp{rv}) capability
903remains active when this takes effect.
904The default is a bold red text foreground over the current line background.
905
906@item mc=01;31
907@vindex mc GREP_COLORS @r{capability}
908SGR substring for matching non-empty text in a context line.
909(This is used only when the @option{-v} command-line option is specified.)
910The effect of the @samp{cx=} (or @samp{sl=} if @samp{rv}) capability
911remains active when this takes effect.
912The default is a bold red text foreground over the current line background.
913
914@item fn=35
915@vindex fn GREP_COLORS @r{capability}
916SGR substring for file names prefixing any content line.
917The default is a magenta text foreground over the terminal's default background.
918
919@item ln=32
920@vindex ln GREP_COLORS @r{capability}
921SGR substring for line numbers prefixing any content line.
922The default is a green text foreground over the terminal's default background.
923
924@item bn=32
925@vindex bn GREP_COLORS @r{capability}
926SGR substring for byte offsets prefixing any content line.
927The default is a green text foreground over the terminal's default background.
928
929@item se=36
930@vindex fn GREP_COLORS @r{capability}
931SGR substring for separators that are inserted
932between selected line fields (@samp{:}),
933between context line fields (@samp{-}),
934and between groups of adjacent lines
935when nonzero context is specified (@samp{--}).
936The default is a cyan text foreground over the terminal's default background.
937
938@item ne
939@vindex ne GREP_COLORS @r{capability}
940Boolean value that prevents clearing to the end of line
941using Erase in Line (EL) to Right (@samp{\33[K})
942each time a colorized item ends.
943This is needed on terminals on which EL is not supported.
944It is otherwise useful on terminals
945for which the @code{back_color_erase}
946(@code{bce}) boolean @code{terminfo} capability does not apply,
947when the chosen highlight colors do not affect the background,
948or when EL is too slow or causes too much flicker.
949The default is false (i.e., the capability is omitted).
950@end table
951
952Note that boolean capabilities have no @samp{=}... part.
953They are omitted (i.e., false) by default and become true when specified.
954
955
956@item LC_ALL
957@itemx LC_COLLATE
958@itemx LANG
959@vindex LC_ALL @r{environment variable}
960@vindex LC_COLLATE @r{environment variable}
961@vindex LANG @r{environment variable}
962@cindex character type
963@cindex national language support
964@cindex NLS
965These variables specify the locale for the @code{LC_COLLATE} category,
966which might affect how range expressions like @samp{[a-z]} are
967interpreted.
968
969@item LC_ALL
970@itemx LC_CTYPE
971@itemx LANG
972@vindex LC_ALL @r{environment variable}
973@vindex LC_CTYPE @r{environment variable}
974@vindex LANG @r{environment variable}
975These variables specify the locale for the @code{LC_CTYPE} category,
976which determines the type of characters,
977e.g., which characters are whitespace.
978
979@item LC_ALL
980@itemx LC_MESSAGES
981@itemx LANG
982@vindex LC_ALL @r{environment variable}
983@vindex LC_MESSAGES @r{environment variable}
984@vindex LANG @r{environment variable}
985@cindex language of messages
986@cindex message language
987@cindex national language support
988@cindex translation of message language
989These variables specify the locale for the @code{LC_MESSAGES} category,
990which determines the language that @command{grep} uses for messages.
991The default @samp{C} locale uses American English messages.
992
993@item POSIXLY_CORRECT
994@vindex POSIXLY_CORRECT @r{environment variable}
995If set, @command{grep} behaves as POSIX requires; otherwise,
996@command{grep} behaves more like other GNU programs.
997POSIX
998requires that options that
999follow file names must be treated as file names;
1000by default,
1001such options are permuted to the front of the operand list
1002and are treated as options.
1003Also, @code{POSIXLY_CORRECT} disables special handling of an
1004invalid bracket expression.  @xref{invalid-bracket-expr}.
1005
1006@item _@var{N}_GNU_nonoption_argv_flags_
1007@vindex _@var{N}_GNU_nonoption_argv_flags_ @r{environment variable}
1008(Here @code{@var{N}} is @command{grep}'s numeric process ID.)
1009If the @var{i}th character of this environment variable's value is @samp{1},
1010do not consider the @var{i}th operand of @command{grep} to be an option,
1011even if it appears to be one.
1012A shell can put this variable in the environment for each command it runs,
1013specifying which operands are the results of file name wildcard expansion
1014and therefore should not be treated as options.
1015This behavior is available only with the GNU C library,
1016and only when @code{POSIXLY_CORRECT} is not set.
1017
1018@end table
1019
1020
1021@node Exit Status
1022@section Exit Status
1023@cindex exit status
1024@cindex return status
1025
1026Normally, the exit status is 0 if selected lines are found and 1 otherwise.
1027But the exit status is 2 if an error occurred, unless the @option{-q} or
1028@option{--quiet} or @option{--silent} option is used and a selected line
1029is found.
1030Note, however, that POSIX only mandates,
1031for programs such as @command{grep}, @command{cmp}, and @command{diff},
1032that the exit status in case of error be greater than 1;
1033it is therefore advisable, for the sake of portability,
1034to use logic that tests for this general condition
1035instead of strict equality with@ 2.
1036
1037
1038@node grep Programs
1039@section @command{grep} Programs
1040@cindex @command{grep} programs
1041@cindex variants of @command{grep}
1042
1043@command{grep} searches the named input files
1044for lines containing a match to the given pattern.
1045By default, @command{grep} prints the matching lines.
1046A file named @file{-} stands for standard input.
1047If no input is specified, @command{grep} searches the working
1048directory @file{.} if given a command-line option specifying
1049recursion; otherwise, @command{grep} searches standard input.
1050There are four major variants of @command{grep},
1051controlled by the following options.
1052
1053@table @option
1054
1055@item -G
1056@itemx --basic-regexp
1057@opindex -G
1058@opindex --basic-regexp
1059@cindex matching basic regular expressions
1060Interpret the pattern as a basic regular expression (BRE).
1061This is the default.
1062
1063@item -E
1064@itemx --extended-regexp
1065@opindex -E
1066@opindex --extended-regexp
1067@cindex matching extended regular expressions
1068Interpret the pattern as an extended regular expression (ERE).
1069(@option{-E} is specified by POSIX.)
1070
1071@item -F
1072@itemx --fixed-strings
1073@opindex -F
1074@opindex --fixed-strings
1075@cindex matching fixed strings
1076Interpret the pattern as a list of fixed strings, separated
1077by newlines, any of which is to be matched.
1078(@option{-F} is specified by POSIX.)
1079
1080@item -P
1081@itemx --perl-regexp
1082@opindex -P
1083@opindex --perl-regexp
1084@cindex matching Perl regular expressions
1085Interpret the pattern as a Perl regular expression.
1086This is highly experimental and
1087@samp{grep@ -P} may warn of unimplemented features.
1088
1089@end table
1090
1091In addition,
1092two variant programs @command{egrep} and @command{fgrep} are available.
1093@command{egrep} is the same as @samp{grep@ -E}.
1094@command{fgrep} is the same as @samp{grep@ -F}.
1095Direct invocation as either
1096@command{egrep} or @command{fgrep} is deprecated,
1097but is provided to allow historical applications
1098that rely on them to run unmodified.
1099
1100
1101@node Regular Expressions
1102@chapter Regular Expressions
1103@cindex regular expressions
1104
1105A @dfn{regular expression} is a pattern that describes a set of strings.
1106Regular expressions are constructed analogously to arithmetic expressions,
1107by using various operators to combine smaller expressions.
1108@command{grep} understands
1109three different versions of regular expression syntax:
1110``basic,'' (BRE) ``extended'' (ERE) and ``perl''.
1111In GNU @command{grep},
1112there is no difference in available functionality between the basic and
1113extended syntaxes.
1114In other implementations, basic regular expressions are less powerful.
1115The following description applies to extended regular expressions;
1116differences for basic regular expressions are summarized afterwards.
1117Perl regular expressions give additional functionality, and are
1118documented in the @i{pcresyntax}(3) and @i{pcrepattern}(3) manual pages,
1119but may not be available on every system.
1120
1121@menu
1122* Fundamental Structure::
1123* Character Classes and Bracket Expressions::
1124* The Backslash Character and Special Expressions::
1125* Anchoring::
1126* Back-references and Subexpressions::
1127* Basic vs Extended::
1128@end menu
1129
1130@node Fundamental Structure
1131@section Fundamental Structure
1132
1133The fundamental building blocks are the regular expressions that match
1134a single character.
1135Most characters, including all letters and digits,
1136are regular expressions that match themselves.
1137Any meta-character
1138with special meaning may be quoted by preceding it with a backslash.
1139
1140A regular expression may be followed by one of several
1141repetition operators:
1142
1143@table @samp
1144
1145@item .
1146@opindex .
1147@cindex dot
1148@cindex period
1149The period @samp{.} matches any single character.
1150
1151@item ?
1152@opindex ?
1153@cindex question mark
1154@cindex match expression at most once
1155The preceding item is optional and will be matched at most once.
1156
1157@item *
1158@opindex *
1159@cindex asterisk
1160@cindex match expression zero or more times
1161The preceding item will be matched zero or more times.
1162
1163@item +
1164@opindex +
1165@cindex plus sign
1166@cindex match expression one or more times
1167The preceding item will be matched one or more times.
1168
1169@item @{@var{n}@}
1170@opindex @{@var{n}@}
1171@cindex braces, one argument
1172@cindex match expression @var{n} times
1173The preceding item is matched exactly @var{n} times.
1174
1175@item @{@var{n},@}
1176@opindex @{@var{n},@}
1177@cindex braces, second argument omitted
1178@cindex match expression @var{n} or more times
1179The preceding item is matched @var{n} or more times.
1180
1181@item @{,@var{m}@}
1182@opindex @{,@var{m}@}
1183@cindex braces, first argument omitted
1184@cindex match expression at most @var{m} times
1185The preceding item is matched at most @var{m} times.
1186This is a GNU extension.
1187
1188@item @{@var{n},@var{m}@}
1189@opindex @{@var{n},@var{m}@}
1190@cindex braces, two arguments
1191@cindex match expression from @var{n} to @var{m} times
1192The preceding item is matched at least @var{n} times, but not more than
1193@var{m} times.
1194
1195@end table
1196
1197The empty regular expression matches the empty string.
1198Two regular expressions may be concatenated;
1199the resulting regular expression
1200matches any string formed by concatenating two substrings
1201that respectively match the concatenated expressions.
1202
1203Two regular expressions may be joined by the infix operator @samp{|};
1204the resulting regular expression
1205matches any string matching either alternate expression.
1206
1207Repetition takes precedence over concatenation,
1208which in turn takes precedence over alternation.
1209A whole expression may be enclosed in parentheses
1210to override these precedence rules and form a subexpression.
1211
1212@node Character Classes and Bracket Expressions
1213@section Character Classes and Bracket Expressions
1214
1215@cindex bracket expression
1216@cindex character class
1217A @dfn{bracket expression} is a list of characters enclosed by @samp{[} and
1218@samp{]}.
1219It matches any single character in that list;
1220if the first character of the list is the caret @samp{^},
1221then it matches any character @strong{not} in the list.
1222For example, the regular expression
1223@samp{[0123456789]} matches any single digit.
1224
1225@cindex range expression
1226Within a bracket expression, a @dfn{range expression} consists of two
1227characters separated by a hyphen.
1228It matches any single character that
1229sorts between the two characters, inclusive.
1230In the default C locale, the sorting sequence is the native character
1231order; for example, @samp{[a-d]} is equivalent to @samp{[abcd]}.
1232In other locales, the sorting sequence is not specified, and
1233@samp{[a-d]} might be equivalent to @samp{[abcd]} or to
1234@samp{[aBbCcDd]}, or it might fail to match any character, or the set of
1235characters that it matches might even be erratic.
1236To obtain the traditional interpretation
1237of bracket expressions, you can use the @samp{C} locale by setting the
1238@env{LC_ALL} environment variable to the value @samp{C}.
1239
1240Finally, certain named classes of characters are predefined within
1241bracket expressions, as follows.
1242Their interpretation depends on the @code{LC_CTYPE} locale;
1243for example, @samp{[[:alnum:]]} means the character class of numbers and letters
1244in the current locale.
1245
1246@cindex classes of characters
1247@cindex character classes
1248@table @samp
1249
1250@item [:alnum:]
1251@opindex alnum @r{character class}
1252@cindex alphanumeric characters
1253Alphanumeric characters:
1254@samp{[:alpha:]} and @samp{[:digit:]}; in the @samp{C} locale and ASCII character encoding, this is the same as @samp{[0-9A-Za-z]}.
1255
1256@item [:alpha:]
1257@opindex alpha @r{character class}
1258@cindex alphabetic characters
1259Alphabetic characters:
1260@samp{[:lower:]} and @samp{[:upper:]}; in the @samp{C} locale and ASCII character encoding, this is the same as @samp{[A-Za-z]}.
1261
1262@item [:blank:]
1263@opindex blank @r{character class}
1264@cindex blank characters
1265Blank characters:
1266space and tab.
1267
1268@item [:cntrl:]
1269@opindex cntrl @r{character class}
1270@cindex control characters
1271Control characters.
1272In ASCII, these characters have octal codes 000
1273through 037, and 177 (@code{DEL}).
1274In other character sets, these are
1275the equivalent characters, if any.
1276
1277@item [:digit:]
1278@opindex digit @r{character class}
1279@cindex digit characters
1280@cindex numeric characters
1281Digits: @code{0 1 2 3 4 5 6 7 8 9}.
1282
1283@item [:graph:]
1284@opindex graph @r{character class}
1285@cindex graphic characters
1286Graphical characters:
1287@samp{[:alnum:]} and @samp{[:punct:]}.
1288
1289@item [:lower:]
1290@opindex lower @r{character class}
1291@cindex lower-case letters
1292Lower-case letters; in the @samp{C} locale and ASCII character
1293encoding, this is
1294@code{a b c d e f g h i j k l m n o p q r s t u v w x y z}.
1295
1296@item [:print:]
1297@opindex print @r{character class}
1298@cindex printable characters
1299Printable characters:
1300@samp{[:alnum:]}, @samp{[:punct:]}, and space.
1301
1302@item [:punct:]
1303@opindex punct @r{character class}
1304@cindex punctuation characters
1305Punctuation characters; in the @samp{C} locale and ASCII character
1306encoding, this is
1307@code{!@: " # $ % & ' ( ) * + , - .@: / : ; < = > ?@: @@ [ \ ] ^ _ ` @{ | @} ~}.
1308
1309@item [:space:]
1310@opindex space @r{character class}
1311@cindex space characters
1312@cindex whitespace characters
1313Space characters: in the @samp{C} locale, this is
1314tab, newline, vertical tab, form feed, carriage return, and space.
1315@xref{Usage}, for more discussion of matching newlines.
1316
1317@item [:upper:]
1318@opindex upper @r{character class}
1319@cindex upper-case letters
1320Upper-case letters: in the @samp{C} locale and ASCII character
1321encoding, this is
1322@code{A B C D E F G H I J K L M N O P Q R S T U V W X Y Z}.
1323
1324@item [:xdigit:]
1325@opindex xdigit @r{character class}
1326@cindex xdigit class
1327@cindex hexadecimal digits
1328Hexadecimal digits:
1329@code{0 1 2 3 4 5 6 7 8 9 A B C D E F a b c d e f}.
1330
1331@end table
1332Note that the brackets in these class names are
1333part of the symbolic names, and must be included in addition to
1334the brackets delimiting the bracket expression.
1335
1336@anchor{invalid-bracket-expr}
1337If you mistakenly omit the outer brackets, and search for say, @samp{[:upper:]},
1338GNU @command{grep} prints a diagnostic and exits with status 2, on
1339the assumption that you did not intend to search for the nominally
1340equivalent regular expression: @samp{[:epru]}.
1341Set the @code{POSIXLY_CORRECT} environment variable to disable this feature.
1342
1343Most meta-characters lose their special meaning inside bracket expressions.
1344
1345@table @samp
1346@item ]
1347ends the bracket expression if it's not the first list item.
1348So, if you want to make the @samp{]} character a list item,
1349you must put it first.
1350
1351@item [.
1352represents the open collating symbol.
1353
1354@item .]
1355represents the close collating symbol.
1356
1357@item [=
1358represents the open equivalence class.
1359
1360@item =]
1361represents the close equivalence class.
1362
1363@item [:
1364represents the open character class symbol, and should be followed by a valid character class name.
1365
1366@item :]
1367represents the close character class symbol.
1368
1369@item -
1370represents the range if it's not first or last in a list or the ending point
1371of a range.
1372
1373@item ^
1374represents the characters not in the list.
1375If you want to make the @samp{^}
1376character a list item, place it anywhere but first.
1377
1378@end table
1379
1380@node The Backslash Character and Special Expressions
1381@section The Backslash Character and Special Expressions
1382@cindex backslash
1383
1384The @samp{\} character,
1385when followed by certain ordinary characters,
1386takes a special meaning:
1387
1388@table @samp
1389
1390@item \b
1391Match the empty string at the edge of a word.
1392
1393@item \B
1394Match the empty string provided it's not at the edge of a word.
1395
1396@item \<
1397Match the empty string at the beginning of word.
1398
1399@item \>
1400Match the empty string at the end of word.
1401
1402@item \w
1403Match word constituent, it is a synonym for @samp{[_[:alnum:]]}.
1404
1405@item \W
1406Match non-word constituent, it is a synonym for @samp{[^_[:alnum:]]}.
1407
1408@item \s
1409Match whitespace, it is a synonym for @samp{[[:space:]]}.
1410
1411@item \S
1412Match non-whitespace, it is a synonym for @samp{[^[:space:]]}.
1413
1414@end table
1415
1416For example, @samp{\brat\b} matches the separate word @samp{rat},
1417@samp{\Brat\B} matches @samp{crate} but not @samp{furry rat}.
1418
1419@node Anchoring
1420@section Anchoring
1421@cindex anchoring
1422
1423The caret @samp{^} and the dollar sign @samp{$} are meta-characters that
1424respectively match the empty string at the beginning and end of a line.
1425They are termed @dfn{anchors}, since they force the match to be ``anchored''
1426to beginning or end of a line, respectively.
1427
1428@node Back-references and Subexpressions
1429@section Back-references and Subexpressions
1430@cindex subexpression
1431@cindex back-reference
1432
1433The back-reference @samp{\@var{n}}, where @var{n} is a single digit, matches
1434the substring previously matched by the @var{n}th parenthesized subexpression
1435of the regular expression.
1436For example, @samp{(a)\1} matches @samp{aa}.
1437When used with alternation, if the group does not participate in the match then
1438the back-reference makes the whole match fail.
1439For example, @samp{a(.)|b\1}
1440will not match @samp{ba}.
1441When multiple regular expressions are given with
1442@option{-e} or from a file (@samp{-f @var{file}}),
1443back-references are local to each expression.
1444
1445@node Basic vs Extended
1446@section Basic vs Extended Regular Expressions
1447@cindex basic regular expressions
1448
1449In basic regular expressions the meta-characters @samp{?}, @samp{+},
1450@samp{@{}, @samp{|}, @samp{(}, and @samp{)} lose their special meaning;
1451instead use the backslashed versions @samp{\?}, @samp{\+}, @samp{\@{},
1452@samp{\|}, @samp{\(}, and @samp{\)}.
1453
1454@cindex interval specifications
1455Traditional @command{egrep} did not support the @samp{@{} meta-character,
1456and some @command{egrep} implementations support @samp{\@{} instead, so
1457portable scripts should avoid @samp{@{} in @samp{grep@ -E} patterns and
1458should use @samp{[@{]} to match a literal @samp{@{}.
1459
1460GNU @command{grep@ -E} attempts to support traditional usage by
1461assuming that @samp{@{} is not special if it would be the start of an
1462invalid interval specification.
1463For example, the command
1464@samp{grep@ -E@ '@{1'} searches for the two-character string @samp{@{1}
1465instead of reporting a syntax error in the regular expression.
1466POSIX allows this behavior as an extension, but portable scripts
1467should avoid it.
1468
1469
1470@node Usage
1471@chapter Usage
1472
1473@cindex usage, examples
1474Here is an example command that invokes GNU @command{grep}:
1475
1476@example
1477grep -i 'hello.*world' menu.h main.c
1478@end example
1479
1480@noindent
1481This lists all lines in the files @file{menu.h} and @file{main.c} that
1482contain the string @samp{hello} followed by the string @samp{world};
1483this is because @samp{.*} matches zero or more characters within a line.
1484@xref{Regular Expressions}.
1485The @option{-i} option causes @command{grep}
1486to ignore case, causing it to match the line @samp{Hello, world!}, which
1487it would not otherwise match.
1488@xref{Invoking}, for more details about
1489how to invoke @command{grep}.
1490
1491@cindex using @command{grep}, Q&A
1492@cindex FAQ about @command{grep} usage
1493Here are some common questions and answers about @command{grep} usage.
1494
1495@enumerate
1496
1497@item
1498How can I list just the names of matching files?
1499
1500@example
1501grep -l 'main' *.c
1502@end example
1503
1504@noindent
1505lists the names of all C files in the current directory whose contents
1506mention @samp{main}.
1507
1508@item
1509How do I search directories recursively?
1510
1511@example
1512grep -r 'hello' /home/gigi
1513@end example
1514
1515@noindent
1516searches for @samp{hello} in all files
1517under the @file{/home/gigi} directory.
1518For more control over which files are searched,
1519use @command{find}, @command{grep}, and @command{xargs}.
1520For example, the following command searches only C files:
1521
1522@example
1523find /home/gigi -name '*.c' -print0 | xargs -0r grep -H 'hello'
1524@end example
1525
1526This differs from the command:
1527
1528@example
1529grep -H 'hello' *.c
1530@end example
1531
1532which merely looks for @samp{hello} in all files in the current
1533directory whose names end in @samp{.c}.
1534The @samp{find ...} command line above is more similar to the command:
1535
1536@example
1537grep -rH --include='*.c' 'hello' /home/gigi
1538@end example
1539
1540@item
1541What if a pattern has a leading @samp{-}?
1542
1543@example
1544grep -e '--cut here--' *
1545@end example
1546
1547@noindent
1548searches for all lines matching @samp{--cut here--}.
1549Without @option{-e},
1550@command{grep} would attempt to parse @samp{--cut here--} as a list of
1551options.
1552
1553@item
1554Suppose I want to search for a whole word, not a part of a word?
1555
1556@example
1557grep -w 'hello' *
1558@end example
1559
1560@noindent
1561searches only for instances of @samp{hello} that are entire words;
1562it does not match @samp{Othello}.
1563For more control, use @samp{\<} and
1564@samp{\>} to match the start and end of words.
1565For example:
1566
1567@example
1568grep 'hello\>' *
1569@end example
1570
1571@noindent
1572searches only for words ending in @samp{hello}, so it matches the word
1573@samp{Othello}.
1574
1575@item
1576How do I output context around the matching lines?
1577
1578@example
1579grep -C 2 'hello' *
1580@end example
1581
1582@noindent
1583prints two lines of context around each matching line.
1584
1585@item
1586How do I force @command{grep} to print the name of the file?
1587
1588Append @file{/dev/null}:
1589
1590@example
1591grep 'eli' /etc/passwd /dev/null
1592@end example
1593
1594gets you:
1595
1596@example
1597/etc/passwd:eli:x:2098:1000:Eli Smith:/home/eli:/bin/bash
1598@end example
1599
1600Alternatively, use @option{-H}, which is a GNU extension:
1601
1602@example
1603grep -H 'eli' /etc/passwd
1604@end example
1605
1606@item
1607Why do people use strange regular expressions on @command{ps} output?
1608
1609@example
1610ps -ef | grep '[c]ron'
1611@end example
1612
1613If the pattern had been written without the square brackets, it would
1614have matched not only the @command{ps} output line for @command{cron},
1615but also the @command{ps} output line for @command{grep}.
1616Note that on some platforms,
1617@command{ps} limits the output to the width of the screen;
1618@command{grep} does not have any limit on the length of a line
1619except the available memory.
1620
1621@item
1622Why does @command{grep} report ``Binary file matches''?
1623
1624If @command{grep} listed all matching ``lines'' from a binary file, it
1625would probably generate output that is not useful, and it might even
1626muck up your display.
1627So GNU @command{grep} suppresses output from
1628files that appear to be binary files.
1629To force GNU @command{grep}
1630to output lines even from files that appear to be binary, use the
1631@option{-a} or @samp{--binary-files=text} option.
1632To eliminate the
1633``Binary file matches'' messages, use the @option{-I} or
1634@samp{--binary-files=without-match} option.
1635
1636@item
1637Why doesn't @samp{grep -lv} print non-matching file names?
1638
1639@samp{grep -lv} lists the names of all files containing one or more
1640lines that do not match.
1641To list the names of all files that contain no
1642matching lines, use the @option{-L} or @option{--files-without-match}
1643option.
1644
1645@item
1646I can do ``OR'' with @samp{|}, but what about ``AND''?
1647
1648@example
1649grep 'paul' /etc/motd | grep 'franc,ois'
1650@end example
1651
1652@noindent
1653finds all lines that contain both @samp{paul} and @samp{franc,ois}.
1654
1655@item
1656Why does the empty pattern match every input line?
1657
1658The @command{grep} command searches for lines that contain strings
1659that match a pattern.  Every line contains the empty string, so an
1660empty pattern causes @command{grep} to find a match on each line.  It
1661is not the only such pattern: @samp{^}, @samp{$}, @samp{.*}, and many
1662other patterns cause @command{grep} to match every line.
1663
1664To match empty lines, use the pattern @samp{^$}.  To match blank
1665lines, use the pattern @samp{^[[:blank:]]*$}.  To match no lines at
1666all, use the command @samp{grep -f /dev/null}.
1667
1668@item
1669How can I search in both standard input and in files?
1670
1671Use the special file name @samp{-}:
1672
1673@example
1674cat /etc/passwd | grep 'alain' - /etc/motd
1675@end example
1676
1677@item
1678@cindex palindromes
1679How to express palindromes in a regular expression?
1680
1681It can be done by using back-references;
1682for example,
1683a palindrome of 4 characters can be written with a BRE:
1684
1685@example
1686grep -w -e '\(.\)\(.\).\2\1' file
1687@end example
1688
1689It matches the word ``radar'' or ``civic.''
1690
1691Guglielmo Bondioni proposed a single RE
1692that finds all palindromes up to 19 characters long
1693using @w{9 subexpressions} and @w{9 back-references}:
1694
1695@smallexample
1696grep -E -e '^(.?)(.?)(.?)(.?)(.?)(.?)(.?)(.?)(.?).?\9\8\7\6\5\4\3\2\1$' file
1697@end smallexample
1698
1699Note this is done by using GNU ERE extensions;
1700it might not be portable to other implementations of @command{grep}.
1701
1702@item
1703Why is this back-reference failing?
1704
1705@example
1706echo 'ba' | grep -E '(a)\1|b\1'
1707@end example
1708
1709This gives no output, because the first alternate @samp{(a)\1} does not match,
1710as there is no @samp{aa} in the input, so the @samp{\1} in the second alternate
1711has nothing to refer back to, meaning it will never match anything.
1712(The second alternate in this example can only match
1713if the first alternate has matched---making the second one superfluous.)
1714
1715@item
1716How can I match across lines?
1717
1718Standard grep cannot do this, as it is fundamentally line-based.
1719Therefore, merely using the @code{[:space:]} character class does not
1720match newlines in the way you might expect.
1721
1722With the GNU @command{grep} option @code{-z} (@pxref{File and
1723Directory Selection}), the input is terminated by null bytes.  Thus,
1724you can match newlines in the input, but typically if there is a match
1725the entire input is output, so this usage is often combined with
1726output-suppressing options like @option{-q}, e.g.:
1727
1728@example
1729printf 'foo\nbar\n' | grep -z -q 'foo[[:space:]]\+bar'
1730@end example
1731
1732If this does not suffice, you can transform the input
1733before giving it to @command{grep}, or turn to @command{awk},
1734@command{sed}, @command{perl}, or many other utilities that are
1735designed to operate across lines.
1736
1737@item
1738What do @command{grep}, @command{fgrep}, and @command{egrep} stand for?
1739
1740The name @command{grep} comes from the way line editing was done on Unix.
1741For example,
1742@command{ed} uses the following syntax
1743to print a list of matching lines on the screen:
1744
1745@example
1746global/regular expression/print
1747g/re/p
1748@end example
1749
1750@command{fgrep} stands for Fixed @command{grep};
1751@command{egrep} stands for Extended @command{grep}.
1752
1753@end enumerate
1754
1755
1756@node Reporting Bugs
1757@chapter Reporting bugs
1758
1759@cindex bugs, reporting
1760Email bug reports to @email{bug-grep@@gnu.org},
1761a mailing list whose web page is
1762@url{http://lists.gnu.org/mailman/listinfo/bug-grep}.
1763The Savannah bug tracker for @command{grep} is located at
1764@url{http://savannah.gnu.org/bugs/?group=grep}.
1765
1766@section Known Bugs
1767@cindex Bugs, known
1768
1769Large repetition counts in the @samp{@{n,m@}} construct may cause
1770@command{grep} to use lots of memory.
1771In addition, certain other
1772obscure regular expressions require exponential time and
1773space, and may cause @command{grep} to run out of memory.
1774
1775Back-references are very slow, and may require exponential time.
1776
1777
1778@node Copying
1779@chapter Copying
1780@cindex copying
1781
1782GNU @command{grep} is licensed under the GNU GPL, which makes it @dfn{free
1783software}.
1784
1785The ``free'' in ``free software'' refers to liberty, not price. As
1786some GNU project advocates like to point out, think of ``free speech''
1787rather than ``free beer''.  In short, you have the right (freedom) to
1788run and change @command{grep} and distribute it to other people, and---if you
1789want---charge money for doing either.  The important restriction is
1790that you have to grant your recipients the same rights and impose the
1791same restrictions.
1792
1793This general method of licensing software is sometimes called
1794@dfn{open source}.  The GNU project prefers the term ``free software''
1795for reasons outlined at
1796@url{http://www.gnu.org/philosophy/open-source-misses-the-point.html}.
1797
1798This manual is free documentation in the same sense.  The
1799documentation license is included below.  The license for the program
1800is available with the source code, or at
1801@url{http://www.gnu.org/licenses/gpl.html}.
1802
1803@menu
1804* GNU Free Documentation License::
1805@end menu
1806
1807@node GNU Free Documentation License
1808@section GNU Free Documentation License
1809
1810@include fdl.texi
1811
1812
1813@node Index
1814@unnumbered Index
1815
1816@printindex cp
1817
1818@bye
1819