xref: /dragonfly/contrib/grep/doc/grep.texi (revision 4d0c54c1)
1\input texinfo  @c -*-texinfo-*-
2@c %**start of header
3@setfilename grep.info
4@include version.texi
5@settitle GNU Grep @value{VERSION}
6
7@c Combine indices.
8@syncodeindex ky cp
9@syncodeindex pg cp
10@syncodeindex tp cp
11@defcodeindex op
12@syncodeindex op cp
13@syncodeindex vr cp
14@c %**end of header
15
16@copying
17This manual is for @command{grep}, a pattern matching engine.
18
19Copyright @copyright{} 1999-2002, 2005, 2008-2012 Free Software Foundation,
20Inc.
21
22@quotation
23Permission is granted to copy, distribute and/or modify this document
24under the terms of the GNU Free Documentation License, Version 1.3 or
25any later version published by the Free Software Foundation; with no
26Invariant Sections, with no Front-Cover Texts, and with no Back-Cover
27Texts.  A copy of the license is included in the section entitled
28``GNU Free Documentation License''.
29@end quotation
30@end copying
31
32@dircategory Text creation and manipulation
33@direntry
34* grep: (grep).                 Print lines matching a pattern.
35@end direntry
36
37@titlepage
38@title GNU Grep: Print lines matching a pattern
39@subtitle version @value{VERSION}, @value{UPDATED}
40@author Alain Magloire et al.
41@page
42@vskip 0pt plus 1filll
43@insertcopying
44@end titlepage
45
46@contents
47
48
49@ifnottex
50@node Top
51@top grep
52
53@command{grep} prints lines that contain a match for a pattern.
54
55This manual is for version @value{VERSION} of GNU Grep.
56
57@insertcopying
58@end ifnottex
59
60@menu
61* Introduction::                Introduction.
62* Invoking::                    Command-line options, environment, exit status.
63* Regular Expressions::         Regular Expressions.
64* Usage::                       Examples.
65* Reporting Bugs::              Reporting Bugs.
66* Copying::                     License terms for this manual.
67* Index::                       Combined index.
68@end menu
69
70
71@node Introduction
72@chapter Introduction
73
74@cindex searching for a pattern
75
76@command{grep} searches input files
77for lines containing a match to a given pattern list.
78When it finds a match in a line,
79it copies the line to standard output (by default),
80or produces whatever other sort of output you have requested with options.
81
82Though @command{grep} expects to do the matching on text,
83it has no limits on input line length other than available memory,
84and it can match arbitrary characters within a line.
85If the final byte of an input file is not a newline,
86@command{grep} silently supplies one.
87Since newline is also a separator for the list of patterns,
88there is no way to match newline characters in a text.
89
90
91@node Invoking
92@chapter Invoking @command{grep}
93
94The general synopsis of the @command{grep} command line is
95
96@example
97grep @var{options} @var{pattern} @var{input_file_names}
98@end example
99
100@noindent
101There can be zero or more @var{options}.
102@var{pattern} will only be seen as such
103(and not as an @var{input_file_name})
104if it wasn't already specified within @var{options}
105(by using the @samp{-e@ @var{pattern}}
106or @samp{-f@ @var{file}} options).
107There can be zero or more @var{input_file_names}.
108
109@menu
110* Command-line Options::        Short and long names, grouped by category.
111* Environment Variables::       POSIX, GNU generic, and GNU grep specific.
112* Exit Status::                 Exit status returned by @command{grep}.
113* grep Programs::               @command{grep} programs.
114@end menu
115
116@node Command-line Options
117@section Command-line Options
118
119@command{grep} comes with a rich set of options:
120some from POSIX and some being GNU extensions.
121Long option names are always a GNU extension,
122even for options that are from POSIX specifications.
123Options that are specified by POSIX,
124under their short names,
125are explicitly marked as such
126to facilitate POSIX-portable programming.
127A few option names are provided
128for compatibility with older or more exotic implementations.
129
130@menu
131* Generic Program Information::
132* Matching Control::
133* General Output Control::
134* Output Line Prefix Control::
135* Context Line Control::
136* File and Directory Selection::
137* Other Options::
138@end menu
139
140Several additional options control
141which variant of the @command{grep} matching engine is used.
142@xref{grep Programs}.
143
144@node Generic Program Information
145@subsection Generic Program Information
146
147@table @option
148
149@item --help
150@opindex --help
151@cindex usage summary, printing
152Print a usage message briefly summarizing the command-line options
153and the bug-reporting address, then exit.
154
155@item -V
156@itemx --version
157@opindex -V
158@opindex --version
159@cindex version, printing
160Print the version number of @command{grep} to the standard output stream.
161This version number should be included in all bug reports.
162
163@end table
164
165@node Matching Control
166@subsection Matching Control
167
168@table @option
169
170@item -e @var{pattern}
171@itemx --regexp=@var{pattern}
172@opindex -e
173@opindex --regexp=@var{pattern}
174@cindex pattern list
175Use @var{pattern} as the pattern.
176This can be used to specify multiple search patterns,
177or to protect a pattern beginning with a @samp{-}.
178(@option{-e} is specified by POSIX.)
179
180@item -f @var{file}
181@itemx --file=@var{file}
182@opindex -f
183@opindex --file
184@cindex pattern from file
185Obtain patterns from @var{file}, one per line.
186The empty file contains zero patterns, and therefore matches nothing.
187(@option{-f} is specified by POSIX.)
188
189@item -i
190@itemx -y
191@itemx --ignore-case
192@opindex -i
193@opindex -y
194@opindex --ignore-case
195@cindex case insensitive search
196Ignore case distinctions in both the pattern and the input files.
197@option{-y} is an obsolete synonym that is provided for compatibility.
198(@option{-i} is specified by POSIX.)
199
200@item -v
201@itemx --invert-match
202@opindex -v
203@opindex --invert-match
204@cindex invert matching
205@cindex print non-matching lines
206Invert the sense of matching, to select non-matching lines.
207(@option{-v} is specified by POSIX.)
208
209@item -w
210@itemx --word-regexp
211@opindex -w
212@opindex --word-regexp
213@cindex matching whole words
214Select only those lines containing matches that form whole words.
215The test is that the matching substring must either
216be at the beginning of the line,
217or preceded by a non-word constituent character.
218Similarly,
219it must be either at the end of the line
220or followed by a non-word constituent character.
221Word-constituent characters are letters, digits, and the underscore.
222
223@item -x
224@itemx --line-regexp
225@opindex -x
226@opindex --line-regexp
227@cindex match the whole line
228Select only those matches that exactly match the whole line.
229(@option{-x} is specified by POSIX.)
230
231@end table
232
233@node General Output Control
234@subsection General Output Control
235
236@table @option
237
238@item -c
239@itemx --count
240@opindex -c
241@opindex --count
242@cindex counting lines
243Suppress normal output;
244instead print a count of matching lines for each input file.
245With the @option{-v} (@option{--invert-match}) option,
246count non-matching lines.
247(@option{-c} is specified by POSIX.)
248
249@item --color[=@var{WHEN}]
250@itemx --colour[=@var{WHEN}]
251@opindex --color
252@opindex --colour
253@cindex highlight, color, colour
254Surround the matched (non-empty) strings, matching lines, context lines,
255file names, line numbers, byte offsets, and separators (for fields and
256groups of context lines) with escape sequences to display them in color
257on the terminal.
258The colors are defined by the environment variable @env{GREP_COLORS}
259and default to @samp{ms=01;31:mc=01;31:sl=:cx=:fn=35:ln=32:bn=32:se=36}
260for bold red matched text, magenta file names, green line numbers,
261green byte offsets, cyan separators, and default terminal colors otherwise.
262The deprecated environment variable @env{GREP_COLOR} is still supported,
263but its setting does not have priority;
264it defaults to `01;31' (bold red)
265which only covers the color for matched text.
266@var{WHEN} is @samp{never}, @samp{always}, or @samp{auto}.
267
268@item -L
269@itemx --files-without-match
270@opindex -L
271@opindex --files-without-match
272@cindex files which don't match
273Suppress normal output;
274instead print the name of each input file from which
275no output would normally have been printed.
276The scanning of each file stops on the first match.
277
278@item -l
279@itemx --files-with-matches
280@opindex -l
281@opindex --files-with-matches
282@cindex names of matching files
283Suppress normal output;
284instead print the name of each input file from which
285output would normally have been printed.
286The scanning of each file stops on the first match.
287(@option{-l} is specified by POSIX.)
288
289@item -m @var{num}
290@itemx --max-count=@var{num}
291@opindex -m
292@opindex --max-count
293@cindex max-count
294Stop reading a file after @var{num} matching lines.
295If the input is standard input from a regular file,
296and @var{num} matching lines are output,
297@command{grep} ensures that the standard input is positioned
298just after the last matching line before exiting,
299regardless of the presence of trailing context lines.
300This enables a calling process to resume a search.
301For example, the following shell script makes use of it:
302
303@example
304while grep -m 1 PATTERN
305do
306  echo xxxx
307done < FILE
308@end example
309
310But the following probably will not work because a pipe is not a regular
311file:
312
313@example
314# This probably will not work.
315cat FILE |
316while grep -m 1 PATTERN
317do
318  echo xxxx
319done
320@end example
321
322When @command{grep} stops after @var{num} matching lines,
323it outputs any trailing context lines.
324Since context does not include matching lines,
325@command{grep} will stop when it encounters another matching line.
326When the @option{-c} or @option{--count} option is also used,
327@command{grep} does not output a count greater than @var{num}.
328When the @option{-v} or @option{--invert-match} option is also used,
329@command{grep} stops after outputting @var{num} non-matching lines.
330
331@item -o
332@itemx --only-matching
333@opindex -o
334@opindex --only-matching
335@cindex only matching
336Print only the matched (non-empty) parts of matching lines,
337with each such part on a separate output line.
338
339@item -q
340@itemx --quiet
341@itemx --silent
342@opindex -q
343@opindex --quiet
344@opindex --silent
345@cindex quiet, silent
346Quiet; do not write anything to standard output.
347Exit immediately with zero status if any match is found,
348even if an error was detected.
349Also see the @option{-s} or @option{--no-messages} option.
350(@option{-q} is specified by POSIX.)
351
352@item -s
353@itemx --no-messages
354@opindex -s
355@opindex --no-messages
356@cindex suppress error messages
357Suppress error messages about nonexistent or unreadable files.
358Portability note:
359unlike GNU @command{grep},
3607th Edition Unix @command{grep} did not conform to POSIX,
361because it lacked @option{-q}
362and its @option{-s} option behaved like
363GNU @command{grep}'s @option{-q} option.@footnote{Of course, 7th Edition
364Unix predated POSIX by several years!}
365USG-style @command{grep} also lacked @option{-q}
366but its @option{-s} option behaved like GNU @command{grep}'s.
367Portable shell scripts should avoid both
368@option{-q} and @option{-s} and should redirect
369standard and error output to @file{/dev/null} instead.
370(@option{-s} is specified by POSIX.)
371
372@end table
373
374@node Output Line Prefix Control
375@subsection Output Line Prefix Control
376
377When several prefix fields are to be output,
378the order is always file name, line number, and byte offset,
379regardless of the order in which these options were specified.
380
381@table @option
382
383@item -b
384@itemx --byte-offset
385@opindex -b
386@opindex --byte-offset
387@cindex byte offset
388Print the 0-based byte offset within the input file
389before each line of output.
390If @option{-o} (@option{--only-matching}) is specified,
391print the offset of the matching part itself.
392When @command{grep} runs on MS-DOS or MS-Windows,
393the printed byte offsets depend on whether
394the @option{-u} (@option{--unix-byte-offsets}) option is used;
395see below.
396
397@item -H
398@itemx --with-filename
399@opindex -H
400@opindex --with-filename
401@cindex with filename prefix
402Print the file name for each match.
403This is the default when there is more than one file to search.
404
405@item -h
406@itemx --no-filename
407@opindex -h
408@opindex --no-filename
409@cindex no filename prefix
410Suppress the prefixing of file names on output.
411This is the default when there is only one file
412(or only standard input) to search.
413
414@item --label=@var{LABEL}
415@opindex --label
416@cindex changing name of standard input
417Display input actually coming from standard input
418as input coming from file @var{LABEL}.  This is
419especially useful when implementing tools like
420@command{zgrep}; e.g.:
421
422@example
423gzip -cd foo.gz | grep --label=foo -H something
424@end example
425
426@item -n
427@itemx --line-number
428@opindex -n
429@opindex --line-number
430@cindex line numbering
431Prefix each line of output with the 1-based line number within its input file.
432(@option{-n} is specified by POSIX.)
433
434@item -T
435@itemx --initial-tab
436@opindex -T
437@opindex --initial-tab
438@cindex tab-aligned content lines
439Make sure that the first character of actual line content lies on a tab stop,
440so that the alignment of tabs looks normal.
441This is useful with options that prefix their output to the actual content:
442@option{-H}, @option{-n}, and @option{-b}.
443In order to improve the probability that lines
444from a single file will all start at the same column,
445this also causes the line number and byte offset (if present)
446to be printed in a minimum-size field width.
447
448@item -u
449@itemx --unix-byte-offsets
450@opindex -u
451@opindex --unix-byte-offsets
452@cindex MS-DOS/MS-Windows byte offsets
453@cindex byte offsets, on MS-DOS/MS-Windows
454Report Unix-style byte offsets.
455This option causes @command{grep} to report byte offsets
456as if the file were a Unix-style text file,
457i.e., the byte offsets ignore the @code{CR} characters that were stripped.
458This will produce results identical
459to running @command{grep} on a Unix machine.
460This option has no effect unless the @option{-b} option is also used;
461it has no effect on platforms other than MS-DOS and MS-Windows.
462
463@item -Z
464@itemx --null
465@opindex -Z
466@opindex --null
467@cindex zero-terminated file names
468Output a zero byte (the ASCII @code{NUL} character)
469instead of the character that normally follows a file name.
470For example,
471@samp{grep -lZ} outputs a zero byte after each file name
472instead of the usual newline.
473This option makes the output unambiguous,
474even in the presence of file names containing unusual characters like newlines.
475This option can be used with commands like
476@samp{find -print0}, @samp{perl -0}, @samp{sort -z}, and @samp{xargs -0}
477to process arbitrary file names,
478even those that contain newline characters.
479
480@end table
481
482@node Context Line Control
483@subsection Context Line Control
484
485Regardless of how these options are set,
486@command{grep} will never print any given line more than once.
487If the @option{-o} (@option{--only-matching}) option is specified,
488these options have no effect and a warning is given upon their use.
489
490@table @option
491
492@item -A @var{num}
493@itemx --after-context=@var{num}
494@opindex -A
495@opindex --after-context
496@cindex after context
497@cindex context lines, after match
498Print @var{num} lines of trailing context after matching lines.
499
500@item -B @var{num}
501@itemx --before-context=@var{num}
502@opindex -B
503@opindex --before-context
504@cindex before context
505@cindex context lines, before match
506Print @var{num} lines of leading context before matching lines.
507
508@item -C @var{num}
509@itemx -@var{num}
510@itemx --context=@var{num}
511@opindex -C
512@opindex --context
513@opindex -@var{num}
514@cindex context
515Print @var{num} lines of leading and trailing output context.
516
517@item --group-separator=@var{string}
518@opindex --group-separator
519@cindex group separator
520When @option{-A}, @option{-B} or @option{-C} are in use,
521print @var{string} instead of @option{--} around disjoint groups
522of lines.
523
524@item --no-group-separator
525@opindex --group-separator
526@cindex group separator
527When @option{-A}, @option{-B} or @option{-C} are in use,
528print disjoint groups of lines adjacent to each other.
529
530@end table
531
532Here are some points about how @command{grep} chooses
533the separator to print between prefix fields and line content:
534
535@itemize @bullet
536@item
537Matching lines normally use @samp{:} as a separator
538between prefix fields and actual line content.
539
540@item
541Context (i.e., non-matching) lines use @samp{-} instead.
542
543@item
544When no context is specified,
545matching lines are simply output one right after another.
546
547@item
548When nonzero context is specified,
549lines that are adjacent in the input form a group
550and are output one right after another, while
551a separator appears by default between disjoint groups on a line
552of its own and without any prefix.
553
554@item
555The default separator
556is @samp{--}, however whether to include it and its appearance
557can be changed with the options above.
558
559@item
560Each group may contain
561several matching lines when they are close enough to each other
562that two otherwise adjacent but divided groups connect
563and can just merge into a single contiguous one.
564@end itemize
565
566@node File and Directory Selection
567@subsection File and Directory Selection
568
569@table @option
570
571@item -a
572@itemx --text
573@opindex -a
574@opindex --text
575@cindex suppress binary data
576@cindex binary files
577Process a binary file as if it were text;
578this is equivalent to the @samp{--binary-files=text} option.
579
580@item --binary-files=@var{type}
581@opindex --binary-files
582@cindex binary files
583If the first few bytes of a file indicate that the file contains binary data,
584assume that the file is of type @var{type}.
585By default, @var{type} is @samp{binary},
586and @command{grep} normally outputs either
587a one-line message saying that a binary file matches,
588or no message if there is no match.
589
590If @var{type} is @samp{without-match},
591@command{grep} assumes that a binary file does not match;
592this is equivalent to the @option{-I} option.
593
594If @var{type} is @samp{text},
595@command{grep} processes a binary file as if it were text;
596this is equivalent to the @option{-a} option.
597
598@emph{Warning:} @samp{--binary-files=text} might output binary garbage,
599which can have nasty side effects
600if the output is a terminal and
601if the terminal driver interprets some of it as commands.
602
603@item -D @var{action}
604@itemx --devices=@var{action}
605@opindex -D
606@opindex --devices
607@cindex device search
608If an input file is a device, FIFO, or socket, use @var{action} to process it.
609If @var{action} is @samp{read},
610all devices are read just as if they were ordinary files.
611If @var{action} is @samp{skip},
612devices, FIFOs, and sockets are silently skipped.
613By default, devices are read if they are on the command line or if the
614@option{-R} (@option{--dereference-recursive}) option is used, and are
615skipped if they are encountered recursively and the @option{-r}
616(@option{--recursive}) option is used.
617This option has no effect on a file that is read via standard input.
618
619@item -d @var{action}
620@itemx --directories=@var{action}
621@opindex -d
622@opindex --directories
623@cindex directory search
624@cindex symbolic links
625If an input file is a directory, use @var{action} to process it.
626By default, @var{action} is @samp{read},
627which means that directories are read just as if they were ordinary files
628(some operating systems and file systems disallow this,
629and will cause @command{grep}
630to print error messages for every directory or silently skip them).
631If @var{action} is @samp{skip}, directories are silently skipped.
632If @var{action} is @samp{recurse},
633@command{grep} reads all files under each directory, recursively,
634following command-line symbolic links and skipping other symlinks;
635this is equivalent to the @option{-r} option.
636
637@item --exclude=@var{glob}
638@opindex --exclude
639@cindex exclude files
640@cindex searching directory trees
641Skip files whose base name matches @var{glob}
642(using wildcard matching).
643A file-name glob can use
644@samp{*}, @samp{?}, and @samp{[}...@samp{]} as wildcards,
645and @code{\} to quote a wildcard or backslash character literally.
646
647@item --exclude-from=@var{file}
648@opindex --exclude-from
649@cindex exclude files
650@cindex searching directory trees
651Skip files whose base name matches any of the file-name globs
652read from @var{file} (using wildcard matching as described
653under @option{--exclude}).
654
655@item --exclude-dir=@var{dir}
656@opindex --exclude-dir
657@cindex exclude directories
658Exclude directories matching the pattern @var{dir} from recursive
659directory searches.
660
661@item -I
662Process a binary file as if it did not contain matching data;
663this is equivalent to the @samp{--binary-files=without-match} option.
664
665@item --include=@var{glob}
666@opindex --include
667@cindex include files
668@cindex searching directory trees
669Search only files whose base name matches @var{glob}
670(using wildcard matching as described under @option{--exclude}).
671
672@item -r
673@itemx --recursive
674@opindex -r
675@opindex --recursive
676@cindex recursive search
677@cindex searching directory trees
678@cindex symbolic links
679For each directory operand,
680read and process all files in that directory, recursively.
681Follow symbolic links on the command line, but skip symlinks
682that are encountered recursively.
683This is the same as the @samp{--directories=recurse} option.
684
685@item -R
686@itemx --dereference-recursive
687@opindex -R
688@opindex --dereference-recursive
689@cindex recursive search
690@cindex searching directory trees
691@cindex symbolic links
692For each directory operand, read and process all files in that
693directory, recursively, following all symbolic links.
694
695@end table
696
697@node Other Options
698@subsection Other Options
699
700@table @option
701
702@item --line-buffered
703@opindex --line-buffered
704@cindex line buffering
705Use line buffering on output.
706This can cause a performance penalty.
707
708@item --mmap
709@opindex --mmap
710@cindex memory mapped input
711This option is deprecated and now elicits a warning, but is otherwise a no-op.
712It used to make @command{grep} read
713input with the @code{mmap} system call, instead of the default @code{read}
714system call.  On modern systems, @code{mmap} would rarely if ever yield
715better performance.
716
717@item -U
718@itemx --binary
719@opindex -U
720@opindex --binary
721@cindex MS-DOS/MS-Windows binary files
722@cindex binary files, MS-DOS/MS-Windows
723Treat the file(s) as binary.
724By default, under MS-DOS and MS-Windows,
725@command{grep} guesses the file type
726by looking at the contents of the first 32kB read from the file.
727If @command{grep} decides the file is a text file,
728it strips the @code{CR} characters from the original file contents
729(to make regular expressions with @code{^} and @code{$} work correctly).
730Specifying @option{-U} overrules this guesswork,
731causing all files to be read and passed to the matching mechanism verbatim;
732if the file is a text file with @code{CR/LF} pairs at the end of each line,
733this will cause some regular expressions to fail.
734This option has no effect
735on platforms other than MS-DOS and MS-Windows.
736
737@item -z
738@itemx --null-data
739@opindex -z
740@opindex --null-data
741@cindex zero-terminated lines
742Treat the input as a set of lines, each terminated by a zero byte (the
743ASCII @code{NUL} character) instead of a newline.
744Like the @option{-Z} or @option{--null} option,
745this option can be used with commands like
746@samp{sort -z} to process arbitrary file names.
747
748@end table
749
750@node Environment Variables
751@section Environment Variables
752
753The behavior of @command{grep} is affected
754by the following environment variables.
755
756The locale for category @w{@code{LC_@var{foo}}}
757is specified by examining the three environment variables
758@env{LC_ALL}, @w{@env{LC_@var{foo}}}, and @env{LANG},
759in that order.
760The first of these variables that is set specifies the locale.
761For example, if @env{LC_ALL} is not set,
762but @env{LC_MESSAGES} is set to @samp{pt_BR},
763then the Brazilian Portuguese locale is used
764for the @code{LC_MESSAGES} category.
765The @samp{C} locale is used if none of these environment variables are set,
766if the locale catalog is not installed,
767or if @command{grep} was not compiled
768with national language support (NLS).
769
770Many of the environment variables in the following list let you
771control highlighting using
772Select Graphic Rendition (SGR)
773commands interpreted by the terminal or terminal emulator.
774(See the
775section
776in the documentation of your text terminal
777for permitted values and their meanings as character attributes.)
778These substring values are integers in decimal representation
779and can be concatenated with semicolons.
780@command{grep} takes care of assembling the result
781into a complete SGR sequence (@samp{\33[}...@samp{m}).
782Common values to concatenate include
783@samp{1} for bold,
784@samp{4} for underline,
785@samp{5} for blink,
786@samp{7} for inverse,
787@samp{39} for default foreground color,
788@samp{30} to @samp{37} for foreground colors,
789@samp{90} to @samp{97} for 16-color mode foreground colors,
790@samp{38;5;0} to @samp{38;5;255}
791for 88-color and 256-color modes foreground colors,
792@samp{49} for default background color,
793@samp{40} to @samp{47} for background colors,
794@samp{100} to @samp{107} for 16-color mode background colors,
795and @samp{48;5;0} to @samp{48;5;255}
796for 88-color and 256-color modes background colors.
797
798The two-letter names used in the @env{GREP_COLORS} environment variable
799(and some of the others) refer to terminal ``capabilities,'' the ability
800of a terminal to highlight text, or change its color, and so on.
801These capabilities are stored in an online database and accessed by
802the @code{terminfo} library.
803
804@cindex environment variables
805
806@table @env
807
808@item GREP_OPTIONS
809@vindex GREP_OPTIONS @r{environment variable}
810@cindex default options environment variable
811This variable specifies default options to be placed in front of any
812explicit options.
813For example, if @code{GREP_OPTIONS} is
814@samp{--binary-files=without-match --directories=skip}, @command{grep}
815behaves as if the two options @samp{--binary-files=without-match} and
816@samp{--directories=skip} had been specified before
817any explicit options.
818Option specifications are separated by
819whitespace.
820A backslash escapes the next character, so it can be used to
821specify an option containing whitespace or a backslash.
822
823The @code{GREP_OPTIONS} value does not affect whether @command{grep}
824without file operands searches standard input or the working
825directory; that is affected only by command-line options.  For
826example, the command @samp{grep PAT} searches standard input and the
827command @samp{grep -r PAT} searches the working directory, regardless
828of whether @code{GREP_OPTIONS} contains @option{-r}.
829
830@item GREP_COLOR
831@vindex GREP_COLOR @r{environment variable}
832@cindex highlight markers
833This variable specifies the color used to highlight matched (non-empty) text.
834It is deprecated in favor of @env{GREP_COLORS}, but still supported.
835The @samp{mt}, @samp{ms}, and @samp{mc} capabilities of @env{GREP_COLORS}
836have priority over it.
837It can only specify the color used to highlight
838the matching non-empty text in any matching line
839(a selected line when the @option{-v} command-line option is omitted,
840or a context line when @option{-v} is specified).
841The default is @samp{01;31},
842which means a bold red foreground text on the terminal's default background.
843
844@item GREP_COLORS
845@vindex GREP_COLORS @r{environment variable}
846@cindex highlight markers
847This variable specifies the colors and other attributes
848used to highlight various parts of the output.
849Its value is a colon-separated list of @code{terminfo} capabilities
850that defaults to @samp{ms=01;31:mc=01;31:sl=:cx=:fn=35:ln=32:bn=32:se=36}
851with the @samp{rv} and @samp{ne} boolean capabilities omitted (i.e., false).
852Supported capabilities are as follows.
853
854@table @code
855@item sl=
856@vindex sl GREP_COLORS @r{capability}
857SGR substring for whole selected lines
858(i.e.,
859matching lines when the @option{-v} command-line option is omitted,
860or non-matching lines when @option{-v} is specified).
861If however the boolean @samp{rv} capability
862and the @option{-v} command-line option are both specified,
863it applies to context matching lines instead.
864The default is empty (i.e., the terminal's default color pair).
865
866@item cx=
867@vindex cx GREP_COLORS @r{capability}
868SGR substring for whole context lines
869(i.e.,
870non-matching lines when the @option{-v} command-line option is omitted,
871or matching lines when @option{-v} is specified).
872If however the boolean @samp{rv} capability
873and the @option{-v} command-line option are both specified,
874it applies to selected non-matching lines instead.
875The default is empty (i.e., the terminal's default color pair).
876
877@item rv
878@vindex rv GREP_COLORS @r{capability}
879Boolean value that reverses (swaps) the meanings of
880the @samp{sl=} and @samp{cx=} capabilities
881when the @option{-v} command-line option is specified.
882The default is false (i.e., the capability is omitted).
883
884@item mt=01;31
885@vindex mt GREP_COLORS @r{capability}
886SGR substring for matching non-empty text in any matching line
887(i.e.,
888a selected line when the @option{-v} command-line option is omitted,
889or a context line when @option{-v} is specified).
890Setting this is equivalent to setting both @samp{ms=} and @samp{mc=}
891at once to the same value.
892The default is a bold red text foreground over the current line background.
893
894@item ms=01;31
895@vindex ms GREP_COLORS @r{capability}
896SGR substring for matching non-empty text in a selected line.
897(This is used only when the @option{-v} command-line option is omitted.)
898The effect of the @samp{sl=} (or @samp{cx=} if @samp{rv}) capability
899remains active when this takes effect.
900The default is a bold red text foreground over the current line background.
901
902@item mc=01;31
903@vindex mc GREP_COLORS @r{capability}
904SGR substring for matching non-empty text in a context line.
905(This is used only when the @option{-v} command-line option is specified.)
906The effect of the @samp{cx=} (or @samp{sl=} if @samp{rv}) capability
907remains active when this takes effect.
908The default is a bold red text foreground over the current line background.
909
910@item fn=35
911@vindex fn GREP_COLORS @r{capability}
912SGR substring for file names prefixing any content line.
913The default is a magenta text foreground over the terminal's default background.
914
915@item ln=32
916@vindex ln GREP_COLORS @r{capability}
917SGR substring for line numbers prefixing any content line.
918The default is a green text foreground over the terminal's default background.
919
920@item bn=32
921@vindex bn GREP_COLORS @r{capability}
922SGR substring for byte offsets prefixing any content line.
923The default is a green text foreground over the terminal's default background.
924
925@item se=36
926@vindex fn GREP_COLORS @r{capability}
927SGR substring for separators that are inserted
928between selected line fields (@samp{:}),
929between context line fields (@samp{-}),
930and between groups of adjacent lines
931when nonzero context is specified (@samp{--}).
932The default is a cyan text foreground over the terminal's default background.
933
934@item ne
935@vindex ne GREP_COLORS @r{capability}
936Boolean value that prevents clearing to the end of line
937using Erase in Line (EL) to Right (@samp{\33[K})
938each time a colorized item ends.
939This is needed on terminals on which EL is not supported.
940It is otherwise useful on terminals
941for which the @code{back_color_erase}
942(@code{bce}) boolean @code{terminfo} capability does not apply,
943when the chosen highlight colors do not affect the background,
944or when EL is too slow or causes too much flicker.
945The default is false (i.e., the capability is omitted).
946@end table
947
948Note that boolean capabilities have no @samp{=}... part.
949They are omitted (i.e., false) by default and become true when specified.
950
951
952@item LC_ALL
953@itemx LC_COLLATE
954@itemx LANG
955@vindex LC_ALL @r{environment variable}
956@vindex LC_COLLATE @r{environment variable}
957@vindex LANG @r{environment variable}
958@cindex character type
959@cindex national language support
960@cindex NLS
961These variables specify the locale for the @code{LC_COLLATE} category,
962which determines the collating sequence
963used to interpret range expressions like @samp{[a-z]}.
964
965@item LC_ALL
966@itemx LC_CTYPE
967@itemx LANG
968@vindex LC_ALL @r{environment variable}
969@vindex LC_CTYPE @r{environment variable}
970@vindex LANG @r{environment variable}
971These variables specify the locale for the @code{LC_CTYPE} category,
972which determines the type of characters,
973e.g., which characters are whitespace.
974
975@item LC_ALL
976@itemx LC_MESSAGES
977@itemx LANG
978@vindex LC_ALL @r{environment variable}
979@vindex LC_MESSAGES @r{environment variable}
980@vindex LANG @r{environment variable}
981@cindex language of messages
982@cindex message language
983@cindex national language support
984@cindex translation of message language
985These variables specify the locale for the @code{LC_MESSAGES} category,
986which determines the language that @command{grep} uses for messages.
987The default @samp{C} locale uses American English messages.
988
989@item POSIXLY_CORRECT
990@vindex POSIXLY_CORRECT @r{environment variable}
991If set, @command{grep} behaves as POSIX requires; otherwise,
992@command{grep} behaves more like other GNU programs.
993POSIX
994requires that options that
995follow file names must be treated as file names;
996by default,
997such options are permuted to the front of the operand list
998and are treated as options.
999Also, @code{POSIXLY_CORRECT} disables special handling of an
1000invalid bracket expression.  @xref{invalid-bracket-expr}.
1001
1002@item _@var{N}_GNU_nonoption_argv_flags_
1003@vindex _@var{N}_GNU_nonoption_argv_flags_ @r{environment variable}
1004(Here @code{@var{N}} is @command{grep}'s numeric process ID.)
1005If the @var{i}th character of this environment variable's value is @samp{1},
1006do not consider the @var{i}th operand of @command{grep} to be an option,
1007even if it appears to be one.
1008A shell can put this variable in the environment for each command it runs,
1009specifying which operands are the results of file name wildcard expansion
1010and therefore should not be treated as options.
1011This behavior is available only with the GNU C library,
1012and only when @code{POSIXLY_CORRECT} is not set.
1013
1014@end table
1015
1016
1017@node Exit Status
1018@section Exit Status
1019@cindex exit status
1020@cindex return status
1021
1022Normally, the exit status is 0 if selected lines are found and 1 otherwise.
1023But the exit status is 2 if an error occurred, unless the @option{-q} or
1024@option{--quiet} or @option{--silent} option is used and a selected line
1025is found.
1026Note, however, that POSIX only mandates,
1027for programs such as @command{grep}, @command{cmp}, and @command{diff},
1028that the exit status in case of error be greater than 1;
1029it is therefore advisable, for the sake of portability,
1030to use logic that tests for this general condition
1031instead of strict equality with@ 2.
1032
1033
1034@node grep Programs
1035@section @command{grep} Programs
1036@cindex @command{grep} programs
1037@cindex variants of @command{grep}
1038
1039@command{grep} searches the named input files
1040for lines containing a match to the given pattern.
1041By default, @command{grep} prints the matching lines.
1042A file named @file{-} stands for standard input.
1043If no input is specified, @command{grep} searches the working
1044directory @file{.} if given a command-line option specifying
1045recursion; otherwise, @command{grep} searches standard input.
1046There are four major variants of @command{grep},
1047controlled by the following options.
1048
1049@table @option
1050
1051@item -G
1052@itemx --basic-regexp
1053@opindex -G
1054@opindex --basic-regexp
1055@cindex matching basic regular expressions
1056Interpret the pattern as a basic regular expression (BRE).
1057This is the default.
1058
1059@item -E
1060@itemx --extended-regexp
1061@opindex -E
1062@opindex --extended-regexp
1063@cindex matching extended regular expressions
1064Interpret the pattern as an extended regular expression (ERE).
1065(@option{-E} is specified by POSIX.)
1066
1067@item -F
1068@itemx --fixed-strings
1069@opindex -F
1070@opindex --fixed-strings
1071@cindex matching fixed strings
1072Interpret the pattern as a list of fixed strings, separated
1073by newlines, any of which is to be matched.
1074(@option{-F} is specified by POSIX.)
1075
1076@item -P
1077@itemx --perl-regexp
1078@opindex -P
1079@opindex --perl-regexp
1080@cindex matching Perl regular expressions
1081Interpret the pattern as a Perl regular expression.
1082This is highly experimental and
1083@samp{grep@ -P} may warn of unimplemented features.
1084
1085@end table
1086
1087In addition,
1088two variant programs @command{egrep} and @command{fgrep} are available.
1089@command{egrep} is the same as @samp{grep@ -E}.
1090@command{fgrep} is the same as @samp{grep@ -F}.
1091Direct invocation as either
1092@command{egrep} or @command{fgrep} is deprecated,
1093but is provided to allow historical applications
1094that rely on them to run unmodified.
1095
1096
1097@node Regular Expressions
1098@chapter Regular Expressions
1099@cindex regular expressions
1100
1101A @dfn{regular expression} is a pattern that describes a set of strings.
1102Regular expressions are constructed analogously to arithmetic expressions,
1103by using various operators to combine smaller expressions.
1104@command{grep} understands
1105three different versions of regular expression syntax:
1106``basic,'' (BRE) ``extended'' (ERE) and ``perl''.
1107In GNU @command{grep},
1108there is no difference in available functionality between the basic and
1109extended syntaxes.
1110In other implementations, basic regular expressions are less powerful.
1111The following description applies to extended regular expressions;
1112differences for basic regular expressions are summarized afterwards.
1113Perl regular expressions give additional functionality, and are
1114documented in the @i{pcresyntax}(3) and @i{pcrepattern}(3) manual pages,
1115but may not be available on every system.
1116
1117@menu
1118* Fundamental Structure::
1119* Character Classes and Bracket Expressions::
1120* The Backslash Character and Special Expressions::
1121* Anchoring::
1122* Back-references and Subexpressions::
1123* Basic vs Extended::
1124@end menu
1125
1126@node Fundamental Structure
1127@section Fundamental Structure
1128
1129The fundamental building blocks are the regular expressions that match
1130a single character.
1131Most characters, including all letters and digits,
1132are regular expressions that match themselves.
1133Any meta-character
1134with special meaning may be quoted by preceding it with a backslash.
1135
1136A regular expression may be followed by one of several
1137repetition operators:
1138
1139@table @samp
1140
1141@item .
1142@opindex .
1143@cindex dot
1144@cindex period
1145The period @samp{.} matches any single character.
1146
1147@item ?
1148@opindex ?
1149@cindex question mark
1150@cindex match expression at most once
1151The preceding item is optional and will be matched at most once.
1152
1153@item *
1154@opindex *
1155@cindex asterisk
1156@cindex match expression zero or more times
1157The preceding item will be matched zero or more times.
1158
1159@item +
1160@opindex +
1161@cindex plus sign
1162@cindex match expression one or more times
1163The preceding item will be matched one or more times.
1164
1165@item @{@var{n}@}
1166@opindex @{@var{n}@}
1167@cindex braces, one argument
1168@cindex match expression @var{n} times
1169The preceding item is matched exactly @var{n} times.
1170
1171@item @{@var{n},@}
1172@opindex @{@var{n},@}
1173@cindex braces, second argument omitted
1174@cindex match expression @var{n} or more times
1175The preceding item is matched @var{n} or more times.
1176
1177@item @{,@var{m}@}
1178@opindex @{,@var{m}@}
1179@cindex braces, first argument omitted
1180@cindex match expression at most @var{m} times
1181The preceding item is matched at most @var{m} times.
1182
1183@item @{@var{n},@var{m}@}
1184@opindex @{@var{n},@var{m}@}
1185@cindex braces, two arguments
1186@cindex match expression from @var{n} to @var{m} times
1187The preceding item is matched at least @var{n} times, but not more than
1188@var{m} times.
1189
1190@end table
1191
1192The empty regular expression matches the empty string.
1193Two regular expressions may be concatenated;
1194the resulting regular expression
1195matches any string formed by concatenating two substrings
1196that respectively match the concatenated expressions.
1197
1198Two regular expressions may be joined by the infix operator @samp{|};
1199the resulting regular expression
1200matches any string matching either alternate expression.
1201
1202Repetition takes precedence over concatenation,
1203which in turn takes precedence over alternation.
1204A whole expression may be enclosed in parentheses
1205to override these precedence rules and form a subexpression.
1206
1207@node Character Classes and Bracket Expressions
1208@section Character Classes and Bracket Expressions
1209
1210@cindex bracket expression
1211@cindex character class
1212A @dfn{bracket expression} is a list of characters enclosed by @samp{[} and
1213@samp{]}.
1214It matches any single character in that list;
1215if the first character of the list is the caret @samp{^},
1216then it matches any character @strong{not} in the list.
1217For example, the regular expression
1218@samp{[0123456789]} matches any single digit.
1219
1220@cindex range expression
1221Within a bracket expression, a @dfn{range expression} consists of two
1222characters separated by a hyphen.
1223It matches any single character that
1224sorts between the two characters, inclusive, using the locale's
1225collating sequence and character set.
1226For example, in the default C
1227locale, @samp{[a-d]} is equivalent to @samp{[abcd]}.
1228Many locales sort
1229characters in dictionary order, and in these locales @samp{[a-d]} is
1230typically not equivalent to @samp{[abcd]};
1231it might be equivalent to @samp{[aBbCcDd]}, for example.
1232To obtain the traditional interpretation
1233of bracket expressions, you can use the @samp{C} locale by setting the
1234@env{LC_ALL} environment variable to the value @samp{C}.
1235
1236Finally, certain named classes of characters are predefined within
1237bracket expressions, as follows.
1238Their interpretation depends on the @code{LC_CTYPE} locale;
1239for example, @samp{[[:alnum:]]} means the character class of numbers and letters
1240in the current locale.
1241
1242@cindex classes of characters
1243@cindex character classes
1244@table @samp
1245
1246@item [:alnum:]
1247@opindex alnum @r{character class}
1248@cindex alphanumeric characters
1249Alphanumeric characters:
1250@samp{[:alpha:]} and @samp{[:digit:]}; in the @samp{C} locale and ASCII character encoding, this is the same as @samp{[0-9A-Za-z]}.
1251
1252@item [:alpha:]
1253@opindex alpha @r{character class}
1254@cindex alphabetic characters
1255Alphabetic characters:
1256@samp{[:lower:]} and @samp{[:upper:]}; in the @samp{C} locale and ASCII character encoding, this is the same as @samp{[A-Za-z]}.
1257
1258@item [:blank:]
1259@opindex blank @r{character class}
1260@cindex blank characters
1261Blank characters:
1262space and tab.
1263
1264@item [:cntrl:]
1265@opindex cntrl @r{character class}
1266@cindex control characters
1267Control characters.
1268In ASCII, these characters have octal codes 000
1269through 037, and 177 (@code{DEL}).
1270In other character sets, these are
1271the equivalent characters, if any.
1272
1273@item [:digit:]
1274@opindex digit @r{character class}
1275@cindex digit characters
1276@cindex numeric characters
1277Digits: @code{0 1 2 3 4 5 6 7 8 9}.
1278
1279@item [:graph:]
1280@opindex graph @r{character class}
1281@cindex graphic characters
1282Graphical characters:
1283@samp{[:alnum:]} and @samp{[:punct:]}.
1284
1285@item [:lower:]
1286@opindex lower @r{character class}
1287@cindex lower-case letters
1288Lower-case letters; in the @samp{C} locale and ASCII character
1289encoding, this is
1290@code{a b c d e f g h i j k l m n o p q r s t u v w x y z}.
1291
1292@item [:print:]
1293@opindex print @r{character class}
1294@cindex printable characters
1295Printable characters:
1296@samp{[:alnum:]}, @samp{[:punct:]}, and space.
1297
1298@item [:punct:]
1299@opindex punct @r{character class}
1300@cindex punctuation characters
1301Punctuation characters; in the @samp{C} locale and ASCII character
1302encoding, this is
1303@code{!@: " # $ % & ' ( ) * + , - .@: / : ; < = > ?@: @@ [ \ ] ^ _ ` @{ | @} ~}.
1304
1305@item [:space:]
1306@opindex space @r{character class}
1307@cindex space characters
1308@cindex whitespace characters
1309Space characters: in the @samp{C} locale, this is
1310tab, newline, vertical tab, form feed, carriage return, and space.
1311@xref{Usage}, for more discussion of matching newlines.
1312
1313@item [:upper:]
1314@opindex upper @r{character class}
1315@cindex upper-case letters
1316Upper-case letters: in the @samp{C} locale and ASCII character
1317encoding, this is
1318@code{A B C D E F G H I J K L M N O P Q R S T U V W X Y Z}.
1319
1320@item [:xdigit:]
1321@opindex xdigit @r{character class}
1322@cindex xdigit class
1323@cindex hexadecimal digits
1324Hexadecimal digits:
1325@code{0 1 2 3 4 5 6 7 8 9 A B C D E F a b c d e f}.
1326
1327@end table
1328Note that the brackets in these class names are
1329part of the symbolic names, and must be included in addition to
1330the brackets delimiting the bracket expression.
1331
1332@anchor{invalid-bracket-expr}
1333If you mistakenly omit the outer brackets, and search for say, @samp{[:upper:]},
1334GNU @command{grep} prints a diagnostic and exits with status 2, on
1335the assumption that you did not intend to search for the nominally
1336equivalent regular expression: @samp{[:epru]}.
1337Set the @code{POSIXLY_CORRECT} environment variable to disable this feature.
1338
1339Most meta-characters lose their special meaning inside bracket expressions.
1340
1341@table @samp
1342@item ]
1343ends the bracket expression if it's not the first list item.
1344So, if you want to make the @samp{]} character a list item,
1345you must put it first.
1346
1347@item [.
1348represents the open collating symbol.
1349
1350@item .]
1351represents the close collating symbol.
1352
1353@item [=
1354represents the open equivalence class.
1355
1356@item =]
1357represents the close equivalence class.
1358
1359@item [:
1360represents the open character class symbol, and should be followed by a valid character class name.
1361
1362@item :]
1363represents the close character class symbol.
1364
1365@item -
1366represents the range if it's not first or last in a list or the ending point
1367of a range.
1368
1369@item ^
1370represents the characters not in the list.
1371If you want to make the @samp{^}
1372character a list item, place it anywhere but first.
1373
1374@end table
1375
1376@node The Backslash Character and Special Expressions
1377@section The Backslash Character and Special Expressions
1378@cindex backslash
1379
1380The @samp{\} character,
1381when followed by certain ordinary characters,
1382takes a special meaning:
1383
1384@table @samp
1385
1386@item \b
1387Match the empty string at the edge of a word.
1388
1389@item \B
1390Match the empty string provided it's not at the edge of a word.
1391
1392@item \<
1393Match the empty string at the beginning of word.
1394
1395@item \>
1396Match the empty string at the end of word.
1397
1398@item \w
1399Match word constituent, it is a synonym for @samp{[_[:alnum:]]}.
1400
1401@item \W
1402Match non-word constituent, it is a synonym for @samp{[^_[:alnum:]]}.
1403
1404@item \s
1405Match whitespace, it is a synonym for @samp{[[:space:]]}.
1406
1407@item \S
1408Match non-whitespace, it is a synonym for @samp{[^[:space:]]}.
1409
1410@end table
1411
1412For example, @samp{\brat\b} matches the separate word @samp{rat},
1413@samp{\Brat\B} matches @samp{crate} but not @samp{furry rat}.
1414
1415@node Anchoring
1416@section Anchoring
1417@cindex anchoring
1418
1419The caret @samp{^} and the dollar sign @samp{$} are meta-characters that
1420respectively match the empty string at the beginning and end of a line.
1421They are termed @dfn{anchors}, since they force the match to be ``anchored''
1422to beginning or end of a line, respectively.
1423
1424@node Back-references and Subexpressions
1425@section Back-references and Subexpressions
1426@cindex subexpression
1427@cindex back-reference
1428
1429The back-reference @samp{\@var{n}}, where @var{n} is a single digit, matches
1430the substring previously matched by the @var{n}th parenthesized subexpression
1431of the regular expression.
1432For example, @samp{(a)\1} matches @samp{aa}.
1433When used with alternation, if the group does not participate in the match then
1434the back-reference makes the whole match fail.
1435For example, @samp{a(.)|b\1}
1436will not match @samp{ba}.
1437When multiple regular expressions are given with
1438@option{-e} or from a file (@samp{-f @var{file}}),
1439back-references are local to each expression.
1440
1441@node Basic vs Extended
1442@section Basic vs Extended Regular Expressions
1443@cindex basic regular expressions
1444
1445In basic regular expressions the meta-characters @samp{?}, @samp{+},
1446@samp{@{}, @samp{|}, @samp{(}, and @samp{)} lose their special meaning;
1447instead use the backslashed versions @samp{\?}, @samp{\+}, @samp{\@{},
1448@samp{\|}, @samp{\(}, and @samp{\)}.
1449
1450@cindex interval specifications
1451Traditional @command{egrep} did not support the @samp{@{} meta-character,
1452and some @command{egrep} implementations support @samp{\@{} instead, so
1453portable scripts should avoid @samp{@{} in @samp{grep@ -E} patterns and
1454should use @samp{[@{]} to match a literal @samp{@{}.
1455
1456GNU @command{grep@ -E} attempts to support traditional usage by
1457assuming that @samp{@{} is not special if it would be the start of an
1458invalid interval specification.
1459For example, the command
1460@samp{grep@ -E@ '@{1'} searches for the two-character string @samp{@{1}
1461instead of reporting a syntax error in the regular expression.
1462POSIX allows this behavior as an extension, but portable scripts
1463should avoid it.
1464
1465
1466@node Usage
1467@chapter Usage
1468
1469@cindex usage, examples
1470Here is an example command that invokes GNU @command{grep}:
1471
1472@example
1473grep -i 'hello.*world' menu.h main.c
1474@end example
1475
1476@noindent
1477This lists all lines in the files @file{menu.h} and @file{main.c} that
1478contain the string @samp{hello} followed by the string @samp{world};
1479this is because @samp{.*} matches zero or more characters within a line.
1480@xref{Regular Expressions}.
1481The @option{-i} option causes @command{grep}
1482to ignore case, causing it to match the line @samp{Hello, world!}, which
1483it would not otherwise match.
1484@xref{Invoking}, for more details about
1485how to invoke @command{grep}.
1486
1487@cindex using @command{grep}, Q&A
1488@cindex FAQ about @command{grep} usage
1489Here are some common questions and answers about @command{grep} usage.
1490
1491@enumerate
1492
1493@item
1494How can I list just the names of matching files?
1495
1496@example
1497grep -l 'main' *.c
1498@end example
1499
1500@noindent
1501lists the names of all C files in the current directory whose contents
1502mention @samp{main}.
1503
1504@item
1505How do I search directories recursively?
1506
1507@example
1508grep -r 'hello' /home/gigi
1509@end example
1510
1511@noindent
1512searches for @samp{hello} in all files
1513under the @file{/home/gigi} directory.
1514For more control over which files are searched,
1515use @command{find}, @command{grep}, and @command{xargs}.
1516For example, the following command searches only C files:
1517
1518@example
1519find /home/gigi -name '*.c' -print0 | xargs -0r grep -H 'hello'
1520@end example
1521
1522This differs from the command:
1523
1524@example
1525grep -H 'hello' *.c
1526@end example
1527
1528which merely looks for @samp{hello} in all files in the current
1529directory whose names end in @samp{.c}.
1530The @samp{find ...} command line above is more similar to the command:
1531
1532@example
1533grep -rH --include='*.c' 'hello' /home/gigi
1534@end example
1535
1536@item
1537What if a pattern has a leading @samp{-}?
1538
1539@example
1540grep -e '--cut here--' *
1541@end example
1542
1543@noindent
1544searches for all lines matching @samp{--cut here--}.
1545Without @option{-e},
1546@command{grep} would attempt to parse @samp{--cut here--} as a list of
1547options.
1548
1549@item
1550Suppose I want to search for a whole word, not a part of a word?
1551
1552@example
1553grep -w 'hello' *
1554@end example
1555
1556@noindent
1557searches only for instances of @samp{hello} that are entire words;
1558it does not match @samp{Othello}.
1559For more control, use @samp{\<} and
1560@samp{\>} to match the start and end of words.
1561For example:
1562
1563@example
1564grep 'hello\>' *
1565@end example
1566
1567@noindent
1568searches only for words ending in @samp{hello}, so it matches the word
1569@samp{Othello}.
1570
1571@item
1572How do I output context around the matching lines?
1573
1574@example
1575grep -C 2 'hello' *
1576@end example
1577
1578@noindent
1579prints two lines of context around each matching line.
1580
1581@item
1582How do I force @command{grep} to print the name of the file?
1583
1584Append @file{/dev/null}:
1585
1586@example
1587grep 'eli' /etc/passwd /dev/null
1588@end example
1589
1590gets you:
1591
1592@example
1593/etc/passwd:eli:x:2098:1000:Eli Smith:/home/eli:/bin/bash
1594@end example
1595
1596Alternatively, use @option{-H}, which is a GNU extension:
1597
1598@example
1599grep -H 'eli' /etc/passwd
1600@end example
1601
1602@item
1603Why do people use strange regular expressions on @command{ps} output?
1604
1605@example
1606ps -ef | grep '[c]ron'
1607@end example
1608
1609If the pattern had been written without the square brackets, it would
1610have matched not only the @command{ps} output line for @command{cron},
1611but also the @command{ps} output line for @command{grep}.
1612Note that on some platforms,
1613@command{ps} limits the output to the width of the screen;
1614@command{grep} does not have any limit on the length of a line
1615except the available memory.
1616
1617@item
1618Why does @command{grep} report ``Binary file matches''?
1619
1620If @command{grep} listed all matching ``lines'' from a binary file, it
1621would probably generate output that is not useful, and it might even
1622muck up your display.
1623So GNU @command{grep} suppresses output from
1624files that appear to be binary files.
1625To force GNU @command{grep}
1626to output lines even from files that appear to be binary, use the
1627@option{-a} or @samp{--binary-files=text} option.
1628To eliminate the
1629``Binary file matches'' messages, use the @option{-I} or
1630@samp{--binary-files=without-match} option.
1631
1632@item
1633Why doesn't @samp{grep -lv} print non-matching file names?
1634
1635@samp{grep -lv} lists the names of all files containing one or more
1636lines that do not match.
1637To list the names of all files that contain no
1638matching lines, use the @option{-L} or @option{--files-without-match}
1639option.
1640
1641@item
1642I can do ``OR'' with @samp{|}, but what about ``AND''?
1643
1644@example
1645grep 'paul' /etc/motd | grep 'franc,ois'
1646@end example
1647
1648@noindent
1649finds all lines that contain both @samp{paul} and @samp{franc,ois}.
1650
1651@item
1652Why does the empty pattern match every input line?
1653
1654The @command{grep} command searches for lines that contain strings
1655that match a pattern.  Every line contains the empty string, so an
1656empty pattern causes @command{grep} to find a match on each line.  It
1657is not the only such pattern: @samp{^}, @samp{$}, @samp{.*}, and many
1658other patterns cause @command{grep} to match every line.
1659
1660To match empty lines, use the pattern @samp{^$}.  To match blank
1661lines, use the pattern @samp{^[[:blank:]]*$}.  To match no lines at
1662all, use the command @samp{grep -f /dev/null}.
1663
1664@item
1665How can I search in both standard input and in files?
1666
1667Use the special file name @samp{-}:
1668
1669@example
1670cat /etc/passwd | grep 'alain' - /etc/motd
1671@end example
1672
1673@item
1674@cindex palindromes
1675How to express palindromes in a regular expression?
1676
1677It can be done by using back-references;
1678for example,
1679a palindrome of 4 characters can be written with a BRE:
1680
1681@example
1682grep -w -e '\(.\)\(.\).\2\1' file
1683@end example
1684
1685It matches the word ``radar'' or ``civic.''
1686
1687Guglielmo Bondioni proposed a single RE
1688that finds all palindromes up to 19 characters long
1689using @w{9 subexpressions} and @w{9 back-references}:
1690
1691@smallexample
1692grep -E -e '^(.?)(.?)(.?)(.?)(.?)(.?)(.?)(.?)(.?).?\9\8\7\6\5\4\3\2\1$' file
1693@end smallexample
1694
1695Note this is done by using GNU ERE extensions;
1696it might not be portable to other implementations of @command{grep}.
1697
1698@item
1699Why is this back-reference failing?
1700
1701@example
1702echo 'ba' | grep -E '(a)\1|b\1'
1703@end example
1704
1705This gives no output, because the first alternate @samp{(a)\1} does not match,
1706as there is no @samp{aa} in the input, so the @samp{\1} in the second alternate
1707has nothing to refer back to, meaning it will never match anything.
1708(The second alternate in this example can only match
1709if the first alternate has matched---making the second one superfluous.)
1710
1711@item
1712How can I match across lines?
1713
1714Standard grep cannot do this, as it is fundamentally line-based.
1715Therefore, merely using the @code{[:space:]} character class does not
1716match newlines in the way you might expect.  However, if your grep is
1717compiled with Perl patterns enabled, the Perl @samp{s}
1718modifier (which makes @code{.} match newlines) can be used:
1719
1720@example
1721printf 'foo\nbar\n' | grep -P '(?s)foo.*?bar'
1722@end example
1723
1724With the GNU @command{grep} option @code{-z} (@pxref{File and
1725Directory Selection}), the input is terminated by null bytes.  Thus,
1726you can match newlines in the input, but the output will be the whole
1727file, so this is really only useful to determine if the pattern is
1728present:
1729
1730@example
1731printf 'foo\nbar\n' | grep -z -q 'foo[[:space:]]\+bar'
1732@end example
1733
1734Failing either of those options, you need to transform the input
1735before giving it to @command{grep}, or turn to @command{awk},
1736@command{sed}, @command{perl}, or many other utilities that are
1737designed to operate across lines.
1738
1739@item
1740What do @command{grep}, @command{fgrep}, and @command{egrep} stand for?
1741
1742The name @command{grep} comes from the way line editing was done on Unix.
1743For example,
1744@command{ed} uses the following syntax
1745to print a list of matching lines on the screen:
1746
1747@example
1748global/regular expression/print
1749g/re/p
1750@end example
1751
1752@command{fgrep} stands for Fixed @command{grep};
1753@command{egrep} stands for Extended @command{grep}.
1754
1755@end enumerate
1756
1757
1758@node Reporting Bugs
1759@chapter Reporting bugs
1760
1761@cindex bugs, reporting
1762Email bug reports to @email{bug-grep@@gnu.org},
1763a mailing list whose web page is
1764@url{http://lists.gnu.org/mailman/listinfo/bug-grep}.
1765The Savannah bug tracker for @command{grep} is located at
1766@url{http://savannah.gnu.org/bugs/?group=grep}.
1767
1768@section Known Bugs
1769@cindex Bugs, known
1770
1771Large repetition counts in the @samp{@{n,m@}} construct may cause
1772@command{grep} to use lots of memory.
1773In addition, certain other
1774obscure regular expressions require exponential time and
1775space, and may cause @command{grep} to run out of memory.
1776
1777Back-references are very slow, and may require exponential time.
1778
1779
1780@node Copying
1781@chapter Copying
1782@cindex copying
1783
1784GNU @command{grep} is licensed under the GNU GPL, which makes it @dfn{free
1785software}.
1786
1787The ``free'' in ``free software'' refers to liberty, not price. As
1788some GNU project advocates like to point out, think of ``free speech''
1789rather than ``free beer''.  In short, you have the right (freedom) to
1790run and change @command{grep} and distribute it to other people, and---if you
1791want---charge money for doing either.  The important restriction is
1792that you have to grant your recipients the same rights and impose the
1793same restrictions.
1794
1795This general method of licensing software is sometimes called
1796@dfn{open source}.  The GNU project prefers the term ``free software''
1797for reasons outlined at
1798@url{http://www.gnu.org/philosophy/open-source-misses-the-point.html}.
1799
1800This manual is free documentation in the same sense.  The
1801documentation license is included below.  The license for the program
1802is available with the source code, or at
1803@url{http://www.gnu.org/licenses/gpl.html}.
1804
1805@menu
1806* GNU Free Documentation License::
1807@end menu
1808
1809@node GNU Free Documentation License
1810@section GNU Free Documentation License
1811
1812@include fdl.texi
1813
1814
1815@node Index
1816@unnumbered Index
1817
1818@printindex cp
1819
1820@bye
1821