xref: /original-bsd/old/awk/awk.1 (revision 89a39cb6)
1.\" Copyright (c) 1990 The Regents of the University of California.
2.\" All rights reserved.
3.\"
4.\" %sccs.include.redist.man%
5.\"
6.\"     @(#)awk.1	6.2 (Berkeley) 06/11/90
7.\"
8.Dd
9.Dt AWK 1
10.Os ATT 7
11.Sh NAME
12.Nm awk
13.Nd pattern scanning and processing language
14.Sh SYNOPSIS
15.Nm awk
16.Oo
17.Op Fl \&F Ar \&c
18.Oo
19.\".Op Op Fl \&f Ar file Op Ar prog
20.Cx \&[
21.Op Fl f Ar file
22.Op Ar prog
23.Cx \&]
24.Cx
25.Ar
26.Sh DESCRIPTION
27.Nm Awk
28scans each input
29.Ar file
30for lines that match any of a set of patterns specified in
31.Ar prog .
32With each pattern in
33.Ar prog
34there can be an associated action that will be performed
35when a line of a
36.Ar file
37matches the pattern.
38The set of patterns may appear literally as
39.Ar prog
40or in a file
41specified as
42.Fl f
43.Ar file .
44.Pp
45.Tw Fl
46.Tp Cx Fl F
47.Ar c
48.Cx
49Specify a field separator of
50.Ar c .
51.Tp Fl f
52Use
53.Ar file
54as an input
55.Ar prog
56(an awk script).
57.Tp
58.Pp
59Files are read in order;
60if there are no files, the standard input is read.
61The file name
62.Fl
63means the standard input.
64Each line is matched against the
65pattern portion of every pattern-action statement;
66the associated action is performed for each matched pattern.
67.Pp
68An input line is made up of fields separated by white space.
69(This default can be changed by using
70.Li FS ,
71.Em vide infra . )
72The fields are denoted $1, $2, ... ;
73$0 refers to the entire line.
74.Pp
75A pattern-action statement has the form
76.Pp
77.Dl pattern {action}
78.Pp
79A missing { action } means print the line;
80a missing pattern always matches.
81.Pp
82An action is a sequence of statements.
83A statement can be one of the following:
84.Pp
85.Ds I
86if ( conditional ) statement [ else statement ]
87while ( conditional ) statement
88for ( expression ; conditional ; expression ) statement
89break
90continue
91{ [ statement ] ... }
92variable = expression
93print [ expression-list ] [ >expression ]
94printf format [, expression-list ] [ >expression ]
95next	# skip remaining patterns on this input line
96exit	# skip the rest of the input
97.De
98.Pp
99Statements are terminated by
100semicolons, newlines or right braces.
101An empty expression-list stands for the whole line.
102Expressions take on string or numeric values as appropriate,
103and are built using the operators
104+, \-, *, /, %,  and concatenation (indicated by a blank).
105The C operators ++, \-\-, +=, \-=, *=, /=, and %=
106are also available in expressions.
107Variables may be scalars, array elements
108(denoted
109.Cx x
110.Op i
111.Cx )
112.Cx
113or fields.
114Variables are initialized to the null string.
115Array subscripts may be any string,
116not necessarily numeric;
117this allows for a form of associative memory.
118String constants are quoted "...".
119.Pp
120The
121.Ic print
122statement prints its arguments on the standard output
123(or on a file if
124.Ar \&>file
125is present), separated by the current output field separator,
126and terminated by the output record separator.
127The
128.Ic printf
129statement formats its expression list according to the format
130(see
131.Xr printf 3 ) .
132.Pp
133The built-in function
134.Ic length
135returns the length of its argument
136taken as a string,
137or of the whole line if no argument.
138There are also built-in functions
139.Ic exp ,
140.Ic log ,
141.Ic sqrt
142and
143.Ic int .
144The last truncates its argument to an integer.
145The function
146.Cx Ic substr
147.Cx (
148.Ar s ,
149.Ar \& m ,
150.Ar \& n )
151.Cx
152returns the
153.Cx Ar n
154.Cx \-
155.Cx character
156.Cx
157substring of
158.Ar s
159that begins at position
160.Ar m .
161The function
162.Cx Ic sprintf
163.Cx (
164.Ar fmt ,
165.Ar \& expr ,
166.Ar \& expr ,
167.Ar \& ... )
168.Cx
169formats the expressions
170according to the
171.Xr printf 3
172format given by
173.Ar fmt
174and returns the resulting string.
175.Pp
176Patterns are arbitrary Boolean combinations
177(!, \(or\(or, &&, and parentheses) of
178regular expressions and
179relational expressions.
180Regular expressions must be surrounded
181by slashes and are as in
182.Xr egrep 1 .
183Isolated regular expressions
184in a pattern apply to the entire line.
185Regular expressions may also occur in
186relational expressions.
187.Pp
188A pattern may consist of two patterns separated by a comma;
189in this case, the action is performed for all lines
190between an occurrence of the first pattern
191and the next occurrence of the second.
192.Pp
193A relational expression is one of the following:
194.Ds
195expression matchop regular-expression
196expression relop expression
197.De
198.Pp
199where a relop is any of the six relational operators in C,
200and a matchop is either ~ (for contains)
201or !~ (for does not contain).
202A conditional is an arithmetic expression,
203a relational expression,
204or a Boolean combination
205of these.
206.Pp
207The special patterns
208.Li BEGIN
209and
210.Li END
211may be used to capture control before the first input line is read
212and after the last.
213.Li BEGIN
214must be the first pattern,
215.Li END
216the last.
217.Pp
218A single character
219.Ar c
220may be used to separate the fields by starting
221the program with
222.Pp
223.Dl BEGIN { FS = "c" }
224.Pp
225or by using the
226.Cx Fl F
227.Ar c
228.Cx
229option.
230.Pp
231Other variable names with special meanings
232include
233.Dp Li NF
234the number of fields in the current record;
235.Dp Li NR
236the ordinal number of the current record;
237.Dp Li FILENAME
238the name of the current input file;
239.Dp Li OFS
240the output field separator (default blank);
241.Dp Li ORS
242the output record separator (default newline);
243.Dp Li OFMT
244the output format for numbers (default "%.6g").
245.Dp
246.Pp
247.Sh EXAMPLES
248.Pp
249Print lines longer than 72 characters:
250.Pp
251.Dl length > 72
252.Pp
253Print first two fields in opposite order:
254.Pp
255.Dl { print $2, $1 }
256.Pp
257Add up first column, print sum and average:
258.Pp
259.Ds I
260	{ s += $1 }
261END	{ print "sum is", s, " average is", s/NR }
262.De
263.Pp
264Print fields in reverse order:
265.Pp
266.Dl { for (i = NF; i > 0; \-\-i) print $i }
267.Pp
268Print all lines between start/stop pairs:
269.Pp
270.Dl /start/, /stop/
271.Pp
272Print all lines whose first field is different from previous one:
273.Pp
274.Dl $1 != prev { print; prev = $1 }
275.Sh SEE ALSO
276.Xr lex 1 ,
277.Xr sed 1
278.Pp
279A. V. Aho, B. W. Kernighan, P. J. Weinberger,
280.Em Awk \- a pattern scanning and processing language
281.Sh HISTORY
282.Nm Awk
283appeared in Version 7 AT&T UNIX.  A much improved
284and true to the book version of
285.Nm awk
286appeared in the AT&T Toolchest in the late 1980's.
287The version of
288.Nm awk
289this manual page describes
290is a derivative of the original and not the Toolchest version.
291.Sh BUGS
292There are no explicit conversions between numbers and strings.
293To force an expression to be treated as a number add 0 to it;
294to force it to be treated as a string concatenate
295.Dq
296to it.
297