xref: /original-bsd/old/awk/awk.1 (revision 68549010)
1.\" Copyright (c) 1990 The Regents of the University of California.
2.\" All rights reserved.
3.\"
4.\" %sccs.include.proprietary.roff%
5.\"
6.\"	@(#)awk.1	6.6 (Berkeley) 08/07/91
7.\"
8.Dd
9.Dt AWK 1
10.Os ATT 7
11.Sh NAME
12.Nm awk
13.Nd pattern scanning and processing language
14.Sh SYNOPSIS
15.Nm awk
16.Op Fl F Ar c
17.Op Fl f Ar prog_file
18.Op Ar prog
19.Ar
20.Sh DESCRIPTION
21.Nm Awk
22scans each input
23.Ar file
24for lines that match any of a set of patterns specified in
25.Ar prog .
26With each pattern in
27.Ar prog
28there can be an associated action that will be performed
29when a line of a
30.Ar file
31matches the pattern.
32The set of patterns may appear literally as
33.Ar prog
34or in a file
35specified as
36.Fl f
37.Ar file .
38.Pp
39.Bl -tag -width flag
40.It Fl F Ns Ar c
41Specify a field separator of
42.Ar c .
43.It Fl f
44Use
45.Ar prog_file
46as an input
47.Ar prog
48(an awk script).
49.El
50.Pp
51Files are read in order;
52if there are no files, the standard input is read.
53The file name
54.Sq Fl
55means the standard input.
56Each line is matched against the
57pattern portion of every pattern-action statement;
58the associated action is performed for each matched pattern.
59.Pp
60An input line is made up of fields separated by white space.
61(This default can be changed by using
62.Li FS ,
63.Em vide infra . )
64The fields are denoted $1, $2, ... ;
65$0 refers to the entire line.
66.Pp
67A pattern-action statement has the form
68.Pp
69.Dl pattern {action}
70.Pp
71A missing { action } means print the line;
72a missing pattern always matches.
73.Pp
74An action is a sequence of statements.
75A statement can be one of the following:
76.Bd -unfilled -offset indent
77if ( conditional ) statement [ else statement ]
78while ( conditional ) statement
79for ( expression ; conditional ; expression ) statement
80break
81continue
82{ [ statement ] ... }
83variable = expression
84print [ expression-list ] [ >expression ]
85printf format [, expression-list ] [ >expression ]
86next	# skip remaining patterns on this input line
87exit	# skip the rest of the input
88.Ed
89.Pp
90Statements are terminated by
91semicolons, newlines or right braces.
92An empty expression-list stands for the whole line.
93Expressions take on string or numeric values as appropriate,
94and are built using the operators
95+, \-, *, /, %,  and concatenation (indicated by a blank).
96The C operators ++, \-\-, +=, \-=, *=, /=, and %=
97are also available in expressions.
98Variables may be scalars, array elements
99(denoted
100.x Ns Ns Op i )
101or fields.
102Variables are initialized to the null string.
103Array subscripts may be any string,
104not necessarily numeric;
105this allows for a form of associative memory.
106String constants are quoted "...".
107.Pp
108The
109.Ic print
110statement prints its arguments on the standard output
111(or on a file if
112.Ar \&>file
113is present), separated by the current output field separator,
114and terminated by the output record separator.
115The
116.Ic printf
117statement formats its expression list according to the format
118(see
119.Xr printf 3 ) .
120.Pp
121The built-in function
122.Ic length
123returns the length of its argument
124taken as a string,
125or of the whole line if no argument.
126There are also built-in functions
127.Ic exp ,
128.Ic log ,
129.Ic sqrt
130and
131.Ic int .
132The last truncates its argument to an integer.
133The function
134.Fn substr s m n
135returns the
136.Ar n Ns \- character
137substring of
138.Ar s
139that begins at position
140.Ar m .
141The
142.Fn sprintf fmt expr expr ...
143function
144formats the expressions
145according to the
146.Xr printf 3
147format given by
148.Ar fmt
149and returns the resulting string.
150.Pp
151Patterns are arbitrary Boolean combinations
152(!, \(or\(or, &&, and parentheses) of
153regular expressions and
154relational expressions.
155Regular expressions must be surrounded
156by slashes and are as in
157.Xr egrep 1 .
158Isolated regular expressions
159in a pattern apply to the entire line.
160Regular expressions may also occur in
161relational expressions.
162.Pp
163A pattern may consist of two patterns separated by a comma;
164in this case, the action is performed for all lines
165between an occurrence of the first pattern
166and the next occurrence of the second.
167.Pp
168A relational expression is one of the following:
169.Bd -unfilled -offset indent
170expression matchop regular-expression
171expression relop expression
172.Ed
173.Pp
174where a relop is any of the six relational operators in C,
175and a matchop is either ~ (for contains)
176or !~ (for does not contain).
177A conditional is an arithmetic expression,
178a relational expression,
179or a Boolean combination
180of these.
181.Pp
182The special patterns
183.Li BEGIN
184and
185.Li END
186may be used to capture control before the first input line is read
187and after the last.
188.Li BEGIN
189must be the first pattern,
190.Li END
191the last.
192.Pp
193A single character
194.Ar c
195may be used to separate the fields by starting
196the program with
197.Pp
198.Dl BEGIN { FS = "c" }
199.Pp
200or by using the
201.Fl F Ns Ns Ar c
202option.
203.Pp
204Other variable names with special meanings
205include
206.Pp
207.Bl -tag -width "file name" -compact
208.It Li NF
209the number of fields in the current record;
210.It Li NR
211the ordinal number of the current record;
212.It Li FILENAME
213the name of the current input file;
214.It Li OFS
215the output field separator (default blank);
216.It Li ORS
217the output record separator (default newline);
218.It Li OFMT
219the output format for numbers (default "%.6g").
220.El
221.Pp
222.Sh EXAMPLES
223.Pp
224Print lines longer than 72 characters:
225.Pp
226.Dl length > 72
227.Pp
228Print first two fields in opposite order:
229.Pp
230.Dl { print $2, $1 }
231.Pp
232Add up first column, print sum and average:
233.Bd -literal -offset indent
234	{ s += $1 }
235END	{ print "sum is", s, " average is", s/NR }
236.Ed
237.Pp
238Print fields in reverse order:
239.Pp
240.Dl { for (i = NF; i > 0; \-\-i) print $i }
241.Pp
242Print all lines between start/stop pairs:
243.Pp
244.Dl /start/, /stop/
245.Pp
246Print all lines whose first field is different from previous one:
247.Pp
248.Dl $1 != prev { print; prev = $1 }
249.Sh SEE ALSO
250.Xr lex 1 ,
251.Xr sed 1
252.Pp
253.Rs
254.%A A. V. Aho
255.%A B. W. Kernighan
256.%A P. J. Weinberger
257.%T "Awk \- a pattern scanning and processing language"
258.Re
259.Sh HISTORY
260The version of
261.Nm awk
262this man page describes
263appeared in Version
264.At v7 .
265A much improved
266and true to the book version of
267.Nm awk
268appeared in the
269.Tn AT&T
270Toolchest in the late 1980's.
271The version of
272.Nm awk
273this manual page describes
274is a derivative of the original and not the Toolchest version.
275.Sh BUGS
276There are no explicit conversions between numbers and strings.
277To force an expression to be treated as a number add 0 to it;
278to force it to be treated as a string concatenate "" (an empty
279string) to it.
280