xref: /original-bsd/old/awk/awk.1 (revision 6b3572dd)
1.\" Copyright (c) 1990 Regents of the University of California.
2.\" All rights reserved.  The Berkeley software License Agreement
3.\" specifies the terms and conditions for redistribution.
4.\"
5.\"     @(#)awk.1	6.4 (Berkeley) 07/24/90
6.\"
7.Dd
8.Dt AWK 1
9.Os ATT 7
10.Sh NAME
11.Nm awk
12.Nd pattern scanning and processing language
13.Sh SYNOPSIS
14.Nm awk
15.Oo
16.Op Fl \&F Ar \&c
17.Oo
18.Op Fl f Ar prog_file
19.Op Ar prog
20.Ar
21.Sh DESCRIPTION
22.Nm Awk
23scans each input
24.Ar file
25for lines that match any of a set of patterns specified in
26.Ar prog .
27With each pattern in
28.Ar prog
29there can be an associated action that will be performed
30when a line of a
31.Ar file
32matches the pattern.
33The set of patterns may appear literally as
34.Ar prog
35or in a file
36specified as
37.Fl f
38.Ar file .
39.Pp
40.Tw Ds
41.Tp Cx Fl F
42.Ar c
43.Cx
44Specify a field separator of
45.Ar c .
46.Tp Fl f
47Use
48.Ar prog_file
49as an input
50.Ar prog
51(an awk script).
52.Tp
53.Pp
54Files are read in order;
55if there are no files, the standard input is read.
56The file name
57.Sq Fl
58means the standard input.
59Each line is matched against the
60pattern portion of every pattern-action statement;
61the associated action is performed for each matched pattern.
62.Pp
63An input line is made up of fields separated by white space.
64(This default can be changed by using
65.Li FS ,
66.Em vide infra . )
67The fields are denoted $1, $2, ... ;
68$0 refers to the entire line.
69.Pp
70A pattern-action statement has the form
71.Pp
72.Dl pattern {action}
73.Pp
74A missing { action } means print the line;
75a missing pattern always matches.
76.Pp
77An action is a sequence of statements.
78A statement can be one of the following:
79.Pp
80.Ds I
81if ( conditional ) statement [ else statement ]
82while ( conditional ) statement
83for ( expression ; conditional ; expression ) statement
84break
85continue
86{ [ statement ] ... }
87variable = expression
88print [ expression-list ] [ >expression ]
89printf format [, expression-list ] [ >expression ]
90next	# skip remaining patterns on this input line
91exit	# skip the rest of the input
92.De
93.Pp
94Statements are terminated by
95semicolons, newlines or right braces.
96An empty expression-list stands for the whole line.
97Expressions take on string or numeric values as appropriate,
98and are built using the operators
99+, \-, *, /, %,  and concatenation (indicated by a blank).
100The C operators ++, \-\-, +=, \-=, *=, /=, and %=
101are also available in expressions.
102Variables may be scalars, array elements
103(denoted
104.Cx x
105.Op i
106.Cx )
107.Cx
108or fields.
109Variables are initialized to the null string.
110Array subscripts may be any string,
111not necessarily numeric;
112this allows for a form of associative memory.
113String constants are quoted "...".
114.Pp
115The
116.Ic print
117statement prints its arguments on the standard output
118(or on a file if
119.Ar \&>file
120is present), separated by the current output field separator,
121and terminated by the output record separator.
122The
123.Ic printf
124statement formats its expression list according to the format
125(see
126.Xr printf 3 ) .
127.Pp
128The built-in function
129.Ic length
130returns the length of its argument
131taken as a string,
132or of the whole line if no argument.
133There are also built-in functions
134.Ic exp ,
135.Ic log ,
136.Ic sqrt
137and
138.Ic int .
139The last truncates its argument to an integer.
140The function
141.Fn substr s m n
142returns the
143.Cx Ar n
144.Cx \-
145.Cx character
146.Cx
147substring of
148.Ar s
149that begins at position
150.Ar m .
151The
152.Fn sprintf fmt expr expr \&...
153function
154formats the expressions
155according to the
156.Xr printf 3
157format given by
158.Ar fmt
159and returns the resulting string.
160.Pp
161Patterns are arbitrary Boolean combinations
162(!, \(or\(or, &&, and parentheses) of
163regular expressions and
164relational expressions.
165Regular expressions must be surrounded
166by slashes and are as in
167.Xr egrep 1 .
168Isolated regular expressions
169in a pattern apply to the entire line.
170Regular expressions may also occur in
171relational expressions.
172.Pp
173A pattern may consist of two patterns separated by a comma;
174in this case, the action is performed for all lines
175between an occurrence of the first pattern
176and the next occurrence of the second.
177.Pp
178A relational expression is one of the following:
179.Pp
180.Ds I
181expression matchop regular-expression
182expression relop expression
183.De
184.Pp
185where a relop is any of the six relational operators in C,
186and a matchop is either ~ (for contains)
187or !~ (for does not contain).
188A conditional is an arithmetic expression,
189a relational expression,
190or a Boolean combination
191of these.
192.Pp
193The special patterns
194.Li BEGIN
195and
196.Li END
197may be used to capture control before the first input line is read
198and after the last.
199.Li BEGIN
200must be the first pattern,
201.Li END
202the last.
203.Pp
204A single character
205.Ar c
206may be used to separate the fields by starting
207the program with
208.Pp
209.Dl BEGIN { FS = "c" }
210.Pp
211or by using the
212.Cx Fl F
213.Ar c
214.Cx
215option.
216.Pp
217Other variable names with special meanings
218include
219.Dp Li NF
220the number of fields in the current record;
221.Dp Li NR
222the ordinal number of the current record;
223.Dp Li FILENAME
224the name of the current input file;
225.Dp Li OFS
226the output field separator (default blank);
227.Dp Li ORS
228the output record separator (default newline);
229.Dp Li OFMT
230the output format for numbers (default "%.6g").
231.Dp
232.Pp
233.Sh EXAMPLES
234.Pp
235Print lines longer than 72 characters:
236.Pp
237.Dl length > 72
238.Pp
239Print first two fields in opposite order:
240.Pp
241.Dl { print $2, $1 }
242.Pp
243Add up first column, print sum and average:
244.Pp
245.Ds I
246	{ s += $1 }
247END	{ print "sum is", s, " average is", s/NR }
248.De
249.Pp
250Print fields in reverse order:
251.Pp
252.Dl { for (i = NF; i > 0; \-\-i) print $i }
253.Pp
254Print all lines between start/stop pairs:
255.Pp
256.Dl /start/, /stop/
257.Pp
258Print all lines whose first field is different from previous one:
259.Pp
260.Dl $1 != prev { print; prev = $1 }
261.Sh SEE ALSO
262.Xr lex 1 ,
263.Xr sed 1
264.Pp
265A. V. Aho, B. W. Kernighan, P. J. Weinberger,
266.Em Awk \- a pattern scanning and processing language
267.Sh HISTORY
268.Nm Awk
269appeared in Version 7 AT&T UNIX.  A much improved
270and true to the book version of
271.Nm awk
272appeared in the AT&T Toolchest in the late 1980's.
273The version of
274.Nm awk
275this manual page describes
276is a derivative of the original and not the Toolchest version.
277.Sh BUGS
278There are no explicit conversions between numbers and strings.
279To force an expression to be treated as a number add 0 to it;
280to force it to be treated as a string concatenate "" (an empty
281string) to it.
282