1.\" Copyright (c) 1990 The Regents of the University of California. 2.\" All rights reserved. 3.\" 4.\" %sccs.include.proprietary.roff% 5.\" 6.\" @(#)awk.1 6.6 (Berkeley) 08/07/91 7.\" 8.Dd 9.Dt AWK 1 10.Os ATT 7 11.Sh NAME 12.Nm awk 13.Nd pattern scanning and processing language 14.Sh SYNOPSIS 15.Nm awk 16.Op Fl F Ar c 17.Op Fl f Ar prog_file 18.Op Ar prog 19.Ar 20.Sh DESCRIPTION 21.Nm Awk 22scans each input 23.Ar file 24for lines that match any of a set of patterns specified in 25.Ar prog . 26With each pattern in 27.Ar prog 28there can be an associated action that will be performed 29when a line of a 30.Ar file 31matches the pattern. 32The set of patterns may appear literally as 33.Ar prog 34or in a file 35specified as 36.Fl f 37.Ar file . 38.Pp 39.Bl -tag -width flag 40.It Fl F Ns Ar c 41Specify a field separator of 42.Ar c . 43.It Fl f 44Use 45.Ar prog_file 46as an input 47.Ar prog 48(an awk script). 49.El 50.Pp 51Files are read in order; 52if there are no files, the standard input is read. 53The file name 54.Sq Fl 55means the standard input. 56Each line is matched against the 57pattern portion of every pattern-action statement; 58the associated action is performed for each matched pattern. 59.Pp 60An input line is made up of fields separated by white space. 61(This default can be changed by using 62.Li FS , 63.Em vide infra . ) 64The fields are denoted $1, $2, ... ; 65$0 refers to the entire line. 66.Pp 67A pattern-action statement has the form 68.Pp 69.Dl pattern {action} 70.Pp 71A missing { action } means print the line; 72a missing pattern always matches. 73.Pp 74An action is a sequence of statements. 75A statement can be one of the following: 76.Bd -unfilled -offset indent 77if ( conditional ) statement [ else statement ] 78while ( conditional ) statement 79for ( expression ; conditional ; expression ) statement 80break 81continue 82{ [ statement ] ... } 83variable = expression 84print [ expression-list ] [ >expression ] 85printf format [, expression-list ] [ >expression ] 86next # skip remaining patterns on this input line 87exit # skip the rest of the input 88.Ed 89.Pp 90Statements are terminated by 91semicolons, newlines or right braces. 92An empty expression-list stands for the whole line. 93Expressions take on string or numeric values as appropriate, 94and are built using the operators 95+, \-, *, /, %, and concatenation (indicated by a blank). 96The C operators ++, \-\-, +=, \-=, *=, /=, and %= 97are also available in expressions. 98Variables may be scalars, array elements 99(denoted 100.x Ns Ns Op i ) 101or fields. 102Variables are initialized to the null string. 103Array subscripts may be any string, 104not necessarily numeric; 105this allows for a form of associative memory. 106String constants are quoted "...". 107.Pp 108The 109.Ic print 110statement prints its arguments on the standard output 111(or on a file if 112.Ar \&>file 113is present), separated by the current output field separator, 114and terminated by the output record separator. 115The 116.Ic printf 117statement formats its expression list according to the format 118(see 119.Xr printf 3 ) . 120.Pp 121The built-in function 122.Ic length 123returns the length of its argument 124taken as a string, 125or of the whole line if no argument. 126There are also built-in functions 127.Ic exp , 128.Ic log , 129.Ic sqrt 130and 131.Ic int . 132The last truncates its argument to an integer. 133The function 134.Fn substr s m n 135returns the 136.Ar n Ns \- character 137substring of 138.Ar s 139that begins at position 140.Ar m . 141The 142.Fn sprintf fmt expr expr ... 143function 144formats the expressions 145according to the 146.Xr printf 3 147format given by 148.Ar fmt 149and returns the resulting string. 150.Pp 151Patterns are arbitrary Boolean combinations 152(!, \(or\(or, &&, and parentheses) of 153regular expressions and 154relational expressions. 155Regular expressions must be surrounded 156by slashes and are as in 157.Xr egrep 1 . 158Isolated regular expressions 159in a pattern apply to the entire line. 160Regular expressions may also occur in 161relational expressions. 162.Pp 163A pattern may consist of two patterns separated by a comma; 164in this case, the action is performed for all lines 165between an occurrence of the first pattern 166and the next occurrence of the second. 167.Pp 168A relational expression is one of the following: 169.Bd -unfilled -offset indent 170expression matchop regular-expression 171expression relop expression 172.Ed 173.Pp 174where a relop is any of the six relational operators in C, 175and a matchop is either ~ (for contains) 176or !~ (for does not contain). 177A conditional is an arithmetic expression, 178a relational expression, 179or a Boolean combination 180of these. 181.Pp 182The special patterns 183.Li BEGIN 184and 185.Li END 186may be used to capture control before the first input line is read 187and after the last. 188.Li BEGIN 189must be the first pattern, 190.Li END 191the last. 192.Pp 193A single character 194.Ar c 195may be used to separate the fields by starting 196the program with 197.Pp 198.Dl BEGIN { FS = "c" } 199.Pp 200or by using the 201.Fl F Ns Ns Ar c 202option. 203.Pp 204Other variable names with special meanings 205include 206.Pp 207.Bl -tag -width "file name" -compact 208.It Li NF 209the number of fields in the current record; 210.It Li NR 211the ordinal number of the current record; 212.It Li FILENAME 213the name of the current input file; 214.It Li OFS 215the output field separator (default blank); 216.It Li ORS 217the output record separator (default newline); 218.It Li OFMT 219the output format for numbers (default "%.6g"). 220.El 221.Pp 222.Sh EXAMPLES 223.Pp 224Print lines longer than 72 characters: 225.Pp 226.Dl length > 72 227.Pp 228Print first two fields in opposite order: 229.Pp 230.Dl { print $2, $1 } 231.Pp 232Add up first column, print sum and average: 233.Bd -literal -offset indent 234 { s += $1 } 235END { print "sum is", s, " average is", s/NR } 236.Ed 237.Pp 238Print fields in reverse order: 239.Pp 240.Dl { for (i = NF; i > 0; \-\-i) print $i } 241.Pp 242Print all lines between start/stop pairs: 243.Pp 244.Dl /start/, /stop/ 245.Pp 246Print all lines whose first field is different from previous one: 247.Pp 248.Dl $1 != prev { print; prev = $1 } 249.Sh SEE ALSO 250.Xr lex 1 , 251.Xr sed 1 252.Pp 253.Rs 254.%A A. V. Aho 255.%A B. W. Kernighan 256.%A P. J. Weinberger 257.%T "Awk \- a pattern scanning and processing language" 258.Re 259.Sh HISTORY 260The version of 261.Nm awk 262this man page describes 263appeared in Version 264.At v7 . 265A much improved 266and true to the book version of 267.Nm awk 268appeared in the 269.Tn AT&T 270Toolchest in the late 1980's. 271The version of 272.Nm awk 273this manual page describes 274is a derivative of the original and not the Toolchest version. 275.Sh BUGS 276There are no explicit conversions between numbers and strings. 277To force an expression to be treated as a number add 0 to it; 278to force it to be treated as a string concatenate "" (an empty 279string) to it. 280