1.\" Copyright (c) 1990 The Regents of the University of California. 2.\" All rights reserved. 3.\" 4.\" %sccs.include.proprietary.roff% 5.\" 6.\" @(#)awk.1 6.5 (Berkeley) 04/17/91 7.\" 8.Dd 9.Dt AWK 1 10.Os ATT 7 11.Sh NAME 12.Nm awk 13.Nd pattern scanning and processing language 14.Sh SYNOPSIS 15.Nm awk 16.Oo 17.Op Fl \&F Ar \&c 18.Oo 19.Op Fl f Ar prog_file 20.Op Ar prog 21.Ar 22.Sh DESCRIPTION 23.Nm Awk 24scans each input 25.Ar file 26for lines that match any of a set of patterns specified in 27.Ar prog . 28With each pattern in 29.Ar prog 30there can be an associated action that will be performed 31when a line of a 32.Ar file 33matches the pattern. 34The set of patterns may appear literally as 35.Ar prog 36or in a file 37specified as 38.Fl f 39.Ar file . 40.Pp 41.Tw Ds 42.Tp Cx Fl F 43.Ar c 44.Cx 45Specify a field separator of 46.Ar c . 47.Tp Fl f 48Use 49.Ar prog_file 50as an input 51.Ar prog 52(an awk script). 53.Tp 54.Pp 55Files are read in order; 56if there are no files, the standard input is read. 57The file name 58.Sq Fl 59means the standard input. 60Each line is matched against the 61pattern portion of every pattern-action statement; 62the associated action is performed for each matched pattern. 63.Pp 64An input line is made up of fields separated by white space. 65(This default can be changed by using 66.Li FS , 67.Em vide infra . ) 68The fields are denoted $1, $2, ... ; 69$0 refers to the entire line. 70.Pp 71A pattern-action statement has the form 72.Pp 73.Dl pattern {action} 74.Pp 75A missing { action } means print the line; 76a missing pattern always matches. 77.Pp 78An action is a sequence of statements. 79A statement can be one of the following: 80.Pp 81.Ds I 82if ( conditional ) statement [ else statement ] 83while ( conditional ) statement 84for ( expression ; conditional ; expression ) statement 85break 86continue 87{ [ statement ] ... } 88variable = expression 89print [ expression-list ] [ >expression ] 90printf format [, expression-list ] [ >expression ] 91next # skip remaining patterns on this input line 92exit # skip the rest of the input 93.De 94.Pp 95Statements are terminated by 96semicolons, newlines or right braces. 97An empty expression-list stands for the whole line. 98Expressions take on string or numeric values as appropriate, 99and are built using the operators 100+, \-, *, /, %, and concatenation (indicated by a blank). 101The C operators ++, \-\-, +=, \-=, *=, /=, and %= 102are also available in expressions. 103Variables may be scalars, array elements 104(denoted 105.Cx x 106.Op i 107.Cx ) 108.Cx 109or fields. 110Variables are initialized to the null string. 111Array subscripts may be any string, 112not necessarily numeric; 113this allows for a form of associative memory. 114String constants are quoted "...". 115.Pp 116The 117.Ic print 118statement prints its arguments on the standard output 119(or on a file if 120.Ar \&>file 121is present), separated by the current output field separator, 122and terminated by the output record separator. 123The 124.Ic printf 125statement formats its expression list according to the format 126(see 127.Xr printf 3 ) . 128.Pp 129The built-in function 130.Ic length 131returns the length of its argument 132taken as a string, 133or of the whole line if no argument. 134There are also built-in functions 135.Ic exp , 136.Ic log , 137.Ic sqrt 138and 139.Ic int . 140The last truncates its argument to an integer. 141The function 142.Fn substr s m n 143returns the 144.Cx Ar n 145.Cx \- 146.Cx character 147.Cx 148substring of 149.Ar s 150that begins at position 151.Ar m . 152The 153.Fn sprintf fmt expr expr \&... 154function 155formats the expressions 156according to the 157.Xr printf 3 158format given by 159.Ar fmt 160and returns the resulting string. 161.Pp 162Patterns are arbitrary Boolean combinations 163(!, \(or\(or, &&, and parentheses) of 164regular expressions and 165relational expressions. 166Regular expressions must be surrounded 167by slashes and are as in 168.Xr egrep 1 . 169Isolated regular expressions 170in a pattern apply to the entire line. 171Regular expressions may also occur in 172relational expressions. 173.Pp 174A pattern may consist of two patterns separated by a comma; 175in this case, the action is performed for all lines 176between an occurrence of the first pattern 177and the next occurrence of the second. 178.Pp 179A relational expression is one of the following: 180.Pp 181.Ds I 182expression matchop regular-expression 183expression relop expression 184.De 185.Pp 186where a relop is any of the six relational operators in C, 187and a matchop is either ~ (for contains) 188or !~ (for does not contain). 189A conditional is an arithmetic expression, 190a relational expression, 191or a Boolean combination 192of these. 193.Pp 194The special patterns 195.Li BEGIN 196and 197.Li END 198may be used to capture control before the first input line is read 199and after the last. 200.Li BEGIN 201must be the first pattern, 202.Li END 203the last. 204.Pp 205A single character 206.Ar c 207may be used to separate the fields by starting 208the program with 209.Pp 210.Dl BEGIN { FS = "c" } 211.Pp 212or by using the 213.Cx Fl F 214.Ar c 215.Cx 216option. 217.Pp 218Other variable names with special meanings 219include 220.Dp Li NF 221the number of fields in the current record; 222.Dp Li NR 223the ordinal number of the current record; 224.Dp Li FILENAME 225the name of the current input file; 226.Dp Li OFS 227the output field separator (default blank); 228.Dp Li ORS 229the output record separator (default newline); 230.Dp Li OFMT 231the output format for numbers (default "%.6g"). 232.Dp 233.Pp 234.Sh EXAMPLES 235.Pp 236Print lines longer than 72 characters: 237.Pp 238.Dl length > 72 239.Pp 240Print first two fields in opposite order: 241.Pp 242.Dl { print $2, $1 } 243.Pp 244Add up first column, print sum and average: 245.Pp 246.Ds I 247 { s += $1 } 248END { print "sum is", s, " average is", s/NR } 249.De 250.Pp 251Print fields in reverse order: 252.Pp 253.Dl { for (i = NF; i > 0; \-\-i) print $i } 254.Pp 255Print all lines between start/stop pairs: 256.Pp 257.Dl /start/, /stop/ 258.Pp 259Print all lines whose first field is different from previous one: 260.Pp 261.Dl $1 != prev { print; prev = $1 } 262.Sh SEE ALSO 263.Xr lex 1 , 264.Xr sed 1 265.Pp 266A. V. Aho, B. W. Kernighan, P. J. Weinberger, 267.Em Awk \- a pattern scanning and processing language 268.Sh HISTORY 269.Nm Awk 270appeared in Version 7 AT&T UNIX. A much improved 271and true to the book version of 272.Nm awk 273appeared in the AT&T Toolchest in the late 1980's. 274The version of 275.Nm awk 276this manual page describes 277is a derivative of the original and not the Toolchest version. 278.Sh BUGS 279There are no explicit conversions between numbers and strings. 280To force an expression to be treated as a number add 0 to it; 281to force it to be treated as a string concatenate "" (an empty 282string) to it. 283