1.\" Copyright (c) 1990 The Regents of the University of California. 2.\" All rights reserved. 3.\" 4.\" %sccs.include.redist.man% 5.\" 6.\" @(#)awk.1 6.2 (Berkeley) 06/11/90 7.\" 8.Dd 9.Dt AWK 1 10.Os ATT 7 11.Sh NAME 12.Nm awk 13.Nd pattern scanning and processing language 14.Sh SYNOPSIS 15.Nm awk 16.Oo 17.Op Fl \&F Ar \&c 18.Oo 19.\".Op Op Fl \&f Ar file Op Ar prog 20.Cx \&[ 21.Op Fl f Ar file 22.Op Ar prog 23.Cx \&] 24.Cx 25.Ar 26.Sh DESCRIPTION 27.Nm Awk 28scans each input 29.Ar file 30for lines that match any of a set of patterns specified in 31.Ar prog . 32With each pattern in 33.Ar prog 34there can be an associated action that will be performed 35when a line of a 36.Ar file 37matches the pattern. 38The set of patterns may appear literally as 39.Ar prog 40or in a file 41specified as 42.Fl f 43.Ar file . 44.Pp 45.Tw Fl 46.Tp Cx Fl F 47.Ar c 48.Cx 49Specify a field separator of 50.Ar c . 51.Tp Fl f 52Use 53.Ar file 54as an input 55.Ar prog 56(an awk script). 57.Tp 58.Pp 59Files are read in order; 60if there are no files, the standard input is read. 61The file name 62.Fl 63means the standard input. 64Each line is matched against the 65pattern portion of every pattern-action statement; 66the associated action is performed for each matched pattern. 67.Pp 68An input line is made up of fields separated by white space. 69(This default can be changed by using 70.Li FS , 71.Em vide infra . ) 72The fields are denoted $1, $2, ... ; 73$0 refers to the entire line. 74.Pp 75A pattern-action statement has the form 76.Pp 77.Dl pattern {action} 78.Pp 79A missing { action } means print the line; 80a missing pattern always matches. 81.Pp 82An action is a sequence of statements. 83A statement can be one of the following: 84.Pp 85.Ds I 86if ( conditional ) statement [ else statement ] 87while ( conditional ) statement 88for ( expression ; conditional ; expression ) statement 89break 90continue 91{ [ statement ] ... } 92variable = expression 93print [ expression-list ] [ >expression ] 94printf format [, expression-list ] [ >expression ] 95next # skip remaining patterns on this input line 96exit # skip the rest of the input 97.De 98.Pp 99Statements are terminated by 100semicolons, newlines or right braces. 101An empty expression-list stands for the whole line. 102Expressions take on string or numeric values as appropriate, 103and are built using the operators 104+, \-, *, /, %, and concatenation (indicated by a blank). 105The C operators ++, \-\-, +=, \-=, *=, /=, and %= 106are also available in expressions. 107Variables may be scalars, array elements 108(denoted 109.Cx x 110.Op i 111.Cx ) 112.Cx 113or fields. 114Variables are initialized to the null string. 115Array subscripts may be any string, 116not necessarily numeric; 117this allows for a form of associative memory. 118String constants are quoted "...". 119.Pp 120The 121.Ic print 122statement prints its arguments on the standard output 123(or on a file if 124.Ar \&>file 125is present), separated by the current output field separator, 126and terminated by the output record separator. 127The 128.Ic printf 129statement formats its expression list according to the format 130(see 131.Xr printf 3 ) . 132.Pp 133The built-in function 134.Ic length 135returns the length of its argument 136taken as a string, 137or of the whole line if no argument. 138There are also built-in functions 139.Ic exp , 140.Ic log , 141.Ic sqrt 142and 143.Ic int . 144The last truncates its argument to an integer. 145The function 146.Cx Ic substr 147.Cx ( 148.Ar s , 149.Ar \& m , 150.Ar \& n ) 151.Cx 152returns the 153.Cx Ar n 154.Cx \- 155.Cx character 156.Cx 157substring of 158.Ar s 159that begins at position 160.Ar m . 161The function 162.Cx Ic sprintf 163.Cx ( 164.Ar fmt , 165.Ar \& expr , 166.Ar \& expr , 167.Ar \& ... ) 168.Cx 169formats the expressions 170according to the 171.Xr printf 3 172format given by 173.Ar fmt 174and returns the resulting string. 175.Pp 176Patterns are arbitrary Boolean combinations 177(!, \(or\(or, &&, and parentheses) of 178regular expressions and 179relational expressions. 180Regular expressions must be surrounded 181by slashes and are as in 182.Xr egrep 1 . 183Isolated regular expressions 184in a pattern apply to the entire line. 185Regular expressions may also occur in 186relational expressions. 187.Pp 188A pattern may consist of two patterns separated by a comma; 189in this case, the action is performed for all lines 190between an occurrence of the first pattern 191and the next occurrence of the second. 192.Pp 193A relational expression is one of the following: 194.Ds 195expression matchop regular-expression 196expression relop expression 197.De 198.Pp 199where a relop is any of the six relational operators in C, 200and a matchop is either ~ (for contains) 201or !~ (for does not contain). 202A conditional is an arithmetic expression, 203a relational expression, 204or a Boolean combination 205of these. 206.Pp 207The special patterns 208.Li BEGIN 209and 210.Li END 211may be used to capture control before the first input line is read 212and after the last. 213.Li BEGIN 214must be the first pattern, 215.Li END 216the last. 217.Pp 218A single character 219.Ar c 220may be used to separate the fields by starting 221the program with 222.Pp 223.Dl BEGIN { FS = "c" } 224.Pp 225or by using the 226.Cx Fl F 227.Ar c 228.Cx 229option. 230.Pp 231Other variable names with special meanings 232include 233.Dp Li NF 234the number of fields in the current record; 235.Dp Li NR 236the ordinal number of the current record; 237.Dp Li FILENAME 238the name of the current input file; 239.Dp Li OFS 240the output field separator (default blank); 241.Dp Li ORS 242the output record separator (default newline); 243.Dp Li OFMT 244the output format for numbers (default "%.6g"). 245.Dp 246.Pp 247.Sh EXAMPLES 248.Pp 249Print lines longer than 72 characters: 250.Pp 251.Dl length > 72 252.Pp 253Print first two fields in opposite order: 254.Pp 255.Dl { print $2, $1 } 256.Pp 257Add up first column, print sum and average: 258.Pp 259.Ds I 260 { s += $1 } 261END { print "sum is", s, " average is", s/NR } 262.De 263.Pp 264Print fields in reverse order: 265.Pp 266.Dl { for (i = NF; i > 0; \-\-i) print $i } 267.Pp 268Print all lines between start/stop pairs: 269.Pp 270.Dl /start/, /stop/ 271.Pp 272Print all lines whose first field is different from previous one: 273.Pp 274.Dl $1 != prev { print; prev = $1 } 275.Sh SEE ALSO 276.Xr lex 1 , 277.Xr sed 1 278.Pp 279A. V. Aho, B. W. Kernighan, P. J. Weinberger, 280.Em Awk \- a pattern scanning and processing language 281.Sh HISTORY 282.Nm Awk 283appeared in Version 7 AT&T UNIX. A much improved 284and true to the book version of 285.Nm awk 286appeared in the AT&T Toolchest in the late 1980's. 287The version of 288.Nm awk 289this manual page describes 290is a derivative of the original and not the Toolchest version. 291.Sh BUGS 292There are no explicit conversions between numbers and strings. 293To force an expression to be treated as a number add 0 to it; 294to force it to be treated as a string concatenate 295.Dq 296to it. 297