1.\" Copyright (c) 1990 Regents of the University of California. 2.\" All rights reserved. The Berkeley software License Agreement 3.\" specifies the terms and conditions for redistribution. 4.\" 5.\" @(#)awk.1 6.4 (Berkeley) 07/24/90 6.\" 7.Dd 8.Dt AWK 1 9.Os ATT 7 10.Sh NAME 11.Nm awk 12.Nd pattern scanning and processing language 13.Sh SYNOPSIS 14.Nm awk 15.Oo 16.Op Fl \&F Ar \&c 17.Oo 18.Op Fl f Ar prog_file 19.Op Ar prog 20.Ar 21.Sh DESCRIPTION 22.Nm Awk 23scans each input 24.Ar file 25for lines that match any of a set of patterns specified in 26.Ar prog . 27With each pattern in 28.Ar prog 29there can be an associated action that will be performed 30when a line of a 31.Ar file 32matches the pattern. 33The set of patterns may appear literally as 34.Ar prog 35or in a file 36specified as 37.Fl f 38.Ar file . 39.Pp 40.Tw Ds 41.Tp Cx Fl F 42.Ar c 43.Cx 44Specify a field separator of 45.Ar c . 46.Tp Fl f 47Use 48.Ar prog_file 49as an input 50.Ar prog 51(an awk script). 52.Tp 53.Pp 54Files are read in order; 55if there are no files, the standard input is read. 56The file name 57.Sq Fl 58means the standard input. 59Each line is matched against the 60pattern portion of every pattern-action statement; 61the associated action is performed for each matched pattern. 62.Pp 63An input line is made up of fields separated by white space. 64(This default can be changed by using 65.Li FS , 66.Em vide infra . ) 67The fields are denoted $1, $2, ... ; 68$0 refers to the entire line. 69.Pp 70A pattern-action statement has the form 71.Pp 72.Dl pattern {action} 73.Pp 74A missing { action } means print the line; 75a missing pattern always matches. 76.Pp 77An action is a sequence of statements. 78A statement can be one of the following: 79.Pp 80.Ds I 81if ( conditional ) statement [ else statement ] 82while ( conditional ) statement 83for ( expression ; conditional ; expression ) statement 84break 85continue 86{ [ statement ] ... } 87variable = expression 88print [ expression-list ] [ >expression ] 89printf format [, expression-list ] [ >expression ] 90next # skip remaining patterns on this input line 91exit # skip the rest of the input 92.De 93.Pp 94Statements are terminated by 95semicolons, newlines or right braces. 96An empty expression-list stands for the whole line. 97Expressions take on string or numeric values as appropriate, 98and are built using the operators 99+, \-, *, /, %, and concatenation (indicated by a blank). 100The C operators ++, \-\-, +=, \-=, *=, /=, and %= 101are also available in expressions. 102Variables may be scalars, array elements 103(denoted 104.Cx x 105.Op i 106.Cx ) 107.Cx 108or fields. 109Variables are initialized to the null string. 110Array subscripts may be any string, 111not necessarily numeric; 112this allows for a form of associative memory. 113String constants are quoted "...". 114.Pp 115The 116.Ic print 117statement prints its arguments on the standard output 118(or on a file if 119.Ar \&>file 120is present), separated by the current output field separator, 121and terminated by the output record separator. 122The 123.Ic printf 124statement formats its expression list according to the format 125(see 126.Xr printf 3 ) . 127.Pp 128The built-in function 129.Ic length 130returns the length of its argument 131taken as a string, 132or of the whole line if no argument. 133There are also built-in functions 134.Ic exp , 135.Ic log , 136.Ic sqrt 137and 138.Ic int . 139The last truncates its argument to an integer. 140The function 141.Fn substr s m n 142returns the 143.Cx Ar n 144.Cx \- 145.Cx character 146.Cx 147substring of 148.Ar s 149that begins at position 150.Ar m . 151The 152.Fn sprintf fmt expr expr \&... 153function 154formats the expressions 155according to the 156.Xr printf 3 157format given by 158.Ar fmt 159and returns the resulting string. 160.Pp 161Patterns are arbitrary Boolean combinations 162(!, \(or\(or, &&, and parentheses) of 163regular expressions and 164relational expressions. 165Regular expressions must be surrounded 166by slashes and are as in 167.Xr egrep 1 . 168Isolated regular expressions 169in a pattern apply to the entire line. 170Regular expressions may also occur in 171relational expressions. 172.Pp 173A pattern may consist of two patterns separated by a comma; 174in this case, the action is performed for all lines 175between an occurrence of the first pattern 176and the next occurrence of the second. 177.Pp 178A relational expression is one of the following: 179.Pp 180.Ds I 181expression matchop regular-expression 182expression relop expression 183.De 184.Pp 185where a relop is any of the six relational operators in C, 186and a matchop is either ~ (for contains) 187or !~ (for does not contain). 188A conditional is an arithmetic expression, 189a relational expression, 190or a Boolean combination 191of these. 192.Pp 193The special patterns 194.Li BEGIN 195and 196.Li END 197may be used to capture control before the first input line is read 198and after the last. 199.Li BEGIN 200must be the first pattern, 201.Li END 202the last. 203.Pp 204A single character 205.Ar c 206may be used to separate the fields by starting 207the program with 208.Pp 209.Dl BEGIN { FS = "c" } 210.Pp 211or by using the 212.Cx Fl F 213.Ar c 214.Cx 215option. 216.Pp 217Other variable names with special meanings 218include 219.Dp Li NF 220the number of fields in the current record; 221.Dp Li NR 222the ordinal number of the current record; 223.Dp Li FILENAME 224the name of the current input file; 225.Dp Li OFS 226the output field separator (default blank); 227.Dp Li ORS 228the output record separator (default newline); 229.Dp Li OFMT 230the output format for numbers (default "%.6g"). 231.Dp 232.Pp 233.Sh EXAMPLES 234.Pp 235Print lines longer than 72 characters: 236.Pp 237.Dl length > 72 238.Pp 239Print first two fields in opposite order: 240.Pp 241.Dl { print $2, $1 } 242.Pp 243Add up first column, print sum and average: 244.Pp 245.Ds I 246 { s += $1 } 247END { print "sum is", s, " average is", s/NR } 248.De 249.Pp 250Print fields in reverse order: 251.Pp 252.Dl { for (i = NF; i > 0; \-\-i) print $i } 253.Pp 254Print all lines between start/stop pairs: 255.Pp 256.Dl /start/, /stop/ 257.Pp 258Print all lines whose first field is different from previous one: 259.Pp 260.Dl $1 != prev { print; prev = $1 } 261.Sh SEE ALSO 262.Xr lex 1 , 263.Xr sed 1 264.Pp 265A. V. Aho, B. W. Kernighan, P. J. Weinberger, 266.Em Awk \- a pattern scanning and processing language 267.Sh HISTORY 268.Nm Awk 269appeared in Version 7 AT&T UNIX. A much improved 270and true to the book version of 271.Nm awk 272appeared in the AT&T Toolchest in the late 1980's. 273The version of 274.Nm awk 275this manual page describes 276is a derivative of the original and not the Toolchest version. 277.Sh BUGS 278There are no explicit conversions between numbers and strings. 279To force an expression to be treated as a number add 0 to it; 280to force it to be treated as a string concatenate "" (an empty 281string) to it. 282