1.\" $OpenBSD: awk.1,v 1.68 2023/09/21 18:16:12 jmc Exp $ 2.\" 3.\" Copyright (C) Lucent Technologies 1997 4.\" All Rights Reserved 5.\" 6.\" Permission to use, copy, modify, and distribute this software and 7.\" its documentation for any purpose and without fee is hereby 8.\" granted, provided that the above copyright notice appear in all 9.\" copies and that both that the copyright notice and this 10.\" permission notice and warranty disclaimer appear in supporting 11.\" documentation, and that the name Lucent Technologies or any of 12.\" its entities not be used in advertising or publicity pertaining 13.\" to distribution of the software without specific, written prior 14.\" permission. 15.\" 16.\" LUCENT DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, 17.\" INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. 18.\" IN NO EVENT SHALL LUCENT OR ANY OF ITS ENTITIES BE LIABLE FOR ANY 19.\" SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES 20.\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER 21.\" IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, 22.\" ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF 23.\" THIS SOFTWARE. 24.\" 25.Dd $Mdocdate: September 21 2023 $ 26.Dt AWK 1 27.Os 28.Sh NAME 29.Nm awk 30.Nd pattern-directed scanning and processing language 31.Sh SYNOPSIS 32.Nm awk 33.Op Fl safe 34.Op Fl V 35.Op Fl d Ns Op Ar n 36.Op Fl F Ar fs | Fl -csv 37.Op Fl v Ar var Ns = Ns Ar value 38.Op Ar prog | Fl f Ar progfile 39.Ar 40.Sh DESCRIPTION 41.Nm 42scans each input 43.Ar file 44for lines that match any of a set of patterns specified literally in 45.Ar prog 46or in one or more files specified as 47.Fl f Ar progfile . 48With each pattern there can be an associated action that will be performed 49when a line of a 50.Ar file 51matches the pattern. 52Each line is matched against the 53pattern portion of every pattern-action statement; 54the associated action is performed for each matched pattern. 55The file name 56.Sq - 57means the standard input. 58Any 59.Ar file 60of the form 61.Ar var Ns = Ns Ar value 62is treated as an assignment, not a filename, 63and is executed at the time it would have been opened if it were a filename. 64.Pp 65The options are as follows: 66.Bl -tag -width "-safe " 67.It Fl -csv 68Process records using the (more or less) standard comma-separated values 69.Pq CSV 70format instead of the input field separator. 71When the 72.Fl -csv 73option is specified, attempts to change the input field separator 74or record separator are ignored. 75.It Fl d Ns Op Ar n 76Debug mode. 77Set debug level to 78.Ar n , 79or 1 if 80.Ar n 81is not specified. 82A value greater than 1 causes 83.Nm 84to dump core on fatal errors. 85.It Fl F Ar fs 86Define the input field separator to be the regular expression 87.Ar fs . 88.It Fl f Ar progfile 89Read program code from the specified file 90.Ar progfile 91instead of from the command line. 92.It Fl safe 93Disable file output 94.Pf ( Ic print No > , 95.Ic print No >> ) , 96process creation 97.Po 98.Ar cmd | Ic getline , 99.Ic print | , 100.Ic system 101.Pc 102and access to the environment 103.Pf ( Va ENVIRON ; 104see the section on variables below). 105This is a first 106.Pq and not very reliable 107approximation to a 108.Dq safe 109version of 110.Nm . 111.It Fl V 112Print the version number of 113.Nm 114to standard output and exit. 115.It Fl v Ar var Ns = Ns Ar value 116Assign 117.Ar value 118to variable 119.Ar var 120before 121.Ar prog 122is executed; 123any number of 124.Fl v 125options may be present. 126.El 127.Pp 128The input is normally made up of input lines 129.Pq records 130separated by newlines, or by the value of 131.Va RS . 132If 133.Va RS 134is null, then any number of blank lines are used as the record separator, 135and newlines are used as field separators 136(in addition to the value of 137.Va FS ) . 138This is convenient when working with multi-line records. 139.Pp 140An input line is normally made up of fields separated by whitespace, 141or by the value of the field separator 142.Va FS 143at the time the line is read. 144The fields are denoted 145.Va $1 , $2 , ... , 146while 147.Va $0 148refers to the entire line. 149.Va FS 150may be set to either a single character or a regular expression. 151As a special case, if 152.Va FS 153is a single space 154.Pq the default , 155fields will be split by one or more whitespace characters. 156If 157.Va FS 158is null, the input line is split into one field per character. 159.Pp 160Normally, any number of blanks separate fields. 161In order to set the field separator to a single blank, use the 162.Fl F 163option with a value of 164.Sq [\ \&] . 165If a field separator of 166.Sq t 167is specified, 168.Nm 169treats it as if 170.Sq \et 171had been specified and uses 172.Aq TAB 173as the field separator. 174In order to use a literal 175.Sq t 176as the field separator, use the 177.Fl F 178option with a value of 179.Sq [t] . 180The field separator is usually set via the 181.Fl F 182option or from inside a 183.Ic BEGIN 184block so that it takes effect before the input is read. 185.Pp 186A pattern-action statement has the form: 187.Pp 188.D1 Ar pattern Ic \&{ Ar action Ic \&} 189.Pp 190A missing 191.Ic \&{ Ar action Ic \&} 192means print the line; 193a missing pattern always matches. 194Pattern-action statements are separated by newlines or semicolons. 195.Pp 196Newlines are permitted after a terminating statement or following a comma 197.Pq Sq ,\& , 198an open brace 199.Pq Sq { , 200a logical AND 201.Pq Sq && , 202a logical OR 203.Pq Sq || , 204after the 205.Sq do 206or 207.Sq else 208keywords, 209or after the closing parenthesis of an 210.Sq if , 211.Sq for , 212or 213.Sq while 214statement. 215Additionally, a backslash 216.Pq Sq \e 217can be used to escape a newline between tokens. 218.Pp 219An action is a sequence of statements. 220A statement can be one of the following: 221.Pp 222.Bl -tag -width Ds -offset indent -compact 223.It Ic if Ar ( expression ) Ar statement Op Ic else Ar statement 224.It Ic while Ar ( expression ) Ar statement 225.It Ic for Ar ( expression ; expression ; expression ) statement 226.It Ic for Ar ( var Ic in Ar array ) statement 227.It Ic do Ar statement Ic while Ar ( expression ) 228.It Ic break 229.It Ic continue 230.It Xo Ic { 231.Op Ar statement ... 232.Ic } 233.Xc 234.It Xo Ar expression 235.No # commonly 236.Ar var No = Ar expression 237.Xc 238.It Xo Ic print 239.Op Ar expression-list 240.Op > Ns Ar expression 241.Xc 242.It Xo Ic printf Ar format 243.Op Ar ... , expression-list 244.Op > Ns Ar expression 245.Xc 246.It Ic return Op Ar expression 247.It Xo Ic next 248.No # skip remaining patterns on this input line 249.Xc 250.It Xo Ic nextfile 251.No # skip rest of this file, open next, start at top 252.Xc 253.It Xo Ic delete 254.Sm off 255.Ar array Ic \&[ Ar expression Ic \&] 256.Sm on 257.No # delete an array element 258.Xc 259.It Xo Ic delete Ar array 260.No # delete all elements of array 261.Xc 262.It Xo Ic exit 263.Op Ar expression 264.No # exit processing, and perform 265.Ic END 266processing; status is 267.Ar expression 268.Xc 269.El 270.Pp 271Statements are terminated by 272semicolons, newlines or right braces. 273An empty 274.Ar expression-list 275stands for 276.Ar $0 . 277String constants are quoted 278.Li \&"" , 279with the usual C escapes recognized within 280(see 281.Xr printf 1 282for a complete list of these). 283Expressions take on string or numeric values as appropriate, 284and are built using the operators 285.Ic + \- * / % ^ 286.Pq exponentiation , 287and concatenation 288.Pq indicated by whitespace . 289The operators 290.Ic \&! ++ \-\- += \-= *= /= %= ^= 291.Ic > >= < <= == != ?\&: 292are also available in expressions. 293Variables may be scalars, array elements 294(denoted 295.Li x[i] ) 296or fields. 297Variables are initialized to the null string. 298Array subscripts may be any string, 299not necessarily numeric; 300this allows for a form of associative memory. 301Multiple subscripts such as 302.Li [i,j,k] 303are permitted; the constituents are concatenated, 304separated by the value of 305.Va SUBSEP 306.Pq see the section on variables below . 307.Pp 308The 309.Ic print 310statement prints its arguments on the standard output 311(or on a file if 312.Pf >\ \& Ar file 313or 314.Pf >>\ \& Ar file 315is present or on a pipe if 316.Pf |\ \& Ar cmd 317is present), separated by the current output field separator, 318and terminated by the output record separator. 319.Ar file 320and 321.Ar cmd 322may be literal names or parenthesized expressions; 323identical string values in different statements denote 324the same open file. 325The 326.Ic printf 327statement formats its expression list according to the 328.Ar format 329(see 330.Xr printf 1 ) . 331.Pp 332Patterns are arbitrary Boolean combinations 333(with 334.Ic "\&! || &&" ) 335of regular expressions and 336relational expressions. 337.Nm 338supports extended regular expressions 339.Pq EREs . 340See 341.Xr re_format 7 342for more information on regular expressions. 343Isolated regular expressions 344in a pattern apply to the entire line. 345Regular expressions may also occur in 346relational expressions, using the operators 347.Ic ~ 348and 349.Ic !~ . 350.Pf / Ar re Ns / 351is a constant regular expression; 352any string (constant or variable) may be used 353as a regular expression, 354except in the position of an isolated regular expression in a pattern. 355.Pp 356A pattern may consist of two patterns separated by a comma; 357in this case, the action is performed for all lines 358from an occurrence of the first pattern 359through an occurrence of the second. 360.Pp 361A relational expression is one of the following: 362.Pp 363.Bl -tag -width Ds -offset indent -compact 364.It Ar expression matchop regular-expression 365.It Ar expression relop expression 366.It Ar expression Ic in Ar array-name 367.It Xo Ic \&( Ns 368.Ar expr , expr , \&... Ns Ic \&) in 369.Ar array-name 370.Xc 371.El 372.Pp 373where a 374.Ar relop 375is any of the six relational operators in C, and a 376.Ar matchop 377is either 378.Ic ~ 379(matches) 380or 381.Ic !~ 382(does not match). 383A conditional is an arithmetic expression, 384a relational expression, 385or a Boolean combination 386of these. 387.Pp 388The special pattern 389.Ic BEGIN 390may be used to capture control before the first input line is read. 391The special pattern 392.Ic END 393may be used to capture control after processing is finished. 394.Ic BEGIN 395and 396.Ic END 397do not combine with other patterns. 398They may appear multiple times in a program and execute 399in the order they are read by 400.Nm . 401.Pp 402Variable names with special meanings: 403.Pp 404.Bl -tag -width "FILENAME " -compact 405.It Va ARGC 406Argument count, assignable. 407.It Va ARGV 408Argument array, assignable; 409non-null members are taken as filenames. 410.It Va CONVFMT 411Conversion format when converting numbers 412(default 413.Qq Li %.6g ) . 414.It Va ENVIRON 415Array of environment variables; subscripts are names. 416.It Va FILENAME 417The name of the current input file. 418.It Va FNR 419Ordinal number of the current record in the current file. 420.It Va FS 421Regular expression used to separate fields (default whitespace); 422also settable by option 423.Fl F Ar fs . 424.It Va NF 425Number of fields in the current record. 426.Va $NF 427can be used to obtain the value of the last field in the current record. 428.It Va NR 429Ordinal number of the current record. 430.It Va OFMT 431Output format for numbers (default 432.Qq Li %.6g ) . 433.It Va OFS 434Output field separator (default blank). 435.It Va ORS 436Output record separator (default newline). 437.It Va RLENGTH 438The length of the string matched by the 439.Fn match 440function. 441.It Va RS 442Input record separator (default newline). 443If empty, blank lines separate records. 444If more than one character long, 445.Va RS 446is treated as a regular expression, and records are 447separated by text matching the expression. 448.It Va RSTART 449The starting position of the string matched by the 450.Fn match 451function. 452.It Va SUBSEP 453Separates multiple subscripts (default 034). 454.El 455.Sh FUNCTIONS 456The awk language has a variety of built-in functions: 457arithmetic, string, input/output, general, and bit-operation. 458.Pp 459Functions may be defined (at the position of a pattern-action statement) 460thusly: 461.Pp 462.Dl function foo(a, b, c) { ...; return x } 463.Pp 464Parameters are passed by value if scalar, and by reference if array name; 465functions may be called recursively. 466Parameters are local to the function; all other variables are global. 467Thus local variables may be created by providing excess parameters in 468the function definition. 469.Ss Arithmetic Functions 470.Bl -tag -width "atan2(y, x)" 471.It Fn atan2 y x 472Return the arctangent of 473.Fa y Ns / Ns Fa x 474in radians. 475.It Fn cos x 476Return the cosine of 477.Fa x , 478where 479.Fa x 480is in radians. 481.It Fn exp x 482Return the exponential of 483.Fa x . 484.It Fn int x 485Return 486.Fa x 487truncated to an integer value. 488.It Fn log x 489Return the natural logarithm of 490.Fa x . 491.It Fn rand 492Return a random number, 493.Fa n , 494such that 495.Sm off 496.Pf 0 \*(Le Fa n No \*(Lt 1 . 497.Sm on 498Random numbers are non-deterministic unless a seed is explicitly set with 499.Fn srand . 500.It Fn sin x 501Return the sine of 502.Fa x , 503where 504.Fa x 505is in radians. 506.It Fn sqrt x 507Return the square root of 508.Fa x . 509.It Fn srand expr 510Sets seed for 511.Fn rand 512to 513.Fa expr 514and returns the previous seed. 515If 516.Fa expr 517is omitted, 518.Fn rand 519will return non-deterministic random numbers. 520.El 521.Ss String Functions 522.Bl -tag -width "split(s, a, fs)" 523.It Fn gensub r s h [t] 524Search the target string 525.Ar t 526for matches of the regular expression 527.Ar r . 528If 529.Ar h 530is a string beginning with 531.Ic g 532or 533.Ic G , 534then replace all matches of 535.Ar r 536with 537.Ar s . 538Otherwise, 539.Ar h 540is a number indicating which match of 541.Ar r 542to replace. 543If no 544.Ar t 545is supplied, 546.Va $0 547is used instead. 548.\"Within the replacement text 549.\".Ar s , 550.\"the sequence 551.\".Ar \en , 552.\"where 553.\".Ar n 554.\"is a digit from 1 to 9, may be used to indicate just the text that 555.\"matched the 556.\".Ar n Ap th 557.\"parenthesized subexpression. 558.\"The sequence 559.\".Ic \e0 560.\"represents the entire text, as does the character 561.\".Ic & . 562Unlike 563.Fn sub 564and 565.Fn gsub , 566the modified string is returned as the result of the function, 567and the original target is 568.Em not 569changed. 570Note that 571.Ar \en 572sequences within the replacement string 573.Ar s , 574as supported by GNU 575.Nm , 576are 577.Em not 578supported at this time. 579.It Fn gsub r t s 580The same as 581.Fn sub 582except that all occurrences of the regular expression are replaced. 583.Fn gsub 584returns the number of replacements. 585.It Fn index s t 586The position in 587.Fa s 588where the string 589.Fa t 590occurs, or 0 if it does not. 591.It Fn length s 592The length of 593.Fa s 594taken as a string, 595number of elements in an array for an array argument, 596or length of 597.Va $0 598if no argument is given. 599.It Fn match s r 600The position in 601.Fa s 602where the regular expression 603.Fa r 604occurs, or 0 if it does not. 605The variable 606.Va RSTART 607is set to the starting position of the matched string 608.Pq which is the same as the returned value 609or zero if no match is found. 610The variable 611.Va RLENGTH 612is set to the length of the matched string, 613or \-1 if no match is found. 614.It Fn split s a fs 615Splits the string 616.Fa s 617into array elements 618.Va a[1] , a[2] , ... , a[n] 619and returns 620.Va n . 621The separation is done with the regular expression 622.Ar fs 623or with the field separator 624.Va FS 625if 626.Ar fs 627is not given. 628An empty string as field separator splits the string 629into one array element per character. 630.It Fn sprintf fmt expr ... 631The string resulting from formatting 632.Fa expr , ... 633according to the 634.Xr printf 1 635format 636.Fa fmt . 637.It Fn sub r t s 638Substitutes 639.Fa t 640for the first occurrence of the regular expression 641.Fa r 642in the string 643.Fa s . 644If 645.Fa s 646is not given, 647.Va $0 648is used. 649An ampersand 650.Pq Sq & 651in 652.Fa t 653is replaced in string 654.Fa s 655with regular expression 656.Fa r . 657A literal ampersand can be specified by preceding it with two backslashes 658.Pq Sq \e\e . 659A literal backslash can be specified by preceding it with another backslash 660.Pq Sq \e\e . 661.Fn sub 662returns the number of replacements. 663.It Fn substr s m n 664Return at most the 665.Fa n Ns -character 666substring of 667.Fa s 668that begins at position 669.Fa m 670counted from 1. 671If 672.Fa n 673is omitted, or if 674.Fa n 675specifies more characters than are left in the string, 676the length of the substring is limited by the length of 677.Fa s . 678.It Fn tolower str 679Returns a copy of 680.Fa str 681with all upper-case characters translated to their 682corresponding lower-case equivalents. 683.It Fn toupper str 684Returns a copy of 685.Fa str 686with all lower-case characters translated to their 687corresponding upper-case equivalents. 688.El 689.Ss Time Functions 690This version of 691.Nm 692provides the following functions for obtaining and formatting time 693stamps. 694.Bl -tag -width indent 695.It Fn mktime datespec 696Converts 697.Fa datespec 698into a timestamp in the same form as a value returned by 699.Fn systime . 700The 701.Fa datespec 702is a string composed of six or seven numbers separated by whitespace: 703.Bd -literal -offset indent 704YYYY MM DD HH MM SS [DST] 705.Ed 706.Pp 707The fields in 708.Fa datespec 709are as follows: 710.Bl -tag -width "YYYY" 711.It YYYY 712Year: a four-digit year, including the century. 713.It MM 714Month: a number from 1 to 12. 715.It DD 716Day: a number from 1 to 31. 717.It HH 718Hour: a number from 0 to 23. 719.It MM 720Minute: a number from 0 to 59. 721.It SS 722Second: a number from 0 to 60 (permitting a leap second). 723.It DST 724Daylight Saving Time: a positive or zero value indicates that 725DST is or is not in effect. 726If DST is not specified, or is negative, 727.Fn mktime 728will attempt to determine the correct value. 729.El 730.It Fn strftime "[format [, timestamp]]" 731Formats 732.Ar timestamp 733according to the string 734.Ar format . 735The format string may contain any of the conversion specifications described 736in the 737.Xr strftime 3 738manual page, as well as any arbitrary text. 739The 740.Ar timestamp 741must be in the same form as a value returned by 742.Fn mktime 743and 744.Fn systime . 745If 746.Ar timestamp 747is not specified, the current time is used. 748If 749.Ar format 750is not specified, a default format equivalent to the output of 751.Xr date 1 752is used. 753.It Fn systime 754Returns the value of time in seconds since 0 hours, 0 minutes, 7550 seconds, January 1, 1970, Coordinated Universal Time (UTC). 756.El 757.Ss Input/Output and General Functions 758.Bl -tag -width "getline [var] < file" 759.It Fn close expr 760Closes the file or pipe 761.Fa expr . 762.Fa expr 763should match the string that was used to open the file or pipe. 764.It Ar cmd | Ic getline Op Va var 765Read a record of input from a stream piped from the output of 766.Ar cmd . 767If 768.Va var 769is omitted, the variables 770.Va $0 771and 772.Va NF 773are set. 774Otherwise 775.Va var 776is set. 777If the stream is not open, it is opened. 778As long as the stream remains open, subsequent calls 779will read subsequent records from the stream. 780The stream remains open until explicitly closed with a call to 781.Fn close . 782.Ic getline 783returns 1 for a successful input, 0 for end of file, and \-1 for an error. 784.It Fn fflush [expr] 785Flushes any buffered output for the file or pipe 786.Fa expr , 787or all open files or pipes if 788.Fa expr 789is omitted. 790.Fa expr 791should match the string that was used to open the file or pipe. 792.It Ic getline 793Sets 794.Va $0 795to the next input record from the current input file. 796This form of 797.Ic getline 798sets the variables 799.Va NF , 800.Va NR , 801and 802.Va FNR . 803.Ic getline 804returns 1 for a successful input, 0 for end of file, and \-1 for an error. 805.It Ic getline Va var 806Sets 807.Va $0 808to variable 809.Va var . 810This form of 811.Ic getline 812sets the variables 813.Va NR 814and 815.Va FNR . 816.Ic getline 817returns 1 for a successful input, 0 for end of file, and \-1 for an error. 818.It Xo 819.Ic getline Op Va var 820.Pf <\ \& Ar file 821.Xc 822Sets 823.Va $0 824to the next record from 825.Ar file . 826If 827.Va var 828is omitted, the variables 829.Va $0 830and 831.Va NF 832are set. 833Otherwise 834.Va var 835is set. 836If 837.Ar file 838is not open, it is opened. 839As long as the stream remains open, subsequent calls will read subsequent 840records from 841.Ar file . 842.Ar file 843remains open until explicitly closed with a call to 844.Fn close . 845.It Fn system cmd 846Executes 847.Fa cmd 848and returns its exit status. 849This will be \-1 upon error, 850.Ar cmd Ns 's 851exit status upon a normal exit, 852256 + 853.Em sig 854if 855.Fa cmd 856was terminated by a signal, where 857.Em sig 858is the number of the signal, 859or 512 + 860.Em sig 861if there was a core dump. 862.El 863.Ss Bit-Operation Functions 864.Bl -tag -width "lshift(a, b)" 865.It Fn compl x 866Returns the bitwise complement of integer argument x. 867.It Fn and x y 868Performs a bitwise AND on integer arguments x and y. 869.It Fn or x y 870Performs a bitwise OR on integer arguments x and y. 871.It Fn xor x y 872Performs a bitwise Exclusive-OR on integer arguments x and y. 873.It Fn lshift x n 874Returns integer argument x shifted by n bits to the left. 875.It Fn rshift x n 876Returns integer argument x shifted by n bits to the right. 877.El 878.Sh ENVIRONMENT 879The following environment variables affect the execution of 880.Nm : 881.Bl -tag -width POSIXLY_CORRECT 882.It Ev LC_CTYPE 883The character encoding 884.Xr locale 1 . 885It decides which byte sequences form characters, which characters are 886letters, and how letters are mapped from lower to upper case and vice versa. 887If unset or set to 888.Qq C , 889.Qq POSIX , 890or an unsupported value, each byte is treated as a character, 891and non-ASCII bytes are not regarded as letters. 892.It Ev POSIXLY_CORRECT 893When set, behave in accordance with the standard, even when it conflicts 894with historical behavior. 895.El 896.Sh EXIT STATUS 897.Ex -std awk 898.Pp 899But note that the 900.Ic exit 901expression can modify the exit status. 902.Sh EXAMPLES 903Print lines longer than 72 characters: 904.Pp 905.Dl length($0) > 72 906.Pp 907Print first two fields in opposite order: 908.Pp 909.Dl { print $2, $1 } 910.Pp 911Same, with input fields separated by comma and/or spaces and tabs: 912.Bd -literal -offset indent 913BEGIN { FS = ",[ \et]*|[ \et]+" } 914 { print $2, $1 } 915.Ed 916.Pp 917Add up first column, print sum and average: 918.Bd -literal -offset indent 919{ s += $1 } 920END { print "sum is", s, " average is", s/NR } 921.Ed 922.Pp 923Print all lines between start/stop pairs: 924.Pp 925.Dl /start/, /stop/ 926.Pp 927Simulate 928.Xr echo 1 : 929.Bd -literal -offset indent 930BEGIN { # Simulate echo(1) 931 for (i = 1; i < ARGC; i++) printf "%s ", ARGV[i] 932 printf "\en" 933 exit } 934.Ed 935.Pp 936Print an error message to standard error: 937.Bd -literal -offset indent 938{ print "error!" > "/dev/stderr" } 939.Ed 940.Sh UNUSUAL FLOATING-POINT VALUES 941.Nm 942was designed before IEEE 754 arithmetic defined Not-A-Number (NaN) 943and Infinity values, which are supported by all modern floating-point 944hardware. 945.Pp 946Because 947.Nm 948uses 949.Xr strtod 3 950and 951.Xr atof 3 952to convert string values to double-precision floating-point values, 953modern C libraries also convert strings starting with 954.Dv inf 955and 956.Dv nan 957into infinity and NaN values respectively. 958This led to strange results, 959with something like this: 960.Pp 961.Li echo nancy | awk '{ print $1 + 0 }' 962.Pp 963printing 964.Dv nan 965instead of zero. 966.Pp 967.Nm 968now follows GNU 969.Nm , 970and prefilters string values before attempting 971to convert them to numbers, as follows: 972.Bl -tag -width Ds 973.It Hexadecimal values 974Hexadecimal values (allowed since C99) convert to zero, as they did 975prior to C99. 976.It NaN values 977The two strings 978.Dq +NAN 979and 980.Dq -NAN 981(case independent) convert to NaN. 982No others do. 983(NaNs can have signs.) 984.It Infinity values 985The two strings 986.Dq +INF 987and 988.Dq -INF 989(case independent) convert to positive and negative infinity, respectively. 990No others do. 991.El 992.Sh SEE ALSO 993.Xr cut 1 , 994.Xr date 1 , 995.Xr grep 1 , 996.Xr lex 1 , 997.Xr printf 1 , 998.Xr sed 1 , 999.Xr strftime 3 , 1000.Xr re_format 7 , 1001.Xr script 7 1002.Rs 1003.\" 4.4BSD USD:16 1004.\".%R Computing Science Technical Report 1005.\".%N 68 1006.\".%D July 1978 1007.%A A. V. Aho 1008.%A P. J. Weinberger 1009.%A B. W. Kernighan 1010.%T AWK \(em A Pattern Scanning and Processing Language 1011.%J Software \(em Practice and Experience 1012.%V 9:4 1013.%P pp. 267-279 1014.%D April 1979 1015.Re 1016.Rs 1017.%A A. V. Aho 1018.%A B. W. Kernighan 1019.%A P. J. Weinberger 1020.%T The AWK Programming Language 1021.%I Addison-Wesley 1022.%D 2024 1023.%O ISBN 0-13-826972-6 1024.Re 1025.Sh STANDARDS 1026The 1027.Nm 1028utility is compliant with the 1029.St -p1003.1-2008 1030specification except that consecutive backslashes in the replacement 1031string argument for 1032.Fn sub 1033and 1034.Fn gsub 1035are not collapsed and a slash 1036.Pq Ql / 1037does not need to be escaped in a bracket expression. 1038Also, the behaviour of 1039.Fn rand 1040and 1041.Fn srand 1042has been changed to support non-deterministic random numbers. 1043.Pp 1044In 1045.Ev LC_CTYPE Ns Li =POSIX 1046mode, treating non-ASCII input bytes as non-letter characters rather 1047than as input encoding errors intentionally violates the specification. 1048.Pp 1049The flags 1050.Op Fl \&dV , 1051.Op Fl -csv , 1052and 1053.Op Fl safe , 1054support for regular expressions in 1055.Va RS , 1056as well as the functions 1057.Fn fflush , 1058.Fn gensub , 1059.Fn compl , 1060.Fn and , 1061.Fn or , 1062.Fn xor , 1063.Fn lshift , 1064.Fn rshift , 1065.Fn mktime , 1066.Fn strftime 1067and 1068.Fn systime 1069are extensions to that specification. 1070.Sh HISTORY 1071An 1072.Nm 1073utility appeared in 1074.At v7 . 1075.Sh BUGS 1076There are no explicit conversions between numbers and strings. 1077To force an expression to be treated as a number add 0 to it; 1078to force it to be treated as a string concatenate 1079.Li \&"" 1080to it. 1081.Pp 1082The scope rules for variables in functions are a botch; 1083the syntax is worse. 1084