xref: /openbsd/usr.bin/awk/awk.1 (revision 4bdff4be)
1.\"	$OpenBSD: awk.1,v 1.68 2023/09/21 18:16:12 jmc Exp $
2.\"
3.\" Copyright (C) Lucent Technologies 1997
4.\" All Rights Reserved
5.\"
6.\" Permission to use, copy, modify, and distribute this software and
7.\" its documentation for any purpose and without fee is hereby
8.\" granted, provided that the above copyright notice appear in all
9.\" copies and that both that the copyright notice and this
10.\" permission notice and warranty disclaimer appear in supporting
11.\" documentation, and that the name Lucent Technologies or any of
12.\" its entities not be used in advertising or publicity pertaining
13.\" to distribution of the software without specific, written prior
14.\" permission.
15.\"
16.\" LUCENT DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE,
17.\" INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS.
18.\" IN NO EVENT SHALL LUCENT OR ANY OF ITS ENTITIES BE LIABLE FOR ANY
19.\" SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
20.\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER
21.\" IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION,
22.\" ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF
23.\" THIS SOFTWARE.
24.\"
25.Dd $Mdocdate: September 21 2023 $
26.Dt AWK 1
27.Os
28.Sh NAME
29.Nm awk
30.Nd pattern-directed scanning and processing language
31.Sh SYNOPSIS
32.Nm awk
33.Op Fl safe
34.Op Fl V
35.Op Fl d Ns Op Ar n
36.Op Fl F Ar fs | Fl -csv
37.Op Fl v Ar var Ns = Ns Ar value
38.Op Ar prog | Fl f Ar progfile
39.Ar
40.Sh DESCRIPTION
41.Nm
42scans each input
43.Ar file
44for lines that match any of a set of patterns specified literally in
45.Ar prog
46or in one or more files specified as
47.Fl f Ar progfile .
48With each pattern there can be an associated action that will be performed
49when a line of a
50.Ar file
51matches the pattern.
52Each line is matched against the
53pattern portion of every pattern-action statement;
54the associated action is performed for each matched pattern.
55The file name
56.Sq -
57means the standard input.
58Any
59.Ar file
60of the form
61.Ar var Ns = Ns Ar value
62is treated as an assignment, not a filename,
63and is executed at the time it would have been opened if it were a filename.
64.Pp
65The options are as follows:
66.Bl -tag -width "-safe "
67.It Fl -csv
68Process records using the (more or less) standard comma-separated values
69.Pq CSV
70format instead of the input field separator.
71When the
72.Fl -csv
73option is specified, attempts to change the input field separator
74or record separator are ignored.
75.It Fl d Ns Op Ar n
76Debug mode.
77Set debug level to
78.Ar n ,
79or 1 if
80.Ar n
81is not specified.
82A value greater than 1 causes
83.Nm
84to dump core on fatal errors.
85.It Fl F Ar fs
86Define the input field separator to be the regular expression
87.Ar fs .
88.It Fl f Ar progfile
89Read program code from the specified file
90.Ar progfile
91instead of from the command line.
92.It Fl safe
93Disable file output
94.Pf ( Ic print No > ,
95.Ic print No >> ) ,
96process creation
97.Po
98.Ar cmd | Ic getline ,
99.Ic print | ,
100.Ic system
101.Pc
102and access to the environment
103.Pf ( Va ENVIRON ;
104see the section on variables below).
105This is a first
106.Pq and not very reliable
107approximation to a
108.Dq safe
109version of
110.Nm .
111.It Fl V
112Print the version number of
113.Nm
114to standard output and exit.
115.It Fl v Ar var Ns = Ns Ar value
116Assign
117.Ar value
118to variable
119.Ar var
120before
121.Ar prog
122is executed;
123any number of
124.Fl v
125options may be present.
126.El
127.Pp
128The input is normally made up of input lines
129.Pq records
130separated by newlines, or by the value of
131.Va RS .
132If
133.Va RS
134is null, then any number of blank lines are used as the record separator,
135and newlines are used as field separators
136(in addition to the value of
137.Va FS ) .
138This is convenient when working with multi-line records.
139.Pp
140An input line is normally made up of fields separated by whitespace,
141or by the value of the field separator
142.Va FS
143at the time the line is read.
144The fields are denoted
145.Va $1 , $2 , ... ,
146while
147.Va $0
148refers to the entire line.
149.Va FS
150may be set to either a single character or a regular expression.
151As a special case, if
152.Va FS
153is a single space
154.Pq the default ,
155fields will be split by one or more whitespace characters.
156If
157.Va FS
158is null, the input line is split into one field per character.
159.Pp
160Normally, any number of blanks separate fields.
161In order to set the field separator to a single blank, use the
162.Fl F
163option with a value of
164.Sq [\ \&] .
165If a field separator of
166.Sq t
167is specified,
168.Nm
169treats it as if
170.Sq \et
171had been specified and uses
172.Aq TAB
173as the field separator.
174In order to use a literal
175.Sq t
176as the field separator, use the
177.Fl F
178option with a value of
179.Sq [t] .
180The field separator is usually set via the
181.Fl F
182option or from inside a
183.Ic BEGIN
184block so that it takes effect before the input is read.
185.Pp
186A pattern-action statement has the form:
187.Pp
188.D1 Ar pattern Ic \&{ Ar action Ic \&}
189.Pp
190A missing
191.Ic \&{ Ar action Ic \&}
192means print the line;
193a missing pattern always matches.
194Pattern-action statements are separated by newlines or semicolons.
195.Pp
196Newlines are permitted after a terminating statement or following a comma
197.Pq Sq ,\& ,
198an open brace
199.Pq Sq { ,
200a logical AND
201.Pq Sq && ,
202a logical OR
203.Pq Sq || ,
204after the
205.Sq do
206or
207.Sq else
208keywords,
209or after the closing parenthesis of an
210.Sq if ,
211.Sq for ,
212or
213.Sq while
214statement.
215Additionally, a backslash
216.Pq Sq \e
217can be used to escape a newline between tokens.
218.Pp
219An action is a sequence of statements.
220A statement can be one of the following:
221.Pp
222.Bl -tag -width Ds -offset indent -compact
223.It Ic if Ar ( expression ) Ar statement Op Ic else Ar statement
224.It Ic while Ar ( expression ) Ar statement
225.It Ic for Ar ( expression ; expression ; expression ) statement
226.It Ic for Ar ( var Ic in Ar array ) statement
227.It Ic do Ar statement Ic while Ar ( expression )
228.It Ic break
229.It Ic continue
230.It Xo Ic {
231.Op Ar statement ...
232.Ic }
233.Xc
234.It Xo Ar expression
235.No # commonly
236.Ar var No = Ar expression
237.Xc
238.It Xo Ic print
239.Op Ar expression-list
240.Op > Ns Ar expression
241.Xc
242.It Xo Ic printf Ar format
243.Op Ar ... , expression-list
244.Op > Ns Ar expression
245.Xc
246.It Ic return Op Ar expression
247.It Xo Ic next
248.No # skip remaining patterns on this input line
249.Xc
250.It Xo Ic nextfile
251.No # skip rest of this file, open next, start at top
252.Xc
253.It Xo Ic delete
254.Sm off
255.Ar array Ic \&[ Ar expression Ic \&]
256.Sm on
257.No # delete an array element
258.Xc
259.It Xo Ic delete Ar array
260.No # delete all elements of array
261.Xc
262.It Xo Ic exit
263.Op Ar expression
264.No # exit processing, and perform
265.Ic END
266processing; status is
267.Ar expression
268.Xc
269.El
270.Pp
271Statements are terminated by
272semicolons, newlines or right braces.
273An empty
274.Ar expression-list
275stands for
276.Ar $0 .
277String constants are quoted
278.Li \&"" ,
279with the usual C escapes recognized within
280(see
281.Xr printf 1
282for a complete list of these).
283Expressions take on string or numeric values as appropriate,
284and are built using the operators
285.Ic + \- * / % ^
286.Pq exponentiation ,
287and concatenation
288.Pq indicated by whitespace .
289The operators
290.Ic \&! ++ \-\- += \-= *= /= %= ^=
291.Ic > >= < <= == != ?\&:
292are also available in expressions.
293Variables may be scalars, array elements
294(denoted
295.Li x[i] )
296or fields.
297Variables are initialized to the null string.
298Array subscripts may be any string,
299not necessarily numeric;
300this allows for a form of associative memory.
301Multiple subscripts such as
302.Li [i,j,k]
303are permitted; the constituents are concatenated,
304separated by the value of
305.Va SUBSEP
306.Pq see the section on variables below .
307.Pp
308The
309.Ic print
310statement prints its arguments on the standard output
311(or on a file if
312.Pf >\ \& Ar file
313or
314.Pf >>\ \& Ar file
315is present or on a pipe if
316.Pf |\ \& Ar cmd
317is present), separated by the current output field separator,
318and terminated by the output record separator.
319.Ar file
320and
321.Ar cmd
322may be literal names or parenthesized expressions;
323identical string values in different statements denote
324the same open file.
325The
326.Ic printf
327statement formats its expression list according to the
328.Ar format
329(see
330.Xr printf 1 ) .
331.Pp
332Patterns are arbitrary Boolean combinations
333(with
334.Ic "\&! || &&" )
335of regular expressions and
336relational expressions.
337.Nm
338supports extended regular expressions
339.Pq EREs .
340See
341.Xr re_format 7
342for more information on regular expressions.
343Isolated regular expressions
344in a pattern apply to the entire line.
345Regular expressions may also occur in
346relational expressions, using the operators
347.Ic ~
348and
349.Ic !~ .
350.Pf / Ar re Ns /
351is a constant regular expression;
352any string (constant or variable) may be used
353as a regular expression,
354except in the position of an isolated regular expression in a pattern.
355.Pp
356A pattern may consist of two patterns separated by a comma;
357in this case, the action is performed for all lines
358from an occurrence of the first pattern
359through an occurrence of the second.
360.Pp
361A relational expression is one of the following:
362.Pp
363.Bl -tag -width Ds -offset indent -compact
364.It Ar expression matchop regular-expression
365.It Ar expression relop expression
366.It Ar expression Ic in Ar array-name
367.It Xo Ic \&( Ns
368.Ar expr , expr , \&... Ns Ic \&) in
369.Ar array-name
370.Xc
371.El
372.Pp
373where a
374.Ar relop
375is any of the six relational operators in C, and a
376.Ar matchop
377is either
378.Ic ~
379(matches)
380or
381.Ic !~
382(does not match).
383A conditional is an arithmetic expression,
384a relational expression,
385or a Boolean combination
386of these.
387.Pp
388The special pattern
389.Ic BEGIN
390may be used to capture control before the first input line is read.
391The special pattern
392.Ic END
393may be used to capture control after processing is finished.
394.Ic BEGIN
395and
396.Ic END
397do not combine with other patterns.
398They may appear multiple times in a program and execute
399in the order they are read by
400.Nm .
401.Pp
402Variable names with special meanings:
403.Pp
404.Bl -tag -width "FILENAME " -compact
405.It Va ARGC
406Argument count, assignable.
407.It Va ARGV
408Argument array, assignable;
409non-null members are taken as filenames.
410.It Va CONVFMT
411Conversion format when converting numbers
412(default
413.Qq Li %.6g ) .
414.It Va ENVIRON
415Array of environment variables; subscripts are names.
416.It Va FILENAME
417The name of the current input file.
418.It Va FNR
419Ordinal number of the current record in the current file.
420.It Va FS
421Regular expression used to separate fields (default whitespace);
422also settable by option
423.Fl F Ar fs .
424.It Va NF
425Number of fields in the current record.
426.Va $NF
427can be used to obtain the value of the last field in the current record.
428.It Va NR
429Ordinal number of the current record.
430.It Va OFMT
431Output format for numbers (default
432.Qq Li %.6g ) .
433.It Va OFS
434Output field separator (default blank).
435.It Va ORS
436Output record separator (default newline).
437.It Va RLENGTH
438The length of the string matched by the
439.Fn match
440function.
441.It Va RS
442Input record separator (default newline).
443If empty, blank lines separate records.
444If more than one character long,
445.Va RS
446is treated as a regular expression, and records are
447separated by text matching the expression.
448.It Va RSTART
449The starting position of the string matched by the
450.Fn match
451function.
452.It Va SUBSEP
453Separates multiple subscripts (default 034).
454.El
455.Sh FUNCTIONS
456The awk language has a variety of built-in functions:
457arithmetic, string, input/output, general, and bit-operation.
458.Pp
459Functions may be defined (at the position of a pattern-action statement)
460thusly:
461.Pp
462.Dl function foo(a, b, c) { ...; return x }
463.Pp
464Parameters are passed by value if scalar, and by reference if array name;
465functions may be called recursively.
466Parameters are local to the function; all other variables are global.
467Thus local variables may be created by providing excess parameters in
468the function definition.
469.Ss Arithmetic Functions
470.Bl -tag -width "atan2(y, x)"
471.It Fn atan2 y x
472Return the arctangent of
473.Fa y Ns / Ns Fa x
474in radians.
475.It Fn cos x
476Return the cosine of
477.Fa x ,
478where
479.Fa x
480is in radians.
481.It Fn exp x
482Return the exponential of
483.Fa x .
484.It Fn int x
485Return
486.Fa x
487truncated to an integer value.
488.It Fn log x
489Return the natural logarithm of
490.Fa x .
491.It Fn rand
492Return a random number,
493.Fa n ,
494such that
495.Sm off
496.Pf 0 \*(Le Fa n No \*(Lt 1 .
497.Sm on
498Random numbers are non-deterministic unless a seed is explicitly set with
499.Fn srand .
500.It Fn sin x
501Return the sine of
502.Fa x ,
503where
504.Fa x
505is in radians.
506.It Fn sqrt x
507Return the square root of
508.Fa x .
509.It Fn srand expr
510Sets seed for
511.Fn rand
512to
513.Fa expr
514and returns the previous seed.
515If
516.Fa expr
517is omitted,
518.Fn rand
519will return non-deterministic random numbers.
520.El
521.Ss String Functions
522.Bl -tag -width "split(s, a, fs)"
523.It Fn gensub r s h [t]
524Search the target string
525.Ar t
526for matches of the regular expression
527.Ar r .
528If
529.Ar h
530is a string beginning with
531.Ic g
532or
533.Ic G ,
534then replace all matches of
535.Ar r
536with
537.Ar s .
538Otherwise,
539.Ar h
540is a number indicating which match of
541.Ar r
542to replace.
543If no
544.Ar t
545is supplied,
546.Va $0
547is used instead.
548.\"Within the replacement text
549.\".Ar s ,
550.\"the sequence
551.\".Ar \en ,
552.\"where
553.\".Ar n
554.\"is a digit from 1 to 9, may be used to indicate just the text that
555.\"matched the
556.\".Ar n Ap th
557.\"parenthesized subexpression.
558.\"The sequence
559.\".Ic \e0
560.\"represents the entire text, as does the character
561.\".Ic & .
562Unlike
563.Fn sub
564and
565.Fn gsub ,
566the modified string is returned as the result of the function,
567and the original target is
568.Em not
569changed.
570Note that
571.Ar \en
572sequences within the replacement string
573.Ar s ,
574as supported by GNU
575.Nm ,
576are
577.Em not
578supported at this time.
579.It Fn gsub r t s
580The same as
581.Fn sub
582except that all occurrences of the regular expression are replaced.
583.Fn gsub
584returns the number of replacements.
585.It Fn index s t
586The position in
587.Fa s
588where the string
589.Fa t
590occurs, or 0 if it does not.
591.It Fn length s
592The length of
593.Fa s
594taken as a string,
595number of elements in an array for an array argument,
596or length of
597.Va $0
598if no argument is given.
599.It Fn match s r
600The position in
601.Fa s
602where the regular expression
603.Fa r
604occurs, or 0 if it does not.
605The variable
606.Va RSTART
607is set to the starting position of the matched string
608.Pq which is the same as the returned value
609or zero if no match is found.
610The variable
611.Va RLENGTH
612is set to the length of the matched string,
613or \-1 if no match is found.
614.It Fn split s a fs
615Splits the string
616.Fa s
617into array elements
618.Va a[1] , a[2] , ... , a[n]
619and returns
620.Va n .
621The separation is done with the regular expression
622.Ar fs
623or with the field separator
624.Va FS
625if
626.Ar fs
627is not given.
628An empty string as field separator splits the string
629into one array element per character.
630.It Fn sprintf fmt expr ...
631The string resulting from formatting
632.Fa expr , ...
633according to the
634.Xr printf 1
635format
636.Fa fmt .
637.It Fn sub r t s
638Substitutes
639.Fa t
640for the first occurrence of the regular expression
641.Fa r
642in the string
643.Fa s .
644If
645.Fa s
646is not given,
647.Va $0
648is used.
649An ampersand
650.Pq Sq &
651in
652.Fa t
653is replaced in string
654.Fa s
655with regular expression
656.Fa r .
657A literal ampersand can be specified by preceding it with two backslashes
658.Pq Sq \e\e .
659A literal backslash can be specified by preceding it with another backslash
660.Pq Sq \e\e .
661.Fn sub
662returns the number of replacements.
663.It Fn substr s m n
664Return at most the
665.Fa n Ns -character
666substring of
667.Fa s
668that begins at position
669.Fa m
670counted from 1.
671If
672.Fa n
673is omitted, or if
674.Fa n
675specifies more characters than are left in the string,
676the length of the substring is limited by the length of
677.Fa s .
678.It Fn tolower str
679Returns a copy of
680.Fa str
681with all upper-case characters translated to their
682corresponding lower-case equivalents.
683.It Fn toupper str
684Returns a copy of
685.Fa str
686with all lower-case characters translated to their
687corresponding upper-case equivalents.
688.El
689.Ss Time Functions
690This version of
691.Nm
692provides the following functions for obtaining and formatting time
693stamps.
694.Bl -tag -width indent
695.It Fn mktime datespec
696Converts
697.Fa datespec
698into a timestamp in the same form as a value returned by
699.Fn systime .
700The
701.Fa datespec
702is a string composed of six or seven numbers separated by whitespace:
703.Bd -literal -offset indent
704YYYY MM DD HH MM SS [DST]
705.Ed
706.Pp
707The fields in
708.Fa datespec
709are as follows:
710.Bl -tag -width "YYYY"
711.It YYYY
712Year: a four-digit year, including the century.
713.It MM
714Month: a number from 1 to 12.
715.It DD
716Day: a number from 1 to 31.
717.It HH
718Hour: a number from 0 to 23.
719.It MM
720Minute: a number from 0 to 59.
721.It SS
722Second: a number from 0 to 60 (permitting a leap second).
723.It DST
724Daylight Saving Time: a positive or zero value indicates that
725DST is or is not in effect.
726If DST is not specified, or is negative,
727.Fn mktime
728will attempt to determine the correct value.
729.El
730.It Fn strftime "[format [, timestamp]]"
731Formats
732.Ar timestamp
733according to the string
734.Ar format .
735The format string may contain any of the conversion specifications described
736in the
737.Xr strftime 3
738manual page, as well as any arbitrary text.
739The
740.Ar timestamp
741must be in the same form as a value returned by
742.Fn mktime
743and
744.Fn systime .
745If
746.Ar timestamp
747is not specified, the current time is used.
748If
749.Ar format
750is not specified, a default format equivalent to the output of
751.Xr date 1
752is used.
753.It Fn systime
754Returns the value of time in seconds since 0 hours, 0 minutes,
7550 seconds, January 1, 1970, Coordinated Universal Time (UTC).
756.El
757.Ss Input/Output and General Functions
758.Bl -tag -width "getline [var] < file"
759.It Fn close expr
760Closes the file or pipe
761.Fa expr .
762.Fa expr
763should match the string that was used to open the file or pipe.
764.It Ar cmd | Ic getline Op Va var
765Read a record of input from a stream piped from the output of
766.Ar cmd .
767If
768.Va var
769is omitted, the variables
770.Va $0
771and
772.Va NF
773are set.
774Otherwise
775.Va var
776is set.
777If the stream is not open, it is opened.
778As long as the stream remains open, subsequent calls
779will read subsequent records from the stream.
780The stream remains open until explicitly closed with a call to
781.Fn close .
782.Ic getline
783returns 1 for a successful input, 0 for end of file, and \-1 for an error.
784.It Fn fflush [expr]
785Flushes any buffered output for the file or pipe
786.Fa expr ,
787or all open files or pipes if
788.Fa expr
789is omitted.
790.Fa expr
791should match the string that was used to open the file or pipe.
792.It Ic getline
793Sets
794.Va $0
795to the next input record from the current input file.
796This form of
797.Ic getline
798sets the variables
799.Va NF ,
800.Va NR ,
801and
802.Va FNR .
803.Ic getline
804returns 1 for a successful input, 0 for end of file, and \-1 for an error.
805.It Ic getline Va var
806Sets
807.Va $0
808to variable
809.Va var .
810This form of
811.Ic getline
812sets the variables
813.Va NR
814and
815.Va FNR .
816.Ic getline
817returns 1 for a successful input, 0 for end of file, and \-1 for an error.
818.It Xo
819.Ic getline Op Va var
820.Pf <\ \& Ar file
821.Xc
822Sets
823.Va $0
824to the next record from
825.Ar file .
826If
827.Va var
828is omitted, the variables
829.Va $0
830and
831.Va NF
832are set.
833Otherwise
834.Va var
835is set.
836If
837.Ar file
838is not open, it is opened.
839As long as the stream remains open, subsequent calls will read subsequent
840records from
841.Ar file .
842.Ar file
843remains open until explicitly closed with a call to
844.Fn close .
845.It Fn system cmd
846Executes
847.Fa cmd
848and returns its exit status.
849This will be \-1 upon error,
850.Ar cmd Ns 's
851exit status upon a normal exit,
852256 +
853.Em sig
854if
855.Fa cmd
856was terminated by a signal, where
857.Em sig
858is the number of the signal,
859or 512 +
860.Em sig
861if there was a core dump.
862.El
863.Ss Bit-Operation Functions
864.Bl -tag -width "lshift(a, b)"
865.It Fn compl x
866Returns the bitwise complement of integer argument x.
867.It Fn and x y
868Performs a bitwise AND on integer arguments x and y.
869.It Fn or x y
870Performs a bitwise OR on integer arguments x and y.
871.It Fn xor x y
872Performs a bitwise Exclusive-OR on integer arguments x and y.
873.It Fn lshift x n
874Returns integer argument x shifted by n bits to the left.
875.It Fn rshift x n
876Returns integer argument x shifted by n bits to the right.
877.El
878.Sh ENVIRONMENT
879The following environment variables affect the execution of
880.Nm :
881.Bl -tag -width POSIXLY_CORRECT
882.It Ev LC_CTYPE
883The character encoding
884.Xr locale 1 .
885It decides which byte sequences form characters, which characters are
886letters, and how letters are mapped from lower to upper case and vice versa.
887If unset or set to
888.Qq C ,
889.Qq POSIX ,
890or an unsupported value, each byte is treated as a character,
891and non-ASCII bytes are not regarded as letters.
892.It Ev POSIXLY_CORRECT
893When set, behave in accordance with the standard, even when it conflicts
894with historical behavior.
895.El
896.Sh EXIT STATUS
897.Ex -std awk
898.Pp
899But note that the
900.Ic exit
901expression can modify the exit status.
902.Sh EXAMPLES
903Print lines longer than 72 characters:
904.Pp
905.Dl length($0) > 72
906.Pp
907Print first two fields in opposite order:
908.Pp
909.Dl { print $2, $1 }
910.Pp
911Same, with input fields separated by comma and/or spaces and tabs:
912.Bd -literal -offset indent
913BEGIN { FS = ",[ \et]*|[ \et]+" }
914      { print $2, $1 }
915.Ed
916.Pp
917Add up first column, print sum and average:
918.Bd -literal -offset indent
919{ s += $1 }
920END { print "sum is", s, " average is", s/NR }
921.Ed
922.Pp
923Print all lines between start/stop pairs:
924.Pp
925.Dl /start/, /stop/
926.Pp
927Simulate
928.Xr echo 1 :
929.Bd -literal -offset indent
930BEGIN { # Simulate echo(1)
931        for (i = 1; i < ARGC; i++) printf "%s ", ARGV[i]
932        printf "\en"
933        exit }
934.Ed
935.Pp
936Print an error message to standard error:
937.Bd -literal -offset indent
938{ print "error!" > "/dev/stderr" }
939.Ed
940.Sh UNUSUAL FLOATING-POINT VALUES
941.Nm
942was designed before IEEE 754 arithmetic defined Not-A-Number (NaN)
943and Infinity values, which are supported by all modern floating-point
944hardware.
945.Pp
946Because
947.Nm
948uses
949.Xr strtod 3
950and
951.Xr atof 3
952to convert string values to double-precision floating-point values,
953modern C libraries also convert strings starting with
954.Dv inf
955and
956.Dv nan
957into infinity and NaN values respectively.
958This led to strange results,
959with something like this:
960.Pp
961.Li echo nancy | awk '{ print $1 + 0 }'
962.Pp
963printing
964.Dv nan
965instead of zero.
966.Pp
967.Nm
968now follows GNU
969.Nm ,
970and prefilters string values before attempting
971to convert them to numbers, as follows:
972.Bl -tag -width Ds
973.It Hexadecimal values
974Hexadecimal values (allowed since C99) convert to zero, as they did
975prior to C99.
976.It NaN values
977The two strings
978.Dq +NAN
979and
980.Dq -NAN
981(case independent) convert to NaN.
982No others do.
983(NaNs can have signs.)
984.It Infinity values
985The two strings
986.Dq +INF
987and
988.Dq -INF
989(case independent) convert to positive and negative infinity, respectively.
990No others do.
991.El
992.Sh SEE ALSO
993.Xr cut 1 ,
994.Xr date 1 ,
995.Xr grep 1 ,
996.Xr lex 1 ,
997.Xr printf 1 ,
998.Xr sed 1 ,
999.Xr strftime 3 ,
1000.Xr re_format 7 ,
1001.Xr script 7
1002.Rs
1003.\" 4.4BSD USD:16
1004.\".%R Computing Science Technical Report
1005.\".%N 68
1006.\".%D July 1978
1007.%A A. V. Aho
1008.%A P. J. Weinberger
1009.%A B. W. Kernighan
1010.%T AWK \(em A Pattern Scanning and Processing Language
1011.%J Software \(em Practice and Experience
1012.%V 9:4
1013.%P pp. 267-279
1014.%D April 1979
1015.Re
1016.Rs
1017.%A A. V. Aho
1018.%A B. W. Kernighan
1019.%A P. J. Weinberger
1020.%T The AWK Programming Language
1021.%I Addison-Wesley
1022.%D 2024
1023.%O ISBN 0-13-826972-6
1024.Re
1025.Sh STANDARDS
1026The
1027.Nm
1028utility is compliant with the
1029.St -p1003.1-2008
1030specification except that consecutive backslashes in the replacement
1031string argument for
1032.Fn sub
1033and
1034.Fn gsub
1035are not collapsed and a slash
1036.Pq Ql /
1037does not need to be escaped in a bracket expression.
1038Also, the behaviour of
1039.Fn rand
1040and
1041.Fn srand
1042has been changed to support non-deterministic random numbers.
1043.Pp
1044In
1045.Ev LC_CTYPE Ns Li =POSIX
1046mode, treating non-ASCII input bytes as non-letter characters rather
1047than as input encoding errors intentionally violates the specification.
1048.Pp
1049The flags
1050.Op Fl \&dV ,
1051.Op Fl -csv ,
1052and
1053.Op Fl safe ,
1054support for regular expressions in
1055.Va RS ,
1056as well as the functions
1057.Fn fflush ,
1058.Fn gensub ,
1059.Fn compl ,
1060.Fn and ,
1061.Fn or ,
1062.Fn xor ,
1063.Fn lshift ,
1064.Fn rshift ,
1065.Fn mktime ,
1066.Fn strftime
1067and
1068.Fn systime
1069are extensions to that specification.
1070.Sh HISTORY
1071An
1072.Nm
1073utility appeared in
1074.At v7 .
1075.Sh BUGS
1076There are no explicit conversions between numbers and strings.
1077To force an expression to be treated as a number add 0 to it;
1078to force it to be treated as a string concatenate
1079.Li \&""
1080to it.
1081.Pp
1082The scope rules for variables in functions are a botch;
1083the syntax is worse.
1084