xref: /openbsd/usr.bin/awk/awk.1 (revision 73471bf0)
1.\"	$OpenBSD: awk.1,v 1.63 2021/11/08 06:46:22 jmc Exp $
2.\"
3.\" Copyright (C) Lucent Technologies 1997
4.\" All Rights Reserved
5.\"
6.\" Permission to use, copy, modify, and distribute this software and
7.\" its documentation for any purpose and without fee is hereby
8.\" granted, provided that the above copyright notice appear in all
9.\" copies and that both that the copyright notice and this
10.\" permission notice and warranty disclaimer appear in supporting
11.\" documentation, and that the name Lucent Technologies or any of
12.\" its entities not be used in advertising or publicity pertaining
13.\" to distribution of the software without specific, written prior
14.\" permission.
15.\"
16.\" LUCENT DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE,
17.\" INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS.
18.\" IN NO EVENT SHALL LUCENT OR ANY OF ITS ENTITIES BE LIABLE FOR ANY
19.\" SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
20.\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER
21.\" IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION,
22.\" ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF
23.\" THIS SOFTWARE.
24.\"
25.Dd $Mdocdate: November 8 2021 $
26.Dt AWK 1
27.Os
28.Sh NAME
29.Nm awk
30.Nd pattern-directed scanning and processing language
31.Sh SYNOPSIS
32.Nm awk
33.Op Fl safe
34.Op Fl V
35.Op Fl d Ns Op Ar n
36.Op Fl F Ar fs
37.Op Fl v Ar var Ns = Ns Ar value
38.Op Ar prog | Fl f Ar progfile
39.Ar
40.Sh DESCRIPTION
41.Nm
42scans each input
43.Ar file
44for lines that match any of a set of patterns specified literally in
45.Ar prog
46or in one or more files specified as
47.Fl f Ar progfile .
48With each pattern there can be an associated action that will be performed
49when a line of a
50.Ar file
51matches the pattern.
52Each line is matched against the
53pattern portion of every pattern-action statement;
54the associated action is performed for each matched pattern.
55The file name
56.Sq -
57means the standard input.
58Any
59.Ar file
60of the form
61.Ar var Ns = Ns Ar value
62is treated as an assignment, not a filename,
63and is executed at the time it would have been opened if it were a filename.
64.Pp
65The options are as follows:
66.Bl -tag -width "-safe "
67.It Fl d Ns Op Ar n
68Debug mode.
69Set debug level to
70.Ar n ,
71or 1 if
72.Ar n
73is not specified.
74A value greater than 1 causes
75.Nm
76to dump core on fatal errors.
77.It Fl F Ar fs
78Define the input field separator to be the regular expression
79.Ar fs .
80.It Fl f Ar progfile
81Read program code from the specified file
82.Ar progfile
83instead of from the command line.
84.It Fl safe
85Disable file output
86.Pf ( Ic print No > ,
87.Ic print No >> ) ,
88process creation
89.Po
90.Ar cmd | Ic getline ,
91.Ic print | ,
92.Ic system
93.Pc
94and access to the environment
95.Pf ( Va ENVIRON ;
96see the section on variables below).
97This is a first
98.Pq and not very reliable
99approximation to a
100.Dq safe
101version of
102.Nm .
103.It Fl V
104Print the version number of
105.Nm
106to standard output and exit.
107.It Fl v Ar var Ns = Ns Ar value
108Assign
109.Ar value
110to variable
111.Ar var
112before
113.Ar prog
114is executed;
115any number of
116.Fl v
117options may be present.
118.El
119.Pp
120The input is normally made up of input lines
121.Pq records
122separated by newlines, or by the value of
123.Va RS .
124If
125.Va RS
126is null, then any number of blank lines are used as the record separator,
127and newlines are used as field separators
128(in addition to the value of
129.Va FS ) .
130This is convenient when working with multi-line records.
131.Pp
132An input line is normally made up of fields separated by whitespace,
133or by the value of the field separator
134.Va FS
135at the time the line is read.
136The fields are denoted
137.Va $1 , $2 , ... ,
138while
139.Va $0
140refers to the entire line.
141.Va FS
142may be set to either a single character or a regular expression.
143As a special case, if
144.Va FS
145is a single space
146.Pq the default ,
147fields will be split by one or more whitespace characters.
148If
149.Va FS
150is null, the input line is split into one field per character.
151.Pp
152Normally, any number of blanks separate fields.
153In order to set the field separator to a single blank, use the
154.Fl F
155option with a value of
156.Sq [\ \&] .
157If a field separator of
158.Sq t
159is specified,
160.Nm
161treats it as if
162.Sq \et
163had been specified and uses
164.Aq TAB
165as the field separator.
166In order to use a literal
167.Sq t
168as the field separator, use the
169.Fl F
170option with a value of
171.Sq [t] .
172The field separator is usually set via the
173.Fl F
174option or from inside a
175.Ic BEGIN
176block so that it takes effect before the input is read.
177.Pp
178A pattern-action statement has the form:
179.Pp
180.D1 Ar pattern Ic \&{ Ar action Ic \&}
181.Pp
182A missing
183.Ic \&{ Ar action Ic \&}
184means print the line;
185a missing pattern always matches.
186Pattern-action statements are separated by newlines or semicolons.
187.Pp
188Newlines are permitted after a terminating statement or following a comma
189.Pq Sq ,\& ,
190an open brace
191.Pq Sq { ,
192a logical AND
193.Pq Sq && ,
194a logical OR
195.Pq Sq || ,
196after the
197.Sq do
198or
199.Sq else
200keywords,
201or after the closing parenthesis of an
202.Sq if ,
203.Sq for ,
204or
205.Sq while
206statement.
207Additionally, a backslash
208.Pq Sq \e
209can be used to escape a newline between tokens.
210.Pp
211An action is a sequence of statements.
212A statement can be one of the following:
213.Pp
214.Bl -tag -width Ds -offset indent -compact
215.It Ic if Ar ( expression ) Ar statement Op Ic else Ar statement
216.It Ic while Ar ( expression ) Ar statement
217.It Ic for Ar ( expression ; expression ; expression ) statement
218.It Ic for Ar ( var Ic in Ar array ) statement
219.It Ic do Ar statement Ic while Ar ( expression )
220.It Ic break
221.It Ic continue
222.It Xo Ic {
223.Op Ar statement ...
224.Ic }
225.Xc
226.It Xo Ar expression
227.No # commonly
228.Ar var No = Ar expression
229.Xc
230.It Xo Ic print
231.Op Ar expression-list
232.Op > Ns Ar expression
233.Xc
234.It Xo Ic printf Ar format
235.Op Ar ... , expression-list
236.Op > Ns Ar expression
237.Xc
238.It Ic return Op Ar expression
239.It Xo Ic next
240.No # skip remaining patterns on this input line
241.Xc
242.It Xo Ic nextfile
243.No # skip rest of this file, open next, start at top
244.Xc
245.It Xo Ic delete
246.Sm off
247.Ar array Ic \&[ Ar expression Ic \&]
248.Sm on
249.No # delete an array element
250.Xc
251.It Xo Ic delete Ar array
252.No # delete all elements of array
253.Xc
254.It Xo Ic exit
255.Op Ar expression
256.No # exit processing, and perform
257.Ic END
258processing; status is
259.Ar expression
260.Xc
261.El
262.Pp
263Statements are terminated by
264semicolons, newlines or right braces.
265An empty
266.Ar expression-list
267stands for
268.Ar $0 .
269String constants are quoted
270.Li \&"" ,
271with the usual C escapes recognized within
272(see
273.Xr printf 1
274for a complete list of these).
275Expressions take on string or numeric values as appropriate,
276and are built using the operators
277.Ic + \- * / % ^
278.Pq exponentiation ,
279and concatenation
280.Pq indicated by whitespace .
281The operators
282.Ic \&! ++ \-\- += \-= *= /= %= ^=
283.Ic > >= < <= == != ?\&:
284are also available in expressions.
285Variables may be scalars, array elements
286(denoted
287.Li x[i] )
288or fields.
289Variables are initialized to the null string.
290Array subscripts may be any string,
291not necessarily numeric;
292this allows for a form of associative memory.
293Multiple subscripts such as
294.Li [i,j,k]
295are permitted; the constituents are concatenated,
296separated by the value of
297.Va SUBSEP
298.Pq see the section on variables below .
299.Pp
300The
301.Ic print
302statement prints its arguments on the standard output
303(or on a file if
304.Pf >\ \& Ar file
305or
306.Pf >>\ \& Ar file
307is present or on a pipe if
308.Pf |\ \& Ar cmd
309is present), separated by the current output field separator,
310and terminated by the output record separator.
311.Ar file
312and
313.Ar cmd
314may be literal names or parenthesized expressions;
315identical string values in different statements denote
316the same open file.
317The
318.Ic printf
319statement formats its expression list according to the
320.Ar format
321(see
322.Xr printf 1 ) .
323.Pp
324Patterns are arbitrary Boolean combinations
325(with
326.Ic "\&! || &&" )
327of regular expressions and
328relational expressions.
329.Nm
330supports extended regular expressions
331.Pq EREs .
332See
333.Xr re_format 7
334for more information on regular expressions.
335Isolated regular expressions
336in a pattern apply to the entire line.
337Regular expressions may also occur in
338relational expressions, using the operators
339.Ic ~
340and
341.Ic !~ .
342.Pf / Ar re Ns /
343is a constant regular expression;
344any string (constant or variable) may be used
345as a regular expression, except in the position of an isolated regular expression
346in a pattern.
347.Pp
348A pattern may consist of two patterns separated by a comma;
349in this case, the action is performed for all lines
350from an occurrence of the first pattern
351through an occurrence of the second.
352.Pp
353A relational expression is one of the following:
354.Pp
355.Bl -tag -width Ds -offset indent -compact
356.It Ar expression matchop regular-expression
357.It Ar expression relop expression
358.It Ar expression Ic in Ar array-name
359.It Xo Ic \&( Ns
360.Ar expr , expr , \&... Ns Ic \&) in
361.Ar array-name
362.Xc
363.El
364.Pp
365where a
366.Ar relop
367is any of the six relational operators in C, and a
368.Ar matchop
369is either
370.Ic ~
371(matches)
372or
373.Ic !~
374(does not match).
375A conditional is an arithmetic expression,
376a relational expression,
377or a Boolean combination
378of these.
379.Pp
380The special pattern
381.Ic BEGIN
382may be used to capture control before the first input line is read.
383The special pattern
384.Ic END
385may be used to capture control after processing is finished.
386.Ic BEGIN
387and
388.Ic END
389do not combine with other patterns.
390They may appear multiple times in a program and execute
391in the order they are read by
392.Nm .
393.Pp
394Variable names with special meanings:
395.Pp
396.Bl -tag -width "FILENAME " -compact
397.It Va ARGC
398Argument count, assignable.
399.It Va ARGV
400Argument array, assignable;
401non-null members are taken as filenames.
402.It Va CONVFMT
403Conversion format when converting numbers
404(default
405.Qq Li %.6g ) .
406.It Va ENVIRON
407Array of environment variables; subscripts are names.
408.It Va FILENAME
409The name of the current input file.
410.It Va FNR
411Ordinal number of the current record in the current file.
412.It Va FS
413Regular expression used to separate fields (default whitespace);
414also settable by option
415.Fl F Ar fs .
416.It Va NF
417Number of fields in the current record.
418.Va $NF
419can be used to obtain the value of the last field in the current record.
420.It Va NR
421Ordinal number of the current record.
422.It Va OFMT
423Output format for numbers (default
424.Qq Li %.6g ) .
425.It Va OFS
426Output field separator (default blank).
427.It Va ORS
428Output record separator (default newline).
429.It Va RLENGTH
430The length of the string matched by the
431.Fn match
432function.
433.It Va RS
434Input record separator (default newline).
435If empty, blank lines separate records.
436If more than one character long,
437.Va RS
438is treated as a regular expression, and records are
439separated by text matching the expression.
440.It Va RSTART
441The starting position of the string matched by the
442.Fn match
443function.
444.It Va SUBSEP
445Separates multiple subscripts (default 034).
446.El
447.Sh FUNCTIONS
448The awk language has a variety of built-in functions:
449arithmetic, string, input/output, general, and bit-operation.
450.Pp
451Functions may be defined (at the position of a pattern-action statement)
452thusly:
453.Pp
454.Dl function foo(a, b, c) { ...; return x }
455.Pp
456Parameters are passed by value if scalar, and by reference if array name;
457functions may be called recursively.
458Parameters are local to the function; all other variables are global.
459Thus local variables may be created by providing excess parameters in
460the function definition.
461.Ss Arithmetic Functions
462.Bl -tag -width "atan2(y, x)"
463.It Fn atan2 y x
464Return the arctangent of
465.Fa y Ns / Ns Fa x
466in radians.
467.It Fn cos x
468Return the cosine of
469.Fa x ,
470where
471.Fa x
472is in radians.
473.It Fn exp x
474Return the exponential of
475.Fa x .
476.It Fn int x
477Return
478.Fa x
479truncated to an integer value.
480.It Fn log x
481Return the natural logarithm of
482.Fa x .
483.It Fn rand
484Return a random number,
485.Fa n ,
486such that
487.Sm off
488.Pf 0 \*(Le Fa n No \*(Lt 1 .
489.Sm on
490Random numbers are non-deterministic unless a seed is explicitly set with
491.Fn srand .
492.It Fn sin x
493Return the sine of
494.Fa x ,
495where
496.Fa x
497is in radians.
498.It Fn sqrt x
499Return the square root of
500.Fa x .
501.It Fn srand expr
502Sets seed for
503.Fn rand
504to
505.Fa expr
506and returns the previous seed.
507If
508.Fa expr
509is omitted,
510.Fn rand
511will return non-deterministic random numbers.
512.El
513.Ss String Functions
514.Bl -tag -width "split(s, a, fs)"
515.It Fn gensub r s h [t]
516Search the target string
517.Ar t
518for matches of the regular expression
519.Ar r .
520If
521.Ar h
522is a string beginning with
523.Ic g
524or
525.Ic G ,
526then replace all matches of
527.Ar r
528with
529.Ar s .
530Otherwise,
531.Ar h
532is a number indicating which match of
533.Ar r
534to replace.
535If no
536.Ar t
537is supplied,
538.Va $0
539is used instead.
540.\"Within the replacement text
541.\".Ar s ,
542.\"the sequence
543.\".Ar \en ,
544.\"where
545.\".Ar n
546.\"is a digit from 1 to 9, may be used to indicate just the text that
547.\"matched the
548.\".Ar n Ap th
549.\"parenthesized subexpression.
550.\"The sequence
551.\".Ic \e0
552.\"represents the entire text, as does the character
553.\".Ic & .
554Unlike
555.Fn sub
556and
557.Fn gsub ,
558the modified string is returned as the result of the function,
559and the original target is
560.Em not
561changed.
562Note that
563.Ar \en
564sequences within the replacement string
565.Ar s ,
566as supported by GNU
567.Nm ,
568are
569.Em not
570supported at this time.
571.It Fn gsub r t s
572The same as
573.Fn sub
574except that all occurrences of the regular expression are replaced.
575.Fn gsub
576returns the number of replacements.
577.It Fn index s t
578The position in
579.Fa s
580where the string
581.Fa t
582occurs, or 0 if it does not.
583.It Fn length s
584The length of
585.Fa s
586taken as a string,
587number of elements in an array for an array argument,
588or length of
589.Va $0
590if no argument is given.
591.It Fn match s r
592The position in
593.Fa s
594where the regular expression
595.Fa r
596occurs, or 0 if it does not.
597The variable
598.Va RSTART
599is set to the starting position of the matched string
600.Pq which is the same as the returned value
601or zero if no match is found.
602The variable
603.Va RLENGTH
604is set to the length of the matched string,
605or \-1 if no match is found.
606.It Fn split s a fs
607Splits the string
608.Fa s
609into array elements
610.Va a[1] , a[2] , ... , a[n]
611and returns
612.Va n .
613The separation is done with the regular expression
614.Ar fs
615or with the field separator
616.Va FS
617if
618.Ar fs
619is not given.
620An empty string as field separator splits the string
621into one array element per character.
622.It Fn sprintf fmt expr ...
623The string resulting from formatting
624.Fa expr , ...
625according to the
626.Xr printf 1
627format
628.Fa fmt .
629.It Fn sub r t s
630Substitutes
631.Fa t
632for the first occurrence of the regular expression
633.Fa r
634in the string
635.Fa s .
636If
637.Fa s
638is not given,
639.Va $0
640is used.
641An ampersand
642.Pq Sq &
643in
644.Fa t
645is replaced in string
646.Fa s
647with regular expression
648.Fa r .
649A literal ampersand can be specified by preceding it with two backslashes
650.Pq Sq \e\e .
651A literal backslash can be specified by preceding it with another backslash
652.Pq Sq \e\e .
653.Fn sub
654returns the number of replacements.
655.It Fn substr s m n
656Return at most the
657.Fa n Ns -character
658substring of
659.Fa s
660that begins at position
661.Fa m
662counted from 1.
663If
664.Fa n
665is omitted, or if
666.Fa n
667specifies more characters than are left in the string,
668the length of the substring is limited by the length of
669.Fa s .
670.It Fn tolower str
671Returns a copy of
672.Fa str
673with all upper-case characters translated to their
674corresponding lower-case equivalents.
675.It Fn toupper str
676Returns a copy of
677.Fa str
678with all lower-case characters translated to their
679corresponding upper-case equivalents.
680.El
681.Ss Time Functions
682This version of
683.Nm
684provides the following functions for obtaining and formatting time
685stamps.
686.Bl -tag -width indent
687.It Fn mktime datespec
688Converts
689.Fa datespec
690into a timestamp in the same form as a value returned by
691.Fn systime .
692The
693.Fa datespec
694is a string composed of six or seven numbers separated by whitespace:
695.Bd -literal -offset indent
696YYYY MM DD HH MM SS [DST]
697.Ed
698.Pp
699The fields in
700.Fa datespec
701are as follows:
702.Bl -tag -width "YYYY"
703.It YYYY
704Year: a four-digit year, including the century.
705.It MM
706Month: a number from 1 to 12.
707.It DD
708Day: a number from 1 to 31.
709.It HH
710Hour: a number from 0 to 23.
711.It MM
712Minute: a number from 0 to 59.
713.It SS
714Second: a number from 0 to 60 (permitting a leap second).
715.It DST
716Daylight Saving Time: a positive or zero value indicates that
717DST is or is not in effect.
718If DST is not specified, or is negative,
719.Fn mktime
720will attempt to determine the correct value.
721.El
722.It Fn strftime "[format [, timestamp]]"
723Formats
724.Ar timestamp
725according to the string
726.Ar format .
727The format string may contain any of the conversion specifications described
728in the
729.Xr strftime 3
730manual page, as well as any arbitrary text.
731The
732.Ar timestamp
733must be in the same form as a value returned by
734.Fn mktime
735and
736.Fn systime .
737If
738.Ar timestamp
739is not specified, the current time is used.
740If
741.Ar format
742is not specified, a default format equivalent to the output of
743.Xr date 1
744is used.
745.It Fn systime
746Returns the value of time in seconds since 0 hours, 0 minutes,
7470 seconds, January 1, 1970, Coordinated Universal Time (UTC).
748.El
749.Ss Input/Output and General Functions
750.Bl -tag -width "getline [var] < file"
751.It Fn close expr
752Closes the file or pipe
753.Fa expr .
754.Fa expr
755should match the string that was used to open the file or pipe.
756.It Ar cmd | Ic getline Op Va var
757Read a record of input from a stream piped from the output of
758.Ar cmd .
759If
760.Va var
761is omitted, the variables
762.Va $0
763and
764.Va NF
765are set.
766Otherwise
767.Va var
768is set.
769If the stream is not open, it is opened.
770As long as the stream remains open, subsequent calls
771will read subsequent records from the stream.
772The stream remains open until explicitly closed with a call to
773.Fn close .
774.Ic getline
775returns 1 for a successful input, 0 for end of file, and \-1 for an error.
776.It Fn fflush [expr]
777Flushes any buffered output for the file or pipe
778.Fa expr ,
779or all open files or pipes if
780.Fa expr
781is omitted.
782.Fa expr
783should match the string that was used to open the file or pipe.
784.It Ic getline
785Sets
786.Va $0
787to the next input record from the current input file.
788This form of
789.Ic getline
790sets the variables
791.Va NF ,
792.Va NR ,
793and
794.Va FNR .
795.Ic getline
796returns 1 for a successful input, 0 for end of file, and \-1 for an error.
797.It Ic getline Va var
798Sets
799.Va $0
800to variable
801.Va var .
802This form of
803.Ic getline
804sets the variables
805.Va NR
806and
807.Va FNR .
808.Ic getline
809returns 1 for a successful input, 0 for end of file, and \-1 for an error.
810.It Xo
811.Ic getline Op Va var
812.Pf <\ \& Ar file
813.Xc
814Sets
815.Va $0
816to the next record from
817.Ar file .
818If
819.Va var
820is omitted, the variables
821.Va $0
822and
823.Va NF
824are set.
825Otherwise
826.Va var
827is set.
828If
829.Ar file
830is not open, it is opened.
831As long as the stream remains open, subsequent calls will read subsequent
832records from
833.Ar file .
834.Ar file
835remains open until explicitly closed with a call to
836.Fn close .
837.It Fn system cmd
838Executes
839.Fa cmd
840and returns its exit status.
841This will be \-1 upon error,
842.Ar cmd Ns 's
843exit status upon a normal exit,
844256 +
845.Em sig
846if
847.Fa cmd
848was terminated by a signal, where
849.Em sig
850is the number of the signal,
851or 512 +
852.Em sig
853if there was a core dump.
854.El
855.Ss Bit-Operation Functions
856.Bl -tag -width "lshift(a, b)"
857.It Fn compl x
858Returns the bitwise complement of integer argument x.
859.It Fn and x y
860Performs a bitwise AND on integer arguments x and y.
861.It Fn or x y
862Performs a bitwise OR on integer arguments x and y.
863.It Fn xor x y
864Performs a bitwise Exclusive-OR on integer arguments x and y.
865.It Fn lshift x n
866Returns integer argument x shifted by n bits to the left.
867.It Fn rshift x n
868Returns integer argument x shifted by n bits to the right.
869.El
870.Sh ENVIRONMENT
871The following environment variables affect the execution of
872.Nm :
873.Bl -tag -width POSIXLY_CORRECT
874.It Ev POSIXLY_CORRECT
875When set, behave in accordance with the standard, even when it conflicts
876with historical behavior.
877.El
878.Sh EXIT STATUS
879.Ex -std awk
880.Pp
881But note that the
882.Ic exit
883expression can modify the exit status.
884.Sh EXAMPLES
885Print lines longer than 72 characters:
886.Pp
887.Dl length($0) > 72
888.Pp
889Print first two fields in opposite order:
890.Pp
891.Dl { print $2, $1 }
892.Pp
893Same, with input fields separated by comma and/or spaces and tabs:
894.Bd -literal -offset indent
895BEGIN { FS = ",[ \et]*|[ \et]+" }
896      { print $2, $1 }
897.Ed
898.Pp
899Add up first column, print sum and average:
900.Bd -literal -offset indent
901{ s += $1 }
902END { print "sum is", s, " average is", s/NR }
903.Ed
904.Pp
905Print all lines between start/stop pairs:
906.Pp
907.Dl /start/, /stop/
908.Pp
909Simulate
910.Xr echo 1 :
911.Bd -literal -offset indent
912BEGIN { # Simulate echo(1)
913        for (i = 1; i < ARGC; i++) printf "%s ", ARGV[i]
914        printf "\en"
915        exit }
916.Ed
917.Pp
918Print an error message to standard error:
919.Bd -literal -offset indent
920{ print "error!" > "/dev/stderr" }
921.Ed
922.Sh UNUSUAL FLOATING-POINT VALUES
923.Nm
924was designed before IEEE 754 arithmetic defined Not-A-Number (NaN)
925and Infinity values, which are supported by all modern floating-point
926hardware.
927.Pp
928Because
929.Nm
930uses
931.Xr strtod 3
932and
933.Xr atof 3
934to convert string values to double-precision floating-point values,
935modern C libraries also convert strings starting with
936.Dv inf
937and
938.Dv nan
939into infinity and NaN values respectively.
940This led to strange results,
941with something like this:
942.Pp
943.Li echo nancy | awk '{ print $1 + 0 }'
944.Pp
945printing
946.Dv nan
947instead of zero.
948.Pp
949.Nm
950now follows GNU
951.Nm ,
952and prefilters string values before attempting
953to convert them to numbers, as follows:
954.Bl -tag -width Ds
955.It Hexadecimal values
956Hexadecimal values (allowed since C99) convert to zero, as they did
957prior to C99.
958.It NaN values
959The two strings
960.Dq +NAN
961and
962.Dq -NAN
963(case independent) convert to NaN.
964No others do.
965(NaNs can have signs.)
966.It Infinity values
967The two strings
968.Dq +INF
969and
970.Dq -INF
971(case independent) convert to positive and negative infinity, respectively.
972No others do.
973.El
974.Sh SEE ALSO
975.Xr cut 1 ,
976.Xr date 1 ,
977.Xr grep 1 ,
978.Xr lex 1 ,
979.Xr printf 1 ,
980.Xr sed 1 ,
981.Xr strftime 3 ,
982.Xr re_format 7 ,
983.Xr script 7
984.Rs
985.\" 4.4BSD USD:16
986.\".%R Computing Science Technical Report
987.\".%N 68
988.\".%D July 1978
989.%A A. V. Aho
990.%A P. J. Weinberger
991.%A B. W. Kernighan
992.%T AWK \(em A Pattern Scanning and Processing Language
993.%J Software \(em Practice and Experience
994.%V 9:4
995.%P pp. 267-279
996.%D April 1979
997.Re
998.Rs
999.%A A. V. Aho
1000.%A B. W. Kernighan
1001.%A P. J. Weinberger
1002.%T The AWK Programming Language
1003.%I Addison-Wesley
1004.%D 1988
1005.%O ISBN 0-201-07981-X
1006.Re
1007.Sh STANDARDS
1008The
1009.Nm
1010utility is compliant with the
1011.St -p1003.1-2008
1012specification except that consecutive backslashes in the replacement
1013string argument for
1014.Fn sub
1015and
1016.Fn gsub
1017are not collapsed and a slash
1018.Pq Ql /
1019does not need to be escaped in a bracket expression.
1020Also, the behaviour of
1021.Fn rand
1022and
1023.Fn srand
1024has been changed to support non-deterministic random numbers.
1025.Pp
1026The flags
1027.Op Fl \&dV
1028and
1029.Op Fl safe ,
1030support for regular expressions in
1031.Va RS ,
1032as well as the functions
1033.Fn fflush ,
1034.Fn gensub ,
1035.Fn compl ,
1036.Fn and ,
1037.Fn or ,
1038.Fn xor ,
1039.Fn lshift ,
1040.Fn rshift ,
1041.Fn mktime ,
1042.Fn strftime
1043and
1044.Fn systime
1045are extensions to that specification.
1046.Sh HISTORY
1047An
1048.Nm
1049utility appeared in
1050.At v7 .
1051.Sh BUGS
1052There are no explicit conversions between numbers and strings.
1053To force an expression to be treated as a number add 0 to it;
1054to force it to be treated as a string concatenate
1055.Li \&""
1056to it.
1057.Pp
1058The scope rules for variables in functions are a botch;
1059the syntax is worse.
1060.Pp
1061Only eight-bit character sets are handled correctly.
1062