xref: /freebsd/lib/libc/stdio/scanf.3 (revision 069ac184)
1.\" Copyright (c) 1990, 1991, 1993
2.\"	The Regents of the University of California.  All rights reserved.
3.\"
4.\" This code is derived from software contributed to Berkeley by
5.\" Chris Torek and the American National Standards Committee X3,
6.\" on Information Processing Systems.
7.\"
8.\" Redistribution and use in source and binary forms, with or without
9.\" modification, are permitted provided that the following conditions
10.\" are met:
11.\" 1. Redistributions of source code must retain the above copyright
12.\"    notice, this list of conditions and the following disclaimer.
13.\" 2. Redistributions in binary form must reproduce the above copyright
14.\"    notice, this list of conditions and the following disclaimer in the
15.\"    documentation and/or other materials provided with the distribution.
16.\" 3. Neither the name of the University nor the names of its contributors
17.\"    may be used to endorse or promote products derived from this software
18.\"    without specific prior written permission.
19.\"
20.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
21.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
22.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
23.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
24.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
25.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
26.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
27.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
28.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
29.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
30.\" SUCH DAMAGE.
31.\"
32.Dd September 5, 2023
33.Dt SCANF 3
34.Os
35.Sh NAME
36.Nm scanf ,
37.Nm fscanf ,
38.Nm sscanf ,
39.Nm vscanf ,
40.Nm vsscanf ,
41.Nm vfscanf
42.Nd input format conversion
43.Sh LIBRARY
44.Lb libc
45.Sh SYNOPSIS
46.In stdio.h
47.Ft int
48.Fn scanf "const char * restrict format" ...
49.Ft int
50.Fn fscanf "FILE * restrict stream" "const char * restrict format" ...
51.Ft int
52.Fn sscanf "const char * restrict str" "const char * restrict format" ...
53.In stdarg.h
54.Ft int
55.Fn vscanf "const char * restrict format" "va_list ap"
56.Ft int
57.Fn vsscanf "const char * restrict str" "const char * restrict format" "va_list ap"
58.Ft int
59.Fn vfscanf "FILE * restrict stream" "const char * restrict format" "va_list ap"
60.Sh DESCRIPTION
61The
62.Fn scanf
63family of functions scans input according to a
64.Fa format
65as described below.
66This format may contain
67.Em conversion specifiers ;
68the results from such conversions, if any,
69are stored through the
70.Em pointer
71arguments.
72The
73.Fn scanf
74function
75reads input from the standard input stream
76.Dv stdin ,
77.Fn fscanf
78reads input from the stream pointer
79.Fa stream ,
80and
81.Fn sscanf
82reads its input from the character string pointed to by
83.Fa str .
84The
85.Fn vfscanf
86function
87is analogous to
88.Xr vfprintf 3
89and reads input from the stream pointer
90.Fa stream
91using a variable argument list of pointers (see
92.Xr stdarg 3 ) .
93The
94.Fn vscanf
95function scans a variable argument list from the standard input and
96the
97.Fn vsscanf
98function scans it from a string;
99these are analogous to
100the
101.Fn vprintf
102and
103.Fn vsprintf
104functions respectively.
105Each successive
106.Em pointer
107argument must correspond properly with
108each successive conversion specifier
109(but see the
110.Cm *
111conversion below).
112All conversions are introduced by the
113.Cm %
114(percent sign) character.
115The
116.Fa format
117string
118may also contain other characters.
119White space (such as blanks, tabs, or newlines) in the
120.Fa format
121string match any amount of white space, including none, in the input.
122Everything else
123matches only itself.
124Scanning stops
125when an input character does not match such a format character.
126Scanning also stops
127when an input conversion cannot be made (see below).
128.Sh CONVERSIONS
129Following the
130.Cm %
131character introducing a conversion
132there may be a number of
133.Em flag
134characters, as follows:
135.Bl -tag -width ".Cm l No (ell)"
136.It Cm *
137Suppresses assignment.
138The conversion that follows occurs as usual, but no pointer is used;
139the result of the conversion is simply discarded.
140.It Cm hh
141Indicates that the conversion will be one of
142.Cm bdioux
143or
144.Cm n
145and the next pointer is a pointer to a
146.Vt char
147(rather than
148.Vt int ) .
149.It Cm h
150Indicates that the conversion will be one of
151.Cm bdioux
152or
153.Cm n
154and the next pointer is a pointer to a
155.Vt "short int"
156(rather than
157.Vt int ) .
158.It Cm l No (ell)
159Indicates that the conversion will be one of
160.Cm bdioux
161or
162.Cm n
163and the next pointer is a pointer to a
164.Vt "long int"
165(rather than
166.Vt int ) ,
167that the conversion will be one of
168.Cm a , e , f ,
169or
170.Cm g
171and the next pointer is a pointer to
172.Vt double
173(rather than
174.Vt float ) ,
175or that the conversion will be one of
176.Cm c ,
177.Cm s
178or
179.Cm \&[
180and the next pointer is a pointer to an array of
181.Vt wchar_t
182(rather than
183.Vt char ) .
184.It Cm ll No (ell ell)
185Indicates that the conversion will be one of
186.Cm bdioux
187or
188.Cm n
189and the next pointer is a pointer to a
190.Vt "long long int"
191(rather than
192.Vt int ) .
193.It Cm L
194Indicates that the conversion will be one of
195.Cm a , e , f ,
196or
197.Cm g
198and the next pointer is a pointer to
199.Vt "long double" .
200.It Cm j
201Indicates that the conversion will be one of
202.Cm bdioux
203or
204.Cm n
205and the next pointer is a pointer to a
206.Vt intmax_t
207(rather than
208.Vt int ) .
209.It Cm t
210Indicates that the conversion will be one of
211.Cm bdioux
212or
213.Cm n
214and the next pointer is a pointer to a
215.Vt ptrdiff_t
216(rather than
217.Vt int ) .
218.It Cm w Ns Ar N
219.Po
220where
221.Ar N
222is 8, 16, 32, or 64
223.Pc
224Indicates that the conversion will be one of
225.Cm bdioux
226or
227.Cm n
228and the next pointer is a pointer to a
229.Vt intN_t
230(rather than
231.Vt int ) .
232.It Cm wf Ns Ar N
233.Po
234where
235.Ar N
236is 8, 16, 32, or 64
237.Pc
238Indicates that the conversion will be one of
239.Cm bdioux
240or
241.Cm n
242and the next pointer is a pointer to a
243.Vt int_fastN_t
244(rather than
245.Vt int ) .
246.It Cm z
247Indicates that the conversion will be one of
248.Cm bdioux
249or
250.Cm n
251and the next pointer is a pointer to a
252.Vt size_t
253(rather than
254.Vt int ) .
255.It Cm q
256(deprecated.)
257Indicates that the conversion will be one of
258.Cm bdioux
259or
260.Cm n
261and the next pointer is a pointer to a
262.Vt "long long int"
263(rather than
264.Vt int ) .
265.El
266.Pp
267In addition to these flags,
268there may be an optional maximum field width,
269expressed as a decimal integer,
270between the
271.Cm %
272and the conversion.
273If no width is given,
274a default of
275.Dq infinity
276is used (with one exception, below);
277otherwise at most this many bytes are scanned
278in processing the conversion.
279In the case of the
280.Cm lc ,
281.Cm ls
282and
283.Cm l[
284conversions, the field width specifies the maximum number
285of multibyte characters that will be scanned.
286Before conversion begins,
287most conversions skip white space;
288this white space is not counted against the field width.
289.Pp
290The following conversions are available:
291.Bl -tag -width XXXX
292.It Cm %
293Matches a literal
294.Ql % .
295That is,
296.Dq Li %%
297in the format string
298matches a single input
299.Ql %
300character.
301No conversion is done, and assignment does not occur.
302.It Cm b , B
303Matches an optionally signed binary integer;
304the next pointer must be a pointer to
305.Vt "unsigned int" .
306.It Cm d
307Matches an optionally signed decimal integer;
308the next pointer must be a pointer to
309.Vt int .
310.It Cm i
311Matches an optionally signed integer;
312the next pointer must be a pointer to
313.Vt int .
314The integer is read
315in base 2 if it begins with
316.Ql 0b
317or
318.Ql 0B ,
319in base 16 if it begins
320with
321.Ql 0x
322or
323.Ql 0X ,
324in base 8 if it begins with
325.Ql 0 ,
326and in base 10 otherwise.
327Only characters that correspond to the base are used.
328.It Cm o
329Matches an octal integer;
330the next pointer must be a pointer to
331.Vt "unsigned int" .
332.It Cm u
333Matches an optionally signed decimal integer;
334the next pointer must be a pointer to
335.Vt "unsigned int" .
336.It Cm x , X
337Matches an optionally signed hexadecimal integer;
338the next pointer must be a pointer to
339.Vt "unsigned int" .
340.It Cm a , A , e , E , f , F , g , G
341Matches a floating-point number in the style of
342.Xr strtod 3 .
343The next pointer must be a pointer to
344.Vt float
345(unless
346.Cm l
347or
348.Cm L
349is specified.)
350.It Cm s
351Matches a sequence of non-white-space characters;
352the next pointer must be a pointer to
353.Vt char ,
354and the array must be large enough to accept all the sequence and the
355terminating
356.Dv NUL
357character.
358The input string stops at white space
359or at the maximum field width, whichever occurs first.
360.Pp
361If an
362.Cm l
363qualifier is present, the next pointer must be a pointer to
364.Vt wchar_t ,
365into which the input will be placed after conversion by
366.Xr mbrtowc 3 .
367.It Cm S
368The same as
369.Cm ls .
370.It Cm c
371Matches a sequence of
372.Em width
373count
374characters (default 1);
375the next pointer must be a pointer to
376.Vt char ,
377and there must be enough room for all the characters
378(no terminating
379.Dv NUL
380is added).
381The usual skip of leading white space is suppressed.
382To skip white space first, use an explicit space in the format.
383.Pp
384If an
385.Cm l
386qualifier is present, the next pointer must be a pointer to
387.Vt wchar_t ,
388into which the input will be placed after conversion by
389.Xr mbrtowc 3 .
390.It Cm C
391The same as
392.Cm lc .
393.It Cm \&[
394Matches a nonempty sequence of characters from the specified set
395of accepted characters;
396the next pointer must be a pointer to
397.Vt char ,
398and there must be enough room for all the characters in the string,
399plus a terminating
400.Dv NUL
401character.
402The usual skip of leading white space is suppressed.
403The string is to be made up of characters in
404(or not in)
405a particular set;
406the set is defined by the characters between the open bracket
407.Cm \&[
408character
409and a close bracket
410.Cm \&]
411character.
412The set
413.Em excludes
414those characters
415if the first character after the open bracket is a circumflex
416.Cm ^ .
417To include a close bracket in the set,
418make it the first character after the open bracket
419or the circumflex;
420any other position will end the set.
421The hyphen character
422.Cm -
423is also special;
424when placed between two other characters,
425it adds all intervening characters to the set.
426To include a hyphen,
427make it the last character before the final close bracket.
428For instance,
429.Ql [^]0-9-]
430means the set
431.Dq "everything except close bracket, zero through nine, and hyphen" .
432The string ends with the appearance of a character not in the
433(or, with a circumflex, in) set
434or when the field width runs out.
435.Pp
436If an
437.Cm l
438qualifier is present, the next pointer must be a pointer to
439.Vt wchar_t ,
440into which the input will be placed after conversion by
441.Xr mbrtowc 3 .
442.It Cm p
443Matches a pointer value (as printed by
444.Ql %p
445in
446.Xr printf 3 ) ;
447the next pointer must be a pointer to
448.Vt void .
449.It Cm n
450Nothing is expected;
451instead, the number of characters consumed thus far from the input
452is stored through the next pointer,
453which must be a pointer to
454.Vt int .
455This is
456.Em not
457a conversion, although it can be suppressed with the
458.Cm *
459flag.
460.El
461.Pp
462The decimal point
463character is defined in the program's locale (category
464.Dv LC_NUMERIC ) .
465.Pp
466For backwards compatibility, a
467.Dq conversion
468of
469.Ql %\e0
470causes an immediate return of
471.Dv EOF .
472.Sh RETURN VALUES
473These
474functions
475return
476the number of input items assigned, which can be fewer than provided
477for, or even zero, in the event of a matching failure.
478Zero
479indicates that, while there was input available,
480no conversions were assigned;
481typically this is due to an invalid input character,
482such as an alphabetic character for a
483.Ql %d
484conversion.
485The value
486.Dv EOF
487is returned if an input failure occurs before any conversion such as an
488end-of-file occurs.
489If an error or end-of-file occurs after conversion
490has begun,
491the number of conversions which were successfully completed is returned.
492.Sh SEE ALSO
493.Xr getc 3 ,
494.Xr mbrtowc 3 ,
495.Xr printf 3 ,
496.Xr strtod 3 ,
497.Xr strtol 3 ,
498.Xr strtoul 3 ,
499.Xr wscanf 3
500.Sh STANDARDS
501The functions
502.Fn fscanf ,
503.Fn scanf ,
504.Fn sscanf ,
505.Fn vfscanf ,
506.Fn vscanf
507and
508.Fn vsscanf
509conform to
510.St -isoC-99 .
511.Sh HISTORY
512The functions
513.Fn scanf ,
514.Fn fscanf ,
515and
516.Fn sscanf
517first appeared in
518.At v7 ,
519and
520.Fn vscanf ,
521.Fn vsscanf ,
522and
523.Fn vfscanf
524in
525.Bx 4.3 Reno .
526.Sh BUGS
527Earlier implementations of
528.Nm
529treated
530.Cm \&%D , \&%E , \&%F , \&%O
531and
532.Cm \&%X
533as their lowercase equivalents with an
534.Cm l
535modifier.
536In addition,
537.Nm
538treated an unknown conversion character as
539.Cm \&%d
540or
541.Cm \&%D ,
542depending on its case.
543This functionality has been removed.
544.Pp
545Numerical strings are truncated to 512 characters; for example,
546.Cm %f
547and
548.Cm %d
549are implicitly
550.Cm %512f
551and
552.Cm %512d .
553.Pp
554The
555.Cm %n$
556modifiers for positional arguments are not implemented.
557.Pp
558The
559.Nm
560family of functions do not correctly handle multibyte characters in the
561.Fa format
562argument.
563