xref: /freebsd/lib/libc/stdio/scanf.3 (revision abcdc1b9)
1.\" Copyright (c) 1990, 1991, 1993
2.\"	The Regents of the University of California.  All rights reserved.
3.\"
4.\" This code is derived from software contributed to Berkeley by
5.\" Chris Torek and the American National Standards Committee X3,
6.\" on Information Processing Systems.
7.\"
8.\" Redistribution and use in source and binary forms, with or without
9.\" modification, are permitted provided that the following conditions
10.\" are met:
11.\" 1. Redistributions of source code must retain the above copyright
12.\"    notice, this list of conditions and the following disclaimer.
13.\" 2. Redistributions in binary form must reproduce the above copyright
14.\"    notice, this list of conditions and the following disclaimer in the
15.\"    documentation and/or other materials provided with the distribution.
16.\" 3. Neither the name of the University nor the names of its contributors
17.\"    may be used to endorse or promote products derived from this software
18.\"    without specific prior written permission.
19.\"
20.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
21.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
22.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
23.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
24.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
25.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
26.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
27.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
28.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
29.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
30.\" SUCH DAMAGE.
31.\"
32.\"     @(#)scanf.3	8.2 (Berkeley) 12/11/93
33.\"
34.Dd September 5, 2023
35.Dt SCANF 3
36.Os
37.Sh NAME
38.Nm scanf ,
39.Nm fscanf ,
40.Nm sscanf ,
41.Nm vscanf ,
42.Nm vsscanf ,
43.Nm vfscanf
44.Nd input format conversion
45.Sh LIBRARY
46.Lb libc
47.Sh SYNOPSIS
48.In stdio.h
49.Ft int
50.Fn scanf "const char * restrict format" ...
51.Ft int
52.Fn fscanf "FILE * restrict stream" "const char * restrict format" ...
53.Ft int
54.Fn sscanf "const char * restrict str" "const char * restrict format" ...
55.In stdarg.h
56.Ft int
57.Fn vscanf "const char * restrict format" "va_list ap"
58.Ft int
59.Fn vsscanf "const char * restrict str" "const char * restrict format" "va_list ap"
60.Ft int
61.Fn vfscanf "FILE * restrict stream" "const char * restrict format" "va_list ap"
62.Sh DESCRIPTION
63The
64.Fn scanf
65family of functions scans input according to a
66.Fa format
67as described below.
68This format may contain
69.Em conversion specifiers ;
70the results from such conversions, if any,
71are stored through the
72.Em pointer
73arguments.
74The
75.Fn scanf
76function
77reads input from the standard input stream
78.Dv stdin ,
79.Fn fscanf
80reads input from the stream pointer
81.Fa stream ,
82and
83.Fn sscanf
84reads its input from the character string pointed to by
85.Fa str .
86The
87.Fn vfscanf
88function
89is analogous to
90.Xr vfprintf 3
91and reads input from the stream pointer
92.Fa stream
93using a variable argument list of pointers (see
94.Xr stdarg 3 ) .
95The
96.Fn vscanf
97function scans a variable argument list from the standard input and
98the
99.Fn vsscanf
100function scans it from a string;
101these are analogous to
102the
103.Fn vprintf
104and
105.Fn vsprintf
106functions respectively.
107Each successive
108.Em pointer
109argument must correspond properly with
110each successive conversion specifier
111(but see the
112.Cm *
113conversion below).
114All conversions are introduced by the
115.Cm %
116(percent sign) character.
117The
118.Fa format
119string
120may also contain other characters.
121White space (such as blanks, tabs, or newlines) in the
122.Fa format
123string match any amount of white space, including none, in the input.
124Everything else
125matches only itself.
126Scanning stops
127when an input character does not match such a format character.
128Scanning also stops
129when an input conversion cannot be made (see below).
130.Sh CONVERSIONS
131Following the
132.Cm %
133character introducing a conversion
134there may be a number of
135.Em flag
136characters, as follows:
137.Bl -tag -width ".Cm l No (ell)"
138.It Cm *
139Suppresses assignment.
140The conversion that follows occurs as usual, but no pointer is used;
141the result of the conversion is simply discarded.
142.It Cm hh
143Indicates that the conversion will be one of
144.Cm bdioux
145or
146.Cm n
147and the next pointer is a pointer to a
148.Vt char
149(rather than
150.Vt int ) .
151.It Cm h
152Indicates that the conversion will be one of
153.Cm bdioux
154or
155.Cm n
156and the next pointer is a pointer to a
157.Vt "short int"
158(rather than
159.Vt int ) .
160.It Cm l No (ell)
161Indicates that the conversion will be one of
162.Cm bdioux
163or
164.Cm n
165and the next pointer is a pointer to a
166.Vt "long int"
167(rather than
168.Vt int ) ,
169that the conversion will be one of
170.Cm a , e , f ,
171or
172.Cm g
173and the next pointer is a pointer to
174.Vt double
175(rather than
176.Vt float ) ,
177or that the conversion will be one of
178.Cm c ,
179.Cm s
180or
181.Cm \&[
182and the next pointer is a pointer to an array of
183.Vt wchar_t
184(rather than
185.Vt char ) .
186.It Cm ll No (ell ell)
187Indicates that the conversion will be one of
188.Cm bdioux
189or
190.Cm n
191and the next pointer is a pointer to a
192.Vt "long long int"
193(rather than
194.Vt int ) .
195.It Cm L
196Indicates that the conversion will be one of
197.Cm a , e , f ,
198or
199.Cm g
200and the next pointer is a pointer to
201.Vt "long double" .
202.It Cm j
203Indicates that the conversion will be one of
204.Cm bdioux
205or
206.Cm n
207and the next pointer is a pointer to a
208.Vt intmax_t
209(rather than
210.Vt int ) .
211.It Cm t
212Indicates that the conversion will be one of
213.Cm bdioux
214or
215.Cm n
216and the next pointer is a pointer to a
217.Vt ptrdiff_t
218(rather than
219.Vt int ) .
220.It Cm w Ns Ar N
221.Po
222where
223.Ar N
224is 8, 16, 32, or 64
225.Pc
226Indicates that the conversion will be one of
227.Cm bdioux
228or
229.Cm n
230and the next pointer is a pointer to a
231.Vt intN_t
232(rather than
233.Vt int ) .
234.It Cm wf Ns Ar N
235.Po
236where
237.Ar N
238is 8, 16, 32, or 64
239.Pc
240Indicates that the conversion will be one of
241.Cm bdioux
242or
243.Cm n
244and the next pointer is a pointer to a
245.Vt int_fastN_t
246(rather than
247.Vt int ) .
248.It Cm z
249Indicates that the conversion will be one of
250.Cm bdioux
251or
252.Cm n
253and the next pointer is a pointer to a
254.Vt size_t
255(rather than
256.Vt int ) .
257.It Cm q
258(deprecated.)
259Indicates that the conversion will be one of
260.Cm bdioux
261or
262.Cm n
263and the next pointer is a pointer to a
264.Vt "long long int"
265(rather than
266.Vt int ) .
267.El
268.Pp
269In addition to these flags,
270there may be an optional maximum field width,
271expressed as a decimal integer,
272between the
273.Cm %
274and the conversion.
275If no width is given,
276a default of
277.Dq infinity
278is used (with one exception, below);
279otherwise at most this many bytes are scanned
280in processing the conversion.
281In the case of the
282.Cm lc ,
283.Cm ls
284and
285.Cm l[
286conversions, the field width specifies the maximum number
287of multibyte characters that will be scanned.
288Before conversion begins,
289most conversions skip white space;
290this white space is not counted against the field width.
291.Pp
292The following conversions are available:
293.Bl -tag -width XXXX
294.It Cm %
295Matches a literal
296.Ql % .
297That is,
298.Dq Li %%
299in the format string
300matches a single input
301.Ql %
302character.
303No conversion is done, and assignment does not occur.
304.It Cm b , B
305Matches an optionally signed binary integer;
306the next pointer must be a pointer to
307.Vt "unsigned int" .
308.It Cm d
309Matches an optionally signed decimal integer;
310the next pointer must be a pointer to
311.Vt int .
312.It Cm i
313Matches an optionally signed integer;
314the next pointer must be a pointer to
315.Vt int .
316The integer is read
317in base 2 if it begins with
318.Ql 0b
319or
320.Ql 0B ,
321in base 16 if it begins
322with
323.Ql 0x
324or
325.Ql 0X ,
326in base 8 if it begins with
327.Ql 0 ,
328and in base 10 otherwise.
329Only characters that correspond to the base are used.
330.It Cm o
331Matches an octal integer;
332the next pointer must be a pointer to
333.Vt "unsigned int" .
334.It Cm u
335Matches an optionally signed decimal integer;
336the next pointer must be a pointer to
337.Vt "unsigned int" .
338.It Cm x , X
339Matches an optionally signed hexadecimal integer;
340the next pointer must be a pointer to
341.Vt "unsigned int" .
342.It Cm a , A , e , E , f , F , g , G
343Matches a floating-point number in the style of
344.Xr strtod 3 .
345The next pointer must be a pointer to
346.Vt float
347(unless
348.Cm l
349or
350.Cm L
351is specified.)
352.It Cm s
353Matches a sequence of non-white-space characters;
354the next pointer must be a pointer to
355.Vt char ,
356and the array must be large enough to accept all the sequence and the
357terminating
358.Dv NUL
359character.
360The input string stops at white space
361or at the maximum field width, whichever occurs first.
362.Pp
363If an
364.Cm l
365qualifier is present, the next pointer must be a pointer to
366.Vt wchar_t ,
367into which the input will be placed after conversion by
368.Xr mbrtowc 3 .
369.It Cm S
370The same as
371.Cm ls .
372.It Cm c
373Matches a sequence of
374.Em width
375count
376characters (default 1);
377the next pointer must be a pointer to
378.Vt char ,
379and there must be enough room for all the characters
380(no terminating
381.Dv NUL
382is added).
383The usual skip of leading white space is suppressed.
384To skip white space first, use an explicit space in the format.
385.Pp
386If an
387.Cm l
388qualifier is present, the next pointer must be a pointer to
389.Vt wchar_t ,
390into which the input will be placed after conversion by
391.Xr mbrtowc 3 .
392.It Cm C
393The same as
394.Cm lc .
395.It Cm \&[
396Matches a nonempty sequence of characters from the specified set
397of accepted characters;
398the next pointer must be a pointer to
399.Vt char ,
400and there must be enough room for all the characters in the string,
401plus a terminating
402.Dv NUL
403character.
404The usual skip of leading white space is suppressed.
405The string is to be made up of characters in
406(or not in)
407a particular set;
408the set is defined by the characters between the open bracket
409.Cm \&[
410character
411and a close bracket
412.Cm \&]
413character.
414The set
415.Em excludes
416those characters
417if the first character after the open bracket is a circumflex
418.Cm ^ .
419To include a close bracket in the set,
420make it the first character after the open bracket
421or the circumflex;
422any other position will end the set.
423The hyphen character
424.Cm -
425is also special;
426when placed between two other characters,
427it adds all intervening characters to the set.
428To include a hyphen,
429make it the last character before the final close bracket.
430For instance,
431.Ql [^]0-9-]
432means the set
433.Dq "everything except close bracket, zero through nine, and hyphen" .
434The string ends with the appearance of a character not in the
435(or, with a circumflex, in) set
436or when the field width runs out.
437.Pp
438If an
439.Cm l
440qualifier is present, the next pointer must be a pointer to
441.Vt wchar_t ,
442into which the input will be placed after conversion by
443.Xr mbrtowc 3 .
444.It Cm p
445Matches a pointer value (as printed by
446.Ql %p
447in
448.Xr printf 3 ) ;
449the next pointer must be a pointer to
450.Vt void .
451.It Cm n
452Nothing is expected;
453instead, the number of characters consumed thus far from the input
454is stored through the next pointer,
455which must be a pointer to
456.Vt int .
457This is
458.Em not
459a conversion, although it can be suppressed with the
460.Cm *
461flag.
462.El
463.Pp
464The decimal point
465character is defined in the program's locale (category
466.Dv LC_NUMERIC ) .
467.Pp
468For backwards compatibility, a
469.Dq conversion
470of
471.Ql %\e0
472causes an immediate return of
473.Dv EOF .
474.Sh RETURN VALUES
475These
476functions
477return
478the number of input items assigned, which can be fewer than provided
479for, or even zero, in the event of a matching failure.
480Zero
481indicates that, while there was input available,
482no conversions were assigned;
483typically this is due to an invalid input character,
484such as an alphabetic character for a
485.Ql %d
486conversion.
487The value
488.Dv EOF
489is returned if an input failure occurs before any conversion such as an
490end-of-file occurs.
491If an error or end-of-file occurs after conversion
492has begun,
493the number of conversions which were successfully completed is returned.
494.Sh SEE ALSO
495.Xr getc 3 ,
496.Xr mbrtowc 3 ,
497.Xr printf 3 ,
498.Xr strtod 3 ,
499.Xr strtol 3 ,
500.Xr strtoul 3 ,
501.Xr wscanf 3
502.Sh STANDARDS
503The functions
504.Fn fscanf ,
505.Fn scanf ,
506.Fn sscanf ,
507.Fn vfscanf ,
508.Fn vscanf
509and
510.Fn vsscanf
511conform to
512.St -isoC-99 .
513.Sh HISTORY
514The functions
515.Fn scanf ,
516.Fn fscanf ,
517and
518.Fn sscanf
519first appeared in
520.At v7 ,
521and
522.Fn vscanf ,
523.Fn vsscanf ,
524and
525.Fn vfscanf
526in
527.Bx 4.3 Reno .
528.Sh BUGS
529Earlier implementations of
530.Nm
531treated
532.Cm \&%D , \&%E , \&%F , \&%O
533and
534.Cm \&%X
535as their lowercase equivalents with an
536.Cm l
537modifier.
538In addition,
539.Nm
540treated an unknown conversion character as
541.Cm \&%d
542or
543.Cm \&%D ,
544depending on its case.
545This functionality has been removed.
546.Pp
547Numerical strings are truncated to 512 characters; for example,
548.Cm %f
549and
550.Cm %d
551are implicitly
552.Cm %512f
553and
554.Cm %512d .
555.Pp
556The
557.Cm %n$
558modifiers for positional arguments are not implemented.
559.Pp
560The
561.Nm
562family of functions do not correctly handle multibyte characters in the
563.Fa format
564argument.
565