xref: /openbsd/lib/libc/stdio/scanf.3 (revision 898184e3)
1.\"	$OpenBSD: scanf.3,v 1.21 2013/03/05 17:19:06 otto Exp $
2.\"
3.\" Copyright (c) 1990, 1991, 1993
4.\"	The Regents of the University of California.  All rights reserved.
5.\"
6.\" This code is derived from software contributed to Berkeley by
7.\" Chris Torek and the American National Standards Committee X3,
8.\" on Information Processing Systems.
9.\"
10.\" Redistribution and use in source and binary forms, with or without
11.\" modification, are permitted provided that the following conditions
12.\" are met:
13.\" 1. Redistributions of source code must retain the above copyright
14.\"    notice, this list of conditions and the following disclaimer.
15.\" 2. Redistributions in binary form must reproduce the above copyright
16.\"    notice, this list of conditions and the following disclaimer in the
17.\"    documentation and/or other materials provided with the distribution.
18.\" 3. Neither the name of the University nor the names of its contributors
19.\"    may be used to endorse or promote products derived from this software
20.\"    without specific prior written permission.
21.\"
22.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
23.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
24.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
25.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
26.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
27.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
28.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
29.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
30.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
31.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
32.\" SUCH DAMAGE.
33.\"
34.Dd $Mdocdate: March 5 2013 $
35.Dt SCANF 3
36.Os
37.Sh NAME
38.Nm scanf ,
39.Nm fscanf ,
40.Nm sscanf ,
41.Nm vscanf ,
42.Nm vsscanf ,
43.Nm vfscanf
44.Nd input format conversion
45.Sh SYNOPSIS
46.In stdio.h
47.Ft int
48.Fn scanf "const char *format" ...
49.Ft int
50.Fn fscanf "FILE *stream" "const char *format" ...
51.Ft int
52.Fn sscanf "const char *str" "const char *format" ...
53.In stdarg.h
54.Ft int
55.Fn vscanf "const char *format" "va_list ap"
56.Ft int
57.Fn vsscanf "const char *str" "const char *format" "va_list ap"
58.Ft int
59.Fn vfscanf "FILE *stream" "const char *format" "va_list ap"
60.Sh DESCRIPTION
61The
62.Fn scanf
63family of functions read input according to the given
64.Fa format
65as described below.
66This format may contain
67.Dq conversion specifiers ;
68the results of such conversions, if any, are stored through a set of pointer
69arguments.
70.Pp
71The
72.Fn scanf
73function reads input from the standard input stream
74.Em stdin ,
75.Fn fscanf
76reads input from the supplied stream pointer
77.Fa stream ,
78and
79.Fn sscanf
80reads its input from the character string pointed to by
81.Fa str .
82.Pp
83The
84.Fn vfscanf
85function is analogous to
86.Xr vfprintf 3
87and reads input from the stream pointer
88.Fa stream
89using a variable argument list of pointers (see
90.Xr stdarg 3 ) .
91The
92.Fn vscanf
93function scans a variable argument list from the standard input and the
94.Fn vsscanf
95function scans it from a string; these are analogous to the
96.Fn vprintf
97and
98.Fn vsprintf
99functions, respectively.
100.Pp
101Each successive
102.Em pointer
103argument must correspond properly with each successive conversion specifier
104(but see the
105.Cm *
106conversion below).
107All conversions are introduced by the
108.Cm %
109(percent sign) character.
110The
111.Fa format
112string may also contain other characters.
113Whitespace (such as blanks, tabs, or newlines) in the
114.Fa format
115string match any amount of whitespace, including none, in the input.
116Everything else matches only itself.
117Scanning stops when an input character does not match such a format character.
118Scanning also stops when an input conversion cannot be made (see below).
119.Sh CONVERSIONS
120Following the
121.Cm %
122character, introducing a conversion, there may be a number of
123.Em flag
124characters, as follows:
125.Bl -tag -width "ll (ell ell)"
126.It Cm *
127Suppresses assignment.
128The conversion that follows occurs as usual, but no pointer is used;
129the result of the conversion is simply discarded.
130.It Cm hh
131Indicates that the conversion will be one of
132.Cm dioux
133or
134.Cm n
135and the next pointer is a pointer to a
136.Vt char
137(rather than
138.Vt int ) .
139.It Cm h
140Indicates that the conversion will be one of
141.Cm dioux
142or
143.Cm n
144and the next pointer is a pointer to a
145.Vt "short int"
146(rather than
147.Vt int ) .
148.It Cm l No (ell)
149Indicates either that the conversion will be one of
150.Cm dioux
151or
152.Cm n
153and the next pointer is a pointer to a
154.Vt "long int"
155(rather than
156.Vt int ) ,
157or that the conversion will be one of
158.Cm efg
159and the next pointer is a pointer to
160.Vt double
161(rather than
162.Vt float ) ,
163or that the conversion will be one of
164.Cm sc[ .
165.Pp
166If the conversion is one of
167.Cm sc[
168the expected conversion input is a multibyte character sequence.
169Each multibyte character in the sequence is converted with a call to the
170.Fn mbrtowc
171function.
172The field width specifies the maximum amount of bytes read from the
173multibyte character sequence and passed to
174.Fn mbrtowc
175for conversion.
176The next pointer is a pointer to a
177.Vt wchar_t
178wide-character buffer large enough to accept the converted input sequence
179including the terminating NUL wide character which will be added automatically.
180.It Cm ll No (ell ell)
181Indicates that the conversion will be one of
182.Cm dioux
183or
184.Cm n
185and the next pointer is a pointer to a
186.Vt "long long int"
187(rather than
188.Vt int ) .
189.It Cm L
190Indicates that the conversion will be one of
191.Cm efg
192and the next pointer is a pointer to
193.Vt "long double" .
194.It Cm j
195Indicates that the conversion will be one of
196.Cm dioux
197or
198.Cm n
199and the next pointer is a pointer to an
200.Vt intmax_t
201(rather than
202.Vt int ) .
203.It Cm t
204Indicates that the conversion will be one of
205.Cm dioux
206or
207.Cm n
208and the next pointer is a pointer to a
209.Vt ptrdiff_t
210(rather than
211.Vt int ) .
212.It Cm z
213Indicates that the conversion will be one of
214.Cm dioux
215or
216.Cm n
217and the next pointer is a pointer to a
218.Vt size_t
219(rather than
220.Vt int ) .
221.It Cm q
222(deprecated)
223Indicates that the conversion will be one of
224.Cm dioux
225or
226.Cm n
227and the next pointer is a pointer to a
228.Vt "long long int"
229(rather than
230.Vt int ) .
231.El
232.Pp
233In addition to these flags, there may be an optional maximum field width,
234expressed as a decimal integer, between the
235.Cm %
236and the conversion.
237If no width is given,
238a default of
239.Dq infinity
240is used (with one exception, below);
241otherwise at most this many characters are scanned in processing the
242conversion.
243Before conversion begins, most conversions skip whitespace;
244this whitespace is not counted against the field width.
245.Pp
246The following conversions are available:
247.Bl -tag -width XXXX
248.It Cm %
249Matches a literal
250.Ql % .
251That is,
252.Dq Li %%
253in the format string matches a single input
254.Ql %
255character.
256No conversion is done, and assignment does not occur.
257.It Cm d
258Matches an optionally signed decimal integer;
259the next pointer must be a pointer to
260.Vt int .
261.It Cm D
262Equivalent to
263.Cm ld ;
264this exists only for backwards compatibility.
265.It Cm i
266Matches an optionally signed integer;
267the next pointer must be a pointer to
268.Vt int .
269The integer is read in base 16 if it begins
270with
271.Ql 0x
272or
273.Ql 0X ,
274in base 8 if it begins with
275.Ql 0 ,
276and in base 10 otherwise.
277Only characters that correspond to the base are used.
278.It Cm o
279Matches an octal integer;
280the next pointer must be a pointer to
281.Vt "unsigned int" .
282.It Cm O
283Equivalent to
284.Cm lo ;
285this exists for backwards compatibility.
286.It Cm u
287Matches an optionally signed decimal integer;
288the next pointer must be a pointer to
289.Vt "unsigned int" .
290.It Cm xX
291Matches an optionally signed hexadecimal integer;
292the next pointer must be a pointer to
293.Vt "unsigned int" .
294.It Cm eE
295Equivalent to
296.Cm f .
297.It Cm fF
298Matches an optionally signed floating-point number;
299the next pointer must be a pointer to
300.Vt float .
301.It Cm gG
302Equivalent to
303.Cm f .
304.It Cm aA
305Equivalent to
306.Cm f .
307.It Cm s
308Matches a sequence of non-whitespace characters;
309the next pointer must be a pointer to
310.Vt char ,
311or to
312.Vt wchar_t
313if the
314.Vt l
315length modifier is present.
316The provided array must be large enough to accept and store
317the whole sequence and the terminating NUL character.
318The input string stops at whitespace
319or at the maximum field width, whichever occurs first.
320If specified, the maximum field length refers to the sequence
321being scanned rather than the storage space, hence the provided
322array must be 1 larger for the terminating NUL character.
323.It Cm c
324Matches a sequence of characters consuming the number of bytes
325specified by the field width (defaults to 1 if unspecified);
326the next pointer must be a pointer to
327.Vt char ,
328or to
329.Vt wchar_t
330if the
331.Vt l
332length modifier is present.
333There must be enough room for all the characters
334(no terminating NUL is added).
335The usual skip of leading whitespace is suppressed.
336To skip whitespace first, use an explicit space in the format.
337.It Cm \&[
338Matches a nonempty sequence of characters from the specified set
339of accepted characters;
340the next pointer must be a pointer to
341.Vt char ,
342or to
343.Vt wchar_t
344if the
345.Vt l
346length modifier is present.
347There must be enough room for all the characters in the string,
348plus a terminating NUL character.
349The usual skip of leading whitespace is suppressed.
350.Pp
351The string is to be made up of characters in
352(or not in)
353a particular set;
354the set is defined by the characters between the open bracket
355.Cm \&[
356character
357and a close bracket
358.Cm \&]
359character.
360The set
361.Em excludes
362those characters
363if the first character after the open bracket is a circumflex
364.Cm ^ .
365To include a close bracket in the set,
366make it the first character after the open bracket
367or the circumflex;
368any other position will end the set.
369The hyphen character
370.Cm \-
371is also special;
372when placed between two other characters,
373it adds all intervening characters to the set.
374To include a hyphen,
375make it the last character before the final close bracket.
376.Pp
377For instance,
378.Ql [^]0-9-]
379means the set
380.Do
381everything except close bracket, zero through nine, and hyphen
382.Dc .
383The string ends with the appearance of a character not in
384(or, with a circumflex, in) the set
385or when the field width runs out.
386.It Cm p
387Matches a pointer value (as printed by
388.Ql %p
389in
390.Xr printf 3 ) ;
391the next pointer must be a pointer to
392.Vt void .
393.It Cm n
394Nothing is expected;
395instead, the number of characters consumed thus far from the input
396is stored through the next pointer,
397which must be a pointer to
398.Vt int .
399This is
400.Em not
401a conversion, although it can be suppressed with the
402.Cm *
403flag.
404.El
405.Pp
406For backwards compatibility, other conversion characters (except
407.Ql \e0 )
408are taken as if they were
409.Ql %d
410or, if uppercase,
411.Ql %ld ,
412and a `conversion' of
413.Ql %\e0
414causes an immediate return of
415.Dv EOF .
416.Sh RETURN VALUES
417These functions return the number of input items assigned, which can be fewer
418than provided for, or even zero, in the event of a matching failure.
419Zero indicates that, while there was input available, no conversions were
420assigned; typically this is due to an invalid input character,
421such as an alphabetic character for a
422.Ql %d
423conversion.
424The value
425.Dv EOF
426is returned if an input failure,
427such as an end-of-file,
428occurs before any conversion.
429If an error or end-of-file occurs after conversion has begun,
430the number of conversions which were successfully completed is returned.
431.Sh SEE ALSO
432.Xr getc 3 ,
433.Xr mbrtowc 3 ,
434.Xr printf 3 ,
435.Xr strtod 3 ,
436.Xr strtol 3 ,
437.Xr strtoul 3
438.Sh STANDARDS
439The functions
440.Fn fscanf ,
441.Fn scanf ,
442and
443.Fn sscanf
444conform to
445.St -ansiC .
446.Sh HISTORY
447The functions
448.Fn vscanf ,
449.Fn vsscanf ,
450and
451.Fn vfscanf
452first appeared in
453.Bx 4.3 Reno .
454.Sh BUGS
455Numerical strings are truncated to 512 characters; for example,
456.Cm %f
457and
458.Cm %d
459are implicitly
460.Cm %512f
461and
462.Cm %512d .
463