xref: /openbsd/lib/libc/locale/mbrlen.3 (revision 4bc2832d)
1.\" $OpenBSD: mbrlen.3,v 1.6 2022/03/29 18:15:52 naddy Exp $
2.\" $NetBSD: mbrlen.3,v 1.5 2003/09/08 17:54:31 wiz Exp $
3.\"
4.\" Copyright (c)2002 Citrus Project,
5.\" All rights reserved.
6.\"
7.\" Redistribution and use in source and binary forms, with or without
8.\" modification, are permitted provided that the following conditions
9.\" are met:
10.\" 1. Redistributions of source code must retain the above copyright
11.\"    notice, this list of conditions and the following disclaimer.
12.\" 2. Redistributions in binary form must reproduce the above copyright
13.\"    notice, this list of conditions and the following disclaimer in the
14.\"    documentation and/or other materials provided with the distribution.
15.\"
16.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
17.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
18.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
19.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
20.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
21.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
22.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
23.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
24.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
25.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
26.\" SUCH DAMAGE.
27.\"
28.Dd $Mdocdate: March 29 2022 $
29.Dt MBRLEN 3
30.Os
31.\" ----------------------------------------------------------------------
32.Sh NAME
33.Nm mbrlen
34.Nd get number of bytes in a multibyte character (restartable)
35.\" ----------------------------------------------------------------------
36.Sh SYNOPSIS
37.In wchar.h
38.Ft size_t
39.Fn mbrlen "const char * restrict s" "size_t n" "mbstate_t * restrict ps"
40.\" ----------------------------------------------------------------------
41.Sh DESCRIPTION
42The
43.Fn mbrlen
44function returns the number of bytes
45in the first multibyte character of the multibyte string
46.Fa s .
47It examines at most the first
48.Fa n
49bytes of
50.Fa s .
51.Pp
52.Fn mbrlen
53is equivalent to the following call, except that
54.Fa ps
55is evaluated only once:
56.Bd -literal -offset indent
57mbrtowc(NULL, s, n, (ps != NULL) ? ps : &internal);
58.Ed
59.Pp
60Here,
61.Fa internal
62is an internal state object automatically initialized
63to the initial conversion state at startup time of the program.
64.Pp
65In state-dependent encodings,
66.Fa s
67may point to special sequence bytes changing the shift state.
68Although such sequence bytes correspond to no wide character,
69they affect the conversion state object pointed to by
70.Fa ps ,
71and
72.Fn mbrlen
73treats the special sequence bytes
74as if they were part of the subsequent multibyte character.
75.Pp
76Unlike
77.Xr mblen 3 ,
78.Fn mbrlen
79accepts the byte sequence if it is not a complete character
80but the initial part of some valid character.
81In this case, this function accepts all such bytes
82and saves them into the conversion state object pointed to by
83.Fa ps .
84They will be used on subsequent calls of this function to restart
85the conversion suspended.
86.Pp
87The behaviour of
88.Fn mbrlen
89is affected by the
90.Dv LC_CTYPE
91category of the current locale.
92.Pp
93There are the special cases:
94.Bl -tag -width 0123456789
95.It "s == NULL"
96.Fn mbrlen
97sets the conversion state object pointed to by
98.Fa ps
99to the initial conversion state and always returns 0.
100Unlike
101.Xr mblen 3 ,
102the value returned does not indicate whether the current encoding of
103the locale is state-dependent.
104.Pp
105In this case,
106.Fn mbrlen
107ignores
108.Fa n .
109.It "n == 0"
110In this case,
111the first
112.Fa n
113bytes of
114.Fa s
115never form a complete character.
116Thus,
117.Fn mbrlen
118always returns (size_t)-2.
119.It "ps == NULL"
120.Fn mbrlen
121uses its own internal state object to keep the conversion state
122instead of the
123.Fa ps
124argument.
125.Pp
126Calling any other function in
127.Em libc
128never changes the internal state of
129.Fn mbrlen ,
130except for calling
131.Xr setlocale 3
132with an
133.Dv LC_CTYPE
134that differs from the current locale.
135Such
136.Xr setlocale 3
137calls cause the internal state of this function to become indeterminate.
138.El
139.\" ----------------------------------------------------------------------
140.Sh RETURN VALUES
141The
142.Fn mbrlen
143function returns:
144.Bl -tag -width "(size_t)-2"
145.It "0"
146.Fa s
147points to a NUL byte
148.Pq Sq \e0 .
149.It "positive"
150The value returned is
151the number of bytes in the valid multibyte character pointed to by
152.Fa s .
153There are no cases where this value is greater than
154.Fa n
155or the value of the
156.Dv MB_CUR_MAX
157macro.
158.It "(size_t)-2"
159The first
160.Fa n
161bytes of
162.Fa s
163contain an incomplete multibyte character that can potentially be
164completed by reading more bytes.
165When
166.Fa n
167is at least
168.Dv MB_CUR_MAX ,
169this can only occur if
170.Fa s
171contains a redundant shift sequence.
172.It "(size_t)-1"
173.Fa s
174points to an illegal byte sequence which does not form a valid multibyte
175character.
176In this case,
177.Fn mbrtowc
178sets
179.Va errno
180to indicate the error.
181.El
182.\" ----------------------------------------------------------------------
183.Sh ERRORS
184.Fn mbrlen
185may cause an error in the following cases:
186.Bl -tag -width Er
187.It Bq Er EILSEQ
188.Fa s
189points to an invalid multibyte character.
190.It Bq Er EINVAL
191.Fa ps
192points to an invalid or uninitialized
193.Vt mbstate_t
194object.
195.El
196.\" ----------------------------------------------------------------------
197.Sh SEE ALSO
198.Xr mblen 3 ,
199.Xr mbrtowc 3 ,
200.Xr setlocale 3
201.\" ----------------------------------------------------------------------
202.Sh STANDARDS
203The
204.Fn mbrlen
205function conforms to
206.\" .St -isoC-amd1 .
207ISO/IEC 9899/AMD1:1995
208.Pq Dq ISO C90, Amendment 1 .
209The restrict qualifier is added at
210.\" .St -isoC99 .
211ISO/IEC 9899/1999
212.Pq Dq ISO C99 .
213