xref: /openbsd/usr.bin/locale/locale.1 (revision d415bd75)
1.\" $OpenBSD: locale.1,v 1.11 2023/03/05 18:55:34 ajacoutot Exp $
2.\"
3.\" Copyright 2016, 2020 Ingo Schwarze <schwarze@openbsd.org>
4.\" Copyright 2013 Stefan Sperling <stsp@openbsd.org>
5.\"
6.\" Permission to use, copy, modify, and distribute this software for any
7.\" purpose with or without fee is hereby granted, provided that the above
8.\" copyright notice and this permission notice appear in all copies.
9.\"
10.\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
11.\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
12.\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
13.\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
14.\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
15.\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
16.\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
17.\"
18.Dd $Mdocdate: March 5 2023 $
19.Dt LOCALE 1
20.Os
21.Sh NAME
22.Nm locale
23.Nd character encoding and localization conventions
24.Sh SYNOPSIS
25.Nm locale
26.Op Fl a | Fl m | Cm charmap
27.Sh DESCRIPTION
28If the
29.Nm
30utility is invoked without any arguments, the current locale
31configuration is shown.
32Values for categories that are not set in the environment or that are
33overridden by
34.Ev LC_ALL
35are displayed between double quotes.
36.Pp
37The options are as follows:
38.Bl -tag -width charmap
39.It Fl a
40Display a list of supported locales.
41.It Fl m
42Display a list of supported character encodings.
43On
44.Ox ,
45this always returns UTF-8 only.
46.It Cm charmap
47Display the currently selected character encoding.
48On
49.Ox ,
50this returns either US-ASCII or UTF-8.
51.El
52.Pp
53A locale is a set of environment variables telling programs which
54character encoding, language and cultural conventions the user
55prefers.
56Programs in the
57.Ox
58base system ignore the locale except for the character encoding,
59and it is not recommended to use any of these variables except that
60the following non-default setting is supported as an option:
61.Pp
62.Dl export LC_CTYPE=en_US.UTF-8
63.Pp
64Programs installed from
65.Xr packages 7
66may or may not change behavior according to the locale.
67Many programs use the X/Open System Interfaces naming scheme
68for the contents of the variables listed below, which is
69.Sm off
70.Ar language
71.Op _ Ar TERRITORY
72.Op \&. Ar encoding
73.Op @ Ar modifier
74.Sm on
75.Pp
76The behavior of some library functions may also depend on the locale,
77and it does on most other operating systems.
78The
79.Ox
80C library tends to avoid locale-dependent behavior except with
81respect to character encoding.
82See the manual pages of individual functions for details.
83.Pp
84The character encoding locale
85.Ev LC_CTYPE
86instructs programs which character encoding to assume for text input
87and to use for text output.
88A character encoding maps each character of a given character set
89to a byte sequence suitable for storing or transmitting the character.
90.Pp
91The
92.Ox
93base system supports two locales: the default of
94.Li LC_CTYPE=C
95selects the US-ASCII character set and encoding, treating the bytes
960x80 to 0xff as non-printable characters of application-specific
97meaning.
98.Li LC_CTYPE=POSIX
99is an alias for
100.Li LC_CTYPE=C .
101The alternative of
102.Li LC_CTYPE=en_US.UTF-8
103selects the UTF-8 encoding of the Unicode character set, which is
104supported by many parts of the system, but not yet fully supported
105by all parts.
106.Pp
107If the value of
108.Ev LC_CTYPE
109ends in
110.Ql .UTF-8 ,
111programs in the
112.Ox
113base system ignore the beginning of it, treating for example zh_CN.UTF-8
114exactly like en_US.UTF-8.
115Programs from
116.Xr packages 7
117may however make a difference.
118If the value of
119.Ev LC_CTYPE
120is unsupported, programs and libraries in the
121.Ox
122base systems fall back to
123.Li LC_CTYPE=C .
124.Pp
125Some programs, for example
126.Xr write 1 ,
127deliberately ignore the locale and always use US-ASCII only.
128See the manual pages of individual programs for details.
129.Sh ENVIRONMENT
130The locale configuration consists of the following environment variables:
131.Bl -tag -width LC_MONETARYX
132.It Ev LC_ALL
133Overrides all other
134.Ev LC_*
135variables below.
136.It Ev LC_COLLATE
137Intended to affect collation order.
138It may for example affect alphabetic sorting, regular expressions
139including equivalence classes, and the
140.Xr strcoll 3
141and
142.Xr strxfrm 3
143functions.
144.It Ev LC_CTYPE
145Intended to affect character encoding, character classification,
146and case conversion.
147For example, it is used by
148.Xr mbtowc 3 ,
149.Xr iswctype 3 ,
150.Xr iswalnum 3 ,
151.Xr towlower 3 ,
152.Xr fgetwc 3 ,
153.Xr fputwc 3 ,
154.Xr printf 3 ,
155and
156.Xr scanf 3 .
157.It Ev LC_MESSAGES
158Intended to affect the output of informative and diagnostic messages
159and the interpretation of interactive responses, in particular
160regarding the language.
161It is used by
162.Xr catopen 3 .
163.It Ev LC_MONETARY
164Intended to affect monetary formatting.
165.It Ev LC_NUMERIC
166Intended to affect numeric, non-monetary formatting, for example
167the radix character and thousands separators.
168On other operating systems, it may for example affect
169.Xr printf 3 ,
170.Xr scanf 3 ,
171and
172.Xr strtod 3 .
173.It Ev LC_TIME
174Intended to affect date and time formats.
175It may for example affect
176.Xr strftime 3 .
177.It Ev LANG
178Fallback if any of the above is unset.
179.It Ev NLSPATH
180Used by
181.Xr catopen 3
182to locate message catalogs.
183.El
184.Sh FILES
185.Bl -tag -width Ds
186.It Pa /usr/share/locale/UTF-8/LC_CTYPE
187Character classification, case conversion, and character display
188width database in
189.Xr mklocale 1
190binary output format used by
191.Xr setlocale 3 .
192.It Pa /usr/local/share/locale/
193Localization data for
194.Xr packages 7 ,
195in particular
196.Ev LC_MESSAGES
197catalogs in GNU gettext format.
198.It Pa /usr/local/share/nls/
199Localization data for
200.Xr packages 7 ,
201in particular
202.Ev LC_MESSAGES
203catalogs in
204.Xr catopen 3
205format.
206.It Pa /usr/src/share/locale/ctype/en_US.UTF-8.src
207Character classification, case conversion, and character display
208width database in
209.Xr mklocale 1
210input format.
211.It Pa /usr/libdata/perl5/unicore/
212Complete Unicode data used for generating the above database.
213.It Pa /usr/src/gnu/usr.bin/perl/lib/unicore/UnicodeData.txt
214The most important parts of Unicode data in a compact, more easily
215human-readable format.
216.El
217.Sh EXIT STATUS
218.Ex -std locale
219.Sh SEE ALSO
220.Xr mklocale 1 ,
221.Xr setlocale 3 ,
222.Xr Unicode::UCD 3p
223.Pp
224Related ports: converters/libiconv, devel/gettext, textproc/icu4c
225.Sh STANDARDS
226With respect to locale support, most libraries and programs in the
227.Ox
228base system, including the
229.Nm
230utility, implement a subset of the
231.St -p1003.1-2008
232specification.
233.Sh HISTORY
234The
235.Nm
236utility was first standardized in the
237.St -xpg4 .
238.Pp
239It was rewritten from scratch for
240.Ox 5.4
241during the 2013 Toronto hackathon.
242.Sh AUTHORS
243.An -nosplit
244.An Stefan Sperling Aq Mt stsp@openbsd.org
245with contributions from
246.An Philip Guenther Aq Mt guenther@openbsd.org
247and
248.An Jeremie Courreges-Anglas Aq Mt jca@openbsd.org .
249This manual page was written by
250.An Ingo Schwarze Aq Mt schwarze@openbsd.org .
251.Sh BUGS
252The
253.Nm
254concept is inadequate for inter-process communication.
255Two processes exchanging text, for example over a network, using
256sockets, in shared memory, or even using plain text files always
257need a protocol-specific way to negotiate the character encoding
258used.
259.Pp
260The list of supported locales is perpetually incomplete.
261