1<HTML><HEAD><TITLE>Tcl Built-In Commands - encoding manual page</TITLE></HEAD><BODY>
2<DL>
3<DD><A HREF="encoding.htm#M2" NAME="L218">NAME</A>
4<DL><DD>encoding - Manipulate encodings</DL>
5<DD><A HREF="encoding.htm#M3" NAME="L219">SYNOPSIS</A>
6<DL>
7<DD><B>encoding </B><I>option</I> ?<I>arg arg ...</I>?
8</DL>
9<DD><A HREF="encoding.htm#M4" NAME="L220">INTRODUCTION</A>
10<DD><A HREF="encoding.htm#M5" NAME="L221">DESCRIPTION</A>
11<DL>
12<DD><A HREF="encoding.htm#M6" NAME="L222"><B>encoding convertfrom</B> ?<I>encoding</I>? <I>data</I></A>
13<DD><A HREF="encoding.htm#M7" NAME="L223"><B>encoding convertto</B> ?<I>encoding</I>? <I>string</I></A>
14<DD><A HREF="encoding.htm#M8" NAME="L224"><B>encoding names</B></A>
15<DD><A HREF="encoding.htm#M9" NAME="L225"><B>encoding system</B> ?<I>encoding</I>?</A>
16</DL>
17<DD><A HREF="encoding.htm#M10" NAME="L226">EXAMPLE</A>
18<DD><A HREF="encoding.htm#M11" NAME="L227">SEE ALSO</A>
19<DD><A HREF="encoding.htm#M12" NAME="L228">KEYWORDS</A>
20</DL><HR>
21<H3><A NAME="M2">NAME</A></H3>
22encoding - Manipulate encodings
23<H3><A NAME="M3">SYNOPSIS</A></H3>
24<B>encoding </B><I>option</I> ?<I>arg arg ...</I>?<BR>
25<H3><A NAME="M4">INTRODUCTION</A></H3>
26Strings in Tcl are encoded using 16-bit Unicode characters.  Different
27operating system interfaces or applications may generate strings in
28other encodings such as Shift-JIS.  The <B>encoding</B> command helps
29to bridge the gap between Unicode and these other formats.
30<H3><A NAME="M5">DESCRIPTION</A></H3>
31Performs one of several encoding related operations, depending on
32<I>option</I>.  The legal <I>option</I>s are:
33<P>
34<DL>
35<DT><A NAME="M6"><B>encoding convertfrom</B> ?<I>encoding</I>? <I>data</I></A><DD>
36Convert <I>data</I> to Unicode from the specified <I>encoding</I>.  The
37characters in <I>data</I> are treated as binary data where the lower
388-bits of each character is taken as a single byte.  The resulting
39sequence of bytes is treated as a string in the specified
40<I>encoding</I>.  If <I>encoding</I> is not specified, the current
41system encoding is used.
42<P><DT><A NAME="M7"><B>encoding convertto</B> ?<I>encoding</I>? <I>string</I></A><DD>
43Convert <I>string</I> from Unicode to the specified <I>encoding</I>.
44The result is a sequence of bytes that represents the converted
45string.  Each byte is stored in the lower 8-bits of a Unicode
46character.  If <I>encoding</I> is not specified, the current
47system encoding is used.
48<P><DT><A NAME="M8"><B>encoding names</B></A><DD>
49Returns a list containing the names of all of the encodings that are
50currently available.
51<P><DT><A NAME="M9"><B>encoding system</B> ?<I>encoding</I>?</A><DD>
52Set the system encoding to <I>encoding</I>. If <I>encoding</I> is
53omitted then the command returns the current system encoding.  The
54system encoding is used whenever Tcl passes strings to system calls.
55<P></DL>
56<H3><A NAME="M10">EXAMPLE</A></H3>
57It is common practice to write script files using a text editor that
58produces output in the euc-jp encoding, which represents the ASCII
59characters as singe bytes and Japanese characters as two bytes.  This
60makes it easy to embed literal strings that correspond to non-ASCII
61characters by simply typing the strings in place in the script.
62However, because the <B><A HREF="../TclCmd/source.htm">source</A></B> command always reads files using the
63current system encoding, Tcl will only source such files correctly
64when the encoding used to write the file is the same.  This tends not
65to be true in an internationalized setting.  For example, if such a
66file was sourced in North America (where the ISO8859-1 is normally
67used), each byte in the file would be treated as a separate character
68that maps to the 00 page in Unicode.  The resulting Tcl strings will
69not contain the expected Japanese characters.  Instead, they will
70contain a sequence of Latin-1 characters that correspond to the bytes
71of the original string.  The <B>encoding</B> command can be used to
72convert this string to the expected Japanese Unicode characters.  For
73example,
74<PRE>set s [<B>encoding convertfrom</B> euc-jp &quot;&#92;xA4&#92;xCF&quot;]</PRE>
75would return the Unicode string &quot;&#92;u306F&quot;, which is the Hiragana
76letter HA.
77
78<H3><A NAME="M11">SEE ALSO</A></H3>
79<B><A HREF="../TclLib/Encoding.htm">Tcl_GetEncoding</A></B>
80<H3><A NAME="M12">KEYWORDS</A></H3>
81<A href="../Keywords/E.htm#encoding">encoding</A>
82<HR><PRE>
83<A HREF="../copyright.htm">Copyright</A> &#169; 1998 by Scriptics Corporation.
84<A HREF="../copyright.htm">Copyright</A> &#169; 1995-1997 Roger E. Critchlow Jr.</PRE>
85</BODY></HTML>
86