README
1Unicode::Map8
2-------------
3
4The Unicode::Map8 class implement efficient mapping tables between
58-bit character sets and 16 bit character sets like Unicode. About
6170 different mapping tables between various known character sets and
7Unicode is distributed with this package. The source of these tables
8is the vendor mapping tables provided by Unicode, Inc. and the code
9tables in RFC 1345. New maps can easily be installed.
10
11By coincidence Martin Schwartz created a similar module at the same
12time I did. His module is called Unicode::Map and should be available
13on CPAN too. Both modules now support a unified interface. Martin's
14module will be depreciated in the future.
15
16Since UTF8 support is coming to Perl soon, there might be good reasons
17to move this module in the direction of mapping to/from UTF8. I will
18probably do so, once the Unicode support in the Perl core settle.
19
20
21EXAMPLE OF USE
22
23 require Unicode::Map8;
24 $no_map = Unicode::Map8->new("ISO646-NO") || die;
25 $l1_map = Unicode::Map8->new("WinLatin1") || die;
26
27 my $ustr = $no_map->to16("V}re norske tegn b|r {res\n");
28 my $lstr = $l1_map->to8($ustr);
29 print $lstr;
30
31 print $l1_map->recode8($no_map, $lstr);
32
33
34INSTALLATION
35
36I recommend that you first install the Unicode-String Perl
37module. Once this is accomplished you just perform the usual steps:
38
39 perl Makefile.PL
40 make
41 make test
42 make install
43
44
45
46SUPPORTED CHARACTER SETS
47
48The following character sets have mapping tables distributed with this
49package.
50
51 ANSI_X3.110-1983 CSA_T500-1983 NAPLPS iso-ir-99
52 ANSI_X3.4-1968 ANSI_X3.4-1986 ASCII IBM367 ISO646-US ISO_646.irv:1991 US-ASCII cp367 iso-ir-6 us
53 ASMO_449 ISO_9036 arabic7 iso-ir-89
54 Adobe-Standard adobe-standard
55 Adobe-Symbol adobe-symbol
56 Adobe-Zapf-Dingbats adobe-zapf-dingbats
57 BS_4730 ISO646-GB gb iso-ir-4 uk
58 BS_viewdata iso-ir-47
59 CSA_Z243.4-1985-1 ISO646-CA ca csa7-1 iso-ir-121
60 CSA_Z243.4-1985-2 ISO646-CA2 csa7-2 iso-ir-122
61 CSA_Z243.4-1985-gr iso-ir-123
62 CSN_369103 iso-ir-139
63 DEC-MCS dec
64 DIN_66003 ISO646-DE de iso-ir-21
65 DS_2089 DS2089 ISO646-DK dk
66 EBCDIC-AT-DE
67 EBCDIC-AT-DE-A
68 EBCDIC-CA-FR
69 EBCDIC-DK-NO
70 EBCDIC-DK-NO-A
71 EBCDIC-ES
72 EBCDIC-ES-A
73 EBCDIC-ES-S
74 EBCDIC-FI-SE
75 EBCDIC-FI-SE-A
76 EBCDIC-FR
77 EBCDIC-IT
78 EBCDIC-PT
79 EBCDIC-UK
80 EBCDIC-US
81 ECMA-cyrillic iso-ir-111
82 ES ISO646-ES iso-ir-17
83 ES2 ISO646-ES2 iso-ir-85
84 GB_1988-80 ISO646-CN cn iso-ir-57
85 GOST_19768-74 ST_SEV_358-88 iso-ir-153
86 IBM037 cp037 ebcdic-cp-ca ebcdic-cp-nl ebcdic-cp-us ebcdic-cp-wt
87 IBM038 EBCDIC-INT cp038
88 IBM1026 CP1026
89 IBM273 CP273
90 IBM274 CP274 EBCDIC-BE
91 IBM275 EBCDIC-BR cp275
92 IBM277 EBCDIC-CP-DK EBCDIC-CP-NO
93 IBM278 CP278 ebcdic-cp-fi ebcdic-cp-se
94 IBM280 CP280 ebcdic-cp-it
95 IBM281 EBCDIC-JP-E cp281
96 IBM284 CP284 ebcdic-cp-es
97 IBM285 CP285 ebcdic-cp-gb
98 IBM290 EBCDIC-JP-kana cp290
99 IBM297 cp297 ebcdic-cp-fr
100 IBM420 cp420 ebcdic-cp-ar1
101 IBM424 cp424 ebcdic-cp-he
102 IBM437 437 cp437
103 IBM500 CP500 ebcdic-cp-be ebcdic-cp-ch
104 IBM850 850 cp850
105 IBM851 851 cp851
106 IBM852 852 cp852
107 IBM855 855 cp855
108 IBM857 857 cp857
109 IBM860 860 cp860
110 IBM861 861 cp-is cp861
111 IBM862 862 cp862
112 IBM863 863 cp863
113 IBM864 cp864
114 IBM865 865 cp865
115 IBM868 CP868 cp-ar
116 IBM869 869 cp-gr cp869
117 IBM870 CP870 ebcdic-cp-roece ebcdic-cp-yu
118 IBM871 CP871 ebcdic-cp-is
119 IBM880 EBCDIC-Cyrillic cp880
120 IBM891 cp891
121 IBM903 cp903
122 IBM904 904 cp904
123 IBM905 CP905 ebcdic-cp-tr
124 IBM918 CP918 ebcdic-cp-ar2
125 IEC_P27-1 iso-ir-143
126 INIS iso-ir-49
127 INIS-8 iso-ir-50
128 INIS-cyrillic iso-ir-51
129 INVARIANT
130 ISO_10367-box iso-ir-155
131 ISO_2033-1983 e13b iso-ir-98
132 ISO_5427 ISO_5427:1981 iso-ir-37 iso-ir-54
133 ISO_5428 ISO_5428:1980 iso-ir-55
134 ISO_646.basic ISO_646.basic:1983 ref
135 ISO_646.irv ISO_646.irv:1983 irv iso-ir-2
136 ISO_6937-2-25 iso-ir-152
137 ISO_6937-2-add iso-ir-142
138 ISO_8859-1 8859-1 CP819 IBM819 ISO-8859-1 ISO_8859-1:1987 iso-ir-100 iso8859-1 l1 latin1
139 ISO_8859-2 8859-2 ISO-8859-2 ISO_8859-2:1987 iso-ir-101 iso8859-2 l2 latin2
140 ISO_8859-3 8859-3 ISO-8859-3 ISO_8859-3:1988 iso-ir-109 iso8859-3 l3 latin3
141 ISO_8859-4 8859-4 ISO-8859-4 ISO_8859-4:1988 iso-ir-110 iso8859-4 l4 latin4
142 ISO_8859-5 8859-5 ISO-8859-5 ISO_8859-5:1988 cyrillic iso-ir-144 iso8859-5
143 ISO_8859-6 8859-6 ASMO-708 ECMA-114 ISO-8859-6 ISO_8859-6:1987 arabic iso-ir-127 iso8859-6
144 ISO_8859-7 8859-7 ECMA-118 ELOT_928 ISO-8859-7 ISO_8859-7:1987 greek greek8 iso-ir-126 iso8859-7
145 ISO_8859-8 8859-8 ISO-8859-8 ISO_8859-8:1988 hebrew iso-ir-138 iso8859-8
146 ISO_8859-9 8859-9 ISO-8859-9 ISO_8859-9:1989 iso-ir-148 iso8859-9 l5 latin5
147 ISO_8859-supp iso-ir-154 latin1-2-5
148 IT ISO646-IT iso-ir-15
149 JIS_C6220-1969-jp JIS_C6220-1969 iso-ir-13 katakana x0201-7
150 JIS_C6220-1969-ro ISO646-JP iso-ir-14 jp
151 JIS_C6229-1984-a iso-ir-91 jp-ocr-a
152 JIS_C6229-1984-b ISO646-JP-OCR-B iso-ir-92 jp-ocr-b
153 JIS_C6229-1984-b-add iso-ir-93 jp-ocr-b-add
154 JIS_C6229-1984-hand iso-ir-94 jp-ocr-hand
155 JIS_C6229-1984-hand-add iso-ir-95 jp-ocr-hand-add
156 JIS_C6229-1984-kana iso-ir-96
157 JIS_X0201 X0201
158 JUS_I.B1.002 ISO646-YU iso-ir-141 js yu
159 JUS_I.B1.003-mac iso-ir-147 macedonian
160 JUS_I.B1.003-serb iso-ir-146 serbian
161 KSC5636 ISO646-KR
162 Latin-greek-1 iso-ir-27
163 MSZ_7795.3 ISO646-HU hu iso-ir-86
164 NATS-DANO iso-ir-9-1
165 NATS-DANO-ADD iso-ir-9-2
166 NATS-SEFI iso-ir-8-1
167 NATS-SEFI-ADD iso-ir-8-2
168 NC_NC00-10 ISO646-CU NC_NC00-10:81 cuba iso-ir-151
169 NF_Z_62-010 ISO646-FR ISO646-FR1 NF_Z_62-010_(1973) fr iso-ir-25 iso-ir-69
170 NS_4551-1 ISO646-NO iso-ir-60 no
171 NS_4551-2 ISO646-NO2 iso-ir-61 no2
172 PT ISO646-PT iso-ir-16
173 PT2 ISO646-PT2 iso-ir-84
174 SEN_850200_B FI ISO646-FI ISO646-SE iso-ir-10 se
175 SEN_850200_C ISO646-SE2 iso-ir-11 se2
176 T.101-G2 iso-ir-128
177 T.61-7bit iso-ir-102
178 T.61-8bit T.61 iso-ir-103
179 cp037 IBMUSCanada
180 cp10000 MacRoman
181 cp10006 MacGreek
182 cp10007 MacCyrillic
183 cp10029 MacLatin2
184 cp10079 MacIcelandic
185 cp10081 MacTurkish
186 cp1026 IBMLatin5Turkish
187 cp1250 WinLatin2
188 cp1251 WinCyrillic
189 cp1252 WinLatin1
190 cp1253 WinGreek
191 cp1254 WinTurkish
192 cp1255 WinHebrew
193 cp1256 WinArabic
194 cp1257 WinBaltic
195 cp1258 WinVietnamese
196 cp437 DOSLatinUS
197 cp500 IBMInternational
198 cp737 DOSGreek
199 cp775 DOSBaltRim
200 cp850 DOSLatin1
201 cp852 DOSLatin2
202 cp855 DOSCyrillic
203 cp857 DOSTurkish
204 cp860 DOSPortuguese
205 cp861 DOSIcelandic
206 cp862 DOSHebrew
207 cp863 DOSCanadaF
208 cp864 DOSArabic
209 cp865 DOSNordic
210 cp866 DOSCyrillicRussian
211 cp866lr DOSCyrillicLatvian
212 cp869 DOSGreek2
213 cp874 DOSThai
214 cp875 IBMGreek
215 dk-us
216 greek-ccitt iso-ir-150
217 greek7 iso-ir-88
218 greek7-old iso-ir-18
219 hp-roman8 r8 roman8
220 iso-ir-90
221 koi8-r
222 koi8-u
223 latin-greek iso-ir-19
224 latin-lap iso-ir-158 lap
225 latin6 iso-ir-157 l6
226 macintosh mac
227 us-dk
228 videotex-suppl iso-ir-70
229
230
231
232COPYRIGHT
233
234 � 1998-1999 Gisle Aas. All rights reserved.
235
236This library is free software; you can redistribute it and/or modify
237it under the same terms as Perl itself.
238