1.\" Copyright (c) 1993 2.\" The Regents of the University of California. All rights reserved. 3.\" 4.\" This code is derived from software contributed to Berkeley by 5.\" Paul Borman at Krystal Technologies. 6.\" 7.\" %sccs.include.redist.roff% 8.\" 9.\" @(#)utf2.4 8.1 (Berkeley) 06/04/93 10.\" 11.Dd "" 12.Dt UTF2 4 13.Os 14.Sh NAME 15.Nm UTF2 16.Nd "Universal character set Transformation Format encoding of runes 17.Sh SYNOPSIS 18\fBENCODING "UTF2"\fP 19.Sh DESCRIPTION 20The 21.Nm UTF2 22encoding is based on a proposed X-Open multibyte 23\s-1FSS-UCS-TF\s+1 (File System Safe Universal Character Set Transformation Format) encoding as used in 24.Nm Plan 9 from Bell Labs. 25Although it is capable of representing more than 16 bits, 26the current implementation is limited to 16 bits as defined by the 27Unicode Standard. 28.Pp 29.Nm UTF2 30representation is backwards compatible with ASCII, so 0x00-0x7f refer to the 31ASCII character set. The multibyte encoding of runes between 0x0080 and 0xffff 32consist entirely of bytes whose high order bit is set. The actual 33encoding is represented by the following table: 34.Bd -literal 35[0x0000 - 0x007f] [00000000.0bbbbbbb] -> 0bbbbbbb 36[0x0080 - 0x03ff] [00000bbb.bbbbbbbb] -> 110bbbbb, 10bbbbbb 37[0x0400 - 0xffff] [bbbbbbbb.bbbbbbbb] -> 1110bbbb, 10bbbbbb, 10bbbbbb 38.Ed 39.sp 40If more than a single representation of a value exists (for example, 410x00; 0xC0 0x80; 0xE0 0x80 0x80) the shortest representation is always 42used (but the longer ones will be correctly decoded). 43.Pp 44The final three encodings provided by X-Open: 45.Bd -literal 46[00000000.000bbbbb.bbbbbbbb.bbbbbbbb] -> 47 11110bbb, 10bbbbbb, 10bbbbbb, 10bbbbbb 48 49[000000bb.bbbbbbbb.bbbbbbbb.bbbbbbbb] -> 50 111110bb, 10bbbbbb, 10bbbbbb, 10bbbbbb, 10bbbbbb 51 52[0bbbbbbb.bbbbbbbb.bbbbbbbb.bbbbbbbb] -> 53 1111110b, 10bbbbbb, 10bbbbbb, 10bbbbbb, 10bbbbbb, 10bbbbbb 54.Ed 55.sp 56which provides for the entire proposed ISO-10646 31 bit standard are currently 57not implemented. 58.Sh "SEE ALSO" 59.Xr mklocale 1 , 60.Xr setlocale 3 61