1Summary : This is the definitions of all hangul codes
2supported by "HCODE ver 2.0". More details and discussions
3on hangul code in addition to this document are included
4in the hcode package version 2.1 from June-Yub Lee.
5(jylee@kitty.cims.nyu.edu, jylee@math1.kaist.ac.kr)
6Thanks.  Sep, 18, 1994
7	June-Yub Lee at Courant Institute
8
9
10Contents
11--------
12I. G0/G1 Switching Codes
13II. ASCII Code points for 7bit
14III. Trigem Combination Code and 3byte modern hangul codes
15IV. KSC5601 2byte Precomposed Chars and 8byte Combination Code
16V. ISO-2022-KR and SDN Mailing Code
17VI. Code points of HAN3 code and 2-Set Keyboard Layout
18
19
20========================
21I. G0/G1 Switching Codes
22========================
23
24---------------------------------------------------
25             |76543210|76543210|765432 10|765 43210
26-------------|--------|--------|------ --|--- -----
27ASCII        |        |        |         |0xx xxxxx
28-------------|--------|--------|------ --|--- -----
29Trigem       |        |        |1xxxxx xx|xxx xxxxx
30-------------|--------|--------|------ --|--- -----
31KSC5601      |        |        |1yyyyy yy|1yy yyyyy
32-------------|--------|--------|------ --|--- -----
33han3         |10001111|1yyyyyyy|1yyyyy yy|1yy yyyyy
34   KSC5601*  |10001111|11111011|1yyyyy yy|1yy yyyyy
35---------------------------------------------------
36---------------------------------------------------
37             |0   8   |16  24  |32  40   |48  56
38-------------|--------|--------|------ --|--- -----
39KSC5601-8byte|A4  D4  |A4  1YY |A4  1YY  |A4  1YY
40---------------------------------------------------
41
42hcode Internal Code
43---------------------------------------------------
44             |76543210|76543210|765432 10|765 43210
45-------------|--------|--------|------ --|--- -----
46ASCII        |00000000|00000000|000000 00|0xx xxxxx
47-------------|--------|--------|------ --|--- -----
48Trigem       |00000000|00000000|1xxxxx xx|xxx xxxxx
49-------------|--------|--------|------ --|--- -----
50han3         |10001111|1yyyyyyy|1yyyyy yy|1yy yyyyy
51-------------|--------|--------|------ --|--- -----
52KSC5601      |10001111|11111011|1yyyyy yy|1yy yyyyy
53-------------|--------|--------|------ --|--- -----
54KSC5601-8byte|11011000|1yyyyyyy|1yyyyy yy|1yy yyyyy
55-------------|--------|--------|------ --|--- -----
56NOT A CHAR   |11111111|00000000|000000 00|000 00000
57---------------------------------------------------
58   		x=0/1, yyyyyyy=YY=subset of 94chars
59
60
61==============================
62II. ASCII Code points for 7bit
63==============================
64
65-----------------------------------------------------------------
66| 00 nul| 01 soh| 02 stx| 03 etx| 04 eot| 05 enq| 06 ack| 07 bel|
67| 08 bs | 09 ht | 0a nl | 0b vt | 0c np | 0d cr | 0e so | 0f si |
68| 10 dle| 11 dc1| 12 dc2| 13 dc3| 14 dc4| 15 nak| 16 syn| 17 etb|
69| 18 can| 19 em | 1a sub| 1b esc| 1c fs | 1d gs | 1e rs | 1f us |
70| 20 sp | 21  ! | 22  " | 23  # | 24  $ | 25  % | 26  & | 27  ' |
71| 28  ( | 29  ) | 2a  * | 2b  + | 2c  , | 2d  - | 2e  . | 2f  / |
72| 30  0 | 31  1 | 32  2 | 33  3 | 34  4 | 35  5 | 36  6 | 37  7 |
73| 38  8 | 39  9 | 3a  : | 3b  ; | 3c  < | 3d  = | 3e  > | 3f  ? |
74| 40  @ | 41  A | 42  B | 43  C | 44  D | 45  E | 46  F | 47  G |
75| 48  H | 49  I | 4a  J | 4b  K | 4c  L | 4d  M | 4e  N | 4f  O |
76| 50  P | 51  Q | 52  R | 53  S | 54  T | 55  U | 56  V | 57  W |
77| 58  X | 59  Y | 5a  Z | 5b  [ | 5c  \ | 5d  ] | 5e  ^ | 5f  _ |
78| 60  ` | 61  a | 62  b | 63  c | 64  d | 65  e | 66  f | 67  g |
79| 68  h | 69  i | 6a  j | 6b  k | 6c  l | 6d  m | 6e  n | 6f  o |
80| 70  p | 71  q | 72  r | 73  s | 74  t | 75  u | 76  v | 77  w |
81| 78  x | 79  y | 7a  z | 7b  { | 7c  | | 7d  } | 7e  ~ | 7f del|
82-----------------------------------------------------------------
83
84
85==========================================================
86III. Trigem Combination Code and 3byte modern hangul codes
87==========================================================
88First byte with MSB ON and following 3*5bits=15bits refers a syllable.
89The following is the codepoints for each 5bits range with the standard
90Romanization code agreeed by N. S. Korea in 1992.
91See the table h3Bcode.h for conversions between 3byte codes.
92And you should notice that the 3byte codes for modern hangul will
93be converted from/to this trigem code.
94
95  {   "",   FILL, "K",   "Kk",  "N",   "T",   "Tt",  "R",
96      "M",  "P",  "Pp",  "S",   "Ss",  "",    "C",   "Cc",
97      "Ch", "Kh", "Th",  "Ph",  "H",   "",    "",    "",
98      "",   "",   "",    "",    "",    "",    "",    ""    },
99  {   "",   "",   FILL,  "a",   "ae",  "ya",  "yae", "eo",
100      "",   "",   "e",   "yeo", "ye",  "o",   "wa",  "wae",
101      "",   "",   "oe",  "yo",  "u",   "weo", "we",  "wi",
102      "",   "",   "yu",  "eu",  "yi",  "i",   "",    ""    },
103  {   "",   FILL, "k",   "kk",  "ks",  "n",   "nc",  "nh",
104      "t",  "l",  "lk",  "lm",  "lp",  "ls",  "lth", "lph",
105      "lh", "m",  "",    "p",   "ps",  "s",   "ss",  "ng",
106      "c",  "ch", "kh",  "th",  "ph",  "h",   "",    ""    }
107
108There is a serious trouble in transforming between 2-set
109approaching codes(with consonant and vowel) and 3-set approaching
110codes(with consonant initial and final separately and also vowel)
111since you can't distinguish init-K from final-K if the syllable
112is not a normal. (If you have an assumption that all syllables
113have initial and vowel, then it's okay.) To resolve this problem
114you need at least one FILL code, either FILL-init or FILL-Vowel.
115
116Traditionally, there is no clear definitions to declare Final-only
117syllables for N-byte, Key board input, or Romanization since
118all of them is based on 2-set approach. Now I will make my
119"own" definitions to grantee that hcode is "round-trip compatible".
120I am introducing NULL-VOWEL "a" in N-byte code (the first
121vowel(A) start from "b") and "L" in keyboard simulation
122according to han3term keyboard input and "!" for Romanization.
123This NULL-VOWEL will be added when you have a final without vowel.
124
125However, there is other problem in romanizaion code that is
126there is no code points for "Cho-Seong I-Eung" so it's a good
127idea to introduce a NULL-INITIAL(@).
128
129Romanization code is case insenstive. And '-' will separate
130two syllables in a word. However, any letters which is not
131a vowel nor a consonant will be regarded as a syllable separator,
132that means you may use {Han-Keul} or {Han.Keul}.
133
134==============================================================
135IV. KSC5601 2byte Precomposed Chars and 8byte Combination Code
136==============================================================
1372350 Chars = 94 chars/page * 25 pages(B0-D8)
138And also 19 + 21 Modern Jamos in A4 pages.
139See the table h2Bcode.h for the codepoint of each character.
140
141If there is no precomposed character then 8byte sequence
142<FILL> <Initial Consonant> <Vowel> <FILL| Final Consonant>
143could be used.  51 codes (A4A1-A4D3) are used for Modern hangul
144(Initial 19+ Vowel 21+ Final_Only 11(27-16Common Consonant) ).
145However hcode can read all of 94 jamos in A4 pages, for example,
146"LK" is not a initial for the Trigem codepoint is 0x8449.
147
148Strictly speaking, hcode ver 2.1 does not conform KSC5601-1989
149in the sense that it accepts <FILL><FILL|Init><FILL|Vowel><FILL|Final>.
150And also it will generate <FILL><FILL><FILL><Final> in addition to
151<FILL> <Initial Consonant> <Vowel> <FILL| Final Consonant>.
152So round trip compatibilities of modern hangul files by hcode
153is granteed. (However "hcode -AB given | hcode -BA" might generate
154slightly different sequence if given doesn't conform the standards.)
155
15642 ancient jamos A4D5-A4FE (34consonant+8vowels) will be used
157only for transformation with han3 code.
158
159A4A1 -  ��, ��, ����, ��, ����, ����, ��
160A4A8 - ��,   ��, ����, ����, ����, ����, ����, ����
161A4B0 - ����, ��, ��, ��, ����, ��, ��, ��
162A4B8 - ��, ��, ��, ��, ��, ��, ��, ��
163A4C0 - ��, ��, ��, ��, ��, ��, ��, ��
164A4C8 - �Ǥ�, �Ǥ�, �Ǥ�, ��, ��, �̤�, �̤�, �̤�
165A4D0 - ��, ��, �Ѥ�, ��, FILL, ����, ����, ����
166A4D8 -  ��A, ������, ����, ������, ��A, ��B, ����, ����
167A4E0 -  ��A, C, ����, ����, ������, ������, ����, ����
168A4E8 -  D, E, ����, ����, ����, ����, ����, A
169A4F0 -  ����, ��, ����, ��A, F, ����, B, �ˤ�
170A4F8 -  �ˤ�, �ˤ�, �Ф�, �Ф�, �Ф�, ., .��
171
172        * A /\   B --  C ___  D |_|  E |_| |_|  F _____
173           /__\    /\    | |    |_|    |_| |_|     | |
174                   \/    ---                      -----
175                         /-\    /-\      /-\       /-\
176                         \_/    \_/      \_/       \_/
177
178
179====================================
180V. ISO-2022-KR and SDN Mailing Code
181====================================
182Basic code points come from KSC-5601 and we use SO/SI (^N/^O)
183to switch G0 from/to G1 instead of using MSB on.
184
185To specify the designation set, ISO-2022-KR, hcode puts
186"ESC$)C\n" at the front of the first Hangul of the text.
187This definition may be different from original definition
188from Uhhyung Choi's. This is more efficient way to put
189the designation sequence. For input, hcode has no problem
190to read hangul text if the sequence is before hangul text.
191
192SDN Mailing code is extended version of ISO-2022-KR to
193encode MSB ON characters in the header fields according to
194RFC-1342. Transformation method is changing 8bit*3chars to
1956bits(=64)*4 printable chars. See SDN documents from
196uhhyung@daiduk.kaist.ac.kr about the coding.
197
198Input of SDN for hcode is perfectly fine but I made
199hcode not generate SDN code since the header fields
200should be made by mailer. In fact, you can't put
201hangul headers from your text. (Well, it's possible but
202it's a cheating). That's why SDN is allowed only for input.
203
204=====================================================
205VI. Code points of HAN3 code and 2-Set Keyboard Layout
206=====================================================
207
208This code was developed by Hyeongkyu Chang (chk@ssp.etri.re.kr)
209and JaeKyung Song (jksong@mani.kaist.ac.kr) for han3term.
210It is upward compatible with KSC-5601-1987.
211The code is based on fixed size so it's clear.
212I think code point given below and Section I of this document
213is all of what you need to understand this code.
214
21529 Vowels match with those in A4 Page in KSC5601-1989 and
21683 Consonants is super set (has 19 additionals) of KSC chars.
217However, you should notice that the sorting order of HAN3 and
218KSC 5601-1989 is different for ancient jamos.
219
220���� �ڵ� (83��) +  ���� �ڵ� (29��)
221--------------------------------------------------------------------------
222        0/8     1/9     2/a     3/b     4/c     5/d     6/e     7/f
223--------------------------------------------------------------------------
2240xa0:           �,   ��,     ��,     ����,   ����,   ������, ��,
2250xa8:   ����,   ����,   ����,   ����,   ��.4,   ����,   ����,   ��,
2260xb0:   ����,   ��,     ��,     ����,   ������, ����,   ����,   ������,
2270xb8:   ����,   ������, ��.6,   ����,   ��.4,   ��.7,   ����,   ����,
2280xc0:   ����,   ��,     ����,   ����,   ����,   ����,   ��.4,   .1,
2290xc8:   ��,     ����,   ����,   ����,   ��,     ����,   ������, ������,
2300xd0:   ����,   ����,   ����,   .6,     .0,     ��,     ����,   ����,
2310xd8:   ����,   ����,   ����,   ����,   ������, ��,     ����,   ����,
2320xe0:   ����,   ����,   ����,   ����,   .4,     ��,     ����,   ����,
2330xe8:   .7,     .8,     .8��,   .8.4,   ��,     ��,     ��,     ��,
2340xf0:   ��,     ��,     .9,     ��,     ����
235--------------------------------------------------------------------------
2360xa0            �,  ��,     ��,     ��,     ��,     ��,     ��,
2370xa8:   ��,    ��,     ��,      �Ǥ�,   �Ǥ�,   �Ǥ�,   ��,     �ˤ�,
2380xb0:   �ˤ�,   �ˤ�,  ��,      �̤�,   �̤�,   �̤�,   ��,     �Ф�,
2390xb8:   �Ф�,   �Ф�,  ��,      �Ѥ�,   ��,     .#,     .#��
240--------------------------------------------------------------------------
241
242
243----------------------------------------
244  3����Ʈ �ѱ�(��Ī han3) �� Ű ��ġǥ
245----------------------------------------
246
247+---------+----+----+----+----+----+----+----+----+
248|Q ��|W ��|E ��|R ��|T ��|Y  Y|U  U|I  I|O ��|P ��|
249|  ��|  ��|  ��|  ��|  ��|  ��|  ��|  ��|  ��|  ��|
250+-+--+-+--+-+--+-+--+-+--+-+--+-+--+-+--+-+--+-+--+
251  |A .1|S .2|D .3|F .4|G .5|H  H|J  J|K .#|L .$|
252  |  ��|  ��|  ��|  ��|  ��|  ��|  ��|  ��|  ��|
253  +-+--+-+--+-+--+-+--+-+--+-+--+-+--+-+--+----+
254    |Z .6|X .7|C .8|V .9|B .0|N  N|M  M|
255    |  ��|  ��|  ��|  ��|  ��|  ��|  ��|
256    +----+----+----+----+----+----+----+
257
258.1: ____        .2: ����        .3: ����        .4:             .5: ����
259    |  |                                              /\
260    |__|                                             /  \
261     __                                             /____\
262    /  \
263    \__/
264
265.6:             .7:             .8:             .9:             .0:
266    |   |           -----             |             ------         | | | |
267    |___|            ___            __|__            |  |          |_| |_|
268    |   |           /   \          /     \          ------         | | | |
269    |___|           \___/          \_____/           ____          |_| |_|
270     ___                                            /    \           ___
271    /   \                                           \____/          /   \
272    \___/                                                           \___/
273
274.#: . (�Ʒ� ��)
275.$: NULL ���� (��ħ���� ���� ������ ���ȴ�.)
276
277