1Summary : This is the definitions of all hangul codes 2supported by "HCODE ver 2.0". More details and discussions 3on hangul code in addition to this document are included 4in the hcode package version 2.1 from June-Yub Lee. 5(jylee@kitty.cims.nyu.edu, jylee@math1.kaist.ac.kr) 6Thanks. Sep, 18, 1994 7 June-Yub Lee at Courant Institute 8 9 10Contents 11-------- 12I. G0/G1 Switching Codes 13II. ASCII Code points for 7bit 14III. Trigem Combination Code and 3byte modern hangul codes 15IV. KSC5601 2byte Precomposed Chars and 8byte Combination Code 16V. ISO-2022-KR and SDN Mailing Code 17VI. Code points of HAN3 code and 2-Set Keyboard Layout 18 19 20======================== 21I. G0/G1 Switching Codes 22======================== 23 24--------------------------------------------------- 25 |76543210|76543210|765432 10|765 43210 26-------------|--------|--------|------ --|--- ----- 27ASCII | | | |0xx xxxxx 28-------------|--------|--------|------ --|--- ----- 29Trigem | | |1xxxxx xx|xxx xxxxx 30-------------|--------|--------|------ --|--- ----- 31KSC5601 | | |1yyyyy yy|1yy yyyyy 32-------------|--------|--------|------ --|--- ----- 33han3 |10001111|1yyyyyyy|1yyyyy yy|1yy yyyyy 34 KSC5601* |10001111|11111011|1yyyyy yy|1yy yyyyy 35--------------------------------------------------- 36--------------------------------------------------- 37 |0 8 |16 24 |32 40 |48 56 38-------------|--------|--------|------ --|--- ----- 39KSC5601-8byte|A4 D4 |A4 1YY |A4 1YY |A4 1YY 40--------------------------------------------------- 41 42hcode Internal Code 43--------------------------------------------------- 44 |76543210|76543210|765432 10|765 43210 45-------------|--------|--------|------ --|--- ----- 46ASCII |00000000|00000000|000000 00|0xx xxxxx 47-------------|--------|--------|------ --|--- ----- 48Trigem |00000000|00000000|1xxxxx xx|xxx xxxxx 49-------------|--------|--------|------ --|--- ----- 50han3 |10001111|1yyyyyyy|1yyyyy yy|1yy yyyyy 51-------------|--------|--------|------ --|--- ----- 52KSC5601 |10001111|11111011|1yyyyy yy|1yy yyyyy 53-------------|--------|--------|------ --|--- ----- 54KSC5601-8byte|11011000|1yyyyyyy|1yyyyy yy|1yy yyyyy 55-------------|--------|--------|------ --|--- ----- 56NOT A CHAR |11111111|00000000|000000 00|000 00000 57--------------------------------------------------- 58 x=0/1, yyyyyyy=YY=subset of 94chars 59 60 61============================== 62II. ASCII Code points for 7bit 63============================== 64 65----------------------------------------------------------------- 66| 00 nul| 01 soh| 02 stx| 03 etx| 04 eot| 05 enq| 06 ack| 07 bel| 67| 08 bs | 09 ht | 0a nl | 0b vt | 0c np | 0d cr | 0e so | 0f si | 68| 10 dle| 11 dc1| 12 dc2| 13 dc3| 14 dc4| 15 nak| 16 syn| 17 etb| 69| 18 can| 19 em | 1a sub| 1b esc| 1c fs | 1d gs | 1e rs | 1f us | 70| 20 sp | 21 ! | 22 " | 23 # | 24 $ | 25 % | 26 & | 27 ' | 71| 28 ( | 29 ) | 2a * | 2b + | 2c , | 2d - | 2e . | 2f / | 72| 30 0 | 31 1 | 32 2 | 33 3 | 34 4 | 35 5 | 36 6 | 37 7 | 73| 38 8 | 39 9 | 3a : | 3b ; | 3c < | 3d = | 3e > | 3f ? | 74| 40 @ | 41 A | 42 B | 43 C | 44 D | 45 E | 46 F | 47 G | 75| 48 H | 49 I | 4a J | 4b K | 4c L | 4d M | 4e N | 4f O | 76| 50 P | 51 Q | 52 R | 53 S | 54 T | 55 U | 56 V | 57 W | 77| 58 X | 59 Y | 5a Z | 5b [ | 5c \ | 5d ] | 5e ^ | 5f _ | 78| 60 ` | 61 a | 62 b | 63 c | 64 d | 65 e | 66 f | 67 g | 79| 68 h | 69 i | 6a j | 6b k | 6c l | 6d m | 6e n | 6f o | 80| 70 p | 71 q | 72 r | 73 s | 74 t | 75 u | 76 v | 77 w | 81| 78 x | 79 y | 7a z | 7b { | 7c | | 7d } | 7e ~ | 7f del| 82----------------------------------------------------------------- 83 84 85========================================================== 86III. Trigem Combination Code and 3byte modern hangul codes 87========================================================== 88First byte with MSB ON and following 3*5bits=15bits refers a syllable. 89The following is the codepoints for each 5bits range with the standard 90Romanization code agreeed by N. S. Korea in 1992. 91See the table h3Bcode.h for conversions between 3byte codes. 92And you should notice that the 3byte codes for modern hangul will 93be converted from/to this trigem code. 94 95 { "", FILL, "K", "Kk", "N", "T", "Tt", "R", 96 "M", "P", "Pp", "S", "Ss", "", "C", "Cc", 97 "Ch", "Kh", "Th", "Ph", "H", "", "", "", 98 "", "", "", "", "", "", "", "" }, 99 { "", "", FILL, "a", "ae", "ya", "yae", "eo", 100 "", "", "e", "yeo", "ye", "o", "wa", "wae", 101 "", "", "oe", "yo", "u", "weo", "we", "wi", 102 "", "", "yu", "eu", "yi", "i", "", "" }, 103 { "", FILL, "k", "kk", "ks", "n", "nc", "nh", 104 "t", "l", "lk", "lm", "lp", "ls", "lth", "lph", 105 "lh", "m", "", "p", "ps", "s", "ss", "ng", 106 "c", "ch", "kh", "th", "ph", "h", "", "" } 107 108There is a serious trouble in transforming between 2-set 109approaching codes(with consonant and vowel) and 3-set approaching 110codes(with consonant initial and final separately and also vowel) 111since you can't distinguish init-K from final-K if the syllable 112is not a normal. (If you have an assumption that all syllables 113have initial and vowel, then it's okay.) To resolve this problem 114you need at least one FILL code, either FILL-init or FILL-Vowel. 115 116Traditionally, there is no clear definitions to declare Final-only 117syllables for N-byte, Key board input, or Romanization since 118all of them is based on 2-set approach. Now I will make my 119"own" definitions to grantee that hcode is "round-trip compatible". 120I am introducing NULL-VOWEL "a" in N-byte code (the first 121vowel(A) start from "b") and "L" in keyboard simulation 122according to han3term keyboard input and "!" for Romanization. 123This NULL-VOWEL will be added when you have a final without vowel. 124 125However, there is other problem in romanizaion code that is 126there is no code points for "Cho-Seong I-Eung" so it's a good 127idea to introduce a NULL-INITIAL(@). 128 129Romanization code is case insenstive. And '-' will separate 130two syllables in a word. However, any letters which is not 131a vowel nor a consonant will be regarded as a syllable separator, 132that means you may use {Han-Keul} or {Han.Keul}. 133 134============================================================== 135IV. KSC5601 2byte Precomposed Chars and 8byte Combination Code 136============================================================== 1372350 Chars = 94 chars/page * 25 pages(B0-D8) 138And also 19 + 21 Modern Jamos in A4 pages. 139See the table h2Bcode.h for the codepoint of each character. 140 141If there is no precomposed character then 8byte sequence 142<FILL> <Initial Consonant> <Vowel> <FILL| Final Consonant> 143could be used. 51 codes (A4A1-A4D3) are used for Modern hangul 144(Initial 19+ Vowel 21+ Final_Only 11(27-16Common Consonant) ). 145However hcode can read all of 94 jamos in A4 pages, for example, 146"LK" is not a initial for the Trigem codepoint is 0x8449. 147 148Strictly speaking, hcode ver 2.1 does not conform KSC5601-1989 149in the sense that it accepts <FILL><FILL|Init><FILL|Vowel><FILL|Final>. 150And also it will generate <FILL><FILL><FILL><Final> in addition to 151<FILL> <Initial Consonant> <Vowel> <FILL| Final Consonant>. 152So round trip compatibilities of modern hangul files by hcode 153is granteed. (However "hcode -AB given | hcode -BA" might generate 154slightly different sequence if given doesn't conform the standards.) 155 15642 ancient jamos A4D5-A4FE (34consonant+8vowels) will be used 157only for transformation with han3 code. 158 159A4A1 - ��, ��, ����, ��, ����, ����, �� 160A4A8 - ��, ��, ����, ����, ����, ����, ����, ���� 161A4B0 - ����, ��, ��, ��, ����, ��, ��, �� 162A4B8 - ��, ��, ��, ��, ��, ��, ��, �� 163A4C0 - ��, ��, ��, ��, ��, ��, ��, �� 164A4C8 - �Ǥ�, �Ǥ�, �Ǥ�, ��, ��, �̤�, �̤�, �̤� 165A4D0 - ��, ��, �Ѥ�, ��, FILL, ����, ����, ���� 166A4D8 - ��A, ������, ����, ������, ��A, ��B, ����, ���� 167A4E0 - ��A, C, ����, ����, ������, ������, ����, ���� 168A4E8 - D, E, ����, ����, ����, ����, ����, A 169A4F0 - ����, ��, ����, ��A, F, ����, B, �ˤ� 170A4F8 - �ˤ�, �ˤ�, �Ф�, �Ф�, �Ф�, ., .�� 171 172 * A /\ B -- C ___ D |_| E |_| |_| F _____ 173 /__\ /\ | | |_| |_| |_| | | 174 \/ --- ----- 175 /-\ /-\ /-\ /-\ 176 \_/ \_/ \_/ \_/ 177 178 179==================================== 180V. ISO-2022-KR and SDN Mailing Code 181==================================== 182Basic code points come from KSC-5601 and we use SO/SI (^N/^O) 183to switch G0 from/to G1 instead of using MSB on. 184 185To specify the designation set, ISO-2022-KR, hcode puts 186"ESC$)C\n" at the front of the first Hangul of the text. 187This definition may be different from original definition 188from Uhhyung Choi's. This is more efficient way to put 189the designation sequence. For input, hcode has no problem 190to read hangul text if the sequence is before hangul text. 191 192SDN Mailing code is extended version of ISO-2022-KR to 193encode MSB ON characters in the header fields according to 194RFC-1342. Transformation method is changing 8bit*3chars to 1956bits(=64)*4 printable chars. See SDN documents from 196uhhyung@daiduk.kaist.ac.kr about the coding. 197 198Input of SDN for hcode is perfectly fine but I made 199hcode not generate SDN code since the header fields 200should be made by mailer. In fact, you can't put 201hangul headers from your text. (Well, it's possible but 202it's a cheating). That's why SDN is allowed only for input. 203 204===================================================== 205VI. Code points of HAN3 code and 2-Set Keyboard Layout 206===================================================== 207 208This code was developed by Hyeongkyu Chang (chk@ssp.etri.re.kr) 209and JaeKyung Song (jksong@mani.kaist.ac.kr) for han3term. 210It is upward compatible with KSC-5601-1987. 211The code is based on fixed size so it's clear. 212I think code point given below and Section I of this document 213is all of what you need to understand this code. 214 21529 Vowels match with those in A4 Page in KSC5601-1989 and 21683 Consonants is super set (has 19 additionals) of KSC chars. 217However, you should notice that the sorting order of HAN3 and 218KSC 5601-1989 is different for ancient jamos. 219 220���� �ڵ� (83��) + ���� �ڵ� (29��) 221-------------------------------------------------------------------------- 222 0/8 1/9 2/a 3/b 4/c 5/d 6/e 7/f 223-------------------------------------------------------------------------- 2240xa0: ä��, ��, ��, ����, ����, ������, ��, 2250xa8: ����, ����, ����, ����, ��.4, ����, ����, ��, 2260xb0: ����, ��, ��, ����, ������, ����, ����, ������, 2270xb8: ����, ������, ��.6, ����, ��.4, ��.7, ����, ����, 2280xc0: ����, ��, ����, ����, ����, ����, ��.4, .1, 2290xc8: ��, ����, ����, ����, ��, ����, ������, ������, 2300xd0: ����, ����, ����, .6, .0, ��, ����, ����, 2310xd8: ����, ����, ����, ����, ������, ��, ����, ����, 2320xe0: ����, ����, ����, ����, .4, ��, ����, ����, 2330xe8: .7, .8, .8��, .8.4, ��, ��, ��, ��, 2340xf0: ��, ��, .9, ��, ���� 235-------------------------------------------------------------------------- 2360xa0 ä��, ��, ��, ��, ��, ��, ��, 2370xa8: ��, ��, ��, �Ǥ�, �Ǥ�, �Ǥ�, ��, �ˤ�, 2380xb0: �ˤ�, �ˤ�, ��, �̤�, �̤�, �̤�, ��, �Ф�, 2390xb8: �Ф�, �Ф�, ��, �Ѥ�, ��, .#, .#�� 240-------------------------------------------------------------------------- 241 242 243---------------------------------------- 244 3����Ʈ �ѱ�(��Ī han3) �� Ű ��ġǥ 245---------------------------------------- 246 247+---------+----+----+----+----+----+----+----+----+ 248|Q ��|W ��|E ��|R ��|T ��|Y Y|U U|I I|O ��|P ��| 249| ��| ��| ��| ��| ��| ��| ��| ��| ��| ��| 250+-+--+-+--+-+--+-+--+-+--+-+--+-+--+-+--+-+--+-+--+ 251 |A .1|S .2|D .3|F .4|G .5|H H|J J|K .#|L .$| 252 | ��| ��| ��| ��| ��| ��| ��| ��| ��| 253 +-+--+-+--+-+--+-+--+-+--+-+--+-+--+-+--+----+ 254 |Z .6|X .7|C .8|V .9|B .0|N N|M M| 255 | ��| ��| ��| ��| ��| ��| ��| 256 +----+----+----+----+----+----+----+ 257 258.1: ____ .2: ���� .3: ���� .4: .5: ���� 259 | | /\ 260 |__| / \ 261 __ /____\ 262 / \ 263 \__/ 264 265.6: .7: .8: .9: .0: 266 | | ----- | ------ | | | | 267 |___| ___ __|__ | | |_| |_| 268 | | / \ / \ ------ | | | | 269 |___| \___/ \_____/ ____ |_| |_| 270 ___ / \ ___ 271 / \ \____/ / \ 272 \___/ \___/ 273 274.#: . (�Ʒ� ��) 275.$: NULL ���� (��ħ���� ���� ������ ���ȴ�.) 276 277