1Dear translators,
2
3Clarification of certain strings contained herein.
4
5Noah
6
7
8Date: Mon, 28 Jul 2003 11:38:20 -0400
9From: Noah Levitt
10To: Telsa Gwynne
11Subject: Re: How to translate some of the gucharmap strings?
12
13Hey Telsa,
14
15These are good questions. I'm surprised other translators
16haven't brought these up before. The short answer is, most
17of these terms are technical Unicode terms. I'll try to put
18some comments in the source files based on these notes.
19
20On Mon, Jul 28, 2003 at 9:55:48 +0100, Telsa Gwynne wrote:
21>
22> gucharmap has a collection of strings which I suspect may cause
23> trouble. There are no notes for translators in the po file (which
24> some people do). Can I glean some explanations from you, before we
25> do Terrible Things?
26[...]
27>
28> These are the strings I don't understand.
29>
30> #: gucharmap/gucharmap-charmap.c:397
31> msgid "Canonical decomposition:"
32> msgstr ""
33>
34> (I have visions of this coming out as "Biblical falling-apart" or something
35> at the moment: is this "taking apart" sort of decomposition rather than
36> "it's falling apart by itself"?)
37
38It's more like "taking apart". There are characters that can
39be split up into base character + accent pairs, or sometimes
40even further. For example, č = c + ̌ , that is,
41LATIN LETTER SMALL C WITH CARON = LATIN LETTER SMALL C + COMBINING CARON
42
43>
44> #: gucharmap/gucharmap-unicode-info.c:157
45> msgid "<Non Private Use High Surrogate>"
46> msgstr ""
47>
48> #: gucharmap/gucharmap-unicode-info.c:159
49> msgid "<Private Use High Surrogate>"
50> msgstr ""
51>
52> #: gucharmap/gucharmap-unicode-info.c:161
53> msgid "<Low Surrogate, Last>"
54> msgstr ""
55>
56> For the above three, is surrogate something we must translate
57> exactly, or can we use something that means "something in its
58> place" (if we can find a way to say that :))
59
60It's rather unfortunate that these terms exist, let alone
61have to be in gucharmap. I don't know how they should be
62translated, so I'll tell you what they mean.
63
64 Unicode originally implied that the encoding was UCS-2
65 and it initially didn't make any provisions for characters
66 outside the BMP (U+0000 to U+FFFF). When it became clear
67 that more than 64k characters would be needed for certain
68 special applications (historic alphabets and ideographs,
69 mathematical and musical typesetting, etc.), Unicode was
70 turned into a sort of 21-bit character set with possible
71 code points in the range U-00000000 to U-0010FFFF. The
72 2×1024 surrogate characters (U+D800 to U+DFFF) were
73 introduced into the BMP to allow 1024×1024 non-BMP
74 characters to be represented as a sequence of two 16-bit
75 surrogate characters. This way UTF-16 was born, which
76 represents the extended "21-bit" Unicode in a way
77 backwards compatible with UCS-2.
78
79from http://www.cl.cam.ac.uk/~mgk25/unicode.html
80(BMP = Basic Multilingual Plane, 0000-FFFF) Notice the use
81of the word surrogate in this paragraph. A low surrogate is
82the first half of a 2 * 16bit character, and a high
83surrogate is the second half (in UTF-16 only, which we
84hate-- UTF-8 is God's encoding). Private Use Surrogate just
85means that these particular surrogates map into one of the
86Private Use Areas.
87
88>
89> #: gucharmap/gucharmap-unicode-info.c:165
90> msgid "<Plane 15 Private Use>"
91> msgstr ""
92>
93> #: gucharmap/gucharmap-unicode-info.c:167
94> msgid "<Plane 16 Private Use>"
95> msgstr ""
96>
97> For the above two, I take it these are not planes in the air? :)
98> surfaces, perhaps?
99
100In Unicode, each 16 bit space is called a plane, for some
101reason. 0000-FFFF, 10000-1FFFF, ..., 100000-10FFFF. There
102are 17 planes.
103
104>
105> #: gucharmap/gucharmap-unicode-info.c:186
106> msgid "Other, Control"
107> msgstr ""
108>
109> Control as a noun, not a verb?
110
111Yeah, a noun. This is the same idea as iscntrl() in ctype.h.
112Newline, carriage return, delete, etc are control
113characters.
114
115>
116> #: gucharmap/gucharmap-window.c:142
117> msgid "Jump to Unicode Code Point"
118> msgstr ""
119>
120> (and its neighbours in the po file: this is code as in.. well, as _not_
121> in source code, I take it)
122
123Code point just means a number, basically. "Unicode assigns
124a number to every character". That number is the code point.
125
126>
127> #: gucharmap/gucharmap-window.c:577
128> msgid "Snap Columns to Power of Two"
129> msgstr ""
130>
131> snap? As in the way metacity has some sort of "snapping"?
132
133Yes, I think it's the same idea.
134
135>
136> And these are strings I want to check we give the right sense
137> for:
138>
139> #: gucharmap/gucharmap-charmap.c:437 gucharmap/gucharmap-table.c:332
140> msgid "[not a printable character]"
141> msgstr ""
142> -- can we say "character which cannot be printed"? And is this
143> printed on a printer, or displayed on a monitor as well?
144
145Yes, you can say that. Displayed on a monitor as well.
146
147>
148> #: gucharmap/gucharmap-charmap.c:529
149> msgid "Approximate equivalents:"
150> msgstr ""
151>
152> #: gucharmap/gucharmap-charmap.c:538
153> msgid "Equivalents:"
154> msgstr ""
155> -- For these, is equivalent the "mathematical" "exactly the same"
156> sort of concept, or something different?
157
158Yeah, I'm pretty sure "exactly the same" works here.
159
160>
161> #: gucharmap/unicode/unicode_blocks.cI:101
162> msgid "Private Use Area"
163> msgstr ""
164> -- This is the part of unicode reserved for people to do what they
165> want with?
166
167Exactly.
168
169>
170> #: gucharmap/unicode/unicode_blocks.cI:110
171> msgid "Halfwidth and Fullwidth Forms"
172> msgstr ""
173> -- forms of... characters?
174
175Yes.
176
177Noah
178
179
180Date: Mon, 28 Jul 2003 12:12:54 -0400
181From: Noah Levitt
182To: Telsa Gwynne
183Subject: Re: How to translate some of the gucharmap strings?
184
185On Mon, Jul 28, 2003 at 16:50:02 +0100, Telsa Gwynne wrote:
186>
187> > > #: gucharmap/gucharmap-charmap.c:466
188> > > msgid "Various Useful Representations"
189> > > msgstr ""
190> >
191
192Ah. Yes, of characters. For example, this section lists the
193numeric entity reference for use in html and xml, e.g.
194೎, and stuff along those lines.
195
196Noah
197
198