1The current version of AS supports the concept loadable language modules,
2i.e. the language AS speaks to you is not set during compile time.  Instead,
3AS tries to detect the language environment at startup and then to load
4the appropriate set of messages dynamically.  The process of detection
5differs depending on the platform: On MS-DOS and OS/2 systems, AS queries
6the COUNTRY setting made from CONFIG.SYS.  On Unix systems, AS looks for
7the environment variables
8
9LC_MESSAGES
10LC_ALL
11LANG
12
13and takes the first two letters from the variable that is found first.
14These two letters are interpreted as a code for the country you live
15in.
16
17Currently, AS knows the languages 'german' (code 049 resp. DE) and
18english (code 001 resp. EN).  Any other setting leads to the default
19english language.  Sorry, but I do not know more languages good enough
20to do other translations.  You may now ask if you could add more
21languages to AS, and this is just what I hoped for when I wrote these
22lines ;-)
23
24Messages are stored in text files with the extension '.res'.  Since
25parsing text files at every startup of the assembler would be quite
26inefficient, the '.res' files are transformed into a binary, indexed
27format that can be read with a few block read statements.  The
28translation is done during the build process with a special tool
29called 'rescomp' (you might have seen the execution of rescomp while
30you built the C version of AS).  rescomp parses the input file(s),
31assigns a number to each message, packs the messages to a single array
32of chars with an index table, and creates an additional header file
33that contains the numbers assigned to each message.  A run-time
34library then allows to look up the messages via their numbers.
35
36A message source file consists of a couple of control statements.
37Empty lines are ignored; lines that start with a semicolon are
38treated as comments (i.e. they are also ignored).  The first
39control statement a message file contains is the 'Langs' statement,
40which indicates the languages the messages in this file will support.
41This is a *GLOBAL* setting, i.e. you cannot omit languages for single
42messages!  The Command has the following form:
43
44Langs <Code>(<Country-Code(s),...>) ....
45
46'Code' is the two-letter abbreviation for a language, e.g. 'DE' for
47german.  Please use only UPPERcase!  The code is followed by a
48comma-separated list of DOS-style country codes for DOS and OS/2
49environments.  As you see, several country codes may point to a
50single language this way.  For example, if you want to assign the
51english language to both americans and british people, write
52
53Langs EN(001,061) <further languages>
54
55In case AS finds a language environment that was not explicitly
56handled in the message file, the first language given to the 'Langs'
57command is used.  You may override this via the 'Default' statement.
58e.g.
59
60Default DE
61
62Once the language is specified, the 'Message' command is the
63only one left to be explained.  This command starts the definition of
64a message.  The message file compiler reads the next 'n' lines, with
65'n' being the number of languages defined by the 'Langs' command.  A
66sample message definition would look like
67
68Message TestMessage
69 "Dies ist ein Test"
70 "This is a test"
71
72given that you specified german and english language with the 'Langs'
73command.
74
75In case the messages become longer than a single line (messages may
76contain newline characters, more about this later), the use of a
77backslash (\) as a line continuation parameter is allowed:
78
79Message TestMessage2
80 "Dies ist eine" \
81 "zweizeilige Nachricht"
82 "This is a" \
83 "two-line message"
84
85Since we deal with non-english languages, we also have to deal with
86characters that are not part of the standard ASCII character set - a
87point where UNIX systems are traditionally weak.  Since we cannot
88assume that all terminals have the capability to enter all
89language-specific character directly, there must be an 'escape
90mechanism' to write them as a sequence of standard ASCII characters.
91The message file compiler uses a subset of the sequences used in SGML
92and HTML:
93
94 &auml; &euml; &iuml; &ouml; &uuml;
95   --> lowercase umlauted characters
96 &Auml; &Euml; &Iuml; &Ouml; &Uuml;
97   --> uppercase umlauted characters
98 &szlig;
99   --> german sharp s
100 &sup2;
101   --> exponential 2
102 &micro;
103   --> micron character
104 &agrave; &egrave; &igrave; &ograve; &ugrave;
105   --> lowercase accent grave characters
106 &Agrave; &Egrave; &Igrave; &Ograve; &Ugrave;
107   --> uppercase accent grave characters
108 &aacute; &eacute; &iacute; &oacute; &uacute;
109   --> lowercase accent acute characters
110 &Aacute; &Eacute; &Iacute; &Oacute; &Uacute;
111   --> uppercase accent acute characters
112 &acirc; &ecirc; &icirc; &ocirc; &ucirc;
113   --> lowercase accent circonflex characters
114 &Acirc; &Ecirc; &Icirc; &Ocirc; &Ucirc;
115   --> uppercase accent circonflex characters
116 &ccedil; &Ccedil;
117   --> lowercase / uppercase cedilla
118 &ntilde; &Ntilde;
119   --> lowercase / uppercase tilded n
120 &aring; &Aring;
121   --> lowercase / uppercase ringed a
122 &aelig; &Aelig;
123   --> lowercase / uppercase ae diphtong
124 &iquest; &iexcl;
125   --> inverted question / exclamation mark
126 \n
127   --> newline character
128
129Upon translation of a message file, the message file compiler will
130replace these sequences with the correct character encodings for the
131target platform.  In the extreme case of a bare 7-bit-ASCII system,
132this may imply the translation to a sequence of ASCII characters that
133'emulate' the non-ASCII character.  *NEVER* use the special characters
134directly in the message source files, as this would destroy their
135portability!!!
136
137The number of supported language-specific characters used to be
138strongly biased to the german language.  The reason for this is
139simple: german is the only non-english language AS currently
140supports...sorry, but English and German is the amount of languages
141im am sufficiently fluent in to make a translation...help of others to
142extend the range is mostly welcome, and this is the primary reason
143why I explained the whole stuff ;-)
144
145So, if you feel brave enough to add a language (don't forget that
146there's also an almost-300-page user's manual that waits for
147translation ;-), the following steps have to be taken:
148
149  1. Find out which non-ASCII characters you additionally need.
150     I can then extend the message file compiler appropriately.
151  2. Add your language to the 'Langs' statement in 'header.res'.
152     This file is included into all other message files, so you
153     only have to do this once :-)
154  3. go through all other '.res' files and add the line to all
155     messages........
156  4. recompile AS
157  5. You're done!
158
159That's about everything to be said about the technical side.
160Let's go to the political side.  I'm prepared to get confronted
161with two opinions after you read this:
162
163  "Gee, that's far too much effort for such a tool.  And anyway, who
164   needs anything else than english on a Unix system?  Unix is some-
165   thing that was born to be english, and you better accept that!"
166
167  "Hey, why did you reinvent the wheel?  There's catgets(), there's
168   GNU-gettext, and..."
169
170Well, i'll try to stay polite ;-)
171
172First, the fact that Unix is so biased towards the english language is
173in no way god-given, it's just the way it evolved.  Unix was developed
174in the USA, and the typical Unix users were up to now people who had
175no problems with english - university students, developers etc.  But
176the times have changed: Linux and *BSD have made Unix cheap, and we are
177facing more and more Unix users from other circles - people who
178previously only knew MS-LOSS and MS-Windog, and who were told by their
179nearest freak that Unix is a great thing.  Such users typically will not
180accept a system that only speaks english, given that every 500-Dollar-
181Windows PC speaks to them in their native language, so why not this
182Unix system that claims to be sooo great ?!
183
184Furthermore, do not forget that AS is not a Unix-only tool: It runs
185on MS-DOS and OS/2 too, and a some people try to make it go on Macs
186(though this seems to be a much harder piece of work...).  On these
187systems, localization is the standard!
188
189The portability to non-Unix platforms is the reason why I did not choose
190an existing package to manage message catalogs.  catgets() seems to be
191Unix-specific (and it even is not available on all Unix systems!), and
192about gettext...well, I just did not look into it...it might have worked,
193but most of the GNU tools ported to DOS I have seen so far needed 32-bit-
194extenders, which I wanted to avoid.  So I quickly hacked up my own
195library, but I promise that I will at least reuse it for my own projects!
196
197chardefs.h
198