1The current version of AS supports the concept loadable language modules, 2i.e. the language AS speaks to you is not set during compile time. Instead, 3AS tries to detect the language environment at startup and then to load 4the appropriate set of messages dynamically. The process of detection 5differs depending on the platform: On MS-DOS and OS/2 systems, AS queries 6the COUNTRY setting made from CONFIG.SYS. On Unix systems, AS looks for 7the environment variables 8 9LC_MESSAGES 10LC_ALL 11LANG 12 13and takes the first two letters from the variable that is found first. 14These two letters are interpreted as a code for the country you live 15in. 16 17Currently, AS knows the languages 'german' (code 049 resp. DE) and 18english (code 001 resp. EN). Any other setting leads to the default 19english language. Sorry, but I do not know more languages good enough 20to do other translations. You may now ask if you could add more 21languages to AS, and this is just what I hoped for when I wrote these 22lines ;-) 23 24Messages are stored in text files with the extension '.res'. Since 25parsing text files at every startup of the assembler would be quite 26inefficient, the '.res' files are transformed into a binary, indexed 27format that can be read with a few block read statements. The 28translation is done during the build process with a special tool 29called 'rescomp' (you might have seen the execution of rescomp while 30you built the C version of AS). rescomp parses the input file(s), 31assigns a number to each message, packs the messages to a single array 32of chars with an index table, and creates an additional header file 33that contains the numbers assigned to each message. A run-time 34library then allows to look up the messages via their numbers. 35 36A message source file consists of a couple of control statements. 37Empty lines are ignored; lines that start with a semicolon are 38treated as comments (i.e. they are also ignored). The first 39control statement a message file contains is the 'Langs' statement, 40which indicates the languages the messages in this file will support. 41This is a *GLOBAL* setting, i.e. you cannot omit languages for single 42messages! The Command has the following form: 43 44Langs <Code>(<Country-Code(s),...>) .... 45 46'Code' is the two-letter abbreviation for a language, e.g. 'DE' for 47german. Please use only UPPERcase! The code is followed by a 48comma-separated list of DOS-style country codes for DOS and OS/2 49environments. As you see, several country codes may point to a 50single language this way. For example, if you want to assign the 51english language to both americans and british people, write 52 53Langs EN(001,061) <further languages> 54 55In case AS finds a language environment that was not explicitly 56handled in the message file, the first language given to the 'Langs' 57command is used. You may override this via the 'Default' statement. 58e.g. 59 60Default DE 61 62Once the language is specified, the 'Message' command is the 63only one left to be explained. This command starts the definition of 64a message. The message file compiler reads the next 'n' lines, with 65'n' being the number of languages defined by the 'Langs' command. A 66sample message definition would look like 67 68Message TestMessage 69 "Dies ist ein Test" 70 "This is a test" 71 72given that you specified german and english language with the 'Langs' 73command. 74 75In case the messages become longer than a single line (messages may 76contain newline characters, more about this later), the use of a 77backslash (\) as a line continuation parameter is allowed: 78 79Message TestMessage2 80 "Dies ist eine" \ 81 "zweizeilige Nachricht" 82 "This is a" \ 83 "two-line message" 84 85Since we deal with non-english languages, we also have to deal with 86characters that are not part of the standard ASCII character set - a 87point where UNIX systems are traditionally weak. Since we cannot 88assume that all terminals have the capability to enter all 89language-specific character directly, there must be an 'escape 90mechanism' to write them as a sequence of standard ASCII characters. 91The message file compiler uses a subset of the sequences used in SGML 92and HTML: 93 94 ä ë ï ö ü 95 --> lowercase umlauted characters 96 Ä Ë Ï Ö Ü 97 --> uppercase umlauted characters 98 ß 99 --> german sharp s 100 ² 101 --> exponential 2 102 µ 103 --> micron character 104 à è ì ò ù 105 --> lowercase accent grave characters 106 À È Ì Ò Ù 107 --> uppercase accent grave characters 108 á é í ó ú 109 --> lowercase accent acute characters 110 Á É Í Ó Ú 111 --> uppercase accent acute characters 112 â ê î ô û 113 --> lowercase accent circonflex characters 114 Â Ê Î Ô Û 115 --> uppercase accent circonflex characters 116 ç Ç 117 --> lowercase / uppercase cedilla 118 ñ Ñ 119 --> lowercase / uppercase tilded n 120 å Å 121 --> lowercase / uppercase ringed a 122 æ &Aelig; 123 --> lowercase / uppercase ae diphtong 124 ¿ ¡ 125 --> inverted question / exclamation mark 126 \n 127 --> newline character 128 129Upon translation of a message file, the message file compiler will 130replace these sequences with the correct character encodings for the 131target platform. In the extreme case of a bare 7-bit-ASCII system, 132this may imply the translation to a sequence of ASCII characters that 133'emulate' the non-ASCII character. *NEVER* use the special characters 134directly in the message source files, as this would destroy their 135portability!!! 136 137The number of supported language-specific characters used to be 138strongly biased to the german language. The reason for this is 139simple: german is the only non-english language AS currently 140supports...sorry, but English and German is the amount of languages 141im am sufficiently fluent in to make a translation...help of others to 142extend the range is mostly welcome, and this is the primary reason 143why I explained the whole stuff ;-) 144 145So, if you feel brave enough to add a language (don't forget that 146there's also an almost-300-page user's manual that waits for 147translation ;-), the following steps have to be taken: 148 149 1. Find out which non-ASCII characters you additionally need. 150 I can then extend the message file compiler appropriately. 151 2. Add your language to the 'Langs' statement in 'header.res'. 152 This file is included into all other message files, so you 153 only have to do this once :-) 154 3. go through all other '.res' files and add the line to all 155 messages........ 156 4. recompile AS 157 5. You're done! 158 159That's about everything to be said about the technical side. 160Let's go to the political side. I'm prepared to get confronted 161with two opinions after you read this: 162 163 "Gee, that's far too much effort for such a tool. And anyway, who 164 needs anything else than english on a Unix system? Unix is some- 165 thing that was born to be english, and you better accept that!" 166 167 "Hey, why did you reinvent the wheel? There's catgets(), there's 168 GNU-gettext, and..." 169 170Well, i'll try to stay polite ;-) 171 172First, the fact that Unix is so biased towards the english language is 173in no way god-given, it's just the way it evolved. Unix was developed 174in the USA, and the typical Unix users were up to now people who had 175no problems with english - university students, developers etc. But 176the times have changed: Linux and *BSD have made Unix cheap, and we are 177facing more and more Unix users from other circles - people who 178previously only knew MS-LOSS and MS-Windog, and who were told by their 179nearest freak that Unix is a great thing. Such users typically will not 180accept a system that only speaks english, given that every 500-Dollar- 181Windows PC speaks to them in their native language, so why not this 182Unix system that claims to be sooo great ?! 183 184Furthermore, do not forget that AS is not a Unix-only tool: It runs 185on MS-DOS and OS/2 too, and a some people try to make it go on Macs 186(though this seems to be a much harder piece of work...). On these 187systems, localization is the standard! 188 189The portability to non-Unix platforms is the reason why I did not choose 190an existing package to manage message catalogs. catgets() seems to be 191Unix-specific (and it even is not available on all Unix systems!), and 192about gettext...well, I just did not look into it...it might have worked, 193but most of the GNU tools ported to DOS I have seen so far needed 32-bit- 194extenders, which I wanted to avoid. So I quickly hacked up my own 195library, but I promise that I will at least reuse it for my own projects! 196 197chardefs.h 198