xref: /openbsd/gnu/usr.bin/binutils/gas/doc/c-mmix.texi (revision 7b36286a)
1@c Copyright 2001, 2002 Free Software Foundation, Inc.
2@c This is part of the GAS manual.
3@c For copying conditions, see the file as.texinfo.
4@c MMIX description by Hans-Peter Nilsson, hp@bitrange.com
5@ifset GENERIC
6@page
7@node MMIX-Dependent
8@chapter MMIX Dependent Features
9@end ifset
10@ifclear GENERIC
11@node Machine Dependencies
12@chapter MMIX Dependent Features
13@end ifclear
14
15@cindex MMIX support
16@menu
17* MMIX-Opts::              Command-line Options
18* MMIX-Expand::            Instruction expansion
19* MMIX-Syntax::            Syntax
20* MMIX-mmixal::		   Differences to @code{mmixal} syntax and semantics
21@end menu
22
23@node MMIX-Opts
24@section Command-line Options
25
26@cindex options, MMIX
27@cindex MMIX options
28The MMIX version of @code{@value{AS}} has some machine-dependent options.
29
30@cindex @samp{--fixed-special-register-names} command line option, MMIX
31When @samp{--fixed-special-register-names} is specified, only the register
32names specified in @ref{MMIX-Regs} are recognized in the instructions
33@code{PUT} and @code{GET}.
34
35@cindex @samp{--globalize-symbols} command line option, MMIX
36You can use the @samp{--globalize-symbols} to make all symbols global.
37This option is useful when splitting up a @code{mmixal} program into
38several files.
39
40@cindex @samp{--gnu-syntax} command line option, MMIX
41The @samp{--gnu-syntax} turns off most syntax compatibility with
42@code{mmixal}.  Its usability is currently doubtful.
43
44@cindex @samp{--relax} command line option, MMIX
45The @samp{--relax} option is not fully supported, but will eventually make
46the object file prepared for linker relaxation.
47
48@cindex @samp{--no-predefined-syms} command line option, MMIX
49If you want to avoid inadvertently calling a predefined symbol and would
50rather get an error, for example when using @code{@value{AS}} with a
51compiler or other machine-generated code, specify
52@samp{--no-predefined-syms}.  This turns off built-in predefined
53definitions of all such symbols, including rounding-mode symbols, segment
54symbols, @samp{BIT} symbols, and @code{TRAP} symbols used in @code{mmix}
55``system calls''.  It also turns off predefined special-register names,
56except when used in @code{PUT} and @code{GET} instructions.
57
58@cindex @samp{--no-expand} command line option, MMIX
59By default, some instructions are expanded to fit the size of the operand
60or an external symbol (@pxref{MMIX-Expand}).  By passing
61@samp{--no-expand}, no such expansion will be done, instead causing errors
62at link time if the operand does not fit.
63
64@cindex @samp{--no-merge-gregs} command line option, MMIX
65The @code{mmixal} documentation (@pxref{mmixsite}) specifies that global
66registers allocated with the @samp{GREG} directive (@pxref{MMIX-greg}) and
67initialized to the same non-zero value, will refer to the same global
68register.  This isn't strictly enforceable in @code{@value{AS}} since the
69final addresses aren't known until link-time, but it will do an effort
70unless the @samp{--no-merge-gregs} option is specified.  (Register merging
71isn't yet implemented in @code{@value{LD}}.)
72
73@cindex @samp{-x} command line option, MMIX
74@code{@value{AS}} will warn every time it expands an instruction to fit an
75operand unless the option @samp{-x} is specified.  It is believed that
76this behaviour is more useful than just mimicking @code{mmixal}'s
77behaviour, in which instructions are only expanded if the @samp{-x} option
78is specified, and assembly fails otherwise, when an instruction needs to
79be expanded.  It needs to be kept in mind that @code{mmixal} is both an
80assembler and linker, while @code{@value{AS}} will expand instructions
81that at link stage can be contracted.  (Though linker relaxation isn't yet
82implemented in @code{@value{LD}}.)  The option @samp{-x} also imples
83@samp{--linker-allocated-gregs}.
84
85@cindex @samp{--no-pushj-stubs} command line option, MMIX
86@cindex @samp{--no-stubs} command line option, MMIX
87If instruction expansion is enabled, @code{@value{AS}} can expand a
88@samp{PUSHJ} instruction into a series of instructions.  The shortest
89expansion is to not expand it, but just mark the call as redirectable to a
90stub, which @code{@value{LD}} creates at link-time, but only if the
91original @samp{PUSHJ} instruction is found not to reach the target.  The
92stub consists of the necessary instructions to form a jump to the target.
93This happens if @code{@value{AS}} can assert that the @samp{PUSHJ}
94instruction can reach such a stub.  The option @samp{--no-pushj-stubs}
95disables this shorter expansion, and the longer series of instructions is
96then created at assembly-time.  The option @samp{--no-stubs} is a synonym,
97intended for compatibility with future releases, where generation of stubs
98for other instructions may be implemented.
99
100@cindex @samp{--linker-allocated-gregs} command line option, MMIX
101Usually a two-operand-expression (@pxref{GREG-base}) without a matching
102@samp{GREG} directive is treated as an error by @code{@value{AS}}.  When
103the option @samp{--linker-allocated-gregs} is in effect, they are instead
104passed through to the linker, which will allocate as many global registers
105as is needed.
106
107@node MMIX-Expand
108@section Instruction expansion
109
110@cindex instruction expansion, MMIX
111When @code{@value{AS}} encounters an instruction with an operand that is
112either not known or does not fit the operand size of the instruction,
113@code{@value{AS}} (and @code{@value{LD}}) will expand the instruction into
114a sequence of instructions semantically equivalent to the operand fitting
115the instruction.  Expansion will take place for the following
116instructions:
117
118@table @asis
119@item @samp{GETA}
120Expands to a sequence of four instructions: @code{SETL}, @code{INCML},
121@code{INCMH} and @code{INCH}.  The operand must be a multiple of four.
122@item Conditional branches
123A branch instruction is turned into a branch with the complemented
124condition and prediction bit over five instructions; four instructions
125setting @code{$255} to the operand value, which like with @code{GETA} must
126be a multiple of four, and a final @code{GO $255,$255,0}.
127@item @samp{PUSHJ}
128Similar to expansion for conditional branches; four instructions set
129@code{$255} to the operand value, followed by a @code{PUSHGO $255,$255,0}.
130@item @samp{JMP}
131Similar to conditional branches and @code{PUSHJ}.  The final instruction
132is @code{GO $255,$255,0}.
133@end table
134
135The linker @code{@value{LD}} is expected to shrink these expansions for
136code assembled with @samp{--relax} (though not currently implemented).
137
138@node MMIX-Syntax
139@section Syntax
140
141The assembly syntax is supposed to be upward compatible with that
142described in Sections 1.3 and 1.4 of @samp{The Art of Computer
143Programming, Volume 1}.  Draft versions of those chapters as well as other
144MMIX information is located at
145@anchor{mmixsite}@url{http://www-cs-faculty.stanford.edu/~knuth/mmix-news.html}.
146Most code examples from the mmixal package located there should work
147unmodified when assembled and linked as single files, with a few
148noteworthy exceptions (@pxref{MMIX-mmixal}).
149
150Before an instruction is emitted, the current location is aligned to the
151next four-byte boundary.  If a label is defined at the beginning of the
152line, its value will be the aligned value.
153
154In addition to the traditional hex-prefix @samp{0x}, a hexadecimal number
155can also be specified by the prefix character @samp{#}.
156
157After all operands to an MMIX instruction or directive have been
158specified, the rest of the line is ignored, treated as a comment.
159
160@menu
161* MMIX-Chars::		        Special Characters
162* MMIX-Symbols::		Symbols
163* MMIX-Regs::			Register Names
164* MMIX-Pseudos::		Assembler Directives
165@end menu
166
167@node MMIX-Chars
168@subsection Special Characters
169@cindex line comment characters, MMIX
170@cindex MMIX line comment characters
171
172The characters @samp{*} and @samp{#} are line comment characters; each
173start a comment at the beginning of a line, but only at the beginning of a
174line.  A @samp{#} prefixes a hexadecimal number if found elsewhere on a
175line.
176
177Two other characters, @samp{%} and @samp{!}, each start a comment anywhere
178on the line.  Thus you can't use the @samp{modulus} and @samp{not}
179operators in expressions normally associated with these two characters.
180
181A @samp{;} is a line separator, treated as a new-line, so separate
182instructions can be specified on a single line.
183
184@node MMIX-Symbols
185@subsection Symbols
186The character @samp{:} is permitted in identifiers.  There are two
187exceptions to it being treated as any other symbol character: if a symbol
188begins with @samp{:}, it means that the symbol is in the global namespace
189and that the current prefix should not be prepended to that symbol
190(@pxref{MMIX-prefix}).  The @samp{:} is then not considered part of the
191symbol.  For a symbol in the label position (first on a line), a @samp{:}
192at the end of a symbol is silently stripped off.  A label is permitted,
193but not required, to be followed by a @samp{:}, as with many other
194assembly formats.
195
196The character @samp{@@} in an expression, is a synonym for @samp{.}, the
197current location.
198
199In addition to the common forward and backward local symbol formats
200(@pxref{Symbol Names}), they can be specified with upper-case @samp{B} and
201@samp{F}, as in @samp{8B} and @samp{9F}.  A local label defined for the
202current position is written with a @samp{H} appended to the number:
203@smallexample
2043H LDB $0,$1,2
205@end smallexample
206This and traditional local-label formats cannot be mixed: a label must be
207defined and referred to using the same format.
208
209There's a minor caveat: just as for the ordinary local symbols, the local
210symbols are translated into ordinary symbols using control characters are
211to hide the ordinal number of the symbol.  Unfortunately, these symbols
212are not translated back in error messages.  Thus you may see confusing
213error messages when local symbols are used.  Control characters
214@samp{\003} (control-C) and @samp{\004} (control-D) are used for the
215MMIX-specific local-symbol syntax.
216
217The symbol @samp{Main} is handled specially; it is always global.
218
219By defining the symbols @samp{__.MMIX.start..text} and
220@samp{__.MMIX.start..data}, the address of respectively the @samp{.text}
221and @samp{.data} segments of the final program can be defined, though when
222linking more than one object file, the code or data in the object file
223containing the symbol is not guaranteed to be start at that position; just
224the final executable.  @xref{MMIX-loc}.
225
226@node MMIX-Regs
227@subsection Register names
228@cindex register names, MMIX
229@cindex MMIX register names
230
231Local and global registers are specified as @samp{$0} to @samp{$255}.
232The recognized special register names are @samp{rJ}, @samp{rA}, @samp{rB},
233@samp{rC}, @samp{rD}, @samp{rE}, @samp{rF}, @samp{rG}, @samp{rH},
234@samp{rI}, @samp{rK}, @samp{rL}, @samp{rM}, @samp{rN}, @samp{rO},
235@samp{rP}, @samp{rQ}, @samp{rR}, @samp{rS}, @samp{rT}, @samp{rU},
236@samp{rV}, @samp{rW}, @samp{rX}, @samp{rY}, @samp{rZ}, @samp{rBB},
237@samp{rTT}, @samp{rWW}, @samp{rXX}, @samp{rYY} and @samp{rZZ}.  A leading
238@samp{:} is optional for special register names.
239
240Local and global symbols can be equated to register names and used in
241place of ordinary registers.
242
243Similarly for special registers, local and global symbols can be used.
244Also, symbols equated from numbers and constant expressions are allowed in
245place of a special register, except when either of the options
246@code{--no-predefined-syms} and @code{--fixed-special-register-names} are
247specified.  Then only the special register names above are allowed for the
248instructions having a special register operand; @code{GET} and @code{PUT}.
249
250@node MMIX-Pseudos
251@subsection Assembler Directives
252@cindex assembler directives, MMIX
253@cindex pseudo-ops, MMIX
254@cindex MMIX assembler directives
255@cindex MMIX pseudo-ops
256
257@table @code
258@item LOC
259@cindex assembler directive LOC, MMIX
260@cindex pseudo-op LOC, MMIX
261@cindex MMIX assembler directive LOC
262@cindex MMIX pseudo-op LOC
263
264@anchor{MMIX-loc}
265The @code{LOC} directive sets the current location to the value of the
266operand field, which may include changing sections.  If the operand is a
267constant, the section is set to either @code{.data} if the value is
268@code{0x2000000000000000} or larger, else it is set to @code{.text}.
269Within a section, the current location may only be changed to
270monotonically higher addresses.  A LOC expression must be a previously
271defined symbol or a ``pure'' constant.
272
273An example, which sets the label @var{prev} to the current location, and
274updates the current location to eight bytes forward:
275@smallexample
276prev LOC @@+8
277@end smallexample
278
279When a LOC has a constant as its operand, a symbol
280@code{__.MMIX.start..text} or @code{__.MMIX.start..data} is defined
281depending on the address as mentioned above.  Each such symbol is
282interpreted as special by the linker, locating the section at that
283address.  Note that if multiple files are linked, the first object file
284with that section will be mapped to that address (not necessarily the file
285with the LOC definition).
286
287@item LOCAL
288@cindex assembler directive LOCAL, MMIX
289@cindex pseudo-op LOCAL, MMIX
290@cindex MMIX assembler directive LOCAL
291@cindex MMIX pseudo-op LOCAL
292
293@anchor{MMIX-local}
294Example:
295@smallexample
296 LOCAL external_symbol
297 LOCAL 42
298 .local asymbol
299@end smallexample
300
301This directive-operation generates a link-time assertion that the operand
302does not correspond to a global register.  The operand is an expression
303that at link-time resolves to a register symbol or a number.  A number is
304treated as the register having that number.  There is one restriction on
305the use of this directive: the pseudo-directive must be placed in a
306section with contents, code or data.
307
308@item IS
309@cindex assembler directive IS, MMIX
310@cindex pseudo-op IS, MMIX
311@cindex MMIX assembler directive IS
312@cindex MMIX pseudo-op IS
313
314@anchor{MMIX-is}
315The @code{IS} directive:
316@smallexample
317asymbol IS an_expression
318@end smallexample
319sets the symbol @samp{asymbol} to @samp{an_expression}.  A symbol may not
320be set more than once using this directive.  Local labels may be set using
321this directive, for example:
322@smallexample
3235H IS @@+4
324@end smallexample
325
326@item GREG
327@cindex assembler directive GREG, MMIX
328@cindex pseudo-op GREG, MMIX
329@cindex MMIX assembler directive GREG
330@cindex MMIX pseudo-op GREG
331
332@anchor{MMIX-greg}
333This directive reserves a global register, gives it an initial value and
334optionally gives it a symbolic name.  Some examples:
335
336@smallexample
337areg GREG
338breg GREG data_value
339     GREG data_buffer
340     .greg creg, another_data_value
341@end smallexample
342
343The symbolic register name can be used in place of a (non-special)
344register.  If a value isn't provided, it defaults to zero.  Unless the
345option @samp{--no-merge-gregs} is specified, non-zero registers allocated
346with this directive may be eliminated by @code{@value{AS}}; another
347register with the same value used in its place.
348Any of the instructions
349@samp{CSWAP},
350@samp{GO},
351@samp{LDA},
352@samp{LDBU},
353@samp{LDB},
354@samp{LDHT},
355@samp{LDOU},
356@samp{LDO},
357@samp{LDSF},
358@samp{LDTU},
359@samp{LDT},
360@samp{LDUNC},
361@samp{LDVTS},
362@samp{LDWU},
363@samp{LDW},
364@samp{PREGO},
365@samp{PRELD},
366@samp{PREST},
367@samp{PUSHGO},
368@samp{STBU},
369@samp{STB},
370@samp{STCO},
371@samp{STHT},
372@samp{STOU},
373@samp{STSF},
374@samp{STTU},
375@samp{STT},
376@samp{STUNC},
377@samp{SYNCD},
378@samp{SYNCID},
379can have a value nearby @anchor{GREG-base}an initial value in place of its
380second and third operands.  Here, ``nearby'' is defined as within the
381range 0@dots{}255 from the initial value of such an allocated register.
382
383@smallexample
384buffer1 BYTE 0,0,0,0,0
385buffer2 BYTE 0,0,0,0,0
386 @dots{}
387 GREG buffer1
388 LDOU $42,buffer2
389@end smallexample
390In the example above, the @samp{Y} field of the @code{LDOUI} instruction
391(LDOU with a constant Z) will be replaced with the global register
392allocated for @samp{buffer1}, and the @samp{Z} field will have the value
3935, the offset from @samp{buffer1} to @samp{buffer2}.  The result is
394equivalent to this code:
395@smallexample
396buffer1 BYTE 0,0,0,0,0
397buffer2 BYTE 0,0,0,0,0
398 @dots{}
399tmpreg GREG buffer1
400 LDOU $42,tmpreg,(buffer2-buffer1)
401@end smallexample
402
403Global registers allocated with this directive are allocated in order
404higher-to-lower within a file.  Other than that, the exact order of
405register allocation and elimination is undefined.  For example, the order
406is undefined when more than one file with such directives are linked
407together.  With the options @samp{-x} and @samp{--linker-allocated-gregs},
408@samp{GREG} directives for two-operand cases like the one mentioned above
409can be omitted.  Sufficient global registers will then be allocated by the
410linker.
411
412@item BYTE
413@cindex assembler directive BYTE, MMIX
414@cindex pseudo-op BYTE, MMIX
415@cindex MMIX assembler directive BYTE
416@cindex MMIX pseudo-op BYTE
417
418@anchor{MMIX-byte}
419The @samp{BYTE} directive takes a series of operands separated by a comma.
420If an operand is a string (@pxref{Strings}), each character of that string
421is emitted as a byte.  Other operands must be constant expressions without
422forward references, in the range 0@dots{}255.  If you need operands having
423expressions with forward references, use @samp{.byte} (@pxref{Byte}).  An
424operand can be omitted, defaulting to a zero value.
425
426@item WYDE
427@itemx TETRA
428@itemx OCTA
429@cindex assembler directive WYDE, MMIX
430@cindex pseudo-op WYDE, MMIX
431@cindex MMIX assembler directive WYDE
432@cindex MMIX pseudo-op WYDE
433@cindex assembler directive TETRA, MMIX
434@cindex pseudo-op TETRA, MMIX
435@cindex MMIX assembler directive TETRA
436@cindex MMIX pseudo-op TETRA
437@cindex assembler directive OCTA, MMIX
438@cindex pseudo-op OCTA, MMIX
439@cindex MMIX assembler directive OCTA
440@cindex MMIX pseudo-op OCTA
441
442@anchor{MMIX-constants}
443The directives @samp{WYDE}, @samp{TETRA} and @samp{OCTA} emit constants of
444two, four and eight bytes size respectively.  Before anything else happens
445for the directive, the current location is aligned to the respective
446constant-size boundary.  If a label is defined at the beginning of the
447line, its value will be that after the alignment.  A single operand can be
448omitted, defaulting to a zero value emitted for the directive.  Operands
449can be expressed as strings (@pxref{Strings}), in which case each
450character in the string is emitted as a separate constant of the size
451indicated by the directive.
452
453@item PREFIX
454@cindex assembler directive PREFIX, MMIX
455@cindex pseudo-op PREFIX, MMIX
456@cindex MMIX assembler directive PREFIX
457@cindex MMIX pseudo-op PREFIX
458
459@anchor{MMIX-prefix}
460The @samp{PREFIX} directive sets a symbol name prefix to be prepended to
461all symbols (except local symbols, @pxref{MMIX-Symbols}), that are not
462prefixed with @samp{:}, until the next @samp{PREFIX} directive.  Such
463prefixes accumulate.  For example,
464@smallexample
465 PREFIX a
466 PREFIX b
467c IS 0
468@end smallexample
469defines a symbol @samp{abc} with the value 0.
470
471@item BSPEC
472@itemx ESPEC
473@cindex assembler directive BSPEC, MMIX
474@cindex pseudo-op BSPEC, MMIX
475@cindex MMIX assembler directive BSPEC
476@cindex MMIX pseudo-op BSPEC
477@cindex assembler directive ESPEC, MMIX
478@cindex pseudo-op ESPEC, MMIX
479@cindex MMIX assembler directive ESPEC
480@cindex MMIX pseudo-op ESPEC
481
482@anchor{MMIX-spec}
483A pair of @samp{BSPEC} and @samp{ESPEC} directives delimit a section of
484special contents (without specified semantics).  Example:
485@smallexample
486 BSPEC 42
487 TETRA 1,2,3
488 ESPEC
489@end smallexample
490The single operand to @samp{BSPEC} must be number in the range
4910@dots{}255.  The @samp{BSPEC} number 80 is used by the GNU binutils
492implementation.
493@end table
494
495@node MMIX-mmixal
496@section Differences to @code{mmixal}
497@cindex mmixal differences
498@cindex differences, mmixal
499
500The binutils @code{@value{AS}} and @code{@value{LD}} combination has a few
501differences in function compared to @code{mmixal} (@pxref{mmixsite}).
502
503The replacement of a symbol with a GREG-allocated register
504(@pxref{GREG-base}) is not handled the exactly same way in
505@code{@value{AS}} as in @code{mmixal}.  This is apparent in the
506@code{mmixal} example file @code{inout.mms}, where different registers
507with different offsets, eventually yielding the same address, are used in
508the first instruction.  This type of difference should however not affect
509the function of any program unless it has specific assumptions about the
510allocated register number.
511
512Line numbers (in the @samp{mmo} object format) are currently not
513supported.
514
515Expression operator precedence is not that of mmixal: operator precedence
516is that of the C programming language.  It's recommended to use
517parentheses to explicitly specify wanted operator precedence whenever more
518than one type of operators are used.
519
520The serialize unary operator @code{&}, the fractional division operator
521@samp{//}, the logical not operator @code{!} and the modulus operator
522@samp{%} are not available.
523
524Symbols are not global by default, unless the option
525@samp{--globalize-symbols} is passed.  Use the @samp{.global} directive to
526globalize symbols (@pxref{Global}).
527
528Operand syntax is a bit stricter with @code{@value{AS}} than
529@code{mmixal}.  For example, you can't say @code{addu 1,2,3}, instead you
530must write @code{addu $1,$2,3}.
531
532You can't LOC to a lower address than those already visited
533(i.e. ``backwards'').
534
535A LOC directive must come before any emitted code.
536
537Predefined symbols are visible as file-local symbols after use.  (In the
538ELF file, that is---the linked mmo file has no notion of a file-local
539symbol.)
540
541Some mapping of constant expressions to sections in LOC expressions is
542attempted, but that functionality is easily confused and should be avoided
543unless compatibility with @code{mmixal} is required.  A LOC expression to
544@samp{0x2000000000000000} or higher, maps to the @samp{.data} section and
545lower addresses map to the @samp{.text} section (@pxref{MMIX-loc}).
546
547The code and data areas are each contiguous.  Sparse programs with
548far-away LOC directives will take up the same amount of space as a
549contiguous program with zeros filled in the gaps between the LOC
550directives.  If you need sparse programs, you might try and get the wanted
551effect with a linker script and splitting up the code parts into sections
552(@pxref{Section}).  Assembly code for this, to be compatible with
553@code{mmixal}, would look something like:
554@smallexample
555 .if 0
556 LOC away_expression
557 .else
558 .section away,"ax"
559 .fi
560@end smallexample
561@code{@value{AS}} will not execute the LOC directive and @code{mmixal}
562ignores the lines with @code{.}.  This construct can be used generally to
563help compatibility.
564
565Symbols can't be defined twice--not even to the same value.
566
567Instruction mnemonics are recognized case-insensitive, though the
568@samp{IS} and @samp{GREG} pseudo-operations must be specified in
569upper-case characters.
570
571There's no unicode support.
572
573The following is a list of programs in @samp{mmix.tar.gz}, available at
574@url{http://www-cs-faculty.stanford.edu/~knuth/mmix-news.html}, last
575checked with the version dated 2001-08-25 (md5sum
576c393470cfc86fac040487d22d2bf0172) that assemble with @code{mmixal} but do
577not assemble with @code{@value{AS}}:
578
579@table @code
580@item silly.mms
581LOC to a previous address.
582@item sim.mms
583Redefines symbol @samp{Done}.
584@item test.mms
585Uses the serial operator @samp{&}.
586@end table
587