1<!doctype linuxdoc system>      <!-- -*- text-mode -*- -->
2
3<article>
4<title>ld65 Users Guide
5<author><url url="mailto:uz@cc65.org" name="Ullrich von Bassewitz">
6
7<abstract>
8The ld65 linker combines object files into an executable file. ld65 is highly
9configurable and uses configuration files for high flexibility.
10</abstract>
11
12<!-- Table of contents -->
13<toc>
14
15<!-- Begin the document -->
16
17<sect>Overview<p>
18
19The ld65 linker combines several object modules created by the ca65
20assembler, producing an executable file. The object modules may be read
21from a library created by the ar65 archiver (this is somewhat faster and
22more convenient). The linker was designed to be as flexible as possible.
23It complements the features that are built into the ca65 macroassembler:
24
25<itemize>
26
27<item>  Accept any number of segments to form an executable module.
28
29<item>  Resolve arbitrary expressions stored in the object files.
30
31<item>  In case of errors, use the meta information stored in the object files
32        to produce helpful error messages. In case of undefined symbols,
33        expression range errors, or symbol type mismatches, ld65 is able to
34        tell you the exact location in the original assembler source, where
35        the symbol was referenced.
36
37<item>  Flexible output. The output of ld65 is highly configurable by a config
38        file. Some more-common platforms are supported by default configurations
39        that may be activated by naming the target system. The output
40        generation was designed with different output formats in mind, so
41        adding other formats shouldn't be a great problem.
42
43</itemize>
44
45
46<sect>Usage<p>
47
48
49<sect1>Command-line option overview<p>
50
51The linker is called as follows:
52
53<tscreen><verb>
54---------------------------------------------------------------------------
55Usage: ld65 [options] module ...
56Short options:
57  -(                    Start a library group
58  -)                    End a library group
59  -C name               Use linker config file
60  -D sym=val            Define a symbol
61  -L path               Specify a library search path
62  -Ln name              Create a VICE label file
63  -S addr               Set the default start address
64  -V                    Print the linker version
65  -h                    Help (this text)
66  -m name               Create a map file
67  -o name               Name the default output file
68  -t sys                Set the target system
69  -u sym                Force an import of symbol 'sym'
70  -v                    Verbose mode
71  -vm                   Verbose map file
72
73Long options:
74  --allow-multiple-definition   Allow multiple definitions
75  --cfg-path path               Specify a config file search path
76  --config name                 Use linker config file
77  --dbgfile name                Generate debug information
78  --define sym=val              Define a symbol
79  --end-group                   End a library group
80  --force-import sym            Force an import of symbol 'sym'
81  --help                        Help (this text)
82  --large-alignment             Don't warn about large alignments
83  --lib file                    Link this library
84  --lib-path path               Specify a library search path
85  --mapfile name                Create a map file
86  --module-id id                Specify a module id
87  --obj file                    Link this object file
88  --obj-path path               Specify an object file search path
89  --start-addr addr             Set the default start address
90  --start-group                 Start a library group
91  --target sys                  Set the target system
92  --version                     Print the linker version
93---------------------------------------------------------------------------
94</verb></tscreen>
95
96
97<sect1>Command-line options in detail<p>
98
99Here is a description of all of the command-line options:
100
101<descrip>
102
103  <tag><tt>--allow-multiple-definition</tt></tag>
104
105  Normally when a global symbol is defined multiple times, ld65 will
106  issue an error and not create the output file. This option lets it
107  silently ignore this fact and continue. The first definition of a
108  symbol will be used.
109
110
111  <label id="option--start-group">
112  <tag><tt>-(, --start-group</tt></tag>
113
114  Start a library group. The libraries specified within a group are searched
115  multiple times to resolve crossreferences within the libraries. Normally,
116  crossreferences are resolved only within a library, that is the library is
117  searched multiple times. Libraries specified later on the command line
118  cannot reference otherwise unreferenced symbols in libraries specified
119  earlier, because the linker has already handled them. Library groups are
120  a solution for this problem, because the linker will search repeatedly
121  through all libraries specified in the group, until all possible open
122  symbol references have been satisfied.
123
124
125  <tag><tt>-), --end-group</tt></tag>
126
127  End a library group. See the explanation of the <tt><ref
128  id="option--start-group" name="--start-group"></tt> option.
129
130
131  <tag><tt>-h, --help</tt></tag>
132
133  Print the short option summary shown above.
134
135
136  <label id="option-m">
137  <tag><tt>-m name, --mapfile name</tt></tag>
138
139  This option (which needs an argument that will used as a filename for
140  the generated map file) will cause the linker to generate a map file.
141  The map file does contain a detailed overview over the modules used, the
142  sizes for the different segments, and a table containing exported
143  symbols.
144
145
146  <label id="option-o">
147  <tag><tt>-o name</tt></tag>
148
149  The -o switch is used to give the name of the default output file.
150  Depending on your output configuration, this name <em/might not/ be used as the
151  name for the output file. However, for the default configurations, this
152  name is used for the output file name.
153
154
155  <label id="option-t">
156  <tag><tt>-t sys, --target sys</tt></tag>
157
158  The argument for the -t switch is the name of the target system. Since this
159  switch will activate a default configuration, it may not be used together
160  with the <tt><ref id="option-C" name="-C"></tt> option. The following target
161  systems are currently supported:
162
163  <itemize>
164  <item>none
165  <item>module
166  <item>apple2
167  <item>apple2enh
168  <item>atari2600
169  <item>atari
170  <item>atarixl
171  <item>atmos
172  <item>c16 (works also for the c116 with memory up to 32K)
173  <item>c64
174  <item>c128
175  <item>cbm510 (CBM-II series with 40-column video)
176  <item>cbm610 (all CBM series-II computers with 80-column video)
177  <item>geos-apple
178  <item>geos-cbm
179  <item>lunix
180  <item>lynx
181  <item>nes
182  <item>pet (all CBM PET systems except the 2001)
183  <item>plus4
184  <item>sim6502
185  <item>sim65c02
186  <item>supervision
187  <item>telestrat
188  <item>vic20
189  </itemize>
190
191  There are a few more targets defined but neither of them is actually
192  supported.
193
194
195  <tag><tt>-u sym[:addrsize], --force-import sym[:addrsize]</tt></tag>
196
197  Force an import of a symbol. While object files are always linked to the
198  output file, regardless if there are any references, object modules from
199  libraries get only linked in if an import can be satisfied by this module.
200  The <tt/--force-import/ option may be used to add a reference to a symbol and
201  as a result force linkage of the module that exports the identifier.
202
203  The name of the symbol may optionally be followed by a colon and an address-size
204  specifier. If no address size is specified, the default address size
205  for the target machine is used.
206
207  Please note that the symbol name needs to have the internal representation,
208  meaning you have to prepend an underscore for C identifiers.
209
210
211  <label id="option-v">
212  <tag><tt>-v, --verbose</tt></tag>
213
214  Using the -v option, you may enable more output that may help you to
215  locate problems. If an undefined symbol is encountered, -v causes the
216  linker to print a detailed list of the references (that is, source file
217  and line) for this symbol.
218
219
220  <tag><tt>-vm</tt></tag>
221
222  Must be used in conjunction with <tt><ref id="option-m" name="-m"></tt>
223  (generate map file). Normally the map file will not include empty segments
224  and sections, or unreferenced symbols. Using this option, you can force the
225  linker to include all that information into the map file.  Also, it will
226  include a second <tt/Exports/ list.  The first list is sorted by name;
227  the second one is sorted by value.
228
229
230  <label id="option-C">
231  <tag><tt>-C</tt></tag>
232
233  This gives the name of an output config file to use. See section 4 for more
234  information about config files. -C may not be used together with <tt><ref
235  id="option-t" name="-t"></tt>.
236
237
238  <label id="option-D">
239  <tag><tt>-D sym=value, --define sym=value</tt></tag>
240
241  This option allows to define an external symbol on the command line. Value
242  may start with a '&dollar;' sign or with <tt/0x/ for hexadecimal values,
243  otherwise a leading zero denotes octal values. See also <ref
244  id="SYMBOLS" name="the SYMBOLS section"> in the configuration file.
245
246
247  <label id="option--lib-path">
248  <tag><tt>-L path, --lib-path path</tt></tag>
249
250  Specify a library search path. This option may be used more than once. It
251  adds a directory to the search path for library files. Libraries specified
252  without a path are searched in the current directory, in the list of
253  directories specified using <tt/--lib-path/, in directories given by
254  environment variables, and in a built-in default directory.
255
256
257  <tag><tt>-Ln</tt></tag>
258
259  This option allows you to create a file that contains all global labels and
260  may be loaded into the VICE emulator using the <tt/ll/ (load label) command
261  or into the Oricutron emulator using the <tt/sl/ (symbols load) command. You
262  may use this to debug your code with VICE. Note: Older versions had some
263  bugs in the label code. If you have problems, please get the latest <url
264  url="http://vice-emu.sourceforge.net" name="VICE"> version.
265
266
267  <label id="option-S">
268  <tag><tt>-S addr, --start-addr addr</tt></tag>
269
270  Using -S you may define the default starting address. If and how this
271  address is used depends on the config file in use. For the default
272  configurations, only the "none", "apple2" and "apple2enh" systems honor an
273  explicit start address, all other default configs provide their own.
274
275
276  <tag><tt>-V, --version</tt></tag>
277
278  This option prints the version number of the linker. If you send any
279  suggestions or bugfixes, please include this number.
280
281
282  <label id="option--cfg-path">
283  <tag><tt>--cfg-path path</tt></tag>
284
285  Specify a config file search path. This option may be used more than once.
286  It adds a directory to the search path for config files. A config file given
287  with the <tt><ref id="option-C" name="-C"></tt> option that has no path in
288  its name is searched in the current directory, in the list of directories
289  specified using <tt/--cfg-path/, in directories given by environment variables,
290  and in a built-in default directory.
291
292
293  <label id="option--dbgfile">
294  <tag><tt>--dbgfile name</tt></tag>
295
296  Specify an output file for debug information. Available information will be
297  written to this file. Using the <tt/-g/ option for the compiler and assembler
298  will increase the amount of information available. Please note that debug
299  information generation is currently being developed, so the format of the
300  file and its contents are subject to change without further notice.
301
302  <label id="option--large-alignment">
303  <tag><tt>--large-alignment</tt></tag>
304
305  Disable warnings about a large combined alignment. See the discussion of the
306  <tt/.ALIGN/ directive in the ca65 Users Guide for further information.
307
308
309  <tag><tt>--lib file</tt></tag>
310
311  Links a library to the output. Use this command-line option instead of just
312  naming the library file, if the linker is not able to determine the file
313  type because of an unusual extension.
314
315
316  <tag><tt>--obj file</tt></tag>
317
318  Links an object file to the output. Use this command-line option instead
319  of just naming the object file, if the linker is not able to determine the
320  file type because of an unusual extension.
321
322
323  <label id="option--obj-path">
324  <tag><tt>--obj-path path</tt></tag>
325
326  Specify an object file search path. This option may be used more than once.
327  It adds a directory to the search path for object files. An object file
328  passed to the linker that has no path in its name is searched in the current
329  directory, in the list of directories specified using <tt/--obj-path/, in
330  directories given by environment variables, and in a built-in default directory.
331
332</descrip>
333
334
335
336<sect>Search paths<p>
337
338Starting with version 2.10, there are now several search-path lists for files needed
339by the linker: one for libraries, one for object files, and one for config
340files.
341
342
343<sect1>Library search path<p>
344
345The library search-path list contains in this order:
346
347<enum>
348<item>The current directory.
349<item>Any directory added with the <tt><ref id="option--lib-path"
350      name="--lib-path"></tt> option on the command line.
351<item>The value of the environment variable <tt/LD65_LIB/ if it is defined.
352<item>A subdirectory named <tt/lib/ of the directory defined in the environment
353      variable <tt/CC65_HOME/, if it is defined.
354<item>An optionally compiled-in library path.
355</enum>
356
357
358<sect1>Object file search path<p>
359
360The object file search-path list contains in this order:
361
362<enum>
363<item>The current directory.
364<item>Any directory added with the <tt><ref id="option--obj-path"
365      name="--obj-path"></tt> option on the command line.
366<item>The value of the environment variable <tt/LD65_OBJ/ if it is defined.
367<item>A subdirectory named <tt/obj/ of the directory defined in the environment
368      variable <tt/CC65_HOME/, if it is defined.
369<item>An optionally compiled-in directory.
370</enum>
371
372
373<sect1>Config file search path<p>
374
375The config file search-path list contains in this order:
376
377<enum>
378<item>The current directory.
379<item>Any directory added with the <tt><ref id="option--cfg-path"
380      name="--cfg-path"></tt> option on the command line.
381<item>The value of the environment variable <tt/LD65_CFG/ if it is defined.
382<item>A subdirectory named <tt/cfg/ of the directory defined in the environment
383      variable <tt/CC65_HOME/, if it is defined.
384<item>An optionally compiled-in directory.
385</enum>
386
387
388
389<sect>Detailed workings<p>
390
391The linker does several things when combining object modules:
392
393First, the command line is parsed from left to right. For each object file
394encountered (object files are recognized by a magic word in the header, so
395the linker does not care about the name), imported and exported
396identifiers are read from the file and inserted in a table. If a library
397name is given (libraries are also recognized by a magic word, there are no
398special naming conventions), all modules in the library are checked if an
399export from this module would satisfy an import from other modules. All
400modules where this is the case are marked. If duplicate identifiers are
401found, the linker issues warnings.
402
403That procedure (parsing and reading from left to right) does mean that a
404library may only satisfy references for object modules (given directly or from
405a library) named <em/before/ that library. With the command line
406
407<tscreen><verb>
408        ld65 crt0.o clib.lib test.o
409</verb></tscreen>
410
411the module <tt/test.o/ must not contain references to modules in the library
412<tt/clib.lib/.  But, if it does, you have to change the order of the modules
413on the command line:
414
415<tscreen><verb>
416        ld65 crt0.o test.o clib.lib
417</verb></tscreen>
418
419Step two is, to read the configuration file, and assign start addresses
420for the segments and define any linker symbols (see <ref id="config-files"
421name="Configuration files">).
422
423After that, the linker is ready to produce an output file. Before doing that,
424it checks its data for consistency. That is, it checks for unresolved
425externals (if the output format is not relocatable) and for symbol type
426mismatches (for example a zero-page symbol is imported by a module as an absolute
427symbol).
428
429Step four is, to write the actual target files. In this step, the linker will
430resolve any expressions contained in the segment data. Circular references are
431also detected in this step (a symbol may have a circular reference that goes
432unnoticed if the symbol is not used).
433
434Step five is to output a map file with a detailed list of all modules,
435segments and symbols encountered.
436
437And, last step, if you give the <tt><ref id="option-v" name="-v"></tt> switch
438twice, you get a dump of the segment data. However, this may be quite
439unreadable if you're not a developer. :-)
440
441
442
443<sect>Configuration files<label id="config-files"><p>
444
445Configuration files are used to describe the layout of the output file(s). Two
446major topics are covered in a config file: The memory layout of the target
447architecture, and the assignment of segments to memory areas. In addition,
448several other attributes may be specified.
449
450Case is ignored for keywords, that is, section or attribute names, but it is
451<em/not/ ignored for names and strings.
452
453
454
455<sect1>Memory areas<p>
456
457Memory areas are specified in a <tt/MEMORY/ section. Let's have a look at an
458example (this one describes the usable memory layout of the C64):
459
460<tscreen><verb>
461        MEMORY {
462            RAM1:  start = $0800, size = $9800;
463            ROM1:  start = $A000, size = $2000;
464            RAM2:  start = $C000, size = $1000;
465            ROM2:  start = $E000, size = $2000;
466        }
467</verb></tscreen>
468
469As you can see, there are two RAM areas and two ROM areas. The names
470(before the colon) are arbitrary names that must start with a letter, with
471the remaining characters being letters or digits. The names of the memory
472areas are used when assigning segments. As mentioned above, case is
473significant for those names.
474
475The syntax above is used in all sections of the config file. The name
476(<tt/ROM1/ etc.) is said to be an identifier, the remaining tokens up to the
477semicolon specify attributes for this identifier. You may use the equal sign
478to assign values to attributes, and you may use a comma to separate
479attributes, you may also leave both out. But you <em/must/ use a semicolon to
480mark the end of the attributes for one identifier. The section above may also
481have looked like this:
482
483<tscreen><verb>
484        # Start of memory section
485        MEMORY
486        {
487            RAM1:
488                start $0800
489                size $9800;
490            ROM1:
491                start $A000
492                size $2000;
493            RAM2:
494                start $C000
495                size $1000;
496            ROM2:
497                start $E000
498                size $2000;
499        }
500</verb></tscreen>
501
502There are of course more attributes for a memory section than just start and
503size. Start and size are mandatory attributes, that means, each memory area
504defined <em/must/ have these attributes given (the linker will check that). I
505will cover other attributes later. As you may have noticed, I've used a
506comment in the example above. Comments start with a hash mark ('#'), the
507remainder of the line is ignored if this character is found.
508
509
510<sect1>Segments<p>
511
512Let's assume you have written a program for your trusty old C64, and you would
513like to run it. For testing purposes, it should run in the <tt/RAM/ area. So
514we will start to assign segments to memory sections in the <tt/SEGMENTS/
515section:
516
517<tscreen><verb>
518        SEGMENTS {
519            CODE:   load = RAM1, type = ro;
520            RODATA: load = RAM1, type = ro;
521            DATA:   load = RAM1, type = rw;
522            BSS:    load = RAM1, type = bss, define = yes;
523        }
524</verb></tscreen>
525
526What we are doing here is telling the linker, that all segments go into the
527<tt/RAM1/ memory area in the order specified in the <tt/SEGMENTS/ section. So
528the linker will first write the <tt/CODE/ segment, then the <tt/RODATA/
529segment, then the <tt/DATA/ segment - but it will not write the <tt/BSS/
530segment. Why? Here enters the segment type: For each segment specified, you may also
531specify a segment attribute. There are five possible segment attributes:
532
533<tscreen><verb>
534        ro          means readonly
535        rw          means read/write
536        bss         means that this is an uninitialized segment
537        zp          a zeropage segment
538        overwrite   a segment that overwrites (parts of) another one
539
540</verb></tscreen>
541
542So, because we specified that the segment with the name BSS is of type bss,
543the linker knows that this is uninitialized data, and will not write it to an
544output file. This is an important point: For the assembler, the <tt/BSS/
545segment has no special meaning. You specify, which segments have the bss
546attribute when linking. This approach is much more flexible than having one
547fixed bss segment, and is a result of the design decision to supporting an
548arbitrary segment count.
549
550If you specify "<tt/type = bss/" for a segment, the linker will make sure that
551this segment does only contain uninitialized data (that is, zeroes), and issue
552a warning if this is not the case.
553
554For a <tt/bss/ type segment to be useful, it must be cleared somehow by your
555program (this happens usually in the startup code - for example the startup
556code for cc65-generated programs takes care about that). But how does your
557code know, where the segment starts, and how big it is? The linker is able to
558give that information, but you must request it. This is, what we're doing with
559the "<tt/define = yes/" attribute in the <tt/BSS/ definitions. For each
560segment, where this attribute is true, the linker will export three symbols.
561
562<tscreen><verb>
563        __NAME_LOAD__   This is set to the address where the
564                        segment is loaded.
565        __NAME_RUN__    This is set to the run address of the
566                        segment. We will cover run addresses
567                        later.
568        __NAME_SIZE__   This is set to the segment size.
569</verb></tscreen>
570
571Replace <tt/NAME/ by the name of the segment, in the example above, this would
572be <tt/BSS/. These symbols may be accessed by your code when imported using
573the <tt>.IMPORT</tt> directive.
574
575Now, as we've configured the linker to write the first three segments and
576create symbols for the last one, there's only one question left: Where does
577the linker put the data? It would be very convenient to have the data in a
578file, wouldn't it?
579
580<sect1>Output files<p>
581
582We don't have any files specified above, and indeed, this is not needed in a
583simple configuration like the one above. There is an additional attribute
584"file" that may be specified for a memory area, that gives a file name to
585write the area data into. If there is no file name given, the linker will
586assign the default file name. This is "a.out" or the one given with the
587<tt><ref id="option-o" name="-o"></tt> option on the command line. Since the
588default behaviour is OK for our purposes, I did not use the attribute in the
589example above. Let's have a look at it now.
590
591The "file" attribute (the keyword may also be written as "FILE" if you like
592that better) takes a string enclosed in double quotes ('&dquot;') that specifies the
593file, where the data is written. You may specify the same file several times,
594in that case the data for all memory areas having this file name is written
595into this file, in the order of the memory areas defined in the <tt/MEMORY/
596section. Let's specify some file names in the <tt/MEMORY/ section used above:
597
598<tscreen><verb>
599        MEMORY {
600            RAM1:  start = $0800, size = $9800, file = %O;
601            ROM1:  start = $A000, size = $2000, file = "rom1.bin";
602            RAM2:  start = $C000, size = $1000, file = %O;
603            ROM2:  start = $E000, size = $2000, file = "rom2.bin";
604        }
605</verb></tscreen>
606
607The <tt/%O/ used here is a way to specify the default behaviour explicitly:
608<tt/%O/ is replaced by a string (including the quotes) that contains the
609default output name, that is, "a.out" or the name specified with the <tt><ref
610id="option-o" name="-o"></tt> option on the command line. Into this file, the
611linker will first write any segments that go into <tt/RAM1/, and will append
612then the segments for <tt/RAM2/, because the memory areas are given in this
613order. So, for the RAM areas, nothing has really changed.
614
615We've not used the ROM areas, but we will do that below, so we give the file
616names here. Segments that go into <tt/ROM1/ will be written to a file named
617"rom1.bin", and segments that go into <tt/ROM2/ will be written to a file
618named "rom2.bin". The name given on the command line is ignored in both cases.
619
620Assigning an empty file name for a memory area will discard the data written
621to it. This is useful, if the memory area has segments assigned that are empty
622(for example because they are of type bss). In that case, the linker will
623create an empty output file. This may be suppressed by assigning an empty file
624name to that memory area.
625
626The <tt/%O/ sequence is also allowed inside a string. So using
627
628<tscreen><verb>
629        MEMORY {
630            ROM1:  start = $A000, size = $2000, file = "%O-1.bin";
631            ROM2:  start = $E000, size = $2000, file = "%O-2.bin";
632        }
633</verb></tscreen>
634
635would write two files that start with the name of the output file specified on
636the command line, with "-1.bin" and "-2.bin" appended respectively. Because
637'%' is used as an escape char, the sequence "%%" has to be used if a single
638percent sign is required.
639
640<sect1>OVERWRITE segments<p>
641
642There are situations when you may wish to overwrite some part (or parts) of a
643segment with another one. Perhaps you are modifying an OS ROM that has its
644public subroutines at fixed, well-known addresses, and you want to prevent them
645from shifting to other locations in memory if your changed code takes less
646space. Or you are updating a block of code available in binary-only form with
647fixes that are scattered in various places. Generally, whenever you want to
648minimize disturbance to an existing code brought on by your updates, OVERWRITE
649segments are worth considering.
650
651Here is an example:
652
653<tscreen><verb>
654MEMORY {
655    RAM: file = "", start = $6000, size = $2000, type=rw;
656    ROM: file = %O, start = $8000, size = $8000, type=ro;
657}
658</verb></tscreen>
659
660Nothing unusual so far, just two memory blocks - one RAM, one ROM. Now let's
661look at the segment configuration:
662
663<tscreen><verb>
664SEGMENTS {
665    RAM:       load = RAM, type = bss;
666    ORIGINAL:  load = ROM, type = ro;
667    FASTCOPY:  load = ROM, start=$9000, type = overwrite;
668    JMPPATCH1: load = ROM, start=$f7e8, type = overwrite;
669    DEBUG:     load = ROM, start=$8000, type = overwrite;
670    VERSION:   load = ROM, start=$e5b7, type = overwrite;
671}
672</verb></tscreen>
673
674Segment named ORIGINAL contains the original code, disassembled or provided in
675a binary form (i.e. using <tt/.INCBIN/ directive; see the <tt/ca65/ assembler
676document).  Subsequent four segments will be relocated to addresses specified
677by their "start" attributes ("offset" can also be used) and then will overwrite
678whatever was at these locations in the ORIGINAL segment. In the end, resulting
679binary output file will thus contain original data with the exception of four
680sequences starting at $9000, $f7e8, $8000 and $e5b7, which will sport code from
681their respective segments. How long these sequences will be depends on the
682lengths of corresponding segments - they can even overlap, so think what you're
683doing.
684
685Finally, note that OVERWRITE segments should be the final segments loaded to a
686particular memory area, and that they need at least one of "start" or "offset"
687attributes specified.
688
689<sect1>LOAD and RUN addresses (ROMable code)<p>
690
691Let us look now at a more complex example. Say, you've successfully tested
692your new "Super Operating System" (SOS for short) for the C64, and you
693will now go and replace the ROMs by your own code. When doing that, you
694face a new problem: If the code runs in RAM, we need not to care about
695read/write data. But now, if the code is in ROM, we must care about it.
696Remember the default segments (you may of course specify your own):
697
698<tscreen><verb>
699        CODE            read-only code
700        RODATA          read-only data
701        DATA            read/write data
702        BSS             uninitialized data, read/write
703</verb></tscreen>
704
705Since <tt/BSS/ is not initialized, we must not care about it now, but what
706about <tt/DATA/? <tt/DATA/ contains initialized data, that is, data that was
707explicitly assigned a value. And your program will rely on these values on
708startup. Since there's no way to remember the contents of the data segment,
709other than storing it into one of the ROMs, we have to put it there. But
710unfortunately, ROM is not writable, so we have to copy it into RAM before
711running the actual code.
712
713The linker won't copy the data from ROM into RAM for you (this must be done by
714the startup code of your program), but it has some features that will help you
715in this process.
716
717First, you may not only specify a "<tt/load/" attribute for a segment, but
718also a "<tt/run/" attribute. The "<tt/load/" attribute is mandatory, and, if
719you don't specify a "<tt/run/" attribute, the linker assumes that load area
720and run area are the same. We will use this feature for our data area:
721
722<tscreen><verb>
723        SEGMENTS {
724            CODE:   load = ROM1, type = ro;
725            RODATA: load = ROM2, type = ro;
726            DATA:   load = ROM2, run = RAM2, type = rw, define = yes;
727            BSS:    load = RAM2, type = bss, define = yes;
728        }
729</verb></tscreen>
730
731Let's have a closer look at this <tt/SEGMENTS/ section. We specify that the
732<tt/CODE/ segment goes into <tt/ROM1/ (the one at $A000). The readonly data
733goes into <tt/ROM2/. Read/write data will be loaded into <tt/ROM2/ but is run
734in <tt/RAM2/. That means that all references to labels in the <tt/DATA/
735segment are relocated to be in <tt/RAM2/, but the segment is written to
736<tt/ROM2/. All your startup code has to do is, to copy the data from its
737location in <tt/ROM2/ to the final location in <tt/RAM2/.
738
739So, how do you know, where the data is located? This is the second point,
740where you get help from the linker. Remember the "<tt/define/" attribute?
741Since we have set this attribute to true, the linker will define three
742external symbols for the data segment that may be accessed from your code:
743
744<tscreen><verb>
745        __DATA_LOAD__   This is set to the address where the segment
746                        is loaded, in this case, it is an address in
747                        ROM2.
748        __DATA_RUN__    This is set to the run address of the segment,
749                        in this case, it is an address in RAM2.
750        __DATA_SIZE__   This is set to the segment size.
751</verb></tscreen>
752
753So, what your startup code must do, is to copy <tt/__DATA_SIZE__/ bytes from
754<tt/__DATA_LOAD__/ to <tt/__DATA_RUN__/ before any other routines are called.
755All references to labels in the <tt/DATA/ segment are relocated to <tt/RAM2/
756by the linker, so things will work properly.
757
758There's a library subroutine called <tt/copydata/ (in a module named
759<tt/copydata.s/) that might be used to do actual copying. Be sure to have a
760look at it's inner workings before using it!
761
762
763<sect1>Other MEMORY area attributes<p>
764
765There are some other attributes not covered above. Before starting the
766reference section, I will discuss the remaining things here.
767
768You may request symbols definitions also for memory areas. This may be
769useful for things like a software stack, or an I/O area.
770
771<tscreen><verb>
772        MEMORY {
773            STACK:  start = $C000, size = $1000, define = yes;
774        }
775</verb></tscreen>
776
777This will define some external symbols that may be used in your code when
778imported using the <tt>.IMPORT</tt> directive:
779
780<tscreen><verb>
781        __STACK_START__         This is set to the start of the memory
782                                area, $C000 in this example.
783        __STACK_SIZE__          The size of the area, here $1000.
784        __STACK_LAST__          This is NOT the same as START+SIZE.
785                                Instead, it is defined as the first
786                                address that is not used by data. If we
787                                don't define any segments for this area,
788                                the value will be the same as START.
789        __STACK_FILEOFFS__      The binary offset in the output file. This
790                                is not defined for relocatable output file
791                                formats (o65).
792</verb></tscreen>
793
794A memory section may also have a type. Valid types are
795
796<tscreen><verb>
797        ro      for readonly memory
798        rw      for read/write memory.
799</verb></tscreen>
800
801The linker will assure, that no segment marked as read/write or bss is put
802into a memory area that is marked as readonly.
803
804Unused memory in a memory area may be filled. Use the "<tt/fill = yes/"
805attribute to request this. The default value to fill unused space is zero. If
806you don't like this, you may specify a byte value that is used to fill these
807areas with the "<tt/fillval/" attribute. If there is no "<tt/fillval/"
808attribute for the segment, the "<tt/fillval/" attribute of the memory area (or
809its default) is used instead. This means that the value may also be used to
810fill unfilled areas generated by the assembler's <tt/.ALIGN/ and <tt/.RES/
811directives.
812
813The symbol <tt/%S/ may be used to access the default start address (that is,
814the one defined in <ref id="FEATURES" name="the FEATURES section">, or the
815value given on the command line with the <tt><ref id="option-S" name="-S"></tt>
816option).
817
818To support systems with banked memory, a special attribute named <tt/bank/ is
819available. The attribute value is an arbitrary 32-bit integer. The assembler
820has a builtin function named <tt/.BANK/ which may be used with an argument
821that has a segment reference (for example a symbol). The result of this
822function is the value of the bank attribute for the run memory area of the
823segment.
824
825
826<sect1>Other SEGMENT attributes<p>
827
828Segments may be aligned to some memory boundary. Specify "<tt/align = num/" to
829request this feature. To align all segments on a page boundary, use
830
831<tscreen><verb>
832        SEGMENTS {
833            CODE:   load = ROM1, type = ro, align = $100;
834            RODATA: load = ROM2, type = ro, align = $100;
835            DATA:   load = ROM2, run = RAM2, type = rw, define = yes,
836                    align = $100;
837            BSS:    load = RAM2, type = bss, define = yes, align = $100;
838        }
839</verb></tscreen>
840
841If an alignment is requested, the linker will add enough space to the output
842file, so that the new segment starts at an address that is dividable by the
843given number without a remainder. All addresses are adjusted accordingly. To
844fill the unused space, bytes of zero are used, or, if the memory area has a
845"<tt/fillval/" attribute, that value. Alignment is always needed, if you have
846used the <tt/.ALIGN/ command in the assembler. The alignment of a segment
847must be equal or greater than the alignment used in the <tt/.ALIGN/ command.
848The linker will check that, and issue a warning, if the alignment of a segment
849is lower than the alignment requested in an <tt/.ALIGN/ command of one of the
850modules making up this segment.
851
852For a given segment you may also specify a fixed offset into a memory area or
853a fixed start address. Use this if you want the code to run at a specific
854address (a prominent case is the interrupt vector table which must go at
855address $FFFA). Only one of <tt/ALIGN/ or <tt/OFFSET/ or <tt/START/ may be
856specified. If the directive creates empty space, it will be filled with zero,
857of with the value specified with the "<tt/fillval/" attribute if one is given.
858The linker will warn you if it is not possible to put the code at the
859specified offset (this may happen if other segments in this area are too
860large). Here's an example:
861
862<tscreen><verb>
863        SEGMENTS {
864            VECTORS: load = ROM2, type = ro, start = $FFFA;
865        }
866</verb></tscreen>
867
868or (for the segment definitions from above)
869
870<tscreen><verb>
871        SEGMENTS {
872            VECTORS: load = ROM2, type = ro, offset = $1FFA;
873        }
874</verb></tscreen>
875
876The "<tt/align/", "<tt/start/" and "<tt/offset/" attributes change placement
877of the segment in the run memory area, because this is what is usually
878desired. If load and run memory areas are equal (which is the case if only the
879load memory area has been specified), the attributes will also work. There is
880also an "<tt/align_load/" attribute that may be used to align the start of the
881segment in the load memory area, in case different load and run areas have
882been specified. There are no special attributes to set start or offset for
883just the load memory area.
884
885A "<tt/fillval/" attribute may not only be specified for a memory area, but
886also for a segment. The value must be an integer between 0 and 255. It is used
887as the fill value for space reserved by the assembler's <tt/.ALIGN/ and <tt/.RES/
888commands. It is also used as the fill value for space between sections (part of a
889segment that comes from one object file) caused by alignment, but not for
890space that preceeds the first section.
891
892To suppress the warning, the linker issues if it encounters a segment that is
893not found in any of the input files, use "<tt/optional=yes/" as an additional
894segment attribute. Be careful when using this attribute, because a missing
895segment may be a sign of a problem, and if you're suppressing the warning,
896there is no one left to tell you about it.
897
898<sect1>The FILES section<p>
899
900The <tt/FILES/ section is used to support other formats than straight binary
901(which is the default, so binary output files do not need an explicit entry
902in the <tt/FILES/ section).
903
904The <tt/FILES/ section lists output files and as only attribute the format of
905each output file. Assigning binary format to the default output file would
906look like this:
907
908<tscreen><verb>
909        FILES {
910            %O: format = bin;
911        }
912</verb></tscreen>
913
914There are two other available formats, one is the o65 format specified by Andre
915Fachat (see the <url url="http://www.6502.org/users/andre/o65/fileformat.html"
916name="6502 binary relocation format specification">). It is defined like this:
917
918<tscreen><verb>
919        FILES {
920            %O: format = o65;
921        }
922</verb></tscreen>
923
924The other format available is the Atari (xex) segmented file format, this is
925the standard format used by Atari DOS 2.0 and upward file managers in the Atari
9268-bit computers, and it is defined like this:
927
928<tscreen><verb>
929        FILES {
930            %O: format = atari;
931        }
932</verb></tscreen>
933
934In the Atari segmented file format, the linker will write each <tt/MEMORY/ area
935as a new segment, including a header with the start and end address.
936
937The necessary o65 or Atari attributes are defined in a special section labeled
938<ref id="FORMAT" name="FORMAT">.
939
940
941
942<sect1>The FORMAT section<label id="FORMAT"><p>
943
944The <tt/FORMAT/ section is used to describe file formats. The default (binary)
945format has currently no attributes, so, while it may be listed in this
946section, the attribute list is empty. The second supported format,
947<url url="http://www.6502.org/users/andre/o65/fileformat.html" name="o65">,
948has several attributes that may be defined here.
949
950<tscreen><verb>
951    FORMATS {
952        o65: os = lunix, version = 0, type = small,
953             import = LUNIXKERNEL,
954             export = _main;
955    }
956</verb></tscreen>
957
958The Atari file format has two attributes:
959
960<descrip>
961
962  <tag><tt>RUNAD = symbol</tt></tag>
963
964  Specify a symbol as the run address of the binary, the loader will call this
965  address after all the file is loaded in memory. If the attribute is omitted,
966  no run address is included in the file.
967
968  <tag><tt>INITAD = memory_area : symbol</tt></tag>
969
970  Specify a symbol as the initialization address for the given memory area.
971  The binary loader will call this address just after the memory area is loaded
972  into memory, before continuing loading the rest of the file.
973
974</descrip>
975
976
977<tscreen><verb>
978    FORMATS {
979        atari: runad = _start;
980    }
981</verb></tscreen>
982
983
984<sect1>The FEATURES section<label id="FEATURES"><p>
985
986In addition to the <tt/MEMORY/ and <tt/SEGMENTS/ sections described above, the
987linker has features that may be enabled by an additional section labeled
988<tt/FEATURES/.
989
990
991<sect2>The CONDES feature<p>
992
993<tt/CONDES/ is used to tell the linker to emit module constructor/destructor
994tables.
995
996<tscreen><verb>
997        FEATURES {
998            CONDES: segment = RODATA,
999                    type = constructor,
1000                    label = __CONSTRUCTOR_TABLE__,
1001                    count = __CONSTRUCTOR_COUNT__;
1002        }
1003</verb></tscreen>
1004
1005The <tt/CONDES/ feature has several attributes:
1006
1007<descrip>
1008
1009  <tag><tt>segment</tt></tag>
1010
1011  This attribute tells the linker into which segment the table should be
1012  placed. If the segment does not exist, it is created.
1013
1014
1015  <tag><tt>type</tt></tag>
1016
1017  Describes the type of the routines to place in the table. Type may be one of
1018  the predefined types <tt/constructor/, <tt/destructor/, <tt/interruptor/, or
1019  a numeric value between 0 and 6.
1020
1021
1022  <tag><tt>label</tt></tag>
1023
1024  This specifies the label to use for the table. The label points to the start
1025  of the table in memory and may be used from within user-written code.
1026
1027
1028  <tag><tt>count</tt></tag>
1029
1030  This is an optional attribute. If specified, an additional symbol is defined
1031  by the linker using the given name. The value of this symbol is the number
1032  of entries (<em/not/ bytes) in the table. While this attribute is optional,
1033  it is often useful to define it.
1034
1035
1036  <tag><tt>order</tt></tag>
1037
1038  An optional attribute that takes one of the keywords <tt/increasing/ or
1039  <tt/decreasing/ as an argument. Specifies the sorting order of the entries
1040  within the table. The default is <tt/increasing/, which means that the
1041  entries are sorted with increasing priority (the first entry has the lowest
1042  priority). "Priority" is the priority specified when declaring a symbol as
1043  <tt/.CONDES/ with the assembler, higher values mean higher priority. You may
1044  change this behaviour by specifying <tt/decreasing/ as the argument, the
1045  order of entries is reversed in this case.
1046
1047  Please note that the order of entries with equal priority is undefined.
1048
1049  <tag><tt>import</tt></tag>
1050
1051  This attribute defines a valid symbol name, that is added as an import
1052  to the modules defining a constructor/destructor of the given type.
1053  This can be used to force linkage of a module if this module exports the
1054  requested symbol.
1055
1056</descrip>
1057
1058Without specifying the <tt/CONDES/ feature, the linker will not create any
1059tables, even if there are <tt/condes/ entries in the object files.
1060
1061For more information see the <tt/.CONDES/ command in the <url
1062url="ca65.html" name="ca65 manual">.
1063
1064
1065<sect2>The STARTADDRESS feature<p>
1066
1067<tt/STARTADDRESS/ is used to set the default value for the start address,
1068which can be referenced by the <tt/%S/ symbol. The builtin default for the
1069linker is &dollar;200.
1070
1071<tscreen><verb>
1072        FEATURES {
1073            # Default start address is $1000
1074            STARTADDRESS:       default = $1000;
1075        }
1076</verb></tscreen>
1077
1078Please note that order is important: The default start address must be defined
1079<em/before/ the <tt/%S/ symbol is used in the config file. This does usually
1080mean, that the <tt/FEATURES/ section has to go to the top of the config file.
1081
1082
1083
1084<sect1>The SYMBOLS section<label id="SYMBOLS"><p>
1085
1086The configuration file may also be used to define symbols used in the link
1087stage or to force symbols imports. This is done in the SYMBOLS section. The
1088symbol name is followed by a colon and symbol attributes.
1089
1090The following symbol attributes are supported:
1091
1092<descrip>
1093
1094  <tag><tt>addrsize</tt></tag>
1095
1096  The <tt/addrsize/ attribute specifies the address size of the symbol and
1097  may be one of
1098<itemize>
1099    <item><tt/zp/, <tt/zeropage/ or <tt/direct/
1100    <item><tt/abs/, <tt/absolute/ or <tt/near/
1101    <item><tt/far/
1102    <item><tt/long/ or <tt/dword/.
1103</itemize>
1104
1105Without this attribute, the default address size is <tt/abs/.
1106
1107  <tag><tt>type</tt></tag>
1108
1109  This attribute is mandatory. Its value is one of <tt/export/, <tt/import/ or
1110  <tt/weak/. <tt/export/ means that the symbol is defined and exported from
1111  the linker config. <tt/import/ means that an import is generated for this
1112  symbol, eventually forcing a module that exports this symbol to be included
1113  in the output. <tt/weak/ is similar as <tt/export/. However, the symbol is
1114  only defined if it is not defined elsewhere.
1115
1116  <tag><tt>value</tt></tag>
1117
1118  This must only be given for symbols of type <tt/export/ or <tt/weak/. It
1119  defines the value of the symbol and may be an expression.
1120
1121</descrip>
1122
1123The following example defines the stack size for an application, but allows
1124the programmer to override the value by specifying <tt/--define
1125__STACKSIZE__=xxx/ on the command line.
1126
1127<tscreen><verb>
1128        SYMBOLS {
1129            # Define the stack size for the application
1130            __STACKSIZE__:  type = weak, value = $800;
1131        }
1132</verb></tscreen>
1133
1134
1135
1136<sect>Special segments<p>
1137
1138The builtin config files do contain segments that have a special meaning for
1139the compiler and the libraries that come with it. If you replace the builtin
1140config files, you will need the following information.
1141
1142<sect1>ONCE<p>
1143
1144The ONCE segment is used for initialization code run only once before
1145execution reaches main() - provided that the program runs in RAM. You
1146may for example add the ONCE segment to the heap in really memory
1147constrained systems.
1148
1149<sect1>LOWCODE<p>
1150
1151For the LOWCODE segment, it is guaranteed that it won't be banked out, so it
1152is reachable at any time by interrupt handlers or similar.
1153
1154<sect1>STARTUP<p>
1155
1156This segment contains the startup code which initializes the C software stack
1157and the libraries. It is placed in its own segment because it needs to be
1158loaded at the lowest possible program address on several platforms.
1159
1160<sect1>ZPSAVE<p>
1161
1162The ZPSAVE segment contains the original values of the zeropage locations used
1163by the ZEROPAGE segment. It is placed in its own segment because it must not be
1164initialized.
1165
1166
1167
1168<sect>Copyright<p>
1169
1170ld65 (and all cc65 binutils) are (C) Copyright 1998-2005 Ullrich von
1171Bassewitz. For usage of the binaries and/or sources the following
1172conditions do apply:
1173
1174This software is provided 'as-is', without any expressed or implied
1175warranty.  In no event will the authors be held liable for any damages
1176arising from the use of this software.
1177
1178Permission is granted to anyone to use this software for any purpose,
1179including commercial applications, and to alter it and redistribute it
1180freely, subject to the following restrictions:
1181
1182<enum>
1183<item>  The origin of this software must not be misrepresented; you must not
1184        claim that you wrote the original software. If you use this software
1185        in a product, an acknowledgment in the product documentation would be
1186        appreciated but is not required.
1187<item>  Altered source versions must be plainly marked as such, and must not
1188        be misrepresented as being the original software.
1189<item>  This notice may not be removed or altered from any source
1190        distribution.
1191</enum>
1192
1193
1194
1195</article>
1196