15ffd83dbSDimitry AndricLinker Script implementation notes and policy
25ffd83dbSDimitry Andric=============================================
35ffd83dbSDimitry Andric
45ffd83dbSDimitry AndricLLD implements a large subset of the GNU ld linker script notation. The LLD
55ffd83dbSDimitry Andricimplementation policy is to implement linker script features as they are
65ffd83dbSDimitry Andricdocumented in the ld `manual <https://sourceware.org/binutils/docs/ld/Scripts.html>`_
75ffd83dbSDimitry AndricWe consider it a bug if the lld implementation does not agree with the manual
85ffd83dbSDimitry Andricand it is not mentioned in the exceptions below.
95ffd83dbSDimitry Andric
105ffd83dbSDimitry AndricThe ld manual is not a complete specification, and is not sufficient to build
115ffd83dbSDimitry Andrican implementation. In particular some features are only defined by the
125ffd83dbSDimitry Andricimplementation and have changed over time.
135ffd83dbSDimitry Andric
145ffd83dbSDimitry AndricThe lld implementation policy for properties of linker scripts that are not
155ffd83dbSDimitry Andricdefined by the documentation is to follow the GNU ld implementation wherever
165ffd83dbSDimitry Andricpossible. We reserve the right to make different implementation choices where
175ffd83dbSDimitry Andricit is appropriate for LLD. Intentional deviations will be documented in this
185ffd83dbSDimitry Andricfile.
195ffd83dbSDimitry Andric
2016d6b3b3SDimitry AndricSymbol assignment
2116d6b3b3SDimitry Andric~~~~~~~~~~~~~~~~~
2216d6b3b3SDimitry Andric
2316d6b3b3SDimitry AndricA symbol assignment looks like:
2416d6b3b3SDimitry Andric
2516d6b3b3SDimitry Andric::
2616d6b3b3SDimitry Andric
2716d6b3b3SDimitry Andric  symbol = expression;
2816d6b3b3SDimitry Andric  symbol += expression;
2916d6b3b3SDimitry Andric
3016d6b3b3SDimitry AndricThe first form defines ``symbol``. If ``symbol`` is already defined, it will be
3116d6b3b3SDimitry Andricoverridden. The other form requires ``symbol`` to be already defined.
3216d6b3b3SDimitry Andric
3316d6b3b3SDimitry AndricFor a simple assignment like ``alias = aliasee;``, the ``st_type`` field is
3416d6b3b3SDimitry Andriccopied from the original symbol. Any arithmetic operation (e.g. ``+ 0`` will
3516d6b3b3SDimitry Andricreset ``st_type`` to ``STT_NOTYPE``.
3616d6b3b3SDimitry Andric
3716d6b3b3SDimitry AndricThe ``st_size`` field is set to 0.
3816d6b3b3SDimitry Andric
39fe6060f1SDimitry AndricSECTIONS command
40fe6060f1SDimitry Andric~~~~~~~~~~~~~~~~
41fe6060f1SDimitry Andric
42fe6060f1SDimitry AndricA ``SECTIONS`` command looks like:
43fe6060f1SDimitry Andric
44fe6060f1SDimitry Andric::
45fe6060f1SDimitry Andric
46fe6060f1SDimitry Andric  SECTIONS {
47fe6060f1SDimitry Andric    section-command
48fe6060f1SDimitry Andric    section-command
49fe6060f1SDimitry Andric    ...
50fe6060f1SDimitry Andric  } [INSERT [AFTER|BEFORE] anchor_section;]
51fe6060f1SDimitry Andric
52fe6060f1SDimitry AndricEach section-command can be a symbol assignment, an output section description,
53fe6060f1SDimitry Andricor an overlay description.
54fe6060f1SDimitry Andric
55fe6060f1SDimitry AndricWhen the ``INSERT`` keyword is present, the ``SECTIONS`` command describes some
56fe6060f1SDimitry Andricoutput sections which should be inserted after or before the specified anchor
57fe6060f1SDimitry Andricsection. The insertion occurs after input sections have been mapped to output
58fe6060f1SDimitry Andricsections but before orphan sections have been processed.
59fe6060f1SDimitry Andric
60fe6060f1SDimitry AndricIn the case where no linker script has been provided or every ``SECTIONS``
61fe6060f1SDimitry Andriccommand is followed by ``INSERT``, LLD applies built-in rules which are similar
62fe6060f1SDimitry Andricto GNU ld's internal linker scripts.
63fe6060f1SDimitry Andric
6481ad6265SDimitry Andric- Align the first section in a ``PT_LOAD`` segment according to
6581ad6265SDimitry Andric  ``-z noseparate-code``, ``-z separate-code``, or
6681ad6265SDimitry Andric  ``-z separate-loadable-segments``
6781ad6265SDimitry Andric- Define ``__bss_start``, ``end``, ``_end``, ``etext``, ``_etext``, ``edata``,
6881ad6265SDimitry Andric  ``_edata``
6981ad6265SDimitry Andric- Sort ``.ctors.*``/``.dtors.*``/``.init_array.*``/``.fini_array.*`` and
7081ad6265SDimitry Andric  PowerPC64 specific ``.toc``
71fe6060f1SDimitry Andric- Place input ``.text.*`` into output ``.text``, and handle certain variants
7281ad6265SDimitry Andric  (``.text.hot.``, ``.text.unknown.``, ``.text.unlikely.``, etc) in the
7381ad6265SDimitry Andric  presence of ``-z keep-text-section-prefix``.
74fe6060f1SDimitry Andric
755ffd83dbSDimitry AndricOutput section description
765ffd83dbSDimitry Andric~~~~~~~~~~~~~~~~~~~~~~~~~~
775ffd83dbSDimitry Andric
785ffd83dbSDimitry AndricThe description of an output section looks like:
795ffd83dbSDimitry Andric
805ffd83dbSDimitry Andric::
815ffd83dbSDimitry Andric
825ffd83dbSDimitry Andric  section [address] [(type)] : [AT(lma)] [ALIGN(section_align)] [SUBALIGN](subsection_align)] {
835ffd83dbSDimitry Andric    output-section-command
845ffd83dbSDimitry Andric    ...
855ffd83dbSDimitry Andric  } [>region] [AT>lma_region] [:phdr ...] [=fillexp] [,]
865ffd83dbSDimitry Andric
875ffd83dbSDimitry AndricOutput section address
885ffd83dbSDimitry Andric----------------------
895ffd83dbSDimitry Andric
905ffd83dbSDimitry AndricWhen an *OutputSection* *S* has ``address``, LLD will set sh_addr to ``address``.
915ffd83dbSDimitry Andric
925ffd83dbSDimitry AndricThe ELF specification says:
935ffd83dbSDimitry Andric
945ffd83dbSDimitry Andric> The value of sh_addr must be congruent to 0, modulo the value of sh_addralign.
955ffd83dbSDimitry Andric
965ffd83dbSDimitry AndricThe presence of ``address`` can cause the condition unsatisfied. LLD will warn.
975ffd83dbSDimitry AndricGNU ld from Binutils 2.35 onwards will reduce sh_addralign so that
985ffd83dbSDimitry Andricsh_addr=0 (modulo sh_addralign).
995ffd83dbSDimitry Andric
100*5f757f3fSDimitry AndricWhen an output section has no input section, GNU ld will eliminate it if it
101*5f757f3fSDimitry Andriconly contains symbol assignments (e.g. ``.foo { symbol = 42; }``). LLD will
102*5f757f3fSDimitry Andricretain such sections unless all the symbol assignments are unreferenced
103*5f757f3fSDimitry Andric``PROVIDED``.
104*5f757f3fSDimitry Andric
105*5f757f3fSDimitry AndricWhen an output section has no input section but advances the location counter,
106*5f757f3fSDimitry AndricGNU ld sets the ``SHF_WRITE`` flag. LLD sets the SHF_WRITE flag only if the
107*5f757f3fSDimitry Andricpreceding output section with non-empty input sections also has the SHF_WRITE
108*5f757f3fSDimitry Andricflag.
109*5f757f3fSDimitry Andric
11081ad6265SDimitry AndricOutput section type
11181ad6265SDimitry Andric-------------------
11281ad6265SDimitry Andric
11381ad6265SDimitry AndricWhen an *OutputSection* *S* has ``(type)``, LLD will set ``sh_type`` or
11481ad6265SDimitry Andric``sh_flags`` of *S*. ``type`` is one of:
11581ad6265SDimitry Andric
11681ad6265SDimitry Andric- ``NOLOAD``: set ``sh_type`` to ``SHT_NOBITS``.
11781ad6265SDimitry Andric- ``COPY``, ``INFO``, ``OVERLAY``: clear the ``SHF_ALLOC`` bit in ``sh_flags``.
11881ad6265SDimitry Andric- ``TYPE=<value>``: set ``sh_type`` to the specified value. ``<value>`` must be
11981ad6265SDimitry Andric  an integer or one of ``SHT_PROGBITS, SHT_NOTE, SHT_NOBITS, SHT_INIT_ARRAY,
12081ad6265SDimitry Andric  SHT_FINI_ARRAY, SHT_PREINIT_ARRAY``.
12181ad6265SDimitry Andric
12281ad6265SDimitry AndricWhen ``sh_type`` is specified, it is an error if an input section in *S* has a
12381ad6265SDimitry Andricdifferent type.
12481ad6265SDimitry Andric
1255ffd83dbSDimitry AndricOutput section alignment
1265ffd83dbSDimitry Andric------------------------
1275ffd83dbSDimitry Andric
1285ffd83dbSDimitry Andricsh_addralign of an *OutputSection* *S* is the maximum of
1295ffd83dbSDimitry Andric``ALIGN(section_align)`` and the maximum alignment of the input sections in
1305ffd83dbSDimitry Andric*S*.
1315ffd83dbSDimitry Andric
1325ffd83dbSDimitry AndricWhen an *OutputSection* *S* has both ``address`` and ``ALIGN(section_align)``,
1335ffd83dbSDimitry AndricGNU ld will set sh_addralign to ``ALIGN(section_align)``.
1345ffd83dbSDimitry Andric
1355ffd83dbSDimitry AndricOutput section LMA
1365ffd83dbSDimitry Andric------------------
1375ffd83dbSDimitry Andric
1385ffd83dbSDimitry AndricA load address (LMA) can be specified by ``AT(lma)`` or ``AT>lma_region``.
1395ffd83dbSDimitry Andric
1405ffd83dbSDimitry Andric- ``AT(lma)`` specifies the exact load address. If the linker script does not
1415ffd83dbSDimitry Andric  have a PHDRS command, then a new loadable segment will be generated.
1425ffd83dbSDimitry Andric- ``AT>lma_region`` specifies the LMA region. The lack of ``AT>lma_region``
1435ffd83dbSDimitry Andric  means the default region is used. Note, GNU ld propagates the previous LMA
1445ffd83dbSDimitry Andric  memory region when ``address`` is not specified. The LMA is set to the
1455ffd83dbSDimitry Andric  current location of the memory region aligned to the section alignment.
1465ffd83dbSDimitry Andric  If the linker script does not have a PHDRS command, then if
1475ffd83dbSDimitry Andric  ``lma_region`` is different from the ``lma_region`` for
1485ffd83dbSDimitry Andric  the previous OutputSection a new loadable segment will be generated.
1495ffd83dbSDimitry Andric
1505ffd83dbSDimitry AndricThe two keywords cannot be specified at the same time.
1515ffd83dbSDimitry Andric
1525ffd83dbSDimitry AndricIf neither ``AT(lma)`` nor ``AT>lma_region`` is specified:
1535ffd83dbSDimitry Andric
1545ffd83dbSDimitry Andric- If the previous section is also in the default LMA region, and the two
1555ffd83dbSDimitry Andric  section have the same memory regions, the difference between the LMA and the
1565ffd83dbSDimitry Andric  VMA is computed to be the same as the previous difference.
1575ffd83dbSDimitry Andric- Otherwise, the LMA is set to the VMA.
158fe6060f1SDimitry Andric
159fe6060f1SDimitry AndricOverwrite sections
160fe6060f1SDimitry Andric~~~~~~~~~~~~~~~~~~
161fe6060f1SDimitry Andric
162fe6060f1SDimitry AndricAn ``OVERWRITE_SECTIONS`` command looks like:
163fe6060f1SDimitry Andric
164fe6060f1SDimitry Andric::
165fe6060f1SDimitry Andric
166fe6060f1SDimitry Andric  OVERWRITE_SECTIONS {
167fe6060f1SDimitry Andric    output-section-description
168fe6060f1SDimitry Andric    output-section-description
169fe6060f1SDimitry Andric    ...
170fe6060f1SDimitry Andric  }
171fe6060f1SDimitry Andric
172fe6060f1SDimitry AndricUnlike a ``SECTIONS`` command, ``OVERWRITE_SECTIONS``  does not specify a
173fe6060f1SDimitry Andricsection order or suppress the built-in rules.
174fe6060f1SDimitry Andric
175fe6060f1SDimitry AndricIf a described output section description also appears in a ``SECTIONS``
176fe6060f1SDimitry Andriccommand, the ``OVERWRITE_SECTIONS`` command wins; otherwise, the output section
177fe6060f1SDimitry Andricwill be added somewhere following the usual orphan section placement rules.
178fe6060f1SDimitry Andric
179fe6060f1SDimitry AndricIf a described output section description also appears in an ``INSERT
180fe6060f1SDimitry Andric[AFTER|BEFORE]`` command, the description will be provided by the
181fe6060f1SDimitry Andricdescription in the ``OVERWRITE_SECTIONS`` command while the insert command
182fe6060f1SDimitry Andricstill applies (possibly after orphan section placement). It is recommended to
183fe6060f1SDimitry Andricleave the brace empty (i.e. ``section : {}``) for the insert command, because
184fe6060f1SDimitry Andricits description will be ignored anyway.
185*5f757f3fSDimitry Andric
186*5f757f3fSDimitry AndricBuilt-in functions
187*5f757f3fSDimitry Andric~~~~~~~~~~~~~~~~~~
188*5f757f3fSDimitry Andric
189*5f757f3fSDimitry Andric``DATA_SEGMENT_RELRO_END(offset, exp)`` defines the end of the ``PT_GNU_RELRO``
190*5f757f3fSDimitry Andricsegment when ``-z relro`` (default) is in effect. Sections between
191*5f757f3fSDimitry Andric``DATA_SEGMENT_ALIGN`` and ``DATA_SEGMENT_RELRO_END`` are considered RELRO.
192*5f757f3fSDimitry Andric
193*5f757f3fSDimitry AndricThe typical use case is ``. = DATA_SEGMENT_RELRO_END(0, .);`` followed by
194*5f757f3fSDimitry Andricwritable but non-RELRO sections. LLD ignores ``offset`` and ``exp`` and aligns
195*5f757f3fSDimitry Andricthe current location to a max-page-size boundary, ensuring that the next
196*5f757f3fSDimitry Andric``PT_LOAD`` segment will not overlap with the ``PT_GNU_RELRO`` segment.
197*5f757f3fSDimitry Andric
198*5f757f3fSDimitry AndricLLD will insert ``.relro_padding`` immediately before the symbol assignment
199*5f757f3fSDimitry Andricusing ``DATA_SEGMENT_RELRO_END``.
200