1=============== 2LLVM Extensions 3=============== 4 5.. contents:: 6 :local: 7 8.. toctree:: 9 :hidden: 10 11Introduction 12============ 13 14This document describes extensions to tools and formats LLVM seeks compatibility 15with. 16 17General Assembly Syntax 18=========================== 19 20C99-style Hexadecimal Floating-point Constants 21---------------------------------------------- 22 23LLVM's assemblers allow floating-point constants to be written in C99's 24hexadecimal format instead of decimal if desired. 25 26.. code-block:: gas 27 28 .section .data 29 .float 0x1c2.2ap3 30 31Machine-specific Assembly Syntax 32================================ 33 34X86/COFF-Dependent 35------------------ 36 37Relocations 38^^^^^^^^^^^ 39 40The following additional relocation types are supported: 41 42**@IMGREL** (AT&T syntax only) generates an image-relative relocation that 43corresponds to the COFF relocation types ``IMAGE_REL_I386_DIR32NB`` (32-bit) or 44``IMAGE_REL_AMD64_ADDR32NB`` (64-bit). 45 46.. code-block:: text 47 48 .text 49 fun: 50 mov foo@IMGREL(%ebx, %ecx, 4), %eax 51 52 .section .pdata 53 .long fun@IMGREL 54 .long (fun@imgrel + 0x3F) 55 .long $unwind$fun@imgrel 56 57**.secrel32** generates a relocation that corresponds to the COFF relocation 58types ``IMAGE_REL_I386_SECREL`` (32-bit) or ``IMAGE_REL_AMD64_SECREL`` (64-bit). 59 60**.secidx** relocation generates an index of the section that contains 61the target. It corresponds to the COFF relocation types 62``IMAGE_REL_I386_SECTION`` (32-bit) or ``IMAGE_REL_AMD64_SECTION`` (64-bit). 63 64.. code-block:: none 65 66 .section .debug$S,"rn" 67 .long 4 68 .long 242 69 .long 40 70 .secrel32 _function_name + 0 71 .secidx _function_name 72 ... 73 74``.linkonce`` Directive 75^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 76 77Syntax: 78 79 ``.linkonce [ comdat type ]`` 80 81Supported COMDAT types: 82 83``discard`` 84 Discards duplicate sections with the same COMDAT symbol. This is the default 85 if no type is specified. 86 87``one_only`` 88 If the symbol is defined multiple times, the linker issues an error. 89 90``same_size`` 91 Duplicates are discarded, but the linker issues an error if any have 92 different sizes. 93 94``same_contents`` 95 Duplicates are discarded, but the linker issues an error if any duplicates 96 do not have exactly the same content. 97 98``largest`` 99 Links the largest section from among the duplicates. 100 101``newest`` 102 Links the newest section from among the duplicates. 103 104 105.. code-block:: gas 106 107 .section .text$foo 108 .linkonce 109 ... 110 111``.section`` Directive 112^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 113 114MC supports passing the information in ``.linkonce`` at the end of 115``.section``. For example, these two codes are equivalent 116 117.. code-block:: gas 118 119 .section secName, "dr", discard, "Symbol1" 120 .globl Symbol1 121 Symbol1: 122 .long 1 123 124.. code-block:: gas 125 126 .section secName, "dr" 127 .linkonce discard 128 .globl Symbol1 129 Symbol1: 130 .long 1 131 132Note that in the combined form the COMDAT symbol is explicit. This 133extension exists to support multiple sections with the same name in 134different COMDATs: 135 136 137.. code-block:: gas 138 139 .section secName, "dr", discard, "Symbol1" 140 .globl Symbol1 141 Symbol1: 142 .long 1 143 144 .section secName, "dr", discard, "Symbol2" 145 .globl Symbol2 146 Symbol2: 147 .long 1 148 149In addition to the types allowed with ``.linkonce``, ``.section`` also accepts 150``associative``. The meaning is that the section is linked if a certain other 151COMDAT section is linked. This other section is indicated by the comdat symbol 152in this directive. It can be any symbol defined in the associated section, but 153is usually the associated section's comdat. 154 155 The following restrictions apply to the associated section: 156 157 1. It must be a COMDAT section. 158 2. It cannot be another associative COMDAT section. 159 160In the following example the symobl ``sym`` is the comdat symbol of ``.foo`` 161and ``.bar`` is associated to ``.foo``. 162 163.. code-block:: gas 164 165 .section .foo,"bw",discard, "sym" 166 .section .bar,"rd",associative, "sym" 167 168MC supports these flags in the COFF ``.section`` directive: 169 170 - ``b``: BSS section (``IMAGE_SCN_CNT_INITIALIZED_DATA``) 171 - ``d``: Data section (``IMAGE_SCN_CNT_UNINITIALIZED_DATA``) 172 - ``n``: Section is not loaded (``IMAGE_SCN_LNK_REMOVE``) 173 - ``r``: Read-only 174 - ``s``: Shared section 175 - ``w``: Writable 176 - ``x``: Executable section 177 - ``y``: Not readable 178 - ``D``: Discardable (``IMAGE_SCN_MEM_DISCARDABLE``) 179 180These flags are all compatible with gas, with the exception of the ``D`` flag, 181which gnu as does not support. For gas compatibility, sections with a name 182starting with ".debug" are implicitly discardable. 183 184 185ARM64/COFF-Dependent 186-------------------- 187 188Relocations 189^^^^^^^^^^^ 190 191The following additional symbol variants are supported: 192 193**:secrel_lo12:** generates a relocation that corresponds to the COFF relocation 194types ``IMAGE_REL_ARM64_SECREL_LOW12A`` or ``IMAGE_REL_ARM64_SECREL_LOW12L``. 195 196**:secrel_hi12:** generates a relocation that corresponds to the COFF relocation 197type ``IMAGE_REL_ARM64_SECREL_HIGH12A``. 198 199.. code-block:: gas 200 201 add x0, x0, :secrel_hi12:symbol 202 ldr x0, [x0, :secrel_lo12:symbol] 203 204 add x1, x1, :secrel_hi12:symbol 205 add x1, x1, :secrel_lo12:symbol 206 ... 207 208 209ELF-Dependent 210------------- 211 212``.section`` Directive 213^^^^^^^^^^^^^^^^^^^^^^ 214 215In order to support creating multiple sections with the same name and comdat, 216it is possible to add an unique number at the end of the ``.seciton`` directive. 217For example, the following code creates two sections named ``.text``. 218 219.. code-block:: gas 220 221 .section .text,"ax",@progbits,unique,1 222 nop 223 224 .section .text,"ax",@progbits,unique,2 225 nop 226 227 228The unique number is not present in the resulting object at all. It is just used 229in the assembler to differentiate the sections. 230 231The 'o' flag is mapped to SHF_LINK_ORDER. If it is present, a symbol 232must be given that identifies the section to be placed is the 233.sh_link. 234 235.. code-block:: gas 236 237 .section .foo,"a",@progbits 238 .Ltmp: 239 .section .bar,"ao",@progbits,.Ltmp 240 241which is equivalent to just 242 243.. code-block:: gas 244 245 .section .foo,"a",@progbits 246 .section .bar,"ao",@progbits,.foo 247 248``.linker-options`` Section (linker options) 249^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 250 251In order to support passing linker options from the frontend to the linker, a 252special section of type ``SHT_LLVM_LINKER_OPTIONS`` (usually named 253``.linker-options`` though the name is not significant as it is identified by 254the type). The contents of this section is a simple pair-wise encoding of 255directives for consideration by the linker. The strings are encoded as standard 256null-terminated UTF-8 strings. They are emitted inline to avoid having the 257linker traverse the object file for retrieving the value. The linker is 258permitted to not honour the option and instead provide a warning/error to the 259user that the requested option was not honoured. 260 261The section has type ``SHT_LLVM_LINKER_OPTIONS`` and has the ``SHF_EXCLUDE`` 262flag to ensure that the section is treated as opaque by linkers which do not 263support the feature and will not be emitted into the final linked binary. 264 265This would be equivalent to the follow raw assembly: 266 267.. code-block:: gas 268 269 .section ".linker-options","e",@llvm_linker_options 270 .asciz "option 1" 271 .asciz "value 1" 272 .asciz "option 2" 273 .asciz "value 2" 274 275The following directives are specified: 276 277 - lib 278 279 The parameter identifies a library to be linked against. The library will 280 be looked up in the default and any specified library search paths 281 (specified to this point). 282 283 - libpath 284 285 The paramter identifies an additional library search path to be considered 286 when looking up libraries after the inclusion of this option. 287 288``SHT_LLVM_CALL_GRAPH_PROFILE`` Section (Call Graph Profile) 289^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 290 291This section is used to pass a call graph profile to the linker which can be 292used to optimize the placement of sections. It contains a sequence of 293(from symbol, to symbol, weight) tuples. 294 295It shall have a type of ``SHT_LLVM_CALL_GRAPH_PROFILE`` (0x6fff4c02), shall 296have the ``SHF_EXCLUDE`` flag set, the ``sh_link`` member shall hold the section 297header index of the associated symbol table, and shall have a ``sh_entsize`` of 29816. It should be named ``.llvm.call-graph-profile``. 299 300The contents of the section shall be a sequence of ``Elf_CGProfile`` entries. 301 302.. code-block:: c 303 304 typedef struct { 305 Elf_Word cgp_from; 306 Elf_Word cgp_to; 307 Elf_Xword cgp_weight; 308 } Elf_CGProfile; 309 310cgp_from 311 The symbol index of the source of the edge. 312 313cgp_to 314 The symbol index of the destination of the edge. 315 316cgp_weight 317 The weight of the edge. 318 319This is represented in assembly as: 320 321.. code-block:: gas 322 323 .cg_profile from, to, 42 324 325``.cg_profile`` directives are processed at the end of the file. It is an error 326if either ``from`` or ``to`` are undefined temporary symbols. If either symbol 327is a temporary symbol, then the section symbol is used instead. If either 328symbol is undefined, then that symbol is defined as if ``.weak symbol`` has been 329written at the end of the file. This forces the symbol to show up in the symbol 330table. 331 332``SHT_LLVM_ADDRSIG`` Section (address-significance table) 333^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 334 335This section is used to mark symbols as address-significant, i.e. the address 336of the symbol is used in a comparison or leaks outside the translation unit. It 337has the same meaning as the absence of the LLVM attributes ``unnamed_addr`` 338and ``local_unnamed_addr``. 339 340Any sections referred to by symbols that are not marked as address-significant 341in any object file may be safely merged by a linker without breaking the 342address uniqueness guarantee provided by the C and C++ language standards. 343 344The contents of the section are a sequence of ULEB128-encoded integers 345referring to the symbol table indexes of the address-significant symbols. 346 347There are two associated assembly directives: 348 349.. code-block:: gas 350 351 .addrsig 352 353This instructs the assembler to emit an address-significance table. Without 354this directive, all symbols are considered address-significant. 355 356.. code-block:: gas 357 358 .addrsig_sym sym 359 360This marks ``sym`` as address-significant. 361 362CodeView-Dependent 363------------------ 364 365``.cv_file`` Directive 366^^^^^^^^^^^^^^^^^^^^^^ 367Syntax: 368 ``.cv_file`` *FileNumber FileName* [ *checksum* ] [ *checksumkind* ] 369 370``.cv_func_id`` Directive 371^^^^^^^^^^^^^^^^^^^^^^^^^ 372Introduces a function ID that can be used with ``.cv_loc``. 373 374Syntax: 375 ``.cv_func_id`` *FunctionId* 376 377``.cv_inline_site_id`` Directive 378^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 379Introduces a function ID that can be used with ``.cv_loc``. Includes 380``inlined at`` source location information for use in the line table of the 381caller, whether the caller is a real function or another inlined call site. 382 383Syntax: 384 ``.cv_inline_site_id`` *FunctionId* ``within`` *Function* ``inlined_at`` *FileNumber Line* [ *Colomn* ] 385 386``.cv_loc`` Directive 387^^^^^^^^^^^^^^^^^^^^^ 388The first number is a file number, must have been previously assigned with a 389``.file`` directive, the second number is the line number and optionally the 390third number is a column position (zero if not specified). The remaining 391optional items are ``.loc`` sub-directives. 392 393Syntax: 394 ``.cv_loc`` *FunctionId FileNumber* [ *Line* ] [ *Column* ] [ *prologue_end* ] [ ``is_stmt`` *value* ] 395 396``.cv_linetable`` Directive 397^^^^^^^^^^^^^^^^^^^^^^^^^^^ 398Syntax: 399 ``.cv_linetable`` *FunctionId* ``,`` *FunctionStart* ``,`` *FunctionEnd* 400 401``.cv_inline_linetable`` Directive 402^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 403Syntax: 404 ``.cv_inline_linetable`` *PrimaryFunctionId* ``,`` *FileNumber Line FunctionStart FunctionEnd* 405 406``.cv_def_range`` Directive 407^^^^^^^^^^^^^^^^^^^^^^^^^^^ 408The *GapStart* and *GapEnd* options may be repeated as needed. 409 410Syntax: 411 ``.cv_def_range`` *RangeStart RangeEnd* [ *GapStart GapEnd* ] ``,`` *bytes* 412 413``.cv_stringtable`` Directive 414^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 415 416``.cv_filechecksums`` Directive 417^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 418 419``.cv_filechecksumoffset`` Directive 420^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 421Syntax: 422 ``.cv_filechecksumoffset`` *FileNumber* 423 424``.cv_fpo_data`` Directive 425^^^^^^^^^^^^^^^^^^^^^^^^^^ 426Syntax: 427 ``.cv_fpo_data`` *procsym* 428 429Target Specific Behaviour 430========================= 431 432X86 433--- 434 435Relocations 436^^^^^^^^^^^ 437 438``@ABS8`` can be applied to symbols which appear as immediate operands to 439instructions that have an 8-bit immediate form for that operand. It causes 440the assembler to use the 8-bit form and an 8-bit relocation (e.g. ``R_386_8`` 441or ``R_X86_64_8``) for the symbol. 442 443For example: 444 445.. code-block:: gas 446 447 cmpq $foo@ABS8, %rdi 448 449This causes the assembler to select the form of the 64-bit ``cmpq`` instruction 450that takes an 8-bit immediate operand that is sign extended to 64 bits, as 451opposed to ``cmpq $foo, %rdi`` which takes a 32-bit immediate operand. This 452is also not the same as ``cmpb $foo, %dil``, which is an 8-bit comparison. 453 454Windows on ARM 455-------------- 456 457Stack Probe Emission 458^^^^^^^^^^^^^^^^^^^^ 459 460The reference implementation (Microsoft Visual Studio 2012) emits stack probes 461in the following fashion: 462 463.. code-block:: gas 464 465 movw r4, #constant 466 bl __chkstk 467 sub.w sp, sp, r4 468 469However, this has the limitation of 32 MiB (±16MiB). In order to accommodate 470larger binaries, LLVM supports the use of ``-mcode-model=large`` to allow a 4GiB 471range via a slight deviation. It will generate an indirect jump as follows: 472 473.. code-block:: gas 474 475 movw r4, #constant 476 movw r12, :lower16:__chkstk 477 movt r12, :upper16:__chkstk 478 blx r12 479 sub.w sp, sp, r4 480 481Variable Length Arrays 482^^^^^^^^^^^^^^^^^^^^^^ 483 484The reference implementation (Microsoft Visual Studio 2012) does not permit the 485emission of Variable Length Arrays (VLAs). 486 487The Windows ARM Itanium ABI extends the base ABI by adding support for emitting 488a dynamic stack allocation. When emitting a variable stack allocation, a call 489to ``__chkstk`` is emitted unconditionally to ensure that guard pages are setup 490properly. The emission of this stack probe emission is handled similar to the 491standard stack probe emission. 492 493The MSVC environment does not emit code for VLAs currently. 494 495Windows on ARM64 496---------------- 497 498Stack Probe Emission 499^^^^^^^^^^^^^^^^^^^^ 500 501The reference implementation (Microsoft Visual Studio 2017) emits stack probes 502in the following fashion: 503 504.. code-block:: gas 505 506 mov x15, #constant 507 bl __chkstk 508 sub sp, sp, x15, lsl #4 509 510However, this has the limitation of 256 MiB (±128MiB). In order to accommodate 511larger binaries, LLVM supports the use of ``-mcode-model=large`` to allow a 8GiB 512(±4GiB) range via a slight deviation. It will generate an indirect jump as 513follows: 514 515.. code-block:: gas 516 517 mov x15, #constant 518 adrp x16, __chkstk 519 add x16, x16, :lo12:__chkstk 520 blr x16 521 sub sp, sp, x15, lsl #4 522 523