1=============== 2LLVM Extensions 3=============== 4 5.. contents:: 6 :local: 7 8.. toctree:: 9 :hidden: 10 11Introduction 12============ 13 14This document describes extensions to tools and formats LLVM seeks compatibility 15with. 16 17General Assembly Syntax 18=========================== 19 20C99-style Hexadecimal Floating-point Constants 21---------------------------------------------- 22 23LLVM's assemblers allow floating-point constants to be written in C99's 24hexadecimal format instead of decimal if desired. 25 26.. code-block:: gas 27 28 .section .data 29 .float 0x1c2.2ap3 30 31Machine-specific Assembly Syntax 32================================ 33 34X86/COFF-Dependent 35------------------ 36 37Relocations 38^^^^^^^^^^^ 39 40The following additional relocation types are supported: 41 42**@IMGREL** (AT&T syntax only) generates an image-relative relocation that 43corresponds to the COFF relocation types ``IMAGE_REL_I386_DIR32NB`` (32-bit) or 44``IMAGE_REL_AMD64_ADDR32NB`` (64-bit). 45 46.. code-block:: text 47 48 .text 49 fun: 50 mov foo@IMGREL(%ebx, %ecx, 4), %eax 51 52 .section .pdata 53 .long fun@IMGREL 54 .long (fun@imgrel + 0x3F) 55 .long $unwind$fun@imgrel 56 57**.secrel32** generates a relocation that corresponds to the COFF relocation 58types ``IMAGE_REL_I386_SECREL`` (32-bit) or ``IMAGE_REL_AMD64_SECREL`` (64-bit). 59 60**.secidx** relocation generates an index of the section that contains 61the target. It corresponds to the COFF relocation types 62``IMAGE_REL_I386_SECTION`` (32-bit) or ``IMAGE_REL_AMD64_SECTION`` (64-bit). 63 64.. code-block:: none 65 66 .section .debug$S,"rn" 67 .long 4 68 .long 242 69 .long 40 70 .secrel32 _function_name + 0 71 .secidx _function_name 72 ... 73 74``.linkonce`` Directive 75^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 76 77Syntax: 78 79 ``.linkonce [ comdat type ]`` 80 81Supported COMDAT types: 82 83``discard`` 84 Discards duplicate sections with the same COMDAT symbol. This is the default 85 if no type is specified. 86 87``one_only`` 88 If the symbol is defined multiple times, the linker issues an error. 89 90``same_size`` 91 Duplicates are discarded, but the linker issues an error if any have 92 different sizes. 93 94``same_contents`` 95 Duplicates are discarded, but the linker issues an error if any duplicates 96 do not have exactly the same content. 97 98``largest`` 99 Links the largest section from among the duplicates. 100 101``newest`` 102 Links the newest section from among the duplicates. 103 104 105.. code-block:: gas 106 107 .section .text$foo 108 .linkonce 109 ... 110 111``.section`` Directive 112^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 113 114MC supports passing the information in ``.linkonce`` at the end of 115``.section``. For example, these two codes are equivalent 116 117.. code-block:: gas 118 119 .section secName, "dr", discard, "Symbol1" 120 .globl Symbol1 121 Symbol1: 122 .long 1 123 124.. code-block:: gas 125 126 .section secName, "dr" 127 .linkonce discard 128 .globl Symbol1 129 Symbol1: 130 .long 1 131 132Note that in the combined form the COMDAT symbol is explicit. This 133extension exists to support multiple sections with the same name in 134different COMDATs: 135 136 137.. code-block:: gas 138 139 .section secName, "dr", discard, "Symbol1" 140 .globl Symbol1 141 Symbol1: 142 .long 1 143 144 .section secName, "dr", discard, "Symbol2" 145 .globl Symbol2 146 Symbol2: 147 .long 1 148 149In addition to the types allowed with ``.linkonce``, ``.section`` also accepts 150``associative``. The meaning is that the section is linked if a certain other 151COMDAT section is linked. This other section is indicated by the comdat symbol 152in this directive. It can be any symbol defined in the associated section, but 153is usually the associated section's comdat. 154 155 The following restrictions apply to the associated section: 156 157 1. It must be a COMDAT section. 158 2. It cannot be another associative COMDAT section. 159 160In the following example the symbol ``sym`` is the comdat symbol of ``.foo`` 161and ``.bar`` is associated to ``.foo``. 162 163.. code-block:: gas 164 165 .section .foo,"bw",discard, "sym" 166 .section .bar,"rd",associative, "sym" 167 168MC supports these flags in the COFF ``.section`` directive: 169 170 - ``b``: BSS section (``IMAGE_SCN_CNT_INITIALIZED_DATA``) 171 - ``d``: Data section (``IMAGE_SCN_CNT_UNINITIALIZED_DATA``) 172 - ``n``: Section is not loaded (``IMAGE_SCN_LNK_REMOVE``) 173 - ``r``: Read-only 174 - ``s``: Shared section 175 - ``w``: Writable 176 - ``x``: Executable section 177 - ``y``: Not readable 178 - ``D``: Discardable (``IMAGE_SCN_MEM_DISCARDABLE``) 179 180These flags are all compatible with gas, with the exception of the ``D`` flag, 181which gnu as does not support. For gas compatibility, sections with a name 182starting with ".debug" are implicitly discardable. 183 184 185ARM64/COFF-Dependent 186-------------------- 187 188Relocations 189^^^^^^^^^^^ 190 191The following additional symbol variants are supported: 192 193**:secrel_lo12:** generates a relocation that corresponds to the COFF relocation 194types ``IMAGE_REL_ARM64_SECREL_LOW12A`` or ``IMAGE_REL_ARM64_SECREL_LOW12L``. 195 196**:secrel_hi12:** generates a relocation that corresponds to the COFF relocation 197type ``IMAGE_REL_ARM64_SECREL_HIGH12A``. 198 199.. code-block:: gas 200 201 add x0, x0, :secrel_hi12:symbol 202 ldr x0, [x0, :secrel_lo12:symbol] 203 204 add x1, x1, :secrel_hi12:symbol 205 add x1, x1, :secrel_lo12:symbol 206 ... 207 208 209ELF-Dependent 210------------- 211 212``.section`` Directive 213^^^^^^^^^^^^^^^^^^^^^^ 214 215In order to support creating multiple sections with the same name and comdat, 216it is possible to add an unique number at the end of the ``.section`` directive. 217For example, the following code creates two sections named ``.text``. 218 219.. code-block:: gas 220 221 .section .text,"ax",@progbits,unique,1 222 nop 223 224 .section .text,"ax",@progbits,unique,2 225 nop 226 227 228The unique number is not present in the resulting object at all. It is just used 229in the assembler to differentiate the sections. 230 231The 'o' flag is mapped to SHF_LINK_ORDER. If it is present, a symbol 232must be given that identifies the section to be placed is the 233.sh_link. 234 235.. code-block:: gas 236 237 .section .foo,"a",@progbits 238 .Ltmp: 239 .section .bar,"ao",@progbits,.Ltmp 240 241which is equivalent to just 242 243.. code-block:: gas 244 245 .section .foo,"a",@progbits 246 .section .bar,"ao",@progbits,.foo 247 248``.linker-options`` Section (linker options) 249^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 250 251In order to support passing linker options from the frontend to the linker, a 252special section of type ``SHT_LLVM_LINKER_OPTIONS`` (usually named 253``.linker-options`` though the name is not significant as it is identified by 254the type). The contents of this section is a simple pair-wise encoding of 255directives for consideration by the linker. The strings are encoded as standard 256null-terminated UTF-8 strings. They are emitted inline to avoid having the 257linker traverse the object file for retrieving the value. The linker is 258permitted to not honour the option and instead provide a warning/error to the 259user that the requested option was not honoured. 260 261The section has type ``SHT_LLVM_LINKER_OPTIONS`` and has the ``SHF_EXCLUDE`` 262flag to ensure that the section is treated as opaque by linkers which do not 263support the feature and will not be emitted into the final linked binary. 264 265This would be equivalent to the follow raw assembly: 266 267.. code-block:: gas 268 269 .section ".linker-options","e",@llvm_linker_options 270 .asciz "option 1" 271 .asciz "value 1" 272 .asciz "option 2" 273 .asciz "value 2" 274 275The following directives are specified: 276 277 - lib 278 279 The parameter identifies a library to be linked against. The library will 280 be looked up in the default and any specified library search paths 281 (specified to this point). 282 283 - libpath 284 285 The parameter identifies an additional library search path to be considered 286 when looking up libraries after the inclusion of this option. 287 288``SHT_LLVM_DEPENDENT_LIBRARIES`` Section (Dependent Libraries) 289^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 290 291This section contains strings specifying libraries to be added to the link by 292the linker. 293 294The section should be consumed by the linker and not written to the output. 295 296The strings are encoded as standard null-terminated UTF-8 strings. 297 298For example: 299 300.. code-block:: gas 301 302 .section ".deplibs","MS",@llvm_dependent_libraries,1 303 .asciz "library specifier 1" 304 .asciz "library specifier 2" 305 306The interpretation of the library specifiers is defined by the consuming linker. 307 308``SHT_LLVM_CALL_GRAPH_PROFILE`` Section (Call Graph Profile) 309^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 310 311This section is used to pass a call graph profile to the linker which can be 312used to optimize the placement of sections. It contains a sequence of 313(from symbol, to symbol, weight) tuples. 314 315It shall have a type of ``SHT_LLVM_CALL_GRAPH_PROFILE`` (0x6fff4c02), shall 316have the ``SHF_EXCLUDE`` flag set, the ``sh_link`` member shall hold the section 317header index of the associated symbol table, and shall have a ``sh_entsize`` of 31816. It should be named ``.llvm.call-graph-profile``. 319 320The contents of the section shall be a sequence of ``Elf_CGProfile`` entries. 321 322.. code-block:: c 323 324 typedef struct { 325 Elf_Word cgp_from; 326 Elf_Word cgp_to; 327 Elf_Xword cgp_weight; 328 } Elf_CGProfile; 329 330cgp_from 331 The symbol index of the source of the edge. 332 333cgp_to 334 The symbol index of the destination of the edge. 335 336cgp_weight 337 The weight of the edge. 338 339This is represented in assembly as: 340 341.. code-block:: gas 342 343 .cg_profile from, to, 42 344 345``.cg_profile`` directives are processed at the end of the file. It is an error 346if either ``from`` or ``to`` are undefined temporary symbols. If either symbol 347is a temporary symbol, then the section symbol is used instead. If either 348symbol is undefined, then that symbol is defined as if ``.weak symbol`` has been 349written at the end of the file. This forces the symbol to show up in the symbol 350table. 351 352``SHT_LLVM_ADDRSIG`` Section (address-significance table) 353^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 354 355This section is used to mark symbols as address-significant, i.e. the address 356of the symbol is used in a comparison or leaks outside the translation unit. It 357has the same meaning as the absence of the LLVM attributes ``unnamed_addr`` 358and ``local_unnamed_addr``. 359 360Any sections referred to by symbols that are not marked as address-significant 361in any object file may be safely merged by a linker without breaking the 362address uniqueness guarantee provided by the C and C++ language standards. 363 364The contents of the section are a sequence of ULEB128-encoded integers 365referring to the symbol table indexes of the address-significant symbols. 366 367There are two associated assembly directives: 368 369.. code-block:: gas 370 371 .addrsig 372 373This instructs the assembler to emit an address-significance table. Without 374this directive, all symbols are considered address-significant. 375 376.. code-block:: gas 377 378 .addrsig_sym sym 379 380This marks ``sym`` as address-significant. 381 382``SHT_LLVM_SYMPART`` Section (symbol partition specification) 383^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 384 385This section is used to mark symbols with the `partition`_ that they 386belong to. An ``.llvm_sympart`` section consists of a null-terminated string 387specifying the name of the partition followed by a relocation referring to 388the symbol that belongs to the partition. It may be constructed as follows: 389 390.. code-block:: gas 391 392 .section ".llvm_sympart","",@llvm_sympart 393 .asciz "libpartition.so" 394 .word symbol_in_partition 395 396.. _partition: https://lld.llvm.org/Partitions.html 397 398CodeView-Dependent 399------------------ 400 401``.cv_file`` Directive 402^^^^^^^^^^^^^^^^^^^^^^ 403Syntax: 404 ``.cv_file`` *FileNumber FileName* [ *checksum* ] [ *checksumkind* ] 405 406``.cv_func_id`` Directive 407^^^^^^^^^^^^^^^^^^^^^^^^^ 408Introduces a function ID that can be used with ``.cv_loc``. 409 410Syntax: 411 ``.cv_func_id`` *FunctionId* 412 413``.cv_inline_site_id`` Directive 414^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 415Introduces a function ID that can be used with ``.cv_loc``. Includes 416``inlined at`` source location information for use in the line table of the 417caller, whether the caller is a real function or another inlined call site. 418 419Syntax: 420 ``.cv_inline_site_id`` *FunctionId* ``within`` *Function* ``inlined_at`` *FileNumber Line* [ *Column* ] 421 422``.cv_loc`` Directive 423^^^^^^^^^^^^^^^^^^^^^ 424The first number is a file number, must have been previously assigned with a 425``.file`` directive, the second number is the line number and optionally the 426third number is a column position (zero if not specified). The remaining 427optional items are ``.loc`` sub-directives. 428 429Syntax: 430 ``.cv_loc`` *FunctionId FileNumber* [ *Line* ] [ *Column* ] [ *prologue_end* ] [ ``is_stmt`` *value* ] 431 432``.cv_linetable`` Directive 433^^^^^^^^^^^^^^^^^^^^^^^^^^^ 434Syntax: 435 ``.cv_linetable`` *FunctionId* ``,`` *FunctionStart* ``,`` *FunctionEnd* 436 437``.cv_inline_linetable`` Directive 438^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 439Syntax: 440 ``.cv_inline_linetable`` *PrimaryFunctionId* ``,`` *FileNumber Line FunctionStart FunctionEnd* 441 442``.cv_def_range`` Directive 443^^^^^^^^^^^^^^^^^^^^^^^^^^^ 444The *GapStart* and *GapEnd* options may be repeated as needed. 445 446Syntax: 447 ``.cv_def_range`` *RangeStart RangeEnd* [ *GapStart GapEnd* ] ``,`` *bytes* 448 449``.cv_stringtable`` Directive 450^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 451 452``.cv_filechecksums`` Directive 453^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 454 455``.cv_filechecksumoffset`` Directive 456^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 457Syntax: 458 ``.cv_filechecksumoffset`` *FileNumber* 459 460``.cv_fpo_data`` Directive 461^^^^^^^^^^^^^^^^^^^^^^^^^^ 462Syntax: 463 ``.cv_fpo_data`` *procsym* 464 465Target Specific Behaviour 466========================= 467 468X86 469--- 470 471Relocations 472^^^^^^^^^^^ 473 474``@ABS8`` can be applied to symbols which appear as immediate operands to 475instructions that have an 8-bit immediate form for that operand. It causes 476the assembler to use the 8-bit form and an 8-bit relocation (e.g. ``R_386_8`` 477or ``R_X86_64_8``) for the symbol. 478 479For example: 480 481.. code-block:: gas 482 483 cmpq $foo@ABS8, %rdi 484 485This causes the assembler to select the form of the 64-bit ``cmpq`` instruction 486that takes an 8-bit immediate operand that is sign extended to 64 bits, as 487opposed to ``cmpq $foo, %rdi`` which takes a 32-bit immediate operand. This 488is also not the same as ``cmpb $foo, %dil``, which is an 8-bit comparison. 489 490Windows on ARM 491-------------- 492 493Stack Probe Emission 494^^^^^^^^^^^^^^^^^^^^ 495 496The reference implementation (Microsoft Visual Studio 2012) emits stack probes 497in the following fashion: 498 499.. code-block:: gas 500 501 movw r4, #constant 502 bl __chkstk 503 sub.w sp, sp, r4 504 505However, this has the limitation of 32 MiB (±16MiB). In order to accommodate 506larger binaries, LLVM supports the use of ``-mcmodel=large`` to allow a 4GiB 507range via a slight deviation. It will generate an indirect jump as follows: 508 509.. code-block:: gas 510 511 movw r4, #constant 512 movw r12, :lower16:__chkstk 513 movt r12, :upper16:__chkstk 514 blx r12 515 sub.w sp, sp, r4 516 517Variable Length Arrays 518^^^^^^^^^^^^^^^^^^^^^^ 519 520The reference implementation (Microsoft Visual Studio 2012) does not permit the 521emission of Variable Length Arrays (VLAs). 522 523The Windows ARM Itanium ABI extends the base ABI by adding support for emitting 524a dynamic stack allocation. When emitting a variable stack allocation, a call 525to ``__chkstk`` is emitted unconditionally to ensure that guard pages are setup 526properly. The emission of this stack probe emission is handled similar to the 527standard stack probe emission. 528 529The MSVC environment does not emit code for VLAs currently. 530 531Windows on ARM64 532---------------- 533 534Stack Probe Emission 535^^^^^^^^^^^^^^^^^^^^ 536 537The reference implementation (Microsoft Visual Studio 2017) emits stack probes 538in the following fashion: 539 540.. code-block:: gas 541 542 mov x15, #constant 543 bl __chkstk 544 sub sp, sp, x15, lsl #4 545 546However, this has the limitation of 256 MiB (±128MiB). In order to accommodate 547larger binaries, LLVM supports the use of ``-mcmodel=large`` to allow a 8GiB 548(±4GiB) range via a slight deviation. It will generate an indirect jump as 549follows: 550 551.. code-block:: gas 552 553 mov x15, #constant 554 adrp x16, __chkstk 555 add x16, x16, :lo12:__chkstk 556 blr x16 557 sub sp, sp, x15, lsl #4 558 559