1=============== 2LLVM Extensions 3=============== 4 5.. contents:: 6 :local: 7 8.. toctree:: 9 :hidden: 10 11Introduction 12============ 13 14This document describes extensions to tools and formats LLVM seeks compatibility 15with. 16 17General Assembly Syntax 18=========================== 19 20C99-style Hexadecimal Floating-point Constants 21---------------------------------------------- 22 23LLVM's assemblers allow floating-point constants to be written in C99's 24hexadecimal format instead of decimal if desired. 25 26.. code-block:: gas 27 28 .section .data 29 .float 0x1c2.2ap3 30 31Machine-specific Assembly Syntax 32================================ 33 34X86/COFF-Dependent 35------------------ 36 37Relocations 38^^^^^^^^^^^ 39 40The following additional relocation types are supported: 41 42**@IMGREL** (AT&T syntax only) generates an image-relative relocation that 43corresponds to the COFF relocation types ``IMAGE_REL_I386_DIR32NB`` (32-bit) or 44``IMAGE_REL_AMD64_ADDR32NB`` (64-bit). 45 46.. code-block:: text 47 48 .text 49 fun: 50 mov foo@IMGREL(%ebx, %ecx, 4), %eax 51 52 .section .pdata 53 .long fun@IMGREL 54 .long (fun@imgrel + 0x3F) 55 .long $unwind$fun@imgrel 56 57**.secrel32** generates a relocation that corresponds to the COFF relocation 58types ``IMAGE_REL_I386_SECREL`` (32-bit) or ``IMAGE_REL_AMD64_SECREL`` (64-bit). 59 60**.secidx** relocation generates an index of the section that contains 61the target. It corresponds to the COFF relocation types 62``IMAGE_REL_I386_SECTION`` (32-bit) or ``IMAGE_REL_AMD64_SECTION`` (64-bit). 63 64.. code-block:: none 65 66 .section .debug$S,"rn" 67 .long 4 68 .long 242 69 .long 40 70 .secrel32 _function_name + 0 71 .secidx _function_name 72 ... 73 74``.linkonce`` Directive 75^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 76 77Syntax: 78 79 ``.linkonce [ comdat type ]`` 80 81Supported COMDAT types: 82 83``discard`` 84 Discards duplicate sections with the same COMDAT symbol. This is the default 85 if no type is specified. 86 87``one_only`` 88 If the symbol is defined multiple times, the linker issues an error. 89 90``same_size`` 91 Duplicates are discarded, but the linker issues an error if any have 92 different sizes. 93 94``same_contents`` 95 Duplicates are discarded, but the linker issues an error if any duplicates 96 do not have exactly the same content. 97 98``largest`` 99 Links the largest section from among the duplicates. 100 101``newest`` 102 Links the newest section from among the duplicates. 103 104 105.. code-block:: gas 106 107 .section .text$foo 108 .linkonce 109 ... 110 111``.section`` Directive 112^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 113 114MC supports passing the information in ``.linkonce`` at the end of 115``.section``. For example, these two codes are equivalent 116 117.. code-block:: gas 118 119 .section secName, "dr", discard, "Symbol1" 120 .globl Symbol1 121 Symbol1: 122 .long 1 123 124.. code-block:: gas 125 126 .section secName, "dr" 127 .linkonce discard 128 .globl Symbol1 129 Symbol1: 130 .long 1 131 132Note that in the combined form the COMDAT symbol is explicit. This 133extension exists to support multiple sections with the same name in 134different COMDATs: 135 136 137.. code-block:: gas 138 139 .section secName, "dr", discard, "Symbol1" 140 .globl Symbol1 141 Symbol1: 142 .long 1 143 144 .section secName, "dr", discard, "Symbol2" 145 .globl Symbol2 146 Symbol2: 147 .long 1 148 149In addition to the types allowed with ``.linkonce``, ``.section`` also accepts 150``associative``. The meaning is that the section is linked if a certain other 151COMDAT section is linked. This other section is indicated by the comdat symbol 152in this directive. It can be any symbol defined in the associated section, but 153is usually the associated section's comdat. 154 155 The following restrictions apply to the associated section: 156 157 1. It must be a COMDAT section. 158 2. It cannot be another associative COMDAT section. 159 160In the following example the symbol ``sym`` is the comdat symbol of ``.foo`` 161and ``.bar`` is associated to ``.foo``. 162 163.. code-block:: gas 164 165 .section .foo,"bw",discard, "sym" 166 .section .bar,"rd",associative, "sym" 167 168MC supports these flags in the COFF ``.section`` directive: 169 170 - ``b``: BSS section (``IMAGE_SCN_CNT_INITIALIZED_DATA``) 171 - ``d``: Data section (``IMAGE_SCN_CNT_UNINITIALIZED_DATA``) 172 - ``n``: Section is not loaded (``IMAGE_SCN_LNK_REMOVE``) 173 - ``r``: Read-only 174 - ``s``: Shared section 175 - ``w``: Writable 176 - ``x``: Executable section 177 - ``y``: Not readable 178 - ``D``: Discardable (``IMAGE_SCN_MEM_DISCARDABLE``) 179 180These flags are all compatible with gas, with the exception of the ``D`` flag, 181which gnu as does not support. For gas compatibility, sections with a name 182starting with ".debug" are implicitly discardable. 183 184 185ARM64/COFF-Dependent 186-------------------- 187 188Relocations 189^^^^^^^^^^^ 190 191The following additional symbol variants are supported: 192 193**:secrel_lo12:** generates a relocation that corresponds to the COFF relocation 194types ``IMAGE_REL_ARM64_SECREL_LOW12A`` or ``IMAGE_REL_ARM64_SECREL_LOW12L``. 195 196**:secrel_hi12:** generates a relocation that corresponds to the COFF relocation 197type ``IMAGE_REL_ARM64_SECREL_HIGH12A``. 198 199.. code-block:: gas 200 201 add x0, x0, :secrel_hi12:symbol 202 ldr x0, [x0, :secrel_lo12:symbol] 203 204 add x1, x1, :secrel_hi12:symbol 205 add x1, x1, :secrel_lo12:symbol 206 ... 207 208 209ELF-Dependent 210------------- 211 212``.section`` Directive 213^^^^^^^^^^^^^^^^^^^^^^ 214 215In order to support creating multiple sections with the same name and comdat, 216it is possible to add an unique number at the end of the ``.section`` directive. 217For example, the following code creates two sections named ``.text``. 218 219.. code-block:: gas 220 221 .section .text,"ax",@progbits,unique,1 222 nop 223 224 .section .text,"ax",@progbits,unique,2 225 nop 226 227 228The unique number is not present in the resulting object at all. It is just used 229in the assembler to differentiate the sections. 230 231The 'o' flag is mapped to SHF_LINK_ORDER. If it is present, a symbol 232must be given that identifies the section to be placed is the 233.sh_link. 234 235.. code-block:: gas 236 237 .section .foo,"a",@progbits 238 .Ltmp: 239 .section .bar,"ao",@progbits,.Ltmp 240 241which is equivalent to just 242 243.. code-block:: gas 244 245 .section .foo,"a",@progbits 246 .section .bar,"ao",@progbits,.foo 247 248``.linker-options`` Section (linker options) 249^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 250 251In order to support passing linker options from the frontend to the linker, a 252special section of type ``SHT_LLVM_LINKER_OPTIONS`` (usually named 253``.linker-options`` though the name is not significant as it is identified by 254the type). The contents of this section is a simple pair-wise encoding of 255directives for consideration by the linker. The strings are encoded as standard 256null-terminated UTF-8 strings. They are emitted inline to avoid having the 257linker traverse the object file for retrieving the value. The linker is 258permitted to not honour the option and instead provide a warning/error to the 259user that the requested option was not honoured. 260 261The section has type ``SHT_LLVM_LINKER_OPTIONS`` and has the ``SHF_EXCLUDE`` 262flag to ensure that the section is treated as opaque by linkers which do not 263support the feature and will not be emitted into the final linked binary. 264 265This would be equivalent to the follow raw assembly: 266 267.. code-block:: gas 268 269 .section ".linker-options","e",@llvm_linker_options 270 .asciz "option 1" 271 .asciz "value 1" 272 .asciz "option 2" 273 .asciz "value 2" 274 275The following directives are specified: 276 277 - lib 278 279 The parameter identifies a library to be linked against. The library will 280 be looked up in the default and any specified library search paths 281 (specified to this point). 282 283 - libpath 284 285 The parameter identifies an additional library search path to be considered 286 when looking up libraries after the inclusion of this option. 287 288``SHT_LLVM_DEPENDENT_LIBRARIES`` Section (Dependent Libraries) 289^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 290 291This section contains strings specifying libraries to be added to the link by 292the linker. 293 294The section should be consumed by the linker and not written to the output. 295 296The strings are encoded as standard null-terminated UTF-8 strings. 297 298For example: 299 300.. code-block:: gas 301 302 .section ".deplibs","MS",@llvm_dependent_libraries,1 303 .asciz "library specifier 1" 304 .asciz "library specifier 2" 305 306The interpretation of the library specifiers is defined by the consuming linker. 307 308``SHT_LLVM_CALL_GRAPH_PROFILE`` Section (Call Graph Profile) 309^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 310 311This section is used to pass a call graph profile to the linker which can be 312used to optimize the placement of sections. It contains a sequence of 313(from symbol, to symbol, weight) tuples. 314 315It shall have a type of ``SHT_LLVM_CALL_GRAPH_PROFILE`` (0x6fff4c02), shall 316have the ``SHF_EXCLUDE`` flag set, the ``sh_link`` member shall hold the section 317header index of the associated symbol table, and shall have a ``sh_entsize`` of 31816. It should be named ``.llvm.call-graph-profile``. 319 320The contents of the section shall be a sequence of ``Elf_CGProfile`` entries. 321 322.. code-block:: c 323 324 typedef struct { 325 Elf_Word cgp_from; 326 Elf_Word cgp_to; 327 Elf_Xword cgp_weight; 328 } Elf_CGProfile; 329 330cgp_from 331 The symbol index of the source of the edge. 332 333cgp_to 334 The symbol index of the destination of the edge. 335 336cgp_weight 337 The weight of the edge. 338 339This is represented in assembly as: 340 341.. code-block:: gas 342 343 .cg_profile from, to, 42 344 345``.cg_profile`` directives are processed at the end of the file. It is an error 346if either ``from`` or ``to`` are undefined temporary symbols. If either symbol 347is a temporary symbol, then the section symbol is used instead. If either 348symbol is undefined, then that symbol is defined as if ``.weak symbol`` has been 349written at the end of the file. This forces the symbol to show up in the symbol 350table. 351 352``SHT_LLVM_ADDRSIG`` Section (address-significance table) 353^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 354 355This section is used to mark symbols as address-significant, i.e. the address 356of the symbol is used in a comparison or leaks outside the translation unit. It 357has the same meaning as the absence of the LLVM attributes ``unnamed_addr`` 358and ``local_unnamed_addr``. 359 360Any sections referred to by symbols that are not marked as address-significant 361in any object file may be safely merged by a linker without breaking the 362address uniqueness guarantee provided by the C and C++ language standards. 363 364The contents of the section are a sequence of ULEB128-encoded integers 365referring to the symbol table indexes of the address-significant symbols. 366 367There are two associated assembly directives: 368 369.. code-block:: gas 370 371 .addrsig 372 373This instructs the assembler to emit an address-significance table. Without 374this directive, all symbols are considered address-significant. 375 376.. code-block:: gas 377 378 .addrsig_sym sym 379 380This marks ``sym`` as address-significant. 381 382``SHT_LLVM_SYMPART`` Section (symbol partition specification) 383^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 384 385This section is used to mark symbols with the `partition`_ that they 386belong to. An ``.llvm_sympart`` section consists of a null-terminated string 387specifying the name of the partition followed by a relocation referring to 388the symbol that belongs to the partition. It may be constructed as follows: 389 390.. code-block:: gas 391 392 .section ".llvm_sympart","",@llvm_sympart 393 .asciz "libpartition.so" 394 .word symbol_in_partition 395 396.. _partition: https://lld.llvm.org/Partitions.html 397 398``SHT_LLVM_BB_ADDR_MAP`` Section (basic block address map) 399^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 400This section stores the binary address of basic blocks along with other related 401metadata. This information can be used to map binary profiles (like perf 402profiles) directly to machine basic blocks. 403This section is emitted with ``-basic-block-sections=labels`` and will contain 404a BB address map table for every function which may be constructed as follows: 405 406.. code-block:: gas 407 408 .section ".llvm_bb_addr_map","",@llvm_bb_addr_map 409 .quad .Lfunc_begin0 # address of the function 410 .byte 2 # number of basic blocks 411 # BB record for BB_0 412 .uleb128 .Lfunc_beign0-.Lfunc_begin0 # BB_0 offset relative to function entry (always zero) 413 .uleb128 .LBB_END0_0-.Lfunc_begin0 # BB_0 size 414 .byte x # BB_0 metadata 415 # BB record for BB_1 416 .uleb128 .LBB0_1-.Lfunc_begin0 # BB_1 offset relative to function entry 417 .uleb128 .LBB_END0_1-.Lfunc_begin0 # BB_1 size 418 .byte y # BB_1 metadata 419 420This creates a BB address map table for a function with two basic blocks. 421 422CodeView-Dependent 423------------------ 424 425``.cv_file`` Directive 426^^^^^^^^^^^^^^^^^^^^^^ 427Syntax: 428 ``.cv_file`` *FileNumber FileName* [ *checksum* ] [ *checksumkind* ] 429 430``.cv_func_id`` Directive 431^^^^^^^^^^^^^^^^^^^^^^^^^ 432Introduces a function ID that can be used with ``.cv_loc``. 433 434Syntax: 435 ``.cv_func_id`` *FunctionId* 436 437``.cv_inline_site_id`` Directive 438^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 439Introduces a function ID that can be used with ``.cv_loc``. Includes 440``inlined at`` source location information for use in the line table of the 441caller, whether the caller is a real function or another inlined call site. 442 443Syntax: 444 ``.cv_inline_site_id`` *FunctionId* ``within`` *Function* ``inlined_at`` *FileNumber Line* [ *Column* ] 445 446``.cv_loc`` Directive 447^^^^^^^^^^^^^^^^^^^^^ 448The first number is a file number, must have been previously assigned with a 449``.file`` directive, the second number is the line number and optionally the 450third number is a column position (zero if not specified). The remaining 451optional items are ``.loc`` sub-directives. 452 453Syntax: 454 ``.cv_loc`` *FunctionId FileNumber* [ *Line* ] [ *Column* ] [ *prologue_end* ] [ ``is_stmt`` *value* ] 455 456``.cv_linetable`` Directive 457^^^^^^^^^^^^^^^^^^^^^^^^^^^ 458Syntax: 459 ``.cv_linetable`` *FunctionId* ``,`` *FunctionStart* ``,`` *FunctionEnd* 460 461``.cv_inline_linetable`` Directive 462^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 463Syntax: 464 ``.cv_inline_linetable`` *PrimaryFunctionId* ``,`` *FileNumber Line FunctionStart FunctionEnd* 465 466``.cv_def_range`` Directive 467^^^^^^^^^^^^^^^^^^^^^^^^^^^ 468The *GapStart* and *GapEnd* options may be repeated as needed. 469 470Syntax: 471 ``.cv_def_range`` *RangeStart RangeEnd* [ *GapStart GapEnd* ] ``,`` *bytes* 472 473``.cv_stringtable`` Directive 474^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 475 476``.cv_filechecksums`` Directive 477^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 478 479``.cv_filechecksumoffset`` Directive 480^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 481Syntax: 482 ``.cv_filechecksumoffset`` *FileNumber* 483 484``.cv_fpo_data`` Directive 485^^^^^^^^^^^^^^^^^^^^^^^^^^ 486Syntax: 487 ``.cv_fpo_data`` *procsym* 488 489Target Specific Behaviour 490========================= 491 492X86 493--- 494 495Relocations 496^^^^^^^^^^^ 497 498``@ABS8`` can be applied to symbols which appear as immediate operands to 499instructions that have an 8-bit immediate form for that operand. It causes 500the assembler to use the 8-bit form and an 8-bit relocation (e.g. ``R_386_8`` 501or ``R_X86_64_8``) for the symbol. 502 503For example: 504 505.. code-block:: gas 506 507 cmpq $foo@ABS8, %rdi 508 509This causes the assembler to select the form of the 64-bit ``cmpq`` instruction 510that takes an 8-bit immediate operand that is sign extended to 64 bits, as 511opposed to ``cmpq $foo, %rdi`` which takes a 32-bit immediate operand. This 512is also not the same as ``cmpb $foo, %dil``, which is an 8-bit comparison. 513 514Windows on ARM 515-------------- 516 517Stack Probe Emission 518^^^^^^^^^^^^^^^^^^^^ 519 520The reference implementation (Microsoft Visual Studio 2012) emits stack probes 521in the following fashion: 522 523.. code-block:: gas 524 525 movw r4, #constant 526 bl __chkstk 527 sub.w sp, sp, r4 528 529However, this has the limitation of 32 MiB (±16MiB). In order to accommodate 530larger binaries, LLVM supports the use of ``-mcmodel=large`` to allow a 4GiB 531range via a slight deviation. It will generate an indirect jump as follows: 532 533.. code-block:: gas 534 535 movw r4, #constant 536 movw r12, :lower16:__chkstk 537 movt r12, :upper16:__chkstk 538 blx r12 539 sub.w sp, sp, r4 540 541Variable Length Arrays 542^^^^^^^^^^^^^^^^^^^^^^ 543 544The reference implementation (Microsoft Visual Studio 2012) does not permit the 545emission of Variable Length Arrays (VLAs). 546 547The Windows ARM Itanium ABI extends the base ABI by adding support for emitting 548a dynamic stack allocation. When emitting a variable stack allocation, a call 549to ``__chkstk`` is emitted unconditionally to ensure that guard pages are setup 550properly. The emission of this stack probe emission is handled similar to the 551standard stack probe emission. 552 553The MSVC environment does not emit code for VLAs currently. 554 555Windows on ARM64 556---------------- 557 558Stack Probe Emission 559^^^^^^^^^^^^^^^^^^^^ 560 561The reference implementation (Microsoft Visual Studio 2017) emits stack probes 562in the following fashion: 563 564.. code-block:: gas 565 566 mov x15, #constant 567 bl __chkstk 568 sub sp, sp, x15, lsl #4 569 570However, this has the limitation of 256 MiB (±128MiB). In order to accommodate 571larger binaries, LLVM supports the use of ``-mcmodel=large`` to allow a 8GiB 572(±4GiB) range via a slight deviation. It will generate an indirect jump as 573follows: 574 575.. code-block:: gas 576 577 mov x15, #constant 578 adrp x16, __chkstk 579 add x16, x16, :lo12:__chkstk 580 blr x16 581 sub sp, sp, x15, lsl #4 582 583