1============================== 2LLVM Language Reference Manual 3============================== 4 5.. contents:: 6 :local: 7 :depth: 4 8 9Abstract 10======== 11 12This document is a reference manual for the LLVM assembly language. LLVM 13is a Static Single Assignment (SSA) based representation that provides 14type safety, low-level operations, flexibility, and the capability of 15representing 'all' high-level languages cleanly. It is the common code 16representation used throughout all phases of the LLVM compilation 17strategy. 18 19Introduction 20============ 21 22The LLVM code representation is designed to be used in three different 23forms: as an in-memory compiler IR, as an on-disk bitcode representation 24(suitable for fast loading by a Just-In-Time compiler), and as a human 25readable assembly language representation. This allows LLVM to provide a 26powerful intermediate representation for efficient compiler 27transformations and analysis, while providing a natural means to debug 28and visualize the transformations. The three different forms of LLVM are 29all equivalent. This document describes the human readable 30representation and notation. 31 32The LLVM representation aims to be light-weight and low-level while 33being expressive, typed, and extensible at the same time. It aims to be 34a "universal IR" of sorts, by being at a low enough level that 35high-level ideas may be cleanly mapped to it (similar to how 36microprocessors are "universal IR's", allowing many source languages to 37be mapped to them). By providing type information, LLVM can be used as 38the target of optimizations: for example, through pointer analysis, it 39can be proven that a C automatic variable is never accessed outside of 40the current function, allowing it to be promoted to a simple SSA value 41instead of a memory location. 42 43.. _wellformed: 44 45Well-Formedness 46--------------- 47 48It is important to note that this document describes 'well formed' LLVM 49assembly language. There is a difference between what the parser accepts 50and what is considered 'well formed'. For example, the following 51instruction is syntactically okay, but not well formed: 52 53.. code-block:: llvm 54 55 %x = add i32 1, %x 56 57because the definition of ``%x`` does not dominate all of its uses. The 58LLVM infrastructure provides a verification pass that may be used to 59verify that an LLVM module is well formed. This pass is automatically 60run by the parser after parsing input assembly and by the optimizer 61before it outputs bitcode. The violations pointed out by the verifier 62pass indicate bugs in transformation passes or input to the parser. 63 64.. _identifiers: 65 66Identifiers 67=========== 68 69LLVM identifiers come in two basic types: global and local. Global 70identifiers (functions, global variables) begin with the ``'@'`` 71character. Local identifiers (register names, types) begin with the 72``'%'`` character. Additionally, there are three different formats for 73identifiers, for different purposes: 74 75#. Named values are represented as a string of characters with their 76 prefix. For example, ``%foo``, ``@DivisionByZero``, 77 ``%a.really.long.identifier``. The actual regular expression used is 78 '``[%@][-a-zA-Z$._][-a-zA-Z$._0-9]*``'. Identifiers that require other 79 characters in their names can be surrounded with quotes. Special 80 characters may be escaped using ``"\xx"`` where ``xx`` is the ASCII 81 code for the character in hexadecimal. In this way, any character can 82 be used in a name value, even quotes themselves. The ``"\01"`` prefix 83 can be used on global values to suppress mangling. 84#. Unnamed values are represented as an unsigned numeric value with 85 their prefix. For example, ``%12``, ``@2``, ``%44``. 86#. Constants, which are described in the section Constants_ below. 87 88LLVM requires that values start with a prefix for two reasons: Compilers 89don't need to worry about name clashes with reserved words, and the set 90of reserved words may be expanded in the future without penalty. 91Additionally, unnamed identifiers allow a compiler to quickly come up 92with a temporary variable without having to avoid symbol table 93conflicts. 94 95Reserved words in LLVM are very similar to reserved words in other 96languages. There are keywords for different opcodes ('``add``', 97'``bitcast``', '``ret``', etc...), for primitive type names ('``void``', 98'``i32``', etc...), and others. These reserved words cannot conflict 99with variable names, because none of them start with a prefix character 100(``'%'`` or ``'@'``). 101 102Here is an example of LLVM code to multiply the integer variable 103'``%X``' by 8: 104 105The easy way: 106 107.. code-block:: llvm 108 109 %result = mul i32 %X, 8 110 111After strength reduction: 112 113.. code-block:: llvm 114 115 %result = shl i32 %X, 3 116 117And the hard way: 118 119.. code-block:: llvm 120 121 %0 = add i32 %X, %X ; yields i32:%0 122 %1 = add i32 %0, %0 ; yields i32:%1 123 %result = add i32 %1, %1 124 125This last way of multiplying ``%X`` by 8 illustrates several important 126lexical features of LLVM: 127 128#. Comments are delimited with a '``;``' and go until the end of line. 129#. Unnamed temporaries are created when the result of a computation is 130 not assigned to a named value. 131#. Unnamed temporaries are numbered sequentially (using a per-function 132 incrementing counter, starting with 0). Note that basic blocks and unnamed 133 function parameters are included in this numbering. For example, if the 134 entry basic block is not given a label name and all function parameters are 135 named, then it will get number 0. 136 137It also shows a convention that we follow in this document. When 138demonstrating instructions, we will follow an instruction with a comment 139that defines the type and name of value produced. 140 141High Level Structure 142==================== 143 144Module Structure 145---------------- 146 147LLVM programs are composed of ``Module``'s, each of which is a 148translation unit of the input programs. Each module consists of 149functions, global variables, and symbol table entries. Modules may be 150combined together with the LLVM linker, which merges function (and 151global variable) definitions, resolves forward declarations, and merges 152symbol table entries. Here is an example of the "hello world" module: 153 154.. code-block:: llvm 155 156 ; Declare the string constant as a global constant. 157 @.str = private unnamed_addr constant [13 x i8] c"hello world\0A\00" 158 159 ; External declaration of the puts function 160 declare i32 @puts(i8* nocapture) nounwind 161 162 ; Definition of main function 163 define i32 @main() { ; i32()* 164 ; Convert [13 x i8]* to i8*... 165 %cast210 = getelementptr [13 x i8], [13 x i8]* @.str, i64 0, i64 0 166 167 ; Call puts function to write out the string to stdout. 168 call i32 @puts(i8* %cast210) 169 ret i32 0 170 } 171 172 ; Named metadata 173 !0 = !{i32 42, null, !"string"} 174 !foo = !{!0} 175 176This example is made up of a :ref:`global variable <globalvars>` named 177"``.str``", an external declaration of the "``puts``" function, a 178:ref:`function definition <functionstructure>` for "``main``" and 179:ref:`named metadata <namedmetadatastructure>` "``foo``". 180 181In general, a module is made up of a list of global values (where both 182functions and global variables are global values). Global values are 183represented by a pointer to a memory location (in this case, a pointer 184to an array of char, and a pointer to a function), and have one of the 185following :ref:`linkage types <linkage>`. 186 187.. _linkage: 188 189Linkage Types 190------------- 191 192All Global Variables and Functions have one of the following types of 193linkage: 194 195``private`` 196 Global values with "``private``" linkage are only directly 197 accessible by objects in the current module. In particular, linking 198 code into a module with a private global value may cause the 199 private to be renamed as necessary to avoid collisions. Because the 200 symbol is private to the module, all references can be updated. This 201 doesn't show up in any symbol table in the object file. 202``internal`` 203 Similar to private, but the value shows as a local symbol 204 (``STB_LOCAL`` in the case of ELF) in the object file. This 205 corresponds to the notion of the '``static``' keyword in C. 206``available_externally`` 207 Globals with "``available_externally``" linkage are never emitted into 208 the object file corresponding to the LLVM module. From the linker's 209 perspective, an ``available_externally`` global is equivalent to 210 an external declaration. They exist to allow inlining and other 211 optimizations to take place given knowledge of the definition of the 212 global, which is known to be somewhere outside the module. Globals 213 with ``available_externally`` linkage are allowed to be discarded at 214 will, and allow inlining and other optimizations. This linkage type is 215 only allowed on definitions, not declarations. 216``linkonce`` 217 Globals with "``linkonce``" linkage are merged with other globals of 218 the same name when linkage occurs. This can be used to implement 219 some forms of inline functions, templates, or other code which must 220 be generated in each translation unit that uses it, but where the 221 body may be overridden with a more definitive definition later. 222 Unreferenced ``linkonce`` globals are allowed to be discarded. Note 223 that ``linkonce`` linkage does not actually allow the optimizer to 224 inline the body of this function into callers because it doesn't 225 know if this definition of the function is the definitive definition 226 within the program or whether it will be overridden by a stronger 227 definition. To enable inlining and other optimizations, use 228 "``linkonce_odr``" linkage. 229``weak`` 230 "``weak``" linkage has the same merging semantics as ``linkonce`` 231 linkage, except that unreferenced globals with ``weak`` linkage may 232 not be discarded. This is used for globals that are declared "weak" 233 in C source code. 234``common`` 235 "``common``" linkage is most similar to "``weak``" linkage, but they 236 are used for tentative definitions in C, such as "``int X;``" at 237 global scope. Symbols with "``common``" linkage are merged in the 238 same way as ``weak symbols``, and they may not be deleted if 239 unreferenced. ``common`` symbols may not have an explicit section, 240 must have a zero initializer, and may not be marked 241 ':ref:`constant <globalvars>`'. Functions and aliases may not have 242 common linkage. 243 244.. _linkage_appending: 245 246``appending`` 247 "``appending``" linkage may only be applied to global variables of 248 pointer to array type. When two global variables with appending 249 linkage are linked together, the two global arrays are appended 250 together. This is the LLVM, typesafe, equivalent of having the 251 system linker append together "sections" with identical names when 252 .o files are linked. 253 254 Unfortunately this doesn't correspond to any feature in .o files, so it 255 can only be used for variables like ``llvm.global_ctors`` which llvm 256 interprets specially. 257 258``extern_weak`` 259 The semantics of this linkage follow the ELF object file model: the 260 symbol is weak until linked, if not linked, the symbol becomes null 261 instead of being an undefined reference. 262``linkonce_odr``, ``weak_odr`` 263 Some languages allow differing globals to be merged, such as two 264 functions with different semantics. Other languages, such as 265 ``C++``, ensure that only equivalent globals are ever merged (the 266 "one definition rule" --- "ODR"). Such languages can use the 267 ``linkonce_odr`` and ``weak_odr`` linkage types to indicate that the 268 global will only be merged with equivalent globals. These linkage 269 types are otherwise the same as their non-``odr`` versions. 270``external`` 271 If none of the above identifiers are used, the global is externally 272 visible, meaning that it participates in linkage and can be used to 273 resolve external symbol references. 274 275It is illegal for a global variable or function *declaration* to have any 276linkage type other than ``external`` or ``extern_weak``. 277 278.. _callingconv: 279 280Calling Conventions 281------------------- 282 283LLVM :ref:`functions <functionstructure>`, :ref:`calls <i_call>` and 284:ref:`invokes <i_invoke>` can all have an optional calling convention 285specified for the call. The calling convention of any pair of dynamic 286caller/callee must match, or the behavior of the program is undefined. 287The following calling conventions are supported by LLVM, and more may be 288added in the future: 289 290"``ccc``" - The C calling convention 291 This calling convention (the default if no other calling convention 292 is specified) matches the target C calling conventions. This calling 293 convention supports varargs function calls and tolerates some 294 mismatch in the declared prototype and implemented declaration of 295 the function (as does normal C). 296"``fastcc``" - The fast calling convention 297 This calling convention attempts to make calls as fast as possible 298 (e.g. by passing things in registers). This calling convention 299 allows the target to use whatever tricks it wants to produce fast 300 code for the target, without having to conform to an externally 301 specified ABI (Application Binary Interface). `Tail calls can only 302 be optimized when this, the tailcc, the GHC or the HiPE convention is 303 used. <CodeGenerator.html#id80>`_ This calling convention does not 304 support varargs and requires the prototype of all callees to exactly 305 match the prototype of the function definition. 306"``coldcc``" - The cold calling convention 307 This calling convention attempts to make code in the caller as 308 efficient as possible under the assumption that the call is not 309 commonly executed. As such, these calls often preserve all registers 310 so that the call does not break any live ranges in the caller side. 311 This calling convention does not support varargs and requires the 312 prototype of all callees to exactly match the prototype of the 313 function definition. Furthermore the inliner doesn't consider such function 314 calls for inlining. 315"``cc 10``" - GHC convention 316 This calling convention has been implemented specifically for use by 317 the `Glasgow Haskell Compiler (GHC) <http://www.haskell.org/ghc>`_. 318 It passes everything in registers, going to extremes to achieve this 319 by disabling callee save registers. This calling convention should 320 not be used lightly but only for specific situations such as an 321 alternative to the *register pinning* performance technique often 322 used when implementing functional programming languages. At the 323 moment only X86 supports this convention and it has the following 324 limitations: 325 326 - On *X86-32* only supports up to 4 bit type parameters. No 327 floating-point types are supported. 328 - On *X86-64* only supports up to 10 bit type parameters and 6 329 floating-point parameters. 330 331 This calling convention supports `tail call 332 optimization <CodeGenerator.html#id80>`_ but requires both the 333 caller and callee are using it. 334"``cc 11``" - The HiPE calling convention 335 This calling convention has been implemented specifically for use by 336 the `High-Performance Erlang 337 (HiPE) <http://www.it.uu.se/research/group/hipe/>`_ compiler, *the* 338 native code compiler of the `Ericsson's Open Source Erlang/OTP 339 system <http://www.erlang.org/download.shtml>`_. It uses more 340 registers for argument passing than the ordinary C calling 341 convention and defines no callee-saved registers. The calling 342 convention properly supports `tail call 343 optimization <CodeGenerator.html#id80>`_ but requires that both the 344 caller and the callee use it. It uses a *register pinning* 345 mechanism, similar to GHC's convention, for keeping frequently 346 accessed runtime components pinned to specific hardware registers. 347 At the moment only X86 supports this convention (both 32 and 64 348 bit). 349"``webkit_jscc``" - WebKit's JavaScript calling convention 350 This calling convention has been implemented for `WebKit FTL JIT 351 <https://trac.webkit.org/wiki/FTLJIT>`_. It passes arguments on the 352 stack right to left (as cdecl does), and returns a value in the 353 platform's customary return register. 354"``anyregcc``" - Dynamic calling convention for code patching 355 This is a special convention that supports patching an arbitrary code 356 sequence in place of a call site. This convention forces the call 357 arguments into registers but allows them to be dynamically 358 allocated. This can currently only be used with calls to 359 llvm.experimental.patchpoint because only this intrinsic records 360 the location of its arguments in a side table. See :doc:`StackMaps`. 361"``preserve_mostcc``" - The `PreserveMost` calling convention 362 This calling convention attempts to make the code in the caller as 363 unintrusive as possible. This convention behaves identically to the `C` 364 calling convention on how arguments and return values are passed, but it 365 uses a different set of caller/callee-saved registers. This alleviates the 366 burden of saving and recovering a large register set before and after the 367 call in the caller. If the arguments are passed in callee-saved registers, 368 then they will be preserved by the callee across the call. This doesn't 369 apply for values returned in callee-saved registers. 370 371 - On X86-64 the callee preserves all general purpose registers, except for 372 R11. R11 can be used as a scratch register. Floating-point registers 373 (XMMs/YMMs) are not preserved and need to be saved by the caller. 374 375 The idea behind this convention is to support calls to runtime functions 376 that have a hot path and a cold path. The hot path is usually a small piece 377 of code that doesn't use many registers. The cold path might need to call out to 378 another function and therefore only needs to preserve the caller-saved 379 registers, which haven't already been saved by the caller. The 380 `PreserveMost` calling convention is very similar to the `cold` calling 381 convention in terms of caller/callee-saved registers, but they are used for 382 different types of function calls. `coldcc` is for function calls that are 383 rarely executed, whereas `preserve_mostcc` function calls are intended to be 384 on the hot path and definitely executed a lot. Furthermore `preserve_mostcc` 385 doesn't prevent the inliner from inlining the function call. 386 387 This calling convention will be used by a future version of the ObjectiveC 388 runtime and should therefore still be considered experimental at this time. 389 Although this convention was created to optimize certain runtime calls to 390 the ObjectiveC runtime, it is not limited to this runtime and might be used 391 by other runtimes in the future too. The current implementation only 392 supports X86-64, but the intention is to support more architectures in the 393 future. 394"``preserve_allcc``" - The `PreserveAll` calling convention 395 This calling convention attempts to make the code in the caller even less 396 intrusive than the `PreserveMost` calling convention. This calling 397 convention also behaves identical to the `C` calling convention on how 398 arguments and return values are passed, but it uses a different set of 399 caller/callee-saved registers. This removes the burden of saving and 400 recovering a large register set before and after the call in the caller. If 401 the arguments are passed in callee-saved registers, then they will be 402 preserved by the callee across the call. This doesn't apply for values 403 returned in callee-saved registers. 404 405 - On X86-64 the callee preserves all general purpose registers, except for 406 R11. R11 can be used as a scratch register. Furthermore it also preserves 407 all floating-point registers (XMMs/YMMs). 408 409 The idea behind this convention is to support calls to runtime functions 410 that don't need to call out to any other functions. 411 412 This calling convention, like the `PreserveMost` calling convention, will be 413 used by a future version of the ObjectiveC runtime and should be considered 414 experimental at this time. 415"``cxx_fast_tlscc``" - The `CXX_FAST_TLS` calling convention for access functions 416 Clang generates an access function to access C++-style TLS. The access 417 function generally has an entry block, an exit block and an initialization 418 block that is run at the first time. The entry and exit blocks can access 419 a few TLS IR variables, each access will be lowered to a platform-specific 420 sequence. 421 422 This calling convention aims to minimize overhead in the caller by 423 preserving as many registers as possible (all the registers that are 424 preserved on the fast path, composed of the entry and exit blocks). 425 426 This calling convention behaves identical to the `C` calling convention on 427 how arguments and return values are passed, but it uses a different set of 428 caller/callee-saved registers. 429 430 Given that each platform has its own lowering sequence, hence its own set 431 of preserved registers, we can't use the existing `PreserveMost`. 432 433 - On X86-64 the callee preserves all general purpose registers, except for 434 RDI and RAX. 435"``swiftcc``" - This calling convention is used for Swift language. 436 - On X86-64 RCX and R8 are available for additional integer returns, and 437 XMM2 and XMM3 are available for additional FP/vector returns. 438 - On iOS platforms, we use AAPCS-VFP calling convention. 439"``tailcc``" - Tail callable calling convention 440 This calling convention ensures that calls in tail position will always be 441 tail call optimized. This calling convention is equivalent to fastcc, 442 except for an additional guarantee that tail calls will be produced 443 whenever possible. `Tail calls can only be optimized when this, the fastcc, 444 the GHC or the HiPE convention is used. <CodeGenerator.html#id80>`_ This 445 calling convention does not support varargs and requires the prototype of 446 all callees to exactly match the prototype of the function definition. 447"``cfguard_checkcc``" - Windows Control Flow Guard (Check mechanism) 448 This calling convention is used for the Control Flow Guard check function, 449 calls to which can be inserted before indirect calls to check that the call 450 target is a valid function address. The check function has no return value, 451 but it will trigger an OS-level error if the address is not a valid target. 452 The set of registers preserved by the check function, and the register 453 containing the target address are architecture-specific. 454 455 - On X86 the target address is passed in ECX. 456 - On ARM the target address is passed in R0. 457 - On AArch64 the target address is passed in X15. 458"``cc <n>``" - Numbered convention 459 Any calling convention may be specified by number, allowing 460 target-specific calling conventions to be used. Target specific 461 calling conventions start at 64. 462 463More calling conventions can be added/defined on an as-needed basis, to 464support Pascal conventions or any other well-known target-independent 465convention. 466 467.. _visibilitystyles: 468 469Visibility Styles 470----------------- 471 472All Global Variables and Functions have one of the following visibility 473styles: 474 475"``default``" - Default style 476 On targets that use the ELF object file format, default visibility 477 means that the declaration is visible to other modules and, in 478 shared libraries, means that the declared entity may be overridden. 479 On Darwin, default visibility means that the declaration is visible 480 to other modules. Default visibility corresponds to "external 481 linkage" in the language. 482"``hidden``" - Hidden style 483 Two declarations of an object with hidden visibility refer to the 484 same object if they are in the same shared object. Usually, hidden 485 visibility indicates that the symbol will not be placed into the 486 dynamic symbol table, so no other module (executable or shared 487 library) can reference it directly. 488"``protected``" - Protected style 489 On ELF, protected visibility indicates that the symbol will be 490 placed in the dynamic symbol table, but that references within the 491 defining module will bind to the local symbol. That is, the symbol 492 cannot be overridden by another module. 493 494A symbol with ``internal`` or ``private`` linkage must have ``default`` 495visibility. 496 497.. _dllstorageclass: 498 499DLL Storage Classes 500------------------- 501 502All Global Variables, Functions and Aliases can have one of the following 503DLL storage class: 504 505``dllimport`` 506 "``dllimport``" causes the compiler to reference a function or variable via 507 a global pointer to a pointer that is set up by the DLL exporting the 508 symbol. On Microsoft Windows targets, the pointer name is formed by 509 combining ``__imp_`` and the function or variable name. 510``dllexport`` 511 "``dllexport``" causes the compiler to provide a global pointer to a pointer 512 in a DLL, so that it can be referenced with the ``dllimport`` attribute. On 513 Microsoft Windows targets, the pointer name is formed by combining 514 ``__imp_`` and the function or variable name. Since this storage class 515 exists for defining a dll interface, the compiler, assembler and linker know 516 it is externally referenced and must refrain from deleting the symbol. 517 518.. _tls_model: 519 520Thread Local Storage Models 521--------------------------- 522 523A variable may be defined as ``thread_local``, which means that it will 524not be shared by threads (each thread will have a separated copy of the 525variable). Not all targets support thread-local variables. Optionally, a 526TLS model may be specified: 527 528``localdynamic`` 529 For variables that are only used within the current shared library. 530``initialexec`` 531 For variables in modules that will not be loaded dynamically. 532``localexec`` 533 For variables defined in the executable and only used within it. 534 535If no explicit model is given, the "general dynamic" model is used. 536 537The models correspond to the ELF TLS models; see `ELF Handling For 538Thread-Local Storage <http://people.redhat.com/drepper/tls.pdf>`_ for 539more information on under which circumstances the different models may 540be used. The target may choose a different TLS model if the specified 541model is not supported, or if a better choice of model can be made. 542 543A model can also be specified in an alias, but then it only governs how 544the alias is accessed. It will not have any effect in the aliasee. 545 546For platforms without linker support of ELF TLS model, the -femulated-tls 547flag can be used to generate GCC compatible emulated TLS code. 548 549.. _runtime_preemption_model: 550 551Runtime Preemption Specifiers 552----------------------------- 553 554Global variables, functions and aliases may have an optional runtime preemption 555specifier. If a preemption specifier isn't given explicitly, then a 556symbol is assumed to be ``dso_preemptable``. 557 558``dso_preemptable`` 559 Indicates that the function or variable may be replaced by a symbol from 560 outside the linkage unit at runtime. 561 562``dso_local`` 563 The compiler may assume that a function or variable marked as ``dso_local`` 564 will resolve to a symbol within the same linkage unit. Direct access will 565 be generated even if the definition is not within this compilation unit. 566 567.. _namedtypes: 568 569Structure Types 570--------------- 571 572LLVM IR allows you to specify both "identified" and "literal" :ref:`structure 573types <t_struct>`. Literal types are uniqued structurally, but identified types 574are never uniqued. An :ref:`opaque structural type <t_opaque>` can also be used 575to forward declare a type that is not yet available. 576 577An example of an identified structure specification is: 578 579.. code-block:: llvm 580 581 %mytype = type { %mytype*, i32 } 582 583Prior to the LLVM 3.0 release, identified types were structurally uniqued. Only 584literal types are uniqued in recent versions of LLVM. 585 586.. _nointptrtype: 587 588Non-Integral Pointer Type 589------------------------- 590 591Note: non-integral pointer types are a work in progress, and they should be 592considered experimental at this time. 593 594LLVM IR optionally allows the frontend to denote pointers in certain address 595spaces as "non-integral" via the :ref:`datalayout string<langref_datalayout>`. 596Non-integral pointer types represent pointers that have an *unspecified* bitwise 597representation; that is, the integral representation may be target dependent or 598unstable (not backed by a fixed integer). 599 600``inttoptr`` instructions converting integers to non-integral pointer types are 601ill-typed, and so are ``ptrtoint`` instructions converting values of 602non-integral pointer types to integers. Vector versions of said instructions 603are ill-typed as well. 604 605.. _globalvars: 606 607Global Variables 608---------------- 609 610Global variables define regions of memory allocated at compilation time 611instead of run-time. 612 613Global variable definitions must be initialized. 614 615Global variables in other translation units can also be declared, in which 616case they don't have an initializer. 617 618Global variables can optionally specify a :ref:`linkage type <linkage>`. 619 620Either global variable definitions or declarations may have an explicit section 621to be placed in and may have an optional explicit alignment specified. If there 622is a mismatch between the explicit or inferred section information for the 623variable declaration and its definition the resulting behavior is undefined. 624 625A variable may be defined as a global ``constant``, which indicates that 626the contents of the variable will **never** be modified (enabling better 627optimization, allowing the global data to be placed in the read-only 628section of an executable, etc). Note that variables that need runtime 629initialization cannot be marked ``constant`` as there is a store to the 630variable. 631 632LLVM explicitly allows *declarations* of global variables to be marked 633constant, even if the final definition of the global is not. This 634capability can be used to enable slightly better optimization of the 635program, but requires the language definition to guarantee that 636optimizations based on the 'constantness' are valid for the translation 637units that do not include the definition. 638 639As SSA values, global variables define pointer values that are in scope 640(i.e. they dominate) all basic blocks in the program. Global variables 641always define a pointer to their "content" type because they describe a 642region of memory, and all memory objects in LLVM are accessed through 643pointers. 644 645Global variables can be marked with ``unnamed_addr`` which indicates 646that the address is not significant, only the content. Constants marked 647like this can be merged with other constants if they have the same 648initializer. Note that a constant with significant address *can* be 649merged with a ``unnamed_addr`` constant, the result being a constant 650whose address is significant. 651 652If the ``local_unnamed_addr`` attribute is given, the address is known to 653not be significant within the module. 654 655A global variable may be declared to reside in a target-specific 656numbered address space. For targets that support them, address spaces 657may affect how optimizations are performed and/or what target 658instructions are used to access the variable. The default address space 659is zero. The address space qualifier must precede any other attributes. 660 661LLVM allows an explicit section to be specified for globals. If the 662target supports it, it will emit globals to the section specified. 663Additionally, the global can placed in a comdat if the target has the necessary 664support. 665 666External declarations may have an explicit section specified. Section 667information is retained in LLVM IR for targets that make use of this 668information. Attaching section information to an external declaration is an 669assertion that its definition is located in the specified section. If the 670definition is located in a different section, the behavior is undefined. 671 672By default, global initializers are optimized by assuming that global 673variables defined within the module are not modified from their 674initial values before the start of the global initializer. This is 675true even for variables potentially accessible from outside the 676module, including those with external linkage or appearing in 677``@llvm.used`` or dllexported variables. This assumption may be suppressed 678by marking the variable with ``externally_initialized``. 679 680An explicit alignment may be specified for a global, which must be a 681power of 2. If not present, or if the alignment is set to zero, the 682alignment of the global is set by the target to whatever it feels 683convenient. If an explicit alignment is specified, the global is forced 684to have exactly that alignment. Targets and optimizers are not allowed 685to over-align the global if the global has an assigned section. In this 686case, the extra alignment could be observable: for example, code could 687assume that the globals are densely packed in their section and try to 688iterate over them as an array, alignment padding would break this 689iteration. The maximum alignment is ``1 << 29``. 690 691For global variables declarations, as well as definitions that may be 692replaced at link time (``linkonce``, ``weak``, ``extern_weak`` and ``common`` 693linkage types), LLVM makes no assumptions about the allocation size of the 694variables, except that they may not overlap. The alignment of a global variable 695declaration or replaceable definition must not be greater than the alignment of 696the definition it resolves to. 697 698Globals can also have a :ref:`DLL storage class <dllstorageclass>`, 699an optional :ref:`runtime preemption specifier <runtime_preemption_model>`, 700an optional :ref:`global attributes <glattrs>` and 701an optional list of attached :ref:`metadata <metadata>`. 702 703Variables and aliases can have a 704:ref:`Thread Local Storage Model <tls_model>`. 705 706:ref:`Scalable vectors <t_vector>` cannot be global variables or members of 707structs or arrays because their size is unknown at compile time. 708 709Syntax:: 710 711 @<GlobalVarName> = [Linkage] [PreemptionSpecifier] [Visibility] 712 [DLLStorageClass] [ThreadLocal] 713 [(unnamed_addr|local_unnamed_addr)] [AddrSpace] 714 [ExternallyInitialized] 715 <global | constant> <Type> [<InitializerConstant>] 716 [, section "name"] [, comdat [($name)]] 717 [, align <Alignment>] (, !name !N)* 718 719For example, the following defines a global in a numbered address space 720with an initializer, section, and alignment: 721 722.. code-block:: llvm 723 724 @G = addrspace(5) constant float 1.0, section "foo", align 4 725 726The following example just declares a global variable 727 728.. code-block:: llvm 729 730 @G = external global i32 731 732The following example defines a thread-local global with the 733``initialexec`` TLS model: 734 735.. code-block:: llvm 736 737 @G = thread_local(initialexec) global i32 0, align 4 738 739.. _functionstructure: 740 741Functions 742--------- 743 744LLVM function definitions consist of the "``define``" keyword, an 745optional :ref:`linkage type <linkage>`, an optional :ref:`runtime preemption 746specifier <runtime_preemption_model>`, an optional :ref:`visibility 747style <visibility>`, an optional :ref:`DLL storage class <dllstorageclass>`, 748an optional :ref:`calling convention <callingconv>`, 749an optional ``unnamed_addr`` attribute, a return type, an optional 750:ref:`parameter attribute <paramattrs>` for the return type, a function 751name, a (possibly empty) argument list (each with optional :ref:`parameter 752attributes <paramattrs>`), optional :ref:`function attributes <fnattrs>`, 753an optional address space, an optional section, an optional alignment, 754an optional :ref:`comdat <langref_comdats>`, 755an optional :ref:`garbage collector name <gc>`, an optional :ref:`prefix <prefixdata>`, 756an optional :ref:`prologue <prologuedata>`, 757an optional :ref:`personality <personalityfn>`, 758an optional list of attached :ref:`metadata <metadata>`, 759an opening curly brace, a list of basic blocks, and a closing curly brace. 760 761LLVM function declarations consist of the "``declare``" keyword, an 762optional :ref:`linkage type <linkage>`, an optional :ref:`visibility style 763<visibility>`, an optional :ref:`DLL storage class <dllstorageclass>`, an 764optional :ref:`calling convention <callingconv>`, an optional ``unnamed_addr`` 765or ``local_unnamed_addr`` attribute, an optional address space, a return type, 766an optional :ref:`parameter attribute <paramattrs>` for the return type, a function name, a possibly 767empty list of arguments, an optional alignment, an optional :ref:`garbage 768collector name <gc>`, an optional :ref:`prefix <prefixdata>`, and an optional 769:ref:`prologue <prologuedata>`. 770 771A function definition contains a list of basic blocks, forming the CFG (Control 772Flow Graph) for the function. Each basic block may optionally start with a label 773(giving the basic block a symbol table entry), contains a list of instructions, 774and ends with a :ref:`terminator <terminators>` instruction (such as a branch or 775function return). If an explicit label name is not provided, a block is assigned 776an implicit numbered label, using the next value from the same counter as used 777for unnamed temporaries (:ref:`see above<identifiers>`). For example, if a 778function entry block does not have an explicit label, it will be assigned label 779"%0", then the first unnamed temporary in that block will be "%1", etc. If a 780numeric label is explicitly specified, it must match the numeric label that 781would be used implicitly. 782 783The first basic block in a function is special in two ways: it is 784immediately executed on entrance to the function, and it is not allowed 785to have predecessor basic blocks (i.e. there can not be any branches to 786the entry block of a function). Because the block can have no 787predecessors, it also cannot have any :ref:`PHI nodes <i_phi>`. 788 789LLVM allows an explicit section to be specified for functions. If the 790target supports it, it will emit functions to the section specified. 791Additionally, the function can be placed in a COMDAT. 792 793An explicit alignment may be specified for a function. If not present, 794or if the alignment is set to zero, the alignment of the function is set 795by the target to whatever it feels convenient. If an explicit alignment 796is specified, the function is forced to have at least that much 797alignment. All alignments must be a power of 2. 798 799If the ``unnamed_addr`` attribute is given, the address is known to not 800be significant and two identical functions can be merged. 801 802If the ``local_unnamed_addr`` attribute is given, the address is known to 803not be significant within the module. 804 805If an explicit address space is not given, it will default to the program 806address space from the :ref:`datalayout string<langref_datalayout>`. 807 808Syntax:: 809 810 define [linkage] [PreemptionSpecifier] [visibility] [DLLStorageClass] 811 [cconv] [ret attrs] 812 <ResultType> @<FunctionName> ([argument list]) 813 [(unnamed_addr|local_unnamed_addr)] [AddrSpace] [fn Attrs] 814 [section "name"] [comdat [($name)]] [align N] [gc] [prefix Constant] 815 [prologue Constant] [personality Constant] (!name !N)* { ... } 816 817The argument list is a comma separated sequence of arguments where each 818argument is of the following form: 819 820Syntax:: 821 822 <type> [parameter Attrs] [name] 823 824 825.. _langref_aliases: 826 827Aliases 828------- 829 830Aliases, unlike function or variables, don't create any new data. They 831are just a new symbol and metadata for an existing position. 832 833Aliases have a name and an aliasee that is either a global value or a 834constant expression. 835 836Aliases may have an optional :ref:`linkage type <linkage>`, an optional 837:ref:`runtime preemption specifier <runtime_preemption_model>`, an optional 838:ref:`visibility style <visibility>`, an optional :ref:`DLL storage class 839<dllstorageclass>` and an optional :ref:`tls model <tls_model>`. 840 841Syntax:: 842 843 @<Name> = [Linkage] [PreemptionSpecifier] [Visibility] [DLLStorageClass] [ThreadLocal] [(unnamed_addr|local_unnamed_addr)] alias <AliaseeTy>, <AliaseeTy>* @<Aliasee> 844 845The linkage must be one of ``private``, ``internal``, ``linkonce``, ``weak``, 846``linkonce_odr``, ``weak_odr``, ``external``. Note that some system linkers 847might not correctly handle dropping a weak symbol that is aliased. 848 849Aliases that are not ``unnamed_addr`` are guaranteed to have the same address as 850the aliasee expression. ``unnamed_addr`` ones are only guaranteed to point 851to the same content. 852 853If the ``local_unnamed_addr`` attribute is given, the address is known to 854not be significant within the module. 855 856Since aliases are only a second name, some restrictions apply, of which 857some can only be checked when producing an object file: 858 859* The expression defining the aliasee must be computable at assembly 860 time. Since it is just a name, no relocations can be used. 861 862* No alias in the expression can be weak as the possibility of the 863 intermediate alias being overridden cannot be represented in an 864 object file. 865 866* No global value in the expression can be a declaration, since that 867 would require a relocation, which is not possible. 868 869.. _langref_ifunc: 870 871IFuncs 872------- 873 874IFuncs, like as aliases, don't create any new data or func. They are just a new 875symbol that dynamic linker resolves at runtime by calling a resolver function. 876 877IFuncs have a name and a resolver that is a function called by dynamic linker 878that returns address of another function associated with the name. 879 880IFunc may have an optional :ref:`linkage type <linkage>` and an optional 881:ref:`visibility style <visibility>`. 882 883Syntax:: 884 885 @<Name> = [Linkage] [Visibility] ifunc <IFuncTy>, <ResolverTy>* @<Resolver> 886 887 888.. _langref_comdats: 889 890Comdats 891------- 892 893Comdat IR provides access to COFF and ELF object file COMDAT functionality. 894 895Comdats have a name which represents the COMDAT key. All global objects that 896specify this key will only end up in the final object file if the linker chooses 897that key over some other key. Aliases are placed in the same COMDAT that their 898aliasee computes to, if any. 899 900Comdats have a selection kind to provide input on how the linker should 901choose between keys in two different object files. 902 903Syntax:: 904 905 $<Name> = comdat SelectionKind 906 907The selection kind must be one of the following: 908 909``any`` 910 The linker may choose any COMDAT key, the choice is arbitrary. 911``exactmatch`` 912 The linker may choose any COMDAT key but the sections must contain the 913 same data. 914``largest`` 915 The linker will choose the section containing the largest COMDAT key. 916``noduplicates`` 917 The linker requires that only section with this COMDAT key exist. 918``samesize`` 919 The linker may choose any COMDAT key but the sections must contain the 920 same amount of data. 921 922Note that XCOFF and the Mach-O platform don't support COMDATs, and ELF and 923WebAssembly only support ``any`` as a selection kind. 924 925Here is an example of a COMDAT group where a function will only be selected if 926the COMDAT key's section is the largest: 927 928.. code-block:: text 929 930 $foo = comdat largest 931 @foo = global i32 2, comdat($foo) 932 933 define void @bar() comdat($foo) { 934 ret void 935 } 936 937As a syntactic sugar the ``$name`` can be omitted if the name is the same as 938the global name: 939 940.. code-block:: text 941 942 $foo = comdat any 943 @foo = global i32 2, comdat 944 945 946In a COFF object file, this will create a COMDAT section with selection kind 947``IMAGE_COMDAT_SELECT_LARGEST`` containing the contents of the ``@foo`` symbol 948and another COMDAT section with selection kind 949``IMAGE_COMDAT_SELECT_ASSOCIATIVE`` which is associated with the first COMDAT 950section and contains the contents of the ``@bar`` symbol. 951 952There are some restrictions on the properties of the global object. 953It, or an alias to it, must have the same name as the COMDAT group when 954targeting COFF. 955The contents and size of this object may be used during link-time to determine 956which COMDAT groups get selected depending on the selection kind. 957Because the name of the object must match the name of the COMDAT group, the 958linkage of the global object must not be local; local symbols can get renamed 959if a collision occurs in the symbol table. 960 961The combined use of COMDATS and section attributes may yield surprising results. 962For example: 963 964.. code-block:: text 965 966 $foo = comdat any 967 $bar = comdat any 968 @g1 = global i32 42, section "sec", comdat($foo) 969 @g2 = global i32 42, section "sec", comdat($bar) 970 971From the object file perspective, this requires the creation of two sections 972with the same name. This is necessary because both globals belong to different 973COMDAT groups and COMDATs, at the object file level, are represented by 974sections. 975 976Note that certain IR constructs like global variables and functions may 977create COMDATs in the object file in addition to any which are specified using 978COMDAT IR. This arises when the code generator is configured to emit globals 979in individual sections (e.g. when `-data-sections` or `-function-sections` 980is supplied to `llc`). 981 982.. _namedmetadatastructure: 983 984Named Metadata 985-------------- 986 987Named metadata is a collection of metadata. :ref:`Metadata 988nodes <metadata>` (but not metadata strings) are the only valid 989operands for a named metadata. 990 991#. Named metadata are represented as a string of characters with the 992 metadata prefix. The rules for metadata names are the same as for 993 identifiers, but quoted names are not allowed. ``"\xx"`` type escapes 994 are still valid, which allows any character to be part of a name. 995 996Syntax:: 997 998 ; Some unnamed metadata nodes, which are referenced by the named metadata. 999 !0 = !{!"zero"} 1000 !1 = !{!"one"} 1001 !2 = !{!"two"} 1002 ; A named metadata. 1003 !name = !{!0, !1, !2} 1004 1005.. _paramattrs: 1006 1007Parameter Attributes 1008-------------------- 1009 1010The return type and each parameter of a function type may have a set of 1011*parameter attributes* associated with them. Parameter attributes are 1012used to communicate additional information about the result or 1013parameters of a function. Parameter attributes are considered to be part 1014of the function, not of the function type, so functions with different 1015parameter attributes can have the same function type. 1016 1017Parameter attributes are simple keywords that follow the type specified. 1018If multiple parameter attributes are needed, they are space separated. 1019For example: 1020 1021.. code-block:: llvm 1022 1023 declare i32 @printf(i8* noalias nocapture, ...) 1024 declare i32 @atoi(i8 zeroext) 1025 declare signext i8 @returns_signed_char() 1026 1027Note that any attributes for the function result (``nounwind``, 1028``readonly``) come immediately after the argument list. 1029 1030Currently, only the following parameter attributes are defined: 1031 1032``zeroext`` 1033 This indicates to the code generator that the parameter or return 1034 value should be zero-extended to the extent required by the target's 1035 ABI by the caller (for a parameter) or the callee (for a return value). 1036``signext`` 1037 This indicates to the code generator that the parameter or return 1038 value should be sign-extended to the extent required by the target's 1039 ABI (which is usually 32-bits) by the caller (for a parameter) or 1040 the callee (for a return value). 1041``inreg`` 1042 This indicates that this parameter or return value should be treated 1043 in a special target-dependent fashion while emitting code for 1044 a function call or return (usually, by putting it in a register as 1045 opposed to memory, though some targets use it to distinguish between 1046 two different kinds of registers). Use of this attribute is 1047 target-specific. 1048``byval`` or ``byval(<ty>)`` 1049 This indicates that the pointer parameter should really be passed by 1050 value to the function. The attribute implies that a hidden copy of 1051 the pointee is made between the caller and the callee, so the callee 1052 is unable to modify the value in the caller. This attribute is only 1053 valid on LLVM pointer arguments. It is generally used to pass 1054 structs and arrays by value, but is also valid on pointers to 1055 scalars. The copy is considered to belong to the caller not the 1056 callee (for example, ``readonly`` functions should not write to 1057 ``byval`` parameters). This is not a valid attribute for return 1058 values. 1059 1060 The byval attribute also supports an optional type argument, which 1061 must be the same as the pointee type of the argument. 1062 1063 The byval attribute also supports specifying an alignment with the 1064 align attribute. It indicates the alignment of the stack slot to 1065 form and the known alignment of the pointer specified to the call 1066 site. If the alignment is not specified, then the code generator 1067 makes a target-specific assumption. 1068 1069.. _attr_byref: 1070 1071``byref(<ty>)`` 1072 1073 The ``byref`` argument attribute allows specifying the pointee 1074 memory type of an argument. This is similar to ``byval``, but does 1075 not imply a copy is made anywhere, or that the argument is passed 1076 on the stack. This implies the pointer is dereferenceable up to 1077 the storage size of the type. 1078 1079 It is not generally permissible to introduce a write to an 1080 ``byref`` pointer. The pointer may have any address space and may 1081 be read only. 1082 1083 This is not a valid attribute for return values. 1084 1085 The alignment for an ``byref`` parameter can be explicitly 1086 specified by combining it with the ``align`` attribute, similar to 1087 ``byval``. If the alignment is not specified, then the code generator 1088 makes a target-specific assumption. 1089 1090 This is intended for representing ABI constraints, and is not 1091 intended to be inferred for optimization use. 1092 1093.. _attr_preallocated: 1094 1095``preallocated(<ty>)`` 1096 This indicates that the pointer parameter should really be passed by 1097 value to the function, and that the pointer parameter's pointee has 1098 already been initialized before the call instruction. This attribute 1099 is only valid on LLVM pointer arguments. The argument must be the value 1100 returned by the appropriate 1101 :ref:`llvm.call.preallocated.arg<int_call_preallocated_arg>` on non 1102 ``musttail`` calls, or the corresponding caller parameter in ``musttail`` 1103 calls, although it is ignored during codegen. 1104 1105 A non ``musttail`` function call with a ``preallocated`` attribute in 1106 any parameter must have a ``"preallocated"`` operand bundle. A ``musttail`` 1107 function call cannot have a ``"preallocated"`` operand bundle. 1108 1109 The preallocated attribute requires a type argument, which must be 1110 the same as the pointee type of the argument. 1111 1112 The preallocated attribute also supports specifying an alignment with the 1113 align attribute. It indicates the alignment of the stack slot to 1114 form and the known alignment of the pointer specified to the call 1115 site. If the alignment is not specified, then the code generator 1116 makes a target-specific assumption. 1117 1118.. _attr_inalloca: 1119 1120``inalloca`` 1121 1122 The ``inalloca`` argument attribute allows the caller to take the 1123 address of outgoing stack arguments. An ``inalloca`` argument must 1124 be a pointer to stack memory produced by an ``alloca`` instruction. 1125 The alloca, or argument allocation, must also be tagged with the 1126 inalloca keyword. Only the last argument may have the ``inalloca`` 1127 attribute, and that argument is guaranteed to be passed in memory. 1128 1129 An argument allocation may be used by a call at most once because 1130 the call may deallocate it. The ``inalloca`` attribute cannot be 1131 used in conjunction with other attributes that affect argument 1132 storage, like ``inreg``, ``nest``, ``sret``, or ``byval``. The 1133 ``inalloca`` attribute also disables LLVM's implicit lowering of 1134 large aggregate return values, which means that frontend authors 1135 must lower them with ``sret`` pointers. 1136 1137 When the call site is reached, the argument allocation must have 1138 been the most recent stack allocation that is still live, or the 1139 behavior is undefined. It is possible to allocate additional stack 1140 space after an argument allocation and before its call site, but it 1141 must be cleared off with :ref:`llvm.stackrestore 1142 <int_stackrestore>`. 1143 1144 See :doc:`InAlloca` for more information on how to use this 1145 attribute. 1146 1147``sret`` or ``sret(<ty>)`` 1148 This indicates that the pointer parameter specifies the address of a 1149 structure that is the return value of the function in the source 1150 program. This pointer must be guaranteed by the caller to be valid: 1151 loads and stores to the structure may be assumed by the callee not 1152 to trap and to be properly aligned. This is not a valid attribute 1153 for return values. 1154 1155 The sret attribute also supports an optional type argument, which 1156 must be the same as the pointee type of the argument. In the 1157 future this will be required. 1158 1159.. _attr_align: 1160 1161``align <n>`` or ``align(<n>)`` 1162 This indicates that the pointer value may be assumed by the optimizer to 1163 have the specified alignment. If the pointer value does not have the 1164 specified alignment, behavior is undefined. ``align 1`` has no effect on 1165 non-byval, non-preallocated arguments. 1166 1167 Note that this attribute has additional semantics when combined with the 1168 ``byval`` or ``preallocated`` attribute, which are documented there. 1169 1170.. _noalias: 1171 1172``noalias`` 1173 This indicates that memory locations accessed via pointer values 1174 :ref:`based <pointeraliasing>` on the argument or return value are not also 1175 accessed, during the execution of the function, via pointer values not 1176 *based* on the argument or return value. This guarantee only holds for 1177 memory locations that are *modified*, by any means, during the execution of 1178 the function. The attribute on a return value also has additional semantics 1179 described below. The caller shares the responsibility with the callee for 1180 ensuring that these requirements are met. For further details, please see 1181 the discussion of the NoAlias response in :ref:`alias analysis <Must, May, 1182 or No>`. 1183 1184 Note that this definition of ``noalias`` is intentionally similar 1185 to the definition of ``restrict`` in C99 for function arguments. 1186 1187 For function return values, C99's ``restrict`` is not meaningful, 1188 while LLVM's ``noalias`` is. Furthermore, the semantics of the ``noalias`` 1189 attribute on return values are stronger than the semantics of the attribute 1190 when used on function arguments. On function return values, the ``noalias`` 1191 attribute indicates that the function acts like a system memory allocation 1192 function, returning a pointer to allocated storage disjoint from the 1193 storage for any other object accessible to the caller. 1194 1195``nocapture`` 1196 This indicates that the callee does not make any copies of the 1197 pointer that outlive the callee itself. This is not a valid 1198 attribute for return values. Addresses used in volatile operations 1199 are considered to be captured. 1200 1201``nofree`` 1202 This indicates that callee does not free the pointer argument. This is not 1203 a valid attribute for return values. 1204 1205.. _nest: 1206 1207``nest`` 1208 This indicates that the pointer parameter can be excised using the 1209 :ref:`trampoline intrinsics <int_trampoline>`. This is not a valid 1210 attribute for return values and can only be applied to one parameter. 1211 1212``returned`` 1213 This indicates that the function always returns the argument as its return 1214 value. This is a hint to the optimizer and code generator used when 1215 generating the caller, allowing value propagation, tail call optimization, 1216 and omission of register saves and restores in some cases; it is not 1217 checked or enforced when generating the callee. The parameter and the 1218 function return type must be valid operands for the 1219 :ref:`bitcast instruction <i_bitcast>`. This is not a valid attribute for 1220 return values and can only be applied to one parameter. 1221 1222``nonnull`` 1223 This indicates that the parameter or return pointer is not null. This 1224 attribute may only be applied to pointer typed parameters. This is not 1225 checked or enforced by LLVM; if the parameter or return pointer is null, 1226 the behavior is undefined. 1227 1228``dereferenceable(<n>)`` 1229 This indicates that the parameter or return pointer is dereferenceable. This 1230 attribute may only be applied to pointer typed parameters. A pointer that 1231 is dereferenceable can be loaded from speculatively without a risk of 1232 trapping. The number of bytes known to be dereferenceable must be provided 1233 in parentheses. It is legal for the number of bytes to be less than the 1234 size of the pointee type. The ``nonnull`` attribute does not imply 1235 dereferenceability (consider a pointer to one element past the end of an 1236 array), however ``dereferenceable(<n>)`` does imply ``nonnull`` in 1237 ``addrspace(0)`` (which is the default address space), except if the 1238 ``null_pointer_is_valid`` function attribute is present. 1239 1240``dereferenceable_or_null(<n>)`` 1241 This indicates that the parameter or return value isn't both 1242 non-null and non-dereferenceable (up to ``<n>`` bytes) at the same 1243 time. All non-null pointers tagged with 1244 ``dereferenceable_or_null(<n>)`` are ``dereferenceable(<n>)``. 1245 For address space 0 ``dereferenceable_or_null(<n>)`` implies that 1246 a pointer is exactly one of ``dereferenceable(<n>)`` or ``null``, 1247 and in other address spaces ``dereferenceable_or_null(<n>)`` 1248 implies that a pointer is at least one of ``dereferenceable(<n>)`` 1249 or ``null`` (i.e. it may be both ``null`` and 1250 ``dereferenceable(<n>)``). This attribute may only be applied to 1251 pointer typed parameters. 1252 1253``swiftself`` 1254 This indicates that the parameter is the self/context parameter. This is not 1255 a valid attribute for return values and can only be applied to one 1256 parameter. 1257 1258``swifterror`` 1259 This attribute is motivated to model and optimize Swift error handling. It 1260 can be applied to a parameter with pointer to pointer type or a 1261 pointer-sized alloca. At the call site, the actual argument that corresponds 1262 to a ``swifterror`` parameter has to come from a ``swifterror`` alloca or 1263 the ``swifterror`` parameter of the caller. A ``swifterror`` value (either 1264 the parameter or the alloca) can only be loaded and stored from, or used as 1265 a ``swifterror`` argument. This is not a valid attribute for return values 1266 and can only be applied to one parameter. 1267 1268 These constraints allow the calling convention to optimize access to 1269 ``swifterror`` variables by associating them with a specific register at 1270 call boundaries rather than placing them in memory. Since this does change 1271 the calling convention, a function which uses the ``swifterror`` attribute 1272 on a parameter is not ABI-compatible with one which does not. 1273 1274 These constraints also allow LLVM to assume that a ``swifterror`` argument 1275 does not alias any other memory visible within a function and that a 1276 ``swifterror`` alloca passed as an argument does not escape. 1277 1278``immarg`` 1279 This indicates the parameter is required to be an immediate 1280 value. This must be a trivial immediate integer or floating-point 1281 constant. Undef or constant expressions are not valid. This is 1282 only valid on intrinsic declarations and cannot be applied to a 1283 call site or arbitrary function. 1284 1285``noundef`` 1286 This attribute applies to parameters and return values. If the value 1287 representation contains any undefined or poison bits, the behavior is 1288 undefined. Note that this does not refer to padding introduced by the 1289 type's storage representation. 1290 1291.. _gc: 1292 1293Garbage Collector Strategy Names 1294-------------------------------- 1295 1296Each function may specify a garbage collector strategy name, which is simply a 1297string: 1298 1299.. code-block:: llvm 1300 1301 define void @f() gc "name" { ... } 1302 1303The supported values of *name* includes those :ref:`built in to LLVM 1304<builtin-gc-strategies>` and any provided by loaded plugins. Specifying a GC 1305strategy will cause the compiler to alter its output in order to support the 1306named garbage collection algorithm. Note that LLVM itself does not contain a 1307garbage collector, this functionality is restricted to generating machine code 1308which can interoperate with a collector provided externally. 1309 1310.. _prefixdata: 1311 1312Prefix Data 1313----------- 1314 1315Prefix data is data associated with a function which the code 1316generator will emit immediately before the function's entrypoint. 1317The purpose of this feature is to allow frontends to associate 1318language-specific runtime metadata with specific functions and make it 1319available through the function pointer while still allowing the 1320function pointer to be called. 1321 1322To access the data for a given function, a program may bitcast the 1323function pointer to a pointer to the constant's type and dereference 1324index -1. This implies that the IR symbol points just past the end of 1325the prefix data. For instance, take the example of a function annotated 1326with a single ``i32``, 1327 1328.. code-block:: llvm 1329 1330 define void @f() prefix i32 123 { ... } 1331 1332The prefix data can be referenced as, 1333 1334.. code-block:: llvm 1335 1336 %0 = bitcast void* () @f to i32* 1337 %a = getelementptr inbounds i32, i32* %0, i32 -1 1338 %b = load i32, i32* %a 1339 1340Prefix data is laid out as if it were an initializer for a global variable 1341of the prefix data's type. The function will be placed such that the 1342beginning of the prefix data is aligned. This means that if the size 1343of the prefix data is not a multiple of the alignment size, the 1344function's entrypoint will not be aligned. If alignment of the 1345function's entrypoint is desired, padding must be added to the prefix 1346data. 1347 1348A function may have prefix data but no body. This has similar semantics 1349to the ``available_externally`` linkage in that the data may be used by the 1350optimizers but will not be emitted in the object file. 1351 1352.. _prologuedata: 1353 1354Prologue Data 1355------------- 1356 1357The ``prologue`` attribute allows arbitrary code (encoded as bytes) to 1358be inserted prior to the function body. This can be used for enabling 1359function hot-patching and instrumentation. 1360 1361To maintain the semantics of ordinary function calls, the prologue data must 1362have a particular format. Specifically, it must begin with a sequence of 1363bytes which decode to a sequence of machine instructions, valid for the 1364module's target, which transfer control to the point immediately succeeding 1365the prologue data, without performing any other visible action. This allows 1366the inliner and other passes to reason about the semantics of the function 1367definition without needing to reason about the prologue data. Obviously this 1368makes the format of the prologue data highly target dependent. 1369 1370A trivial example of valid prologue data for the x86 architecture is ``i8 144``, 1371which encodes the ``nop`` instruction: 1372 1373.. code-block:: text 1374 1375 define void @f() prologue i8 144 { ... } 1376 1377Generally prologue data can be formed by encoding a relative branch instruction 1378which skips the metadata, as in this example of valid prologue data for the 1379x86_64 architecture, where the first two bytes encode ``jmp .+10``: 1380 1381.. code-block:: text 1382 1383 %0 = type <{ i8, i8, i8* }> 1384 1385 define void @f() prologue %0 <{ i8 235, i8 8, i8* @md}> { ... } 1386 1387A function may have prologue data but no body. This has similar semantics 1388to the ``available_externally`` linkage in that the data may be used by the 1389optimizers but will not be emitted in the object file. 1390 1391.. _personalityfn: 1392 1393Personality Function 1394-------------------- 1395 1396The ``personality`` attribute permits functions to specify what function 1397to use for exception handling. 1398 1399.. _attrgrp: 1400 1401Attribute Groups 1402---------------- 1403 1404Attribute groups are groups of attributes that are referenced by objects within 1405the IR. They are important for keeping ``.ll`` files readable, because a lot of 1406functions will use the same set of attributes. In the degenerative case of a 1407``.ll`` file that corresponds to a single ``.c`` file, the single attribute 1408group will capture the important command line flags used to build that file. 1409 1410An attribute group is a module-level object. To use an attribute group, an 1411object references the attribute group's ID (e.g. ``#37``). An object may refer 1412to more than one attribute group. In that situation, the attributes from the 1413different groups are merged. 1414 1415Here is an example of attribute groups for a function that should always be 1416inlined, has a stack alignment of 4, and which shouldn't use SSE instructions: 1417 1418.. code-block:: llvm 1419 1420 ; Target-independent attributes: 1421 attributes #0 = { alwaysinline alignstack=4 } 1422 1423 ; Target-dependent attributes: 1424 attributes #1 = { "no-sse" } 1425 1426 ; Function @f has attributes: alwaysinline, alignstack=4, and "no-sse". 1427 define void @f() #0 #1 { ... } 1428 1429.. _fnattrs: 1430 1431Function Attributes 1432------------------- 1433 1434Function attributes are set to communicate additional information about 1435a function. Function attributes are considered to be part of the 1436function, not of the function type, so functions with different function 1437attributes can have the same function type. 1438 1439Function attributes are simple keywords that follow the type specified. 1440If multiple attributes are needed, they are space separated. For 1441example: 1442 1443.. code-block:: llvm 1444 1445 define void @f() noinline { ... } 1446 define void @f() alwaysinline { ... } 1447 define void @f() alwaysinline optsize { ... } 1448 define void @f() optsize { ... } 1449 1450``alignstack(<n>)`` 1451 This attribute indicates that, when emitting the prologue and 1452 epilogue, the backend should forcibly align the stack pointer. 1453 Specify the desired alignment, which must be a power of two, in 1454 parentheses. 1455``allocsize(<EltSizeParam>[, <NumEltsParam>])`` 1456 This attribute indicates that the annotated function will always return at 1457 least a given number of bytes (or null). Its arguments are zero-indexed 1458 parameter numbers; if one argument is provided, then it's assumed that at 1459 least ``CallSite.Args[EltSizeParam]`` bytes will be available at the 1460 returned pointer. If two are provided, then it's assumed that 1461 ``CallSite.Args[EltSizeParam] * CallSite.Args[NumEltsParam]`` bytes are 1462 available. The referenced parameters must be integer types. No assumptions 1463 are made about the contents of the returned block of memory. 1464``alwaysinline`` 1465 This attribute indicates that the inliner should attempt to inline 1466 this function into callers whenever possible, ignoring any active 1467 inlining size threshold for this caller. 1468``builtin`` 1469 This indicates that the callee function at a call site should be 1470 recognized as a built-in function, even though the function's declaration 1471 uses the ``nobuiltin`` attribute. This is only valid at call sites for 1472 direct calls to functions that are declared with the ``nobuiltin`` 1473 attribute. 1474``cold`` 1475 This attribute indicates that this function is rarely called. When 1476 computing edge weights, basic blocks post-dominated by a cold 1477 function call are also considered to be cold; and, thus, given low 1478 weight. 1479``convergent`` 1480 In some parallel execution models, there exist operations that cannot be 1481 made control-dependent on any additional values. We call such operations 1482 ``convergent``, and mark them with this attribute. 1483 1484 The ``convergent`` attribute may appear on functions or call/invoke 1485 instructions. When it appears on a function, it indicates that calls to 1486 this function should not be made control-dependent on additional values. 1487 For example, the intrinsic ``llvm.nvvm.barrier0`` is ``convergent``, so 1488 calls to this intrinsic cannot be made control-dependent on additional 1489 values. 1490 1491 When it appears on a call/invoke, the ``convergent`` attribute indicates 1492 that we should treat the call as though we're calling a convergent 1493 function. This is particularly useful on indirect calls; without this we 1494 may treat such calls as though the target is non-convergent. 1495 1496 The optimizer may remove the ``convergent`` attribute on functions when it 1497 can prove that the function does not execute any convergent operations. 1498 Similarly, the optimizer may remove ``convergent`` on calls/invokes when it 1499 can prove that the call/invoke cannot call a convergent function. 1500``inaccessiblememonly`` 1501 This attribute indicates that the function may only access memory that 1502 is not accessible by the module being compiled. This is a weaker form 1503 of ``readnone``. If the function reads or writes other memory, the 1504 behavior is undefined. 1505``inaccessiblemem_or_argmemonly`` 1506 This attribute indicates that the function may only access memory that is 1507 either not accessible by the module being compiled, or is pointed to 1508 by its pointer arguments. This is a weaker form of ``argmemonly``. If the 1509 function reads or writes other memory, the behavior is undefined. 1510``inlinehint`` 1511 This attribute indicates that the source code contained a hint that 1512 inlining this function is desirable (such as the "inline" keyword in 1513 C/C++). It is just a hint; it imposes no requirements on the 1514 inliner. 1515``jumptable`` 1516 This attribute indicates that the function should be added to a 1517 jump-instruction table at code-generation time, and that all address-taken 1518 references to this function should be replaced with a reference to the 1519 appropriate jump-instruction-table function pointer. Note that this creates 1520 a new pointer for the original function, which means that code that depends 1521 on function-pointer identity can break. So, any function annotated with 1522 ``jumptable`` must also be ``unnamed_addr``. 1523``minsize`` 1524 This attribute suggests that optimization passes and code generator 1525 passes make choices that keep the code size of this function as small 1526 as possible and perform optimizations that may sacrifice runtime 1527 performance in order to minimize the size of the generated code. 1528``naked`` 1529 This attribute disables prologue / epilogue emission for the 1530 function. This can have very system-specific consequences. 1531``"no-inline-line-tables"`` 1532 When this attribute is set to true, the inliner discards source locations 1533 when inlining code and instead uses the source location of the call site. 1534 Breakpoints set on code that was inlined into the current function will 1535 not fire during the execution of the inlined call sites. If the debugger 1536 stops inside an inlined call site, it will appear to be stopped at the 1537 outermost inlined call site. 1538``no-jump-tables`` 1539 When this attribute is set to true, the jump tables and lookup tables that 1540 can be generated from a switch case lowering are disabled. 1541``nobuiltin`` 1542 This indicates that the callee function at a call site is not recognized as 1543 a built-in function. LLVM will retain the original call and not replace it 1544 with equivalent code based on the semantics of the built-in function, unless 1545 the call site uses the ``builtin`` attribute. This is valid at call sites 1546 and on function declarations and definitions. 1547``noduplicate`` 1548 This attribute indicates that calls to the function cannot be 1549 duplicated. A call to a ``noduplicate`` function may be moved 1550 within its parent function, but may not be duplicated within 1551 its parent function. 1552 1553 A function containing a ``noduplicate`` call may still 1554 be an inlining candidate, provided that the call is not 1555 duplicated by inlining. That implies that the function has 1556 internal linkage and only has one call site, so the original 1557 call is dead after inlining. 1558``nofree`` 1559 This function attribute indicates that the function does not, directly or 1560 indirectly, call a memory-deallocation function (free, for example). As a 1561 result, uncaptured pointers that are known to be dereferenceable prior to a 1562 call to a function with the ``nofree`` attribute are still known to be 1563 dereferenceable after the call (the capturing condition is necessary in 1564 environments where the function might communicate the pointer to another thread 1565 which then deallocates the memory). 1566``noimplicitfloat`` 1567 This attributes disables implicit floating-point instructions. 1568``noinline`` 1569 This attribute indicates that the inliner should never inline this 1570 function in any situation. This attribute may not be used together 1571 with the ``alwaysinline`` attribute. 1572``nomerge`` 1573 This attribute indicates that calls to this function should never be merged 1574 during optimization. For example, it will prevent tail merging otherwise 1575 identical code sequences that raise an exception or terminate the program. 1576 Tail merging normally reduces the precision of source location information, 1577 making stack traces less useful for debugging. This attribute gives the 1578 user control over the tradeoff between code size and debug information 1579 precision. 1580``nonlazybind`` 1581 This attribute suppresses lazy symbol binding for the function. This 1582 may make calls to the function faster, at the cost of extra program 1583 startup time if the function is not called during program startup. 1584``noredzone`` 1585 This attribute indicates that the code generator should not use a 1586 red zone, even if the target-specific ABI normally permits it. 1587``indirect-tls-seg-refs`` 1588 This attribute indicates that the code generator should not use 1589 direct TLS access through segment registers, even if the 1590 target-specific ABI normally permits it. 1591``noreturn`` 1592 This function attribute indicates that the function never returns 1593 normally, hence through a return instruction. This produces undefined 1594 behavior at runtime if the function ever does dynamically return. Annotated 1595 functions may still raise an exception, i.a., ``nounwind`` is not implied. 1596``norecurse`` 1597 This function attribute indicates that the function does not call itself 1598 either directly or indirectly down any possible call path. This produces 1599 undefined behavior at runtime if the function ever does recurse. 1600``willreturn`` 1601 This function attribute indicates that a call of this function will 1602 either exhibit undefined behavior or comes back and continues execution 1603 at a point in the existing call stack that includes the current invocation. 1604 Annotated functions may still raise an exception, i.a., ``nounwind`` is not implied. 1605 If an invocation of an annotated function does not return control back 1606 to a point in the call stack, the behavior is undefined. 1607``nosync`` 1608 This function attribute indicates that the function does not communicate 1609 (synchronize) with another thread through memory or other well-defined means. 1610 Synchronization is considered possible in the presence of `atomic` accesses 1611 that enforce an order, thus not "unordered" and "monotonic", `volatile` accesses, 1612 as well as `convergent` function calls. Note that through `convergent` function calls 1613 non-memory communication, e.g., cross-lane operations, are possible and are also 1614 considered synchronization. However `convergent` does not contradict `nosync`. 1615 If an annotated function does ever synchronize with another thread, 1616 the behavior is undefined. 1617``nounwind`` 1618 This function attribute indicates that the function never raises an 1619 exception. If the function does raise an exception, its runtime 1620 behavior is undefined. However, functions marked nounwind may still 1621 trap or generate asynchronous exceptions. Exception handling schemes 1622 that are recognized by LLVM to handle asynchronous exceptions, such 1623 as SEH, will still provide their implementation defined semantics. 1624``null_pointer_is_valid`` 1625 If ``null_pointer_is_valid`` is set, then the ``null`` address 1626 in address-space 0 is considered to be a valid address for memory loads and 1627 stores. Any analysis or optimization should not treat dereferencing a 1628 pointer to ``null`` as undefined behavior in this function. 1629 Note: Comparing address of a global variable to ``null`` may still 1630 evaluate to false because of a limitation in querying this attribute inside 1631 constant expressions. 1632``optforfuzzing`` 1633 This attribute indicates that this function should be optimized 1634 for maximum fuzzing signal. 1635``optnone`` 1636 This function attribute indicates that most optimization passes will skip 1637 this function, with the exception of interprocedural optimization passes. 1638 Code generation defaults to the "fast" instruction selector. 1639 This attribute cannot be used together with the ``alwaysinline`` 1640 attribute; this attribute is also incompatible 1641 with the ``minsize`` attribute and the ``optsize`` attribute. 1642 1643 This attribute requires the ``noinline`` attribute to be specified on 1644 the function as well, so the function is never inlined into any caller. 1645 Only functions with the ``alwaysinline`` attribute are valid 1646 candidates for inlining into the body of this function. 1647``optsize`` 1648 This attribute suggests that optimization passes and code generator 1649 passes make choices that keep the code size of this function low, 1650 and otherwise do optimizations specifically to reduce code size as 1651 long as they do not significantly impact runtime performance. 1652``"patchable-function"`` 1653 This attribute tells the code generator that the code 1654 generated for this function needs to follow certain conventions that 1655 make it possible for a runtime function to patch over it later. 1656 The exact effect of this attribute depends on its string value, 1657 for which there currently is one legal possibility: 1658 1659 * ``"prologue-short-redirect"`` - This style of patchable 1660 function is intended to support patching a function prologue to 1661 redirect control away from the function in a thread safe 1662 manner. It guarantees that the first instruction of the 1663 function will be large enough to accommodate a short jump 1664 instruction, and will be sufficiently aligned to allow being 1665 fully changed via an atomic compare-and-swap instruction. 1666 While the first requirement can be satisfied by inserting large 1667 enough NOP, LLVM can and will try to re-purpose an existing 1668 instruction (i.e. one that would have to be emitted anyway) as 1669 the patchable instruction larger than a short jump. 1670 1671 ``"prologue-short-redirect"`` is currently only supported on 1672 x86-64. 1673 1674 This attribute by itself does not imply restrictions on 1675 inter-procedural optimizations. All of the semantic effects the 1676 patching may have to be separately conveyed via the linkage type. 1677``"probe-stack"`` 1678 This attribute indicates that the function will trigger a guard region 1679 in the end of the stack. It ensures that accesses to the stack must be 1680 no further apart than the size of the guard region to a previous 1681 access of the stack. It takes one required string value, the name of 1682 the stack probing function that will be called. 1683 1684 If a function that has a ``"probe-stack"`` attribute is inlined into 1685 a function with another ``"probe-stack"`` attribute, the resulting 1686 function has the ``"probe-stack"`` attribute of the caller. If a 1687 function that has a ``"probe-stack"`` attribute is inlined into a 1688 function that has no ``"probe-stack"`` attribute at all, the resulting 1689 function has the ``"probe-stack"`` attribute of the callee. 1690``readnone`` 1691 On a function, this attribute indicates that the function computes its 1692 result (or decides to unwind an exception) based strictly on its arguments, 1693 without dereferencing any pointer arguments or otherwise accessing 1694 any mutable state (e.g. memory, control registers, etc) visible to 1695 caller functions. It does not write through any pointer arguments 1696 (including ``byval`` arguments) and never changes any state visible 1697 to callers. This means while it cannot unwind exceptions by calling 1698 the ``C++`` exception throwing methods (since they write to memory), there may 1699 be non-``C++`` mechanisms that throw exceptions without writing to LLVM 1700 visible memory. 1701 1702 On an argument, this attribute indicates that the function does not 1703 dereference that pointer argument, even though it may read or write the 1704 memory that the pointer points to if accessed through other pointers. 1705 1706 If a readnone function reads or writes memory visible to the program, or 1707 has other side-effects, the behavior is undefined. If a function reads from 1708 or writes to a readnone pointer argument, the behavior is undefined. 1709``readonly`` 1710 On a function, this attribute indicates that the function does not write 1711 through any pointer arguments (including ``byval`` arguments) or otherwise 1712 modify any state (e.g. memory, control registers, etc) visible to 1713 caller functions. It may dereference pointer arguments and read 1714 state that may be set in the caller. A readonly function always 1715 returns the same value (or unwinds an exception identically) when 1716 called with the same set of arguments and global state. This means while it 1717 cannot unwind exceptions by calling the ``C++`` exception throwing methods 1718 (since they write to memory), there may be non-``C++`` mechanisms that throw 1719 exceptions without writing to LLVM visible memory. 1720 1721 On an argument, this attribute indicates that the function does not write 1722 through this pointer argument, even though it may write to the memory that 1723 the pointer points to. 1724 1725 If a readonly function writes memory visible to the program, or 1726 has other side-effects, the behavior is undefined. If a function writes to 1727 a readonly pointer argument, the behavior is undefined. 1728``"stack-probe-size"`` 1729 This attribute controls the behavior of stack probes: either 1730 the ``"probe-stack"`` attribute, or ABI-required stack probes, if any. 1731 It defines the size of the guard region. It ensures that if the function 1732 may use more stack space than the size of the guard region, stack probing 1733 sequence will be emitted. It takes one required integer value, which 1734 is 4096 by default. 1735 1736 If a function that has a ``"stack-probe-size"`` attribute is inlined into 1737 a function with another ``"stack-probe-size"`` attribute, the resulting 1738 function has the ``"stack-probe-size"`` attribute that has the lower 1739 numeric value. If a function that has a ``"stack-probe-size"`` attribute is 1740 inlined into a function that has no ``"stack-probe-size"`` attribute 1741 at all, the resulting function has the ``"stack-probe-size"`` attribute 1742 of the callee. 1743``"no-stack-arg-probe"`` 1744 This attribute disables ABI-required stack probes, if any. 1745``writeonly`` 1746 On a function, this attribute indicates that the function may write to but 1747 does not read from memory. 1748 1749 On an argument, this attribute indicates that the function may write to but 1750 does not read through this pointer argument (even though it may read from 1751 the memory that the pointer points to). 1752 1753 If a writeonly function reads memory visible to the program, or 1754 has other side-effects, the behavior is undefined. If a function reads 1755 from a writeonly pointer argument, the behavior is undefined. 1756``argmemonly`` 1757 This attribute indicates that the only memory accesses inside function are 1758 loads and stores from objects pointed to by its pointer-typed arguments, 1759 with arbitrary offsets. Or in other words, all memory operations in the 1760 function can refer to memory only using pointers based on its function 1761 arguments. 1762 1763 Note that ``argmemonly`` can be used together with ``readonly`` attribute 1764 in order to specify that function reads only from its arguments. 1765 1766 If an argmemonly function reads or writes memory other than the pointer 1767 arguments, or has other side-effects, the behavior is undefined. 1768``returns_twice`` 1769 This attribute indicates that this function can return twice. The C 1770 ``setjmp`` is an example of such a function. The compiler disables 1771 some optimizations (like tail calls) in the caller of these 1772 functions. 1773``safestack`` 1774 This attribute indicates that 1775 `SafeStack <https://clang.llvm.org/docs/SafeStack.html>`_ 1776 protection is enabled for this function. 1777 1778 If a function that has a ``safestack`` attribute is inlined into a 1779 function that doesn't have a ``safestack`` attribute or which has an 1780 ``ssp``, ``sspstrong`` or ``sspreq`` attribute, then the resulting 1781 function will have a ``safestack`` attribute. 1782``sanitize_address`` 1783 This attribute indicates that AddressSanitizer checks 1784 (dynamic address safety analysis) are enabled for this function. 1785``sanitize_memory`` 1786 This attribute indicates that MemorySanitizer checks (dynamic detection 1787 of accesses to uninitialized memory) are enabled for this function. 1788``sanitize_thread`` 1789 This attribute indicates that ThreadSanitizer checks 1790 (dynamic thread safety analysis) are enabled for this function. 1791``sanitize_hwaddress`` 1792 This attribute indicates that HWAddressSanitizer checks 1793 (dynamic address safety analysis based on tagged pointers) are enabled for 1794 this function. 1795``sanitize_memtag`` 1796 This attribute indicates that MemTagSanitizer checks 1797 (dynamic address safety analysis based on Armv8 MTE) are enabled for 1798 this function. 1799``speculative_load_hardening`` 1800 This attribute indicates that 1801 `Speculative Load Hardening <https://llvm.org/docs/SpeculativeLoadHardening.html>`_ 1802 should be enabled for the function body. 1803 1804 Speculative Load Hardening is a best-effort mitigation against 1805 information leak attacks that make use of control flow 1806 miss-speculation - specifically miss-speculation of whether a branch 1807 is taken or not. Typically vulnerabilities enabling such attacks are 1808 classified as "Spectre variant #1". Notably, this does not attempt to 1809 mitigate against miss-speculation of branch target, classified as 1810 "Spectre variant #2" vulnerabilities. 1811 1812 When inlining, the attribute is sticky. Inlining a function that carries 1813 this attribute will cause the caller to gain the attribute. This is intended 1814 to provide a maximally conservative model where the code in a function 1815 annotated with this attribute will always (even after inlining) end up 1816 hardened. 1817``speculatable`` 1818 This function attribute indicates that the function does not have any 1819 effects besides calculating its result and does not have undefined behavior. 1820 Note that ``speculatable`` is not enough to conclude that along any 1821 particular execution path the number of calls to this function will not be 1822 externally observable. This attribute is only valid on functions 1823 and declarations, not on individual call sites. If a function is 1824 incorrectly marked as speculatable and really does exhibit 1825 undefined behavior, the undefined behavior may be observed even 1826 if the call site is dead code. 1827 1828``nossp`` 1829 This attribute indicates the function should not emit a stack smashing 1830 protector. This is useful for code that intentionally manipulates the stack 1831 canary, such as operating system kernel code that must save/restore such 1832 canary values on context switch. 1833 1834 If a function with the ``nossp`` attribute calls a callee function that has 1835 a stack protector function attribute, such as ``ssp``, ``sspreq``, or 1836 ``sspstrong`` (or vice-versa), then the callee will not be inline 1837 substituted into the caller. Even when the callee is ``alwaysinline``, the 1838 above holds. 1839 1840 Such inlining might break assumptions in the function that was built 1841 without stack protection. This permits the functions that would have stack 1842 protection to retain their stack protector. 1843 1844``ssp`` 1845 This attribute indicates that the function should emit a stack 1846 smashing protector. It is in the form of a "canary" --- a random value 1847 placed on the stack before the local variables that's checked upon 1848 return from the function to see if it has been overwritten. A 1849 heuristic is used to determine if a function needs stack protectors 1850 or not. The heuristic used will enable protectors for functions with: 1851 1852 - Character arrays larger than ``ssp-buffer-size`` (default 8). 1853 - Aggregates containing character arrays larger than ``ssp-buffer-size``. 1854 - Calls to alloca() with variable sizes or constant sizes greater than 1855 ``ssp-buffer-size``. 1856 1857 Variables that are identified as requiring a protector will be arranged 1858 on the stack such that they are adjacent to the stack protector guard. 1859 1860 If a function that has an ``ssp`` attribute is inlined into a 1861 function that doesn't have an ``ssp`` attribute, then the resulting 1862 function will have an ``ssp`` attribute. 1863``sspreq`` 1864 This attribute indicates that the function should *always* emit a 1865 stack smashing protector. This overrides the ``ssp`` function 1866 attribute. 1867 1868 Variables that are identified as requiring a protector will be arranged 1869 on the stack such that they are adjacent to the stack protector guard. 1870 The specific layout rules are: 1871 1872 #. Large arrays and structures containing large arrays 1873 (``>= ssp-buffer-size``) are closest to the stack protector. 1874 #. Small arrays and structures containing small arrays 1875 (``< ssp-buffer-size``) are 2nd closest to the protector. 1876 #. Variables that have had their address taken are 3rd closest to the 1877 protector. 1878 1879 If a function that has an ``sspreq`` attribute is inlined into a 1880 function that doesn't have an ``sspreq`` attribute or which has an 1881 ``ssp`` or ``sspstrong`` attribute, then the resulting function will have 1882 an ``sspreq`` attribute. 1883``sspstrong`` 1884 This attribute indicates that the function should emit a stack smashing 1885 protector. This attribute causes a strong heuristic to be used when 1886 determining if a function needs stack protectors. The strong heuristic 1887 will enable protectors for functions with: 1888 1889 - Arrays of any size and type 1890 - Aggregates containing an array of any size and type. 1891 - Calls to alloca(). 1892 - Local variables that have had their address taken. 1893 1894 Variables that are identified as requiring a protector will be arranged 1895 on the stack such that they are adjacent to the stack protector guard. 1896 The specific layout rules are: 1897 1898 #. Large arrays and structures containing large arrays 1899 (``>= ssp-buffer-size``) are closest to the stack protector. 1900 #. Small arrays and structures containing small arrays 1901 (``< ssp-buffer-size``) are 2nd closest to the protector. 1902 #. Variables that have had their address taken are 3rd closest to the 1903 protector. 1904 1905 This overrides the ``ssp`` function attribute. 1906 1907 If a function that has an ``sspstrong`` attribute is inlined into a 1908 function that doesn't have an ``sspstrong`` attribute, then the 1909 resulting function will have an ``sspstrong`` attribute. 1910``strictfp`` 1911 This attribute indicates that the function was called from a scope that 1912 requires strict floating-point semantics. LLVM will not attempt any 1913 optimizations that require assumptions about the floating-point rounding 1914 mode or that might alter the state of floating-point status flags that 1915 might otherwise be set or cleared by calling this function. LLVM will 1916 not introduce any new floating-point instructions that may trap. 1917 1918``"denormal-fp-math"`` 1919 This indicates the denormal (subnormal) handling that may be 1920 assumed for the default floating-point environment. This is a 1921 comma separated pair. The elements may be one of ``"ieee"``, 1922 ``"preserve-sign"``, or ``"positive-zero"``. The first entry 1923 indicates the flushing mode for the result of floating point 1924 operations. The second indicates the handling of denormal inputs 1925 to floating point instructions. For compatibility with older 1926 bitcode, if the second value is omitted, both input and output 1927 modes will assume the same mode. 1928 1929 If this is attribute is not specified, the default is 1930 ``"ieee,ieee"``. 1931 1932 If the output mode is ``"preserve-sign"``, or ``"positive-zero"``, 1933 denormal outputs may be flushed to zero by standard floating-point 1934 operations. It is not mandated that flushing to zero occurs, but if 1935 a denormal output is flushed to zero, it must respect the sign 1936 mode. Not all targets support all modes. While this indicates the 1937 expected floating point mode the function will be executed with, 1938 this does not make any attempt to ensure the mode is 1939 consistent. User or platform code is expected to set the floating 1940 point mode appropriately before function entry. 1941 1942 If the input mode is ``"preserve-sign"``, or ``"positive-zero"``, a 1943 floating-point operation must treat any input denormal value as 1944 zero. In some situations, if an instruction does not respect this 1945 mode, the input may need to be converted to 0 as if by 1946 ``@llvm.canonicalize`` during lowering for correctness. 1947 1948``"denormal-fp-math-f32"`` 1949 Same as ``"denormal-fp-math"``, but only controls the behavior of 1950 the 32-bit float type (or vectors of 32-bit floats). If both are 1951 are present, this overrides ``"denormal-fp-math"``. Not all targets 1952 support separately setting the denormal mode per type, and no 1953 attempt is made to diagnose unsupported uses. Currently this 1954 attribute is respected by the AMDGPU and NVPTX backends. 1955 1956``"thunk"`` 1957 This attribute indicates that the function will delegate to some other 1958 function with a tail call. The prototype of a thunk should not be used for 1959 optimization purposes. The caller is expected to cast the thunk prototype to 1960 match the thunk target prototype. 1961``uwtable`` 1962 This attribute indicates that the ABI being targeted requires that 1963 an unwind table entry be produced for this function even if we can 1964 show that no exceptions passes by it. This is normally the case for 1965 the ELF x86-64 abi, but it can be disabled for some compilation 1966 units. 1967``nocf_check`` 1968 This attribute indicates that no control-flow check will be performed on 1969 the attributed entity. It disables -fcf-protection=<> for a specific 1970 entity to fine grain the HW control flow protection mechanism. The flag 1971 is target independent and currently appertains to a function or function 1972 pointer. 1973``shadowcallstack`` 1974 This attribute indicates that the ShadowCallStack checks are enabled for 1975 the function. The instrumentation checks that the return address for the 1976 function has not changed between the function prolog and epilog. It is 1977 currently x86_64-specific. 1978``mustprogress`` 1979 This attribute indicates that the function is required to return, unwind, 1980 or interact with the environment in an observable way e.g. via a volatile 1981 memory access, I/O, or other synchronization. The ``mustprogress`` 1982 attribute is intended to model the requirements of the first section of 1983 [intro.progress] of the C++ Standard. As a consequence, a loop in a 1984 function with the `mustprogress` attribute can be assumed to terminate if 1985 it does not interact with the environment in an observable way, and 1986 terminating loops without side-effects can be removed. If a `mustprogress` 1987 function does not satisfy this contract, the behavior is undefined. This 1988 attribute does not apply transitively to callees, but does apply to call 1989 sites within the function. Note that `willreturn` implies `mustprogress`. 1990 1991Call Site Attributes 1992---------------------- 1993 1994In addition to function attributes the following call site only 1995attributes are supported: 1996 1997``vector-function-abi-variant`` 1998 This attribute can be attached to a :ref:`call <i_call>` to list 1999 the vector functions associated to the function. Notice that the 2000 attribute cannot be attached to a :ref:`invoke <i_invoke>` or a 2001 :ref:`callbr <i_callbr>` instruction. The attribute consists of a 2002 comma separated list of mangled names. The order of the list does 2003 not imply preference (it is logically a set). The compiler is free 2004 to pick any listed vector function of its choosing. 2005 2006 The syntax for the mangled names is as follows::: 2007 2008 _ZGV<isa><mask><vlen><parameters>_<scalar_name>[(<vector_redirection>)] 2009 2010 When present, the attribute informs the compiler that the function 2011 ``<scalar_name>`` has a corresponding vector variant that can be 2012 used to perform the concurrent invocation of ``<scalar_name>`` on 2013 vectors. The shape of the vector function is described by the 2014 tokens between the prefix ``_ZGV`` and the ``<scalar_name>`` 2015 token. The standard name of the vector function is 2016 ``_ZGV<isa><mask><vlen><parameters>_<scalar_name>``. When present, 2017 the optional token ``(<vector_redirection>)`` informs the compiler 2018 that a custom name is provided in addition to the standard one 2019 (custom names can be provided for example via the use of ``declare 2020 variant`` in OpenMP 5.0). The declaration of the variant must be 2021 present in the IR Module. The signature of the vector variant is 2022 determined by the rules of the Vector Function ABI (VFABI) 2023 specifications of the target. For Arm and X86, the VFABI can be 2024 found at https://github.com/ARM-software/abi-aa and 2025 https://software.intel.com/en-us/articles/vector-simd-function-abi, 2026 respectively. 2027 2028 For X86 and Arm targets, the values of the tokens in the standard 2029 name are those that are defined in the VFABI. LLVM has an internal 2030 ``<isa>`` token that can be used to create scalar-to-vector 2031 mappings for functions that are not directly associated to any of 2032 the target ISAs (for example, some of the mappings stored in the 2033 TargetLibraryInfo). Valid values for the ``<isa>`` token are::: 2034 2035 <isa>:= b | c | d | e -> X86 SSE, AVX, AVX2, AVX512 2036 | n | s -> Armv8 Advanced SIMD, SVE 2037 | __LLVM__ -> Internal LLVM Vector ISA 2038 2039 For all targets currently supported (x86, Arm and Internal LLVM), 2040 the remaining tokens can have the following values::: 2041 2042 <mask>:= M | N -> mask | no mask 2043 2044 <vlen>:= number -> number of lanes 2045 | x -> VLA (Vector Length Agnostic) 2046 2047 <parameters>:= v -> vector 2048 | l | l <number> -> linear 2049 | R | R <number> -> linear with ref modifier 2050 | L | L <number> -> linear with val modifier 2051 | U | U <number> -> linear with uval modifier 2052 | ls <pos> -> runtime linear 2053 | Rs <pos> -> runtime linear with ref modifier 2054 | Ls <pos> -> runtime linear with val modifier 2055 | Us <pos> -> runtime linear with uval modifier 2056 | u -> uniform 2057 2058 <scalar_name>:= name of the scalar function 2059 2060 <vector_redirection>:= optional, custom name of the vector function 2061 2062``preallocated(<ty>)`` 2063 This attribute is required on calls to ``llvm.call.preallocated.arg`` 2064 and cannot be used on any other call. See 2065 :ref:`llvm.call.preallocated.arg<int_call_preallocated_arg>` for more 2066 details. 2067 2068.. _glattrs: 2069 2070Global Attributes 2071----------------- 2072 2073Attributes may be set to communicate additional information about a global variable. 2074Unlike :ref:`function attributes <fnattrs>`, attributes on a global variable 2075are grouped into a single :ref:`attribute group <attrgrp>`. 2076 2077.. _opbundles: 2078 2079Operand Bundles 2080--------------- 2081 2082Operand bundles are tagged sets of SSA values that can be associated 2083with certain LLVM instructions (currently only ``call`` s and 2084``invoke`` s). In a way they are like metadata, but dropping them is 2085incorrect and will change program semantics. 2086 2087Syntax:: 2088 2089 operand bundle set ::= '[' operand bundle (, operand bundle )* ']' 2090 operand bundle ::= tag '(' [ bundle operand ] (, bundle operand )* ')' 2091 bundle operand ::= SSA value 2092 tag ::= string constant 2093 2094Operand bundles are **not** part of a function's signature, and a 2095given function may be called from multiple places with different kinds 2096of operand bundles. This reflects the fact that the operand bundles 2097are conceptually a part of the ``call`` (or ``invoke``), not the 2098callee being dispatched to. 2099 2100Operand bundles are a generic mechanism intended to support 2101runtime-introspection-like functionality for managed languages. While 2102the exact semantics of an operand bundle depend on the bundle tag, 2103there are certain limitations to how much the presence of an operand 2104bundle can influence the semantics of a program. These restrictions 2105are described as the semantics of an "unknown" operand bundle. As 2106long as the behavior of an operand bundle is describable within these 2107restrictions, LLVM does not need to have special knowledge of the 2108operand bundle to not miscompile programs containing it. 2109 2110- The bundle operands for an unknown operand bundle escape in unknown 2111 ways before control is transferred to the callee or invokee. 2112- Calls and invokes with operand bundles have unknown read / write 2113 effect on the heap on entry and exit (even if the call target is 2114 ``readnone`` or ``readonly``), unless they're overridden with 2115 callsite specific attributes. 2116- An operand bundle at a call site cannot change the implementation 2117 of the called function. Inter-procedural optimizations work as 2118 usual as long as they take into account the first two properties. 2119 2120More specific types of operand bundles are described below. 2121 2122.. _deopt_opbundles: 2123 2124Deoptimization Operand Bundles 2125^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2126 2127Deoptimization operand bundles are characterized by the ``"deopt"`` 2128operand bundle tag. These operand bundles represent an alternate 2129"safe" continuation for the call site they're attached to, and can be 2130used by a suitable runtime to deoptimize the compiled frame at the 2131specified call site. There can be at most one ``"deopt"`` operand 2132bundle attached to a call site. Exact details of deoptimization is 2133out of scope for the language reference, but it usually involves 2134rewriting a compiled frame into a set of interpreted frames. 2135 2136From the compiler's perspective, deoptimization operand bundles make 2137the call sites they're attached to at least ``readonly``. They read 2138through all of their pointer typed operands (even if they're not 2139otherwise escaped) and the entire visible heap. Deoptimization 2140operand bundles do not capture their operands except during 2141deoptimization, in which case control will not be returned to the 2142compiled frame. 2143 2144The inliner knows how to inline through calls that have deoptimization 2145operand bundles. Just like inlining through a normal call site 2146involves composing the normal and exceptional continuations, inlining 2147through a call site with a deoptimization operand bundle needs to 2148appropriately compose the "safe" deoptimization continuation. The 2149inliner does this by prepending the parent's deoptimization 2150continuation to every deoptimization continuation in the inlined body. 2151E.g. inlining ``@f`` into ``@g`` in the following example 2152 2153.. code-block:: llvm 2154 2155 define void @f() { 2156 call void @x() ;; no deopt state 2157 call void @y() [ "deopt"(i32 10) ] 2158 call void @y() [ "deopt"(i32 10), "unknown"(i8* null) ] 2159 ret void 2160 } 2161 2162 define void @g() { 2163 call void @f() [ "deopt"(i32 20) ] 2164 ret void 2165 } 2166 2167will result in 2168 2169.. code-block:: llvm 2170 2171 define void @g() { 2172 call void @x() ;; still no deopt state 2173 call void @y() [ "deopt"(i32 20, i32 10) ] 2174 call void @y() [ "deopt"(i32 20, i32 10), "unknown"(i8* null) ] 2175 ret void 2176 } 2177 2178It is the frontend's responsibility to structure or encode the 2179deoptimization state in a way that syntactically prepending the 2180caller's deoptimization state to the callee's deoptimization state is 2181semantically equivalent to composing the caller's deoptimization 2182continuation after the callee's deoptimization continuation. 2183 2184.. _ob_funclet: 2185 2186Funclet Operand Bundles 2187^^^^^^^^^^^^^^^^^^^^^^^ 2188 2189Funclet operand bundles are characterized by the ``"funclet"`` 2190operand bundle tag. These operand bundles indicate that a call site 2191is within a particular funclet. There can be at most one 2192``"funclet"`` operand bundle attached to a call site and it must have 2193exactly one bundle operand. 2194 2195If any funclet EH pads have been "entered" but not "exited" (per the 2196`description in the EH doc\ <ExceptionHandling.html#wineh-constraints>`_), 2197it is undefined behavior to execute a ``call`` or ``invoke`` which: 2198 2199* does not have a ``"funclet"`` bundle and is not a ``call`` to a nounwind 2200 intrinsic, or 2201* has a ``"funclet"`` bundle whose operand is not the most-recently-entered 2202 not-yet-exited funclet EH pad. 2203 2204Similarly, if no funclet EH pads have been entered-but-not-yet-exited, 2205executing a ``call`` or ``invoke`` with a ``"funclet"`` bundle is undefined behavior. 2206 2207GC Transition Operand Bundles 2208^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2209 2210GC transition operand bundles are characterized by the 2211``"gc-transition"`` operand bundle tag. These operand bundles mark a 2212call as a transition between a function with one GC strategy to a 2213function with a different GC strategy. If coordinating the transition 2214between GC strategies requires additional code generation at the call 2215site, these bundles may contain any values that are needed by the 2216generated code. For more details, see :ref:`GC Transitions 2217<gc_transition_args>`. 2218 2219The bundle contain an arbitrary list of Values which need to be passed 2220to GC transition code. They will be lowered and passed as operands to 2221the appropriate GC_TRANSITION nodes in the selection DAG. It is assumed 2222that these arguments must be available before and after (but not 2223necessarily during) the execution of the callee. 2224 2225.. _assume_opbundles: 2226 2227Assume Operand Bundles 2228^^^^^^^^^^^^^^^^^^^^^^ 2229 2230Operand bundles on an :ref:`llvm.assume <int_assume>` allows representing 2231assumptions that a :ref:`parameter attribute <paramattrs>` or a 2232:ref:`function attribute <fnattrs>` holds for a certain value at a certain 2233location. Operand bundles enable assumptions that are either hard or impossible 2234to represent as a boolean argument of an :ref:`llvm.assume <int_assume>`. 2235 2236An assume operand bundle has the form: 2237 2238:: 2239 2240 "<tag>"([ <holds for value> [, <attribute argument>] ]) 2241 2242* The tag of the operand bundle is usually the name of attribute that can be 2243 assumed to hold. It can also be `ignore`, this tag doesn't contain any 2244 information and should be ignored. 2245* The first argument if present is the value for which the attribute hold. 2246* The second argument if present is an argument of the attribute. 2247 2248If there are no arguments the attribute is a property of the call location. 2249 2250If the represented attribute expects a constant argument, the argument provided 2251to the operand bundle should be a constant as well. 2252 2253For example: 2254 2255.. code-block:: llvm 2256 2257 call void @llvm.assume(i1 true) ["align"(i32* %val, i32 8)] 2258 2259allows the optimizer to assume that at location of call to 2260:ref:`llvm.assume <int_assume>` ``%val`` has an alignment of at least 8. 2261 2262.. code-block:: llvm 2263 2264 call void @llvm.assume(i1 %cond) ["cold"(), "nonnull"(i64* %val)] 2265 2266allows the optimizer to assume that the :ref:`llvm.assume <int_assume>` 2267call location is cold and that ``%val`` may not be null. 2268 2269Just like for the argument of :ref:`llvm.assume <int_assume>`, if any of the 2270provided guarantees are are violated at runtime the behavior is undefined. 2271 2272Even if the assumed property can be encoded as a boolean value, like 2273``nonnull``, using operand bundles to express the property can still have 2274benefits: 2275 2276* Attributes that can be expressed via operand bundles are directly the 2277 property that the optimizer uses and cares about. Encoding attributes as 2278 operand bundles removes the need for an instruction sequence that represents 2279 the property (e.g., `icmp ne i32* %p, null` for `nonnull`) and for the 2280 optimizer to deduce the property from that instruction sequence. 2281* Expressing the property using operand bundles makes it easy to identify the 2282 use of the value as a use in an :ref:`llvm.assume <int_assume>`. This then 2283 simplifies and improves heuristics, e.g., for use "use-sensitive" 2284 optimizations. 2285 2286.. _ob_preallocated: 2287 2288Preallocated Operand Bundles 2289^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2290 2291Preallocated operand bundles are characterized by the ``"preallocated"`` 2292operand bundle tag. These operand bundles allow separation of the allocation 2293of the call argument memory from the call site. This is necessary to pass 2294non-trivially copyable objects by value in a way that is compatible with MSVC 2295on some targets. There can be at most one ``"preallocated"`` operand bundle 2296attached to a call site and it must have exactly one bundle operand, which is 2297a token generated by ``@llvm.call.preallocated.setup``. A call with this 2298operand bundle should not adjust the stack before entering the function, as 2299that will have been done by one of the ``@llvm.call.preallocated.*`` intrinsics. 2300 2301.. code-block:: llvm 2302 2303 %foo = type { i64, i32 } 2304 2305 ... 2306 2307 %t = call token @llvm.call.preallocated.setup(i32 1) 2308 %a = call i8* @llvm.call.preallocated.arg(token %t, i32 0) preallocated(%foo) 2309 %b = bitcast i8* %a to %foo* 2310 ; initialize %b 2311 call void @bar(i32 42, %foo* preallocated(%foo) %b) ["preallocated"(token %t)] 2312 2313.. _ob_gc_live: 2314 2315GC Live Operand Bundles 2316^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2317 2318A "gc-live" operand bundle is only valid on a :ref:`gc.statepoint <gc_statepoint>` 2319intrinsic. The operand bundle must contain every pointer to a garbage collected 2320object which potentially needs to be updated by the garbage collector. 2321 2322When lowered, any relocated value will be recorded in the corresponding 2323:ref:`stackmap entry <statepoint-stackmap-format>`. See the intrinsic description 2324for further details. 2325 2326.. _moduleasm: 2327 2328Module-Level Inline Assembly 2329---------------------------- 2330 2331Modules may contain "module-level inline asm" blocks, which corresponds 2332to the GCC "file scope inline asm" blocks. These blocks are internally 2333concatenated by LLVM and treated as a single unit, but may be separated 2334in the ``.ll`` file if desired. The syntax is very simple: 2335 2336.. code-block:: llvm 2337 2338 module asm "inline asm code goes here" 2339 module asm "more can go here" 2340 2341The strings can contain any character by escaping non-printable 2342characters. The escape sequence used is simply "\\xx" where "xx" is the 2343two digit hex code for the number. 2344 2345Note that the assembly string *must* be parseable by LLVM's integrated assembler 2346(unless it is disabled), even when emitting a ``.s`` file. 2347 2348.. _langref_datalayout: 2349 2350Data Layout 2351----------- 2352 2353A module may specify a target specific data layout string that specifies 2354how data is to be laid out in memory. The syntax for the data layout is 2355simply: 2356 2357.. code-block:: llvm 2358 2359 target datalayout = "layout specification" 2360 2361The *layout specification* consists of a list of specifications 2362separated by the minus sign character ('-'). Each specification starts 2363with a letter and may include other information after the letter to 2364define some aspect of the data layout. The specifications accepted are 2365as follows: 2366 2367``E`` 2368 Specifies that the target lays out data in big-endian form. That is, 2369 the bits with the most significance have the lowest address 2370 location. 2371``e`` 2372 Specifies that the target lays out data in little-endian form. That 2373 is, the bits with the least significance have the lowest address 2374 location. 2375``S<size>`` 2376 Specifies the natural alignment of the stack in bits. Alignment 2377 promotion of stack variables is limited to the natural stack 2378 alignment to avoid dynamic stack realignment. The stack alignment 2379 must be a multiple of 8-bits. If omitted, the natural stack 2380 alignment defaults to "unspecified", which does not prevent any 2381 alignment promotions. 2382``P<address space>`` 2383 Specifies the address space that corresponds to program memory. 2384 Harvard architectures can use this to specify what space LLVM 2385 should place things such as functions into. If omitted, the 2386 program memory space defaults to the default address space of 0, 2387 which corresponds to a Von Neumann architecture that has code 2388 and data in the same space. 2389``A<address space>`` 2390 Specifies the address space of objects created by '``alloca``'. 2391 Defaults to the default address space of 0. 2392``p[n]:<size>:<abi>:<pref>:<idx>`` 2393 This specifies the *size* of a pointer and its ``<abi>`` and 2394 ``<pref>``\erred alignments for address space ``n``. The fourth parameter 2395 ``<idx>`` is a size of index that used for address calculation. If not 2396 specified, the default index size is equal to the pointer size. All sizes 2397 are in bits. The address space, ``n``, is optional, and if not specified, 2398 denotes the default address space 0. The value of ``n`` must be 2399 in the range [1,2^23). 2400``i<size>:<abi>:<pref>`` 2401 This specifies the alignment for an integer type of a given bit 2402 ``<size>``. The value of ``<size>`` must be in the range [1,2^23). 2403``v<size>:<abi>:<pref>`` 2404 This specifies the alignment for a vector type of a given bit 2405 ``<size>``. 2406``f<size>:<abi>:<pref>`` 2407 This specifies the alignment for a floating-point type of a given bit 2408 ``<size>``. Only values of ``<size>`` that are supported by the target 2409 will work. 32 (float) and 64 (double) are supported on all targets; 80 2410 or 128 (different flavors of long double) are also supported on some 2411 targets. 2412``a:<abi>:<pref>`` 2413 This specifies the alignment for an object of aggregate type. 2414``F<type><abi>`` 2415 This specifies the alignment for function pointers. 2416 The options for ``<type>`` are: 2417 2418 * ``i``: The alignment of function pointers is independent of the alignment 2419 of functions, and is a multiple of ``<abi>``. 2420 * ``n``: The alignment of function pointers is a multiple of the explicit 2421 alignment specified on the function, and is a multiple of ``<abi>``. 2422``m:<mangling>`` 2423 If present, specifies that llvm names are mangled in the output. Symbols 2424 prefixed with the mangling escape character ``\01`` are passed through 2425 directly to the assembler without the escape character. The mangling style 2426 options are 2427 2428 * ``e``: ELF mangling: Private symbols get a ``.L`` prefix. 2429 * ``m``: Mips mangling: Private symbols get a ``$`` prefix. 2430 * ``o``: Mach-O mangling: Private symbols get ``L`` prefix. Other 2431 symbols get a ``_`` prefix. 2432 * ``x``: Windows x86 COFF mangling: Private symbols get the usual prefix. 2433 Regular C symbols get a ``_`` prefix. Functions with ``__stdcall``, 2434 ``__fastcall``, and ``__vectorcall`` have custom mangling that appends 2435 ``@N`` where N is the number of bytes used to pass parameters. C++ symbols 2436 starting with ``?`` are not mangled in any way. 2437 * ``w``: Windows COFF mangling: Similar to ``x``, except that normal C 2438 symbols do not receive a ``_`` prefix. 2439 * ``a``: XCOFF mangling: Private symbols get a ``L..`` prefix. 2440``n<size1>:<size2>:<size3>...`` 2441 This specifies a set of native integer widths for the target CPU in 2442 bits. For example, it might contain ``n32`` for 32-bit PowerPC, 2443 ``n32:64`` for PowerPC 64, or ``n8:16:32:64`` for X86-64. Elements of 2444 this set are considered to support most general arithmetic operations 2445 efficiently. 2446``ni:<address space0>:<address space1>:<address space2>...`` 2447 This specifies pointer types with the specified address spaces 2448 as :ref:`Non-Integral Pointer Type <nointptrtype>` s. The ``0`` 2449 address space cannot be specified as non-integral. 2450 2451On every specification that takes a ``<abi>:<pref>``, specifying the 2452``<pref>`` alignment is optional. If omitted, the preceding ``:`` 2453should be omitted too and ``<pref>`` will be equal to ``<abi>``. 2454 2455When constructing the data layout for a given target, LLVM starts with a 2456default set of specifications which are then (possibly) overridden by 2457the specifications in the ``datalayout`` keyword. The default 2458specifications are given in this list: 2459 2460- ``E`` - big endian 2461- ``p:64:64:64`` - 64-bit pointers with 64-bit alignment. 2462- ``p[n]:64:64:64`` - Other address spaces are assumed to be the 2463 same as the default address space. 2464- ``S0`` - natural stack alignment is unspecified 2465- ``i1:8:8`` - i1 is 8-bit (byte) aligned 2466- ``i8:8:8`` - i8 is 8-bit (byte) aligned 2467- ``i16:16:16`` - i16 is 16-bit aligned 2468- ``i32:32:32`` - i32 is 32-bit aligned 2469- ``i64:32:64`` - i64 has ABI alignment of 32-bits but preferred 2470 alignment of 64-bits 2471- ``f16:16:16`` - half is 16-bit aligned 2472- ``f32:32:32`` - float is 32-bit aligned 2473- ``f64:64:64`` - double is 64-bit aligned 2474- ``f128:128:128`` - quad is 128-bit aligned 2475- ``v64:64:64`` - 64-bit vector is 64-bit aligned 2476- ``v128:128:128`` - 128-bit vector is 128-bit aligned 2477- ``a:0:64`` - aggregates are 64-bit aligned 2478 2479When LLVM is determining the alignment for a given type, it uses the 2480following rules: 2481 2482#. If the type sought is an exact match for one of the specifications, 2483 that specification is used. 2484#. If no match is found, and the type sought is an integer type, then 2485 the smallest integer type that is larger than the bitwidth of the 2486 sought type is used. If none of the specifications are larger than 2487 the bitwidth then the largest integer type is used. For example, 2488 given the default specifications above, the i7 type will use the 2489 alignment of i8 (next largest) while both i65 and i256 will use the 2490 alignment of i64 (largest specified). 2491#. If no match is found, and the type sought is a vector type, then the 2492 largest vector type that is smaller than the sought vector type will 2493 be used as a fall back. This happens because <128 x double> can be 2494 implemented in terms of 64 <2 x double>, for example. 2495 2496The function of the data layout string may not be what you expect. 2497Notably, this is not a specification from the frontend of what alignment 2498the code generator should use. 2499 2500Instead, if specified, the target data layout is required to match what 2501the ultimate *code generator* expects. This string is used by the 2502mid-level optimizers to improve code, and this only works if it matches 2503what the ultimate code generator uses. There is no way to generate IR 2504that does not embed this target-specific detail into the IR. If you 2505don't specify the string, the default specifications will be used to 2506generate a Data Layout and the optimization phases will operate 2507accordingly and introduce target specificity into the IR with respect to 2508these default specifications. 2509 2510.. _langref_triple: 2511 2512Target Triple 2513------------- 2514 2515A module may specify a target triple string that describes the target 2516host. The syntax for the target triple is simply: 2517 2518.. code-block:: llvm 2519 2520 target triple = "x86_64-apple-macosx10.7.0" 2521 2522The *target triple* string consists of a series of identifiers delimited 2523by the minus sign character ('-'). The canonical forms are: 2524 2525:: 2526 2527 ARCHITECTURE-VENDOR-OPERATING_SYSTEM 2528 ARCHITECTURE-VENDOR-OPERATING_SYSTEM-ENVIRONMENT 2529 2530This information is passed along to the backend so that it generates 2531code for the proper architecture. It's possible to override this on the 2532command line with the ``-mtriple`` command line option. 2533 2534.. _pointeraliasing: 2535 2536Pointer Aliasing Rules 2537---------------------- 2538 2539Any memory access must be done through a pointer value associated with 2540an address range of the memory access, otherwise the behavior is 2541undefined. Pointer values are associated with address ranges according 2542to the following rules: 2543 2544- A pointer value is associated with the addresses associated with any 2545 value it is *based* on. 2546- An address of a global variable is associated with the address range 2547 of the variable's storage. 2548- The result value of an allocation instruction is associated with the 2549 address range of the allocated storage. 2550- A null pointer in the default address-space is associated with no 2551 address. 2552- An :ref:`undef value <undefvalues>` in *any* address-space is 2553 associated with no address. 2554- An integer constant other than zero or a pointer value returned from 2555 a function not defined within LLVM may be associated with address 2556 ranges allocated through mechanisms other than those provided by 2557 LLVM. Such ranges shall not overlap with any ranges of addresses 2558 allocated by mechanisms provided by LLVM. 2559 2560A pointer value is *based* on another pointer value according to the 2561following rules: 2562 2563- A pointer value formed from a scalar ``getelementptr`` operation is *based* on 2564 the pointer-typed operand of the ``getelementptr``. 2565- The pointer in lane *l* of the result of a vector ``getelementptr`` operation 2566 is *based* on the pointer in lane *l* of the vector-of-pointers-typed operand 2567 of the ``getelementptr``. 2568- The result value of a ``bitcast`` is *based* on the operand of the 2569 ``bitcast``. 2570- A pointer value formed by an ``inttoptr`` is *based* on all pointer 2571 values that contribute (directly or indirectly) to the computation of 2572 the pointer's value. 2573- The "*based* on" relationship is transitive. 2574 2575Note that this definition of *"based"* is intentionally similar to the 2576definition of *"based"* in C99, though it is slightly weaker. 2577 2578LLVM IR does not associate types with memory. The result type of a 2579``load`` merely indicates the size and alignment of the memory from 2580which to load, as well as the interpretation of the value. The first 2581operand type of a ``store`` similarly only indicates the size and 2582alignment of the store. 2583 2584Consequently, type-based alias analysis, aka TBAA, aka 2585``-fstrict-aliasing``, is not applicable to general unadorned LLVM IR. 2586:ref:`Metadata <metadata>` may be used to encode additional information 2587which specialized optimization passes may use to implement type-based 2588alias analysis. 2589 2590.. _volatile: 2591 2592Volatile Memory Accesses 2593------------------------ 2594 2595Certain memory accesses, such as :ref:`load <i_load>`'s, 2596:ref:`store <i_store>`'s, and :ref:`llvm.memcpy <int_memcpy>`'s may be 2597marked ``volatile``. The optimizers must not change the number of 2598volatile operations or change their order of execution relative to other 2599volatile operations. The optimizers *may* change the order of volatile 2600operations relative to non-volatile operations. This is not Java's 2601"volatile" and has no cross-thread synchronization behavior. 2602 2603A volatile load or store may have additional target-specific semantics. 2604Any volatile operation can have side effects, and any volatile operation 2605can read and/or modify state which is not accessible via a regular load 2606or store in this module. Volatile operations may use addresses which do 2607not point to memory (like MMIO registers). This means the compiler may 2608not use a volatile operation to prove a non-volatile access to that 2609address has defined behavior. 2610 2611The allowed side-effects for volatile accesses are limited. If a 2612non-volatile store to a given address would be legal, a volatile 2613operation may modify the memory at that address. A volatile operation 2614may not modify any other memory accessible by the module being compiled. 2615A volatile operation may not call any code in the current module. 2616 2617The compiler may assume execution will continue after a volatile operation, 2618so operations which modify memory or may have undefined behavior can be 2619hoisted past a volatile operation. 2620 2621IR-level volatile loads and stores cannot safely be optimized into llvm.memcpy 2622or llvm.memmove intrinsics even when those intrinsics are flagged volatile. 2623Likewise, the backend should never split or merge target-legal volatile 2624load/store instructions. Similarly, IR-level volatile loads and stores cannot 2625change from integer to floating-point or vice versa. 2626 2627.. admonition:: Rationale 2628 2629 Platforms may rely on volatile loads and stores of natively supported 2630 data width to be executed as single instruction. For example, in C 2631 this holds for an l-value of volatile primitive type with native 2632 hardware support, but not necessarily for aggregate types. The 2633 frontend upholds these expectations, which are intentionally 2634 unspecified in the IR. The rules above ensure that IR transformations 2635 do not violate the frontend's contract with the language. 2636 2637.. _memmodel: 2638 2639Memory Model for Concurrent Operations 2640-------------------------------------- 2641 2642The LLVM IR does not define any way to start parallel threads of 2643execution or to register signal handlers. Nonetheless, there are 2644platform-specific ways to create them, and we define LLVM IR's behavior 2645in their presence. This model is inspired by the C++0x memory model. 2646 2647For a more informal introduction to this model, see the :doc:`Atomics`. 2648 2649We define a *happens-before* partial order as the least partial order 2650that 2651 2652- Is a superset of single-thread program order, and 2653- When a *synchronizes-with* ``b``, includes an edge from ``a`` to 2654 ``b``. *Synchronizes-with* pairs are introduced by platform-specific 2655 techniques, like pthread locks, thread creation, thread joining, 2656 etc., and by atomic instructions. (See also :ref:`Atomic Memory Ordering 2657 Constraints <ordering>`). 2658 2659Note that program order does not introduce *happens-before* edges 2660between a thread and signals executing inside that thread. 2661 2662Every (defined) read operation (load instructions, memcpy, atomic 2663loads/read-modify-writes, etc.) R reads a series of bytes written by 2664(defined) write operations (store instructions, atomic 2665stores/read-modify-writes, memcpy, etc.). For the purposes of this 2666section, initialized globals are considered to have a write of the 2667initializer which is atomic and happens before any other read or write 2668of the memory in question. For each byte of a read R, R\ :sub:`byte` 2669may see any write to the same byte, except: 2670 2671- If write\ :sub:`1` happens before write\ :sub:`2`, and 2672 write\ :sub:`2` happens before R\ :sub:`byte`, then 2673 R\ :sub:`byte` does not see write\ :sub:`1`. 2674- If R\ :sub:`byte` happens before write\ :sub:`3`, then 2675 R\ :sub:`byte` does not see write\ :sub:`3`. 2676 2677Given that definition, R\ :sub:`byte` is defined as follows: 2678 2679- If R is volatile, the result is target-dependent. (Volatile is 2680 supposed to give guarantees which can support ``sig_atomic_t`` in 2681 C/C++, and may be used for accesses to addresses that do not behave 2682 like normal memory. It does not generally provide cross-thread 2683 synchronization.) 2684- Otherwise, if there is no write to the same byte that happens before 2685 R\ :sub:`byte`, R\ :sub:`byte` returns ``undef`` for that byte. 2686- Otherwise, if R\ :sub:`byte` may see exactly one write, 2687 R\ :sub:`byte` returns the value written by that write. 2688- Otherwise, if R is atomic, and all the writes R\ :sub:`byte` may 2689 see are atomic, it chooses one of the values written. See the :ref:`Atomic 2690 Memory Ordering Constraints <ordering>` section for additional 2691 constraints on how the choice is made. 2692- Otherwise R\ :sub:`byte` returns ``undef``. 2693 2694R returns the value composed of the series of bytes it read. This 2695implies that some bytes within the value may be ``undef`` **without** 2696the entire value being ``undef``. Note that this only defines the 2697semantics of the operation; it doesn't mean that targets will emit more 2698than one instruction to read the series of bytes. 2699 2700Note that in cases where none of the atomic intrinsics are used, this 2701model places only one restriction on IR transformations on top of what 2702is required for single-threaded execution: introducing a store to a byte 2703which might not otherwise be stored is not allowed in general. 2704(Specifically, in the case where another thread might write to and read 2705from an address, introducing a store can change a load that may see 2706exactly one write into a load that may see multiple writes.) 2707 2708.. _ordering: 2709 2710Atomic Memory Ordering Constraints 2711---------------------------------- 2712 2713Atomic instructions (:ref:`cmpxchg <i_cmpxchg>`, 2714:ref:`atomicrmw <i_atomicrmw>`, :ref:`fence <i_fence>`, 2715:ref:`atomic load <i_load>`, and :ref:`atomic store <i_store>`) take 2716ordering parameters that determine which other atomic instructions on 2717the same address they *synchronize with*. These semantics are borrowed 2718from Java and C++0x, but are somewhat more colloquial. If these 2719descriptions aren't precise enough, check those specs (see spec 2720references in the :doc:`atomics guide <Atomics>`). 2721:ref:`fence <i_fence>` instructions treat these orderings somewhat 2722differently since they don't take an address. See that instruction's 2723documentation for details. 2724 2725For a simpler introduction to the ordering constraints, see the 2726:doc:`Atomics`. 2727 2728``unordered`` 2729 The set of values that can be read is governed by the happens-before 2730 partial order. A value cannot be read unless some operation wrote 2731 it. This is intended to provide a guarantee strong enough to model 2732 Java's non-volatile shared variables. This ordering cannot be 2733 specified for read-modify-write operations; it is not strong enough 2734 to make them atomic in any interesting way. 2735``monotonic`` 2736 In addition to the guarantees of ``unordered``, there is a single 2737 total order for modifications by ``monotonic`` operations on each 2738 address. All modification orders must be compatible with the 2739 happens-before order. There is no guarantee that the modification 2740 orders can be combined to a global total order for the whole program 2741 (and this often will not be possible). The read in an atomic 2742 read-modify-write operation (:ref:`cmpxchg <i_cmpxchg>` and 2743 :ref:`atomicrmw <i_atomicrmw>`) reads the value in the modification 2744 order immediately before the value it writes. If one atomic read 2745 happens before another atomic read of the same address, the later 2746 read must see the same value or a later value in the address's 2747 modification order. This disallows reordering of ``monotonic`` (or 2748 stronger) operations on the same address. If an address is written 2749 ``monotonic``-ally by one thread, and other threads ``monotonic``-ally 2750 read that address repeatedly, the other threads must eventually see 2751 the write. This corresponds to the C++0x/C1x 2752 ``memory_order_relaxed``. 2753``acquire`` 2754 In addition to the guarantees of ``monotonic``, a 2755 *synchronizes-with* edge may be formed with a ``release`` operation. 2756 This is intended to model C++'s ``memory_order_acquire``. 2757``release`` 2758 In addition to the guarantees of ``monotonic``, if this operation 2759 writes a value which is subsequently read by an ``acquire`` 2760 operation, it *synchronizes-with* that operation. (This isn't a 2761 complete description; see the C++0x definition of a release 2762 sequence.) This corresponds to the C++0x/C1x 2763 ``memory_order_release``. 2764``acq_rel`` (acquire+release) 2765 Acts as both an ``acquire`` and ``release`` operation on its 2766 address. This corresponds to the C++0x/C1x ``memory_order_acq_rel``. 2767``seq_cst`` (sequentially consistent) 2768 In addition to the guarantees of ``acq_rel`` (``acquire`` for an 2769 operation that only reads, ``release`` for an operation that only 2770 writes), there is a global total order on all 2771 sequentially-consistent operations on all addresses, which is 2772 consistent with the *happens-before* partial order and with the 2773 modification orders of all the affected addresses. Each 2774 sequentially-consistent read sees the last preceding write to the 2775 same address in this global order. This corresponds to the C++0x/C1x 2776 ``memory_order_seq_cst`` and Java volatile. 2777 2778.. _syncscope: 2779 2780If an atomic operation is marked ``syncscope("singlethread")``, it only 2781*synchronizes with* and only participates in the seq\_cst total orderings of 2782other operations running in the same thread (for example, in signal handlers). 2783 2784If an atomic operation is marked ``syncscope("<target-scope>")``, where 2785``<target-scope>`` is a target specific synchronization scope, then it is target 2786dependent if it *synchronizes with* and participates in the seq\_cst total 2787orderings of other operations. 2788 2789Otherwise, an atomic operation that is not marked ``syncscope("singlethread")`` 2790or ``syncscope("<target-scope>")`` *synchronizes with* and participates in the 2791seq\_cst total orderings of other operations that are not marked 2792``syncscope("singlethread")`` or ``syncscope("<target-scope>")``. 2793 2794.. _floatenv: 2795 2796Floating-Point Environment 2797-------------------------- 2798 2799The default LLVM floating-point environment assumes that floating-point 2800instructions do not have side effects. Results assume the round-to-nearest 2801rounding mode. No floating-point exception state is maintained in this 2802environment. Therefore, there is no attempt to create or preserve invalid 2803operation (SNaN) or division-by-zero exceptions. 2804 2805The benefit of this exception-free assumption is that floating-point 2806operations may be speculated freely without any other fast-math relaxations 2807to the floating-point model. 2808 2809Code that requires different behavior than this should use the 2810:ref:`Constrained Floating-Point Intrinsics <constrainedfp>`. 2811 2812.. _fastmath: 2813 2814Fast-Math Flags 2815--------------- 2816 2817LLVM IR floating-point operations (:ref:`fneg <i_fneg>`, :ref:`fadd <i_fadd>`, 2818:ref:`fsub <i_fsub>`, :ref:`fmul <i_fmul>`, :ref:`fdiv <i_fdiv>`, 2819:ref:`frem <i_frem>`, :ref:`fcmp <i_fcmp>`), :ref:`phi <i_phi>`, 2820:ref:`select <i_select>` and :ref:`call <i_call>` 2821may use the following flags to enable otherwise unsafe 2822floating-point transformations. 2823 2824``nnan`` 2825 No NaNs - Allow optimizations to assume the arguments and result are not 2826 NaN. If an argument is a nan, or the result would be a nan, it produces 2827 a :ref:`poison value <poisonvalues>` instead. 2828 2829``ninf`` 2830 No Infs - Allow optimizations to assume the arguments and result are not 2831 +/-Inf. If an argument is +/-Inf, or the result would be +/-Inf, it 2832 produces a :ref:`poison value <poisonvalues>` instead. 2833 2834``nsz`` 2835 No Signed Zeros - Allow optimizations to treat the sign of a zero 2836 argument or result as insignificant. This does not imply that -0.0 2837 is poison and/or guaranteed to not exist in the operation. 2838 2839``arcp`` 2840 Allow Reciprocal - Allow optimizations to use the reciprocal of an 2841 argument rather than perform division. 2842 2843``contract`` 2844 Allow floating-point contraction (e.g. fusing a multiply followed by an 2845 addition into a fused multiply-and-add). This does not enable reassociating 2846 to form arbitrary contractions. For example, ``(a*b) + (c*d) + e`` can not 2847 be transformed into ``(a*b) + ((c*d) + e)`` to create two fma operations. 2848 2849``afn`` 2850 Approximate functions - Allow substitution of approximate calculations for 2851 functions (sin, log, sqrt, etc). See floating-point intrinsic definitions 2852 for places where this can apply to LLVM's intrinsic math functions. 2853 2854``reassoc`` 2855 Allow reassociation transformations for floating-point instructions. 2856 This may dramatically change results in floating-point. 2857 2858``fast`` 2859 This flag implies all of the others. 2860 2861.. _uselistorder: 2862 2863Use-list Order Directives 2864------------------------- 2865 2866Use-list directives encode the in-memory order of each use-list, allowing the 2867order to be recreated. ``<order-indexes>`` is a comma-separated list of 2868indexes that are assigned to the referenced value's uses. The referenced 2869value's use-list is immediately sorted by these indexes. 2870 2871Use-list directives may appear at function scope or global scope. They are not 2872instructions, and have no effect on the semantics of the IR. When they're at 2873function scope, they must appear after the terminator of the final basic block. 2874 2875If basic blocks have their address taken via ``blockaddress()`` expressions, 2876``uselistorder_bb`` can be used to reorder their use-lists from outside their 2877function's scope. 2878 2879:Syntax: 2880 2881:: 2882 2883 uselistorder <ty> <value>, { <order-indexes> } 2884 uselistorder_bb @function, %block { <order-indexes> } 2885 2886:Examples: 2887 2888:: 2889 2890 define void @foo(i32 %arg1, i32 %arg2) { 2891 entry: 2892 ; ... instructions ... 2893 bb: 2894 ; ... instructions ... 2895 2896 ; At function scope. 2897 uselistorder i32 %arg1, { 1, 0, 2 } 2898 uselistorder label %bb, { 1, 0 } 2899 } 2900 2901 ; At global scope. 2902 uselistorder i32* @global, { 1, 2, 0 } 2903 uselistorder i32 7, { 1, 0 } 2904 uselistorder i32 (i32) @bar, { 1, 0 } 2905 uselistorder_bb @foo, %bb, { 5, 1, 3, 2, 0, 4 } 2906 2907.. _source_filename: 2908 2909Source Filename 2910--------------- 2911 2912The *source filename* string is set to the original module identifier, 2913which will be the name of the compiled source file when compiling from 2914source through the clang front end, for example. It is then preserved through 2915the IR and bitcode. 2916 2917This is currently necessary to generate a consistent unique global 2918identifier for local functions used in profile data, which prepends the 2919source file name to the local function name. 2920 2921The syntax for the source file name is simply: 2922 2923.. code-block:: text 2924 2925 source_filename = "/path/to/source.c" 2926 2927.. _typesystem: 2928 2929Type System 2930=========== 2931 2932The LLVM type system is one of the most important features of the 2933intermediate representation. Being typed enables a number of 2934optimizations to be performed on the intermediate representation 2935directly, without having to do extra analyses on the side before the 2936transformation. A strong type system makes it easier to read the 2937generated code and enables novel analyses and transformations that are 2938not feasible to perform on normal three address code representations. 2939 2940.. _t_void: 2941 2942Void Type 2943--------- 2944 2945:Overview: 2946 2947 2948The void type does not represent any value and has no size. 2949 2950:Syntax: 2951 2952 2953:: 2954 2955 void 2956 2957 2958.. _t_function: 2959 2960Function Type 2961------------- 2962 2963:Overview: 2964 2965 2966The function type can be thought of as a function signature. It consists of a 2967return type and a list of formal parameter types. The return type of a function 2968type is a void type or first class type --- except for :ref:`label <t_label>` 2969and :ref:`metadata <t_metadata>` types. 2970 2971:Syntax: 2972 2973:: 2974 2975 <returntype> (<parameter list>) 2976 2977...where '``<parameter list>``' is a comma-separated list of type 2978specifiers. Optionally, the parameter list may include a type ``...``, which 2979indicates that the function takes a variable number of arguments. Variable 2980argument functions can access their arguments with the :ref:`variable argument 2981handling intrinsic <int_varargs>` functions. '``<returntype>``' is any type 2982except :ref:`label <t_label>` and :ref:`metadata <t_metadata>`. 2983 2984:Examples: 2985 2986+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 2987| ``i32 (i32)`` | function taking an ``i32``, returning an ``i32`` | 2988+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 2989| ``float (i16, i32 *) *`` | :ref:`Pointer <t_pointer>` to a function that takes an ``i16`` and a :ref:`pointer <t_pointer>` to ``i32``, returning ``float``. | 2990+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 2991| ``i32 (i8*, ...)`` | A vararg function that takes at least one :ref:`pointer <t_pointer>` to ``i8`` (char in C), which returns an integer. This is the signature for ``printf`` in LLVM. | 2992+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 2993| ``{i32, i32} (i32)`` | A function taking an ``i32``, returning a :ref:`structure <t_struct>` containing two ``i32`` values | 2994+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 2995 2996.. _t_firstclass: 2997 2998First Class Types 2999----------------- 3000 3001The :ref:`first class <t_firstclass>` types are perhaps the most important. 3002Values of these types are the only ones which can be produced by 3003instructions. 3004 3005.. _t_single_value: 3006 3007Single Value Types 3008^^^^^^^^^^^^^^^^^^ 3009 3010These are the types that are valid in registers from CodeGen's perspective. 3011 3012.. _t_integer: 3013 3014Integer Type 3015"""""""""""" 3016 3017:Overview: 3018 3019The integer type is a very simple type that simply specifies an 3020arbitrary bit width for the integer type desired. Any bit width from 1 3021bit to 2\ :sup:`23`\ -1 (about 8 million) can be specified. 3022 3023:Syntax: 3024 3025:: 3026 3027 iN 3028 3029The number of bits the integer will occupy is specified by the ``N`` 3030value. 3031 3032Examples: 3033********* 3034 3035+----------------+------------------------------------------------+ 3036| ``i1`` | a single-bit integer. | 3037+----------------+------------------------------------------------+ 3038| ``i32`` | a 32-bit integer. | 3039+----------------+------------------------------------------------+ 3040| ``i1942652`` | a really big integer of over 1 million bits. | 3041+----------------+------------------------------------------------+ 3042 3043.. _t_floating: 3044 3045Floating-Point Types 3046"""""""""""""""""""" 3047 3048.. list-table:: 3049 :header-rows: 1 3050 3051 * - Type 3052 - Description 3053 3054 * - ``half`` 3055 - 16-bit floating-point value 3056 3057 * - ``bfloat`` 3058 - 16-bit "brain" floating-point value (7-bit significand). Provides the 3059 same number of exponent bits as ``float``, so that it matches its dynamic 3060 range, but with greatly reduced precision. Used in Intel's AVX-512 BF16 3061 extensions and Arm's ARMv8.6-A extensions, among others. 3062 3063 * - ``float`` 3064 - 32-bit floating-point value 3065 3066 * - ``double`` 3067 - 64-bit floating-point value 3068 3069 * - ``fp128`` 3070 - 128-bit floating-point value (112-bit significand) 3071 3072 * - ``x86_fp80`` 3073 - 80-bit floating-point value (X87) 3074 3075 * - ``ppc_fp128`` 3076 - 128-bit floating-point value (two 64-bits) 3077 3078The binary format of half, float, double, and fp128 correspond to the 3079IEEE-754-2008 specifications for binary16, binary32, binary64, and binary128 3080respectively. 3081 3082X86_mmx Type 3083"""""""""""" 3084 3085:Overview: 3086 3087The x86_mmx type represents a value held in an MMX register on an x86 3088machine. The operations allowed on it are quite limited: parameters and 3089return values, load and store, and bitcast. User-specified MMX 3090instructions are represented as intrinsic or asm calls with arguments 3091and/or results of this type. There are no arrays, vectors or constants 3092of this type. 3093 3094:Syntax: 3095 3096:: 3097 3098 x86_mmx 3099 3100 3101.. _t_pointer: 3102 3103Pointer Type 3104"""""""""""" 3105 3106:Overview: 3107 3108The pointer type is used to specify memory locations. Pointers are 3109commonly used to reference objects in memory. 3110 3111Pointer types may have an optional address space attribute defining the 3112numbered address space where the pointed-to object resides. The default 3113address space is number zero. The semantics of non-zero address spaces 3114are target-specific. 3115 3116Note that LLVM does not permit pointers to void (``void*``) nor does it 3117permit pointers to labels (``label*``). Use ``i8*`` instead. 3118 3119:Syntax: 3120 3121:: 3122 3123 <type> * 3124 3125:Examples: 3126 3127+-------------------------+--------------------------------------------------------------------------------------------------------------+ 3128| ``[4 x i32]*`` | A :ref:`pointer <t_pointer>` to :ref:`array <t_array>` of four ``i32`` values. | 3129+-------------------------+--------------------------------------------------------------------------------------------------------------+ 3130| ``i32 (i32*) *`` | A :ref:`pointer <t_pointer>` to a :ref:`function <t_function>` that takes an ``i32*``, returning an ``i32``. | 3131+-------------------------+--------------------------------------------------------------------------------------------------------------+ 3132| ``i32 addrspace(5)*`` | A :ref:`pointer <t_pointer>` to an ``i32`` value that resides in address space #5. | 3133+-------------------------+--------------------------------------------------------------------------------------------------------------+ 3134 3135.. _t_vector: 3136 3137Vector Type 3138""""""""""" 3139 3140:Overview: 3141 3142A vector type is a simple derived type that represents a vector of 3143elements. Vector types are used when multiple primitive data are 3144operated in parallel using a single instruction (SIMD). A vector type 3145requires a size (number of elements), an underlying primitive data type, 3146and a scalable property to represent vectors where the exact hardware 3147vector length is unknown at compile time. Vector types are considered 3148:ref:`first class <t_firstclass>`. 3149 3150:Syntax: 3151 3152:: 3153 3154 < <# elements> x <elementtype> > ; Fixed-length vector 3155 < vscale x <# elements> x <elementtype> > ; Scalable vector 3156 3157The number of elements is a constant integer value larger than 0; 3158elementtype may be any integer, floating-point or pointer type. Vectors 3159of size zero are not allowed. For scalable vectors, the total number of 3160elements is a constant multiple (called vscale) of the specified number 3161of elements; vscale is a positive integer that is unknown at compile time 3162and the same hardware-dependent constant for all scalable vectors at run 3163time. The size of a specific scalable vector type is thus constant within 3164IR, even if the exact size in bytes cannot be determined until run time. 3165 3166:Examples: 3167 3168+------------------------+----------------------------------------------------+ 3169| ``<4 x i32>`` | Vector of 4 32-bit integer values. | 3170+------------------------+----------------------------------------------------+ 3171| ``<8 x float>`` | Vector of 8 32-bit floating-point values. | 3172+------------------------+----------------------------------------------------+ 3173| ``<2 x i64>`` | Vector of 2 64-bit integer values. | 3174+------------------------+----------------------------------------------------+ 3175| ``<4 x i64*>`` | Vector of 4 pointers to 64-bit integer values. | 3176+------------------------+----------------------------------------------------+ 3177| ``<vscale x 4 x i32>`` | Vector with a multiple of 4 32-bit integer values. | 3178+------------------------+----------------------------------------------------+ 3179 3180.. _t_label: 3181 3182Label Type 3183^^^^^^^^^^ 3184 3185:Overview: 3186 3187The label type represents code labels. 3188 3189:Syntax: 3190 3191:: 3192 3193 label 3194 3195.. _t_token: 3196 3197Token Type 3198^^^^^^^^^^ 3199 3200:Overview: 3201 3202The token type is used when a value is associated with an instruction 3203but all uses of the value must not attempt to introspect or obscure it. 3204As such, it is not appropriate to have a :ref:`phi <i_phi>` or 3205:ref:`select <i_select>` of type token. 3206 3207:Syntax: 3208 3209:: 3210 3211 token 3212 3213 3214 3215.. _t_metadata: 3216 3217Metadata Type 3218^^^^^^^^^^^^^ 3219 3220:Overview: 3221 3222The metadata type represents embedded metadata. No derived types may be 3223created from metadata except for :ref:`function <t_function>` arguments. 3224 3225:Syntax: 3226 3227:: 3228 3229 metadata 3230 3231.. _t_aggregate: 3232 3233Aggregate Types 3234^^^^^^^^^^^^^^^ 3235 3236Aggregate Types are a subset of derived types that can contain multiple 3237member types. :ref:`Arrays <t_array>` and :ref:`structs <t_struct>` are 3238aggregate types. :ref:`Vectors <t_vector>` are not considered to be 3239aggregate types. 3240 3241.. _t_array: 3242 3243Array Type 3244"""""""""" 3245 3246:Overview: 3247 3248The array type is a very simple derived type that arranges elements 3249sequentially in memory. The array type requires a size (number of 3250elements) and an underlying data type. 3251 3252:Syntax: 3253 3254:: 3255 3256 [<# elements> x <elementtype>] 3257 3258The number of elements is a constant integer value; ``elementtype`` may 3259be any type with a size. 3260 3261:Examples: 3262 3263+------------------+--------------------------------------+ 3264| ``[40 x i32]`` | Array of 40 32-bit integer values. | 3265+------------------+--------------------------------------+ 3266| ``[41 x i32]`` | Array of 41 32-bit integer values. | 3267+------------------+--------------------------------------+ 3268| ``[4 x i8]`` | Array of 4 8-bit integer values. | 3269+------------------+--------------------------------------+ 3270 3271Here are some examples of multidimensional arrays: 3272 3273+-----------------------------+----------------------------------------------------------+ 3274| ``[3 x [4 x i32]]`` | 3x4 array of 32-bit integer values. | 3275+-----------------------------+----------------------------------------------------------+ 3276| ``[12 x [10 x float]]`` | 12x10 array of single precision floating-point values. | 3277+-----------------------------+----------------------------------------------------------+ 3278| ``[2 x [3 x [4 x i16]]]`` | 2x3x4 array of 16-bit integer values. | 3279+-----------------------------+----------------------------------------------------------+ 3280 3281There is no restriction on indexing beyond the end of the array implied 3282by a static type (though there are restrictions on indexing beyond the 3283bounds of an allocated object in some cases). This means that 3284single-dimension 'variable sized array' addressing can be implemented in 3285LLVM with a zero length array type. An implementation of 'pascal style 3286arrays' in LLVM could use the type "``{ i32, [0 x float]}``", for 3287example. 3288 3289.. _t_struct: 3290 3291Structure Type 3292"""""""""""""" 3293 3294:Overview: 3295 3296The structure type is used to represent a collection of data members 3297together in memory. The elements of a structure may be any type that has 3298a size. 3299 3300Structures in memory are accessed using '``load``' and '``store``' by 3301getting a pointer to a field with the '``getelementptr``' instruction. 3302Structures in registers are accessed using the '``extractvalue``' and 3303'``insertvalue``' instructions. 3304 3305Structures may optionally be "packed" structures, which indicate that 3306the alignment of the struct is one byte, and that there is no padding 3307between the elements. In non-packed structs, padding between field types 3308is inserted as defined by the DataLayout string in the module, which is 3309required to match what the underlying code generator expects. 3310 3311Structures can either be "literal" or "identified". A literal structure 3312is defined inline with other types (e.g. ``{i32, i32}*``) whereas 3313identified types are always defined at the top level with a name. 3314Literal types are uniqued by their contents and can never be recursive 3315or opaque since there is no way to write one. Identified types can be 3316recursive, can be opaqued, and are never uniqued. 3317 3318:Syntax: 3319 3320:: 3321 3322 %T1 = type { <type list> } ; Identified normal struct type 3323 %T2 = type <{ <type list> }> ; Identified packed struct type 3324 3325:Examples: 3326 3327+------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 3328| ``{ i32, i32, i32 }`` | A triple of three ``i32`` values | 3329+------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 3330| ``{ float, i32 (i32) * }`` | A pair, where the first element is a ``float`` and the second element is a :ref:`pointer <t_pointer>` to a :ref:`function <t_function>` that takes an ``i32``, returning an ``i32``. | 3331+------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 3332| ``<{ i8, i32 }>`` | A packed struct known to be 5 bytes in size. | 3333+------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 3334 3335.. _t_opaque: 3336 3337Opaque Structure Types 3338"""""""""""""""""""""" 3339 3340:Overview: 3341 3342Opaque structure types are used to represent named structure types that 3343do not have a body specified. This corresponds (for example) to the C 3344notion of a forward declared structure. 3345 3346:Syntax: 3347 3348:: 3349 3350 %X = type opaque 3351 %52 = type opaque 3352 3353:Examples: 3354 3355+--------------+-------------------+ 3356| ``opaque`` | An opaque type. | 3357+--------------+-------------------+ 3358 3359.. _constants: 3360 3361Constants 3362========= 3363 3364LLVM has several different basic types of constants. This section 3365describes them all and their syntax. 3366 3367Simple Constants 3368---------------- 3369 3370**Boolean constants** 3371 The two strings '``true``' and '``false``' are both valid constants 3372 of the ``i1`` type. 3373**Integer constants** 3374 Standard integers (such as '4') are constants of the 3375 :ref:`integer <t_integer>` type. Negative numbers may be used with 3376 integer types. 3377**Floating-point constants** 3378 Floating-point constants use standard decimal notation (e.g. 3379 123.421), exponential notation (e.g. 1.23421e+2), or a more precise 3380 hexadecimal notation (see below). The assembler requires the exact 3381 decimal value of a floating-point constant. For example, the 3382 assembler accepts 1.25 but rejects 1.3 because 1.3 is a repeating 3383 decimal in binary. Floating-point constants must have a 3384 :ref:`floating-point <t_floating>` type. 3385**Null pointer constants** 3386 The identifier '``null``' is recognized as a null pointer constant 3387 and must be of :ref:`pointer type <t_pointer>`. 3388**Token constants** 3389 The identifier '``none``' is recognized as an empty token constant 3390 and must be of :ref:`token type <t_token>`. 3391 3392The one non-intuitive notation for constants is the hexadecimal form of 3393floating-point constants. For example, the form 3394'``double 0x432ff973cafa8000``' is equivalent to (but harder to read 3395than) '``double 4.5e+15``'. The only time hexadecimal floating-point 3396constants are required (and the only time that they are generated by the 3397disassembler) is when a floating-point constant must be emitted but it 3398cannot be represented as a decimal floating-point number in a reasonable 3399number of digits. For example, NaN's, infinities, and other special 3400values are represented in their IEEE hexadecimal format so that assembly 3401and disassembly do not cause any bits to change in the constants. 3402 3403When using the hexadecimal form, constants of types bfloat, half, float, and 3404double are represented using the 16-digit form shown above (which matches the 3405IEEE754 representation for double); bfloat, half and float values must, however, 3406be exactly representable as bfloat, IEEE 754 half, and IEEE 754 single 3407precision respectively. Hexadecimal format is always used for long double, and 3408there are three forms of long double. The 80-bit format used by x86 is 3409represented as ``0xK`` followed by 20 hexadecimal digits. The 128-bit format 3410used by PowerPC (two adjacent doubles) is represented by ``0xM`` followed by 32 3411hexadecimal digits. The IEEE 128-bit format is represented by ``0xL`` followed 3412by 32 hexadecimal digits. Long doubles will only work if they match the long 3413double format on your target. The IEEE 16-bit format (half precision) is 3414represented by ``0xH`` followed by 4 hexadecimal digits. The bfloat 16-bit 3415format is represented by ``0xR`` followed by 4 hexadecimal digits. All 3416hexadecimal formats are big-endian (sign bit at the left). 3417 3418There are no constants of type x86_mmx. 3419 3420.. _complexconstants: 3421 3422Complex Constants 3423----------------- 3424 3425Complex constants are a (potentially recursive) combination of simple 3426constants and smaller complex constants. 3427 3428**Structure constants** 3429 Structure constants are represented with notation similar to 3430 structure type definitions (a comma separated list of elements, 3431 surrounded by braces (``{}``)). For example: 3432 "``{ i32 4, float 17.0, i32* @G }``", where "``@G``" is declared as 3433 "``@G = external global i32``". Structure constants must have 3434 :ref:`structure type <t_struct>`, and the number and types of elements 3435 must match those specified by the type. 3436**Array constants** 3437 Array constants are represented with notation similar to array type 3438 definitions (a comma separated list of elements, surrounded by 3439 square brackets (``[]``)). For example: 3440 "``[ i32 42, i32 11, i32 74 ]``". Array constants must have 3441 :ref:`array type <t_array>`, and the number and types of elements must 3442 match those specified by the type. As a special case, character array 3443 constants may also be represented as a double-quoted string using the ``c`` 3444 prefix. For example: "``c"Hello World\0A\00"``". 3445**Vector constants** 3446 Vector constants are represented with notation similar to vector 3447 type definitions (a comma separated list of elements, surrounded by 3448 less-than/greater-than's (``<>``)). For example: 3449 "``< i32 42, i32 11, i32 74, i32 100 >``". Vector constants 3450 must have :ref:`vector type <t_vector>`, and the number and types of 3451 elements must match those specified by the type. 3452**Zero initialization** 3453 The string '``zeroinitializer``' can be used to zero initialize a 3454 value to zero of *any* type, including scalar and 3455 :ref:`aggregate <t_aggregate>` types. This is often used to avoid 3456 having to print large zero initializers (e.g. for large arrays) and 3457 is always exactly equivalent to using explicit zero initializers. 3458**Metadata node** 3459 A metadata node is a constant tuple without types. For example: 3460 "``!{!0, !{!2, !0}, !"test"}``". Metadata can reference constant values, 3461 for example: "``!{!0, i32 0, i8* @global, i64 (i64)* @function, !"str"}``". 3462 Unlike other typed constants that are meant to be interpreted as part of 3463 the instruction stream, metadata is a place to attach additional 3464 information such as debug info. 3465 3466Global Variable and Function Addresses 3467-------------------------------------- 3468 3469The addresses of :ref:`global variables <globalvars>` and 3470:ref:`functions <functionstructure>` are always implicitly valid 3471(link-time) constants. These constants are explicitly referenced when 3472the :ref:`identifier for the global <identifiers>` is used and always have 3473:ref:`pointer <t_pointer>` type. For example, the following is a legal LLVM 3474file: 3475 3476.. code-block:: llvm 3477 3478 @X = global i32 17 3479 @Y = global i32 42 3480 @Z = global [2 x i32*] [ i32* @X, i32* @Y ] 3481 3482.. _undefvalues: 3483 3484Undefined Values 3485---------------- 3486 3487The string '``undef``' can be used anywhere a constant is expected, and 3488indicates that the user of the value may receive an unspecified 3489bit-pattern. Undefined values may be of any type (other than '``label``' 3490or '``void``') and be used anywhere a constant is permitted. 3491 3492Undefined values are useful because they indicate to the compiler that 3493the program is well defined no matter what value is used. This gives the 3494compiler more freedom to optimize. Here are some examples of 3495(potentially surprising) transformations that are valid (in pseudo IR): 3496 3497.. code-block:: llvm 3498 3499 %A = add %X, undef 3500 %B = sub %X, undef 3501 %C = xor %X, undef 3502 Safe: 3503 %A = undef 3504 %B = undef 3505 %C = undef 3506 3507This is safe because all of the output bits are affected by the undef 3508bits. Any output bit can have a zero or one depending on the input bits. 3509 3510.. code-block:: llvm 3511 3512 %A = or %X, undef 3513 %B = and %X, undef 3514 Safe: 3515 %A = -1 3516 %B = 0 3517 Safe: 3518 %A = %X ;; By choosing undef as 0 3519 %B = %X ;; By choosing undef as -1 3520 Unsafe: 3521 %A = undef 3522 %B = undef 3523 3524These logical operations have bits that are not always affected by the 3525input. For example, if ``%X`` has a zero bit, then the output of the 3526'``and``' operation will always be a zero for that bit, no matter what 3527the corresponding bit from the '``undef``' is. As such, it is unsafe to 3528optimize or assume that the result of the '``and``' is '``undef``'. 3529However, it is safe to assume that all bits of the '``undef``' could be 35300, and optimize the '``and``' to 0. Likewise, it is safe to assume that 3531all the bits of the '``undef``' operand to the '``or``' could be set, 3532allowing the '``or``' to be folded to -1. 3533 3534.. code-block:: llvm 3535 3536 %A = select undef, %X, %Y 3537 %B = select undef, 42, %Y 3538 %C = select %X, %Y, undef 3539 Safe: 3540 %A = %X (or %Y) 3541 %B = 42 (or %Y) 3542 %C = %Y 3543 Unsafe: 3544 %A = undef 3545 %B = undef 3546 %C = undef 3547 3548This set of examples shows that undefined '``select``' (and conditional 3549branch) conditions can go *either way*, but they have to come from one 3550of the two operands. In the ``%A`` example, if ``%X`` and ``%Y`` were 3551both known to have a clear low bit, then ``%A`` would have to have a 3552cleared low bit. However, in the ``%C`` example, the optimizer is 3553allowed to assume that the '``undef``' operand could be the same as 3554``%Y``, allowing the whole '``select``' to be eliminated. 3555 3556.. code-block:: text 3557 3558 %A = xor undef, undef 3559 3560 %B = undef 3561 %C = xor %B, %B 3562 3563 %D = undef 3564 %E = icmp slt %D, 4 3565 %F = icmp gte %D, 4 3566 3567 Safe: 3568 %A = undef 3569 %B = undef 3570 %C = undef 3571 %D = undef 3572 %E = undef 3573 %F = undef 3574 3575This example points out that two '``undef``' operands are not 3576necessarily the same. This can be surprising to people (and also matches 3577C semantics) where they assume that "``X^X``" is always zero, even if 3578``X`` is undefined. This isn't true for a number of reasons, but the 3579short answer is that an '``undef``' "variable" can arbitrarily change 3580its value over its "live range". This is true because the variable 3581doesn't actually *have a live range*. Instead, the value is logically 3582read from arbitrary registers that happen to be around when needed, so 3583the value is not necessarily consistent over time. In fact, ``%A`` and 3584``%C`` need to have the same semantics or the core LLVM "replace all 3585uses with" concept would not hold. 3586 3587To ensure all uses of a given register observe the same value (even if 3588'``undef``'), the :ref:`freeze instruction <i_freeze>` can be used. 3589 3590.. code-block:: llvm 3591 3592 %A = sdiv undef, %X 3593 %B = sdiv %X, undef 3594 Safe: 3595 %A = 0 3596 b: unreachable 3597 3598These examples show the crucial difference between an *undefined value* 3599and *undefined behavior*. An undefined value (like '``undef``') is 3600allowed to have an arbitrary bit-pattern. This means that the ``%A`` 3601operation can be constant folded to '``0``', because the '``undef``' 3602could be zero, and zero divided by any value is zero. 3603However, in the second example, we can make a more aggressive 3604assumption: because the ``undef`` is allowed to be an arbitrary value, 3605we are allowed to assume that it could be zero. Since a divide by zero 3606has *undefined behavior*, we are allowed to assume that the operation 3607does not execute at all. This allows us to delete the divide and all 3608code after it. Because the undefined operation "can't happen", the 3609optimizer can assume that it occurs in dead code. 3610 3611.. code-block:: text 3612 3613 a: store undef -> %X 3614 b: store %X -> undef 3615 Safe: 3616 a: <deleted> 3617 b: unreachable 3618 3619A store *of* an undefined value can be assumed to not have any effect; 3620we can assume that the value is overwritten with bits that happen to 3621match what was already there. However, a store *to* an undefined 3622location could clobber arbitrary memory, therefore, it has undefined 3623behavior. 3624 3625Branching on an undefined value is undefined behavior. 3626This explains optimizations that depend on branch conditions to construct 3627predicates, such as Correlated Value Propagation and Global Value Numbering. 3628In case of switch instruction, the branch condition should be frozen, otherwise 3629it is undefined behavior. 3630 3631.. code-block:: text 3632 3633 Unsafe: 3634 br undef, BB1, BB2 ; UB 3635 3636 %X = and i32 undef, 255 3637 switch %X, label %ret [ .. ] ; UB 3638 3639 store undef, i8* %ptr 3640 %X = load i8* %ptr ; %X is undef 3641 switch i8 %X, label %ret [ .. ] ; UB 3642 3643 Safe: 3644 %X = or i8 undef, 255 ; always 255 3645 switch i8 %X, label %ret [ .. ] ; Well-defined 3646 3647 %X = freeze i1 undef 3648 br %X, BB1, BB2 ; Well-defined (non-deterministic jump) 3649 3650 3651This is also consistent with the behavior of MemorySanitizer. 3652MemorySanitizer, detector of uses of uninitialized memory, 3653defines a branch with condition that depends on an undef value (or 3654certain other values, like e.g. a result of a load from heap-allocated 3655memory that has never been stored to) to have an externally visible 3656side effect. For this reason functions with *sanitize_memory* 3657attribute are not allowed to produce such branches "out of thin 3658air". More strictly, an optimization that inserts a conditional branch 3659is only valid if in all executions where the branch condition has at 3660least one undefined bit, the same branch condition is evaluated in the 3661input IR as well. 3662 3663.. _poisonvalues: 3664 3665Poison Values 3666------------- 3667 3668In order to facilitate speculative execution, many instructions do not 3669invoke immediate undefined behavior when provided with illegal operands, 3670and return a poison value instead. 3671 3672There is currently no way of representing a poison value in the IR; they 3673only exist when produced by operations such as :ref:`add <i_add>` with 3674the ``nsw`` flag. 3675 3676Poison value behavior is defined in terms of value *dependence*: 3677 3678- Values other than :ref:`phi <i_phi>` nodes and :ref:`select <i_select>` 3679 instructions depend on their operands. 3680- :ref:`Phi <i_phi>` nodes depend on the operand corresponding to 3681 their dynamic predecessor basic block. 3682- Select instructions depend on their condition operand and their 3683 selected operand. 3684- Function arguments depend on the corresponding actual argument values 3685 in the dynamic callers of their functions. 3686- :ref:`Call <i_call>` instructions depend on the :ref:`ret <i_ret>` 3687 instructions that dynamically transfer control back to them. 3688- :ref:`Invoke <i_invoke>` instructions depend on the 3689 :ref:`ret <i_ret>`, :ref:`resume <i_resume>`, or exception-throwing 3690 call instructions that dynamically transfer control back to them. 3691- Non-volatile loads and stores depend on the most recent stores to all 3692 of the referenced memory addresses, following the order in the IR 3693 (including loads and stores implied by intrinsics such as 3694 :ref:`@llvm.memcpy <int_memcpy>`.) 3695- An instruction with externally visible side effects depends on the 3696 most recent preceding instruction with externally visible side 3697 effects, following the order in the IR. (This includes :ref:`volatile 3698 operations <volatile>`.) 3699- An instruction *control-depends* on a :ref:`terminator 3700 instruction <terminators>` if the terminator instruction has 3701 multiple successors and the instruction is always executed when 3702 control transfers to one of the successors, and may not be executed 3703 when control is transferred to another. 3704- Additionally, an instruction also *control-depends* on a terminator 3705 instruction if the set of instructions it otherwise depends on would 3706 be different if the terminator had transferred control to a different 3707 successor. 3708- Dependence is transitive. 3709- Vector elements may be independently poisoned. Therefore, transforms 3710 on instructions such as shufflevector must be careful to propagate 3711 poison across values or elements only as allowed by the original code. 3712 3713An instruction that *depends* on a poison value, produces a poison value 3714itself. A poison value may be relaxed into an 3715:ref:`undef value <undefvalues>`, which takes an arbitrary bit-pattern. 3716Propagation of poison can be stopped with the 3717:ref:`freeze instruction <i_freeze>`. 3718 3719This means that immediate undefined behavior occurs if a poison value is 3720used as an instruction operand that has any values that trigger undefined 3721behavior. Notably this includes (but is not limited to): 3722 3723- The pointer operand of a :ref:`load <i_load>`, :ref:`store <i_store>` or 3724 any other pointer dereferencing instruction (independent of address 3725 space). 3726- The divisor operand of a ``udiv``, ``sdiv``, ``urem`` or ``srem`` 3727 instruction. 3728- The condition operand of a :ref:`br <i_br>` instruction. 3729- The callee operand of a :ref:`call <i_call>` or :ref:`invoke <i_invoke>` 3730 instruction. 3731- The parameter operand of a :ref:`call <i_call>` or :ref:`invoke <i_invoke>` 3732 instruction, when the function or invoking call site has a ``noundef`` 3733 attribute in the corresponding position. 3734- The operand of a :ref:`ret <i_ret>` instruction if the function or invoking 3735 call site has a `noundef` attribute in the return value position. 3736 3737Here are some examples: 3738 3739.. code-block:: llvm 3740 3741 entry: 3742 %poison = sub nuw i32 0, 1 ; Results in a poison value. 3743 %still_poison = and i32 %poison, 0 ; 0, but also poison. 3744 %poison_yet_again = getelementptr i32, i32* @h, i32 %still_poison 3745 store i32 0, i32* %poison_yet_again ; Undefined behavior due to 3746 ; store to poison. 3747 3748 store i32 %poison, i32* @g ; Poison value stored to memory. 3749 %poison2 = load i32, i32* @g ; Poison value loaded back from memory. 3750 3751 %narrowaddr = bitcast i32* @g to i16* 3752 %wideaddr = bitcast i32* @g to i64* 3753 %poison3 = load i16, i16* %narrowaddr ; Returns a poison value. 3754 %poison4 = load i64, i64* %wideaddr ; Returns a poison value. 3755 3756 %cmp = icmp slt i32 %poison, 0 ; Returns a poison value. 3757 br i1 %cmp, label %end, label %end ; undefined behavior 3758 3759 end: 3760 3761.. _welldefinedvalues: 3762 3763Well-Defined Values 3764------------------- 3765 3766Given a program execution, a value is *well defined* if the value does not 3767have an undef bit and is not poison in the execution. 3768An aggregate value or vector is well defined if its elements are well defined. 3769The padding of an aggregate isn't considered, since it isn't visible 3770without storing it into memory and loading it with a different type. 3771 3772A constant of a :ref:`single value <t_single_value>`, non-vector type is well 3773defined if it is a non-undef constant. Note that there is no poison constant 3774in LLVM. 3775The result of :ref:`freeze instruction <i_freeze>` is well defined regardless 3776of its operand. 3777 3778.. _blockaddress: 3779 3780Addresses of Basic Blocks 3781------------------------- 3782 3783``blockaddress(@function, %block)`` 3784 3785The '``blockaddress``' constant computes the address of the specified 3786basic block in the specified function, and always has an ``i8*`` type. 3787Taking the address of the entry block is illegal. 3788 3789This value only has defined behavior when used as an operand to the 3790':ref:`indirectbr <i_indirectbr>`' or ':ref:`callbr <i_callbr>`'instruction, or 3791for comparisons against null. Pointer equality tests between labels addresses 3792results in undefined behavior --- though, again, comparison against null is ok, 3793and no label is equal to the null pointer. This may be passed around as an 3794opaque pointer sized value as long as the bits are not inspected. This 3795allows ``ptrtoint`` and arithmetic to be performed on these values so 3796long as the original value is reconstituted before the ``indirectbr`` or 3797``callbr`` instruction. 3798 3799Finally, some targets may provide defined semantics when using the value 3800as the operand to an inline assembly, but that is target specific. 3801 3802.. _constantexprs: 3803 3804Constant Expressions 3805-------------------- 3806 3807Constant expressions are used to allow expressions involving other 3808constants to be used as constants. Constant expressions may be of any 3809:ref:`first class <t_firstclass>` type and may involve any LLVM operation 3810that does not have side effects (e.g. load and call are not supported). 3811The following is the syntax for constant expressions: 3812 3813``trunc (CST to TYPE)`` 3814 Perform the :ref:`trunc operation <i_trunc>` on constants. 3815``zext (CST to TYPE)`` 3816 Perform the :ref:`zext operation <i_zext>` on constants. 3817``sext (CST to TYPE)`` 3818 Perform the :ref:`sext operation <i_sext>` on constants. 3819``fptrunc (CST to TYPE)`` 3820 Truncate a floating-point constant to another floating-point type. 3821 The size of CST must be larger than the size of TYPE. Both types 3822 must be floating-point. 3823``fpext (CST to TYPE)`` 3824 Floating-point extend a constant to another type. The size of CST 3825 must be smaller or equal to the size of TYPE. Both types must be 3826 floating-point. 3827``fptoui (CST to TYPE)`` 3828 Convert a floating-point constant to the corresponding unsigned 3829 integer constant. TYPE must be a scalar or vector integer type. CST 3830 must be of scalar or vector floating-point type. Both CST and TYPE 3831 must be scalars, or vectors of the same number of elements. If the 3832 value won't fit in the integer type, the result is a 3833 :ref:`poison value <poisonvalues>`. 3834``fptosi (CST to TYPE)`` 3835 Convert a floating-point constant to the corresponding signed 3836 integer constant. TYPE must be a scalar or vector integer type. CST 3837 must be of scalar or vector floating-point type. Both CST and TYPE 3838 must be scalars, or vectors of the same number of elements. If the 3839 value won't fit in the integer type, the result is a 3840 :ref:`poison value <poisonvalues>`. 3841``uitofp (CST to TYPE)`` 3842 Convert an unsigned integer constant to the corresponding 3843 floating-point constant. TYPE must be a scalar or vector floating-point 3844 type. CST must be of scalar or vector integer type. Both CST and TYPE must 3845 be scalars, or vectors of the same number of elements. 3846``sitofp (CST to TYPE)`` 3847 Convert a signed integer constant to the corresponding floating-point 3848 constant. TYPE must be a scalar or vector floating-point type. 3849 CST must be of scalar or vector integer type. Both CST and TYPE must 3850 be scalars, or vectors of the same number of elements. 3851``ptrtoint (CST to TYPE)`` 3852 Perform the :ref:`ptrtoint operation <i_ptrtoint>` on constants. 3853``inttoptr (CST to TYPE)`` 3854 Perform the :ref:`inttoptr operation <i_inttoptr>` on constants. 3855 This one is *really* dangerous! 3856``bitcast (CST to TYPE)`` 3857 Convert a constant, CST, to another TYPE. 3858 The constraints of the operands are the same as those for the 3859 :ref:`bitcast instruction <i_bitcast>`. 3860``addrspacecast (CST to TYPE)`` 3861 Convert a constant pointer or constant vector of pointer, CST, to another 3862 TYPE in a different address space. The constraints of the operands are the 3863 same as those for the :ref:`addrspacecast instruction <i_addrspacecast>`. 3864``getelementptr (TY, CSTPTR, IDX0, IDX1, ...)``, ``getelementptr inbounds (TY, CSTPTR, IDX0, IDX1, ...)`` 3865 Perform the :ref:`getelementptr operation <i_getelementptr>` on 3866 constants. As with the :ref:`getelementptr <i_getelementptr>` 3867 instruction, the index list may have one or more indexes, which are 3868 required to make sense for the type of "pointer to TY". 3869``select (COND, VAL1, VAL2)`` 3870 Perform the :ref:`select operation <i_select>` on constants. 3871``icmp COND (VAL1, VAL2)`` 3872 Perform the :ref:`icmp operation <i_icmp>` on constants. 3873``fcmp COND (VAL1, VAL2)`` 3874 Perform the :ref:`fcmp operation <i_fcmp>` on constants. 3875``extractelement (VAL, IDX)`` 3876 Perform the :ref:`extractelement operation <i_extractelement>` on 3877 constants. 3878``insertelement (VAL, ELT, IDX)`` 3879 Perform the :ref:`insertelement operation <i_insertelement>` on 3880 constants. 3881``shufflevector (VEC1, VEC2, IDXMASK)`` 3882 Perform the :ref:`shufflevector operation <i_shufflevector>` on 3883 constants. 3884``extractvalue (VAL, IDX0, IDX1, ...)`` 3885 Perform the :ref:`extractvalue operation <i_extractvalue>` on 3886 constants. The index list is interpreted in a similar manner as 3887 indices in a ':ref:`getelementptr <i_getelementptr>`' operation. At 3888 least one index value must be specified. 3889``insertvalue (VAL, ELT, IDX0, IDX1, ...)`` 3890 Perform the :ref:`insertvalue operation <i_insertvalue>` on constants. 3891 The index list is interpreted in a similar manner as indices in a 3892 ':ref:`getelementptr <i_getelementptr>`' operation. At least one index 3893 value must be specified. 3894``OPCODE (LHS, RHS)`` 3895 Perform the specified operation of the LHS and RHS constants. OPCODE 3896 may be any of the :ref:`binary <binaryops>` or :ref:`bitwise 3897 binary <bitwiseops>` operations. The constraints on operands are 3898 the same as those for the corresponding instruction (e.g. no bitwise 3899 operations on floating-point values are allowed). 3900 3901Other Values 3902============ 3903 3904.. _inlineasmexprs: 3905 3906Inline Assembler Expressions 3907---------------------------- 3908 3909LLVM supports inline assembler expressions (as opposed to :ref:`Module-Level 3910Inline Assembly <moduleasm>`) through the use of a special value. This value 3911represents the inline assembler as a template string (containing the 3912instructions to emit), a list of operand constraints (stored as a string), a 3913flag that indicates whether or not the inline asm expression has side effects, 3914and a flag indicating whether the function containing the asm needs to align its 3915stack conservatively. 3916 3917The template string supports argument substitution of the operands using "``$``" 3918followed by a number, to indicate substitution of the given register/memory 3919location, as specified by the constraint string. "``${NUM:MODIFIER}``" may also 3920be used, where ``MODIFIER`` is a target-specific annotation for how to print the 3921operand (See :ref:`inline-asm-modifiers`). 3922 3923A literal "``$``" may be included by using "``$$``" in the template. To include 3924other special characters into the output, the usual "``\XX``" escapes may be 3925used, just as in other strings. Note that after template substitution, the 3926resulting assembly string is parsed by LLVM's integrated assembler unless it is 3927disabled -- even when emitting a ``.s`` file -- and thus must contain assembly 3928syntax known to LLVM. 3929 3930LLVM also supports a few more substitutions useful for writing inline assembly: 3931 3932- ``${:uid}``: Expands to a decimal integer unique to this inline assembly blob. 3933 This substitution is useful when declaring a local label. Many standard 3934 compiler optimizations, such as inlining, may duplicate an inline asm blob. 3935 Adding a blob-unique identifier ensures that the two labels will not conflict 3936 during assembly. This is used to implement `GCC's %= special format 3937 string <https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html>`_. 3938- ``${:comment}``: Expands to the comment character of the current target's 3939 assembly dialect. This is usually ``#``, but many targets use other strings, 3940 such as ``;``, ``//``, or ``!``. 3941- ``${:private}``: Expands to the assembler private label prefix. Labels with 3942 this prefix will not appear in the symbol table of the assembled object. 3943 Typically the prefix is ``L``, but targets may use other strings. ``.L`` is 3944 relatively popular. 3945 3946LLVM's support for inline asm is modeled closely on the requirements of Clang's 3947GCC-compatible inline-asm support. Thus, the feature-set and the constraint and 3948modifier codes listed here are similar or identical to those in GCC's inline asm 3949support. However, to be clear, the syntax of the template and constraint strings 3950described here is *not* the same as the syntax accepted by GCC and Clang, and, 3951while most constraint letters are passed through as-is by Clang, some get 3952translated to other codes when converting from the C source to the LLVM 3953assembly. 3954 3955An example inline assembler expression is: 3956 3957.. code-block:: llvm 3958 3959 i32 (i32) asm "bswap $0", "=r,r" 3960 3961Inline assembler expressions may **only** be used as the callee operand 3962of a :ref:`call <i_call>` or an :ref:`invoke <i_invoke>` instruction. 3963Thus, typically we have: 3964 3965.. code-block:: llvm 3966 3967 %X = call i32 asm "bswap $0", "=r,r"(i32 %Y) 3968 3969Inline asms with side effects not visible in the constraint list must be 3970marked as having side effects. This is done through the use of the 3971'``sideeffect``' keyword, like so: 3972 3973.. code-block:: llvm 3974 3975 call void asm sideeffect "eieio", ""() 3976 3977In some cases inline asms will contain code that will not work unless 3978the stack is aligned in some way, such as calls or SSE instructions on 3979x86, yet will not contain code that does that alignment within the asm. 3980The compiler should make conservative assumptions about what the asm 3981might contain and should generate its usual stack alignment code in the 3982prologue if the '``alignstack``' keyword is present: 3983 3984.. code-block:: llvm 3985 3986 call void asm alignstack "eieio", ""() 3987 3988Inline asms also support using non-standard assembly dialects. The 3989assumed dialect is ATT. When the '``inteldialect``' keyword is present, 3990the inline asm is using the Intel dialect. Currently, ATT and Intel are 3991the only supported dialects. An example is: 3992 3993.. code-block:: llvm 3994 3995 call void asm inteldialect "eieio", ""() 3996 3997If multiple keywords appear the '``sideeffect``' keyword must come 3998first, the '``alignstack``' keyword second and the '``inteldialect``' 3999keyword last. 4000 4001Inline Asm Constraint String 4002^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 4003 4004The constraint list is a comma-separated string, each element containing one or 4005more constraint codes. 4006 4007For each element in the constraint list an appropriate register or memory 4008operand will be chosen, and it will be made available to assembly template 4009string expansion as ``$0`` for the first constraint in the list, ``$1`` for the 4010second, etc. 4011 4012There are three different types of constraints, which are distinguished by a 4013prefix symbol in front of the constraint code: Output, Input, and Clobber. The 4014constraints must always be given in that order: outputs first, then inputs, then 4015clobbers. They cannot be intermingled. 4016 4017There are also three different categories of constraint codes: 4018 4019- Register constraint. This is either a register class, or a fixed physical 4020 register. This kind of constraint will allocate a register, and if necessary, 4021 bitcast the argument or result to the appropriate type. 4022- Memory constraint. This kind of constraint is for use with an instruction 4023 taking a memory operand. Different constraints allow for different addressing 4024 modes used by the target. 4025- Immediate value constraint. This kind of constraint is for an integer or other 4026 immediate value which can be rendered directly into an instruction. The 4027 various target-specific constraints allow the selection of a value in the 4028 proper range for the instruction you wish to use it with. 4029 4030Output constraints 4031"""""""""""""""""" 4032 4033Output constraints are specified by an "``=``" prefix (e.g. "``=r``"). This 4034indicates that the assembly will write to this operand, and the operand will 4035then be made available as a return value of the ``asm`` expression. Output 4036constraints do not consume an argument from the call instruction. (Except, see 4037below about indirect outputs). 4038 4039Normally, it is expected that no output locations are written to by the assembly 4040expression until *all* of the inputs have been read. As such, LLVM may assign 4041the same register to an output and an input. If this is not safe (e.g. if the 4042assembly contains two instructions, where the first writes to one output, and 4043the second reads an input and writes to a second output), then the "``&``" 4044modifier must be used (e.g. "``=&r``") to specify that the output is an 4045"early-clobber" output. Marking an output as "early-clobber" ensures that LLVM 4046will not use the same register for any inputs (other than an input tied to this 4047output). 4048 4049Input constraints 4050""""""""""""""""" 4051 4052Input constraints do not have a prefix -- just the constraint codes. Each input 4053constraint will consume one argument from the call instruction. It is not 4054permitted for the asm to write to any input register or memory location (unless 4055that input is tied to an output). Note also that multiple inputs may all be 4056assigned to the same register, if LLVM can determine that they necessarily all 4057contain the same value. 4058 4059Instead of providing a Constraint Code, input constraints may also "tie" 4060themselves to an output constraint, by providing an integer as the constraint 4061string. Tied inputs still consume an argument from the call instruction, and 4062take up a position in the asm template numbering as is usual -- they will simply 4063be constrained to always use the same register as the output they've been tied 4064to. For example, a constraint string of "``=r,0``" says to assign a register for 4065output, and use that register as an input as well (it being the 0'th 4066constraint). 4067 4068It is permitted to tie an input to an "early-clobber" output. In that case, no 4069*other* input may share the same register as the input tied to the early-clobber 4070(even when the other input has the same value). 4071 4072You may only tie an input to an output which has a register constraint, not a 4073memory constraint. Only a single input may be tied to an output. 4074 4075There is also an "interesting" feature which deserves a bit of explanation: if a 4076register class constraint allocates a register which is too small for the value 4077type operand provided as input, the input value will be split into multiple 4078registers, and all of them passed to the inline asm. 4079 4080However, this feature is often not as useful as you might think. 4081 4082Firstly, the registers are *not* guaranteed to be consecutive. So, on those 4083architectures that have instructions which operate on multiple consecutive 4084instructions, this is not an appropriate way to support them. (e.g. the 32-bit 4085SparcV8 has a 64-bit load, which instruction takes a single 32-bit register. The 4086hardware then loads into both the named register, and the next register. This 4087feature of inline asm would not be useful to support that.) 4088 4089A few of the targets provide a template string modifier allowing explicit access 4090to the second register of a two-register operand (e.g. MIPS ``L``, ``M``, and 4091``D``). On such an architecture, you can actually access the second allocated 4092register (yet, still, not any subsequent ones). But, in that case, you're still 4093probably better off simply splitting the value into two separate operands, for 4094clarity. (e.g. see the description of the ``A`` constraint on X86, which, 4095despite existing only for use with this feature, is not really a good idea to 4096use) 4097 4098Indirect inputs and outputs 4099""""""""""""""""""""""""""" 4100 4101Indirect output or input constraints can be specified by the "``*``" modifier 4102(which goes after the "``=``" in case of an output). This indicates that the asm 4103will write to or read from the contents of an *address* provided as an input 4104argument. (Note that in this way, indirect outputs act more like an *input* than 4105an output: just like an input, they consume an argument of the call expression, 4106rather than producing a return value. An indirect output constraint is an 4107"output" only in that the asm is expected to write to the contents of the input 4108memory location, instead of just read from it). 4109 4110This is most typically used for memory constraint, e.g. "``=*m``", to pass the 4111address of a variable as a value. 4112 4113It is also possible to use an indirect *register* constraint, but only on output 4114(e.g. "``=*r``"). This will cause LLVM to allocate a register for an output 4115value normally, and then, separately emit a store to the address provided as 4116input, after the provided inline asm. (It's not clear what value this 4117functionality provides, compared to writing the store explicitly after the asm 4118statement, and it can only produce worse code, since it bypasses many 4119optimization passes. I would recommend not using it.) 4120 4121 4122Clobber constraints 4123""""""""""""""""""" 4124 4125A clobber constraint is indicated by a "``~``" prefix. A clobber does not 4126consume an input operand, nor generate an output. Clobbers cannot use any of the 4127general constraint code letters -- they may use only explicit register 4128constraints, e.g. "``~{eax}``". The one exception is that a clobber string of 4129"``~{memory}``" indicates that the assembly writes to arbitrary undeclared 4130memory locations -- not only the memory pointed to by a declared indirect 4131output. 4132 4133Note that clobbering named registers that are also present in output 4134constraints is not legal. 4135 4136 4137Constraint Codes 4138"""""""""""""""" 4139After a potential prefix comes constraint code, or codes. 4140 4141A Constraint Code is either a single letter (e.g. "``r``"), a "``^``" character 4142followed by two letters (e.g. "``^wc``"), or "``{``" register-name "``}``" 4143(e.g. "``{eax}``"). 4144 4145The one and two letter constraint codes are typically chosen to be the same as 4146GCC's constraint codes. 4147 4148A single constraint may include one or more than constraint code in it, leaving 4149it up to LLVM to choose which one to use. This is included mainly for 4150compatibility with the translation of GCC inline asm coming from clang. 4151 4152There are two ways to specify alternatives, and either or both may be used in an 4153inline asm constraint list: 4154 41551) Append the codes to each other, making a constraint code set. E.g. "``im``" 4156 or "``{eax}m``". This means "choose any of the options in the set". The 4157 choice of constraint is made independently for each constraint in the 4158 constraint list. 4159 41602) Use "``|``" between constraint code sets, creating alternatives. Every 4161 constraint in the constraint list must have the same number of alternative 4162 sets. With this syntax, the same alternative in *all* of the items in the 4163 constraint list will be chosen together. 4164 4165Putting those together, you might have a two operand constraint string like 4166``"rm|r,ri|rm"``. This indicates that if operand 0 is ``r`` or ``m``, then 4167operand 1 may be one of ``r`` or ``i``. If operand 0 is ``r``, then operand 1 4168may be one of ``r`` or ``m``. But, operand 0 and 1 cannot both be of type m. 4169 4170However, the use of either of the alternatives features is *NOT* recommended, as 4171LLVM is not able to make an intelligent choice about which one to use. (At the 4172point it currently needs to choose, not enough information is available to do so 4173in a smart way.) Thus, it simply tries to make a choice that's most likely to 4174compile, not one that will be optimal performance. (e.g., given "``rm``", it'll 4175always choose to use memory, not registers). And, if given multiple registers, 4176or multiple register classes, it will simply choose the first one. (In fact, it 4177doesn't currently even ensure explicitly specified physical registers are 4178unique, so specifying multiple physical registers as alternatives, like 4179``{r11}{r12},{r11}{r12}``, will assign r11 to both operands, not at all what was 4180intended.) 4181 4182Supported Constraint Code List 4183"""""""""""""""""""""""""""""" 4184 4185The constraint codes are, in general, expected to behave the same way they do in 4186GCC. LLVM's support is often implemented on an 'as-needed' basis, to support C 4187inline asm code which was supported by GCC. A mismatch in behavior between LLVM 4188and GCC likely indicates a bug in LLVM. 4189 4190Some constraint codes are typically supported by all targets: 4191 4192- ``r``: A register in the target's general purpose register class. 4193- ``m``: A memory address operand. It is target-specific what addressing modes 4194 are supported, typical examples are register, or register + register offset, 4195 or register + immediate offset (of some target-specific size). 4196- ``i``: An integer constant (of target-specific width). Allows either a simple 4197 immediate, or a relocatable value. 4198- ``n``: An integer constant -- *not* including relocatable values. 4199- ``s``: An integer constant, but allowing *only* relocatable values. 4200- ``X``: Allows an operand of any kind, no constraint whatsoever. Typically 4201 useful to pass a label for an asm branch or call. 4202 4203 .. FIXME: but that surely isn't actually okay to jump out of an asm 4204 block without telling llvm about the control transfer???) 4205 4206- ``{register-name}``: Requires exactly the named physical register. 4207 4208Other constraints are target-specific: 4209 4210AArch64: 4211 4212- ``z``: An immediate integer 0. Outputs ``WZR`` or ``XZR``, as appropriate. 4213- ``I``: An immediate integer valid for an ``ADD`` or ``SUB`` instruction, 4214 i.e. 0 to 4095 with optional shift by 12. 4215- ``J``: An immediate integer that, when negated, is valid for an ``ADD`` or 4216 ``SUB`` instruction, i.e. -1 to -4095 with optional left shift by 12. 4217- ``K``: An immediate integer that is valid for the 'bitmask immediate 32' of a 4218 logical instruction like ``AND``, ``EOR``, or ``ORR`` with a 32-bit register. 4219- ``L``: An immediate integer that is valid for the 'bitmask immediate 64' of a 4220 logical instruction like ``AND``, ``EOR``, or ``ORR`` with a 64-bit register. 4221- ``M``: An immediate integer for use with the ``MOV`` assembly alias on a 4222 32-bit register. This is a superset of ``K``: in addition to the bitmask 4223 immediate, also allows immediate integers which can be loaded with a single 4224 ``MOVZ`` or ``MOVL`` instruction. 4225- ``N``: An immediate integer for use with the ``MOV`` assembly alias on a 4226 64-bit register. This is a superset of ``L``. 4227- ``Q``: Memory address operand must be in a single register (no 4228 offsets). (However, LLVM currently does this for the ``m`` constraint as 4229 well.) 4230- ``r``: A 32 or 64-bit integer register (W* or X*). 4231- ``w``: A 32, 64, or 128-bit floating-point, SIMD or SVE vector register. 4232- ``x``: Like w, but restricted to registers 0 to 15 inclusive. 4233- ``y``: Like w, but restricted to SVE vector registers Z0 to Z7 inclusive. 4234- ``Upl``: One of the low eight SVE predicate registers (P0 to P7) 4235- ``Upa``: Any of the SVE predicate registers (P0 to P15) 4236 4237AMDGPU: 4238 4239- ``r``: A 32 or 64-bit integer register. 4240- ``[0-9]v``: The 32-bit VGPR register, number 0-9. 4241- ``[0-9]s``: The 32-bit SGPR register, number 0-9. 4242- ``[0-9]a``: The 32-bit AGPR register, number 0-9. 4243- ``I``: An integer inline constant in the range from -16 to 64. 4244- ``J``: A 16-bit signed integer constant. 4245- ``A``: An integer or a floating-point inline constant. 4246- ``B``: A 32-bit signed integer constant. 4247- ``C``: A 32-bit unsigned integer constant or an integer inline constant in the range from -16 to 64. 4248- ``DA``: A 64-bit constant that can be split into two "A" constants. 4249- ``DB``: A 64-bit constant that can be split into two "B" constants. 4250 4251All ARM modes: 4252 4253- ``Q``, ``Um``, ``Un``, ``Uq``, ``Us``, ``Ut``, ``Uv``, ``Uy``: Memory address 4254 operand. Treated the same as operand ``m``, at the moment. 4255- ``Te``: An even general-purpose 32-bit integer register: ``r0,r2,...,r12,r14`` 4256- ``To``: An odd general-purpose 32-bit integer register: ``r1,r3,...,r11`` 4257 4258ARM and ARM's Thumb2 mode: 4259 4260- ``j``: An immediate integer between 0 and 65535 (valid for ``MOVW``) 4261- ``I``: An immediate integer valid for a data-processing instruction. 4262- ``J``: An immediate integer between -4095 and 4095. 4263- ``K``: An immediate integer whose bitwise inverse is valid for a 4264 data-processing instruction. (Can be used with template modifier "``B``" to 4265 print the inverted value). 4266- ``L``: An immediate integer whose negation is valid for a data-processing 4267 instruction. (Can be used with template modifier "``n``" to print the negated 4268 value). 4269- ``M``: A power of two or a integer between 0 and 32. 4270- ``N``: Invalid immediate constraint. 4271- ``O``: Invalid immediate constraint. 4272- ``r``: A general-purpose 32-bit integer register (``r0-r15``). 4273- ``l``: In Thumb2 mode, low 32-bit GPR registers (``r0-r7``). In ARM mode, same 4274 as ``r``. 4275- ``h``: In Thumb2 mode, a high 32-bit GPR register (``r8-r15``). In ARM mode, 4276 invalid. 4277- ``w``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges 4278 ``s0-s31``, ``d0-d31``, or ``q0-q15``, respectively. 4279- ``t``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges 4280 ``s0-s31``, ``d0-d15``, or ``q0-q7``, respectively. 4281- ``x``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges 4282 ``s0-s15``, ``d0-d7``, or ``q0-q3``, respectively. 4283 4284ARM's Thumb1 mode: 4285 4286- ``I``: An immediate integer between 0 and 255. 4287- ``J``: An immediate integer between -255 and -1. 4288- ``K``: An immediate integer between 0 and 255, with optional left-shift by 4289 some amount. 4290- ``L``: An immediate integer between -7 and 7. 4291- ``M``: An immediate integer which is a multiple of 4 between 0 and 1020. 4292- ``N``: An immediate integer between 0 and 31. 4293- ``O``: An immediate integer which is a multiple of 4 between -508 and 508. 4294- ``r``: A low 32-bit GPR register (``r0-r7``). 4295- ``l``: A low 32-bit GPR register (``r0-r7``). 4296- ``h``: A high GPR register (``r0-r7``). 4297- ``w``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges 4298 ``s0-s31``, ``d0-d31``, or ``q0-q15``, respectively. 4299- ``t``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges 4300 ``s0-s31``, ``d0-d15``, or ``q0-q7``, respectively. 4301- ``x``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges 4302 ``s0-s15``, ``d0-d7``, or ``q0-q3``, respectively. 4303 4304 4305Hexagon: 4306 4307- ``o``, ``v``: A memory address operand, treated the same as constraint ``m``, 4308 at the moment. 4309- ``r``: A 32 or 64-bit register. 4310 4311MSP430: 4312 4313- ``r``: An 8 or 16-bit register. 4314 4315MIPS: 4316 4317- ``I``: An immediate signed 16-bit integer. 4318- ``J``: An immediate integer zero. 4319- ``K``: An immediate unsigned 16-bit integer. 4320- ``L``: An immediate 32-bit integer, where the lower 16 bits are 0. 4321- ``N``: An immediate integer between -65535 and -1. 4322- ``O``: An immediate signed 15-bit integer. 4323- ``P``: An immediate integer between 1 and 65535. 4324- ``m``: A memory address operand. In MIPS-SE mode, allows a base address 4325 register plus 16-bit immediate offset. In MIPS mode, just a base register. 4326- ``R``: A memory address operand. In MIPS-SE mode, allows a base address 4327 register plus a 9-bit signed offset. In MIPS mode, the same as constraint 4328 ``m``. 4329- ``ZC``: A memory address operand, suitable for use in a ``pref``, ``ll``, or 4330 ``sc`` instruction on the given subtarget (details vary). 4331- ``r``, ``d``, ``y``: A 32 or 64-bit GPR register. 4332- ``f``: A 32 or 64-bit FPU register (``F0-F31``), or a 128-bit MSA register 4333 (``W0-W31``). In the case of MSA registers, it is recommended to use the ``w`` 4334 argument modifier for compatibility with GCC. 4335- ``c``: A 32-bit or 64-bit GPR register suitable for indirect jump (always 4336 ``25``). 4337- ``l``: The ``lo`` register, 32 or 64-bit. 4338- ``x``: Invalid. 4339 4340NVPTX: 4341 4342- ``b``: A 1-bit integer register. 4343- ``c`` or ``h``: A 16-bit integer register. 4344- ``r``: A 32-bit integer register. 4345- ``l`` or ``N``: A 64-bit integer register. 4346- ``f``: A 32-bit float register. 4347- ``d``: A 64-bit float register. 4348 4349 4350PowerPC: 4351 4352- ``I``: An immediate signed 16-bit integer. 4353- ``J``: An immediate unsigned 16-bit integer, shifted left 16 bits. 4354- ``K``: An immediate unsigned 16-bit integer. 4355- ``L``: An immediate signed 16-bit integer, shifted left 16 bits. 4356- ``M``: An immediate integer greater than 31. 4357- ``N``: An immediate integer that is an exact power of 2. 4358- ``O``: The immediate integer constant 0. 4359- ``P``: An immediate integer constant whose negation is a signed 16-bit 4360 constant. 4361- ``es``, ``o``, ``Q``, ``Z``, ``Zy``: A memory address operand, currently 4362 treated the same as ``m``. 4363- ``r``: A 32 or 64-bit integer register. 4364- ``b``: A 32 or 64-bit integer register, excluding ``R0`` (that is: 4365 ``R1-R31``). 4366- ``f``: A 32 or 64-bit float register (``F0-F31``), 4367- ``v``: For ``4 x f32`` or ``4 x f64`` types, a 128-bit altivec vector 4368 register (``V0-V31``). 4369 4370- ``y``: Condition register (``CR0-CR7``). 4371- ``wc``: An individual CR bit in a CR register. 4372- ``wa``, ``wd``, ``wf``: Any 128-bit VSX vector register, from the full VSX 4373 register set (overlapping both the floating-point and vector register files). 4374- ``ws``: A 32 or 64-bit floating-point register, from the full VSX register 4375 set. 4376 4377RISC-V: 4378 4379- ``A``: An address operand (using a general-purpose register, without an 4380 offset). 4381- ``I``: A 12-bit signed integer immediate operand. 4382- ``J``: A zero integer immediate operand. 4383- ``K``: A 5-bit unsigned integer immediate operand. 4384- ``f``: A 32- or 64-bit floating-point register (requires F or D extension). 4385- ``r``: A 32- or 64-bit general-purpose register (depending on the platform 4386 ``XLEN``). 4387 4388Sparc: 4389 4390- ``I``: An immediate 13-bit signed integer. 4391- ``r``: A 32-bit integer register. 4392- ``f``: Any floating-point register on SparcV8, or a floating-point 4393 register in the "low" half of the registers on SparcV9. 4394- ``e``: Any floating-point register. (Same as ``f`` on SparcV8.) 4395 4396SystemZ: 4397 4398- ``I``: An immediate unsigned 8-bit integer. 4399- ``J``: An immediate unsigned 12-bit integer. 4400- ``K``: An immediate signed 16-bit integer. 4401- ``L``: An immediate signed 20-bit integer. 4402- ``M``: An immediate integer 0x7fffffff. 4403- ``Q``: A memory address operand with a base address and a 12-bit immediate 4404 unsigned displacement. 4405- ``R``: A memory address operand with a base address, a 12-bit immediate 4406 unsigned displacement, and an index register. 4407- ``S``: A memory address operand with a base address and a 20-bit immediate 4408 signed displacement. 4409- ``T``: A memory address operand with a base address, a 20-bit immediate 4410 signed displacement, and an index register. 4411- ``r`` or ``d``: A 32, 64, or 128-bit integer register. 4412- ``a``: A 32, 64, or 128-bit integer address register (excludes R0, which in an 4413 address context evaluates as zero). 4414- ``h``: A 32-bit value in the high part of a 64bit data register 4415 (LLVM-specific) 4416- ``f``: A 32, 64, or 128-bit floating-point register. 4417 4418X86: 4419 4420- ``I``: An immediate integer between 0 and 31. 4421- ``J``: An immediate integer between 0 and 64. 4422- ``K``: An immediate signed 8-bit integer. 4423- ``L``: An immediate integer, 0xff or 0xffff or (in 64-bit mode only) 4424 0xffffffff. 4425- ``M``: An immediate integer between 0 and 3. 4426- ``N``: An immediate unsigned 8-bit integer. 4427- ``O``: An immediate integer between 0 and 127. 4428- ``e``: An immediate 32-bit signed integer. 4429- ``Z``: An immediate 32-bit unsigned integer. 4430- ``o``, ``v``: Treated the same as ``m``, at the moment. 4431- ``q``: An 8, 16, 32, or 64-bit register which can be accessed as an 8-bit 4432 ``l`` integer register. On X86-32, this is the ``a``, ``b``, ``c``, and ``d`` 4433 registers, and on X86-64, it is all of the integer registers. 4434- ``Q``: An 8, 16, 32, or 64-bit register which can be accessed as an 8-bit 4435 ``h`` integer register. This is the ``a``, ``b``, ``c``, and ``d`` registers. 4436- ``r`` or ``l``: An 8, 16, 32, or 64-bit integer register. 4437- ``R``: An 8, 16, 32, or 64-bit "legacy" integer register -- one which has 4438 existed since i386, and can be accessed without the REX prefix. 4439- ``f``: A 32, 64, or 80-bit '387 FPU stack pseudo-register. 4440- ``y``: A 64-bit MMX register, if MMX is enabled. 4441- ``x``: If SSE is enabled: a 32 or 64-bit scalar operand, or 128-bit vector 4442 operand in a SSE register. If AVX is also enabled, can also be a 256-bit 4443 vector operand in an AVX register. If AVX-512 is also enabled, can also be a 4444 512-bit vector operand in an AVX512 register, Otherwise, an error. 4445- ``Y``: The same as ``x``, if *SSE2* is enabled, otherwise an error. 4446- ``A``: Special case: allocates EAX first, then EDX, for a single operand (in 4447 32-bit mode, a 64-bit integer operand will get split into two registers). It 4448 is not recommended to use this constraint, as in 64-bit mode, the 64-bit 4449 operand will get allocated only to RAX -- if two 32-bit operands are needed, 4450 you're better off splitting it yourself, before passing it to the asm 4451 statement. 4452 4453XCore: 4454 4455- ``r``: A 32-bit integer register. 4456 4457 4458.. _inline-asm-modifiers: 4459 4460Asm template argument modifiers 4461^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 4462 4463In the asm template string, modifiers can be used on the operand reference, like 4464"``${0:n}``". 4465 4466The modifiers are, in general, expected to behave the same way they do in 4467GCC. LLVM's support is often implemented on an 'as-needed' basis, to support C 4468inline asm code which was supported by GCC. A mismatch in behavior between LLVM 4469and GCC likely indicates a bug in LLVM. 4470 4471Target-independent: 4472 4473- ``c``: Print an immediate integer constant unadorned, without 4474 the target-specific immediate punctuation (e.g. no ``$`` prefix). 4475- ``n``: Negate and print immediate integer constant unadorned, without the 4476 target-specific immediate punctuation (e.g. no ``$`` prefix). 4477- ``l``: Print as an unadorned label, without the target-specific label 4478 punctuation (e.g. no ``$`` prefix). 4479 4480AArch64: 4481 4482- ``w``: Print a GPR register with a ``w*`` name instead of ``x*`` name. E.g., 4483 instead of ``x30``, print ``w30``. 4484- ``x``: Print a GPR register with a ``x*`` name. (this is the default, anyhow). 4485- ``b``, ``h``, ``s``, ``d``, ``q``: Print a floating-point/SIMD register with a 4486 ``b*``, ``h*``, ``s*``, ``d*``, or ``q*`` name, rather than the default of 4487 ``v*``. 4488 4489AMDGPU: 4490 4491- ``r``: No effect. 4492 4493ARM: 4494 4495- ``a``: Print an operand as an address (with ``[`` and ``]`` surrounding a 4496 register). 4497- ``P``: No effect. 4498- ``q``: No effect. 4499- ``y``: Print a VFP single-precision register as an indexed double (e.g. print 4500 as ``d4[1]`` instead of ``s9``) 4501- ``B``: Bitwise invert and print an immediate integer constant without ``#`` 4502 prefix. 4503- ``L``: Print the low 16-bits of an immediate integer constant. 4504- ``M``: Print as a register set suitable for ldm/stm. Also prints *all* 4505 register operands subsequent to the specified one (!), so use carefully. 4506- ``Q``: Print the low-order register of a register-pair, or the low-order 4507 register of a two-register operand. 4508- ``R``: Print the high-order register of a register-pair, or the high-order 4509 register of a two-register operand. 4510- ``H``: Print the second register of a register-pair. (On a big-endian system, 4511 ``H`` is equivalent to ``Q``, and on little-endian system, ``H`` is equivalent 4512 to ``R``.) 4513 4514 .. FIXME: H doesn't currently support printing the second register 4515 of a two-register operand. 4516 4517- ``e``: Print the low doubleword register of a NEON quad register. 4518- ``f``: Print the high doubleword register of a NEON quad register. 4519- ``m``: Print the base register of a memory operand without the ``[`` and ``]`` 4520 adornment. 4521 4522Hexagon: 4523 4524- ``L``: Print the second register of a two-register operand. Requires that it 4525 has been allocated consecutively to the first. 4526 4527 .. FIXME: why is it restricted to consecutive ones? And there's 4528 nothing that ensures that happens, is there? 4529 4530- ``I``: Print the letter 'i' if the operand is an integer constant, otherwise 4531 nothing. Used to print 'addi' vs 'add' instructions. 4532 4533MSP430: 4534 4535No additional modifiers. 4536 4537MIPS: 4538 4539- ``X``: Print an immediate integer as hexadecimal 4540- ``x``: Print the low 16 bits of an immediate integer as hexadecimal. 4541- ``d``: Print an immediate integer as decimal. 4542- ``m``: Subtract one and print an immediate integer as decimal. 4543- ``z``: Print $0 if an immediate zero, otherwise print normally. 4544- ``L``: Print the low-order register of a two-register operand, or prints the 4545 address of the low-order word of a double-word memory operand. 4546 4547 .. FIXME: L seems to be missing memory operand support. 4548 4549- ``M``: Print the high-order register of a two-register operand, or prints the 4550 address of the high-order word of a double-word memory operand. 4551 4552 .. FIXME: M seems to be missing memory operand support. 4553 4554- ``D``: Print the second register of a two-register operand, or prints the 4555 second word of a double-word memory operand. (On a big-endian system, ``D`` is 4556 equivalent to ``L``, and on little-endian system, ``D`` is equivalent to 4557 ``M``.) 4558- ``w``: No effect. Provided for compatibility with GCC which requires this 4559 modifier in order to print MSA registers (``W0-W31``) with the ``f`` 4560 constraint. 4561 4562NVPTX: 4563 4564- ``r``: No effect. 4565 4566PowerPC: 4567 4568- ``L``: Print the second register of a two-register operand. Requires that it 4569 has been allocated consecutively to the first. 4570 4571 .. FIXME: why is it restricted to consecutive ones? And there's 4572 nothing that ensures that happens, is there? 4573 4574- ``I``: Print the letter 'i' if the operand is an integer constant, otherwise 4575 nothing. Used to print 'addi' vs 'add' instructions. 4576- ``y``: For a memory operand, prints formatter for a two-register X-form 4577 instruction. (Currently always prints ``r0,OPERAND``). 4578- ``U``: Prints 'u' if the memory operand is an update form, and nothing 4579 otherwise. (NOTE: LLVM does not support update form, so this will currently 4580 always print nothing) 4581- ``X``: Prints 'x' if the memory operand is an indexed form. (NOTE: LLVM does 4582 not support indexed form, so this will currently always print nothing) 4583 4584RISC-V: 4585 4586- ``i``: Print the letter 'i' if the operand is not a register, otherwise print 4587 nothing. Used to print 'addi' vs 'add' instructions, etc. 4588- ``z``: Print the register ``zero`` if an immediate zero, otherwise print 4589 normally. 4590 4591Sparc: 4592 4593- ``r``: No effect. 4594 4595SystemZ: 4596 4597SystemZ implements only ``n``, and does *not* support any of the other 4598target-independent modifiers. 4599 4600X86: 4601 4602- ``c``: Print an unadorned integer or symbol name. (The latter is 4603 target-specific behavior for this typically target-independent modifier). 4604- ``A``: Print a register name with a '``*``' before it. 4605- ``b``: Print an 8-bit register name (e.g. ``al``); do nothing on a memory 4606 operand. 4607- ``h``: Print the upper 8-bit register name (e.g. ``ah``); do nothing on a 4608 memory operand. 4609- ``w``: Print the 16-bit register name (e.g. ``ax``); do nothing on a memory 4610 operand. 4611- ``k``: Print the 32-bit register name (e.g. ``eax``); do nothing on a memory 4612 operand. 4613- ``q``: Print the 64-bit register name (e.g. ``rax``), if 64-bit registers are 4614 available, otherwise the 32-bit register name; do nothing on a memory operand. 4615- ``n``: Negate and print an unadorned integer, or, for operands other than an 4616 immediate integer (e.g. a relocatable symbol expression), print a '-' before 4617 the operand. (The behavior for relocatable symbol expressions is a 4618 target-specific behavior for this typically target-independent modifier) 4619- ``H``: Print a memory reference with additional offset +8. 4620- ``P``: Print a memory reference or operand for use as the argument of a call 4621 instruction. (E.g. omit ``(rip)``, even though it's PC-relative.) 4622 4623XCore: 4624 4625No additional modifiers. 4626 4627 4628Inline Asm Metadata 4629^^^^^^^^^^^^^^^^^^^ 4630 4631The call instructions that wrap inline asm nodes may have a 4632"``!srcloc``" MDNode attached to it that contains a list of constant 4633integers. If present, the code generator will use the integer as the 4634location cookie value when report errors through the ``LLVMContext`` 4635error reporting mechanisms. This allows a front-end to correlate backend 4636errors that occur with inline asm back to the source code that produced 4637it. For example: 4638 4639.. code-block:: llvm 4640 4641 call void asm sideeffect "something bad", ""(), !srcloc !42 4642 ... 4643 !42 = !{ i32 1234567 } 4644 4645It is up to the front-end to make sense of the magic numbers it places 4646in the IR. If the MDNode contains multiple constants, the code generator 4647will use the one that corresponds to the line of the asm that the error 4648occurs on. 4649 4650.. _metadata: 4651 4652Metadata 4653======== 4654 4655LLVM IR allows metadata to be attached to instructions in the program 4656that can convey extra information about the code to the optimizers and 4657code generator. One example application of metadata is source-level 4658debug information. There are two metadata primitives: strings and nodes. 4659 4660Metadata does not have a type, and is not a value. If referenced from a 4661``call`` instruction, it uses the ``metadata`` type. 4662 4663All metadata are identified in syntax by a exclamation point ('``!``'). 4664 4665.. _metadata-string: 4666 4667Metadata Nodes and Metadata Strings 4668----------------------------------- 4669 4670A metadata string is a string surrounded by double quotes. It can 4671contain any character by escaping non-printable characters with 4672"``\xx``" where "``xx``" is the two digit hex code. For example: 4673"``!"test\00"``". 4674 4675Metadata nodes are represented with notation similar to structure 4676constants (a comma separated list of elements, surrounded by braces and 4677preceded by an exclamation point). Metadata nodes can have any values as 4678their operand. For example: 4679 4680.. code-block:: llvm 4681 4682 !{ !"test\00", i32 10} 4683 4684Metadata nodes that aren't uniqued use the ``distinct`` keyword. For example: 4685 4686.. code-block:: text 4687 4688 !0 = distinct !{!"test\00", i32 10} 4689 4690``distinct`` nodes are useful when nodes shouldn't be merged based on their 4691content. They can also occur when transformations cause uniquing collisions 4692when metadata operands change. 4693 4694A :ref:`named metadata <namedmetadatastructure>` is a collection of 4695metadata nodes, which can be looked up in the module symbol table. For 4696example: 4697 4698.. code-block:: llvm 4699 4700 !foo = !{!4, !3} 4701 4702Metadata can be used as function arguments. Here the ``llvm.dbg.value`` 4703intrinsic is using three metadata arguments: 4704 4705.. code-block:: llvm 4706 4707 call void @llvm.dbg.value(metadata !24, metadata !25, metadata !26) 4708 4709Metadata can be attached to an instruction. Here metadata ``!21`` is attached 4710to the ``add`` instruction using the ``!dbg`` identifier: 4711 4712.. code-block:: llvm 4713 4714 %indvar.next = add i64 %indvar, 1, !dbg !21 4715 4716Metadata can also be attached to a function or a global variable. Here metadata 4717``!22`` is attached to the ``f1`` and ``f2`` functions, and the globals ``g1`` 4718and ``g2`` using the ``!dbg`` identifier: 4719 4720.. code-block:: llvm 4721 4722 declare !dbg !22 void @f1() 4723 define void @f2() !dbg !22 { 4724 ret void 4725 } 4726 4727 @g1 = global i32 0, !dbg !22 4728 @g2 = external global i32, !dbg !22 4729 4730A transformation is required to drop any metadata attachment that it does not 4731know or know it can't preserve. Currently there is an exception for metadata 4732attachment to globals for ``!type`` and ``!absolute_symbol`` which can't be 4733unconditionally dropped unless the global is itself deleted. 4734 4735Metadata attached to a module using named metadata may not be dropped, with 4736the exception of debug metadata (named metadata with the name ``!llvm.dbg.*``). 4737 4738More information about specific metadata nodes recognized by the 4739optimizers and code generator is found below. 4740 4741.. _specialized-metadata: 4742 4743Specialized Metadata Nodes 4744^^^^^^^^^^^^^^^^^^^^^^^^^^ 4745 4746Specialized metadata nodes are custom data structures in metadata (as opposed 4747to generic tuples). Their fields are labelled, and can be specified in any 4748order. 4749 4750These aren't inherently debug info centric, but currently all the specialized 4751metadata nodes are related to debug info. 4752 4753.. _DICompileUnit: 4754 4755DICompileUnit 4756""""""""""""" 4757 4758``DICompileUnit`` nodes represent a compile unit. The ``enums:``, 4759``retainedTypes:``, ``globals:``, ``imports:`` and ``macros:`` fields are tuples 4760containing the debug info to be emitted along with the compile unit, regardless 4761of code optimizations (some nodes are only emitted if there are references to 4762them from instructions). The ``debugInfoForProfiling:`` field is a boolean 4763indicating whether or not line-table discriminators are updated to provide 4764more-accurate debug info for profiling results. 4765 4766.. code-block:: text 4767 4768 !0 = !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang", 4769 isOptimized: true, flags: "-O2", runtimeVersion: 2, 4770 splitDebugFilename: "abc.debug", emissionKind: FullDebug, 4771 enums: !2, retainedTypes: !3, globals: !4, imports: !5, 4772 macros: !6, dwoId: 0x0abcd) 4773 4774Compile unit descriptors provide the root scope for objects declared in a 4775specific compilation unit. File descriptors are defined using this scope. These 4776descriptors are collected by a named metadata node ``!llvm.dbg.cu``. They keep 4777track of global variables, type information, and imported entities (declarations 4778and namespaces). 4779 4780.. _DIFile: 4781 4782DIFile 4783"""""" 4784 4785``DIFile`` nodes represent files. The ``filename:`` can include slashes. 4786 4787.. code-block:: none 4788 4789 !0 = !DIFile(filename: "path/to/file", directory: "/path/to/dir", 4790 checksumkind: CSK_MD5, 4791 checksum: "000102030405060708090a0b0c0d0e0f") 4792 4793Files are sometimes used in ``scope:`` fields, and are the only valid target 4794for ``file:`` fields. 4795Valid values for ``checksumkind:`` field are: {CSK_None, CSK_MD5, CSK_SHA1, CSK_SHA256} 4796 4797.. _DIBasicType: 4798 4799DIBasicType 4800""""""""""" 4801 4802``DIBasicType`` nodes represent primitive types, such as ``int``, ``bool`` and 4803``float``. ``tag:`` defaults to ``DW_TAG_base_type``. 4804 4805.. code-block:: text 4806 4807 !0 = !DIBasicType(name: "unsigned char", size: 8, align: 8, 4808 encoding: DW_ATE_unsigned_char) 4809 !1 = !DIBasicType(tag: DW_TAG_unspecified_type, name: "decltype(nullptr)") 4810 4811The ``encoding:`` describes the details of the type. Usually it's one of the 4812following: 4813 4814.. code-block:: text 4815 4816 DW_ATE_address = 1 4817 DW_ATE_boolean = 2 4818 DW_ATE_float = 4 4819 DW_ATE_signed = 5 4820 DW_ATE_signed_char = 6 4821 DW_ATE_unsigned = 7 4822 DW_ATE_unsigned_char = 8 4823 4824.. _DISubroutineType: 4825 4826DISubroutineType 4827"""""""""""""""" 4828 4829``DISubroutineType`` nodes represent subroutine types. Their ``types:`` field 4830refers to a tuple; the first operand is the return type, while the rest are the 4831types of the formal arguments in order. If the first operand is ``null``, that 4832represents a function with no return value (such as ``void foo() {}`` in C++). 4833 4834.. code-block:: text 4835 4836 !0 = !BasicType(name: "int", size: 32, align: 32, DW_ATE_signed) 4837 !1 = !BasicType(name: "char", size: 8, align: 8, DW_ATE_signed_char) 4838 !2 = !DISubroutineType(types: !{null, !0, !1}) ; void (int, char) 4839 4840.. _DIDerivedType: 4841 4842DIDerivedType 4843""""""""""""" 4844 4845``DIDerivedType`` nodes represent types derived from other types, such as 4846qualified types. 4847 4848.. code-block:: text 4849 4850 !0 = !DIBasicType(name: "unsigned char", size: 8, align: 8, 4851 encoding: DW_ATE_unsigned_char) 4852 !1 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !0, size: 32, 4853 align: 32) 4854 4855The following ``tag:`` values are valid: 4856 4857.. code-block:: text 4858 4859 DW_TAG_member = 13 4860 DW_TAG_pointer_type = 15 4861 DW_TAG_reference_type = 16 4862 DW_TAG_typedef = 22 4863 DW_TAG_inheritance = 28 4864 DW_TAG_ptr_to_member_type = 31 4865 DW_TAG_const_type = 38 4866 DW_TAG_friend = 42 4867 DW_TAG_volatile_type = 53 4868 DW_TAG_restrict_type = 55 4869 DW_TAG_atomic_type = 71 4870 4871.. _DIDerivedTypeMember: 4872 4873``DW_TAG_member`` is used to define a member of a :ref:`composite type 4874<DICompositeType>`. The type of the member is the ``baseType:``. The 4875``offset:`` is the member's bit offset. If the composite type has an ODR 4876``identifier:`` and does not set ``flags: DIFwdDecl``, then the member is 4877uniqued based only on its ``name:`` and ``scope:``. 4878 4879``DW_TAG_inheritance`` and ``DW_TAG_friend`` are used in the ``elements:`` 4880field of :ref:`composite types <DICompositeType>` to describe parents and 4881friends. 4882 4883``DW_TAG_typedef`` is used to provide a name for the ``baseType:``. 4884 4885``DW_TAG_pointer_type``, ``DW_TAG_reference_type``, ``DW_TAG_const_type``, 4886``DW_TAG_volatile_type``, ``DW_TAG_restrict_type`` and ``DW_TAG_atomic_type`` 4887are used to qualify the ``baseType:``. 4888 4889Note that the ``void *`` type is expressed as a type derived from NULL. 4890 4891.. _DICompositeType: 4892 4893DICompositeType 4894""""""""""""""" 4895 4896``DICompositeType`` nodes represent types composed of other types, like 4897structures and unions. ``elements:`` points to a tuple of the composed types. 4898 4899If the source language supports ODR, the ``identifier:`` field gives the unique 4900identifier used for type merging between modules. When specified, 4901:ref:`subprogram declarations <DISubprogramDeclaration>` and :ref:`member 4902derived types <DIDerivedTypeMember>` that reference the ODR-type in their 4903``scope:`` change uniquing rules. 4904 4905For a given ``identifier:``, there should only be a single composite type that 4906does not have ``flags: DIFlagFwdDecl`` set. LLVM tools that link modules 4907together will unique such definitions at parse time via the ``identifier:`` 4908field, even if the nodes are ``distinct``. 4909 4910.. code-block:: text 4911 4912 !0 = !DIEnumerator(name: "SixKind", value: 7) 4913 !1 = !DIEnumerator(name: "SevenKind", value: 7) 4914 !2 = !DIEnumerator(name: "NegEightKind", value: -8) 4915 !3 = !DICompositeType(tag: DW_TAG_enumeration_type, name: "Enum", file: !12, 4916 line: 2, size: 32, align: 32, identifier: "_M4Enum", 4917 elements: !{!0, !1, !2}) 4918 4919The following ``tag:`` values are valid: 4920 4921.. code-block:: text 4922 4923 DW_TAG_array_type = 1 4924 DW_TAG_class_type = 2 4925 DW_TAG_enumeration_type = 4 4926 DW_TAG_structure_type = 19 4927 DW_TAG_union_type = 23 4928 4929For ``DW_TAG_array_type``, the ``elements:`` should be :ref:`subrange 4930descriptors <DISubrange>`, each representing the range of subscripts at that 4931level of indexing. The ``DIFlagVector`` flag to ``flags:`` indicates that an 4932array type is a native packed vector. The optional ``dataLocation`` is a 4933DIExpression that describes how to get from an object's address to the actual 4934raw data, if they aren't equivalent. This is only supported for array types, 4935particularly to describe Fortran arrays, which have an array descriptor in 4936addition to the array data. Alternatively it can also be DIVariable which 4937has the address of the actual raw data. The Fortran language supports pointer 4938arrays which can be attached to actual arrays, this attachment between pointer 4939and pointee is called association. The optional ``associated`` is a 4940DIExpression that describes whether the pointer array is currently associated. 4941The optional ``allocated`` is a DIExpression that describes whether the 4942allocatable array is currently allocated. The optional ``rank`` is a 4943DIExpression that describes the rank (number of dimensions) of fortran assumed 4944rank array (rank is known at runtime). 4945 4946For ``DW_TAG_enumeration_type``, the ``elements:`` should be :ref:`enumerator 4947descriptors <DIEnumerator>`, each representing the definition of an enumeration 4948value for the set. All enumeration type descriptors are collected in the 4949``enums:`` field of the :ref:`compile unit <DICompileUnit>`. 4950 4951For ``DW_TAG_structure_type``, ``DW_TAG_class_type``, and 4952``DW_TAG_union_type``, the ``elements:`` should be :ref:`derived types 4953<DIDerivedType>` with ``tag: DW_TAG_member``, ``tag: DW_TAG_inheritance``, or 4954``tag: DW_TAG_friend``; or :ref:`subprograms <DISubprogram>` with 4955``isDefinition: false``. 4956 4957.. _DISubrange: 4958 4959DISubrange 4960"""""""""" 4961 4962``DISubrange`` nodes are the elements for ``DW_TAG_array_type`` variants of 4963:ref:`DICompositeType`. 4964 4965- ``count: -1`` indicates an empty array. 4966- ``count: !9`` describes the count with a :ref:`DILocalVariable`. 4967- ``count: !11`` describes the count with a :ref:`DIGlobalVariable`. 4968 4969.. code-block:: text 4970 4971 !0 = !DISubrange(count: 5, lowerBound: 0) ; array counting from 0 4972 !1 = !DISubrange(count: 5, lowerBound: 1) ; array counting from 1 4973 !2 = !DISubrange(count: -1) ; empty array. 4974 4975 ; Scopes used in rest of example 4976 !6 = !DIFile(filename: "vla.c", directory: "/path/to/file") 4977 !7 = distinct !DICompileUnit(language: DW_LANG_C99, file: !6) 4978 !8 = distinct !DISubprogram(name: "foo", scope: !7, file: !6, line: 5) 4979 4980 ; Use of local variable as count value 4981 !9 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed) 4982 !10 = !DILocalVariable(name: "count", scope: !8, file: !6, line: 42, type: !9) 4983 !11 = !DISubrange(count: !10, lowerBound: 0) 4984 4985 ; Use of global variable as count value 4986 !12 = !DIGlobalVariable(name: "count", scope: !8, file: !6, line: 22, type: !9) 4987 !13 = !DISubrange(count: !12, lowerBound: 0) 4988 4989.. _DIEnumerator: 4990 4991DIEnumerator 4992"""""""""""" 4993 4994``DIEnumerator`` nodes are the elements for ``DW_TAG_enumeration_type`` 4995variants of :ref:`DICompositeType`. 4996 4997.. code-block:: text 4998 4999 !0 = !DIEnumerator(name: "SixKind", value: 7) 5000 !1 = !DIEnumerator(name: "SevenKind", value: 7) 5001 !2 = !DIEnumerator(name: "NegEightKind", value: -8) 5002 5003DITemplateTypeParameter 5004""""""""""""""""""""""" 5005 5006``DITemplateTypeParameter`` nodes represent type parameters to generic source 5007language constructs. They are used (optionally) in :ref:`DICompositeType` and 5008:ref:`DISubprogram` ``templateParams:`` fields. 5009 5010.. code-block:: text 5011 5012 !0 = !DITemplateTypeParameter(name: "Ty", type: !1) 5013 5014DITemplateValueParameter 5015"""""""""""""""""""""""" 5016 5017``DITemplateValueParameter`` nodes represent value parameters to generic source 5018language constructs. ``tag:`` defaults to ``DW_TAG_template_value_parameter``, 5019but if specified can also be set to ``DW_TAG_GNU_template_template_param`` or 5020``DW_TAG_GNU_template_param_pack``. They are used (optionally) in 5021:ref:`DICompositeType` and :ref:`DISubprogram` ``templateParams:`` fields. 5022 5023.. code-block:: text 5024 5025 !0 = !DITemplateValueParameter(name: "Ty", type: !1, value: i32 7) 5026 5027DINamespace 5028""""""""""" 5029 5030``DINamespace`` nodes represent namespaces in the source language. 5031 5032.. code-block:: text 5033 5034 !0 = !DINamespace(name: "myawesomeproject", scope: !1, file: !2, line: 7) 5035 5036.. _DIGlobalVariable: 5037 5038DIGlobalVariable 5039"""""""""""""""" 5040 5041``DIGlobalVariable`` nodes represent global variables in the source language. 5042 5043.. code-block:: text 5044 5045 @foo = global i32, !dbg !0 5046 !0 = !DIGlobalVariableExpression(var: !1, expr: !DIExpression()) 5047 !1 = !DIGlobalVariable(name: "foo", linkageName: "foo", scope: !2, 5048 file: !3, line: 7, type: !4, isLocal: true, 5049 isDefinition: false, declaration: !5) 5050 5051 5052DIGlobalVariableExpression 5053"""""""""""""""""""""""""" 5054 5055``DIGlobalVariableExpression`` nodes tie a :ref:`DIGlobalVariable` together 5056with a :ref:`DIExpression`. 5057 5058.. code-block:: text 5059 5060 @lower = global i32, !dbg !0 5061 @upper = global i32, !dbg !1 5062 !0 = !DIGlobalVariableExpression( 5063 var: !2, 5064 expr: !DIExpression(DW_OP_LLVM_fragment, 0, 32) 5065 ) 5066 !1 = !DIGlobalVariableExpression( 5067 var: !2, 5068 expr: !DIExpression(DW_OP_LLVM_fragment, 32, 32) 5069 ) 5070 !2 = !DIGlobalVariable(name: "split64", linkageName: "split64", scope: !3, 5071 file: !4, line: 8, type: !5, declaration: !6) 5072 5073All global variable expressions should be referenced by the `globals:` field of 5074a :ref:`compile unit <DICompileUnit>`. 5075 5076.. _DISubprogram: 5077 5078DISubprogram 5079"""""""""""" 5080 5081``DISubprogram`` nodes represent functions from the source language. A distinct 5082``DISubprogram`` may be attached to a function definition using ``!dbg`` 5083metadata. A unique ``DISubprogram`` may be attached to a function declaration 5084used for call site debug info. The ``retainedNodes:`` field is a list of 5085:ref:`variables <DILocalVariable>` and :ref:`labels <DILabel>` that must be 5086retained, even if their IR counterparts are optimized out of the IR. The 5087``type:`` field must point at an :ref:`DISubroutineType`. 5088 5089.. _DISubprogramDeclaration: 5090 5091When ``isDefinition: false``, subprograms describe a declaration in the type 5092tree as opposed to a definition of a function. If the scope is a composite 5093type with an ODR ``identifier:`` and that does not set ``flags: DIFwdDecl``, 5094then the subprogram declaration is uniqued based only on its ``linkageName:`` 5095and ``scope:``. 5096 5097.. code-block:: text 5098 5099 define void @_Z3foov() !dbg !0 { 5100 ... 5101 } 5102 5103 !0 = distinct !DISubprogram(name: "foo", linkageName: "_Zfoov", scope: !1, 5104 file: !2, line: 7, type: !3, isLocal: true, 5105 isDefinition: true, scopeLine: 8, 5106 containingType: !4, 5107 virtuality: DW_VIRTUALITY_pure_virtual, 5108 virtualIndex: 10, flags: DIFlagPrototyped, 5109 isOptimized: true, unit: !5, templateParams: !6, 5110 declaration: !7, retainedNodes: !8, 5111 thrownTypes: !9) 5112 5113.. _DILexicalBlock: 5114 5115DILexicalBlock 5116"""""""""""""" 5117 5118``DILexicalBlock`` nodes describe nested blocks within a :ref:`subprogram 5119<DISubprogram>`. The line number and column numbers are used to distinguish 5120two lexical blocks at same depth. They are valid targets for ``scope:`` 5121fields. 5122 5123.. code-block:: text 5124 5125 !0 = distinct !DILexicalBlock(scope: !1, file: !2, line: 7, column: 35) 5126 5127Usually lexical blocks are ``distinct`` to prevent node merging based on 5128operands. 5129 5130.. _DILexicalBlockFile: 5131 5132DILexicalBlockFile 5133"""""""""""""""""" 5134 5135``DILexicalBlockFile`` nodes are used to discriminate between sections of a 5136:ref:`lexical block <DILexicalBlock>`. The ``file:`` field can be changed to 5137indicate textual inclusion, or the ``discriminator:`` field can be used to 5138discriminate between control flow within a single block in the source language. 5139 5140.. code-block:: text 5141 5142 !0 = !DILexicalBlock(scope: !3, file: !4, line: 7, column: 35) 5143 !1 = !DILexicalBlockFile(scope: !0, file: !4, discriminator: 0) 5144 !2 = !DILexicalBlockFile(scope: !0, file: !4, discriminator: 1) 5145 5146.. _DILocation: 5147 5148DILocation 5149"""""""""" 5150 5151``DILocation`` nodes represent source debug locations. The ``scope:`` field is 5152mandatory, and points at an :ref:`DILexicalBlockFile`, an 5153:ref:`DILexicalBlock`, or an :ref:`DISubprogram`. 5154 5155.. code-block:: text 5156 5157 !0 = !DILocation(line: 2900, column: 42, scope: !1, inlinedAt: !2) 5158 5159.. _DILocalVariable: 5160 5161DILocalVariable 5162""""""""""""""" 5163 5164``DILocalVariable`` nodes represent local variables in the source language. If 5165the ``arg:`` field is set to non-zero, then this variable is a subprogram 5166parameter, and it will be included in the ``variables:`` field of its 5167:ref:`DISubprogram`. 5168 5169.. code-block:: text 5170 5171 !0 = !DILocalVariable(name: "this", arg: 1, scope: !3, file: !2, line: 7, 5172 type: !3, flags: DIFlagArtificial) 5173 !1 = !DILocalVariable(name: "x", arg: 2, scope: !4, file: !2, line: 7, 5174 type: !3) 5175 !2 = !DILocalVariable(name: "y", scope: !5, file: !2, line: 7, type: !3) 5176 5177.. _DIExpression: 5178 5179DIExpression 5180"""""""""""" 5181 5182``DIExpression`` nodes represent expressions that are inspired by the DWARF 5183expression language. They are used in :ref:`debug intrinsics<dbg_intrinsics>` 5184(such as ``llvm.dbg.declare`` and ``llvm.dbg.value``) to describe how the 5185referenced LLVM variable relates to the source language variable. Debug 5186intrinsics are interpreted left-to-right: start by pushing the value/address 5187operand of the intrinsic onto a stack, then repeatedly push and evaluate 5188opcodes from the DIExpression until the final variable description is produced. 5189 5190The current supported opcode vocabulary is limited: 5191 5192- ``DW_OP_deref`` dereferences the top of the expression stack. 5193- ``DW_OP_plus`` pops the last two entries from the expression stack, adds 5194 them together and appends the result to the expression stack. 5195- ``DW_OP_minus`` pops the last two entries from the expression stack, subtracts 5196 the last entry from the second last entry and appends the result to the 5197 expression stack. 5198- ``DW_OP_plus_uconst, 93`` adds ``93`` to the working expression. 5199- ``DW_OP_LLVM_fragment, 16, 8`` specifies the offset and size (``16`` and ``8`` 5200 here, respectively) of the variable fragment from the working expression. Note 5201 that contrary to DW_OP_bit_piece, the offset is describing the location 5202 within the described source variable. 5203- ``DW_OP_LLVM_convert, 16, DW_ATE_signed`` specifies a bit size and encoding 5204 (``16`` and ``DW_ATE_signed`` here, respectively) to which the top of the 5205 expression stack is to be converted. Maps into a ``DW_OP_convert`` operation 5206 that references a base type constructed from the supplied values. 5207- ``DW_OP_LLVM_tag_offset, tag_offset`` specifies that a memory tag should be 5208 optionally applied to the pointer. The memory tag is derived from the 5209 given tag offset in an implementation-defined manner. 5210- ``DW_OP_swap`` swaps top two stack entries. 5211- ``DW_OP_xderef`` provides extended dereference mechanism. The entry at the top 5212 of the stack is treated as an address. The second stack entry is treated as an 5213 address space identifier. 5214- ``DW_OP_stack_value`` marks a constant value. 5215- ``DW_OP_LLVM_entry_value, N`` can only appear at the beginning of a 5216 ``DIExpression``, and it specifies that all register and memory read 5217 operations for the debug value instruction's value/address operand and for 5218 the ``(N - 1)`` operations immediately following the 5219 ``DW_OP_LLVM_entry_value`` refer to their respective values at function 5220 entry. For example, ``!DIExpression(DW_OP_LLVM_entry_value, 1, 5221 DW_OP_plus_uconst, 123, DW_OP_stack_value)`` specifies an expression where 5222 the entry value of the debug value instruction's value/address operand is 5223 pushed to the stack, and is added with 123. Due to framework limitations 5224 ``N`` can currently only be 1. 5225 5226 ``DW_OP_LLVM_entry_value`` is only legal in MIR. The operation is introduced 5227 by the ``LiveDebugValues`` pass; currently only for function parameters that 5228 are unmodified throughout a function. Support is limited to function 5229 parameter that are described as simple register location descriptions, or as 5230 indirect locations (e.g. when a struct is passed-by-value to a callee via a 5231 pointer to a temporary copy made in the caller). The entry value op is also 5232 introduced by the ``AsmPrinter`` pass when a call site parameter value 5233 (``DW_AT_call_site_parameter_value``) is represented as entry value of the 5234 parameter. 5235- ``DW_OP_breg`` (or ``DW_OP_bregx``) represents a content on the provided 5236 signed offset of the specified register. The opcode is only generated by the 5237 ``AsmPrinter`` pass to describe call site parameter value which requires an 5238 expression over two registers. 5239- ``DW_OP_push_object_address`` pushes the address of the object which can then 5240 serve as a descriptor in subsequent calculation. This opcode can be used to 5241 calculate bounds of fortran allocatable array which has array descriptors. 5242- ``DW_OP_over`` duplicates the entry currently second in the stack at the top 5243 of the stack. This opcode can be used to calculate bounds of fortran assumed 5244 rank array which has rank known at run time and current dimension number is 5245 implicitly first element of the stack. 5246 5247DWARF specifies three kinds of simple location descriptions: Register, memory, 5248and implicit location descriptions. Note that a location description is 5249defined over certain ranges of a program, i.e the location of a variable may 5250change over the course of the program. Register and memory location 5251descriptions describe the *concrete location* of a source variable (in the 5252sense that a debugger might modify its value), whereas *implicit locations* 5253describe merely the actual *value* of a source variable which might not exist 5254in registers or in memory (see ``DW_OP_stack_value``). 5255 5256A ``llvm.dbg.addr`` or ``llvm.dbg.declare`` intrinsic describes an indirect 5257value (the address) of a source variable. The first operand of the intrinsic 5258must be an address of some kind. A DIExpression attached to the intrinsic 5259refines this address to produce a concrete location for the source variable. 5260 5261A ``llvm.dbg.value`` intrinsic describes the direct value of a source variable. 5262The first operand of the intrinsic may be a direct or indirect value. A 5263DIExpression attached to the intrinsic refines the first operand to produce a 5264direct value. For example, if the first operand is an indirect value, it may be 5265necessary to insert ``DW_OP_deref`` into the DIExpression in order to produce a 5266valid debug intrinsic. 5267 5268.. note:: 5269 5270 A DIExpression is interpreted in the same way regardless of which kind of 5271 debug intrinsic it's attached to. 5272 5273.. code-block:: text 5274 5275 !0 = !DIExpression(DW_OP_deref) 5276 !1 = !DIExpression(DW_OP_plus_uconst, 3) 5277 !1 = !DIExpression(DW_OP_constu, 3, DW_OP_plus) 5278 !2 = !DIExpression(DW_OP_bit_piece, 3, 7) 5279 !3 = !DIExpression(DW_OP_deref, DW_OP_constu, 3, DW_OP_plus, DW_OP_LLVM_fragment, 3, 7) 5280 !4 = !DIExpression(DW_OP_constu, 2, DW_OP_swap, DW_OP_xderef) 5281 !5 = !DIExpression(DW_OP_constu, 42, DW_OP_stack_value) 5282 5283DIFlags 5284""""""""""""""" 5285 5286These flags encode various properties of DINodes. 5287 5288The `ExportSymbols` flag marks a class, struct or union whose members 5289may be referenced as if they were defined in the containing class or 5290union. This flag is used to decide whether the DW_AT_export_symbols can 5291be used for the structure type. 5292 5293DIObjCProperty 5294"""""""""""""" 5295 5296``DIObjCProperty`` nodes represent Objective-C property nodes. 5297 5298.. code-block:: text 5299 5300 !3 = !DIObjCProperty(name: "foo", file: !1, line: 7, setter: "setFoo", 5301 getter: "getFoo", attributes: 7, type: !2) 5302 5303DIImportedEntity 5304"""""""""""""""" 5305 5306``DIImportedEntity`` nodes represent entities (such as modules) imported into a 5307compile unit. 5308 5309.. code-block:: text 5310 5311 !2 = !DIImportedEntity(tag: DW_TAG_imported_module, name: "foo", scope: !0, 5312 entity: !1, line: 7) 5313 5314DIMacro 5315""""""" 5316 5317``DIMacro`` nodes represent definition or undefinition of a macro identifiers. 5318The ``name:`` field is the macro identifier, followed by macro parameters when 5319defining a function-like macro, and the ``value`` field is the token-string 5320used to expand the macro identifier. 5321 5322.. code-block:: text 5323 5324 !2 = !DIMacro(macinfo: DW_MACINFO_define, line: 7, name: "foo(x)", 5325 value: "((x) + 1)") 5326 !3 = !DIMacro(macinfo: DW_MACINFO_undef, line: 30, name: "foo") 5327 5328DIMacroFile 5329""""""""""" 5330 5331``DIMacroFile`` nodes represent inclusion of source files. 5332The ``nodes:`` field is a list of ``DIMacro`` and ``DIMacroFile`` nodes that 5333appear in the included source file. 5334 5335.. code-block:: text 5336 5337 !2 = !DIMacroFile(macinfo: DW_MACINFO_start_file, line: 7, file: !2, 5338 nodes: !3) 5339 5340.. _DILabel: 5341 5342DILabel 5343""""""" 5344 5345``DILabel`` nodes represent labels within a :ref:`DISubprogram`. All fields of 5346a ``DILabel`` are mandatory. The ``scope:`` field must be one of either a 5347:ref:`DILexicalBlockFile`, a :ref:`DILexicalBlock`, or a :ref:`DISubprogram`. 5348The ``name:`` field is the label identifier. The ``file:`` field is the 5349:ref:`DIFile` the label is present in. The ``line:`` field is the source line 5350within the file where the label is declared. 5351 5352.. code-block:: text 5353 5354 !2 = !DILabel(scope: !0, name: "foo", file: !1, line: 7) 5355 5356'``tbaa``' Metadata 5357^^^^^^^^^^^^^^^^^^^ 5358 5359In LLVM IR, memory does not have types, so LLVM's own type system is not 5360suitable for doing type based alias analysis (TBAA). Instead, metadata is 5361added to the IR to describe a type system of a higher level language. This 5362can be used to implement C/C++ strict type aliasing rules, but it can also 5363be used to implement custom alias analysis behavior for other languages. 5364 5365This description of LLVM's TBAA system is broken into two parts: 5366:ref:`Semantics<tbaa_node_semantics>` talks about high level issues, and 5367:ref:`Representation<tbaa_node_representation>` talks about the metadata 5368encoding of various entities. 5369 5370It is always possible to trace any TBAA node to a "root" TBAA node (details 5371in the :ref:`Representation<tbaa_node_representation>` section). TBAA 5372nodes with different roots have an unknown aliasing relationship, and LLVM 5373conservatively infers ``MayAlias`` between them. The rules mentioned in 5374this section only pertain to TBAA nodes living under the same root. 5375 5376.. _tbaa_node_semantics: 5377 5378Semantics 5379""""""""" 5380 5381The TBAA metadata system, referred to as "struct path TBAA" (not to be 5382confused with ``tbaa.struct``), consists of the following high level 5383concepts: *Type Descriptors*, further subdivided into scalar type 5384descriptors and struct type descriptors; and *Access Tags*. 5385 5386**Type descriptors** describe the type system of the higher level language 5387being compiled. **Scalar type descriptors** describe types that do not 5388contain other types. Each scalar type has a parent type, which must also 5389be a scalar type or the TBAA root. Via this parent relation, scalar types 5390within a TBAA root form a tree. **Struct type descriptors** denote types 5391that contain a sequence of other type descriptors, at known offsets. These 5392contained type descriptors can either be struct type descriptors themselves 5393or scalar type descriptors. 5394 5395**Access tags** are metadata nodes attached to load and store instructions. 5396Access tags use type descriptors to describe the *location* being accessed 5397in terms of the type system of the higher level language. Access tags are 5398tuples consisting of a base type, an access type and an offset. The base 5399type is a scalar type descriptor or a struct type descriptor, the access 5400type is a scalar type descriptor, and the offset is a constant integer. 5401 5402The access tag ``(BaseTy, AccessTy, Offset)`` can describe one of two 5403things: 5404 5405 * If ``BaseTy`` is a struct type, the tag describes a memory access (load 5406 or store) of a value of type ``AccessTy`` contained in the struct type 5407 ``BaseTy`` at offset ``Offset``. 5408 5409 * If ``BaseTy`` is a scalar type, ``Offset`` must be 0 and ``BaseTy`` and 5410 ``AccessTy`` must be the same; and the access tag describes a scalar 5411 access with scalar type ``AccessTy``. 5412 5413We first define an ``ImmediateParent`` relation on ``(BaseTy, Offset)`` 5414tuples this way: 5415 5416 * If ``BaseTy`` is a scalar type then ``ImmediateParent(BaseTy, 0)`` is 5417 ``(ParentTy, 0)`` where ``ParentTy`` is the parent of the scalar type as 5418 described in the TBAA metadata. ``ImmediateParent(BaseTy, Offset)`` is 5419 undefined if ``Offset`` is non-zero. 5420 5421 * If ``BaseTy`` is a struct type then ``ImmediateParent(BaseTy, Offset)`` 5422 is ``(NewTy, NewOffset)`` where ``NewTy`` is the type contained in 5423 ``BaseTy`` at offset ``Offset`` and ``NewOffset`` is ``Offset`` adjusted 5424 to be relative within that inner type. 5425 5426A memory access with an access tag ``(BaseTy1, AccessTy1, Offset1)`` 5427aliases a memory access with an access tag ``(BaseTy2, AccessTy2, 5428Offset2)`` if either ``(BaseTy1, Offset1)`` is reachable from ``(Base2, 5429Offset2)`` via the ``Parent`` relation or vice versa. 5430 5431As a concrete example, the type descriptor graph for the following program 5432 5433.. code-block:: c 5434 5435 struct Inner { 5436 int i; // offset 0 5437 float f; // offset 4 5438 }; 5439 5440 struct Outer { 5441 float f; // offset 0 5442 double d; // offset 4 5443 struct Inner inner_a; // offset 12 5444 }; 5445 5446 void f(struct Outer* outer, struct Inner* inner, float* f, int* i, char* c) { 5447 outer->f = 0; // tag0: (OuterStructTy, FloatScalarTy, 0) 5448 outer->inner_a.i = 0; // tag1: (OuterStructTy, IntScalarTy, 12) 5449 outer->inner_a.f = 0.0; // tag2: (OuterStructTy, FloatScalarTy, 16) 5450 *f = 0.0; // tag3: (FloatScalarTy, FloatScalarTy, 0) 5451 } 5452 5453is (note that in C and C++, ``char`` can be used to access any arbitrary 5454type): 5455 5456.. code-block:: text 5457 5458 Root = "TBAA Root" 5459 CharScalarTy = ("char", Root, 0) 5460 FloatScalarTy = ("float", CharScalarTy, 0) 5461 DoubleScalarTy = ("double", CharScalarTy, 0) 5462 IntScalarTy = ("int", CharScalarTy, 0) 5463 InnerStructTy = {"Inner" (IntScalarTy, 0), (FloatScalarTy, 4)} 5464 OuterStructTy = {"Outer", (FloatScalarTy, 0), (DoubleScalarTy, 4), 5465 (InnerStructTy, 12)} 5466 5467 5468with (e.g.) ``ImmediateParent(OuterStructTy, 12)`` = ``(InnerStructTy, 54690)``, ``ImmediateParent(InnerStructTy, 0)`` = ``(IntScalarTy, 0)``, and 5470``ImmediateParent(IntScalarTy, 0)`` = ``(CharScalarTy, 0)``. 5471 5472.. _tbaa_node_representation: 5473 5474Representation 5475"""""""""""""" 5476 5477The root node of a TBAA type hierarchy is an ``MDNode`` with 0 operands or 5478with exactly one ``MDString`` operand. 5479 5480Scalar type descriptors are represented as an ``MDNode`` s with two 5481operands. The first operand is an ``MDString`` denoting the name of the 5482struct type. LLVM does not assign meaning to the value of this operand, it 5483only cares about it being an ``MDString``. The second operand is an 5484``MDNode`` which points to the parent for said scalar type descriptor, 5485which is either another scalar type descriptor or the TBAA root. Scalar 5486type descriptors can have an optional third argument, but that must be the 5487constant integer zero. 5488 5489Struct type descriptors are represented as ``MDNode`` s with an odd number 5490of operands greater than 1. The first operand is an ``MDString`` denoting 5491the name of the struct type. Like in scalar type descriptors the actual 5492value of this name operand is irrelevant to LLVM. After the name operand, 5493the struct type descriptors have a sequence of alternating ``MDNode`` and 5494``ConstantInt`` operands. With N starting from 1, the 2N - 1 th operand, 5495an ``MDNode``, denotes a contained field, and the 2N th operand, a 5496``ConstantInt``, is the offset of the said contained field. The offsets 5497must be in non-decreasing order. 5498 5499Access tags are represented as ``MDNode`` s with either 3 or 4 operands. 5500The first operand is an ``MDNode`` pointing to the node representing the 5501base type. The second operand is an ``MDNode`` pointing to the node 5502representing the access type. The third operand is a ``ConstantInt`` that 5503states the offset of the access. If a fourth field is present, it must be 5504a ``ConstantInt`` valued at 0 or 1. If it is 1 then the access tag states 5505that the location being accessed is "constant" (meaning 5506``pointsToConstantMemory`` should return true; see `other useful 5507AliasAnalysis methods <AliasAnalysis.html#OtherItfs>`_). The TBAA root of 5508the access type and the base type of an access tag must be the same, and 5509that is the TBAA root of the access tag. 5510 5511'``tbaa.struct``' Metadata 5512^^^^^^^^^^^^^^^^^^^^^^^^^^ 5513 5514The :ref:`llvm.memcpy <int_memcpy>` is often used to implement 5515aggregate assignment operations in C and similar languages, however it 5516is defined to copy a contiguous region of memory, which is more than 5517strictly necessary for aggregate types which contain holes due to 5518padding. Also, it doesn't contain any TBAA information about the fields 5519of the aggregate. 5520 5521``!tbaa.struct`` metadata can describe which memory subregions in a 5522memcpy are padding and what the TBAA tags of the struct are. 5523 5524The current metadata format is very simple. ``!tbaa.struct`` metadata 5525nodes are a list of operands which are in conceptual groups of three. 5526For each group of three, the first operand gives the byte offset of a 5527field in bytes, the second gives its size in bytes, and the third gives 5528its tbaa tag. e.g.: 5529 5530.. code-block:: llvm 5531 5532 !4 = !{ i64 0, i64 4, !1, i64 8, i64 4, !2 } 5533 5534This describes a struct with two fields. The first is at offset 0 bytes 5535with size 4 bytes, and has tbaa tag !1. The second is at offset 8 bytes 5536and has size 4 bytes and has tbaa tag !2. 5537 5538Note that the fields need not be contiguous. In this example, there is a 55394 byte gap between the two fields. This gap represents padding which 5540does not carry useful data and need not be preserved. 5541 5542'``noalias``' and '``alias.scope``' Metadata 5543^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 5544 5545``noalias`` and ``alias.scope`` metadata provide the ability to specify generic 5546noalias memory-access sets. This means that some collection of memory access 5547instructions (loads, stores, memory-accessing calls, etc.) that carry 5548``noalias`` metadata can specifically be specified not to alias with some other 5549collection of memory access instructions that carry ``alias.scope`` metadata. 5550Each type of metadata specifies a list of scopes where each scope has an id and 5551a domain. 5552 5553When evaluating an aliasing query, if for some domain, the set 5554of scopes with that domain in one instruction's ``alias.scope`` list is a 5555subset of (or equal to) the set of scopes for that domain in another 5556instruction's ``noalias`` list, then the two memory accesses are assumed not to 5557alias. 5558 5559Because scopes in one domain don't affect scopes in other domains, separate 5560domains can be used to compose multiple independent noalias sets. This is 5561used for example during inlining. As the noalias function parameters are 5562turned into noalias scope metadata, a new domain is used every time the 5563function is inlined. 5564 5565The metadata identifying each domain is itself a list containing one or two 5566entries. The first entry is the name of the domain. Note that if the name is a 5567string then it can be combined across functions and translation units. A 5568self-reference can be used to create globally unique domain names. A 5569descriptive string may optionally be provided as a second list entry. 5570 5571The metadata identifying each scope is also itself a list containing two or 5572three entries. The first entry is the name of the scope. Note that if the name 5573is a string then it can be combined across functions and translation units. A 5574self-reference can be used to create globally unique scope names. A metadata 5575reference to the scope's domain is the second entry. A descriptive string may 5576optionally be provided as a third list entry. 5577 5578For example, 5579 5580.. code-block:: llvm 5581 5582 ; Two scope domains: 5583 !0 = !{!0} 5584 !1 = !{!1} 5585 5586 ; Some scopes in these domains: 5587 !2 = !{!2, !0} 5588 !3 = !{!3, !0} 5589 !4 = !{!4, !1} 5590 5591 ; Some scope lists: 5592 !5 = !{!4} ; A list containing only scope !4 5593 !6 = !{!4, !3, !2} 5594 !7 = !{!3} 5595 5596 ; These two instructions don't alias: 5597 %0 = load float, float* %c, align 4, !alias.scope !5 5598 store float %0, float* %arrayidx.i, align 4, !noalias !5 5599 5600 ; These two instructions also don't alias (for domain !1, the set of scopes 5601 ; in the !alias.scope equals that in the !noalias list): 5602 %2 = load float, float* %c, align 4, !alias.scope !5 5603 store float %2, float* %arrayidx.i2, align 4, !noalias !6 5604 5605 ; These two instructions may alias (for domain !0, the set of scopes in 5606 ; the !noalias list is not a superset of, or equal to, the scopes in the 5607 ; !alias.scope list): 5608 %2 = load float, float* %c, align 4, !alias.scope !6 5609 store float %0, float* %arrayidx.i, align 4, !noalias !7 5610 5611'``fpmath``' Metadata 5612^^^^^^^^^^^^^^^^^^^^^ 5613 5614``fpmath`` metadata may be attached to any instruction of floating-point 5615type. It can be used to express the maximum acceptable error in the 5616result of that instruction, in ULPs, thus potentially allowing the 5617compiler to use a more efficient but less accurate method of computing 5618it. ULP is defined as follows: 5619 5620 If ``x`` is a real number that lies between two finite consecutive 5621 floating-point numbers ``a`` and ``b``, without being equal to one 5622 of them, then ``ulp(x) = |b - a|``, otherwise ``ulp(x)`` is the 5623 distance between the two non-equal finite floating-point numbers 5624 nearest ``x``. Moreover, ``ulp(NaN)`` is ``NaN``. 5625 5626The metadata node shall consist of a single positive float type number 5627representing the maximum relative error, for example: 5628 5629.. code-block:: llvm 5630 5631 !0 = !{ float 2.5 } ; maximum acceptable inaccuracy is 2.5 ULPs 5632 5633.. _range-metadata: 5634 5635'``range``' Metadata 5636^^^^^^^^^^^^^^^^^^^^ 5637 5638``range`` metadata may be attached only to ``load``, ``call`` and ``invoke`` of 5639integer types. It expresses the possible ranges the loaded value or the value 5640returned by the called function at this call site is in. If the loaded or 5641returned value is not in the specified range, the behavior is undefined. The 5642ranges are represented with a flattened list of integers. The loaded value or 5643the value returned is known to be in the union of the ranges defined by each 5644consecutive pair. Each pair has the following properties: 5645 5646- The type must match the type loaded by the instruction. 5647- The pair ``a,b`` represents the range ``[a,b)``. 5648- Both ``a`` and ``b`` are constants. 5649- The range is allowed to wrap. 5650- The range should not represent the full or empty set. That is, 5651 ``a!=b``. 5652 5653In addition, the pairs must be in signed order of the lower bound and 5654they must be non-contiguous. 5655 5656Examples: 5657 5658.. code-block:: llvm 5659 5660 %a = load i8, i8* %x, align 1, !range !0 ; Can only be 0 or 1 5661 %b = load i8, i8* %y, align 1, !range !1 ; Can only be 255 (-1), 0 or 1 5662 %c = call i8 @foo(), !range !2 ; Can only be 0, 1, 3, 4 or 5 5663 %d = invoke i8 @bar() to label %cont 5664 unwind label %lpad, !range !3 ; Can only be -2, -1, 3, 4 or 5 5665 ... 5666 !0 = !{ i8 0, i8 2 } 5667 !1 = !{ i8 255, i8 2 } 5668 !2 = !{ i8 0, i8 2, i8 3, i8 6 } 5669 !3 = !{ i8 -2, i8 0, i8 3, i8 6 } 5670 5671'``absolute_symbol``' Metadata 5672^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 5673 5674``absolute_symbol`` metadata may be attached to a global variable 5675declaration. It marks the declaration as a reference to an absolute symbol, 5676which causes the backend to use absolute relocations for the symbol even 5677in position independent code, and expresses the possible ranges that the 5678global variable's *address* (not its value) is in, in the same format as 5679``range`` metadata, with the extension that the pair ``all-ones,all-ones`` 5680may be used to represent the full set. 5681 5682Example (assuming 64-bit pointers): 5683 5684.. code-block:: llvm 5685 5686 @a = external global i8, !absolute_symbol !0 ; Absolute symbol in range [0,256) 5687 @b = external global i8, !absolute_symbol !1 ; Absolute symbol in range [0,2^64) 5688 5689 ... 5690 !0 = !{ i64 0, i64 256 } 5691 !1 = !{ i64 -1, i64 -1 } 5692 5693'``callees``' Metadata 5694^^^^^^^^^^^^^^^^^^^^^^ 5695 5696``callees`` metadata may be attached to indirect call sites. If ``callees`` 5697metadata is attached to a call site, and any callee is not among the set of 5698functions provided by the metadata, the behavior is undefined. The intent of 5699this metadata is to facilitate optimizations such as indirect-call promotion. 5700For example, in the code below, the call instruction may only target the 5701``add`` or ``sub`` functions: 5702 5703.. code-block:: llvm 5704 5705 %result = call i64 %binop(i64 %x, i64 %y), !callees !0 5706 5707 ... 5708 !0 = !{i64 (i64, i64)* @add, i64 (i64, i64)* @sub} 5709 5710'``callback``' Metadata 5711^^^^^^^^^^^^^^^^^^^^^^^ 5712 5713``callback`` metadata may be attached to a function declaration, or definition. 5714(Call sites are excluded only due to the lack of a use case.) For ease of 5715exposition, we'll refer to the function annotated w/ metadata as a broker 5716function. The metadata describes how the arguments of a call to the broker are 5717in turn passed to the callback function specified by the metadata. Thus, the 5718``callback`` metadata provides a partial description of a call site inside the 5719broker function with regards to the arguments of a call to the broker. The only 5720semantic restriction on the broker function itself is that it is not allowed to 5721inspect or modify arguments referenced in the ``callback`` metadata as 5722pass-through to the callback function. 5723 5724The broker is not required to actually invoke the callback function at runtime. 5725However, the assumptions about not inspecting or modifying arguments that would 5726be passed to the specified callback function still hold, even if the callback 5727function is not dynamically invoked. The broker is allowed to invoke the 5728callback function more than once per invocation of the broker. The broker is 5729also allowed to invoke (directly or indirectly) the function passed as a 5730callback through another use. Finally, the broker is also allowed to relay the 5731callback callee invocation to a different thread. 5732 5733The metadata is structured as follows: At the outer level, ``callback`` 5734metadata is a list of ``callback`` encodings. Each encoding starts with a 5735constant ``i64`` which describes the argument position of the callback function 5736in the call to the broker. The following elements, except the last, describe 5737what arguments are passed to the callback function. Each element is again an 5738``i64`` constant identifying the argument of the broker that is passed through, 5739or ``i64 -1`` to indicate an unknown or inspected argument. The order in which 5740they are listed has to be the same in which they are passed to the callback 5741callee. The last element of the encoding is a boolean which specifies how 5742variadic arguments of the broker are handled. If it is true, all variadic 5743arguments of the broker are passed through to the callback function *after* the 5744arguments encoded explicitly before. 5745 5746In the code below, the ``pthread_create`` function is marked as a broker 5747through the ``!callback !1`` metadata. In the example, there is only one 5748callback encoding, namely ``!2``, associated with the broker. This encoding 5749identifies the callback function as the second argument of the broker (``i64 57502``) and the sole argument of the callback function as the third one of the 5751broker function (``i64 3``). 5752 5753.. FIXME why does the llvm-sphinx-docs builder give a highlighting 5754 error if the below is set to highlight as 'llvm', despite that we 5755 have misc.highlighting_failure set? 5756 5757.. code-block:: text 5758 5759 declare !callback !1 dso_local i32 @pthread_create(i64*, %union.pthread_attr_t*, i8* (i8*)*, i8*) 5760 5761 ... 5762 !2 = !{i64 2, i64 3, i1 false} 5763 !1 = !{!2} 5764 5765Another example is shown below. The callback callee is the second argument of 5766the ``__kmpc_fork_call`` function (``i64 2``). The callee is given two unknown 5767values (each identified by a ``i64 -1``) and afterwards all 5768variadic arguments that are passed to the ``__kmpc_fork_call`` call (due to the 5769final ``i1 true``). 5770 5771.. FIXME why does the llvm-sphinx-docs builder give a highlighting 5772 error if the below is set to highlight as 'llvm', despite that we 5773 have misc.highlighting_failure set? 5774 5775.. code-block:: text 5776 5777 declare !callback !0 dso_local void @__kmpc_fork_call(%struct.ident_t*, i32, void (i32*, i32*, ...)*, ...) 5778 5779 ... 5780 !1 = !{i64 2, i64 -1, i64 -1, i1 true} 5781 !0 = !{!1} 5782 5783 5784'``unpredictable``' Metadata 5785^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 5786 5787``unpredictable`` metadata may be attached to any branch or switch 5788instruction. It can be used to express the unpredictability of control 5789flow. Similar to the llvm.expect intrinsic, it may be used to alter 5790optimizations related to compare and branch instructions. The metadata 5791is treated as a boolean value; if it exists, it signals that the branch 5792or switch that it is attached to is completely unpredictable. 5793 5794.. _md_dereferenceable: 5795 5796'``dereferenceable``' Metadata 5797^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 5798 5799The existence of the ``!dereferenceable`` metadata on the instruction 5800tells the optimizer that the value loaded is known to be dereferenceable. 5801The number of bytes known to be dereferenceable is specified by the integer 5802value in the metadata node. This is analogous to the ''dereferenceable'' 5803attribute on parameters and return values. 5804 5805.. _md_dereferenceable_or_null: 5806 5807'``dereferenceable_or_null``' Metadata 5808^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 5809 5810The existence of the ``!dereferenceable_or_null`` metadata on the 5811instruction tells the optimizer that the value loaded is known to be either 5812dereferenceable or null. 5813The number of bytes known to be dereferenceable is specified by the integer 5814value in the metadata node. This is analogous to the ''dereferenceable_or_null'' 5815attribute on parameters and return values. 5816 5817.. _llvm.loop: 5818 5819'``llvm.loop``' 5820^^^^^^^^^^^^^^^ 5821 5822It is sometimes useful to attach information to loop constructs. Currently, 5823loop metadata is implemented as metadata attached to the branch instruction 5824in the loop latch block. The loop metadata node is a list of 5825other metadata nodes, each representing a property of the loop. Usually, 5826the first item of the property node is a string. For example, the 5827``llvm.loop.unroll.count`` suggests an unroll factor to the loop 5828unroller: 5829 5830.. code-block:: llvm 5831 5832 br i1 %exitcond, label %._crit_edge, label %.lr.ph, !llvm.loop !0 5833 ... 5834 !0 = !{!0, !1, !2} 5835 !1 = !{!"llvm.loop.unroll.enable"} 5836 !2 = !{!"llvm.loop.unroll.count", i32 4} 5837 5838For legacy reasons, the first item of a loop metadata node must be a 5839reference to itself. Before the advent of the 'distinct' keyword, this 5840forced the preservation of otherwise identical metadata nodes. Since 5841the loop-metadata node can be attached to multiple nodes, the 'distinct' 5842keyword has become unnecessary. 5843 5844Prior to the property nodes, one or two ``DILocation`` (debug location) 5845nodes can be present in the list. The first, if present, identifies the 5846source-code location where the loop begins. The second, if present, 5847identifies the source-code location where the loop ends. 5848 5849Loop metadata nodes cannot be used as unique identifiers. They are 5850neither persistent for the same loop through transformations nor 5851necessarily unique to just one loop. 5852 5853'``llvm.loop.disable_nonforced``' 5854^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 5855 5856This metadata disables all optional loop transformations unless 5857explicitly instructed using other transformation metadata such as 5858``llvm.loop.unroll.enable``. That is, no heuristic will try to determine 5859whether a transformation is profitable. The purpose is to avoid that the 5860loop is transformed to a different loop before an explicitly requested 5861(forced) transformation is applied. For instance, loop fusion can make 5862other transformations impossible. Mandatory loop canonicalizations such 5863as loop rotation are still applied. 5864 5865It is recommended to use this metadata in addition to any llvm.loop.* 5866transformation directive. Also, any loop should have at most one 5867directive applied to it (and a sequence of transformations built using 5868followup-attributes). Otherwise, which transformation will be applied 5869depends on implementation details such as the pass pipeline order. 5870 5871See :ref:`transformation-metadata` for details. 5872 5873'``llvm.loop.vectorize``' and '``llvm.loop.interleave``' 5874^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 5875 5876Metadata prefixed with ``llvm.loop.vectorize`` or ``llvm.loop.interleave`` are 5877used to control per-loop vectorization and interleaving parameters such as 5878vectorization width and interleave count. These metadata should be used in 5879conjunction with ``llvm.loop`` loop identification metadata. The 5880``llvm.loop.vectorize`` and ``llvm.loop.interleave`` metadata are only 5881optimization hints and the optimizer will only interleave and vectorize loops if 5882it believes it is safe to do so. The ``llvm.loop.parallel_accesses`` metadata 5883which contains information about loop-carried memory dependencies can be helpful 5884in determining the safety of these transformations. 5885 5886'``llvm.loop.interleave.count``' Metadata 5887^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 5888 5889This metadata suggests an interleave count to the loop interleaver. 5890The first operand is the string ``llvm.loop.interleave.count`` and the 5891second operand is an integer specifying the interleave count. For 5892example: 5893 5894.. code-block:: llvm 5895 5896 !0 = !{!"llvm.loop.interleave.count", i32 4} 5897 5898Note that setting ``llvm.loop.interleave.count`` to 1 disables interleaving 5899multiple iterations of the loop. If ``llvm.loop.interleave.count`` is set to 0 5900then the interleave count will be determined automatically. 5901 5902'``llvm.loop.vectorize.enable``' Metadata 5903^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 5904 5905This metadata selectively enables or disables vectorization for the loop. The 5906first operand is the string ``llvm.loop.vectorize.enable`` and the second operand 5907is a bit. If the bit operand value is 1 vectorization is enabled. A value of 59080 disables vectorization: 5909 5910.. code-block:: llvm 5911 5912 !0 = !{!"llvm.loop.vectorize.enable", i1 0} 5913 !1 = !{!"llvm.loop.vectorize.enable", i1 1} 5914 5915'``llvm.loop.vectorize.predicate.enable``' Metadata 5916^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 5917 5918This metadata selectively enables or disables creating predicated instructions 5919for the loop, which can enable folding of the scalar epilogue loop into the 5920main loop. The first operand is the string 5921``llvm.loop.vectorize.predicate.enable`` and the second operand is a bit. If 5922the bit operand value is 1 vectorization is enabled. A value of 0 disables 5923vectorization: 5924 5925.. code-block:: llvm 5926 5927 !0 = !{!"llvm.loop.vectorize.predicate.enable", i1 0} 5928 !1 = !{!"llvm.loop.vectorize.predicate.enable", i1 1} 5929 5930'``llvm.loop.vectorize.width``' Metadata 5931^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 5932 5933This metadata sets the target width of the vectorizer. The first 5934operand is the string ``llvm.loop.vectorize.width`` and the second 5935operand is an integer specifying the width. For example: 5936 5937.. code-block:: llvm 5938 5939 !0 = !{!"llvm.loop.vectorize.width", i32 4} 5940 5941Note that setting ``llvm.loop.vectorize.width`` to 1 disables 5942vectorization of the loop. If ``llvm.loop.vectorize.width`` is set to 59430 or if the loop does not have this metadata the width will be 5944determined automatically. 5945 5946'``llvm.loop.vectorize.followup_vectorized``' Metadata 5947^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 5948 5949This metadata defines which loop attributes the vectorized loop will 5950have. See :ref:`transformation-metadata` for details. 5951 5952'``llvm.loop.vectorize.followup_epilogue``' Metadata 5953^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 5954 5955This metadata defines which loop attributes the epilogue will have. The 5956epilogue is not vectorized and is executed when either the vectorized 5957loop is not known to preserve semantics (because e.g., it processes two 5958arrays that are found to alias by a runtime check) or for the last 5959iterations that do not fill a complete set of vector lanes. See 5960:ref:`Transformation Metadata <transformation-metadata>` for details. 5961 5962'``llvm.loop.vectorize.followup_all``' Metadata 5963^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 5964 5965Attributes in the metadata will be added to both the vectorized and 5966epilogue loop. 5967See :ref:`Transformation Metadata <transformation-metadata>` for details. 5968 5969'``llvm.loop.unroll``' 5970^^^^^^^^^^^^^^^^^^^^^^ 5971 5972Metadata prefixed with ``llvm.loop.unroll`` are loop unrolling 5973optimization hints such as the unroll factor. ``llvm.loop.unroll`` 5974metadata should be used in conjunction with ``llvm.loop`` loop 5975identification metadata. The ``llvm.loop.unroll`` metadata are only 5976optimization hints and the unrolling will only be performed if the 5977optimizer believes it is safe to do so. 5978 5979'``llvm.loop.unroll.count``' Metadata 5980^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 5981 5982This metadata suggests an unroll factor to the loop unroller. The 5983first operand is the string ``llvm.loop.unroll.count`` and the second 5984operand is a positive integer specifying the unroll factor. For 5985example: 5986 5987.. code-block:: llvm 5988 5989 !0 = !{!"llvm.loop.unroll.count", i32 4} 5990 5991If the trip count of the loop is less than the unroll count the loop 5992will be partially unrolled. 5993 5994'``llvm.loop.unroll.disable``' Metadata 5995^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 5996 5997This metadata disables loop unrolling. The metadata has a single operand 5998which is the string ``llvm.loop.unroll.disable``. For example: 5999 6000.. code-block:: llvm 6001 6002 !0 = !{!"llvm.loop.unroll.disable"} 6003 6004'``llvm.loop.unroll.runtime.disable``' Metadata 6005^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6006 6007This metadata disables runtime loop unrolling. The metadata has a single 6008operand which is the string ``llvm.loop.unroll.runtime.disable``. For example: 6009 6010.. code-block:: llvm 6011 6012 !0 = !{!"llvm.loop.unroll.runtime.disable"} 6013 6014'``llvm.loop.unroll.enable``' Metadata 6015^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6016 6017This metadata suggests that the loop should be fully unrolled if the trip count 6018is known at compile time and partially unrolled if the trip count is not known 6019at compile time. The metadata has a single operand which is the string 6020``llvm.loop.unroll.enable``. For example: 6021 6022.. code-block:: llvm 6023 6024 !0 = !{!"llvm.loop.unroll.enable"} 6025 6026'``llvm.loop.unroll.full``' Metadata 6027^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6028 6029This metadata suggests that the loop should be unrolled fully. The 6030metadata has a single operand which is the string ``llvm.loop.unroll.full``. 6031For example: 6032 6033.. code-block:: llvm 6034 6035 !0 = !{!"llvm.loop.unroll.full"} 6036 6037'``llvm.loop.unroll.followup``' Metadata 6038^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6039 6040This metadata defines which loop attributes the unrolled loop will have. 6041See :ref:`Transformation Metadata <transformation-metadata>` for details. 6042 6043'``llvm.loop.unroll.followup_remainder``' Metadata 6044^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6045 6046This metadata defines which loop attributes the remainder loop after 6047partial/runtime unrolling will have. See 6048:ref:`Transformation Metadata <transformation-metadata>` for details. 6049 6050'``llvm.loop.unroll_and_jam``' 6051^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6052 6053This metadata is treated very similarly to the ``llvm.loop.unroll`` metadata 6054above, but affect the unroll and jam pass. In addition any loop with 6055``llvm.loop.unroll`` metadata but no ``llvm.loop.unroll_and_jam`` metadata will 6056disable unroll and jam (so ``llvm.loop.unroll`` metadata will be left to the 6057unroller, plus ``llvm.loop.unroll.disable`` metadata will disable unroll and jam 6058too.) 6059 6060The metadata for unroll and jam otherwise is the same as for ``unroll``. 6061``llvm.loop.unroll_and_jam.enable``, ``llvm.loop.unroll_and_jam.disable`` and 6062``llvm.loop.unroll_and_jam.count`` do the same as for unroll. 6063``llvm.loop.unroll_and_jam.full`` is not supported. Again these are only hints 6064and the normal safety checks will still be performed. 6065 6066'``llvm.loop.unroll_and_jam.count``' Metadata 6067^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6068 6069This metadata suggests an unroll and jam factor to use, similarly to 6070``llvm.loop.unroll.count``. The first operand is the string 6071``llvm.loop.unroll_and_jam.count`` and the second operand is a positive integer 6072specifying the unroll factor. For example: 6073 6074.. code-block:: llvm 6075 6076 !0 = !{!"llvm.loop.unroll_and_jam.count", i32 4} 6077 6078If the trip count of the loop is less than the unroll count the loop 6079will be partially unroll and jammed. 6080 6081'``llvm.loop.unroll_and_jam.disable``' Metadata 6082^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6083 6084This metadata disables loop unroll and jamming. The metadata has a single 6085operand which is the string ``llvm.loop.unroll_and_jam.disable``. For example: 6086 6087.. code-block:: llvm 6088 6089 !0 = !{!"llvm.loop.unroll_and_jam.disable"} 6090 6091'``llvm.loop.unroll_and_jam.enable``' Metadata 6092^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6093 6094This metadata suggests that the loop should be fully unroll and jammed if the 6095trip count is known at compile time and partially unrolled if the trip count is 6096not known at compile time. The metadata has a single operand which is the 6097string ``llvm.loop.unroll_and_jam.enable``. For example: 6098 6099.. code-block:: llvm 6100 6101 !0 = !{!"llvm.loop.unroll_and_jam.enable"} 6102 6103'``llvm.loop.unroll_and_jam.followup_outer``' Metadata 6104^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6105 6106This metadata defines which loop attributes the outer unrolled loop will 6107have. See :ref:`Transformation Metadata <transformation-metadata>` for 6108details. 6109 6110'``llvm.loop.unroll_and_jam.followup_inner``' Metadata 6111^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6112 6113This metadata defines which loop attributes the inner jammed loop will 6114have. See :ref:`Transformation Metadata <transformation-metadata>` for 6115details. 6116 6117'``llvm.loop.unroll_and_jam.followup_remainder_outer``' Metadata 6118^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6119 6120This metadata defines which attributes the epilogue of the outer loop 6121will have. This loop is usually unrolled, meaning there is no such 6122loop. This attribute will be ignored in this case. See 6123:ref:`Transformation Metadata <transformation-metadata>` for details. 6124 6125'``llvm.loop.unroll_and_jam.followup_remainder_inner``' Metadata 6126^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6127 6128This metadata defines which attributes the inner loop of the epilogue 6129will have. The outer epilogue will usually be unrolled, meaning there 6130can be multiple inner remainder loops. See 6131:ref:`Transformation Metadata <transformation-metadata>` for details. 6132 6133'``llvm.loop.unroll_and_jam.followup_all``' Metadata 6134^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6135 6136Attributes specified in the metadata is added to all 6137``llvm.loop.unroll_and_jam.*`` loops. See 6138:ref:`Transformation Metadata <transformation-metadata>` for details. 6139 6140'``llvm.loop.licm_versioning.disable``' Metadata 6141^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6142 6143This metadata indicates that the loop should not be versioned for the purpose 6144of enabling loop-invariant code motion (LICM). The metadata has a single operand 6145which is the string ``llvm.loop.licm_versioning.disable``. For example: 6146 6147.. code-block:: llvm 6148 6149 !0 = !{!"llvm.loop.licm_versioning.disable"} 6150 6151'``llvm.loop.distribute.enable``' Metadata 6152^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6153 6154Loop distribution allows splitting a loop into multiple loops. Currently, 6155this is only performed if the entire loop cannot be vectorized due to unsafe 6156memory dependencies. The transformation will attempt to isolate the unsafe 6157dependencies into their own loop. 6158 6159This metadata can be used to selectively enable or disable distribution of the 6160loop. The first operand is the string ``llvm.loop.distribute.enable`` and the 6161second operand is a bit. If the bit operand value is 1 distribution is 6162enabled. A value of 0 disables distribution: 6163 6164.. code-block:: llvm 6165 6166 !0 = !{!"llvm.loop.distribute.enable", i1 0} 6167 !1 = !{!"llvm.loop.distribute.enable", i1 1} 6168 6169This metadata should be used in conjunction with ``llvm.loop`` loop 6170identification metadata. 6171 6172'``llvm.loop.distribute.followup_coincident``' Metadata 6173^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6174 6175This metadata defines which attributes extracted loops with no cyclic 6176dependencies will have (i.e. can be vectorized). See 6177:ref:`Transformation Metadata <transformation-metadata>` for details. 6178 6179'``llvm.loop.distribute.followup_sequential``' Metadata 6180^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6181 6182This metadata defines which attributes the isolated loops with unsafe 6183memory dependencies will have. See 6184:ref:`Transformation Metadata <transformation-metadata>` for details. 6185 6186'``llvm.loop.distribute.followup_fallback``' Metadata 6187^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6188 6189If loop versioning is necessary, this metadata defined the attributes 6190the non-distributed fallback version will have. See 6191:ref:`Transformation Metadata <transformation-metadata>` for details. 6192 6193'``llvm.loop.distribute.followup_all``' Metadata 6194^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6195 6196The attributes in this metadata is added to all followup loops of the 6197loop distribution pass. See 6198:ref:`Transformation Metadata <transformation-metadata>` for details. 6199 6200'``llvm.licm.disable``' Metadata 6201^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6202 6203This metadata indicates that loop-invariant code motion (LICM) should not be 6204performed on this loop. The metadata has a single operand which is the string 6205``llvm.licm.disable``. For example: 6206 6207.. code-block:: llvm 6208 6209 !0 = !{!"llvm.licm.disable"} 6210 6211Note that although it operates per loop it isn't given the llvm.loop prefix 6212as it is not affected by the ``llvm.loop.disable_nonforced`` metadata. 6213 6214'``llvm.access.group``' Metadata 6215^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6216 6217``llvm.access.group`` metadata can be attached to any instruction that 6218potentially accesses memory. It can point to a single distinct metadata 6219node, which we call access group. This node represents all memory access 6220instructions referring to it via ``llvm.access.group``. When an 6221instruction belongs to multiple access groups, it can also point to a 6222list of accesses groups, illustrated by the following example. 6223 6224.. code-block:: llvm 6225 6226 %val = load i32, i32* %arrayidx, !llvm.access.group !0 6227 ... 6228 !0 = !{!1, !2} 6229 !1 = distinct !{} 6230 !2 = distinct !{} 6231 6232It is illegal for the list node to be empty since it might be confused 6233with an access group. 6234 6235The access group metadata node must be 'distinct' to avoid collapsing 6236multiple access groups by content. A access group metadata node must 6237always be empty which can be used to distinguish an access group 6238metadata node from a list of access groups. Being empty avoids the 6239situation that the content must be updated which, because metadata is 6240immutable by design, would required finding and updating all references 6241to the access group node. 6242 6243The access group can be used to refer to a memory access instruction 6244without pointing to it directly (which is not possible in global 6245metadata). Currently, the only metadata making use of it is 6246``llvm.loop.parallel_accesses``. 6247 6248'``llvm.loop.parallel_accesses``' Metadata 6249^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6250 6251The ``llvm.loop.parallel_accesses`` metadata refers to one or more 6252access group metadata nodes (see ``llvm.access.group``). It denotes that 6253no loop-carried memory dependence exist between it and other instructions 6254in the loop with this metadata. 6255 6256Let ``m1`` and ``m2`` be two instructions that both have the 6257``llvm.access.group`` metadata to the access group ``g1``, respectively 6258``g2`` (which might be identical). If a loop contains both access groups 6259in its ``llvm.loop.parallel_accesses`` metadata, then the compiler can 6260assume that there is no dependency between ``m1`` and ``m2`` carried by 6261this loop. Instructions that belong to multiple access groups are 6262considered having this property if at least one of the access groups 6263matches the ``llvm.loop.parallel_accesses`` list. 6264 6265If all memory-accessing instructions in a loop have 6266``llvm.access.group`` metadata that each refer to one of the access 6267groups of a loop's ``llvm.loop.parallel_accesses`` metadata, then the 6268loop has no loop carried memory dependences and is considered to be a 6269parallel loop. 6270 6271Note that if not all memory access instructions belong to an access 6272group referred to by ``llvm.loop.parallel_accesses``, then the loop must 6273not be considered trivially parallel. Additional 6274memory dependence analysis is required to make that determination. As a fail 6275safe mechanism, this causes loops that were originally parallel to be considered 6276sequential (if optimization passes that are unaware of the parallel semantics 6277insert new memory instructions into the loop body). 6278 6279Example of a loop that is considered parallel due to its correct use of 6280both ``llvm.access.group`` and ``llvm.loop.parallel_accesses`` 6281metadata types. 6282 6283.. code-block:: llvm 6284 6285 for.body: 6286 ... 6287 %val0 = load i32, i32* %arrayidx, !llvm.access.group !1 6288 ... 6289 store i32 %val0, i32* %arrayidx1, !llvm.access.group !1 6290 ... 6291 br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !0 6292 6293 for.end: 6294 ... 6295 !0 = distinct !{!0, !{!"llvm.loop.parallel_accesses", !1}} 6296 !1 = distinct !{} 6297 6298It is also possible to have nested parallel loops: 6299 6300.. code-block:: llvm 6301 6302 outer.for.body: 6303 ... 6304 %val1 = load i32, i32* %arrayidx3, !llvm.access.group !4 6305 ... 6306 br label %inner.for.body 6307 6308 inner.for.body: 6309 ... 6310 %val0 = load i32, i32* %arrayidx1, !llvm.access.group !3 6311 ... 6312 store i32 %val0, i32* %arrayidx2, !llvm.access.group !3 6313 ... 6314 br i1 %exitcond, label %inner.for.end, label %inner.for.body, !llvm.loop !1 6315 6316 inner.for.end: 6317 ... 6318 store i32 %val1, i32* %arrayidx4, !llvm.access.group !4 6319 ... 6320 br i1 %exitcond, label %outer.for.end, label %outer.for.body, !llvm.loop !2 6321 6322 outer.for.end: ; preds = %for.body 6323 ... 6324 !1 = distinct !{!1, !{!"llvm.loop.parallel_accesses", !3}} ; metadata for the inner loop 6325 !2 = distinct !{!2, !{!"llvm.loop.parallel_accesses", !3, !4}} ; metadata for the outer loop 6326 !3 = distinct !{} ; access group for instructions in the inner loop (which are implicitly contained in outer loop as well) 6327 !4 = distinct !{} ; access group for instructions in the outer, but not the inner loop 6328 6329'``llvm.loop.mustprogress``' Metadata 6330^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6331 6332The ``llvm.loop.mustprogress`` metadata indicates that this loop is required to 6333terminate, unwind, or interact with the environment in an observable way e.g. 6334via a volatile memory access, I/O, or other synchronization. If such a loop is 6335not found to interact with the environment in an observable way, the loop may 6336be removed. This corresponds to the ``mustprogress`` function attribute. 6337 6338'``irr_loop``' Metadata 6339^^^^^^^^^^^^^^^^^^^^^^^ 6340 6341``irr_loop`` metadata may be attached to the terminator instruction of a basic 6342block that's an irreducible loop header (note that an irreducible loop has more 6343than once header basic blocks.) If ``irr_loop`` metadata is attached to the 6344terminator instruction of a basic block that is not really an irreducible loop 6345header, the behavior is undefined. The intent of this metadata is to improve the 6346accuracy of the block frequency propagation. For example, in the code below, the 6347block ``header0`` may have a loop header weight (relative to the other headers of 6348the irreducible loop) of 100: 6349 6350.. code-block:: llvm 6351 6352 header0: 6353 ... 6354 br i1 %cmp, label %t1, label %t2, !irr_loop !0 6355 6356 ... 6357 !0 = !{"loop_header_weight", i64 100} 6358 6359Irreducible loop header weights are typically based on profile data. 6360 6361.. _md_invariant.group: 6362 6363'``invariant.group``' Metadata 6364^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6365 6366The experimental ``invariant.group`` metadata may be attached to 6367``load``/``store`` instructions referencing a single metadata with no entries. 6368The existence of the ``invariant.group`` metadata on the instruction tells 6369the optimizer that every ``load`` and ``store`` to the same pointer operand 6370can be assumed to load or store the same 6371value (but see the ``llvm.launder.invariant.group`` intrinsic which affects 6372when two pointers are considered the same). Pointers returned by bitcast or 6373getelementptr with only zero indices are considered the same. 6374 6375Examples: 6376 6377.. code-block:: llvm 6378 6379 @unknownPtr = external global i8 6380 ... 6381 %ptr = alloca i8 6382 store i8 42, i8* %ptr, !invariant.group !0 6383 call void @foo(i8* %ptr) 6384 6385 %a = load i8, i8* %ptr, !invariant.group !0 ; Can assume that value under %ptr didn't change 6386 call void @foo(i8* %ptr) 6387 6388 %newPtr = call i8* @getPointer(i8* %ptr) 6389 %c = load i8, i8* %newPtr, !invariant.group !0 ; Can't assume anything, because we only have information about %ptr 6390 6391 %unknownValue = load i8, i8* @unknownPtr 6392 store i8 %unknownValue, i8* %ptr, !invariant.group !0 ; Can assume that %unknownValue == 42 6393 6394 call void @foo(i8* %ptr) 6395 %newPtr2 = call i8* @llvm.launder.invariant.group(i8* %ptr) 6396 %d = load i8, i8* %newPtr2, !invariant.group !0 ; Can't step through launder.invariant.group to get value of %ptr 6397 6398 ... 6399 declare void @foo(i8*) 6400 declare i8* @getPointer(i8*) 6401 declare i8* @llvm.launder.invariant.group(i8*) 6402 6403 !0 = !{} 6404 6405The invariant.group metadata must be dropped when replacing one pointer by 6406another based on aliasing information. This is because invariant.group is tied 6407to the SSA value of the pointer operand. 6408 6409.. code-block:: llvm 6410 6411 %v = load i8, i8* %x, !invariant.group !0 6412 ; if %x mustalias %y then we can replace the above instruction with 6413 %v = load i8, i8* %y 6414 6415Note that this is an experimental feature, which means that its semantics might 6416change in the future. 6417 6418'``type``' Metadata 6419^^^^^^^^^^^^^^^^^^^ 6420 6421See :doc:`TypeMetadata`. 6422 6423'``associated``' Metadata 6424^^^^^^^^^^^^^^^^^^^^^^^^^ 6425 6426The ``associated`` metadata may be attached to a global object 6427declaration with a single argument that references another global object. 6428 6429This metadata prevents discarding of the global object in linker GC 6430unless the referenced object is also discarded. The linker support for 6431this feature is spotty. For best compatibility, globals carrying this 6432metadata may also: 6433 6434- Be in a comdat with the referenced global. 6435- Be in @llvm.compiler.used. 6436- Have an explicit section with a name which is a valid C identifier. 6437 6438It does not have any effect on non-ELF targets. 6439 6440Example: 6441 6442.. code-block:: text 6443 6444 $a = comdat any 6445 @a = global i32 1, comdat $a 6446 @b = internal global i32 2, comdat $a, section "abc", !associated !0 6447 !0 = !{i32* @a} 6448 6449 6450'``prof``' Metadata 6451^^^^^^^^^^^^^^^^^^^ 6452 6453The ``prof`` metadata is used to record profile data in the IR. 6454The first operand of the metadata node indicates the profile metadata 6455type. There are currently 3 types: 6456:ref:`branch_weights<prof_node_branch_weights>`, 6457:ref:`function_entry_count<prof_node_function_entry_count>`, and 6458:ref:`VP<prof_node_VP>`. 6459 6460.. _prof_node_branch_weights: 6461 6462branch_weights 6463"""""""""""""" 6464 6465Branch weight metadata attached to a branch, select, switch or call instruction 6466represents the likeliness of the associated branch being taken. 6467For more information, see :doc:`BranchWeightMetadata`. 6468 6469.. _prof_node_function_entry_count: 6470 6471function_entry_count 6472"""""""""""""""""""" 6473 6474Function entry count metadata can be attached to function definitions 6475to record the number of times the function is called. Used with BFI 6476information, it is also used to derive the basic block profile count. 6477For more information, see :doc:`BranchWeightMetadata`. 6478 6479.. _prof_node_VP: 6480 6481VP 6482"" 6483 6484VP (value profile) metadata can be attached to instructions that have 6485value profile information. Currently this is indirect calls (where it 6486records the hottest callees) and calls to memory intrinsics such as memcpy, 6487memmove, and memset (where it records the hottest byte lengths). 6488 6489Each VP metadata node contains "VP" string, then a uint32_t value for the value 6490profiling kind, a uint64_t value for the total number of times the instruction 6491is executed, followed by uint64_t value and execution count pairs. 6492The value profiling kind is 0 for indirect call targets and 1 for memory 6493operations. For indirect call targets, each profile value is a hash 6494of the callee function name, and for memory operations each value is the 6495byte length. 6496 6497Note that the value counts do not need to add up to the total count 6498listed in the third operand (in practice only the top hottest values 6499are tracked and reported). 6500 6501Indirect call example: 6502 6503.. code-block:: llvm 6504 6505 call void %f(), !prof !1 6506 !1 = !{!"VP", i32 0, i64 1600, i64 7651369219802541373, i64 1030, i64 -4377547752858689819, i64 410} 6507 6508Note that the VP type is 0 (the second operand), which indicates this is 6509an indirect call value profile data. The third operand indicates that the 6510indirect call executed 1600 times. The 4th and 6th operands give the 6511hashes of the 2 hottest target functions' names (this is the same hash used 6512to represent function names in the profile database), and the 5th and 7th 6513operands give the execution count that each of the respective prior target 6514functions was called. 6515 6516Module Flags Metadata 6517===================== 6518 6519Information about the module as a whole is difficult to convey to LLVM's 6520subsystems. The LLVM IR isn't sufficient to transmit this information. 6521The ``llvm.module.flags`` named metadata exists in order to facilitate 6522this. These flags are in the form of key / value pairs --- much like a 6523dictionary --- making it easy for any subsystem who cares about a flag to 6524look it up. 6525 6526The ``llvm.module.flags`` metadata contains a list of metadata triplets. 6527Each triplet has the following form: 6528 6529- The first element is a *behavior* flag, which specifies the behavior 6530 when two (or more) modules are merged together, and it encounters two 6531 (or more) metadata with the same ID. The supported behaviors are 6532 described below. 6533- The second element is a metadata string that is a unique ID for the 6534 metadata. Each module may only have one flag entry for each unique ID (not 6535 including entries with the **Require** behavior). 6536- The third element is the value of the flag. 6537 6538When two (or more) modules are merged together, the resulting 6539``llvm.module.flags`` metadata is the union of the modules' flags. That is, for 6540each unique metadata ID string, there will be exactly one entry in the merged 6541modules ``llvm.module.flags`` metadata table, and the value for that entry will 6542be determined by the merge behavior flag, as described below. The only exception 6543is that entries with the *Require* behavior are always preserved. 6544 6545The following behaviors are supported: 6546 6547.. list-table:: 6548 :header-rows: 1 6549 :widths: 10 90 6550 6551 * - Value 6552 - Behavior 6553 6554 * - 1 6555 - **Error** 6556 Emits an error if two values disagree, otherwise the resulting value 6557 is that of the operands. 6558 6559 * - 2 6560 - **Warning** 6561 Emits a warning if two values disagree. The result value will be the 6562 operand for the flag from the first module being linked, or the max 6563 if the other module uses **Max** (in which case the resulting flag 6564 will be **Max**). 6565 6566 * - 3 6567 - **Require** 6568 Adds a requirement that another module flag be present and have a 6569 specified value after linking is performed. The value must be a 6570 metadata pair, where the first element of the pair is the ID of the 6571 module flag to be restricted, and the second element of the pair is 6572 the value the module flag should be restricted to. This behavior can 6573 be used to restrict the allowable results (via triggering of an 6574 error) of linking IDs with the **Override** behavior. 6575 6576 * - 4 6577 - **Override** 6578 Uses the specified value, regardless of the behavior or value of the 6579 other module. If both modules specify **Override**, but the values 6580 differ, an error will be emitted. 6581 6582 * - 5 6583 - **Append** 6584 Appends the two values, which are required to be metadata nodes. 6585 6586 * - 6 6587 - **AppendUnique** 6588 Appends the two values, which are required to be metadata 6589 nodes. However, duplicate entries in the second list are dropped 6590 during the append operation. 6591 6592 * - 7 6593 - **Max** 6594 Takes the max of the two values, which are required to be integers. 6595 6596It is an error for a particular unique flag ID to have multiple behaviors, 6597except in the case of **Require** (which adds restrictions on another metadata 6598value) or **Override**. 6599 6600An example of module flags: 6601 6602.. code-block:: llvm 6603 6604 !0 = !{ i32 1, !"foo", i32 1 } 6605 !1 = !{ i32 4, !"bar", i32 37 } 6606 !2 = !{ i32 2, !"qux", i32 42 } 6607 !3 = !{ i32 3, !"qux", 6608 !{ 6609 !"foo", i32 1 6610 } 6611 } 6612 !llvm.module.flags = !{ !0, !1, !2, !3 } 6613 6614- Metadata ``!0`` has the ID ``!"foo"`` and the value '1'. The behavior 6615 if two or more ``!"foo"`` flags are seen is to emit an error if their 6616 values are not equal. 6617 6618- Metadata ``!1`` has the ID ``!"bar"`` and the value '37'. The 6619 behavior if two or more ``!"bar"`` flags are seen is to use the value 6620 '37'. 6621 6622- Metadata ``!2`` has the ID ``!"qux"`` and the value '42'. The 6623 behavior if two or more ``!"qux"`` flags are seen is to emit a 6624 warning if their values are not equal. 6625 6626- Metadata ``!3`` has the ID ``!"qux"`` and the value: 6627 6628 :: 6629 6630 !{ !"foo", i32 1 } 6631 6632 The behavior is to emit an error if the ``llvm.module.flags`` does not 6633 contain a flag with the ID ``!"foo"`` that has the value '1' after linking is 6634 performed. 6635 6636Objective-C Garbage Collection Module Flags Metadata 6637---------------------------------------------------- 6638 6639On the Mach-O platform, Objective-C stores metadata about garbage 6640collection in a special section called "image info". The metadata 6641consists of a version number and a bitmask specifying what types of 6642garbage collection are supported (if any) by the file. If two or more 6643modules are linked together their garbage collection metadata needs to 6644be merged rather than appended together. 6645 6646The Objective-C garbage collection module flags metadata consists of the 6647following key-value pairs: 6648 6649.. list-table:: 6650 :header-rows: 1 6651 :widths: 30 70 6652 6653 * - Key 6654 - Value 6655 6656 * - ``Objective-C Version`` 6657 - **[Required]** --- The Objective-C ABI version. Valid values are 1 and 2. 6658 6659 * - ``Objective-C Image Info Version`` 6660 - **[Required]** --- The version of the image info section. Currently 6661 always 0. 6662 6663 * - ``Objective-C Image Info Section`` 6664 - **[Required]** --- The section to place the metadata. Valid values are 6665 ``"__OBJC, __image_info, regular"`` for Objective-C ABI version 1, and 6666 ``"__DATA,__objc_imageinfo, regular, no_dead_strip"`` for 6667 Objective-C ABI version 2. 6668 6669 * - ``Objective-C Garbage Collection`` 6670 - **[Required]** --- Specifies whether garbage collection is supported or 6671 not. Valid values are 0, for no garbage collection, and 2, for garbage 6672 collection supported. 6673 6674 * - ``Objective-C GC Only`` 6675 - **[Optional]** --- Specifies that only garbage collection is supported. 6676 If present, its value must be 6. This flag requires that the 6677 ``Objective-C Garbage Collection`` flag have the value 2. 6678 6679Some important flag interactions: 6680 6681- If a module with ``Objective-C Garbage Collection`` set to 0 is 6682 merged with a module with ``Objective-C Garbage Collection`` set to 6683 2, then the resulting module has the 6684 ``Objective-C Garbage Collection`` flag set to 0. 6685- A module with ``Objective-C Garbage Collection`` set to 0 cannot be 6686 merged with a module with ``Objective-C GC Only`` set to 6. 6687 6688C type width Module Flags Metadata 6689---------------------------------- 6690 6691The ARM backend emits a section into each generated object file describing the 6692options that it was compiled with (in a compiler-independent way) to prevent 6693linking incompatible objects, and to allow automatic library selection. Some 6694of these options are not visible at the IR level, namely wchar_t width and enum 6695width. 6696 6697To pass this information to the backend, these options are encoded in module 6698flags metadata, using the following key-value pairs: 6699 6700.. list-table:: 6701 :header-rows: 1 6702 :widths: 30 70 6703 6704 * - Key 6705 - Value 6706 6707 * - short_wchar 6708 - * 0 --- sizeof(wchar_t) == 4 6709 * 1 --- sizeof(wchar_t) == 2 6710 6711 * - short_enum 6712 - * 0 --- Enums are at least as large as an ``int``. 6713 * 1 --- Enums are stored in the smallest integer type which can 6714 represent all of its values. 6715 6716For example, the following metadata section specifies that the module was 6717compiled with a ``wchar_t`` width of 4 bytes, and the underlying type of an 6718enum is the smallest type which can represent all of its values:: 6719 6720 !llvm.module.flags = !{!0, !1} 6721 !0 = !{i32 1, !"short_wchar", i32 1} 6722 !1 = !{i32 1, !"short_enum", i32 0} 6723 6724LTO Post-Link Module Flags Metadata 6725----------------------------------- 6726 6727Some optimisations are only when the entire LTO unit is present in the current 6728module. This is represented by the ``LTOPostLink`` module flags metadata, which 6729will be created with a value of ``1`` when LTO linking occurs. 6730 6731Automatic Linker Flags Named Metadata 6732===================================== 6733 6734Some targets support embedding of flags to the linker inside individual object 6735files. Typically this is used in conjunction with language extensions which 6736allow source files to contain linker command line options, and have these 6737automatically be transmitted to the linker via object files. 6738 6739These flags are encoded in the IR using named metadata with the name 6740``!llvm.linker.options``. Each operand is expected to be a metadata node 6741which should be a list of other metadata nodes, each of which should be a 6742list of metadata strings defining linker options. 6743 6744For example, the following metadata section specifies two separate sets of 6745linker options, presumably to link against ``libz`` and the ``Cocoa`` 6746framework:: 6747 6748 !0 = !{ !"-lz" } 6749 !1 = !{ !"-framework", !"Cocoa" } 6750 !llvm.linker.options = !{ !0, !1 } 6751 6752The metadata encoding as lists of lists of options, as opposed to a collapsed 6753list of options, is chosen so that the IR encoding can use multiple option 6754strings to specify e.g., a single library, while still having that specifier be 6755preserved as an atomic element that can be recognized by a target specific 6756assembly writer or object file emitter. 6757 6758Each individual option is required to be either a valid option for the target's 6759linker, or an option that is reserved by the target specific assembly writer or 6760object file emitter. No other aspect of these options is defined by the IR. 6761 6762Dependent Libs Named Metadata 6763============================= 6764 6765Some targets support embedding of strings into object files to indicate 6766a set of libraries to add to the link. Typically this is used in conjunction 6767with language extensions which allow source files to explicitly declare the 6768libraries they depend on, and have these automatically be transmitted to the 6769linker via object files. 6770 6771The list is encoded in the IR using named metadata with the name 6772``!llvm.dependent-libraries``. Each operand is expected to be a metadata node 6773which should contain a single string operand. 6774 6775For example, the following metadata section contains two library specifiers:: 6776 6777 !0 = !{!"a library specifier"} 6778 !1 = !{!"another library specifier"} 6779 !llvm.dependent-libraries = !{ !0, !1 } 6780 6781Each library specifier will be handled independently by the consuming linker. 6782The effect of the library specifiers are defined by the consuming linker. 6783 6784.. _summary: 6785 6786ThinLTO Summary 6787=============== 6788 6789Compiling with `ThinLTO <https://clang.llvm.org/docs/ThinLTO.html>`_ 6790causes the building of a compact summary of the module that is emitted into 6791the bitcode. The summary is emitted into the LLVM assembly and identified 6792in syntax by a caret ('``^``'). 6793 6794The summary is parsed into a bitcode output, along with the Module 6795IR, via the "``llvm-as``" tool. Tools that parse the Module IR for the purposes 6796of optimization (e.g. "``clang -x ir``" and "``opt``"), will ignore the 6797summary entries (just as they currently ignore summary entries in a bitcode 6798input file). 6799 6800Eventually, the summary will be parsed into a ModuleSummaryIndex object under 6801the same conditions where summary index is currently built from bitcode. 6802Specifically, tools that test the Thin Link portion of a ThinLTO compile 6803(i.e. llvm-lto and llvm-lto2), or when parsing a combined index 6804for a distributed ThinLTO backend via clang's "``-fthinlto-index=<>``" flag 6805(this part is not yet implemented, use llvm-as to create a bitcode object 6806before feeding into thin link tools for now). 6807 6808There are currently 3 types of summary entries in the LLVM assembly: 6809:ref:`module paths<module_path_summary>`, 6810:ref:`global values<gv_summary>`, and 6811:ref:`type identifiers<typeid_summary>`. 6812 6813.. _module_path_summary: 6814 6815Module Path Summary Entry 6816------------------------- 6817 6818Each module path summary entry lists a module containing global values included 6819in the summary. For a single IR module there will be one such entry, but 6820in a combined summary index produced during the thin link, there will be 6821one module path entry per linked module with summary. 6822 6823Example: 6824 6825.. code-block:: text 6826 6827 ^0 = module: (path: "/path/to/file.o", hash: (2468601609, 1329373163, 1565878005, 638838075, 3148790418)) 6828 6829The ``path`` field is a string path to the bitcode file, and the ``hash`` 6830field is the 160-bit SHA-1 hash of the IR bitcode contents, used for 6831incremental builds and caching. 6832 6833.. _gv_summary: 6834 6835Global Value Summary Entry 6836-------------------------- 6837 6838Each global value summary entry corresponds to a global value defined or 6839referenced by a summarized module. 6840 6841Example: 6842 6843.. code-block:: text 6844 6845 ^4 = gv: (name: "f"[, summaries: (Summary)[, (Summary)]*]?) ; guid = 14740650423002898831 6846 6847For declarations, there will not be a summary list. For definitions, a 6848global value will contain a list of summaries, one per module containing 6849a definition. There can be multiple entries in a combined summary index 6850for symbols with weak linkage. 6851 6852Each ``Summary`` format will depend on whether the global value is a 6853:ref:`function<function_summary>`, :ref:`variable<variable_summary>`, or 6854:ref:`alias<alias_summary>`. 6855 6856.. _function_summary: 6857 6858Function Summary 6859^^^^^^^^^^^^^^^^ 6860 6861If the global value is a function, the ``Summary`` entry will look like: 6862 6863.. code-block:: text 6864 6865 function: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0), insts: 2[, FuncFlags]?[, Calls]?[, TypeIdInfo]?[, Params]?[, Refs]? 6866 6867The ``module`` field includes the summary entry id for the module containing 6868this definition, and the ``flags`` field contains information such as 6869the linkage type, a flag indicating whether it is legal to import the 6870definition, whether it is globally live and whether the linker resolved it 6871to a local definition (the latter two are populated during the thin link). 6872The ``insts`` field contains the number of IR instructions in the function. 6873Finally, there are several optional fields: :ref:`FuncFlags<funcflags_summary>`, 6874:ref:`Calls<calls_summary>`, :ref:`TypeIdInfo<typeidinfo_summary>`, 6875:ref:`Params<params_summary>`, :ref:`Refs<refs_summary>`. 6876 6877.. _variable_summary: 6878 6879Global Variable Summary 6880^^^^^^^^^^^^^^^^^^^^^^^ 6881 6882If the global value is a variable, the ``Summary`` entry will look like: 6883 6884.. code-block:: text 6885 6886 variable: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0)[, Refs]? 6887 6888The variable entry contains a subset of the fields in a 6889:ref:`function summary <function_summary>`, see the descriptions there. 6890 6891.. _alias_summary: 6892 6893Alias Summary 6894^^^^^^^^^^^^^ 6895 6896If the global value is an alias, the ``Summary`` entry will look like: 6897 6898.. code-block:: text 6899 6900 alias: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0), aliasee: ^2) 6901 6902The ``module`` and ``flags`` fields are as described for a 6903:ref:`function summary <function_summary>`. The ``aliasee`` field 6904contains a reference to the global value summary entry of the aliasee. 6905 6906.. _funcflags_summary: 6907 6908Function Flags 6909^^^^^^^^^^^^^^ 6910 6911The optional ``FuncFlags`` field looks like: 6912 6913.. code-block:: text 6914 6915 funcFlags: (readNone: 0, readOnly: 0, noRecurse: 0, returnDoesNotAlias: 0) 6916 6917If unspecified, flags are assumed to hold the conservative ``false`` value of 6918``0``. 6919 6920.. _calls_summary: 6921 6922Calls 6923^^^^^ 6924 6925The optional ``Calls`` field looks like: 6926 6927.. code-block:: text 6928 6929 calls: ((Callee)[, (Callee)]*) 6930 6931where each ``Callee`` looks like: 6932 6933.. code-block:: text 6934 6935 callee: ^1[, hotness: None]?[, relbf: 0]? 6936 6937The ``callee`` refers to the summary entry id of the callee. At most one 6938of ``hotness`` (which can take the values ``Unknown``, ``Cold``, ``None``, 6939``Hot``, and ``Critical``), and ``relbf`` (which holds the integer 6940branch frequency relative to the entry frequency, scaled down by 2^8) 6941may be specified. The defaults are ``Unknown`` and ``0``, respectively. 6942 6943.. _params_summary: 6944 6945Params 6946^^^^^^ 6947 6948The optional ``Params`` is used by ``StackSafety`` and looks like: 6949 6950.. code-block:: text 6951 6952 Params: ((Param)[, (Param)]*) 6953 6954where each ``Param`` describes pointer parameter access inside of the 6955function and looks like: 6956 6957.. code-block:: text 6958 6959 param: 4, offset: [0, 5][, calls: ((Callee)[, (Callee)]*)]? 6960 6961where the first ``param`` is the number of the parameter it describes, 6962``offset`` is the inclusive range of offsets from the pointer parameter to bytes 6963which can be accessed by the function. This range does not include accesses by 6964function calls from ``calls`` list. 6965 6966where each ``Callee`` describes how parameter is forwarded into other 6967functions and looks like: 6968 6969.. code-block:: text 6970 6971 callee: ^3, param: 5, offset: [-3, 3] 6972 6973The ``callee`` refers to the summary entry id of the callee, ``param`` is 6974the number of the callee parameter which points into the callers parameter 6975with offset known to be inside of the ``offset`` range. ``calls`` will be 6976consumed and removed by thin link stage to update ``Param::offset`` so it 6977covers all accesses possible by ``calls``. 6978 6979Pointer parameter without corresponding ``Param`` is considered unsafe and we 6980assume that access with any offset is possible. 6981 6982Example: 6983 6984If we have the following function: 6985 6986.. code-block:: text 6987 6988 define i64 @foo(i64* %0, i32* %1, i8* %2, i8 %3) { 6989 store i32* %1, i32** @x 6990 %5 = getelementptr inbounds i8, i8* %2, i64 5 6991 %6 = load i8, i8* %5 6992 %7 = getelementptr inbounds i8, i8* %2, i8 %3 6993 tail call void @bar(i8 %3, i8* %7) 6994 %8 = load i64, i64* %0 6995 ret i64 %8 6996 } 6997 6998We can expect the record like this: 6999 7000.. code-block:: text 7001 7002 params: ((param: 0, offset: [0, 7]),(param: 2, offset: [5, 5], calls: ((callee: ^3, param: 1, offset: [-128, 127])))) 7003 7004The function may access just 8 bytes of the parameter %0 . ``calls`` is empty, 7005so the parameter is either not used for function calls or ``offset`` already 7006covers all accesses from nested function calls. 7007Parameter %1 escapes, so access is unknown. 7008The function itself can access just a single byte of the parameter %2. Additional 7009access is possible inside of the ``@bar`` or ``^3``. The function adds signed 7010offset to the pointer and passes the result as the argument %1 into ``^3``. 7011This record itself does not tell us how ``^3`` will access the parameter. 7012Parameter %3 is not a pointer. 7013 7014.. _refs_summary: 7015 7016Refs 7017^^^^ 7018 7019The optional ``Refs`` field looks like: 7020 7021.. code-block:: text 7022 7023 refs: ((Ref)[, (Ref)]*) 7024 7025where each ``Ref`` contains a reference to the summary id of the referenced 7026value (e.g. ``^1``). 7027 7028.. _typeidinfo_summary: 7029 7030TypeIdInfo 7031^^^^^^^^^^ 7032 7033The optional ``TypeIdInfo`` field, used for 7034`Control Flow Integrity <https://clang.llvm.org/docs/ControlFlowIntegrity.html>`_, 7035looks like: 7036 7037.. code-block:: text 7038 7039 typeIdInfo: [(TypeTests)]?[, (TypeTestAssumeVCalls)]?[, (TypeCheckedLoadVCalls)]?[, (TypeTestAssumeConstVCalls)]?[, (TypeCheckedLoadConstVCalls)]? 7040 7041These optional fields have the following forms: 7042 7043TypeTests 7044""""""""" 7045 7046.. code-block:: text 7047 7048 typeTests: (TypeIdRef[, TypeIdRef]*) 7049 7050Where each ``TypeIdRef`` refers to a :ref:`type id<typeid_summary>` 7051by summary id or ``GUID``. 7052 7053TypeTestAssumeVCalls 7054"""""""""""""""""""" 7055 7056.. code-block:: text 7057 7058 typeTestAssumeVCalls: (VFuncId[, VFuncId]*) 7059 7060Where each VFuncId has the format: 7061 7062.. code-block:: text 7063 7064 vFuncId: (TypeIdRef, offset: 16) 7065 7066Where each ``TypeIdRef`` refers to a :ref:`type id<typeid_summary>` 7067by summary id or ``GUID`` preceded by a ``guid:`` tag. 7068 7069TypeCheckedLoadVCalls 7070""""""""""""""""""""" 7071 7072.. code-block:: text 7073 7074 typeCheckedLoadVCalls: (VFuncId[, VFuncId]*) 7075 7076Where each VFuncId has the format described for ``TypeTestAssumeVCalls``. 7077 7078TypeTestAssumeConstVCalls 7079""""""""""""""""""""""""" 7080 7081.. code-block:: text 7082 7083 typeTestAssumeConstVCalls: (ConstVCall[, ConstVCall]*) 7084 7085Where each ConstVCall has the format: 7086 7087.. code-block:: text 7088 7089 (VFuncId, args: (Arg[, Arg]*)) 7090 7091and where each VFuncId has the format described for ``TypeTestAssumeVCalls``, 7092and each Arg is an integer argument number. 7093 7094TypeCheckedLoadConstVCalls 7095"""""""""""""""""""""""""" 7096 7097.. code-block:: text 7098 7099 typeCheckedLoadConstVCalls: (ConstVCall[, ConstVCall]*) 7100 7101Where each ConstVCall has the format described for 7102``TypeTestAssumeConstVCalls``. 7103 7104.. _typeid_summary: 7105 7106Type ID Summary Entry 7107--------------------- 7108 7109Each type id summary entry corresponds to a type identifier resolution 7110which is generated during the LTO link portion of the compile when building 7111with `Control Flow Integrity <https://clang.llvm.org/docs/ControlFlowIntegrity.html>`_, 7112so these are only present in a combined summary index. 7113 7114Example: 7115 7116.. code-block:: text 7117 7118 ^4 = typeid: (name: "_ZTS1A", summary: (typeTestRes: (kind: allOnes, sizeM1BitWidth: 7[, alignLog2: 0]?[, sizeM1: 0]?[, bitMask: 0]?[, inlineBits: 0]?)[, WpdResolutions]?)) ; guid = 7004155349499253778 7119 7120The ``typeTestRes`` gives the type test resolution ``kind`` (which may 7121be ``unsat``, ``byteArray``, ``inline``, ``single``, or ``allOnes``), and 7122the ``size-1`` bit width. It is followed by optional flags, which default to 0, 7123and an optional WpdResolutions (whole program devirtualization resolution) 7124field that looks like: 7125 7126.. code-block:: text 7127 7128 wpdResolutions: ((offset: 0, WpdRes)[, (offset: 1, WpdRes)]* 7129 7130where each entry is a mapping from the given byte offset to the whole-program 7131devirtualization resolution WpdRes, that has one of the following formats: 7132 7133.. code-block:: text 7134 7135 wpdRes: (kind: branchFunnel) 7136 wpdRes: (kind: singleImpl, singleImplName: "_ZN1A1nEi") 7137 wpdRes: (kind: indir) 7138 7139Additionally, each wpdRes has an optional ``resByArg`` field, which 7140describes the resolutions for calls with all constant integer arguments: 7141 7142.. code-block:: text 7143 7144 resByArg: (ResByArg[, ResByArg]*) 7145 7146where ResByArg is: 7147 7148.. code-block:: text 7149 7150 args: (Arg[, Arg]*), byArg: (kind: UniformRetVal[, info: 0][, byte: 0][, bit: 0]) 7151 7152Where the ``kind`` can be ``Indir``, ``UniformRetVal``, ``UniqueRetVal`` 7153or ``VirtualConstProp``. The ``info`` field is only used if the kind 7154is ``UniformRetVal`` (indicates the uniform return value), or 7155``UniqueRetVal`` (holds the return value associated with the unique vtable 7156(0 or 1)). The ``byte`` and ``bit`` fields are only used if the target does 7157not support the use of absolute symbols to store constants. 7158 7159.. _intrinsicglobalvariables: 7160 7161Intrinsic Global Variables 7162========================== 7163 7164LLVM has a number of "magic" global variables that contain data that 7165affect code generation or other IR semantics. These are documented here. 7166All globals of this sort should have a section specified as 7167"``llvm.metadata``". This section and all globals that start with 7168"``llvm.``" are reserved for use by LLVM. 7169 7170.. _gv_llvmused: 7171 7172The '``llvm.used``' Global Variable 7173----------------------------------- 7174 7175The ``@llvm.used`` global is an array which has 7176:ref:`appending linkage <linkage_appending>`. This array contains a list of 7177pointers to named global variables, functions and aliases which may optionally 7178have a pointer cast formed of bitcast or getelementptr. For example, a legal 7179use of it is: 7180 7181.. code-block:: llvm 7182 7183 @X = global i8 4 7184 @Y = global i32 123 7185 7186 @llvm.used = appending global [2 x i8*] [ 7187 i8* @X, 7188 i8* bitcast (i32* @Y to i8*) 7189 ], section "llvm.metadata" 7190 7191If a symbol appears in the ``@llvm.used`` list, then the compiler, assembler, 7192and linker are required to treat the symbol as if there is a reference to the 7193symbol that it cannot see (which is why they have to be named). For example, if 7194a variable has internal linkage and no references other than that from the 7195``@llvm.used`` list, it cannot be deleted. This is commonly used to represent 7196references from inline asms and other things the compiler cannot "see", and 7197corresponds to "``attribute((used))``" in GNU C. 7198 7199On some targets, the code generator must emit a directive to the 7200assembler or object file to prevent the assembler and linker from 7201removing the symbol. 7202 7203.. _gv_llvmcompilerused: 7204 7205The '``llvm.compiler.used``' Global Variable 7206-------------------------------------------- 7207 7208The ``@llvm.compiler.used`` directive is the same as the ``@llvm.used`` 7209directive, except that it only prevents the compiler from touching the 7210symbol. On targets that support it, this allows an intelligent linker to 7211optimize references to the symbol without being impeded as it would be 7212by ``@llvm.used``. 7213 7214This is a rare construct that should only be used in rare circumstances, 7215and should not be exposed to source languages. 7216 7217.. _gv_llvmglobalctors: 7218 7219The '``llvm.global_ctors``' Global Variable 7220------------------------------------------- 7221 7222.. code-block:: llvm 7223 7224 %0 = type { i32, void ()*, i8* } 7225 @llvm.global_ctors = appending global [1 x %0] [%0 { i32 65535, void ()* @ctor, i8* @data }] 7226 7227The ``@llvm.global_ctors`` array contains a list of constructor 7228functions, priorities, and an associated global or function. 7229The functions referenced by this array will be called in ascending order 7230of priority (i.e. lowest first) when the module is loaded. The order of 7231functions with the same priority is not defined. 7232 7233If the third field is non-null, and points to a global variable 7234or function, the initializer function will only run if the associated 7235data from the current module is not discarded. 7236 7237.. _llvmglobaldtors: 7238 7239The '``llvm.global_dtors``' Global Variable 7240------------------------------------------- 7241 7242.. code-block:: llvm 7243 7244 %0 = type { i32, void ()*, i8* } 7245 @llvm.global_dtors = appending global [1 x %0] [%0 { i32 65535, void ()* @dtor, i8* @data }] 7246 7247The ``@llvm.global_dtors`` array contains a list of destructor 7248functions, priorities, and an associated global or function. 7249The functions referenced by this array will be called in descending 7250order of priority (i.e. highest first) when the module is unloaded. The 7251order of functions with the same priority is not defined. 7252 7253If the third field is non-null, and points to a global variable 7254or function, the destructor function will only run if the associated 7255data from the current module is not discarded. 7256 7257Instruction Reference 7258===================== 7259 7260The LLVM instruction set consists of several different classifications 7261of instructions: :ref:`terminator instructions <terminators>`, :ref:`binary 7262instructions <binaryops>`, :ref:`bitwise binary 7263instructions <bitwiseops>`, :ref:`memory instructions <memoryops>`, and 7264:ref:`other instructions <otherops>`. 7265 7266.. _terminators: 7267 7268Terminator Instructions 7269----------------------- 7270 7271As mentioned :ref:`previously <functionstructure>`, every basic block in a 7272program ends with a "Terminator" instruction, which indicates which 7273block should be executed after the current block is finished. These 7274terminator instructions typically yield a '``void``' value: they produce 7275control flow, not values (the one exception being the 7276':ref:`invoke <i_invoke>`' instruction). 7277 7278The terminator instructions are: ':ref:`ret <i_ret>`', 7279':ref:`br <i_br>`', ':ref:`switch <i_switch>`', 7280':ref:`indirectbr <i_indirectbr>`', ':ref:`invoke <i_invoke>`', 7281':ref:`callbr <i_callbr>`' 7282':ref:`resume <i_resume>`', ':ref:`catchswitch <i_catchswitch>`', 7283':ref:`catchret <i_catchret>`', 7284':ref:`cleanupret <i_cleanupret>`', 7285and ':ref:`unreachable <i_unreachable>`'. 7286 7287.. _i_ret: 7288 7289'``ret``' Instruction 7290^^^^^^^^^^^^^^^^^^^^^ 7291 7292Syntax: 7293""""""" 7294 7295:: 7296 7297 ret <type> <value> ; Return a value from a non-void function 7298 ret void ; Return from void function 7299 7300Overview: 7301""""""""" 7302 7303The '``ret``' instruction is used to return control flow (and optionally 7304a value) from a function back to the caller. 7305 7306There are two forms of the '``ret``' instruction: one that returns a 7307value and then causes control flow, and one that just causes control 7308flow to occur. 7309 7310Arguments: 7311"""""""""" 7312 7313The '``ret``' instruction optionally accepts a single argument, the 7314return value. The type of the return value must be a ':ref:`first 7315class <t_firstclass>`' type. 7316 7317A function is not :ref:`well formed <wellformed>` if it has a non-void 7318return type and contains a '``ret``' instruction with no return value or 7319a return value with a type that does not match its type, or if it has a 7320void return type and contains a '``ret``' instruction with a return 7321value. 7322 7323Semantics: 7324"""""""""" 7325 7326When the '``ret``' instruction is executed, control flow returns back to 7327the calling function's context. If the caller is a 7328":ref:`call <i_call>`" instruction, execution continues at the 7329instruction after the call. If the caller was an 7330":ref:`invoke <i_invoke>`" instruction, execution continues at the 7331beginning of the "normal" destination block. If the instruction returns 7332a value, that value shall set the call or invoke instruction's return 7333value. 7334 7335Example: 7336"""""""" 7337 7338.. code-block:: llvm 7339 7340 ret i32 5 ; Return an integer value of 5 7341 ret void ; Return from a void function 7342 ret { i32, i8 } { i32 4, i8 2 } ; Return a struct of values 4 and 2 7343 7344.. _i_br: 7345 7346'``br``' Instruction 7347^^^^^^^^^^^^^^^^^^^^ 7348 7349Syntax: 7350""""""" 7351 7352:: 7353 7354 br i1 <cond>, label <iftrue>, label <iffalse> 7355 br label <dest> ; Unconditional branch 7356 7357Overview: 7358""""""""" 7359 7360The '``br``' instruction is used to cause control flow to transfer to a 7361different basic block in the current function. There are two forms of 7362this instruction, corresponding to a conditional branch and an 7363unconditional branch. 7364 7365Arguments: 7366"""""""""" 7367 7368The conditional branch form of the '``br``' instruction takes a single 7369'``i1``' value and two '``label``' values. The unconditional form of the 7370'``br``' instruction takes a single '``label``' value as a target. 7371 7372Semantics: 7373"""""""""" 7374 7375Upon execution of a conditional '``br``' instruction, the '``i1``' 7376argument is evaluated. If the value is ``true``, control flows to the 7377'``iftrue``' ``label`` argument. If "cond" is ``false``, control flows 7378to the '``iffalse``' ``label`` argument. 7379If '``cond``' is ``poison`` or ``undef``, this instruction has undefined 7380behavior. 7381 7382Example: 7383"""""""" 7384 7385.. code-block:: llvm 7386 7387 Test: 7388 %cond = icmp eq i32 %a, %b 7389 br i1 %cond, label %IfEqual, label %IfUnequal 7390 IfEqual: 7391 ret i32 1 7392 IfUnequal: 7393 ret i32 0 7394 7395.. _i_switch: 7396 7397'``switch``' Instruction 7398^^^^^^^^^^^^^^^^^^^^^^^^ 7399 7400Syntax: 7401""""""" 7402 7403:: 7404 7405 switch <intty> <value>, label <defaultdest> [ <intty> <val>, label <dest> ... ] 7406 7407Overview: 7408""""""""" 7409 7410The '``switch``' instruction is used to transfer control flow to one of 7411several different places. It is a generalization of the '``br``' 7412instruction, allowing a branch to occur to one of many possible 7413destinations. 7414 7415Arguments: 7416"""""""""" 7417 7418The '``switch``' instruction uses three parameters: an integer 7419comparison value '``value``', a default '``label``' destination, and an 7420array of pairs of comparison value constants and '``label``'s. The table 7421is not allowed to contain duplicate constant entries. 7422 7423Semantics: 7424"""""""""" 7425 7426The ``switch`` instruction specifies a table of values and destinations. 7427When the '``switch``' instruction is executed, this table is searched 7428for the given value. If the value is found, control flow is transferred 7429to the corresponding destination; otherwise, control flow is transferred 7430to the default destination. 7431If '``value``' is ``poison`` or ``undef``, this instruction has undefined 7432behavior. 7433 7434Implementation: 7435""""""""""""""" 7436 7437Depending on properties of the target machine and the particular 7438``switch`` instruction, this instruction may be code generated in 7439different ways. For example, it could be generated as a series of 7440chained conditional branches or with a lookup table. 7441 7442Example: 7443"""""""" 7444 7445.. code-block:: llvm 7446 7447 ; Emulate a conditional br instruction 7448 %Val = zext i1 %value to i32 7449 switch i32 %Val, label %truedest [ i32 0, label %falsedest ] 7450 7451 ; Emulate an unconditional br instruction 7452 switch i32 0, label %dest [ ] 7453 7454 ; Implement a jump table: 7455 switch i32 %val, label %otherwise [ i32 0, label %onzero 7456 i32 1, label %onone 7457 i32 2, label %ontwo ] 7458 7459.. _i_indirectbr: 7460 7461'``indirectbr``' Instruction 7462^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 7463 7464Syntax: 7465""""""" 7466 7467:: 7468 7469 indirectbr <somety>* <address>, [ label <dest1>, label <dest2>, ... ] 7470 7471Overview: 7472""""""""" 7473 7474The '``indirectbr``' instruction implements an indirect branch to a 7475label within the current function, whose address is specified by 7476"``address``". Address must be derived from a 7477:ref:`blockaddress <blockaddress>` constant. 7478 7479Arguments: 7480"""""""""" 7481 7482The '``address``' argument is the address of the label to jump to. The 7483rest of the arguments indicate the full set of possible destinations 7484that the address may point to. Blocks are allowed to occur multiple 7485times in the destination list, though this isn't particularly useful. 7486 7487This destination list is required so that dataflow analysis has an 7488accurate understanding of the CFG. 7489 7490Semantics: 7491"""""""""" 7492 7493Control transfers to the block specified in the address argument. All 7494possible destination blocks must be listed in the label list, otherwise 7495this instruction has undefined behavior. This implies that jumps to 7496labels defined in other functions have undefined behavior as well. 7497If '``address``' is ``poison`` or ``undef``, this instruction has undefined 7498behavior. 7499 7500Implementation: 7501""""""""""""""" 7502 7503This is typically implemented with a jump through a register. 7504 7505Example: 7506"""""""" 7507 7508.. code-block:: llvm 7509 7510 indirectbr i8* %Addr, [ label %bb1, label %bb2, label %bb3 ] 7511 7512.. _i_invoke: 7513 7514'``invoke``' Instruction 7515^^^^^^^^^^^^^^^^^^^^^^^^ 7516 7517Syntax: 7518""""""" 7519 7520:: 7521 7522 <result> = invoke [cconv] [ret attrs] [addrspace(<num>)] <ty>|<fnty> <fnptrval>(<function args>) [fn attrs] 7523 [operand bundles] to label <normal label> unwind label <exception label> 7524 7525Overview: 7526""""""""" 7527 7528The '``invoke``' instruction causes control to transfer to a specified 7529function, with the possibility of control flow transfer to either the 7530'``normal``' label or the '``exception``' label. If the callee function 7531returns with the "``ret``" instruction, control flow will return to the 7532"normal" label. If the callee (or any indirect callees) returns via the 7533":ref:`resume <i_resume>`" instruction or other exception handling 7534mechanism, control is interrupted and continued at the dynamically 7535nearest "exception" label. 7536 7537The '``exception``' label is a `landing 7538pad <ExceptionHandling.html#overview>`_ for the exception. As such, 7539'``exception``' label is required to have the 7540":ref:`landingpad <i_landingpad>`" instruction, which contains the 7541information about the behavior of the program after unwinding happens, 7542as its first non-PHI instruction. The restrictions on the 7543"``landingpad``" instruction's tightly couples it to the "``invoke``" 7544instruction, so that the important information contained within the 7545"``landingpad``" instruction can't be lost through normal code motion. 7546 7547Arguments: 7548"""""""""" 7549 7550This instruction requires several arguments: 7551 7552#. The optional "cconv" marker indicates which :ref:`calling 7553 convention <callingconv>` the call should use. If none is 7554 specified, the call defaults to using C calling conventions. 7555#. The optional :ref:`Parameter Attributes <paramattrs>` list for return 7556 values. Only '``zeroext``', '``signext``', and '``inreg``' attributes 7557 are valid here. 7558#. The optional addrspace attribute can be used to indicate the address space 7559 of the called function. If it is not specified, the program address space 7560 from the :ref:`datalayout string<langref_datalayout>` will be used. 7561#. '``ty``': the type of the call instruction itself which is also the 7562 type of the return value. Functions that return no value are marked 7563 ``void``. 7564#. '``fnty``': shall be the signature of the function being invoked. The 7565 argument types must match the types implied by this signature. This 7566 type can be omitted if the function is not varargs. 7567#. '``fnptrval``': An LLVM value containing a pointer to a function to 7568 be invoked. In most cases, this is a direct function invocation, but 7569 indirect ``invoke``'s are just as possible, calling an arbitrary pointer 7570 to function value. 7571#. '``function args``': argument list whose types match the function 7572 signature argument types and parameter attributes. All arguments must 7573 be of :ref:`first class <t_firstclass>` type. If the function signature 7574 indicates the function accepts a variable number of arguments, the 7575 extra arguments can be specified. 7576#. '``normal label``': the label reached when the called function 7577 executes a '``ret``' instruction. 7578#. '``exception label``': the label reached when a callee returns via 7579 the :ref:`resume <i_resume>` instruction or other exception handling 7580 mechanism. 7581#. The optional :ref:`function attributes <fnattrs>` list. 7582#. The optional :ref:`operand bundles <opbundles>` list. 7583 7584Semantics: 7585"""""""""" 7586 7587This instruction is designed to operate as a standard '``call``' 7588instruction in most regards. The primary difference is that it 7589establishes an association with a label, which is used by the runtime 7590library to unwind the stack. 7591 7592This instruction is used in languages with destructors to ensure that 7593proper cleanup is performed in the case of either a ``longjmp`` or a 7594thrown exception. Additionally, this is important for implementation of 7595'``catch``' clauses in high-level languages that support them. 7596 7597For the purposes of the SSA form, the definition of the value returned 7598by the '``invoke``' instruction is deemed to occur on the edge from the 7599current block to the "normal" label. If the callee unwinds then no 7600return value is available. 7601 7602Example: 7603"""""""" 7604 7605.. code-block:: llvm 7606 7607 %retval = invoke i32 @Test(i32 15) to label %Continue 7608 unwind label %TestCleanup ; i32:retval set 7609 %retval = invoke coldcc i32 %Testfnptr(i32 15) to label %Continue 7610 unwind label %TestCleanup ; i32:retval set 7611 7612.. _i_callbr: 7613 7614'``callbr``' Instruction 7615^^^^^^^^^^^^^^^^^^^^^^^^ 7616 7617Syntax: 7618""""""" 7619 7620:: 7621 7622 <result> = callbr [cconv] [ret attrs] [addrspace(<num>)] <ty>|<fnty> <fnptrval>(<function args>) [fn attrs] 7623 [operand bundles] to label <fallthrough label> [indirect labels] 7624 7625Overview: 7626""""""""" 7627 7628The '``callbr``' instruction causes control to transfer to a specified 7629function, with the possibility of control flow transfer to either the 7630'``fallthrough``' label or one of the '``indirect``' labels. 7631 7632This instruction should only be used to implement the "goto" feature of gcc 7633style inline assembly. Any other usage is an error in the IR verifier. 7634 7635Arguments: 7636"""""""""" 7637 7638This instruction requires several arguments: 7639 7640#. The optional "cconv" marker indicates which :ref:`calling 7641 convention <callingconv>` the call should use. If none is 7642 specified, the call defaults to using C calling conventions. 7643#. The optional :ref:`Parameter Attributes <paramattrs>` list for return 7644 values. Only '``zeroext``', '``signext``', and '``inreg``' attributes 7645 are valid here. 7646#. The optional addrspace attribute can be used to indicate the address space 7647 of the called function. If it is not specified, the program address space 7648 from the :ref:`datalayout string<langref_datalayout>` will be used. 7649#. '``ty``': the type of the call instruction itself which is also the 7650 type of the return value. Functions that return no value are marked 7651 ``void``. 7652#. '``fnty``': shall be the signature of the function being called. The 7653 argument types must match the types implied by this signature. This 7654 type can be omitted if the function is not varargs. 7655#. '``fnptrval``': An LLVM value containing a pointer to a function to 7656 be called. In most cases, this is a direct function call, but 7657 other ``callbr``'s are just as possible, calling an arbitrary pointer 7658 to function value. 7659#. '``function args``': argument list whose types match the function 7660 signature argument types and parameter attributes. All arguments must 7661 be of :ref:`first class <t_firstclass>` type. If the function signature 7662 indicates the function accepts a variable number of arguments, the 7663 extra arguments can be specified. 7664#. '``fallthrough label``': the label reached when the inline assembly's 7665 execution exits the bottom. 7666#. '``indirect labels``': the labels reached when a callee transfers control 7667 to a location other than the '``fallthrough label``'. The blockaddress 7668 constant for these should also be in the list of '``function args``'. 7669#. The optional :ref:`function attributes <fnattrs>` list. 7670#. The optional :ref:`operand bundles <opbundles>` list. 7671 7672Semantics: 7673"""""""""" 7674 7675This instruction is designed to operate as a standard '``call``' 7676instruction in most regards. The primary difference is that it 7677establishes an association with additional labels to define where control 7678flow goes after the call. 7679 7680The output values of a '``callbr``' instruction are available only to 7681the '``fallthrough``' block, not to any '``indirect``' blocks(s). 7682 7683The only use of this today is to implement the "goto" feature of gcc inline 7684assembly where additional labels can be provided as locations for the inline 7685assembly to jump to. 7686 7687Example: 7688"""""""" 7689 7690.. code-block:: llvm 7691 7692 ; "asm goto" without output constraints. 7693 callbr void asm "", "r,X"(i32 %x, i8 *blockaddress(@foo, %indirect)) 7694 to label %fallthrough [label %indirect] 7695 7696 ; "asm goto" with output constraints. 7697 <result> = callbr i32 asm "", "=r,r,X"(i32 %x, i8 *blockaddress(@foo, %indirect)) 7698 to label %fallthrough [label %indirect] 7699 7700.. _i_resume: 7701 7702'``resume``' Instruction 7703^^^^^^^^^^^^^^^^^^^^^^^^ 7704 7705Syntax: 7706""""""" 7707 7708:: 7709 7710 resume <type> <value> 7711 7712Overview: 7713""""""""" 7714 7715The '``resume``' instruction is a terminator instruction that has no 7716successors. 7717 7718Arguments: 7719"""""""""" 7720 7721The '``resume``' instruction requires one argument, which must have the 7722same type as the result of any '``landingpad``' instruction in the same 7723function. 7724 7725Semantics: 7726"""""""""" 7727 7728The '``resume``' instruction resumes propagation of an existing 7729(in-flight) exception whose unwinding was interrupted with a 7730:ref:`landingpad <i_landingpad>` instruction. 7731 7732Example: 7733"""""""" 7734 7735.. code-block:: llvm 7736 7737 resume { i8*, i32 } %exn 7738 7739.. _i_catchswitch: 7740 7741'``catchswitch``' Instruction 7742^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 7743 7744Syntax: 7745""""""" 7746 7747:: 7748 7749 <resultval> = catchswitch within <parent> [ label <handler1>, label <handler2>, ... ] unwind to caller 7750 <resultval> = catchswitch within <parent> [ label <handler1>, label <handler2>, ... ] unwind label <default> 7751 7752Overview: 7753""""""""" 7754 7755The '``catchswitch``' instruction is used by `LLVM's exception handling system 7756<ExceptionHandling.html#overview>`_ to describe the set of possible catch handlers 7757that may be executed by the :ref:`EH personality routine <personalityfn>`. 7758 7759Arguments: 7760"""""""""" 7761 7762The ``parent`` argument is the token of the funclet that contains the 7763``catchswitch`` instruction. If the ``catchswitch`` is not inside a funclet, 7764this operand may be the token ``none``. 7765 7766The ``default`` argument is the label of another basic block beginning with 7767either a ``cleanuppad`` or ``catchswitch`` instruction. This unwind destination 7768must be a legal target with respect to the ``parent`` links, as described in 7769the `exception handling documentation\ <ExceptionHandling.html#wineh-constraints>`_. 7770 7771The ``handlers`` are a nonempty list of successor blocks that each begin with a 7772:ref:`catchpad <i_catchpad>` instruction. 7773 7774Semantics: 7775"""""""""" 7776 7777Executing this instruction transfers control to one of the successors in 7778``handlers``, if appropriate, or continues to unwind via the unwind label if 7779present. 7780 7781The ``catchswitch`` is both a terminator and a "pad" instruction, meaning that 7782it must be both the first non-phi instruction and last instruction in the basic 7783block. Therefore, it must be the only non-phi instruction in the block. 7784 7785Example: 7786"""""""" 7787 7788.. code-block:: text 7789 7790 dispatch1: 7791 %cs1 = catchswitch within none [label %handler0, label %handler1] unwind to caller 7792 dispatch2: 7793 %cs2 = catchswitch within %parenthandler [label %handler0] unwind label %cleanup 7794 7795.. _i_catchret: 7796 7797'``catchret``' Instruction 7798^^^^^^^^^^^^^^^^^^^^^^^^^^ 7799 7800Syntax: 7801""""""" 7802 7803:: 7804 7805 catchret from <token> to label <normal> 7806 7807Overview: 7808""""""""" 7809 7810The '``catchret``' instruction is a terminator instruction that has a 7811single successor. 7812 7813 7814Arguments: 7815"""""""""" 7816 7817The first argument to a '``catchret``' indicates which ``catchpad`` it 7818exits. It must be a :ref:`catchpad <i_catchpad>`. 7819The second argument to a '``catchret``' specifies where control will 7820transfer to next. 7821 7822Semantics: 7823"""""""""" 7824 7825The '``catchret``' instruction ends an existing (in-flight) exception whose 7826unwinding was interrupted with a :ref:`catchpad <i_catchpad>` instruction. The 7827:ref:`personality function <personalityfn>` gets a chance to execute arbitrary 7828code to, for example, destroy the active exception. Control then transfers to 7829``normal``. 7830 7831The ``token`` argument must be a token produced by a ``catchpad`` instruction. 7832If the specified ``catchpad`` is not the most-recently-entered not-yet-exited 7833funclet pad (as described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_), 7834the ``catchret``'s behavior is undefined. 7835 7836Example: 7837"""""""" 7838 7839.. code-block:: text 7840 7841 catchret from %catch label %continue 7842 7843.. _i_cleanupret: 7844 7845'``cleanupret``' Instruction 7846^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 7847 7848Syntax: 7849""""""" 7850 7851:: 7852 7853 cleanupret from <value> unwind label <continue> 7854 cleanupret from <value> unwind to caller 7855 7856Overview: 7857""""""""" 7858 7859The '``cleanupret``' instruction is a terminator instruction that has 7860an optional successor. 7861 7862 7863Arguments: 7864"""""""""" 7865 7866The '``cleanupret``' instruction requires one argument, which indicates 7867which ``cleanuppad`` it exits, and must be a :ref:`cleanuppad <i_cleanuppad>`. 7868If the specified ``cleanuppad`` is not the most-recently-entered not-yet-exited 7869funclet pad (as described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_), 7870the ``cleanupret``'s behavior is undefined. 7871 7872The '``cleanupret``' instruction also has an optional successor, ``continue``, 7873which must be the label of another basic block beginning with either a 7874``cleanuppad`` or ``catchswitch`` instruction. This unwind destination must 7875be a legal target with respect to the ``parent`` links, as described in the 7876`exception handling documentation\ <ExceptionHandling.html#wineh-constraints>`_. 7877 7878Semantics: 7879"""""""""" 7880 7881The '``cleanupret``' instruction indicates to the 7882:ref:`personality function <personalityfn>` that one 7883:ref:`cleanuppad <i_cleanuppad>` it transferred control to has ended. 7884It transfers control to ``continue`` or unwinds out of the function. 7885 7886Example: 7887"""""""" 7888 7889.. code-block:: text 7890 7891 cleanupret from %cleanup unwind to caller 7892 cleanupret from %cleanup unwind label %continue 7893 7894.. _i_unreachable: 7895 7896'``unreachable``' Instruction 7897^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 7898 7899Syntax: 7900""""""" 7901 7902:: 7903 7904 unreachable 7905 7906Overview: 7907""""""""" 7908 7909The '``unreachable``' instruction has no defined semantics. This 7910instruction is used to inform the optimizer that a particular portion of 7911the code is not reachable. This can be used to indicate that the code 7912after a no-return function cannot be reached, and other facts. 7913 7914Semantics: 7915"""""""""" 7916 7917The '``unreachable``' instruction has no defined semantics. 7918 7919.. _unaryops: 7920 7921Unary Operations 7922----------------- 7923 7924Unary operators require a single operand, execute an operation on 7925it, and produce a single value. The operand might represent multiple 7926data, as is the case with the :ref:`vector <t_vector>` data type. The 7927result value has the same type as its operand. 7928 7929.. _i_fneg: 7930 7931'``fneg``' Instruction 7932^^^^^^^^^^^^^^^^^^^^^^ 7933 7934Syntax: 7935""""""" 7936 7937:: 7938 7939 <result> = fneg [fast-math flags]* <ty> <op1> ; yields ty:result 7940 7941Overview: 7942""""""""" 7943 7944The '``fneg``' instruction returns the negation of its operand. 7945 7946Arguments: 7947"""""""""" 7948 7949The argument to the '``fneg``' instruction must be a 7950:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of 7951floating-point values. 7952 7953Semantics: 7954"""""""""" 7955 7956The value produced is a copy of the operand with its sign bit flipped. 7957This instruction can also take any number of :ref:`fast-math 7958flags <fastmath>`, which are optimization hints to enable otherwise 7959unsafe floating-point optimizations: 7960 7961Example: 7962"""""""" 7963 7964.. code-block:: text 7965 7966 <result> = fneg float %val ; yields float:result = -%var 7967 7968.. _binaryops: 7969 7970Binary Operations 7971----------------- 7972 7973Binary operators are used to do most of the computation in a program. 7974They require two operands of the same type, execute an operation on 7975them, and produce a single value. The operands might represent multiple 7976data, as is the case with the :ref:`vector <t_vector>` data type. The 7977result value has the same type as its operands. 7978 7979There are several different binary operators: 7980 7981.. _i_add: 7982 7983'``add``' Instruction 7984^^^^^^^^^^^^^^^^^^^^^ 7985 7986Syntax: 7987""""""" 7988 7989:: 7990 7991 <result> = add <ty> <op1>, <op2> ; yields ty:result 7992 <result> = add nuw <ty> <op1>, <op2> ; yields ty:result 7993 <result> = add nsw <ty> <op1>, <op2> ; yields ty:result 7994 <result> = add nuw nsw <ty> <op1>, <op2> ; yields ty:result 7995 7996Overview: 7997""""""""" 7998 7999The '``add``' instruction returns the sum of its two operands. 8000 8001Arguments: 8002"""""""""" 8003 8004The two arguments to the '``add``' instruction must be 8005:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both 8006arguments must have identical types. 8007 8008Semantics: 8009"""""""""" 8010 8011The value produced is the integer sum of the two operands. 8012 8013If the sum has unsigned overflow, the result returned is the 8014mathematical result modulo 2\ :sup:`n`\ , where n is the bit width of 8015the result. 8016 8017Because LLVM integers use a two's complement representation, this 8018instruction is appropriate for both signed and unsigned integers. 8019 8020``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap", 8021respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the 8022result value of the ``add`` is a :ref:`poison value <poisonvalues>` if 8023unsigned and/or signed overflow, respectively, occurs. 8024 8025Example: 8026"""""""" 8027 8028.. code-block:: text 8029 8030 <result> = add i32 4, %var ; yields i32:result = 4 + %var 8031 8032.. _i_fadd: 8033 8034'``fadd``' Instruction 8035^^^^^^^^^^^^^^^^^^^^^^ 8036 8037Syntax: 8038""""""" 8039 8040:: 8041 8042 <result> = fadd [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result 8043 8044Overview: 8045""""""""" 8046 8047The '``fadd``' instruction returns the sum of its two operands. 8048 8049Arguments: 8050"""""""""" 8051 8052The two arguments to the '``fadd``' instruction must be 8053:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of 8054floating-point values. Both arguments must have identical types. 8055 8056Semantics: 8057"""""""""" 8058 8059The value produced is the floating-point sum of the two operands. 8060This instruction is assumed to execute in the default :ref:`floating-point 8061environment <floatenv>`. 8062This instruction can also take any number of :ref:`fast-math 8063flags <fastmath>`, which are optimization hints to enable otherwise 8064unsafe floating-point optimizations: 8065 8066Example: 8067"""""""" 8068 8069.. code-block:: text 8070 8071 <result> = fadd float 4.0, %var ; yields float:result = 4.0 + %var 8072 8073.. _i_sub: 8074 8075'``sub``' Instruction 8076^^^^^^^^^^^^^^^^^^^^^ 8077 8078Syntax: 8079""""""" 8080 8081:: 8082 8083 <result> = sub <ty> <op1>, <op2> ; yields ty:result 8084 <result> = sub nuw <ty> <op1>, <op2> ; yields ty:result 8085 <result> = sub nsw <ty> <op1>, <op2> ; yields ty:result 8086 <result> = sub nuw nsw <ty> <op1>, <op2> ; yields ty:result 8087 8088Overview: 8089""""""""" 8090 8091The '``sub``' instruction returns the difference of its two operands. 8092 8093Note that the '``sub``' instruction is used to represent the '``neg``' 8094instruction present in most other intermediate representations. 8095 8096Arguments: 8097"""""""""" 8098 8099The two arguments to the '``sub``' instruction must be 8100:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both 8101arguments must have identical types. 8102 8103Semantics: 8104"""""""""" 8105 8106The value produced is the integer difference of the two operands. 8107 8108If the difference has unsigned overflow, the result returned is the 8109mathematical result modulo 2\ :sup:`n`\ , where n is the bit width of 8110the result. 8111 8112Because LLVM integers use a two's complement representation, this 8113instruction is appropriate for both signed and unsigned integers. 8114 8115``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap", 8116respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the 8117result value of the ``sub`` is a :ref:`poison value <poisonvalues>` if 8118unsigned and/or signed overflow, respectively, occurs. 8119 8120Example: 8121"""""""" 8122 8123.. code-block:: text 8124 8125 <result> = sub i32 4, %var ; yields i32:result = 4 - %var 8126 <result> = sub i32 0, %val ; yields i32:result = -%var 8127 8128.. _i_fsub: 8129 8130'``fsub``' Instruction 8131^^^^^^^^^^^^^^^^^^^^^^ 8132 8133Syntax: 8134""""""" 8135 8136:: 8137 8138 <result> = fsub [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result 8139 8140Overview: 8141""""""""" 8142 8143The '``fsub``' instruction returns the difference of its two operands. 8144 8145Arguments: 8146"""""""""" 8147 8148The two arguments to the '``fsub``' instruction must be 8149:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of 8150floating-point values. Both arguments must have identical types. 8151 8152Semantics: 8153"""""""""" 8154 8155The value produced is the floating-point difference of the two operands. 8156This instruction is assumed to execute in the default :ref:`floating-point 8157environment <floatenv>`. 8158This instruction can also take any number of :ref:`fast-math 8159flags <fastmath>`, which are optimization hints to enable otherwise 8160unsafe floating-point optimizations: 8161 8162Example: 8163"""""""" 8164 8165.. code-block:: text 8166 8167 <result> = fsub float 4.0, %var ; yields float:result = 4.0 - %var 8168 <result> = fsub float -0.0, %val ; yields float:result = -%var 8169 8170.. _i_mul: 8171 8172'``mul``' Instruction 8173^^^^^^^^^^^^^^^^^^^^^ 8174 8175Syntax: 8176""""""" 8177 8178:: 8179 8180 <result> = mul <ty> <op1>, <op2> ; yields ty:result 8181 <result> = mul nuw <ty> <op1>, <op2> ; yields ty:result 8182 <result> = mul nsw <ty> <op1>, <op2> ; yields ty:result 8183 <result> = mul nuw nsw <ty> <op1>, <op2> ; yields ty:result 8184 8185Overview: 8186""""""""" 8187 8188The '``mul``' instruction returns the product of its two operands. 8189 8190Arguments: 8191"""""""""" 8192 8193The two arguments to the '``mul``' instruction must be 8194:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both 8195arguments must have identical types. 8196 8197Semantics: 8198"""""""""" 8199 8200The value produced is the integer product of the two operands. 8201 8202If the result of the multiplication has unsigned overflow, the result 8203returned is the mathematical result modulo 2\ :sup:`n`\ , where n is the 8204bit width of the result. 8205 8206Because LLVM integers use a two's complement representation, and the 8207result is the same width as the operands, this instruction returns the 8208correct result for both signed and unsigned integers. If a full product 8209(e.g. ``i32`` * ``i32`` -> ``i64``) is needed, the operands should be 8210sign-extended or zero-extended as appropriate to the width of the full 8211product. 8212 8213``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap", 8214respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the 8215result value of the ``mul`` is a :ref:`poison value <poisonvalues>` if 8216unsigned and/or signed overflow, respectively, occurs. 8217 8218Example: 8219"""""""" 8220 8221.. code-block:: text 8222 8223 <result> = mul i32 4, %var ; yields i32:result = 4 * %var 8224 8225.. _i_fmul: 8226 8227'``fmul``' Instruction 8228^^^^^^^^^^^^^^^^^^^^^^ 8229 8230Syntax: 8231""""""" 8232 8233:: 8234 8235 <result> = fmul [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result 8236 8237Overview: 8238""""""""" 8239 8240The '``fmul``' instruction returns the product of its two operands. 8241 8242Arguments: 8243"""""""""" 8244 8245The two arguments to the '``fmul``' instruction must be 8246:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of 8247floating-point values. Both arguments must have identical types. 8248 8249Semantics: 8250"""""""""" 8251 8252The value produced is the floating-point product of the two operands. 8253This instruction is assumed to execute in the default :ref:`floating-point 8254environment <floatenv>`. 8255This instruction can also take any number of :ref:`fast-math 8256flags <fastmath>`, which are optimization hints to enable otherwise 8257unsafe floating-point optimizations: 8258 8259Example: 8260"""""""" 8261 8262.. code-block:: text 8263 8264 <result> = fmul float 4.0, %var ; yields float:result = 4.0 * %var 8265 8266.. _i_udiv: 8267 8268'``udiv``' Instruction 8269^^^^^^^^^^^^^^^^^^^^^^ 8270 8271Syntax: 8272""""""" 8273 8274:: 8275 8276 <result> = udiv <ty> <op1>, <op2> ; yields ty:result 8277 <result> = udiv exact <ty> <op1>, <op2> ; yields ty:result 8278 8279Overview: 8280""""""""" 8281 8282The '``udiv``' instruction returns the quotient of its two operands. 8283 8284Arguments: 8285"""""""""" 8286 8287The two arguments to the '``udiv``' instruction must be 8288:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both 8289arguments must have identical types. 8290 8291Semantics: 8292"""""""""" 8293 8294The value produced is the unsigned integer quotient of the two operands. 8295 8296Note that unsigned integer division and signed integer division are 8297distinct operations; for signed integer division, use '``sdiv``'. 8298 8299Division by zero is undefined behavior. For vectors, if any element 8300of the divisor is zero, the operation has undefined behavior. 8301 8302 8303If the ``exact`` keyword is present, the result value of the ``udiv`` is 8304a :ref:`poison value <poisonvalues>` if %op1 is not a multiple of %op2 (as 8305such, "((a udiv exact b) mul b) == a"). 8306 8307Example: 8308"""""""" 8309 8310.. code-block:: text 8311 8312 <result> = udiv i32 4, %var ; yields i32:result = 4 / %var 8313 8314.. _i_sdiv: 8315 8316'``sdiv``' Instruction 8317^^^^^^^^^^^^^^^^^^^^^^ 8318 8319Syntax: 8320""""""" 8321 8322:: 8323 8324 <result> = sdiv <ty> <op1>, <op2> ; yields ty:result 8325 <result> = sdiv exact <ty> <op1>, <op2> ; yields ty:result 8326 8327Overview: 8328""""""""" 8329 8330The '``sdiv``' instruction returns the quotient of its two operands. 8331 8332Arguments: 8333"""""""""" 8334 8335The two arguments to the '``sdiv``' instruction must be 8336:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both 8337arguments must have identical types. 8338 8339Semantics: 8340"""""""""" 8341 8342The value produced is the signed integer quotient of the two operands 8343rounded towards zero. 8344 8345Note that signed integer division and unsigned integer division are 8346distinct operations; for unsigned integer division, use '``udiv``'. 8347 8348Division by zero is undefined behavior. For vectors, if any element 8349of the divisor is zero, the operation has undefined behavior. 8350Overflow also leads to undefined behavior; this is a rare case, but can 8351occur, for example, by doing a 32-bit division of -2147483648 by -1. 8352 8353If the ``exact`` keyword is present, the result value of the ``sdiv`` is 8354a :ref:`poison value <poisonvalues>` if the result would be rounded. 8355 8356Example: 8357"""""""" 8358 8359.. code-block:: text 8360 8361 <result> = sdiv i32 4, %var ; yields i32:result = 4 / %var 8362 8363.. _i_fdiv: 8364 8365'``fdiv``' Instruction 8366^^^^^^^^^^^^^^^^^^^^^^ 8367 8368Syntax: 8369""""""" 8370 8371:: 8372 8373 <result> = fdiv [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result 8374 8375Overview: 8376""""""""" 8377 8378The '``fdiv``' instruction returns the quotient of its two operands. 8379 8380Arguments: 8381"""""""""" 8382 8383The two arguments to the '``fdiv``' instruction must be 8384:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of 8385floating-point values. Both arguments must have identical types. 8386 8387Semantics: 8388"""""""""" 8389 8390The value produced is the floating-point quotient of the two operands. 8391This instruction is assumed to execute in the default :ref:`floating-point 8392environment <floatenv>`. 8393This instruction can also take any number of :ref:`fast-math 8394flags <fastmath>`, which are optimization hints to enable otherwise 8395unsafe floating-point optimizations: 8396 8397Example: 8398"""""""" 8399 8400.. code-block:: text 8401 8402 <result> = fdiv float 4.0, %var ; yields float:result = 4.0 / %var 8403 8404.. _i_urem: 8405 8406'``urem``' Instruction 8407^^^^^^^^^^^^^^^^^^^^^^ 8408 8409Syntax: 8410""""""" 8411 8412:: 8413 8414 <result> = urem <ty> <op1>, <op2> ; yields ty:result 8415 8416Overview: 8417""""""""" 8418 8419The '``urem``' instruction returns the remainder from the unsigned 8420division of its two arguments. 8421 8422Arguments: 8423"""""""""" 8424 8425The two arguments to the '``urem``' instruction must be 8426:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both 8427arguments must have identical types. 8428 8429Semantics: 8430"""""""""" 8431 8432This instruction returns the unsigned integer *remainder* of a division. 8433This instruction always performs an unsigned division to get the 8434remainder. 8435 8436Note that unsigned integer remainder and signed integer remainder are 8437distinct operations; for signed integer remainder, use '``srem``'. 8438 8439Taking the remainder of a division by zero is undefined behavior. 8440For vectors, if any element of the divisor is zero, the operation has 8441undefined behavior. 8442 8443Example: 8444"""""""" 8445 8446.. code-block:: text 8447 8448 <result> = urem i32 4, %var ; yields i32:result = 4 % %var 8449 8450.. _i_srem: 8451 8452'``srem``' Instruction 8453^^^^^^^^^^^^^^^^^^^^^^ 8454 8455Syntax: 8456""""""" 8457 8458:: 8459 8460 <result> = srem <ty> <op1>, <op2> ; yields ty:result 8461 8462Overview: 8463""""""""" 8464 8465The '``srem``' instruction returns the remainder from the signed 8466division of its two operands. This instruction can also take 8467:ref:`vector <t_vector>` versions of the values in which case the elements 8468must be integers. 8469 8470Arguments: 8471"""""""""" 8472 8473The two arguments to the '``srem``' instruction must be 8474:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both 8475arguments must have identical types. 8476 8477Semantics: 8478"""""""""" 8479 8480This instruction returns the *remainder* of a division (where the result 8481is either zero or has the same sign as the dividend, ``op1``), not the 8482*modulo* operator (where the result is either zero or has the same sign 8483as the divisor, ``op2``) of a value. For more information about the 8484difference, see `The Math 8485Forum <http://mathforum.org/dr.math/problems/anne.4.28.99.html>`_. For a 8486table of how this is implemented in various languages, please see 8487`Wikipedia: modulo 8488operation <http://en.wikipedia.org/wiki/Modulo_operation>`_. 8489 8490Note that signed integer remainder and unsigned integer remainder are 8491distinct operations; for unsigned integer remainder, use '``urem``'. 8492 8493Taking the remainder of a division by zero is undefined behavior. 8494For vectors, if any element of the divisor is zero, the operation has 8495undefined behavior. 8496Overflow also leads to undefined behavior; this is a rare case, but can 8497occur, for example, by taking the remainder of a 32-bit division of 8498-2147483648 by -1. (The remainder doesn't actually overflow, but this 8499rule lets srem be implemented using instructions that return both the 8500result of the division and the remainder.) 8501 8502Example: 8503"""""""" 8504 8505.. code-block:: text 8506 8507 <result> = srem i32 4, %var ; yields i32:result = 4 % %var 8508 8509.. _i_frem: 8510 8511'``frem``' Instruction 8512^^^^^^^^^^^^^^^^^^^^^^ 8513 8514Syntax: 8515""""""" 8516 8517:: 8518 8519 <result> = frem [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result 8520 8521Overview: 8522""""""""" 8523 8524The '``frem``' instruction returns the remainder from the division of 8525its two operands. 8526 8527Arguments: 8528"""""""""" 8529 8530The two arguments to the '``frem``' instruction must be 8531:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of 8532floating-point values. Both arguments must have identical types. 8533 8534Semantics: 8535"""""""""" 8536 8537The value produced is the floating-point remainder of the two operands. 8538This is the same output as a libm '``fmod``' function, but without any 8539possibility of setting ``errno``. The remainder has the same sign as the 8540dividend. 8541This instruction is assumed to execute in the default :ref:`floating-point 8542environment <floatenv>`. 8543This instruction can also take any number of :ref:`fast-math 8544flags <fastmath>`, which are optimization hints to enable otherwise 8545unsafe floating-point optimizations: 8546 8547Example: 8548"""""""" 8549 8550.. code-block:: text 8551 8552 <result> = frem float 4.0, %var ; yields float:result = 4.0 % %var 8553 8554.. _bitwiseops: 8555 8556Bitwise Binary Operations 8557------------------------- 8558 8559Bitwise binary operators are used to do various forms of bit-twiddling 8560in a program. They are generally very efficient instructions and can 8561commonly be strength reduced from other instructions. They require two 8562operands of the same type, execute an operation on them, and produce a 8563single value. The resulting value is the same type as its operands. 8564 8565.. _i_shl: 8566 8567'``shl``' Instruction 8568^^^^^^^^^^^^^^^^^^^^^ 8569 8570Syntax: 8571""""""" 8572 8573:: 8574 8575 <result> = shl <ty> <op1>, <op2> ; yields ty:result 8576 <result> = shl nuw <ty> <op1>, <op2> ; yields ty:result 8577 <result> = shl nsw <ty> <op1>, <op2> ; yields ty:result 8578 <result> = shl nuw nsw <ty> <op1>, <op2> ; yields ty:result 8579 8580Overview: 8581""""""""" 8582 8583The '``shl``' instruction returns the first operand shifted to the left 8584a specified number of bits. 8585 8586Arguments: 8587"""""""""" 8588 8589Both arguments to the '``shl``' instruction must be the same 8590:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type. 8591'``op2``' is treated as an unsigned value. 8592 8593Semantics: 8594"""""""""" 8595 8596The value produced is ``op1`` \* 2\ :sup:`op2` mod 2\ :sup:`n`, 8597where ``n`` is the width of the result. If ``op2`` is (statically or 8598dynamically) equal to or larger than the number of bits in 8599``op1``, this instruction returns a :ref:`poison value <poisonvalues>`. 8600If the arguments are vectors, each vector element of ``op1`` is shifted 8601by the corresponding shift amount in ``op2``. 8602 8603If the ``nuw`` keyword is present, then the shift produces a poison 8604value if it shifts out any non-zero bits. 8605If the ``nsw`` keyword is present, then the shift produces a poison 8606value if it shifts out any bits that disagree with the resultant sign bit. 8607 8608Example: 8609"""""""" 8610 8611.. code-block:: text 8612 8613 <result> = shl i32 4, %var ; yields i32: 4 << %var 8614 <result> = shl i32 4, 2 ; yields i32: 16 8615 <result> = shl i32 1, 10 ; yields i32: 1024 8616 <result> = shl i32 1, 32 ; undefined 8617 <result> = shl <2 x i32> < i32 1, i32 1>, < i32 1, i32 2> ; yields: result=<2 x i32> < i32 2, i32 4> 8618 8619.. _i_lshr: 8620 8621 8622'``lshr``' Instruction 8623^^^^^^^^^^^^^^^^^^^^^^ 8624 8625Syntax: 8626""""""" 8627 8628:: 8629 8630 <result> = lshr <ty> <op1>, <op2> ; yields ty:result 8631 <result> = lshr exact <ty> <op1>, <op2> ; yields ty:result 8632 8633Overview: 8634""""""""" 8635 8636The '``lshr``' instruction (logical shift right) returns the first 8637operand shifted to the right a specified number of bits with zero fill. 8638 8639Arguments: 8640"""""""""" 8641 8642Both arguments to the '``lshr``' instruction must be the same 8643:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type. 8644'``op2``' is treated as an unsigned value. 8645 8646Semantics: 8647"""""""""" 8648 8649This instruction always performs a logical shift right operation. The 8650most significant bits of the result will be filled with zero bits after 8651the shift. If ``op2`` is (statically or dynamically) equal to or larger 8652than the number of bits in ``op1``, this instruction returns a :ref:`poison 8653value <poisonvalues>`. If the arguments are vectors, each vector element 8654of ``op1`` is shifted by the corresponding shift amount in ``op2``. 8655 8656If the ``exact`` keyword is present, the result value of the ``lshr`` is 8657a poison value if any of the bits shifted out are non-zero. 8658 8659Example: 8660"""""""" 8661 8662.. code-block:: text 8663 8664 <result> = lshr i32 4, 1 ; yields i32:result = 2 8665 <result> = lshr i32 4, 2 ; yields i32:result = 1 8666 <result> = lshr i8 4, 3 ; yields i8:result = 0 8667 <result> = lshr i8 -2, 1 ; yields i8:result = 0x7F 8668 <result> = lshr i32 1, 32 ; undefined 8669 <result> = lshr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 2> ; yields: result=<2 x i32> < i32 0x7FFFFFFF, i32 1> 8670 8671.. _i_ashr: 8672 8673'``ashr``' Instruction 8674^^^^^^^^^^^^^^^^^^^^^^ 8675 8676Syntax: 8677""""""" 8678 8679:: 8680 8681 <result> = ashr <ty> <op1>, <op2> ; yields ty:result 8682 <result> = ashr exact <ty> <op1>, <op2> ; yields ty:result 8683 8684Overview: 8685""""""""" 8686 8687The '``ashr``' instruction (arithmetic shift right) returns the first 8688operand shifted to the right a specified number of bits with sign 8689extension. 8690 8691Arguments: 8692"""""""""" 8693 8694Both arguments to the '``ashr``' instruction must be the same 8695:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type. 8696'``op2``' is treated as an unsigned value. 8697 8698Semantics: 8699"""""""""" 8700 8701This instruction always performs an arithmetic shift right operation, 8702The most significant bits of the result will be filled with the sign bit 8703of ``op1``. If ``op2`` is (statically or dynamically) equal to or larger 8704than the number of bits in ``op1``, this instruction returns a :ref:`poison 8705value <poisonvalues>`. If the arguments are vectors, each vector element 8706of ``op1`` is shifted by the corresponding shift amount in ``op2``. 8707 8708If the ``exact`` keyword is present, the result value of the ``ashr`` is 8709a poison value if any of the bits shifted out are non-zero. 8710 8711Example: 8712"""""""" 8713 8714.. code-block:: text 8715 8716 <result> = ashr i32 4, 1 ; yields i32:result = 2 8717 <result> = ashr i32 4, 2 ; yields i32:result = 1 8718 <result> = ashr i8 4, 3 ; yields i8:result = 0 8719 <result> = ashr i8 -2, 1 ; yields i8:result = -1 8720 <result> = ashr i32 1, 32 ; undefined 8721 <result> = ashr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 3> ; yields: result=<2 x i32> < i32 -1, i32 0> 8722 8723.. _i_and: 8724 8725'``and``' Instruction 8726^^^^^^^^^^^^^^^^^^^^^ 8727 8728Syntax: 8729""""""" 8730 8731:: 8732 8733 <result> = and <ty> <op1>, <op2> ; yields ty:result 8734 8735Overview: 8736""""""""" 8737 8738The '``and``' instruction returns the bitwise logical and of its two 8739operands. 8740 8741Arguments: 8742"""""""""" 8743 8744The two arguments to the '``and``' instruction must be 8745:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both 8746arguments must have identical types. 8747 8748Semantics: 8749"""""""""" 8750 8751The truth table used for the '``and``' instruction is: 8752 8753+-----+-----+-----+ 8754| In0 | In1 | Out | 8755+-----+-----+-----+ 8756| 0 | 0 | 0 | 8757+-----+-----+-----+ 8758| 0 | 1 | 0 | 8759+-----+-----+-----+ 8760| 1 | 0 | 0 | 8761+-----+-----+-----+ 8762| 1 | 1 | 1 | 8763+-----+-----+-----+ 8764 8765Example: 8766"""""""" 8767 8768.. code-block:: text 8769 8770 <result> = and i32 4, %var ; yields i32:result = 4 & %var 8771 <result> = and i32 15, 40 ; yields i32:result = 8 8772 <result> = and i32 4, 8 ; yields i32:result = 0 8773 8774.. _i_or: 8775 8776'``or``' Instruction 8777^^^^^^^^^^^^^^^^^^^^ 8778 8779Syntax: 8780""""""" 8781 8782:: 8783 8784 <result> = or <ty> <op1>, <op2> ; yields ty:result 8785 8786Overview: 8787""""""""" 8788 8789The '``or``' instruction returns the bitwise logical inclusive or of its 8790two operands. 8791 8792Arguments: 8793"""""""""" 8794 8795The two arguments to the '``or``' instruction must be 8796:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both 8797arguments must have identical types. 8798 8799Semantics: 8800"""""""""" 8801 8802The truth table used for the '``or``' instruction is: 8803 8804+-----+-----+-----+ 8805| In0 | In1 | Out | 8806+-----+-----+-----+ 8807| 0 | 0 | 0 | 8808+-----+-----+-----+ 8809| 0 | 1 | 1 | 8810+-----+-----+-----+ 8811| 1 | 0 | 1 | 8812+-----+-----+-----+ 8813| 1 | 1 | 1 | 8814+-----+-----+-----+ 8815 8816Example: 8817"""""""" 8818 8819:: 8820 8821 <result> = or i32 4, %var ; yields i32:result = 4 | %var 8822 <result> = or i32 15, 40 ; yields i32:result = 47 8823 <result> = or i32 4, 8 ; yields i32:result = 12 8824 8825.. _i_xor: 8826 8827'``xor``' Instruction 8828^^^^^^^^^^^^^^^^^^^^^ 8829 8830Syntax: 8831""""""" 8832 8833:: 8834 8835 <result> = xor <ty> <op1>, <op2> ; yields ty:result 8836 8837Overview: 8838""""""""" 8839 8840The '``xor``' instruction returns the bitwise logical exclusive or of 8841its two operands. The ``xor`` is used to implement the "one's 8842complement" operation, which is the "~" operator in C. 8843 8844Arguments: 8845"""""""""" 8846 8847The two arguments to the '``xor``' instruction must be 8848:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both 8849arguments must have identical types. 8850 8851Semantics: 8852"""""""""" 8853 8854The truth table used for the '``xor``' instruction is: 8855 8856+-----+-----+-----+ 8857| In0 | In1 | Out | 8858+-----+-----+-----+ 8859| 0 | 0 | 0 | 8860+-----+-----+-----+ 8861| 0 | 1 | 1 | 8862+-----+-----+-----+ 8863| 1 | 0 | 1 | 8864+-----+-----+-----+ 8865| 1 | 1 | 0 | 8866+-----+-----+-----+ 8867 8868Example: 8869"""""""" 8870 8871.. code-block:: text 8872 8873 <result> = xor i32 4, %var ; yields i32:result = 4 ^ %var 8874 <result> = xor i32 15, 40 ; yields i32:result = 39 8875 <result> = xor i32 4, 8 ; yields i32:result = 12 8876 <result> = xor i32 %V, -1 ; yields i32:result = ~%V 8877 8878Vector Operations 8879----------------- 8880 8881LLVM supports several instructions to represent vector operations in a 8882target-independent manner. These instructions cover the element-access 8883and vector-specific operations needed to process vectors effectively. 8884While LLVM does directly support these vector operations, many 8885sophisticated algorithms will want to use target-specific intrinsics to 8886take full advantage of a specific target. 8887 8888.. _i_extractelement: 8889 8890'``extractelement``' Instruction 8891^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 8892 8893Syntax: 8894""""""" 8895 8896:: 8897 8898 <result> = extractelement <n x <ty>> <val>, <ty2> <idx> ; yields <ty> 8899 <result> = extractelement <vscale x n x <ty>> <val>, <ty2> <idx> ; yields <ty> 8900 8901Overview: 8902""""""""" 8903 8904The '``extractelement``' instruction extracts a single scalar element 8905from a vector at a specified index. 8906 8907Arguments: 8908"""""""""" 8909 8910The first operand of an '``extractelement``' instruction is a value of 8911:ref:`vector <t_vector>` type. The second operand is an index indicating 8912the position from which to extract the element. The index may be a 8913variable of any integer type. 8914 8915Semantics: 8916"""""""""" 8917 8918The result is a scalar of the same type as the element type of ``val``. 8919Its value is the value at position ``idx`` of ``val``. If ``idx`` 8920exceeds the length of ``val`` for a fixed-length vector, the result is a 8921:ref:`poison value <poisonvalues>`. For a scalable vector, if the value 8922of ``idx`` exceeds the runtime length of the vector, the result is a 8923:ref:`poison value <poisonvalues>`. 8924 8925Example: 8926"""""""" 8927 8928.. code-block:: text 8929 8930 <result> = extractelement <4 x i32> %vec, i32 0 ; yields i32 8931 8932.. _i_insertelement: 8933 8934'``insertelement``' Instruction 8935^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 8936 8937Syntax: 8938""""""" 8939 8940:: 8941 8942 <result> = insertelement <n x <ty>> <val>, <ty> <elt>, <ty2> <idx> ; yields <n x <ty>> 8943 <result> = insertelement <vscale x n x <ty>> <val>, <ty> <elt>, <ty2> <idx> ; yields <vscale x n x <ty>> 8944 8945Overview: 8946""""""""" 8947 8948The '``insertelement``' instruction inserts a scalar element into a 8949vector at a specified index. 8950 8951Arguments: 8952"""""""""" 8953 8954The first operand of an '``insertelement``' instruction is a value of 8955:ref:`vector <t_vector>` type. The second operand is a scalar value whose 8956type must equal the element type of the first operand. The third operand 8957is an index indicating the position at which to insert the value. The 8958index may be a variable of any integer type. 8959 8960Semantics: 8961"""""""""" 8962 8963The result is a vector of the same type as ``val``. Its element values 8964are those of ``val`` except at position ``idx``, where it gets the value 8965``elt``. If ``idx`` exceeds the length of ``val`` for a fixed-length vector, 8966the result is a :ref:`poison value <poisonvalues>`. For a scalable vector, 8967if the value of ``idx`` exceeds the runtime length of the vector, the result 8968is a :ref:`poison value <poisonvalues>`. 8969 8970Example: 8971"""""""" 8972 8973.. code-block:: text 8974 8975 <result> = insertelement <4 x i32> %vec, i32 1, i32 0 ; yields <4 x i32> 8976 8977.. _i_shufflevector: 8978 8979'``shufflevector``' Instruction 8980^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 8981 8982Syntax: 8983""""""" 8984 8985:: 8986 8987 <result> = shufflevector <n x <ty>> <v1>, <n x <ty>> <v2>, <m x i32> <mask> ; yields <m x <ty>> 8988 <result> = shufflevector <vscale x n x <ty>> <v1>, <vscale x n x <ty>> v2, <vscale x m x i32> <mask> ; yields <vscale x m x <ty>> 8989 8990Overview: 8991""""""""" 8992 8993The '``shufflevector``' instruction constructs a permutation of elements 8994from two input vectors, returning a vector with the same element type as 8995the input and length that is the same as the shuffle mask. 8996 8997Arguments: 8998"""""""""" 8999 9000The first two operands of a '``shufflevector``' instruction are vectors 9001with the same type. The third argument is a shuffle mask vector constant 9002whose element type is ``i32``. The mask vector elements must be constant 9003integers or ``undef`` values. The result of the instruction is a vector 9004whose length is the same as the shuffle mask and whose element type is the 9005same as the element type of the first two operands. 9006 9007Semantics: 9008"""""""""" 9009 9010The elements of the two input vectors are numbered from left to right 9011across both of the vectors. For each element of the result vector, the 9012shuffle mask selects an element from one of the input vectors to copy 9013to the result. Non-negative elements in the mask represent an index 9014into the concatenated pair of input vectors. 9015 9016If the shuffle mask is undefined, the result vector is undefined. If 9017the shuffle mask selects an undefined element from one of the input 9018vectors, the resulting element is undefined. An undefined element 9019in the mask vector specifies that the resulting element is undefined. 9020An undefined element in the mask vector prevents a poisoned vector 9021element from propagating. 9022 9023For scalable vectors, the only valid mask values at present are 9024``zeroinitializer`` and ``undef``, since we cannot write all indices as 9025literals for a vector with a length unknown at compile time. 9026 9027Example: 9028"""""""" 9029 9030.. code-block:: text 9031 9032 <result> = shufflevector <4 x i32> %v1, <4 x i32> %v2, 9033 <4 x i32> <i32 0, i32 4, i32 1, i32 5> ; yields <4 x i32> 9034 <result> = shufflevector <4 x i32> %v1, <4 x i32> undef, 9035 <4 x i32> <i32 0, i32 1, i32 2, i32 3> ; yields <4 x i32> - Identity shuffle. 9036 <result> = shufflevector <8 x i32> %v1, <8 x i32> undef, 9037 <4 x i32> <i32 0, i32 1, i32 2, i32 3> ; yields <4 x i32> 9038 <result> = shufflevector <4 x i32> %v1, <4 x i32> %v2, 9039 <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7 > ; yields <8 x i32> 9040 9041Aggregate Operations 9042-------------------- 9043 9044LLVM supports several instructions for working with 9045:ref:`aggregate <t_aggregate>` values. 9046 9047.. _i_extractvalue: 9048 9049'``extractvalue``' Instruction 9050^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 9051 9052Syntax: 9053""""""" 9054 9055:: 9056 9057 <result> = extractvalue <aggregate type> <val>, <idx>{, <idx>}* 9058 9059Overview: 9060""""""""" 9061 9062The '``extractvalue``' instruction extracts the value of a member field 9063from an :ref:`aggregate <t_aggregate>` value. 9064 9065Arguments: 9066"""""""""" 9067 9068The first operand of an '``extractvalue``' instruction is a value of 9069:ref:`struct <t_struct>` or :ref:`array <t_array>` type. The other operands are 9070constant indices to specify which value to extract in a similar manner 9071as indices in a '``getelementptr``' instruction. 9072 9073The major differences to ``getelementptr`` indexing are: 9074 9075- Since the value being indexed is not a pointer, the first index is 9076 omitted and assumed to be zero. 9077- At least one index must be specified. 9078- Not only struct indices but also array indices must be in bounds. 9079 9080Semantics: 9081"""""""""" 9082 9083The result is the value at the position in the aggregate specified by 9084the index operands. 9085 9086Example: 9087"""""""" 9088 9089.. code-block:: text 9090 9091 <result> = extractvalue {i32, float} %agg, 0 ; yields i32 9092 9093.. _i_insertvalue: 9094 9095'``insertvalue``' Instruction 9096^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 9097 9098Syntax: 9099""""""" 9100 9101:: 9102 9103 <result> = insertvalue <aggregate type> <val>, <ty> <elt>, <idx>{, <idx>}* ; yields <aggregate type> 9104 9105Overview: 9106""""""""" 9107 9108The '``insertvalue``' instruction inserts a value into a member field in 9109an :ref:`aggregate <t_aggregate>` value. 9110 9111Arguments: 9112"""""""""" 9113 9114The first operand of an '``insertvalue``' instruction is a value of 9115:ref:`struct <t_struct>` or :ref:`array <t_array>` type. The second operand is 9116a first-class value to insert. The following operands are constant 9117indices indicating the position at which to insert the value in a 9118similar manner as indices in a '``extractvalue``' instruction. The value 9119to insert must have the same type as the value identified by the 9120indices. 9121 9122Semantics: 9123"""""""""" 9124 9125The result is an aggregate of the same type as ``val``. Its value is 9126that of ``val`` except that the value at the position specified by the 9127indices is that of ``elt``. 9128 9129Example: 9130"""""""" 9131 9132.. code-block:: llvm 9133 9134 %agg1 = insertvalue {i32, float} undef, i32 1, 0 ; yields {i32 1, float undef} 9135 %agg2 = insertvalue {i32, float} %agg1, float %val, 1 ; yields {i32 1, float %val} 9136 %agg3 = insertvalue {i32, {float}} undef, float %val, 1, 0 ; yields {i32 undef, {float %val}} 9137 9138.. _memoryops: 9139 9140Memory Access and Addressing Operations 9141--------------------------------------- 9142 9143A key design point of an SSA-based representation is how it represents 9144memory. In LLVM, no memory locations are in SSA form, which makes things 9145very simple. This section describes how to read, write, and allocate 9146memory in LLVM. 9147 9148.. _i_alloca: 9149 9150'``alloca``' Instruction 9151^^^^^^^^^^^^^^^^^^^^^^^^ 9152 9153Syntax: 9154""""""" 9155 9156:: 9157 9158 <result> = alloca [inalloca] <type> [, <ty> <NumElements>] [, align <alignment>] [, addrspace(<num>)] ; yields type addrspace(num)*:result 9159 9160Overview: 9161""""""""" 9162 9163The '``alloca``' instruction allocates memory on the stack frame of the 9164currently executing function, to be automatically released when this 9165function returns to its caller. The object is always allocated in the 9166address space for allocas indicated in the datalayout. 9167 9168Arguments: 9169"""""""""" 9170 9171The '``alloca``' instruction allocates ``sizeof(<type>)*NumElements`` 9172bytes of memory on the runtime stack, returning a pointer of the 9173appropriate type to the program. If "NumElements" is specified, it is 9174the number of elements allocated, otherwise "NumElements" is defaulted 9175to be one. If a constant alignment is specified, the value result of the 9176allocation is guaranteed to be aligned to at least that boundary. The 9177alignment may not be greater than ``1 << 29``. If not specified, or if 9178zero, the target can choose to align the allocation on any convenient 9179boundary compatible with the type. 9180 9181'``type``' may be any sized type. 9182 9183Semantics: 9184"""""""""" 9185 9186Memory is allocated; a pointer is returned. The allocated memory is 9187uninitialized, and loading from uninitialized memory produces an undefined 9188value. The operation itself is undefined if there is insufficient stack 9189space for the allocation.'``alloca``'d memory is automatically released 9190when the function returns. The '``alloca``' instruction is commonly used 9191to represent automatic variables that must have an address available. When 9192the function returns (either with the ``ret`` or ``resume`` instructions), 9193the memory is reclaimed. Allocating zero bytes is legal, but the returned 9194pointer may not be unique. The order in which memory is allocated (ie., 9195which way the stack grows) is not specified. 9196 9197Example: 9198"""""""" 9199 9200.. code-block:: llvm 9201 9202 %ptr = alloca i32 ; yields i32*:ptr 9203 %ptr = alloca i32, i32 4 ; yields i32*:ptr 9204 %ptr = alloca i32, i32 4, align 1024 ; yields i32*:ptr 9205 %ptr = alloca i32, align 1024 ; yields i32*:ptr 9206 9207.. _i_load: 9208 9209'``load``' Instruction 9210^^^^^^^^^^^^^^^^^^^^^^ 9211 9212Syntax: 9213""""""" 9214 9215:: 9216 9217 <result> = load [volatile] <ty>, <ty>* <pointer>[, align <alignment>][, !nontemporal !<nontemp_node>][, !invariant.load !<empty_node>][, !invariant.group !<empty_node>][, !nonnull !<empty_node>][, !dereferenceable !<deref_bytes_node>][, !dereferenceable_or_null !<deref_bytes_node>][, !align !<align_node>][, !noundef !<empty_node>] 9218 <result> = load atomic [volatile] <ty>, <ty>* <pointer> [syncscope("<target-scope>")] <ordering>, align <alignment> [, !invariant.group !<empty_node>] 9219 !<nontemp_node> = !{ i32 1 } 9220 !<empty_node> = !{} 9221 !<deref_bytes_node> = !{ i64 <dereferenceable_bytes> } 9222 !<align_node> = !{ i64 <value_alignment> } 9223 9224Overview: 9225""""""""" 9226 9227The '``load``' instruction is used to read from memory. 9228 9229Arguments: 9230"""""""""" 9231 9232The argument to the ``load`` instruction specifies the memory address from which 9233to load. The type specified must be a :ref:`first class <t_firstclass>` type of 9234known size (i.e. not containing an :ref:`opaque structural type <t_opaque>`). If 9235the ``load`` is marked as ``volatile``, then the optimizer is not allowed to 9236modify the number or order of execution of this ``load`` with other 9237:ref:`volatile operations <volatile>`. 9238 9239If the ``load`` is marked as ``atomic``, it takes an extra :ref:`ordering 9240<ordering>` and optional ``syncscope("<target-scope>")`` argument. The 9241``release`` and ``acq_rel`` orderings are not valid on ``load`` instructions. 9242Atomic loads produce :ref:`defined <memmodel>` results when they may see 9243multiple atomic stores. The type of the pointee must be an integer, pointer, or 9244floating-point type whose bit width is a power of two greater than or equal to 9245eight and less than or equal to a target-specific size limit. ``align`` must be 9246explicitly specified on atomic loads, and the load has undefined behavior if the 9247alignment is not set to a value which is at least the size in bytes of the 9248pointee. ``!nontemporal`` does not have any defined semantics for atomic loads. 9249 9250The optional constant ``align`` argument specifies the alignment of the 9251operation (that is, the alignment of the memory address). A value of 0 9252or an omitted ``align`` argument means that the operation has the ABI 9253alignment for the target. It is the responsibility of the code emitter 9254to ensure that the alignment information is correct. Overestimating the 9255alignment results in undefined behavior. Underestimating the alignment 9256may produce less efficient code. An alignment of 1 is always safe. The 9257maximum possible alignment is ``1 << 29``. An alignment value higher 9258than the size of the loaded type implies memory up to the alignment 9259value bytes can be safely loaded without trapping in the default 9260address space. Access of the high bytes can interfere with debugging 9261tools, so should not be accessed if the function has the 9262``sanitize_thread`` or ``sanitize_address`` attributes. 9263 9264The optional ``!nontemporal`` metadata must reference a single 9265metadata name ``<nontemp_node>`` corresponding to a metadata node with one 9266``i32`` entry of value 1. The existence of the ``!nontemporal`` 9267metadata on the instruction tells the optimizer and code generator 9268that this load is not expected to be reused in the cache. The code 9269generator may select special instructions to save cache bandwidth, such 9270as the ``MOVNT`` instruction on x86. 9271 9272The optional ``!invariant.load`` metadata must reference a single 9273metadata name ``<empty_node>`` corresponding to a metadata node with no 9274entries. If a load instruction tagged with the ``!invariant.load`` 9275metadata is executed, the optimizer may assume the memory location 9276referenced by the load contains the same value at all points in the 9277program where the memory location is known to be dereferenceable; 9278otherwise, the behavior is undefined. 9279 9280The optional ``!invariant.group`` metadata must reference a single metadata name 9281 ``<empty_node>`` corresponding to a metadata node with no entries. 9282 See ``invariant.group`` metadata :ref:`invariant.group <md_invariant.group>`. 9283 9284The optional ``!nonnull`` metadata must reference a single 9285metadata name ``<empty_node>`` corresponding to a metadata node with no 9286entries. The existence of the ``!nonnull`` metadata on the 9287instruction tells the optimizer that the value loaded is known to 9288never be null. If the value is null at runtime, the behavior is undefined. 9289This is analogous to the ``nonnull`` attribute on parameters and return 9290values. This metadata can only be applied to loads of a pointer type. 9291 9292The optional ``!dereferenceable`` metadata must reference a single metadata 9293name ``<deref_bytes_node>`` corresponding to a metadata node with one ``i64`` 9294entry. 9295See ``dereferenceable`` metadata :ref:`dereferenceable <md_dereferenceable>`. 9296 9297The optional ``!dereferenceable_or_null`` metadata must reference a single 9298metadata name ``<deref_bytes_node>`` corresponding to a metadata node with one 9299``i64`` entry. 9300See ``dereferenceable_or_null`` metadata :ref:`dereferenceable_or_null 9301<md_dereferenceable_or_null>`. 9302 9303The optional ``!align`` metadata must reference a single metadata name 9304``<align_node>`` corresponding to a metadata node with one ``i64`` entry. 9305The existence of the ``!align`` metadata on the instruction tells the 9306optimizer that the value loaded is known to be aligned to a boundary specified 9307by the integer value in the metadata node. The alignment must be a power of 2. 9308This is analogous to the ''align'' attribute on parameters and return values. 9309This metadata can only be applied to loads of a pointer type. If the returned 9310value is not appropriately aligned at runtime, the behavior is undefined. 9311 9312The optional ``!noundef`` metadata must reference a single metadata name 9313``<empty_node>`` corresponding to a node with no entries. The existence of 9314``!noundef`` metadata on the instruction tells the optimizer that the value 9315loaded is known to be :ref:`well defined <welldefinedvalues>`. 9316If the value isn't well defined, the behavior is undefined. 9317 9318Semantics: 9319"""""""""" 9320 9321The location of memory pointed to is loaded. If the value being loaded 9322is of scalar type then the number of bytes read does not exceed the 9323minimum number of bytes needed to hold all bits of the type. For 9324example, loading an ``i24`` reads at most three bytes. When loading a 9325value of a type like ``i20`` with a size that is not an integral number 9326of bytes, the result is undefined if the value was not originally 9327written using a store of the same type. 9328If the value being loaded is of aggregate type, the bytes that correspond to 9329padding may be accessed but are ignored, because it is impossible to observe 9330padding from the loaded aggregate value. 9331 9332If the pointer is not a well-defined value, all of its possible representations 9333should be dereferenceable. For example, loading a byte from a pointer to an 9334array of type ``[16 x i8]`` with offset ``undef & 31`` is undefined behavior. 9335Loading a byte at offset ``undef & 15`` nondeterministically reads one of the 9336bytes. 9337 9338Examples: 9339""""""""" 9340 9341.. code-block:: llvm 9342 9343 %ptr = alloca i32 ; yields i32*:ptr 9344 store i32 3, i32* %ptr ; yields void 9345 %val = load i32, i32* %ptr ; yields i32:val = i32 3 9346 9347.. _i_store: 9348 9349'``store``' Instruction 9350^^^^^^^^^^^^^^^^^^^^^^^ 9351 9352Syntax: 9353""""""" 9354 9355:: 9356 9357 store [volatile] <ty> <value>, <ty>* <pointer>[, align <alignment>][, !nontemporal !<nontemp_node>][, !invariant.group !<empty_node>] ; yields void 9358 store atomic [volatile] <ty> <value>, <ty>* <pointer> [syncscope("<target-scope>")] <ordering>, align <alignment> [, !invariant.group !<empty_node>] ; yields void 9359 !<nontemp_node> = !{ i32 1 } 9360 !<empty_node> = !{} 9361 9362Overview: 9363""""""""" 9364 9365The '``store``' instruction is used to write to memory. 9366 9367Arguments: 9368"""""""""" 9369 9370There are two arguments to the ``store`` instruction: a value to store and an 9371address at which to store it. The type of the ``<pointer>`` operand must be a 9372pointer to the :ref:`first class <t_firstclass>` type of the ``<value>`` 9373operand. If the ``store`` is marked as ``volatile``, then the optimizer is not 9374allowed to modify the number or order of execution of this ``store`` with other 9375:ref:`volatile operations <volatile>`. Only values of :ref:`first class 9376<t_firstclass>` types of known size (i.e. not containing an :ref:`opaque 9377structural type <t_opaque>`) can be stored. 9378 9379If the ``store`` is marked as ``atomic``, it takes an extra :ref:`ordering 9380<ordering>` and optional ``syncscope("<target-scope>")`` argument. The 9381``acquire`` and ``acq_rel`` orderings aren't valid on ``store`` instructions. 9382Atomic loads produce :ref:`defined <memmodel>` results when they may see 9383multiple atomic stores. The type of the pointee must be an integer, pointer, or 9384floating-point type whose bit width is a power of two greater than or equal to 9385eight and less than or equal to a target-specific size limit. ``align`` must be 9386explicitly specified on atomic stores, and the store has undefined behavior if 9387the alignment is not set to a value which is at least the size in bytes of the 9388pointee. ``!nontemporal`` does not have any defined semantics for atomic stores. 9389 9390The optional constant ``align`` argument specifies the alignment of the 9391operation (that is, the alignment of the memory address). A value of 0 9392or an omitted ``align`` argument means that the operation has the ABI 9393alignment for the target. It is the responsibility of the code emitter 9394to ensure that the alignment information is correct. Overestimating the 9395alignment results in undefined behavior. Underestimating the 9396alignment may produce less efficient code. An alignment of 1 is always 9397safe. The maximum possible alignment is ``1 << 29``. An alignment 9398value higher than the size of the stored type implies memory up to the 9399alignment value bytes can be stored to without trapping in the default 9400address space. Storing to the higher bytes however may result in data 9401races if another thread can access the same address. Introducing a 9402data race is not allowed. Storing to the extra bytes is not allowed 9403even in situations where a data race is known to not exist if the 9404function has the ``sanitize_address`` attribute. 9405 9406The optional ``!nontemporal`` metadata must reference a single metadata 9407name ``<nontemp_node>`` corresponding to a metadata node with one ``i32`` entry 9408of value 1. The existence of the ``!nontemporal`` metadata on the instruction 9409tells the optimizer and code generator that this load is not expected to 9410be reused in the cache. The code generator may select special 9411instructions to save cache bandwidth, such as the ``MOVNT`` instruction on 9412x86. 9413 9414The optional ``!invariant.group`` metadata must reference a 9415single metadata name ``<empty_node>``. See ``invariant.group`` metadata. 9416 9417Semantics: 9418"""""""""" 9419 9420The contents of memory are updated to contain ``<value>`` at the 9421location specified by the ``<pointer>`` operand. If ``<value>`` is 9422of scalar type then the number of bytes written does not exceed the 9423minimum number of bytes needed to hold all bits of the type. For 9424example, storing an ``i24`` writes at most three bytes. When writing a 9425value of a type like ``i20`` with a size that is not an integral number 9426of bytes, it is unspecified what happens to the extra bits that do not 9427belong to the type, but they will typically be overwritten. 9428If ``<value>`` is of aggregate type, padding is filled with 9429:ref:`undef <undefvalues>`. 9430 9431If ``<pointer>`` is not a well-defined value, all of its possible 9432representations should be dereferenceable. For example, storing a byte to a 9433pointer to an array of type ``[16 x i8]`` with offset ``undef & 31`` is 9434undefined behavior. Storing a byte to an offset ``undef & 15`` 9435nondeterministically stores to one of offsets from 0 to 15. 9436 9437Example: 9438"""""""" 9439 9440.. code-block:: llvm 9441 9442 %ptr = alloca i32 ; yields i32*:ptr 9443 store i32 3, i32* %ptr ; yields void 9444 %val = load i32, i32* %ptr ; yields i32:val = i32 3 9445 9446.. _i_fence: 9447 9448'``fence``' Instruction 9449^^^^^^^^^^^^^^^^^^^^^^^ 9450 9451Syntax: 9452""""""" 9453 9454:: 9455 9456 fence [syncscope("<target-scope>")] <ordering> ; yields void 9457 9458Overview: 9459""""""""" 9460 9461The '``fence``' instruction is used to introduce happens-before edges 9462between operations. 9463 9464Arguments: 9465"""""""""" 9466 9467'``fence``' instructions take an :ref:`ordering <ordering>` argument which 9468defines what *synchronizes-with* edges they add. They can only be given 9469``acquire``, ``release``, ``acq_rel``, and ``seq_cst`` orderings. 9470 9471Semantics: 9472"""""""""" 9473 9474A fence A which has (at least) ``release`` ordering semantics 9475*synchronizes with* a fence B with (at least) ``acquire`` ordering 9476semantics if and only if there exist atomic operations X and Y, both 9477operating on some atomic object M, such that A is sequenced before X, X 9478modifies M (either directly or through some side effect of a sequence 9479headed by X), Y is sequenced before B, and Y observes M. This provides a 9480*happens-before* dependency between A and B. Rather than an explicit 9481``fence``, one (but not both) of the atomic operations X or Y might 9482provide a ``release`` or ``acquire`` (resp.) ordering constraint and 9483still *synchronize-with* the explicit ``fence`` and establish the 9484*happens-before* edge. 9485 9486A ``fence`` which has ``seq_cst`` ordering, in addition to having both 9487``acquire`` and ``release`` semantics specified above, participates in 9488the global program order of other ``seq_cst`` operations and/or fences. 9489 9490A ``fence`` instruction can also take an optional 9491":ref:`syncscope <syncscope>`" argument. 9492 9493Example: 9494"""""""" 9495 9496.. code-block:: text 9497 9498 fence acquire ; yields void 9499 fence syncscope("singlethread") seq_cst ; yields void 9500 fence syncscope("agent") seq_cst ; yields void 9501 9502.. _i_cmpxchg: 9503 9504'``cmpxchg``' Instruction 9505^^^^^^^^^^^^^^^^^^^^^^^^^ 9506 9507Syntax: 9508""""""" 9509 9510:: 9511 9512 cmpxchg [weak] [volatile] <ty>* <pointer>, <ty> <cmp>, <ty> <new> [syncscope("<target-scope>")] <success ordering> <failure ordering> ; yields { ty, i1 } 9513 9514Overview: 9515""""""""" 9516 9517The '``cmpxchg``' instruction is used to atomically modify memory. It 9518loads a value in memory and compares it to a given value. If they are 9519equal, it tries to store a new value into the memory. 9520 9521Arguments: 9522"""""""""" 9523 9524There are three arguments to the '``cmpxchg``' instruction: an address 9525to operate on, a value to compare to the value currently be at that 9526address, and a new value to place at that address if the compared values 9527are equal. The type of '<cmp>' must be an integer or pointer type whose 9528bit width is a power of two greater than or equal to eight and less 9529than or equal to a target-specific size limit. '<cmp>' and '<new>' must 9530have the same type, and the type of '<pointer>' must be a pointer to 9531that type. If the ``cmpxchg`` is marked as ``volatile``, then the 9532optimizer is not allowed to modify the number or order of execution of 9533this ``cmpxchg`` with other :ref:`volatile operations <volatile>`. 9534 9535The success and failure :ref:`ordering <ordering>` arguments specify how this 9536``cmpxchg`` synchronizes with other atomic operations. Both ordering parameters 9537must be at least ``monotonic``, the ordering constraint on failure must be no 9538stronger than that on success, and the failure ordering cannot be either 9539``release`` or ``acq_rel``. 9540 9541A ``cmpxchg`` instruction can also take an optional 9542":ref:`syncscope <syncscope>`" argument. 9543 9544The pointer passed into cmpxchg must have alignment greater than or 9545equal to the size in memory of the operand. 9546 9547Semantics: 9548"""""""""" 9549 9550The contents of memory at the location specified by the '``<pointer>``' operand 9551is read and compared to '``<cmp>``'; if the values are equal, '``<new>``' is 9552written to the location. The original value at the location is returned, 9553together with a flag indicating success (true) or failure (false). 9554 9555If the cmpxchg operation is marked as ``weak`` then a spurious failure is 9556permitted: the operation may not write ``<new>`` even if the comparison 9557matched. 9558 9559If the cmpxchg operation is strong (the default), the i1 value is 1 if and only 9560if the value loaded equals ``cmp``. 9561 9562A successful ``cmpxchg`` is a read-modify-write instruction for the purpose of 9563identifying release sequences. A failed ``cmpxchg`` is equivalent to an atomic 9564load with an ordering parameter determined the second ordering parameter. 9565 9566Example: 9567"""""""" 9568 9569.. code-block:: llvm 9570 9571 entry: 9572 %orig = load atomic i32, i32* %ptr unordered, align 4 ; yields i32 9573 br label %loop 9574 9575 loop: 9576 %cmp = phi i32 [ %orig, %entry ], [%value_loaded, %loop] 9577 %squared = mul i32 %cmp, %cmp 9578 %val_success = cmpxchg i32* %ptr, i32 %cmp, i32 %squared acq_rel monotonic ; yields { i32, i1 } 9579 %value_loaded = extractvalue { i32, i1 } %val_success, 0 9580 %success = extractvalue { i32, i1 } %val_success, 1 9581 br i1 %success, label %done, label %loop 9582 9583 done: 9584 ... 9585 9586.. _i_atomicrmw: 9587 9588'``atomicrmw``' Instruction 9589^^^^^^^^^^^^^^^^^^^^^^^^^^^ 9590 9591Syntax: 9592""""""" 9593 9594:: 9595 9596 atomicrmw [volatile] <operation> <ty>* <pointer>, <ty> <value> [syncscope("<target-scope>")] <ordering> ; yields ty 9597 9598Overview: 9599""""""""" 9600 9601The '``atomicrmw``' instruction is used to atomically modify memory. 9602 9603Arguments: 9604"""""""""" 9605 9606There are three arguments to the '``atomicrmw``' instruction: an 9607operation to apply, an address whose value to modify, an argument to the 9608operation. The operation must be one of the following keywords: 9609 9610- xchg 9611- add 9612- sub 9613- and 9614- nand 9615- or 9616- xor 9617- max 9618- min 9619- umax 9620- umin 9621- fadd 9622- fsub 9623 9624For most of these operations, the type of '<value>' must be an integer 9625type whose bit width is a power of two greater than or equal to eight 9626and less than or equal to a target-specific size limit. For xchg, this 9627may also be a floating point type with the same size constraints as 9628integers. For fadd/fsub, this must be a floating point type. The 9629type of the '``<pointer>``' operand must be a pointer to that type. If 9630the ``atomicrmw`` is marked as ``volatile``, then the optimizer is not 9631allowed to modify the number or order of execution of this 9632``atomicrmw`` with other :ref:`volatile operations <volatile>`. 9633 9634A ``atomicrmw`` instruction can also take an optional 9635":ref:`syncscope <syncscope>`" argument. 9636 9637Semantics: 9638"""""""""" 9639 9640The contents of memory at the location specified by the '``<pointer>``' 9641operand are atomically read, modified, and written back. The original 9642value at the location is returned. The modification is specified by the 9643operation argument: 9644 9645- xchg: ``*ptr = val`` 9646- add: ``*ptr = *ptr + val`` 9647- sub: ``*ptr = *ptr - val`` 9648- and: ``*ptr = *ptr & val`` 9649- nand: ``*ptr = ~(*ptr & val)`` 9650- or: ``*ptr = *ptr | val`` 9651- xor: ``*ptr = *ptr ^ val`` 9652- max: ``*ptr = *ptr > val ? *ptr : val`` (using a signed comparison) 9653- min: ``*ptr = *ptr < val ? *ptr : val`` (using a signed comparison) 9654- umax: ``*ptr = *ptr > val ? *ptr : val`` (using an unsigned 9655 comparison) 9656- umin: ``*ptr = *ptr < val ? *ptr : val`` (using an unsigned 9657 comparison) 9658- fadd: ``*ptr = *ptr + val`` (using floating point arithmetic) 9659- fsub: ``*ptr = *ptr - val`` (using floating point arithmetic) 9660 9661Example: 9662"""""""" 9663 9664.. code-block:: llvm 9665 9666 %old = atomicrmw add i32* %ptr, i32 1 acquire ; yields i32 9667 9668.. _i_getelementptr: 9669 9670'``getelementptr``' Instruction 9671^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 9672 9673Syntax: 9674""""""" 9675 9676:: 9677 9678 <result> = getelementptr <ty>, <ty>* <ptrval>{, [inrange] <ty> <idx>}* 9679 <result> = getelementptr inbounds <ty>, <ty>* <ptrval>{, [inrange] <ty> <idx>}* 9680 <result> = getelementptr <ty>, <ptr vector> <ptrval>, [inrange] <vector index type> <idx> 9681 9682Overview: 9683""""""""" 9684 9685The '``getelementptr``' instruction is used to get the address of a 9686subelement of an :ref:`aggregate <t_aggregate>` data structure. It performs 9687address calculation only and does not access memory. The instruction can also 9688be used to calculate a vector of such addresses. 9689 9690Arguments: 9691"""""""""" 9692 9693The first argument is always a type used as the basis for the calculations. 9694The second argument is always a pointer or a vector of pointers, and is the 9695base address to start from. The remaining arguments are indices 9696that indicate which of the elements of the aggregate object are indexed. 9697The interpretation of each index is dependent on the type being indexed 9698into. The first index always indexes the pointer value given as the 9699second argument, the second index indexes a value of the type pointed to 9700(not necessarily the value directly pointed to, since the first index 9701can be non-zero), etc. The first type indexed into must be a pointer 9702value, subsequent types can be arrays, vectors, and structs. Note that 9703subsequent types being indexed into can never be pointers, since that 9704would require loading the pointer before continuing calculation. 9705 9706The type of each index argument depends on the type it is indexing into. 9707When indexing into a (optionally packed) structure, only ``i32`` integer 9708**constants** are allowed (when using a vector of indices they must all 9709be the **same** ``i32`` integer constant). When indexing into an array, 9710pointer or vector, integers of any width are allowed, and they are not 9711required to be constant. These integers are treated as signed values 9712where relevant. 9713 9714For example, let's consider a C code fragment and how it gets compiled 9715to LLVM: 9716 9717.. code-block:: c 9718 9719 struct RT { 9720 char A; 9721 int B[10][20]; 9722 char C; 9723 }; 9724 struct ST { 9725 int X; 9726 double Y; 9727 struct RT Z; 9728 }; 9729 9730 int *foo(struct ST *s) { 9731 return &s[1].Z.B[5][13]; 9732 } 9733 9734The LLVM code generated by Clang is: 9735 9736.. code-block:: llvm 9737 9738 %struct.RT = type { i8, [10 x [20 x i32]], i8 } 9739 %struct.ST = type { i32, double, %struct.RT } 9740 9741 define i32* @foo(%struct.ST* %s) nounwind uwtable readnone optsize ssp { 9742 entry: 9743 %arrayidx = getelementptr inbounds %struct.ST, %struct.ST* %s, i64 1, i32 2, i32 1, i64 5, i64 13 9744 ret i32* %arrayidx 9745 } 9746 9747Semantics: 9748"""""""""" 9749 9750In the example above, the first index is indexing into the 9751'``%struct.ST*``' type, which is a pointer, yielding a '``%struct.ST``' 9752= '``{ i32, double, %struct.RT }``' type, a structure. The second index 9753indexes into the third element of the structure, yielding a 9754'``%struct.RT``' = '``{ i8 , [10 x [20 x i32]], i8 }``' type, another 9755structure. The third index indexes into the second element of the 9756structure, yielding a '``[10 x [20 x i32]]``' type, an array. The two 9757dimensions of the array are subscripted into, yielding an '``i32``' 9758type. The '``getelementptr``' instruction returns a pointer to this 9759element, thus computing a value of '``i32*``' type. 9760 9761Note that it is perfectly legal to index partially through a structure, 9762returning a pointer to an inner element. Because of this, the LLVM code 9763for the given testcase is equivalent to: 9764 9765.. code-block:: llvm 9766 9767 define i32* @foo(%struct.ST* %s) { 9768 %t1 = getelementptr %struct.ST, %struct.ST* %s, i32 1 ; yields %struct.ST*:%t1 9769 %t2 = getelementptr %struct.ST, %struct.ST* %t1, i32 0, i32 2 ; yields %struct.RT*:%t2 9770 %t3 = getelementptr %struct.RT, %struct.RT* %t2, i32 0, i32 1 ; yields [10 x [20 x i32]]*:%t3 9771 %t4 = getelementptr [10 x [20 x i32]], [10 x [20 x i32]]* %t3, i32 0, i32 5 ; yields [20 x i32]*:%t4 9772 %t5 = getelementptr [20 x i32], [20 x i32]* %t4, i32 0, i32 13 ; yields i32*:%t5 9773 ret i32* %t5 9774 } 9775 9776If the ``inbounds`` keyword is present, the result value of the 9777``getelementptr`` is a :ref:`poison value <poisonvalues>` if the base 9778pointer is not an *in bounds* address of an allocated object, or if any 9779of the addresses that would be formed by successive addition of the 9780offsets implied by the indices to the base address with infinitely 9781precise signed arithmetic are not an *in bounds* address of that 9782allocated object. The *in bounds* addresses for an allocated object are 9783all the addresses that point into the object, plus the address one byte 9784past the end. The only *in bounds* address for a null pointer in the 9785default address-space is the null pointer itself. In cases where the 9786base is a vector of pointers the ``inbounds`` keyword applies to each 9787of the computations element-wise. 9788 9789If the ``inbounds`` keyword is not present, the offsets are added to the 9790base address with silently-wrapping two's complement arithmetic. If the 9791offsets have a different width from the pointer, they are sign-extended 9792or truncated to the width of the pointer. The result value of the 9793``getelementptr`` may be outside the object pointed to by the base 9794pointer. The result value may not necessarily be used to access memory 9795though, even if it happens to point into allocated storage. See the 9796:ref:`Pointer Aliasing Rules <pointeraliasing>` section for more 9797information. 9798 9799If the ``inrange`` keyword is present before any index, loading from or 9800storing to any pointer derived from the ``getelementptr`` has undefined 9801behavior if the load or store would access memory outside of the bounds of 9802the element selected by the index marked as ``inrange``. The result of a 9803pointer comparison or ``ptrtoint`` (including ``ptrtoint``-like operations 9804involving memory) involving a pointer derived from a ``getelementptr`` with 9805the ``inrange`` keyword is undefined, with the exception of comparisons 9806in the case where both operands are in the range of the element selected 9807by the ``inrange`` keyword, inclusive of the address one past the end of 9808that element. Note that the ``inrange`` keyword is currently only allowed 9809in constant ``getelementptr`` expressions. 9810 9811The getelementptr instruction is often confusing. For some more insight 9812into how it works, see :doc:`the getelementptr FAQ <GetElementPtr>`. 9813 9814Example: 9815"""""""" 9816 9817.. code-block:: llvm 9818 9819 ; yields [12 x i8]*:aptr 9820 %aptr = getelementptr {i32, [12 x i8]}, {i32, [12 x i8]}* %saptr, i64 0, i32 1 9821 ; yields i8*:vptr 9822 %vptr = getelementptr {i32, <2 x i8>}, {i32, <2 x i8>}* %svptr, i64 0, i32 1, i32 1 9823 ; yields i8*:eptr 9824 %eptr = getelementptr [12 x i8], [12 x i8]* %aptr, i64 0, i32 1 9825 ; yields i32*:iptr 9826 %iptr = getelementptr [10 x i32], [10 x i32]* @arr, i16 0, i16 0 9827 9828Vector of pointers: 9829""""""""""""""""""" 9830 9831The ``getelementptr`` returns a vector of pointers, instead of a single address, 9832when one or more of its arguments is a vector. In such cases, all vector 9833arguments should have the same number of elements, and every scalar argument 9834will be effectively broadcast into a vector during address calculation. 9835 9836.. code-block:: llvm 9837 9838 ; All arguments are vectors: 9839 ; A[i] = ptrs[i] + offsets[i]*sizeof(i8) 9840 %A = getelementptr i8, <4 x i8*> %ptrs, <4 x i64> %offsets 9841 9842 ; Add the same scalar offset to each pointer of a vector: 9843 ; A[i] = ptrs[i] + offset*sizeof(i8) 9844 %A = getelementptr i8, <4 x i8*> %ptrs, i64 %offset 9845 9846 ; Add distinct offsets to the same pointer: 9847 ; A[i] = ptr + offsets[i]*sizeof(i8) 9848 %A = getelementptr i8, i8* %ptr, <4 x i64> %offsets 9849 9850 ; In all cases described above the type of the result is <4 x i8*> 9851 9852The two following instructions are equivalent: 9853 9854.. code-block:: llvm 9855 9856 getelementptr %struct.ST, <4 x %struct.ST*> %s, <4 x i64> %ind1, 9857 <4 x i32> <i32 2, i32 2, i32 2, i32 2>, 9858 <4 x i32> <i32 1, i32 1, i32 1, i32 1>, 9859 <4 x i32> %ind4, 9860 <4 x i64> <i64 13, i64 13, i64 13, i64 13> 9861 9862 getelementptr %struct.ST, <4 x %struct.ST*> %s, <4 x i64> %ind1, 9863 i32 2, i32 1, <4 x i32> %ind4, i64 13 9864 9865Let's look at the C code, where the vector version of ``getelementptr`` 9866makes sense: 9867 9868.. code-block:: c 9869 9870 // Let's assume that we vectorize the following loop: 9871 double *A, *B; int *C; 9872 for (int i = 0; i < size; ++i) { 9873 A[i] = B[C[i]]; 9874 } 9875 9876.. code-block:: llvm 9877 9878 ; get pointers for 8 elements from array B 9879 %ptrs = getelementptr double, double* %B, <8 x i32> %C 9880 ; load 8 elements from array B into A 9881 %A = call <8 x double> @llvm.masked.gather.v8f64.v8p0f64(<8 x double*> %ptrs, 9882 i32 8, <8 x i1> %mask, <8 x double> %passthru) 9883 9884Conversion Operations 9885--------------------- 9886 9887The instructions in this category are the conversion instructions 9888(casting) which all take a single operand and a type. They perform 9889various bit conversions on the operand. 9890 9891.. _i_trunc: 9892 9893'``trunc .. to``' Instruction 9894^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 9895 9896Syntax: 9897""""""" 9898 9899:: 9900 9901 <result> = trunc <ty> <value> to <ty2> ; yields ty2 9902 9903Overview: 9904""""""""" 9905 9906The '``trunc``' instruction truncates its operand to the type ``ty2``. 9907 9908Arguments: 9909"""""""""" 9910 9911The '``trunc``' instruction takes a value to trunc, and a type to trunc 9912it to. Both types must be of :ref:`integer <t_integer>` types, or vectors 9913of the same number of integers. The bit size of the ``value`` must be 9914larger than the bit size of the destination type, ``ty2``. Equal sized 9915types are not allowed. 9916 9917Semantics: 9918"""""""""" 9919 9920The '``trunc``' instruction truncates the high order bits in ``value`` 9921and converts the remaining bits to ``ty2``. Since the source size must 9922be larger than the destination size, ``trunc`` cannot be a *no-op cast*. 9923It will always truncate bits. 9924 9925Example: 9926"""""""" 9927 9928.. code-block:: llvm 9929 9930 %X = trunc i32 257 to i8 ; yields i8:1 9931 %Y = trunc i32 123 to i1 ; yields i1:true 9932 %Z = trunc i32 122 to i1 ; yields i1:false 9933 %W = trunc <2 x i16> <i16 8, i16 7> to <2 x i8> ; yields <i8 8, i8 7> 9934 9935.. _i_zext: 9936 9937'``zext .. to``' Instruction 9938^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 9939 9940Syntax: 9941""""""" 9942 9943:: 9944 9945 <result> = zext <ty> <value> to <ty2> ; yields ty2 9946 9947Overview: 9948""""""""" 9949 9950The '``zext``' instruction zero extends its operand to type ``ty2``. 9951 9952Arguments: 9953"""""""""" 9954 9955The '``zext``' instruction takes a value to cast, and a type to cast it 9956to. Both types must be of :ref:`integer <t_integer>` types, or vectors of 9957the same number of integers. The bit size of the ``value`` must be 9958smaller than the bit size of the destination type, ``ty2``. 9959 9960Semantics: 9961"""""""""" 9962 9963The ``zext`` fills the high order bits of the ``value`` with zero bits 9964until it reaches the size of the destination type, ``ty2``. 9965 9966When zero extending from i1, the result will always be either 0 or 1. 9967 9968Example: 9969"""""""" 9970 9971.. code-block:: llvm 9972 9973 %X = zext i32 257 to i64 ; yields i64:257 9974 %Y = zext i1 true to i32 ; yields i32:1 9975 %Z = zext <2 x i16> <i16 8, i16 7> to <2 x i32> ; yields <i32 8, i32 7> 9976 9977.. _i_sext: 9978 9979'``sext .. to``' Instruction 9980^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 9981 9982Syntax: 9983""""""" 9984 9985:: 9986 9987 <result> = sext <ty> <value> to <ty2> ; yields ty2 9988 9989Overview: 9990""""""""" 9991 9992The '``sext``' sign extends ``value`` to the type ``ty2``. 9993 9994Arguments: 9995"""""""""" 9996 9997The '``sext``' instruction takes a value to cast, and a type to cast it 9998to. Both types must be of :ref:`integer <t_integer>` types, or vectors of 9999the same number of integers. The bit size of the ``value`` must be 10000smaller than the bit size of the destination type, ``ty2``. 10001 10002Semantics: 10003"""""""""" 10004 10005The '``sext``' instruction performs a sign extension by copying the sign 10006bit (highest order bit) of the ``value`` until it reaches the bit size 10007of the type ``ty2``. 10008 10009When sign extending from i1, the extension always results in -1 or 0. 10010 10011Example: 10012"""""""" 10013 10014.. code-block:: llvm 10015 10016 %X = sext i8 -1 to i16 ; yields i16 :65535 10017 %Y = sext i1 true to i32 ; yields i32:-1 10018 %Z = sext <2 x i16> <i16 8, i16 7> to <2 x i32> ; yields <i32 8, i32 7> 10019 10020'``fptrunc .. to``' Instruction 10021^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10022 10023Syntax: 10024""""""" 10025 10026:: 10027 10028 <result> = fptrunc <ty> <value> to <ty2> ; yields ty2 10029 10030Overview: 10031""""""""" 10032 10033The '``fptrunc``' instruction truncates ``value`` to type ``ty2``. 10034 10035Arguments: 10036"""""""""" 10037 10038The '``fptrunc``' instruction takes a :ref:`floating-point <t_floating>` 10039value to cast and a :ref:`floating-point <t_floating>` type to cast it to. 10040The size of ``value`` must be larger than the size of ``ty2``. This 10041implies that ``fptrunc`` cannot be used to make a *no-op cast*. 10042 10043Semantics: 10044"""""""""" 10045 10046The '``fptrunc``' instruction casts a ``value`` from a larger 10047:ref:`floating-point <t_floating>` type to a smaller :ref:`floating-point 10048<t_floating>` type. 10049This instruction is assumed to execute in the default :ref:`floating-point 10050environment <floatenv>`. 10051 10052Example: 10053"""""""" 10054 10055.. code-block:: llvm 10056 10057 %X = fptrunc double 16777217.0 to float ; yields float:16777216.0 10058 %Y = fptrunc double 1.0E+300 to half ; yields half:+infinity 10059 10060'``fpext .. to``' Instruction 10061^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10062 10063Syntax: 10064""""""" 10065 10066:: 10067 10068 <result> = fpext <ty> <value> to <ty2> ; yields ty2 10069 10070Overview: 10071""""""""" 10072 10073The '``fpext``' extends a floating-point ``value`` to a larger floating-point 10074value. 10075 10076Arguments: 10077"""""""""" 10078 10079The '``fpext``' instruction takes a :ref:`floating-point <t_floating>` 10080``value`` to cast, and a :ref:`floating-point <t_floating>` type to cast it 10081to. The source type must be smaller than the destination type. 10082 10083Semantics: 10084"""""""""" 10085 10086The '``fpext``' instruction extends the ``value`` from a smaller 10087:ref:`floating-point <t_floating>` type to a larger :ref:`floating-point 10088<t_floating>` type. The ``fpext`` cannot be used to make a 10089*no-op cast* because it always changes bits. Use ``bitcast`` to make a 10090*no-op cast* for a floating-point cast. 10091 10092Example: 10093"""""""" 10094 10095.. code-block:: llvm 10096 10097 %X = fpext float 3.125 to double ; yields double:3.125000e+00 10098 %Y = fpext double %X to fp128 ; yields fp128:0xL00000000000000004000900000000000 10099 10100'``fptoui .. to``' Instruction 10101^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10102 10103Syntax: 10104""""""" 10105 10106:: 10107 10108 <result> = fptoui <ty> <value> to <ty2> ; yields ty2 10109 10110Overview: 10111""""""""" 10112 10113The '``fptoui``' converts a floating-point ``value`` to its unsigned 10114integer equivalent of type ``ty2``. 10115 10116Arguments: 10117"""""""""" 10118 10119The '``fptoui``' instruction takes a value to cast, which must be a 10120scalar or vector :ref:`floating-point <t_floating>` value, and a type to 10121cast it to ``ty2``, which must be an :ref:`integer <t_integer>` type. If 10122``ty`` is a vector floating-point type, ``ty2`` must be a vector integer 10123type with the same number of elements as ``ty`` 10124 10125Semantics: 10126"""""""""" 10127 10128The '``fptoui``' instruction converts its :ref:`floating-point 10129<t_floating>` operand into the nearest (rounding towards zero) 10130unsigned integer value. If the value cannot fit in ``ty2``, the result 10131is a :ref:`poison value <poisonvalues>`. 10132 10133Example: 10134"""""""" 10135 10136.. code-block:: llvm 10137 10138 %X = fptoui double 123.0 to i32 ; yields i32:123 10139 %Y = fptoui float 1.0E+300 to i1 ; yields undefined:1 10140 %Z = fptoui float 1.04E+17 to i8 ; yields undefined:1 10141 10142'``fptosi .. to``' Instruction 10143^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10144 10145Syntax: 10146""""""" 10147 10148:: 10149 10150 <result> = fptosi <ty> <value> to <ty2> ; yields ty2 10151 10152Overview: 10153""""""""" 10154 10155The '``fptosi``' instruction converts :ref:`floating-point <t_floating>` 10156``value`` to type ``ty2``. 10157 10158Arguments: 10159"""""""""" 10160 10161The '``fptosi``' instruction takes a value to cast, which must be a 10162scalar or vector :ref:`floating-point <t_floating>` value, and a type to 10163cast it to ``ty2``, which must be an :ref:`integer <t_integer>` type. If 10164``ty`` is a vector floating-point type, ``ty2`` must be a vector integer 10165type with the same number of elements as ``ty`` 10166 10167Semantics: 10168"""""""""" 10169 10170The '``fptosi``' instruction converts its :ref:`floating-point 10171<t_floating>` operand into the nearest (rounding towards zero) 10172signed integer value. If the value cannot fit in ``ty2``, the result 10173is a :ref:`poison value <poisonvalues>`. 10174 10175Example: 10176"""""""" 10177 10178.. code-block:: llvm 10179 10180 %X = fptosi double -123.0 to i32 ; yields i32:-123 10181 %Y = fptosi float 1.0E-247 to i1 ; yields undefined:1 10182 %Z = fptosi float 1.04E+17 to i8 ; yields undefined:1 10183 10184'``uitofp .. to``' Instruction 10185^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10186 10187Syntax: 10188""""""" 10189 10190:: 10191 10192 <result> = uitofp <ty> <value> to <ty2> ; yields ty2 10193 10194Overview: 10195""""""""" 10196 10197The '``uitofp``' instruction regards ``value`` as an unsigned integer 10198and converts that value to the ``ty2`` type. 10199 10200Arguments: 10201"""""""""" 10202 10203The '``uitofp``' instruction takes a value to cast, which must be a 10204scalar or vector :ref:`integer <t_integer>` value, and a type to cast it to 10205``ty2``, which must be an :ref:`floating-point <t_floating>` type. If 10206``ty`` is a vector integer type, ``ty2`` must be a vector floating-point 10207type with the same number of elements as ``ty`` 10208 10209Semantics: 10210"""""""""" 10211 10212The '``uitofp``' instruction interprets its operand as an unsigned 10213integer quantity and converts it to the corresponding floating-point 10214value. If the value cannot be exactly represented, it is rounded using 10215the default rounding mode. 10216 10217 10218Example: 10219"""""""" 10220 10221.. code-block:: llvm 10222 10223 %X = uitofp i32 257 to float ; yields float:257.0 10224 %Y = uitofp i8 -1 to double ; yields double:255.0 10225 10226'``sitofp .. to``' Instruction 10227^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10228 10229Syntax: 10230""""""" 10231 10232:: 10233 10234 <result> = sitofp <ty> <value> to <ty2> ; yields ty2 10235 10236Overview: 10237""""""""" 10238 10239The '``sitofp``' instruction regards ``value`` as a signed integer and 10240converts that value to the ``ty2`` type. 10241 10242Arguments: 10243"""""""""" 10244 10245The '``sitofp``' instruction takes a value to cast, which must be a 10246scalar or vector :ref:`integer <t_integer>` value, and a type to cast it to 10247``ty2``, which must be an :ref:`floating-point <t_floating>` type. If 10248``ty`` is a vector integer type, ``ty2`` must be a vector floating-point 10249type with the same number of elements as ``ty`` 10250 10251Semantics: 10252"""""""""" 10253 10254The '``sitofp``' instruction interprets its operand as a signed integer 10255quantity and converts it to the corresponding floating-point value. If the 10256value cannot be exactly represented, it is rounded using the default rounding 10257mode. 10258 10259Example: 10260"""""""" 10261 10262.. code-block:: llvm 10263 10264 %X = sitofp i32 257 to float ; yields float:257.0 10265 %Y = sitofp i8 -1 to double ; yields double:-1.0 10266 10267.. _i_ptrtoint: 10268 10269'``ptrtoint .. to``' Instruction 10270^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10271 10272Syntax: 10273""""""" 10274 10275:: 10276 10277 <result> = ptrtoint <ty> <value> to <ty2> ; yields ty2 10278 10279Overview: 10280""""""""" 10281 10282The '``ptrtoint``' instruction converts the pointer or a vector of 10283pointers ``value`` to the integer (or vector of integers) type ``ty2``. 10284 10285Arguments: 10286"""""""""" 10287 10288The '``ptrtoint``' instruction takes a ``value`` to cast, which must be 10289a value of type :ref:`pointer <t_pointer>` or a vector of pointers, and a 10290type to cast it to ``ty2``, which must be an :ref:`integer <t_integer>` or 10291a vector of integers type. 10292 10293Semantics: 10294"""""""""" 10295 10296The '``ptrtoint``' instruction converts ``value`` to integer type 10297``ty2`` by interpreting the pointer value as an integer and either 10298truncating or zero extending that value to the size of the integer type. 10299If ``value`` is smaller than ``ty2`` then a zero extension is done. If 10300``value`` is larger than ``ty2`` then a truncation is done. If they are 10301the same size, then nothing is done (*no-op cast*) other than a type 10302change. 10303 10304Example: 10305"""""""" 10306 10307.. code-block:: llvm 10308 10309 %X = ptrtoint i32* %P to i8 ; yields truncation on 32-bit architecture 10310 %Y = ptrtoint i32* %P to i64 ; yields zero extension on 32-bit architecture 10311 %Z = ptrtoint <4 x i32*> %P to <4 x i64>; yields vector zero extension for a vector of addresses on 32-bit architecture 10312 10313.. _i_inttoptr: 10314 10315'``inttoptr .. to``' Instruction 10316^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10317 10318Syntax: 10319""""""" 10320 10321:: 10322 10323 <result> = inttoptr <ty> <value> to <ty2>[, !dereferenceable !<deref_bytes_node>][, !dereferenceable_or_null !<deref_bytes_node>] ; yields ty2 10324 10325Overview: 10326""""""""" 10327 10328The '``inttoptr``' instruction converts an integer ``value`` to a 10329pointer type, ``ty2``. 10330 10331Arguments: 10332"""""""""" 10333 10334The '``inttoptr``' instruction takes an :ref:`integer <t_integer>` value to 10335cast, and a type to cast it to, which must be a :ref:`pointer <t_pointer>` 10336type. 10337 10338The optional ``!dereferenceable`` metadata must reference a single metadata 10339name ``<deref_bytes_node>`` corresponding to a metadata node with one ``i64`` 10340entry. 10341See ``dereferenceable`` metadata. 10342 10343The optional ``!dereferenceable_or_null`` metadata must reference a single 10344metadata name ``<deref_bytes_node>`` corresponding to a metadata node with one 10345``i64`` entry. 10346See ``dereferenceable_or_null`` metadata. 10347 10348Semantics: 10349"""""""""" 10350 10351The '``inttoptr``' instruction converts ``value`` to type ``ty2`` by 10352applying either a zero extension or a truncation depending on the size 10353of the integer ``value``. If ``value`` is larger than the size of a 10354pointer then a truncation is done. If ``value`` is smaller than the size 10355of a pointer then a zero extension is done. If they are the same size, 10356nothing is done (*no-op cast*). 10357 10358Example: 10359"""""""" 10360 10361.. code-block:: llvm 10362 10363 %X = inttoptr i32 255 to i32* ; yields zero extension on 64-bit architecture 10364 %Y = inttoptr i32 255 to i32* ; yields no-op on 32-bit architecture 10365 %Z = inttoptr i64 0 to i32* ; yields truncation on 32-bit architecture 10366 %Z = inttoptr <4 x i32> %G to <4 x i8*>; yields truncation of vector G to four pointers 10367 10368.. _i_bitcast: 10369 10370'``bitcast .. to``' Instruction 10371^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10372 10373Syntax: 10374""""""" 10375 10376:: 10377 10378 <result> = bitcast <ty> <value> to <ty2> ; yields ty2 10379 10380Overview: 10381""""""""" 10382 10383The '``bitcast``' instruction converts ``value`` to type ``ty2`` without 10384changing any bits. 10385 10386Arguments: 10387"""""""""" 10388 10389The '``bitcast``' instruction takes a value to cast, which must be a 10390non-aggregate first class value, and a type to cast it to, which must 10391also be a non-aggregate :ref:`first class <t_firstclass>` type. The 10392bit sizes of ``value`` and the destination type, ``ty2``, must be 10393identical. If the source type is a pointer, the destination type must 10394also be a pointer of the same size. This instruction supports bitwise 10395conversion of vectors to integers and to vectors of other types (as 10396long as they have the same size). 10397 10398Semantics: 10399"""""""""" 10400 10401The '``bitcast``' instruction converts ``value`` to type ``ty2``. It 10402is always a *no-op cast* because no bits change with this 10403conversion. The conversion is done as if the ``value`` had been stored 10404to memory and read back as type ``ty2``. Pointer (or vector of 10405pointers) types may only be converted to other pointer (or vector of 10406pointers) types with the same address space through this instruction. 10407To convert pointers to other types, use the :ref:`inttoptr <i_inttoptr>` 10408or :ref:`ptrtoint <i_ptrtoint>` instructions first. 10409 10410Example: 10411"""""""" 10412 10413.. code-block:: text 10414 10415 %X = bitcast i8 255 to i8 ; yields i8 :-1 10416 %Y = bitcast i32* %x to sint* ; yields sint*:%x 10417 %Z = bitcast <2 x int> %V to i64; ; yields i64: %V 10418 %Z = bitcast <2 x i32*> %V to <2 x i64*> ; yields <2 x i64*> 10419 10420.. _i_addrspacecast: 10421 10422'``addrspacecast .. to``' Instruction 10423^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10424 10425Syntax: 10426""""""" 10427 10428:: 10429 10430 <result> = addrspacecast <pty> <ptrval> to <pty2> ; yields pty2 10431 10432Overview: 10433""""""""" 10434 10435The '``addrspacecast``' instruction converts ``ptrval`` from ``pty`` in 10436address space ``n`` to type ``pty2`` in address space ``m``. 10437 10438Arguments: 10439"""""""""" 10440 10441The '``addrspacecast``' instruction takes a pointer or vector of pointer value 10442to cast and a pointer type to cast it to, which must have a different 10443address space. 10444 10445Semantics: 10446"""""""""" 10447 10448The '``addrspacecast``' instruction converts the pointer value 10449``ptrval`` to type ``pty2``. It can be a *no-op cast* or a complex 10450value modification, depending on the target and the address space 10451pair. Pointer conversions within the same address space must be 10452performed with the ``bitcast`` instruction. Note that if the address space 10453conversion is legal then both result and operand refer to the same memory 10454location. 10455 10456Example: 10457"""""""" 10458 10459.. code-block:: llvm 10460 10461 %X = addrspacecast i32* %x to i32 addrspace(1)* ; yields i32 addrspace(1)*:%x 10462 %Y = addrspacecast i32 addrspace(1)* %y to i64 addrspace(2)* ; yields i64 addrspace(2)*:%y 10463 %Z = addrspacecast <4 x i32*> %z to <4 x float addrspace(3)*> ; yields <4 x float addrspace(3)*>:%z 10464 10465.. _otherops: 10466 10467Other Operations 10468---------------- 10469 10470The instructions in this category are the "miscellaneous" instructions, 10471which defy better classification. 10472 10473.. _i_icmp: 10474 10475'``icmp``' Instruction 10476^^^^^^^^^^^^^^^^^^^^^^ 10477 10478Syntax: 10479""""""" 10480 10481:: 10482 10483 <result> = icmp <cond> <ty> <op1>, <op2> ; yields i1 or <N x i1>:result 10484 10485Overview: 10486""""""""" 10487 10488The '``icmp``' instruction returns a boolean value or a vector of 10489boolean values based on comparison of its two integer, integer vector, 10490pointer, or pointer vector operands. 10491 10492Arguments: 10493"""""""""" 10494 10495The '``icmp``' instruction takes three operands. The first operand is 10496the condition code indicating the kind of comparison to perform. It is 10497not a value, just a keyword. The possible condition codes are: 10498 10499#. ``eq``: equal 10500#. ``ne``: not equal 10501#. ``ugt``: unsigned greater than 10502#. ``uge``: unsigned greater or equal 10503#. ``ult``: unsigned less than 10504#. ``ule``: unsigned less or equal 10505#. ``sgt``: signed greater than 10506#. ``sge``: signed greater or equal 10507#. ``slt``: signed less than 10508#. ``sle``: signed less or equal 10509 10510The remaining two arguments must be :ref:`integer <t_integer>` or 10511:ref:`pointer <t_pointer>` or integer :ref:`vector <t_vector>` typed. They 10512must also be identical types. 10513 10514Semantics: 10515"""""""""" 10516 10517The '``icmp``' compares ``op1`` and ``op2`` according to the condition 10518code given as ``cond``. The comparison performed always yields either an 10519:ref:`i1 <t_integer>` or vector of ``i1`` result, as follows: 10520 10521#. ``eq``: yields ``true`` if the operands are equal, ``false`` 10522 otherwise. No sign interpretation is necessary or performed. 10523#. ``ne``: yields ``true`` if the operands are unequal, ``false`` 10524 otherwise. No sign interpretation is necessary or performed. 10525#. ``ugt``: interprets the operands as unsigned values and yields 10526 ``true`` if ``op1`` is greater than ``op2``. 10527#. ``uge``: interprets the operands as unsigned values and yields 10528 ``true`` if ``op1`` is greater than or equal to ``op2``. 10529#. ``ult``: interprets the operands as unsigned values and yields 10530 ``true`` if ``op1`` is less than ``op2``. 10531#. ``ule``: interprets the operands as unsigned values and yields 10532 ``true`` if ``op1`` is less than or equal to ``op2``. 10533#. ``sgt``: interprets the operands as signed values and yields ``true`` 10534 if ``op1`` is greater than ``op2``. 10535#. ``sge``: interprets the operands as signed values and yields ``true`` 10536 if ``op1`` is greater than or equal to ``op2``. 10537#. ``slt``: interprets the operands as signed values and yields ``true`` 10538 if ``op1`` is less than ``op2``. 10539#. ``sle``: interprets the operands as signed values and yields ``true`` 10540 if ``op1`` is less than or equal to ``op2``. 10541 10542If the operands are :ref:`pointer <t_pointer>` typed, the pointer values 10543are compared as if they were integers. 10544 10545If the operands are integer vectors, then they are compared element by 10546element. The result is an ``i1`` vector with the same number of elements 10547as the values being compared. Otherwise, the result is an ``i1``. 10548 10549Example: 10550"""""""" 10551 10552.. code-block:: text 10553 10554 <result> = icmp eq i32 4, 5 ; yields: result=false 10555 <result> = icmp ne float* %X, %X ; yields: result=false 10556 <result> = icmp ult i16 4, 5 ; yields: result=true 10557 <result> = icmp sgt i16 4, 5 ; yields: result=false 10558 <result> = icmp ule i16 -4, 5 ; yields: result=false 10559 <result> = icmp sge i16 4, 5 ; yields: result=false 10560 10561.. _i_fcmp: 10562 10563'``fcmp``' Instruction 10564^^^^^^^^^^^^^^^^^^^^^^ 10565 10566Syntax: 10567""""""" 10568 10569:: 10570 10571 <result> = fcmp [fast-math flags]* <cond> <ty> <op1>, <op2> ; yields i1 or <N x i1>:result 10572 10573Overview: 10574""""""""" 10575 10576The '``fcmp``' instruction returns a boolean value or vector of boolean 10577values based on comparison of its operands. 10578 10579If the operands are floating-point scalars, then the result type is a 10580boolean (:ref:`i1 <t_integer>`). 10581 10582If the operands are floating-point vectors, then the result type is a 10583vector of boolean with the same number of elements as the operands being 10584compared. 10585 10586Arguments: 10587"""""""""" 10588 10589The '``fcmp``' instruction takes three operands. The first operand is 10590the condition code indicating the kind of comparison to perform. It is 10591not a value, just a keyword. The possible condition codes are: 10592 10593#. ``false``: no comparison, always returns false 10594#. ``oeq``: ordered and equal 10595#. ``ogt``: ordered and greater than 10596#. ``oge``: ordered and greater than or equal 10597#. ``olt``: ordered and less than 10598#. ``ole``: ordered and less than or equal 10599#. ``one``: ordered and not equal 10600#. ``ord``: ordered (no nans) 10601#. ``ueq``: unordered or equal 10602#. ``ugt``: unordered or greater than 10603#. ``uge``: unordered or greater than or equal 10604#. ``ult``: unordered or less than 10605#. ``ule``: unordered or less than or equal 10606#. ``une``: unordered or not equal 10607#. ``uno``: unordered (either nans) 10608#. ``true``: no comparison, always returns true 10609 10610*Ordered* means that neither operand is a QNAN while *unordered* means 10611that either operand may be a QNAN. 10612 10613Each of ``val1`` and ``val2`` arguments must be either a :ref:`floating-point 10614<t_floating>` type or a :ref:`vector <t_vector>` of floating-point type. 10615They must have identical types. 10616 10617Semantics: 10618"""""""""" 10619 10620The '``fcmp``' instruction compares ``op1`` and ``op2`` according to the 10621condition code given as ``cond``. If the operands are vectors, then the 10622vectors are compared element by element. Each comparison performed 10623always yields an :ref:`i1 <t_integer>` result, as follows: 10624 10625#. ``false``: always yields ``false``, regardless of operands. 10626#. ``oeq``: yields ``true`` if both operands are not a QNAN and ``op1`` 10627 is equal to ``op2``. 10628#. ``ogt``: yields ``true`` if both operands are not a QNAN and ``op1`` 10629 is greater than ``op2``. 10630#. ``oge``: yields ``true`` if both operands are not a QNAN and ``op1`` 10631 is greater than or equal to ``op2``. 10632#. ``olt``: yields ``true`` if both operands are not a QNAN and ``op1`` 10633 is less than ``op2``. 10634#. ``ole``: yields ``true`` if both operands are not a QNAN and ``op1`` 10635 is less than or equal to ``op2``. 10636#. ``one``: yields ``true`` if both operands are not a QNAN and ``op1`` 10637 is not equal to ``op2``. 10638#. ``ord``: yields ``true`` if both operands are not a QNAN. 10639#. ``ueq``: yields ``true`` if either operand is a QNAN or ``op1`` is 10640 equal to ``op2``. 10641#. ``ugt``: yields ``true`` if either operand is a QNAN or ``op1`` is 10642 greater than ``op2``. 10643#. ``uge``: yields ``true`` if either operand is a QNAN or ``op1`` is 10644 greater than or equal to ``op2``. 10645#. ``ult``: yields ``true`` if either operand is a QNAN or ``op1`` is 10646 less than ``op2``. 10647#. ``ule``: yields ``true`` if either operand is a QNAN or ``op1`` is 10648 less than or equal to ``op2``. 10649#. ``une``: yields ``true`` if either operand is a QNAN or ``op1`` is 10650 not equal to ``op2``. 10651#. ``uno``: yields ``true`` if either operand is a QNAN. 10652#. ``true``: always yields ``true``, regardless of operands. 10653 10654The ``fcmp`` instruction can also optionally take any number of 10655:ref:`fast-math flags <fastmath>`, which are optimization hints to enable 10656otherwise unsafe floating-point optimizations. 10657 10658Any set of fast-math flags are legal on an ``fcmp`` instruction, but the 10659only flags that have any effect on its semantics are those that allow 10660assumptions to be made about the values of input arguments; namely 10661``nnan``, ``ninf``, and ``reassoc``. See :ref:`fastmath` for more information. 10662 10663Example: 10664"""""""" 10665 10666.. code-block:: text 10667 10668 <result> = fcmp oeq float 4.0, 5.0 ; yields: result=false 10669 <result> = fcmp one float 4.0, 5.0 ; yields: result=true 10670 <result> = fcmp olt float 4.0, 5.0 ; yields: result=true 10671 <result> = fcmp ueq double 1.0, 2.0 ; yields: result=false 10672 10673.. _i_phi: 10674 10675'``phi``' Instruction 10676^^^^^^^^^^^^^^^^^^^^^ 10677 10678Syntax: 10679""""""" 10680 10681:: 10682 10683 <result> = phi [fast-math-flags] <ty> [ <val0>, <label0>], ... 10684 10685Overview: 10686""""""""" 10687 10688The '``phi``' instruction is used to implement the φ node in the SSA 10689graph representing the function. 10690 10691Arguments: 10692"""""""""" 10693 10694The type of the incoming values is specified with the first type field. 10695After this, the '``phi``' instruction takes a list of pairs as 10696arguments, with one pair for each predecessor basic block of the current 10697block. Only values of :ref:`first class <t_firstclass>` type may be used as 10698the value arguments to the PHI node. Only labels may be used as the 10699label arguments. 10700 10701There must be no non-phi instructions between the start of a basic block 10702and the PHI instructions: i.e. PHI instructions must be first in a basic 10703block. 10704 10705For the purposes of the SSA form, the use of each incoming value is 10706deemed to occur on the edge from the corresponding predecessor block to 10707the current block (but after any definition of an '``invoke``' 10708instruction's return value on the same edge). 10709 10710The optional ``fast-math-flags`` marker indicates that the phi has one 10711or more :ref:`fast-math-flags <fastmath>`. These are optimization hints 10712to enable otherwise unsafe floating-point optimizations. Fast-math-flags 10713are only valid for phis that return a floating-point scalar or vector 10714type, or an array (nested to any depth) of floating-point scalar or vector 10715types. 10716 10717Semantics: 10718"""""""""" 10719 10720At runtime, the '``phi``' instruction logically takes on the value 10721specified by the pair corresponding to the predecessor basic block that 10722executed just prior to the current block. 10723 10724Example: 10725"""""""" 10726 10727.. code-block:: llvm 10728 10729 Loop: ; Infinite loop that counts from 0 on up... 10730 %indvar = phi i32 [ 0, %LoopHeader ], [ %nextindvar, %Loop ] 10731 %nextindvar = add i32 %indvar, 1 10732 br label %Loop 10733 10734.. _i_select: 10735 10736'``select``' Instruction 10737^^^^^^^^^^^^^^^^^^^^^^^^ 10738 10739Syntax: 10740""""""" 10741 10742:: 10743 10744 <result> = select [fast-math flags] selty <cond>, <ty> <val1>, <ty> <val2> ; yields ty 10745 10746 selty is either i1 or {<N x i1>} 10747 10748Overview: 10749""""""""" 10750 10751The '``select``' instruction is used to choose one value based on a 10752condition, without IR-level branching. 10753 10754Arguments: 10755"""""""""" 10756 10757The '``select``' instruction requires an 'i1' value or a vector of 'i1' 10758values indicating the condition, and two values of the same :ref:`first 10759class <t_firstclass>` type. 10760 10761#. The optional ``fast-math flags`` marker indicates that the select has one or more 10762 :ref:`fast-math flags <fastmath>`. These are optimization hints to enable 10763 otherwise unsafe floating-point optimizations. Fast-math flags are only valid 10764 for selects that return a floating-point scalar or vector type, or an array 10765 (nested to any depth) of floating-point scalar or vector types. 10766 10767Semantics: 10768"""""""""" 10769 10770If the condition is an i1 and it evaluates to 1, the instruction returns 10771the first value argument; otherwise, it returns the second value 10772argument. 10773 10774If the condition is a vector of i1, then the value arguments must be 10775vectors of the same size, and the selection is done element by element. 10776 10777If the condition is an i1 and the value arguments are vectors of the 10778same size, then an entire vector is selected. 10779 10780Example: 10781"""""""" 10782 10783.. code-block:: llvm 10784 10785 %X = select i1 true, i8 17, i8 42 ; yields i8:17 10786 10787 10788.. _i_freeze: 10789 10790'``freeze``' Instruction 10791^^^^^^^^^^^^^^^^^^^^^^^^ 10792 10793Syntax: 10794""""""" 10795 10796:: 10797 10798 <result> = freeze ty <val> ; yields ty:result 10799 10800Overview: 10801""""""""" 10802 10803The '``freeze``' instruction is used to stop propagation of 10804:ref:`undef <undefvalues>` and :ref:`poison <poisonvalues>` values. 10805 10806Arguments: 10807"""""""""" 10808 10809The '``freeze``' instruction takes a single argument. 10810 10811Semantics: 10812"""""""""" 10813 10814If the argument is ``undef`` or ``poison``, '``freeze``' returns an 10815arbitrary, but fixed, value of type '``ty``'. 10816Otherwise, this instruction is a no-op and returns the input argument. 10817All uses of a value returned by the same '``freeze``' instruction are 10818guaranteed to always observe the same value, while different '``freeze``' 10819instructions may yield different values. 10820 10821While ``undef`` and ``poison`` pointers can be frozen, the result is a 10822non-dereferenceable pointer. See the 10823:ref:`Pointer Aliasing Rules <pointeraliasing>` section for more information. 10824If an aggregate value or vector is frozen, the operand is frozen element-wise. 10825The padding of an aggregate isn't considered, since it isn't visible 10826without storing it into memory and loading it with a different type. 10827 10828 10829Example: 10830"""""""" 10831 10832.. code-block:: text 10833 10834 %w = i32 undef 10835 %x = freeze i32 %w 10836 %y = add i32 %w, %w ; undef 10837 %z = add i32 %x, %x ; even number because all uses of %x observe 10838 ; the same value 10839 %x2 = freeze i32 %w 10840 %cmp = icmp eq i32 %x, %x2 ; can be true or false 10841 10842 ; example with vectors 10843 %v = <2 x i32> <i32 undef, i32 poison> 10844 %a = extractelement <2 x i32> %v, i32 0 ; undef 10845 %b = extractelement <2 x i32> %v, i32 1 ; poison 10846 %add = add i32 %a, %a ; undef 10847 10848 %v.fr = freeze <2 x i32> %v ; element-wise freeze 10849 %d = extractelement <2 x i32> %v.fr, i32 0 ; not undef 10850 %add.f = add i32 %d, %d ; even number 10851 10852 ; branching on frozen value 10853 %poison = add nsw i1 %k, undef ; poison 10854 %c = freeze i1 %poison 10855 br i1 %c, label %foo, label %bar ; non-deterministic branch to %foo or %bar 10856 10857 10858.. _i_call: 10859 10860'``call``' Instruction 10861^^^^^^^^^^^^^^^^^^^^^^ 10862 10863Syntax: 10864""""""" 10865 10866:: 10867 10868 <result> = [tail | musttail | notail ] call [fast-math flags] [cconv] [ret attrs] [addrspace(<num>)] 10869 <ty>|<fnty> <fnptrval>(<function args>) [fn attrs] [ operand bundles ] 10870 10871Overview: 10872""""""""" 10873 10874The '``call``' instruction represents a simple function call. 10875 10876Arguments: 10877"""""""""" 10878 10879This instruction requires several arguments: 10880 10881#. The optional ``tail`` and ``musttail`` markers indicate that the optimizers 10882 should perform tail call optimization. The ``tail`` marker is a hint that 10883 `can be ignored <CodeGenerator.html#sibcallopt>`_. The ``musttail`` marker 10884 means that the call must be tail call optimized in order for the program to 10885 be correct. The ``musttail`` marker provides these guarantees: 10886 10887 #. The call will not cause unbounded stack growth if it is part of a 10888 recursive cycle in the call graph. 10889 #. Arguments with the :ref:`inalloca <attr_inalloca>` or 10890 :ref:`preallocated <attr_preallocated>` attribute are forwarded in place. 10891 #. If the musttail call appears in a function with the ``"thunk"`` attribute 10892 and the caller and callee both have varargs, than any unprototyped 10893 arguments in register or memory are forwarded to the callee. Similarly, 10894 the return value of the callee is returned to the caller's caller, even 10895 if a void return type is in use. 10896 10897 Both markers imply that the callee does not access allocas from the caller. 10898 The ``tail`` marker additionally implies that the callee does not access 10899 varargs from the caller. Calls marked ``musttail`` must obey the following 10900 additional rules: 10901 10902 - The call must immediately precede a :ref:`ret <i_ret>` instruction, 10903 or a pointer bitcast followed by a ret instruction. 10904 - The ret instruction must return the (possibly bitcasted) value 10905 produced by the call or void. 10906 - The caller and callee prototypes must match. Pointer types of 10907 parameters or return types may differ in pointee type, but not 10908 in address space. 10909 - The calling conventions of the caller and callee must match. 10910 - All ABI-impacting function attributes, such as sret, byval, inreg, 10911 returned, and inalloca, must match. 10912 - The callee must be varargs iff the caller is varargs. Bitcasting a 10913 non-varargs function to the appropriate varargs type is legal so 10914 long as the non-varargs prefixes obey the other rules. 10915 10916 Tail call optimization for calls marked ``tail`` is guaranteed to occur if 10917 the following conditions are met: 10918 10919 - Caller and callee both have the calling convention ``fastcc`` or ``tailcc``. 10920 - The call is in tail position (ret immediately follows call and ret 10921 uses value of call or is void). 10922 - Option ``-tailcallopt`` is enabled, 10923 ``llvm::GuaranteedTailCallOpt`` is ``true``, or the calling convention 10924 is ``tailcc`` 10925 - `Platform-specific constraints are 10926 met. <CodeGenerator.html#tailcallopt>`_ 10927 10928#. The optional ``notail`` marker indicates that the optimizers should not add 10929 ``tail`` or ``musttail`` markers to the call. It is used to prevent tail 10930 call optimization from being performed on the call. 10931 10932#. The optional ``fast-math flags`` marker indicates that the call has one or more 10933 :ref:`fast-math flags <fastmath>`, which are optimization hints to enable 10934 otherwise unsafe floating-point optimizations. Fast-math flags are only valid 10935 for calls that return a floating-point scalar or vector type, or an array 10936 (nested to any depth) of floating-point scalar or vector types. 10937 10938#. The optional "cconv" marker indicates which :ref:`calling 10939 convention <callingconv>` the call should use. If none is 10940 specified, the call defaults to using C calling conventions. The 10941 calling convention of the call must match the calling convention of 10942 the target function, or else the behavior is undefined. 10943#. The optional :ref:`Parameter Attributes <paramattrs>` list for return 10944 values. Only '``zeroext``', '``signext``', and '``inreg``' attributes 10945 are valid here. 10946#. The optional addrspace attribute can be used to indicate the address space 10947 of the called function. If it is not specified, the program address space 10948 from the :ref:`datalayout string<langref_datalayout>` will be used. 10949#. '``ty``': the type of the call instruction itself which is also the 10950 type of the return value. Functions that return no value are marked 10951 ``void``. 10952#. '``fnty``': shall be the signature of the function being called. The 10953 argument types must match the types implied by this signature. This 10954 type can be omitted if the function is not varargs. 10955#. '``fnptrval``': An LLVM value containing a pointer to a function to 10956 be called. In most cases, this is a direct function call, but 10957 indirect ``call``'s are just as possible, calling an arbitrary pointer 10958 to function value. 10959#. '``function args``': argument list whose types match the function 10960 signature argument types and parameter attributes. All arguments must 10961 be of :ref:`first class <t_firstclass>` type. If the function signature 10962 indicates the function accepts a variable number of arguments, the 10963 extra arguments can be specified. 10964#. The optional :ref:`function attributes <fnattrs>` list. 10965#. The optional :ref:`operand bundles <opbundles>` list. 10966 10967Semantics: 10968"""""""""" 10969 10970The '``call``' instruction is used to cause control flow to transfer to 10971a specified function, with its incoming arguments bound to the specified 10972values. Upon a '``ret``' instruction in the called function, control 10973flow continues with the instruction after the function call, and the 10974return value of the function is bound to the result argument. 10975 10976Example: 10977"""""""" 10978 10979.. code-block:: llvm 10980 10981 %retval = call i32 @test(i32 %argc) 10982 call i32 (i8*, ...)* @printf(i8* %msg, i32 12, i8 42) ; yields i32 10983 %X = tail call i32 @foo() ; yields i32 10984 %Y = tail call fastcc i32 @foo() ; yields i32 10985 call void %foo(i8 97 signext) 10986 10987 %struct.A = type { i32, i8 } 10988 %r = call %struct.A @foo() ; yields { i32, i8 } 10989 %gr = extractvalue %struct.A %r, 0 ; yields i32 10990 %gr1 = extractvalue %struct.A %r, 1 ; yields i8 10991 %Z = call void @foo() noreturn ; indicates that %foo never returns normally 10992 %ZZ = call zeroext i32 @bar() ; Return value is %zero extended 10993 10994llvm treats calls to some functions with names and arguments that match 10995the standard C99 library as being the C99 library functions, and may 10996perform optimizations or generate code for them under that assumption. 10997This is something we'd like to change in the future to provide better 10998support for freestanding environments and non-C-based languages. 10999 11000.. _i_va_arg: 11001 11002'``va_arg``' Instruction 11003^^^^^^^^^^^^^^^^^^^^^^^^ 11004 11005Syntax: 11006""""""" 11007 11008:: 11009 11010 <resultval> = va_arg <va_list*> <arglist>, <argty> 11011 11012Overview: 11013""""""""" 11014 11015The '``va_arg``' instruction is used to access arguments passed through 11016the "variable argument" area of a function call. It is used to implement 11017the ``va_arg`` macro in C. 11018 11019Arguments: 11020"""""""""" 11021 11022This instruction takes a ``va_list*`` value and the type of the 11023argument. It returns a value of the specified argument type and 11024increments the ``va_list`` to point to the next argument. The actual 11025type of ``va_list`` is target specific. 11026 11027Semantics: 11028"""""""""" 11029 11030The '``va_arg``' instruction loads an argument of the specified type 11031from the specified ``va_list`` and causes the ``va_list`` to point to 11032the next argument. For more information, see the variable argument 11033handling :ref:`Intrinsic Functions <int_varargs>`. 11034 11035It is legal for this instruction to be called in a function which does 11036not take a variable number of arguments, for example, the ``vfprintf`` 11037function. 11038 11039``va_arg`` is an LLVM instruction instead of an :ref:`intrinsic 11040function <intrinsics>` because it takes a type as an argument. 11041 11042Example: 11043"""""""" 11044 11045See the :ref:`variable argument processing <int_varargs>` section. 11046 11047Note that the code generator does not yet fully support va\_arg on many 11048targets. Also, it does not currently support va\_arg with aggregate 11049types on any target. 11050 11051.. _i_landingpad: 11052 11053'``landingpad``' Instruction 11054^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 11055 11056Syntax: 11057""""""" 11058 11059:: 11060 11061 <resultval> = landingpad <resultty> <clause>+ 11062 <resultval> = landingpad <resultty> cleanup <clause>* 11063 11064 <clause> := catch <type> <value> 11065 <clause> := filter <array constant type> <array constant> 11066 11067Overview: 11068""""""""" 11069 11070The '``landingpad``' instruction is used by `LLVM's exception handling 11071system <ExceptionHandling.html#overview>`_ to specify that a basic block 11072is a landing pad --- one where the exception lands, and corresponds to the 11073code found in the ``catch`` portion of a ``try``/``catch`` sequence. It 11074defines values supplied by the :ref:`personality function <personalityfn>` upon 11075re-entry to the function. The ``resultval`` has the type ``resultty``. 11076 11077Arguments: 11078"""""""""" 11079 11080The optional 11081``cleanup`` flag indicates that the landing pad block is a cleanup. 11082 11083A ``clause`` begins with the clause type --- ``catch`` or ``filter`` --- and 11084contains the global variable representing the "type" that may be caught 11085or filtered respectively. Unlike the ``catch`` clause, the ``filter`` 11086clause takes an array constant as its argument. Use 11087"``[0 x i8**] undef``" for a filter which cannot throw. The 11088'``landingpad``' instruction must contain *at least* one ``clause`` or 11089the ``cleanup`` flag. 11090 11091Semantics: 11092"""""""""" 11093 11094The '``landingpad``' instruction defines the values which are set by the 11095:ref:`personality function <personalityfn>` upon re-entry to the function, and 11096therefore the "result type" of the ``landingpad`` instruction. As with 11097calling conventions, how the personality function results are 11098represented in LLVM IR is target specific. 11099 11100The clauses are applied in order from top to bottom. If two 11101``landingpad`` instructions are merged together through inlining, the 11102clauses from the calling function are appended to the list of clauses. 11103When the call stack is being unwound due to an exception being thrown, 11104the exception is compared against each ``clause`` in turn. If it doesn't 11105match any of the clauses, and the ``cleanup`` flag is not set, then 11106unwinding continues further up the call stack. 11107 11108The ``landingpad`` instruction has several restrictions: 11109 11110- A landing pad block is a basic block which is the unwind destination 11111 of an '``invoke``' instruction. 11112- A landing pad block must have a '``landingpad``' instruction as its 11113 first non-PHI instruction. 11114- There can be only one '``landingpad``' instruction within the landing 11115 pad block. 11116- A basic block that is not a landing pad block may not include a 11117 '``landingpad``' instruction. 11118 11119Example: 11120"""""""" 11121 11122.. code-block:: llvm 11123 11124 ;; A landing pad which can catch an integer. 11125 %res = landingpad { i8*, i32 } 11126 catch i8** @_ZTIi 11127 ;; A landing pad that is a cleanup. 11128 %res = landingpad { i8*, i32 } 11129 cleanup 11130 ;; A landing pad which can catch an integer and can only throw a double. 11131 %res = landingpad { i8*, i32 } 11132 catch i8** @_ZTIi 11133 filter [1 x i8**] [@_ZTId] 11134 11135.. _i_catchpad: 11136 11137'``catchpad``' Instruction 11138^^^^^^^^^^^^^^^^^^^^^^^^^^ 11139 11140Syntax: 11141""""""" 11142 11143:: 11144 11145 <resultval> = catchpad within <catchswitch> [<args>*] 11146 11147Overview: 11148""""""""" 11149 11150The '``catchpad``' instruction is used by `LLVM's exception handling 11151system <ExceptionHandling.html#overview>`_ to specify that a basic block 11152begins a catch handler --- one where a personality routine attempts to transfer 11153control to catch an exception. 11154 11155Arguments: 11156"""""""""" 11157 11158The ``catchswitch`` operand must always be a token produced by a 11159:ref:`catchswitch <i_catchswitch>` instruction in a predecessor block. This 11160ensures that each ``catchpad`` has exactly one predecessor block, and it always 11161terminates in a ``catchswitch``. 11162 11163The ``args`` correspond to whatever information the personality routine 11164requires to know if this is an appropriate handler for the exception. Control 11165will transfer to the ``catchpad`` if this is the first appropriate handler for 11166the exception. 11167 11168The ``resultval`` has the type :ref:`token <t_token>` and is used to match the 11169``catchpad`` to corresponding :ref:`catchrets <i_catchret>` and other nested EH 11170pads. 11171 11172Semantics: 11173"""""""""" 11174 11175When the call stack is being unwound due to an exception being thrown, the 11176exception is compared against the ``args``. If it doesn't match, control will 11177not reach the ``catchpad`` instruction. The representation of ``args`` is 11178entirely target and personality function-specific. 11179 11180Like the :ref:`landingpad <i_landingpad>` instruction, the ``catchpad`` 11181instruction must be the first non-phi of its parent basic block. 11182 11183The meaning of the tokens produced and consumed by ``catchpad`` and other "pad" 11184instructions is described in the 11185`Windows exception handling documentation\ <ExceptionHandling.html#wineh>`_. 11186 11187When a ``catchpad`` has been "entered" but not yet "exited" (as 11188described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_), 11189it is undefined behavior to execute a :ref:`call <i_call>` or :ref:`invoke <i_invoke>` 11190that does not carry an appropriate :ref:`"funclet" bundle <ob_funclet>`. 11191 11192Example: 11193"""""""" 11194 11195.. code-block:: text 11196 11197 dispatch: 11198 %cs = catchswitch within none [label %handler0] unwind to caller 11199 ;; A catch block which can catch an integer. 11200 handler0: 11201 %tok = catchpad within %cs [i8** @_ZTIi] 11202 11203.. _i_cleanuppad: 11204 11205'``cleanuppad``' Instruction 11206^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 11207 11208Syntax: 11209""""""" 11210 11211:: 11212 11213 <resultval> = cleanuppad within <parent> [<args>*] 11214 11215Overview: 11216""""""""" 11217 11218The '``cleanuppad``' instruction is used by `LLVM's exception handling 11219system <ExceptionHandling.html#overview>`_ to specify that a basic block 11220is a cleanup block --- one where a personality routine attempts to 11221transfer control to run cleanup actions. 11222The ``args`` correspond to whatever additional 11223information the :ref:`personality function <personalityfn>` requires to 11224execute the cleanup. 11225The ``resultval`` has the type :ref:`token <t_token>` and is used to 11226match the ``cleanuppad`` to corresponding :ref:`cleanuprets <i_cleanupret>`. 11227The ``parent`` argument is the token of the funclet that contains the 11228``cleanuppad`` instruction. If the ``cleanuppad`` is not inside a funclet, 11229this operand may be the token ``none``. 11230 11231Arguments: 11232"""""""""" 11233 11234The instruction takes a list of arbitrary values which are interpreted 11235by the :ref:`personality function <personalityfn>`. 11236 11237Semantics: 11238"""""""""" 11239 11240When the call stack is being unwound due to an exception being thrown, 11241the :ref:`personality function <personalityfn>` transfers control to the 11242``cleanuppad`` with the aid of the personality-specific arguments. 11243As with calling conventions, how the personality function results are 11244represented in LLVM IR is target specific. 11245 11246The ``cleanuppad`` instruction has several restrictions: 11247 11248- A cleanup block is a basic block which is the unwind destination of 11249 an exceptional instruction. 11250- A cleanup block must have a '``cleanuppad``' instruction as its 11251 first non-PHI instruction. 11252- There can be only one '``cleanuppad``' instruction within the 11253 cleanup block. 11254- A basic block that is not a cleanup block may not include a 11255 '``cleanuppad``' instruction. 11256 11257When a ``cleanuppad`` has been "entered" but not yet "exited" (as 11258described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_), 11259it is undefined behavior to execute a :ref:`call <i_call>` or :ref:`invoke <i_invoke>` 11260that does not carry an appropriate :ref:`"funclet" bundle <ob_funclet>`. 11261 11262Example: 11263"""""""" 11264 11265.. code-block:: text 11266 11267 %tok = cleanuppad within %cs [] 11268 11269.. _intrinsics: 11270 11271Intrinsic Functions 11272=================== 11273 11274LLVM supports the notion of an "intrinsic function". These functions 11275have well known names and semantics and are required to follow certain 11276restrictions. Overall, these intrinsics represent an extension mechanism 11277for the LLVM language that does not require changing all of the 11278transformations in LLVM when adding to the language (or the bitcode 11279reader/writer, the parser, etc...). 11280 11281Intrinsic function names must all start with an "``llvm.``" prefix. This 11282prefix is reserved in LLVM for intrinsic names; thus, function names may 11283not begin with this prefix. Intrinsic functions must always be external 11284functions: you cannot define the body of intrinsic functions. Intrinsic 11285functions may only be used in call or invoke instructions: it is illegal 11286to take the address of an intrinsic function. Additionally, because 11287intrinsic functions are part of the LLVM language, it is required if any 11288are added that they be documented here. 11289 11290Some intrinsic functions can be overloaded, i.e., the intrinsic 11291represents a family of functions that perform the same operation but on 11292different data types. Because LLVM can represent over 8 million 11293different integer types, overloading is used commonly to allow an 11294intrinsic function to operate on any integer type. One or more of the 11295argument types or the result type can be overloaded to accept any 11296integer type. Argument types may also be defined as exactly matching a 11297previous argument's type or the result type. This allows an intrinsic 11298function which accepts multiple arguments, but needs all of them to be 11299of the same type, to only be overloaded with respect to a single 11300argument or the result. 11301 11302Overloaded intrinsics will have the names of its overloaded argument 11303types encoded into its function name, each preceded by a period. Only 11304those types which are overloaded result in a name suffix. Arguments 11305whose type is matched against another type do not. For example, the 11306``llvm.ctpop`` function can take an integer of any width and returns an 11307integer of exactly the same integer width. This leads to a family of 11308functions such as ``i8 @llvm.ctpop.i8(i8 %val)`` and 11309``i29 @llvm.ctpop.i29(i29 %val)``. Only one type, the return type, is 11310overloaded, and only one type suffix is required. Because the argument's 11311type is matched against the return type, it does not require its own 11312name suffix. 11313 11314For target developers who are defining intrinsics for back-end code 11315generation, any intrinsic overloads based solely the distinction between 11316integer or floating point types should not be relied upon for correct 11317code generation. In such cases, the recommended approach for target 11318maintainers when defining intrinsics is to create separate integer and 11319FP intrinsics rather than rely on overloading. For example, if different 11320codegen is required for ``llvm.target.foo(<4 x i32>)`` and 11321``llvm.target.foo(<4 x float>)`` then these should be split into 11322different intrinsics. 11323 11324To learn how to add an intrinsic function, please see the `Extending 11325LLVM Guide <ExtendingLLVM.html>`_. 11326 11327.. _int_varargs: 11328 11329Variable Argument Handling Intrinsics 11330------------------------------------- 11331 11332Variable argument support is defined in LLVM with the 11333:ref:`va_arg <i_va_arg>` instruction and these three intrinsic 11334functions. These functions are related to the similarly named macros 11335defined in the ``<stdarg.h>`` header file. 11336 11337All of these functions operate on arguments that use a target-specific 11338value type "``va_list``". The LLVM assembly language reference manual 11339does not define what this type is, so all transformations should be 11340prepared to handle these functions regardless of the type used. 11341 11342This example shows how the :ref:`va_arg <i_va_arg>` instruction and the 11343variable argument handling intrinsic functions are used. 11344 11345.. code-block:: llvm 11346 11347 ; This struct is different for every platform. For most platforms, 11348 ; it is merely an i8*. 11349 %struct.va_list = type { i8* } 11350 11351 ; For Unix x86_64 platforms, va_list is the following struct: 11352 ; %struct.va_list = type { i32, i32, i8*, i8* } 11353 11354 define i32 @test(i32 %X, ...) { 11355 ; Initialize variable argument processing 11356 %ap = alloca %struct.va_list 11357 %ap2 = bitcast %struct.va_list* %ap to i8* 11358 call void @llvm.va_start(i8* %ap2) 11359 11360 ; Read a single integer argument 11361 %tmp = va_arg i8* %ap2, i32 11362 11363 ; Demonstrate usage of llvm.va_copy and llvm.va_end 11364 %aq = alloca i8* 11365 %aq2 = bitcast i8** %aq to i8* 11366 call void @llvm.va_copy(i8* %aq2, i8* %ap2) 11367 call void @llvm.va_end(i8* %aq2) 11368 11369 ; Stop processing of arguments. 11370 call void @llvm.va_end(i8* %ap2) 11371 ret i32 %tmp 11372 } 11373 11374 declare void @llvm.va_start(i8*) 11375 declare void @llvm.va_copy(i8*, i8*) 11376 declare void @llvm.va_end(i8*) 11377 11378.. _int_va_start: 11379 11380'``llvm.va_start``' Intrinsic 11381^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 11382 11383Syntax: 11384""""""" 11385 11386:: 11387 11388 declare void @llvm.va_start(i8* <arglist>) 11389 11390Overview: 11391""""""""" 11392 11393The '``llvm.va_start``' intrinsic initializes ``*<arglist>`` for 11394subsequent use by ``va_arg``. 11395 11396Arguments: 11397"""""""""" 11398 11399The argument is a pointer to a ``va_list`` element to initialize. 11400 11401Semantics: 11402"""""""""" 11403 11404The '``llvm.va_start``' intrinsic works just like the ``va_start`` macro 11405available in C. In a target-dependent way, it initializes the 11406``va_list`` element to which the argument points, so that the next call 11407to ``va_arg`` will produce the first variable argument passed to the 11408function. Unlike the C ``va_start`` macro, this intrinsic does not need 11409to know the last argument of the function as the compiler can figure 11410that out. 11411 11412'``llvm.va_end``' Intrinsic 11413^^^^^^^^^^^^^^^^^^^^^^^^^^^ 11414 11415Syntax: 11416""""""" 11417 11418:: 11419 11420 declare void @llvm.va_end(i8* <arglist>) 11421 11422Overview: 11423""""""""" 11424 11425The '``llvm.va_end``' intrinsic destroys ``*<arglist>``, which has been 11426initialized previously with ``llvm.va_start`` or ``llvm.va_copy``. 11427 11428Arguments: 11429"""""""""" 11430 11431The argument is a pointer to a ``va_list`` to destroy. 11432 11433Semantics: 11434"""""""""" 11435 11436The '``llvm.va_end``' intrinsic works just like the ``va_end`` macro 11437available in C. In a target-dependent way, it destroys the ``va_list`` 11438element to which the argument points. Calls to 11439:ref:`llvm.va_start <int_va_start>` and 11440:ref:`llvm.va_copy <int_va_copy>` must be matched exactly with calls to 11441``llvm.va_end``. 11442 11443.. _int_va_copy: 11444 11445'``llvm.va_copy``' Intrinsic 11446^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 11447 11448Syntax: 11449""""""" 11450 11451:: 11452 11453 declare void @llvm.va_copy(i8* <destarglist>, i8* <srcarglist>) 11454 11455Overview: 11456""""""""" 11457 11458The '``llvm.va_copy``' intrinsic copies the current argument position 11459from the source argument list to the destination argument list. 11460 11461Arguments: 11462"""""""""" 11463 11464The first argument is a pointer to a ``va_list`` element to initialize. 11465The second argument is a pointer to a ``va_list`` element to copy from. 11466 11467Semantics: 11468"""""""""" 11469 11470The '``llvm.va_copy``' intrinsic works just like the ``va_copy`` macro 11471available in C. In a target-dependent way, it copies the source 11472``va_list`` element into the destination ``va_list`` element. This 11473intrinsic is necessary because the `` llvm.va_start`` intrinsic may be 11474arbitrarily complex and require, for example, memory allocation. 11475 11476Accurate Garbage Collection Intrinsics 11477-------------------------------------- 11478 11479LLVM's support for `Accurate Garbage Collection <GarbageCollection.html>`_ 11480(GC) requires the frontend to generate code containing appropriate intrinsic 11481calls and select an appropriate GC strategy which knows how to lower these 11482intrinsics in a manner which is appropriate for the target collector. 11483 11484These intrinsics allow identification of :ref:`GC roots on the 11485stack <int_gcroot>`, as well as garbage collector implementations that 11486require :ref:`read <int_gcread>` and :ref:`write <int_gcwrite>` barriers. 11487Frontends for type-safe garbage collected languages should generate 11488these intrinsics to make use of the LLVM garbage collectors. For more 11489details, see `Garbage Collection with LLVM <GarbageCollection.html>`_. 11490 11491Experimental Statepoint Intrinsics 11492^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 11493 11494LLVM provides an second experimental set of intrinsics for describing garbage 11495collection safepoints in compiled code. These intrinsics are an alternative 11496to the ``llvm.gcroot`` intrinsics, but are compatible with the ones for 11497:ref:`read <int_gcread>` and :ref:`write <int_gcwrite>` barriers. The 11498differences in approach are covered in the `Garbage Collection with LLVM 11499<GarbageCollection.html>`_ documentation. The intrinsics themselves are 11500described in :doc:`Statepoints`. 11501 11502.. _int_gcroot: 11503 11504'``llvm.gcroot``' Intrinsic 11505^^^^^^^^^^^^^^^^^^^^^^^^^^^ 11506 11507Syntax: 11508""""""" 11509 11510:: 11511 11512 declare void @llvm.gcroot(i8** %ptrloc, i8* %metadata) 11513 11514Overview: 11515""""""""" 11516 11517The '``llvm.gcroot``' intrinsic declares the existence of a GC root to 11518the code generator, and allows some metadata to be associated with it. 11519 11520Arguments: 11521"""""""""" 11522 11523The first argument specifies the address of a stack object that contains 11524the root pointer. The second pointer (which must be either a constant or 11525a global value address) contains the meta-data to be associated with the 11526root. 11527 11528Semantics: 11529"""""""""" 11530 11531At runtime, a call to this intrinsic stores a null pointer into the 11532"ptrloc" location. At compile-time, the code generator generates 11533information to allow the runtime to find the pointer at GC safe points. 11534The '``llvm.gcroot``' intrinsic may only be used in a function which 11535:ref:`specifies a GC algorithm <gc>`. 11536 11537.. _int_gcread: 11538 11539'``llvm.gcread``' Intrinsic 11540^^^^^^^^^^^^^^^^^^^^^^^^^^^ 11541 11542Syntax: 11543""""""" 11544 11545:: 11546 11547 declare i8* @llvm.gcread(i8* %ObjPtr, i8** %Ptr) 11548 11549Overview: 11550""""""""" 11551 11552The '``llvm.gcread``' intrinsic identifies reads of references from heap 11553locations, allowing garbage collector implementations that require read 11554barriers. 11555 11556Arguments: 11557"""""""""" 11558 11559The second argument is the address to read from, which should be an 11560address allocated from the garbage collector. The first object is a 11561pointer to the start of the referenced object, if needed by the language 11562runtime (otherwise null). 11563 11564Semantics: 11565"""""""""" 11566 11567The '``llvm.gcread``' intrinsic has the same semantics as a load 11568instruction, but may be replaced with substantially more complex code by 11569the garbage collector runtime, as needed. The '``llvm.gcread``' 11570intrinsic may only be used in a function which :ref:`specifies a GC 11571algorithm <gc>`. 11572 11573.. _int_gcwrite: 11574 11575'``llvm.gcwrite``' Intrinsic 11576^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 11577 11578Syntax: 11579""""""" 11580 11581:: 11582 11583 declare void @llvm.gcwrite(i8* %P1, i8* %Obj, i8** %P2) 11584 11585Overview: 11586""""""""" 11587 11588The '``llvm.gcwrite``' intrinsic identifies writes of references to heap 11589locations, allowing garbage collector implementations that require write 11590barriers (such as generational or reference counting collectors). 11591 11592Arguments: 11593"""""""""" 11594 11595The first argument is the reference to store, the second is the start of 11596the object to store it to, and the third is the address of the field of 11597Obj to store to. If the runtime does not require a pointer to the 11598object, Obj may be null. 11599 11600Semantics: 11601"""""""""" 11602 11603The '``llvm.gcwrite``' intrinsic has the same semantics as a store 11604instruction, but may be replaced with substantially more complex code by 11605the garbage collector runtime, as needed. The '``llvm.gcwrite``' 11606intrinsic may only be used in a function which :ref:`specifies a GC 11607algorithm <gc>`. 11608 11609Code Generator Intrinsics 11610------------------------- 11611 11612These intrinsics are provided by LLVM to expose special features that 11613may only be implemented with code generator support. 11614 11615'``llvm.returnaddress``' Intrinsic 11616^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 11617 11618Syntax: 11619""""""" 11620 11621:: 11622 11623 declare i8* @llvm.returnaddress(i32 <level>) 11624 11625Overview: 11626""""""""" 11627 11628The '``llvm.returnaddress``' intrinsic attempts to compute a 11629target-specific value indicating the return address of the current 11630function or one of its callers. 11631 11632Arguments: 11633"""""""""" 11634 11635The argument to this intrinsic indicates which function to return the 11636address for. Zero indicates the calling function, one indicates its 11637caller, etc. The argument is **required** to be a constant integer 11638value. 11639 11640Semantics: 11641"""""""""" 11642 11643The '``llvm.returnaddress``' intrinsic either returns a pointer 11644indicating the return address of the specified call frame, or zero if it 11645cannot be identified. The value returned by this intrinsic is likely to 11646be incorrect or 0 for arguments other than zero, so it should only be 11647used for debugging purposes. 11648 11649Note that calling this intrinsic does not prevent function inlining or 11650other aggressive transformations, so the value returned may not be that 11651of the obvious source-language caller. 11652 11653'``llvm.addressofreturnaddress``' Intrinsic 11654^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 11655 11656Syntax: 11657""""""" 11658 11659:: 11660 11661 declare i8* @llvm.addressofreturnaddress() 11662 11663Overview: 11664""""""""" 11665 11666The '``llvm.addressofreturnaddress``' intrinsic returns a target-specific 11667pointer to the place in the stack frame where the return address of the 11668current function is stored. 11669 11670Semantics: 11671"""""""""" 11672 11673Note that calling this intrinsic does not prevent function inlining or 11674other aggressive transformations, so the value returned may not be that 11675of the obvious source-language caller. 11676 11677This intrinsic is only implemented for x86 and aarch64. 11678 11679'``llvm.sponentry``' Intrinsic 11680^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 11681 11682Syntax: 11683""""""" 11684 11685:: 11686 11687 declare i8* @llvm.sponentry() 11688 11689Overview: 11690""""""""" 11691 11692The '``llvm.sponentry``' intrinsic returns the stack pointer value at 11693the entry of the current function calling this intrinsic. 11694 11695Semantics: 11696"""""""""" 11697 11698Note this intrinsic is only verified on AArch64. 11699 11700'``llvm.frameaddress``' Intrinsic 11701^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 11702 11703Syntax: 11704""""""" 11705 11706:: 11707 11708 declare i8* @llvm.frameaddress(i32 <level>) 11709 11710Overview: 11711""""""""" 11712 11713The '``llvm.frameaddress``' intrinsic attempts to return the 11714target-specific frame pointer value for the specified stack frame. 11715 11716Arguments: 11717"""""""""" 11718 11719The argument to this intrinsic indicates which function to return the 11720frame pointer for. Zero indicates the calling function, one indicates 11721its caller, etc. The argument is **required** to be a constant integer 11722value. 11723 11724Semantics: 11725"""""""""" 11726 11727The '``llvm.frameaddress``' intrinsic either returns a pointer 11728indicating the frame address of the specified call frame, or zero if it 11729cannot be identified. The value returned by this intrinsic is likely to 11730be incorrect or 0 for arguments other than zero, so it should only be 11731used for debugging purposes. 11732 11733Note that calling this intrinsic does not prevent function inlining or 11734other aggressive transformations, so the value returned may not be that 11735of the obvious source-language caller. 11736 11737'``llvm.localescape``' and '``llvm.localrecover``' Intrinsics 11738^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 11739 11740Syntax: 11741""""""" 11742 11743:: 11744 11745 declare void @llvm.localescape(...) 11746 declare i8* @llvm.localrecover(i8* %func, i8* %fp, i32 %idx) 11747 11748Overview: 11749""""""""" 11750 11751The '``llvm.localescape``' intrinsic escapes offsets of a collection of static 11752allocas, and the '``llvm.localrecover``' intrinsic applies those offsets to a 11753live frame pointer to recover the address of the allocation. The offset is 11754computed during frame layout of the caller of ``llvm.localescape``. 11755 11756Arguments: 11757"""""""""" 11758 11759All arguments to '``llvm.localescape``' must be pointers to static allocas or 11760casts of static allocas. Each function can only call '``llvm.localescape``' 11761once, and it can only do so from the entry block. 11762 11763The ``func`` argument to '``llvm.localrecover``' must be a constant 11764bitcasted pointer to a function defined in the current module. The code 11765generator cannot determine the frame allocation offset of functions defined in 11766other modules. 11767 11768The ``fp`` argument to '``llvm.localrecover``' must be a frame pointer of a 11769call frame that is currently live. The return value of '``llvm.localaddress``' 11770is one way to produce such a value, but various runtimes also expose a suitable 11771pointer in platform-specific ways. 11772 11773The ``idx`` argument to '``llvm.localrecover``' indicates which alloca passed to 11774'``llvm.localescape``' to recover. It is zero-indexed. 11775 11776Semantics: 11777"""""""""" 11778 11779These intrinsics allow a group of functions to share access to a set of local 11780stack allocations of a one parent function. The parent function may call the 11781'``llvm.localescape``' intrinsic once from the function entry block, and the 11782child functions can use '``llvm.localrecover``' to access the escaped allocas. 11783The '``llvm.localescape``' intrinsic blocks inlining, as inlining changes where 11784the escaped allocas are allocated, which would break attempts to use 11785'``llvm.localrecover``'. 11786 11787.. _int_read_register: 11788.. _int_read_volatile_register: 11789.. _int_write_register: 11790 11791'``llvm.read_register``', '``llvm.read_volatile_register``', and 11792'``llvm.write_register``' Intrinsics 11793^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 11794 11795Syntax: 11796""""""" 11797 11798:: 11799 11800 declare i32 @llvm.read_register.i32(metadata) 11801 declare i64 @llvm.read_register.i64(metadata) 11802 declare i32 @llvm.read_volatile_register.i32(metadata) 11803 declare i64 @llvm.read_volatile_register.i64(metadata) 11804 declare void @llvm.write_register.i32(metadata, i32 @value) 11805 declare void @llvm.write_register.i64(metadata, i64 @value) 11806 !0 = !{!"sp\00"} 11807 11808Overview: 11809""""""""" 11810 11811The '``llvm.read_register``', '``llvm.read_volatile_register``', and 11812'``llvm.write_register``' intrinsics provide access to the named register. 11813The register must be valid on the architecture being compiled to. The type 11814needs to be compatible with the register being read. 11815 11816Semantics: 11817"""""""""" 11818 11819The '``llvm.read_register``' and '``llvm.read_volatile_register``' intrinsics 11820return the current value of the register, where possible. The 11821'``llvm.write_register``' intrinsic sets the current value of the register, 11822where possible. 11823 11824A call to '``llvm.read_volatile_register``' is assumed to have side-effects 11825and possibly return a different value each time (e.g. for a timer register). 11826 11827This is useful to implement named register global variables that need 11828to always be mapped to a specific register, as is common practice on 11829bare-metal programs including OS kernels. 11830 11831The compiler doesn't check for register availability or use of the used 11832register in surrounding code, including inline assembly. Because of that, 11833allocatable registers are not supported. 11834 11835Warning: So far it only works with the stack pointer on selected 11836architectures (ARM, AArch64, PowerPC and x86_64). Significant amount of 11837work is needed to support other registers and even more so, allocatable 11838registers. 11839 11840.. _int_stacksave: 11841 11842'``llvm.stacksave``' Intrinsic 11843^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 11844 11845Syntax: 11846""""""" 11847 11848:: 11849 11850 declare i8* @llvm.stacksave() 11851 11852Overview: 11853""""""""" 11854 11855The '``llvm.stacksave``' intrinsic is used to remember the current state 11856of the function stack, for use with 11857:ref:`llvm.stackrestore <int_stackrestore>`. This is useful for 11858implementing language features like scoped automatic variable sized 11859arrays in C99. 11860 11861Semantics: 11862"""""""""" 11863 11864This intrinsic returns a opaque pointer value that can be passed to 11865:ref:`llvm.stackrestore <int_stackrestore>`. When an 11866``llvm.stackrestore`` intrinsic is executed with a value saved from 11867``llvm.stacksave``, it effectively restores the state of the stack to 11868the state it was in when the ``llvm.stacksave`` intrinsic executed. In 11869practice, this pops any :ref:`alloca <i_alloca>` blocks from the stack that 11870were allocated after the ``llvm.stacksave`` was executed. 11871 11872.. _int_stackrestore: 11873 11874'``llvm.stackrestore``' Intrinsic 11875^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 11876 11877Syntax: 11878""""""" 11879 11880:: 11881 11882 declare void @llvm.stackrestore(i8* %ptr) 11883 11884Overview: 11885""""""""" 11886 11887The '``llvm.stackrestore``' intrinsic is used to restore the state of 11888the function stack to the state it was in when the corresponding 11889:ref:`llvm.stacksave <int_stacksave>` intrinsic executed. This is 11890useful for implementing language features like scoped automatic variable 11891sized arrays in C99. 11892 11893Semantics: 11894"""""""""" 11895 11896See the description for :ref:`llvm.stacksave <int_stacksave>`. 11897 11898.. _int_get_dynamic_area_offset: 11899 11900'``llvm.get.dynamic.area.offset``' Intrinsic 11901^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 11902 11903Syntax: 11904""""""" 11905 11906:: 11907 11908 declare i32 @llvm.get.dynamic.area.offset.i32() 11909 declare i64 @llvm.get.dynamic.area.offset.i64() 11910 11911Overview: 11912""""""""" 11913 11914 The '``llvm.get.dynamic.area.offset.*``' intrinsic family is used to 11915 get the offset from native stack pointer to the address of the most 11916 recent dynamic alloca on the caller's stack. These intrinsics are 11917 intendend for use in combination with 11918 :ref:`llvm.stacksave <int_stacksave>` to get a 11919 pointer to the most recent dynamic alloca. This is useful, for example, 11920 for AddressSanitizer's stack unpoisoning routines. 11921 11922Semantics: 11923"""""""""" 11924 11925 These intrinsics return a non-negative integer value that can be used to 11926 get the address of the most recent dynamic alloca, allocated by :ref:`alloca <i_alloca>` 11927 on the caller's stack. In particular, for targets where stack grows downwards, 11928 adding this offset to the native stack pointer would get the address of the most 11929 recent dynamic alloca. For targets where stack grows upwards, the situation is a bit more 11930 complicated, because subtracting this value from stack pointer would get the address 11931 one past the end of the most recent dynamic alloca. 11932 11933 Although for most targets `llvm.get.dynamic.area.offset <int_get_dynamic_area_offset>` 11934 returns just a zero, for others, such as PowerPC and PowerPC64, it returns a 11935 compile-time-known constant value. 11936 11937 The return value type of :ref:`llvm.get.dynamic.area.offset <int_get_dynamic_area_offset>` 11938 must match the target's default address space's (address space 0) pointer type. 11939 11940'``llvm.prefetch``' Intrinsic 11941^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 11942 11943Syntax: 11944""""""" 11945 11946:: 11947 11948 declare void @llvm.prefetch(i8* <address>, i32 <rw>, i32 <locality>, i32 <cache type>) 11949 11950Overview: 11951""""""""" 11952 11953The '``llvm.prefetch``' intrinsic is a hint to the code generator to 11954insert a prefetch instruction if supported; otherwise, it is a noop. 11955Prefetches have no effect on the behavior of the program but can change 11956its performance characteristics. 11957 11958Arguments: 11959"""""""""" 11960 11961``address`` is the address to be prefetched, ``rw`` is the specifier 11962determining if the fetch should be for a read (0) or write (1), and 11963``locality`` is a temporal locality specifier ranging from (0) - no 11964locality, to (3) - extremely local keep in cache. The ``cache type`` 11965specifies whether the prefetch is performed on the data (1) or 11966instruction (0) cache. The ``rw``, ``locality`` and ``cache type`` 11967arguments must be constant integers. 11968 11969Semantics: 11970"""""""""" 11971 11972This intrinsic does not modify the behavior of the program. In 11973particular, prefetches cannot trap and do not produce a value. On 11974targets that support this intrinsic, the prefetch can provide hints to 11975the processor cache for better performance. 11976 11977'``llvm.pcmarker``' Intrinsic 11978^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 11979 11980Syntax: 11981""""""" 11982 11983:: 11984 11985 declare void @llvm.pcmarker(i32 <id>) 11986 11987Overview: 11988""""""""" 11989 11990The '``llvm.pcmarker``' intrinsic is a method to export a Program 11991Counter (PC) in a region of code to simulators and other tools. The 11992method is target specific, but it is expected that the marker will use 11993exported symbols to transmit the PC of the marker. The marker makes no 11994guarantees that it will remain with any specific instruction after 11995optimizations. It is possible that the presence of a marker will inhibit 11996optimizations. The intended use is to be inserted after optimizations to 11997allow correlations of simulation runs. 11998 11999Arguments: 12000"""""""""" 12001 12002``id`` is a numerical id identifying the marker. 12003 12004Semantics: 12005"""""""""" 12006 12007This intrinsic does not modify the behavior of the program. Backends 12008that do not support this intrinsic may ignore it. 12009 12010'``llvm.readcyclecounter``' Intrinsic 12011^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12012 12013Syntax: 12014""""""" 12015 12016:: 12017 12018 declare i64 @llvm.readcyclecounter() 12019 12020Overview: 12021""""""""" 12022 12023The '``llvm.readcyclecounter``' intrinsic provides access to the cycle 12024counter register (or similar low latency, high accuracy clocks) on those 12025targets that support it. On X86, it should map to RDTSC. On Alpha, it 12026should map to RPCC. As the backing counters overflow quickly (on the 12027order of 9 seconds on alpha), this should only be used for small 12028timings. 12029 12030Semantics: 12031"""""""""" 12032 12033When directly supported, reading the cycle counter should not modify any 12034memory. Implementations are allowed to either return a application 12035specific value or a system wide value. On backends without support, this 12036is lowered to a constant 0. 12037 12038Note that runtime support may be conditional on the privilege-level code is 12039running at and the host platform. 12040 12041'``llvm.clear_cache``' Intrinsic 12042^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12043 12044Syntax: 12045""""""" 12046 12047:: 12048 12049 declare void @llvm.clear_cache(i8*, i8*) 12050 12051Overview: 12052""""""""" 12053 12054The '``llvm.clear_cache``' intrinsic ensures visibility of modifications 12055in the specified range to the execution unit of the processor. On 12056targets with non-unified instruction and data cache, the implementation 12057flushes the instruction cache. 12058 12059Semantics: 12060"""""""""" 12061 12062On platforms with coherent instruction and data caches (e.g. x86), this 12063intrinsic is a nop. On platforms with non-coherent instruction and data 12064cache (e.g. ARM, MIPS), the intrinsic is lowered either to appropriate 12065instructions or a system call, if cache flushing requires special 12066privileges. 12067 12068The default behavior is to emit a call to ``__clear_cache`` from the run 12069time library. 12070 12071This intrinsic does *not* empty the instruction pipeline. Modifications 12072of the current function are outside the scope of the intrinsic. 12073 12074'``llvm.instrprof.increment``' Intrinsic 12075^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12076 12077Syntax: 12078""""""" 12079 12080:: 12081 12082 declare void @llvm.instrprof.increment(i8* <name>, i64 <hash>, 12083 i32 <num-counters>, i32 <index>) 12084 12085Overview: 12086""""""""" 12087 12088The '``llvm.instrprof.increment``' intrinsic can be emitted by a 12089frontend for use with instrumentation based profiling. These will be 12090lowered by the ``-instrprof`` pass to generate execution counts of a 12091program at runtime. 12092 12093Arguments: 12094"""""""""" 12095 12096The first argument is a pointer to a global variable containing the 12097name of the entity being instrumented. This should generally be the 12098(mangled) function name for a set of counters. 12099 12100The second argument is a hash value that can be used by the consumer 12101of the profile data to detect changes to the instrumented source, and 12102the third is the number of counters associated with ``name``. It is an 12103error if ``hash`` or ``num-counters`` differ between two instances of 12104``instrprof.increment`` that refer to the same name. 12105 12106The last argument refers to which of the counters for ``name`` should 12107be incremented. It should be a value between 0 and ``num-counters``. 12108 12109Semantics: 12110"""""""""" 12111 12112This intrinsic represents an increment of a profiling counter. It will 12113cause the ``-instrprof`` pass to generate the appropriate data 12114structures and the code to increment the appropriate value, in a 12115format that can be written out by a compiler runtime and consumed via 12116the ``llvm-profdata`` tool. 12117 12118'``llvm.instrprof.increment.step``' Intrinsic 12119^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12120 12121Syntax: 12122""""""" 12123 12124:: 12125 12126 declare void @llvm.instrprof.increment.step(i8* <name>, i64 <hash>, 12127 i32 <num-counters>, 12128 i32 <index>, i64 <step>) 12129 12130Overview: 12131""""""""" 12132 12133The '``llvm.instrprof.increment.step``' intrinsic is an extension to 12134the '``llvm.instrprof.increment``' intrinsic with an additional fifth 12135argument to specify the step of the increment. 12136 12137Arguments: 12138"""""""""" 12139The first four arguments are the same as '``llvm.instrprof.increment``' 12140intrinsic. 12141 12142The last argument specifies the value of the increment of the counter variable. 12143 12144Semantics: 12145"""""""""" 12146See description of '``llvm.instrprof.increment``' intrinsic. 12147 12148 12149'``llvm.instrprof.value.profile``' Intrinsic 12150^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12151 12152Syntax: 12153""""""" 12154 12155:: 12156 12157 declare void @llvm.instrprof.value.profile(i8* <name>, i64 <hash>, 12158 i64 <value>, i32 <value_kind>, 12159 i32 <index>) 12160 12161Overview: 12162""""""""" 12163 12164The '``llvm.instrprof.value.profile``' intrinsic can be emitted by a 12165frontend for use with instrumentation based profiling. This will be 12166lowered by the ``-instrprof`` pass to find out the target values, 12167instrumented expressions take in a program at runtime. 12168 12169Arguments: 12170"""""""""" 12171 12172The first argument is a pointer to a global variable containing the 12173name of the entity being instrumented. ``name`` should generally be the 12174(mangled) function name for a set of counters. 12175 12176The second argument is a hash value that can be used by the consumer 12177of the profile data to detect changes to the instrumented source. It 12178is an error if ``hash`` differs between two instances of 12179``llvm.instrprof.*`` that refer to the same name. 12180 12181The third argument is the value of the expression being profiled. The profiled 12182expression's value should be representable as an unsigned 64-bit value. The 12183fourth argument represents the kind of value profiling that is being done. The 12184supported value profiling kinds are enumerated through the 12185``InstrProfValueKind`` type declared in the 12186``<include/llvm/ProfileData/InstrProf.h>`` header file. The last argument is the 12187index of the instrumented expression within ``name``. It should be >= 0. 12188 12189Semantics: 12190"""""""""" 12191 12192This intrinsic represents the point where a call to a runtime routine 12193should be inserted for value profiling of target expressions. ``-instrprof`` 12194pass will generate the appropriate data structures and replace the 12195``llvm.instrprof.value.profile`` intrinsic with the call to the profile 12196runtime library with proper arguments. 12197 12198'``llvm.thread.pointer``' Intrinsic 12199^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12200 12201Syntax: 12202""""""" 12203 12204:: 12205 12206 declare i8* @llvm.thread.pointer() 12207 12208Overview: 12209""""""""" 12210 12211The '``llvm.thread.pointer``' intrinsic returns the value of the thread 12212pointer. 12213 12214Semantics: 12215"""""""""" 12216 12217The '``llvm.thread.pointer``' intrinsic returns a pointer to the TLS area 12218for the current thread. The exact semantics of this value are target 12219specific: it may point to the start of TLS area, to the end, or somewhere 12220in the middle. Depending on the target, this intrinsic may read a register, 12221call a helper function, read from an alternate memory space, or perform 12222other operations necessary to locate the TLS area. Not all targets support 12223this intrinsic. 12224 12225'``llvm.call.preallocated.setup``' Intrinsic 12226^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12227 12228Syntax: 12229""""""" 12230 12231:: 12232 12233 declare token @llvm.call.preallocated.setup(i32 %num_args) 12234 12235Overview: 12236""""""""" 12237 12238The '``llvm.call.preallocated.setup``' intrinsic returns a token which can 12239be used with a call's ``"preallocated"`` operand bundle to indicate that 12240certain arguments are allocated and initialized before the call. 12241 12242Semantics: 12243"""""""""" 12244 12245The '``llvm.call.preallocated.setup``' intrinsic returns a token which is 12246associated with at most one call. The token can be passed to 12247'``@llvm.call.preallocated.arg``' to get a pointer to get that 12248corresponding argument. The token must be the parameter to a 12249``"preallocated"`` operand bundle for the corresponding call. 12250 12251Nested calls to '``llvm.call.preallocated.setup``' are allowed, but must 12252be properly nested. e.g. 12253 12254:: code-block:: llvm 12255 12256 %t1 = call token @llvm.call.preallocated.setup(i32 0) 12257 %t2 = call token @llvm.call.preallocated.setup(i32 0) 12258 call void foo() ["preallocated"(token %t2)] 12259 call void foo() ["preallocated"(token %t1)] 12260 12261is allowed, but not 12262 12263:: code-block:: llvm 12264 12265 %t1 = call token @llvm.call.preallocated.setup(i32 0) 12266 %t2 = call token @llvm.call.preallocated.setup(i32 0) 12267 call void foo() ["preallocated"(token %t1)] 12268 call void foo() ["preallocated"(token %t2)] 12269 12270.. _int_call_preallocated_arg: 12271 12272'``llvm.call.preallocated.arg``' Intrinsic 12273^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12274 12275Syntax: 12276""""""" 12277 12278:: 12279 12280 declare i8* @llvm.call.preallocated.arg(token %setup_token, i32 %arg_index) 12281 12282Overview: 12283""""""""" 12284 12285The '``llvm.call.preallocated.arg``' intrinsic returns a pointer to the 12286corresponding preallocated argument for the preallocated call. 12287 12288Semantics: 12289"""""""""" 12290 12291The '``llvm.call.preallocated.arg``' intrinsic returns a pointer to the 12292``%arg_index``th argument with the ``preallocated`` attribute for 12293the call associated with the ``%setup_token``, which must be from 12294'``llvm.call.preallocated.setup``'. 12295 12296A call to '``llvm.call.preallocated.arg``' must have a call site 12297``preallocated`` attribute. The type of the ``preallocated`` attribute must 12298match the type used by the ``preallocated`` attribute of the corresponding 12299argument at the preallocated call. The type is used in the case that an 12300``llvm.call.preallocated.setup`` does not have a corresponding call (e.g. due 12301to DCE), where otherwise we cannot know how large the arguments are. 12302 12303It is undefined behavior if this is called with a token from an 12304'``llvm.call.preallocated.setup``' if another 12305'``llvm.call.preallocated.setup``' has already been called or if the 12306preallocated call corresponding to the '``llvm.call.preallocated.setup``' 12307has already been called. 12308 12309.. _int_call_preallocated_teardown: 12310 12311'``llvm.call.preallocated.teardown``' Intrinsic 12312^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12313 12314Syntax: 12315""""""" 12316 12317:: 12318 12319 declare i8* @llvm.call.preallocated.teardown(token %setup_token) 12320 12321Overview: 12322""""""""" 12323 12324The '``llvm.call.preallocated.teardown``' intrinsic cleans up the stack 12325created by a '``llvm.call.preallocated.setup``'. 12326 12327Semantics: 12328"""""""""" 12329 12330The token argument must be a '``llvm.call.preallocated.setup``'. 12331 12332The '``llvm.call.preallocated.teardown``' intrinsic cleans up the stack 12333allocated by the corresponding '``llvm.call.preallocated.setup``'. Exactly 12334one of this or the preallocated call must be called to prevent stack leaks. 12335It is undefined behavior to call both a '``llvm.call.preallocated.teardown``' 12336and the preallocated call for a given '``llvm.call.preallocated.setup``'. 12337 12338For example, if the stack is allocated for a preallocated call by a 12339'``llvm.call.preallocated.setup``', then an initializer function called on an 12340allocated argument throws an exception, there should be a 12341'``llvm.call.preallocated.teardown``' in the exception handler to prevent 12342stack leaks. 12343 12344Following the nesting rules in '``llvm.call.preallocated.setup``', nested 12345calls to '``llvm.call.preallocated.setup``' and 12346'``llvm.call.preallocated.teardown``' are allowed but must be properly 12347nested. 12348 12349Example: 12350"""""""" 12351 12352.. code-block:: llvm 12353 12354 %cs = call token @llvm.call.preallocated.setup(i32 1) 12355 %x = call i8* @llvm.call.preallocated.arg(token %cs, i32 0) preallocated(i32) 12356 %y = bitcast i8* %x to i32* 12357 invoke void @constructor(i32* %y) to label %conta unwind label %contb 12358 conta: 12359 call void @foo1(i32* preallocated(i32) %y) ["preallocated"(token %cs)] 12360 ret void 12361 contb: 12362 %s = catchswitch within none [label %catch] unwind to caller 12363 catch: 12364 %p = catchpad within %s [] 12365 call void @llvm.call.preallocated.teardown(token %cs) 12366 ret void 12367 12368Standard C/C++ Library Intrinsics 12369--------------------------------- 12370 12371LLVM provides intrinsics for a few important standard C/C++ library 12372functions. These intrinsics allow source-language front-ends to pass 12373information about the alignment of the pointer arguments to the code 12374generator, providing opportunity for more efficient code generation. 12375 12376 12377'``llvm.abs.*``' Intrinsic 12378^^^^^^^^^^^^^^^^^^^^^^^^^^ 12379 12380Syntax: 12381""""""" 12382 12383This is an overloaded intrinsic. You can use ``llvm.abs`` on any 12384integer bit width or any vector of integer elements. 12385 12386:: 12387 12388 declare i32 @llvm.abs.i32(i32 <src>, i1 <is_int_min_poison>) 12389 declare <4 x i32> @llvm.abs.v4i32(<4 x i32> <src>, i1 <is_int_min_poison>) 12390 12391Overview: 12392""""""""" 12393 12394The '``llvm.abs``' family of intrinsic functions returns the absolute value 12395of an argument. 12396 12397Arguments: 12398"""""""""" 12399 12400The first argument is the value for which the absolute value is to be returned. 12401This argument may be of any integer type or a vector with integer element type. 12402The return type must match the first argument type. 12403 12404The second argument must be a constant and is a flag to indicate whether the 12405result value of the '``llvm.abs``' intrinsic is a 12406:ref:`poison value <poisonvalues>` if the argument is statically or dynamically 12407an ``INT_MIN`` value. 12408 12409Semantics: 12410"""""""""" 12411 12412The '``llvm.abs``' intrinsic returns the magnitude (always positive) of the 12413argument or each element of a vector argument.". If the argument is ``INT_MIN``, 12414then the result is also ``INT_MIN`` if ``is_int_min_poison == 0`` and 12415``poison`` otherwise. 12416 12417 12418'``llvm.smax.*``' Intrinsic 12419^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12420 12421Syntax: 12422""""""" 12423 12424This is an overloaded intrinsic. You can use ``@llvm.smax`` on any 12425integer bit width or any vector of integer elements. 12426 12427:: 12428 12429 declare i32 @llvm.smax.i32(i32 %a, i32 %b) 12430 declare <4 x i32> @llvm.smax.v4i32(<4 x i32> %a, <4 x i32> %b) 12431 12432Overview: 12433""""""""" 12434 12435Return the larger of ``%a`` and ``%b`` comparing the values as signed integers. 12436Vector intrinsics operate on a per-element basis. The larger element of ``%a`` 12437and ``%b`` at a given index is returned for that index. 12438 12439Arguments: 12440"""""""""" 12441 12442The arguments (``%a`` and ``%b``) may be of any integer type or a vector with 12443integer element type. The argument types must match each other, and the return 12444type must match the argument type. 12445 12446 12447'``llvm.smin.*``' Intrinsic 12448^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12449 12450Syntax: 12451""""""" 12452 12453This is an overloaded intrinsic. You can use ``@llvm.smin`` on any 12454integer bit width or any vector of integer elements. 12455 12456:: 12457 12458 declare i32 @llvm.smin.i32(i32 %a, i32 %b) 12459 declare <4 x i32> @llvm.smin.v4i32(<4 x i32> %a, <4 x i32> %b) 12460 12461Overview: 12462""""""""" 12463 12464Return the smaller of ``%a`` and ``%b`` comparing the values as signed integers. 12465Vector intrinsics operate on a per-element basis. The smaller element of ``%a`` 12466and ``%b`` at a given index is returned for that index. 12467 12468Arguments: 12469"""""""""" 12470 12471The arguments (``%a`` and ``%b``) may be of any integer type or a vector with 12472integer element type. The argument types must match each other, and the return 12473type must match the argument type. 12474 12475 12476'``llvm.umax.*``' Intrinsic 12477^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12478 12479Syntax: 12480""""""" 12481 12482This is an overloaded intrinsic. You can use ``@llvm.umax`` on any 12483integer bit width or any vector of integer elements. 12484 12485:: 12486 12487 declare i32 @llvm.umax.i32(i32 %a, i32 %b) 12488 declare <4 x i32> @llvm.umax.v4i32(<4 x i32> %a, <4 x i32> %b) 12489 12490Overview: 12491""""""""" 12492 12493Return the larger of ``%a`` and ``%b`` comparing the values as unsigned 12494integers. Vector intrinsics operate on a per-element basis. The larger element 12495of ``%a`` and ``%b`` at a given index is returned for that index. 12496 12497Arguments: 12498"""""""""" 12499 12500The arguments (``%a`` and ``%b``) may be of any integer type or a vector with 12501integer element type. The argument types must match each other, and the return 12502type must match the argument type. 12503 12504 12505'``llvm.umin.*``' Intrinsic 12506^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12507 12508Syntax: 12509""""""" 12510 12511This is an overloaded intrinsic. You can use ``@llvm.umin`` on any 12512integer bit width or any vector of integer elements. 12513 12514:: 12515 12516 declare i32 @llvm.umin.i32(i32 %a, i32 %b) 12517 declare <4 x i32> @llvm.umin.v4i32(<4 x i32> %a, <4 x i32> %b) 12518 12519Overview: 12520""""""""" 12521 12522Return the smaller of ``%a`` and ``%b`` comparing the values as unsigned 12523integers. Vector intrinsics operate on a per-element basis. The smaller element 12524of ``%a`` and ``%b`` at a given index is returned for that index. 12525 12526Arguments: 12527"""""""""" 12528 12529The arguments (``%a`` and ``%b``) may be of any integer type or a vector with 12530integer element type. The argument types must match each other, and the return 12531type must match the argument type. 12532 12533 12534.. _int_memcpy: 12535 12536'``llvm.memcpy``' Intrinsic 12537^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12538 12539Syntax: 12540""""""" 12541 12542This is an overloaded intrinsic. You can use ``llvm.memcpy`` on any 12543integer bit width and for different address spaces. Not all targets 12544support all bit widths however. 12545 12546:: 12547 12548 declare void @llvm.memcpy.p0i8.p0i8.i32(i8* <dest>, i8* <src>, 12549 i32 <len>, i1 <isvolatile>) 12550 declare void @llvm.memcpy.p0i8.p0i8.i64(i8* <dest>, i8* <src>, 12551 i64 <len>, i1 <isvolatile>) 12552 12553Overview: 12554""""""""" 12555 12556The '``llvm.memcpy.*``' intrinsics copy a block of memory from the 12557source location to the destination location. 12558 12559Note that, unlike the standard libc function, the ``llvm.memcpy.*`` 12560intrinsics do not return a value, takes extra isvolatile 12561arguments and the pointers can be in specified address spaces. 12562 12563Arguments: 12564"""""""""" 12565 12566The first argument is a pointer to the destination, the second is a 12567pointer to the source. The third argument is an integer argument 12568specifying the number of bytes to copy, and the fourth is a 12569boolean indicating a volatile access. 12570 12571The :ref:`align <attr_align>` parameter attribute can be provided 12572for the first and second arguments. 12573 12574If the ``isvolatile`` parameter is ``true``, the ``llvm.memcpy`` call is 12575a :ref:`volatile operation <volatile>`. The detailed access behavior is not 12576very cleanly specified and it is unwise to depend on it. 12577 12578Semantics: 12579"""""""""" 12580 12581The '``llvm.memcpy.*``' intrinsics copy a block of memory from the source 12582location to the destination location, which must either be equal or 12583non-overlapping. It copies "len" bytes of memory over. If the argument is known 12584to be aligned to some boundary, this can be specified as an attribute on the 12585argument. 12586 12587If "len" is 0, the pointers may be NULL, dangling, ``undef``, or ``poison`` 12588pointers. However, they must still be appropriately aligned. 12589If "len" isn't a well-defined value, all of its possible representations should 12590make the behavior of this ``llvm.memcpy`` defined, otherwise the behavior is 12591undefined. 12592 12593.. _int_memcpy_inline: 12594 12595'``llvm.memcpy.inline``' Intrinsic 12596^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12597 12598Syntax: 12599""""""" 12600 12601This is an overloaded intrinsic. You can use ``llvm.memcpy.inline`` on any 12602integer bit width and for different address spaces. Not all targets 12603support all bit widths however. 12604 12605:: 12606 12607 declare void @llvm.memcpy.inline.p0i8.p0i8.i32(i8* <dest>, i8* <src>, 12608 i32 <len>, i1 <isvolatile>) 12609 declare void @llvm.memcpy.inline.p0i8.p0i8.i64(i8* <dest>, i8* <src>, 12610 i64 <len>, i1 <isvolatile>) 12611 12612Overview: 12613""""""""" 12614 12615The '``llvm.memcpy.inline.*``' intrinsics copy a block of memory from the 12616source location to the destination location and guarantees that no external 12617functions are called. 12618 12619Note that, unlike the standard libc function, the ``llvm.memcpy.inline.*`` 12620intrinsics do not return a value, takes extra isvolatile 12621arguments and the pointers can be in specified address spaces. 12622 12623Arguments: 12624"""""""""" 12625 12626The first argument is a pointer to the destination, the second is a 12627pointer to the source. The third argument is a constant integer argument 12628specifying the number of bytes to copy, and the fourth is a 12629boolean indicating a volatile access. 12630 12631The :ref:`align <attr_align>` parameter attribute can be provided 12632for the first and second arguments. 12633 12634If the ``isvolatile`` parameter is ``true``, the ``llvm.memcpy.inline`` call is 12635a :ref:`volatile operation <volatile>`. The detailed access behavior is not 12636very cleanly specified and it is unwise to depend on it. 12637 12638Semantics: 12639"""""""""" 12640 12641The '``llvm.memcpy.inline.*``' intrinsics copy a block of memory from the 12642source location to the destination location, which are not allowed to 12643overlap. It copies "len" bytes of memory over. If the argument is known 12644to be aligned to some boundary, this can be specified as an attribute on 12645the argument. 12646 12647If "len" is 0, the pointers may be NULL, dangling, ``undef``, or ``poison`` 12648pointers. However, they must still be appropriately aligned. 12649 12650The generated code is guaranteed not to call any external functions. 12651 12652.. _int_memmove: 12653 12654'``llvm.memmove``' Intrinsic 12655^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12656 12657Syntax: 12658""""""" 12659 12660This is an overloaded intrinsic. You can use llvm.memmove on any integer 12661bit width and for different address space. Not all targets support all 12662bit widths however. 12663 12664:: 12665 12666 declare void @llvm.memmove.p0i8.p0i8.i32(i8* <dest>, i8* <src>, 12667 i32 <len>, i1 <isvolatile>) 12668 declare void @llvm.memmove.p0i8.p0i8.i64(i8* <dest>, i8* <src>, 12669 i64 <len>, i1 <isvolatile>) 12670 12671Overview: 12672""""""""" 12673 12674The '``llvm.memmove.*``' intrinsics move a block of memory from the 12675source location to the destination location. It is similar to the 12676'``llvm.memcpy``' intrinsic but allows the two memory locations to 12677overlap. 12678 12679Note that, unlike the standard libc function, the ``llvm.memmove.*`` 12680intrinsics do not return a value, takes an extra isvolatile 12681argument and the pointers can be in specified address spaces. 12682 12683Arguments: 12684"""""""""" 12685 12686The first argument is a pointer to the destination, the second is a 12687pointer to the source. The third argument is an integer argument 12688specifying the number of bytes to copy, and the fourth is a 12689boolean indicating a volatile access. 12690 12691The :ref:`align <attr_align>` parameter attribute can be provided 12692for the first and second arguments. 12693 12694If the ``isvolatile`` parameter is ``true``, the ``llvm.memmove`` call 12695is a :ref:`volatile operation <volatile>`. The detailed access behavior is 12696not very cleanly specified and it is unwise to depend on it. 12697 12698Semantics: 12699"""""""""" 12700 12701The '``llvm.memmove.*``' intrinsics copy a block of memory from the 12702source location to the destination location, which may overlap. It 12703copies "len" bytes of memory over. If the argument is known to be 12704aligned to some boundary, this can be specified as an attribute on 12705the argument. 12706 12707If "len" is 0, the pointers may be NULL, dangling, ``undef``, or ``poison`` 12708pointers. However, they must still be appropriately aligned. 12709If "len" isn't a well-defined value, all of its possible representations should 12710make the behavior of this ``llvm.memmove`` defined, otherwise the behavior is 12711undefined. 12712 12713.. _int_memset: 12714 12715'``llvm.memset.*``' Intrinsics 12716^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12717 12718Syntax: 12719""""""" 12720 12721This is an overloaded intrinsic. You can use llvm.memset on any integer 12722bit width and for different address spaces. However, not all targets 12723support all bit widths. 12724 12725:: 12726 12727 declare void @llvm.memset.p0i8.i32(i8* <dest>, i8 <val>, 12728 i32 <len>, i1 <isvolatile>) 12729 declare void @llvm.memset.p0i8.i64(i8* <dest>, i8 <val>, 12730 i64 <len>, i1 <isvolatile>) 12731 12732Overview: 12733""""""""" 12734 12735The '``llvm.memset.*``' intrinsics fill a block of memory with a 12736particular byte value. 12737 12738Note that, unlike the standard libc function, the ``llvm.memset`` 12739intrinsic does not return a value and takes an extra volatile 12740argument. Also, the destination can be in an arbitrary address space. 12741 12742Arguments: 12743"""""""""" 12744 12745The first argument is a pointer to the destination to fill, the second 12746is the byte value with which to fill it, the third argument is an 12747integer argument specifying the number of bytes to fill, and the fourth 12748is a boolean indicating a volatile access. 12749 12750The :ref:`align <attr_align>` parameter attribute can be provided 12751for the first arguments. 12752 12753If the ``isvolatile`` parameter is ``true``, the ``llvm.memset`` call is 12754a :ref:`volatile operation <volatile>`. The detailed access behavior is not 12755very cleanly specified and it is unwise to depend on it. 12756 12757Semantics: 12758"""""""""" 12759 12760The '``llvm.memset.*``' intrinsics fill "len" bytes of memory starting 12761at the destination location. If the argument is known to be 12762aligned to some boundary, this can be specified as an attribute on 12763the argument. 12764 12765If "len" is 0, the pointer may be NULL, dangling, ``undef``, or ``poison`` 12766pointer. However, it must still be appropriately aligned. 12767If "len" isn't a well-defined value, all of its possible representations should 12768make the behavior of this ``llvm.memset`` defined, otherwise the behavior is 12769undefined. 12770 12771'``llvm.sqrt.*``' Intrinsic 12772^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12773 12774Syntax: 12775""""""" 12776 12777This is an overloaded intrinsic. You can use ``llvm.sqrt`` on any 12778floating-point or vector of floating-point type. Not all targets support 12779all types however. 12780 12781:: 12782 12783 declare float @llvm.sqrt.f32(float %Val) 12784 declare double @llvm.sqrt.f64(double %Val) 12785 declare x86_fp80 @llvm.sqrt.f80(x86_fp80 %Val) 12786 declare fp128 @llvm.sqrt.f128(fp128 %Val) 12787 declare ppc_fp128 @llvm.sqrt.ppcf128(ppc_fp128 %Val) 12788 12789Overview: 12790""""""""" 12791 12792The '``llvm.sqrt``' intrinsics return the square root of the specified value. 12793 12794Arguments: 12795"""""""""" 12796 12797The argument and return value are floating-point numbers of the same type. 12798 12799Semantics: 12800"""""""""" 12801 12802Return the same value as a corresponding libm '``sqrt``' function but without 12803trapping or setting ``errno``. For types specified by IEEE-754, the result 12804matches a conforming libm implementation. 12805 12806When specified with the fast-math-flag 'afn', the result may be approximated 12807using a less accurate calculation. 12808 12809'``llvm.powi.*``' Intrinsic 12810^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12811 12812Syntax: 12813""""""" 12814 12815This is an overloaded intrinsic. You can use ``llvm.powi`` on any 12816floating-point or vector of floating-point type. Not all targets support 12817all types however. 12818 12819:: 12820 12821 declare float @llvm.powi.f32(float %Val, i32 %power) 12822 declare double @llvm.powi.f64(double %Val, i32 %power) 12823 declare x86_fp80 @llvm.powi.f80(x86_fp80 %Val, i32 %power) 12824 declare fp128 @llvm.powi.f128(fp128 %Val, i32 %power) 12825 declare ppc_fp128 @llvm.powi.ppcf128(ppc_fp128 %Val, i32 %power) 12826 12827Overview: 12828""""""""" 12829 12830The '``llvm.powi.*``' intrinsics return the first operand raised to the 12831specified (positive or negative) power. The order of evaluation of 12832multiplications is not defined. When a vector of floating-point type is 12833used, the second argument remains a scalar integer value. 12834 12835Arguments: 12836"""""""""" 12837 12838The second argument is an integer power, and the first is a value to 12839raise to that power. 12840 12841Semantics: 12842"""""""""" 12843 12844This function returns the first value raised to the second power with an 12845unspecified sequence of rounding operations. 12846 12847'``llvm.sin.*``' Intrinsic 12848^^^^^^^^^^^^^^^^^^^^^^^^^^ 12849 12850Syntax: 12851""""""" 12852 12853This is an overloaded intrinsic. You can use ``llvm.sin`` on any 12854floating-point or vector of floating-point type. Not all targets support 12855all types however. 12856 12857:: 12858 12859 declare float @llvm.sin.f32(float %Val) 12860 declare double @llvm.sin.f64(double %Val) 12861 declare x86_fp80 @llvm.sin.f80(x86_fp80 %Val) 12862 declare fp128 @llvm.sin.f128(fp128 %Val) 12863 declare ppc_fp128 @llvm.sin.ppcf128(ppc_fp128 %Val) 12864 12865Overview: 12866""""""""" 12867 12868The '``llvm.sin.*``' intrinsics return the sine of the operand. 12869 12870Arguments: 12871"""""""""" 12872 12873The argument and return value are floating-point numbers of the same type. 12874 12875Semantics: 12876"""""""""" 12877 12878Return the same value as a corresponding libm '``sin``' function but without 12879trapping or setting ``errno``. 12880 12881When specified with the fast-math-flag 'afn', the result may be approximated 12882using a less accurate calculation. 12883 12884'``llvm.cos.*``' Intrinsic 12885^^^^^^^^^^^^^^^^^^^^^^^^^^ 12886 12887Syntax: 12888""""""" 12889 12890This is an overloaded intrinsic. You can use ``llvm.cos`` on any 12891floating-point or vector of floating-point type. Not all targets support 12892all types however. 12893 12894:: 12895 12896 declare float @llvm.cos.f32(float %Val) 12897 declare double @llvm.cos.f64(double %Val) 12898 declare x86_fp80 @llvm.cos.f80(x86_fp80 %Val) 12899 declare fp128 @llvm.cos.f128(fp128 %Val) 12900 declare ppc_fp128 @llvm.cos.ppcf128(ppc_fp128 %Val) 12901 12902Overview: 12903""""""""" 12904 12905The '``llvm.cos.*``' intrinsics return the cosine of the operand. 12906 12907Arguments: 12908"""""""""" 12909 12910The argument and return value are floating-point numbers of the same type. 12911 12912Semantics: 12913"""""""""" 12914 12915Return the same value as a corresponding libm '``cos``' function but without 12916trapping or setting ``errno``. 12917 12918When specified with the fast-math-flag 'afn', the result may be approximated 12919using a less accurate calculation. 12920 12921'``llvm.pow.*``' Intrinsic 12922^^^^^^^^^^^^^^^^^^^^^^^^^^ 12923 12924Syntax: 12925""""""" 12926 12927This is an overloaded intrinsic. You can use ``llvm.pow`` on any 12928floating-point or vector of floating-point type. Not all targets support 12929all types however. 12930 12931:: 12932 12933 declare float @llvm.pow.f32(float %Val, float %Power) 12934 declare double @llvm.pow.f64(double %Val, double %Power) 12935 declare x86_fp80 @llvm.pow.f80(x86_fp80 %Val, x86_fp80 %Power) 12936 declare fp128 @llvm.pow.f128(fp128 %Val, fp128 %Power) 12937 declare ppc_fp128 @llvm.pow.ppcf128(ppc_fp128 %Val, ppc_fp128 Power) 12938 12939Overview: 12940""""""""" 12941 12942The '``llvm.pow.*``' intrinsics return the first operand raised to the 12943specified (positive or negative) power. 12944 12945Arguments: 12946"""""""""" 12947 12948The arguments and return value are floating-point numbers of the same type. 12949 12950Semantics: 12951"""""""""" 12952 12953Return the same value as a corresponding libm '``pow``' function but without 12954trapping or setting ``errno``. 12955 12956When specified with the fast-math-flag 'afn', the result may be approximated 12957using a less accurate calculation. 12958 12959'``llvm.exp.*``' Intrinsic 12960^^^^^^^^^^^^^^^^^^^^^^^^^^ 12961 12962Syntax: 12963""""""" 12964 12965This is an overloaded intrinsic. You can use ``llvm.exp`` on any 12966floating-point or vector of floating-point type. Not all targets support 12967all types however. 12968 12969:: 12970 12971 declare float @llvm.exp.f32(float %Val) 12972 declare double @llvm.exp.f64(double %Val) 12973 declare x86_fp80 @llvm.exp.f80(x86_fp80 %Val) 12974 declare fp128 @llvm.exp.f128(fp128 %Val) 12975 declare ppc_fp128 @llvm.exp.ppcf128(ppc_fp128 %Val) 12976 12977Overview: 12978""""""""" 12979 12980The '``llvm.exp.*``' intrinsics compute the base-e exponential of the specified 12981value. 12982 12983Arguments: 12984"""""""""" 12985 12986The argument and return value are floating-point numbers of the same type. 12987 12988Semantics: 12989"""""""""" 12990 12991Return the same value as a corresponding libm '``exp``' function but without 12992trapping or setting ``errno``. 12993 12994When specified with the fast-math-flag 'afn', the result may be approximated 12995using a less accurate calculation. 12996 12997'``llvm.exp2.*``' Intrinsic 12998^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12999 13000Syntax: 13001""""""" 13002 13003This is an overloaded intrinsic. You can use ``llvm.exp2`` on any 13004floating-point or vector of floating-point type. Not all targets support 13005all types however. 13006 13007:: 13008 13009 declare float @llvm.exp2.f32(float %Val) 13010 declare double @llvm.exp2.f64(double %Val) 13011 declare x86_fp80 @llvm.exp2.f80(x86_fp80 %Val) 13012 declare fp128 @llvm.exp2.f128(fp128 %Val) 13013 declare ppc_fp128 @llvm.exp2.ppcf128(ppc_fp128 %Val) 13014 13015Overview: 13016""""""""" 13017 13018The '``llvm.exp2.*``' intrinsics compute the base-2 exponential of the 13019specified value. 13020 13021Arguments: 13022"""""""""" 13023 13024The argument and return value are floating-point numbers of the same type. 13025 13026Semantics: 13027"""""""""" 13028 13029Return the same value as a corresponding libm '``exp2``' function but without 13030trapping or setting ``errno``. 13031 13032When specified with the fast-math-flag 'afn', the result may be approximated 13033using a less accurate calculation. 13034 13035'``llvm.log.*``' Intrinsic 13036^^^^^^^^^^^^^^^^^^^^^^^^^^ 13037 13038Syntax: 13039""""""" 13040 13041This is an overloaded intrinsic. You can use ``llvm.log`` on any 13042floating-point or vector of floating-point type. Not all targets support 13043all types however. 13044 13045:: 13046 13047 declare float @llvm.log.f32(float %Val) 13048 declare double @llvm.log.f64(double %Val) 13049 declare x86_fp80 @llvm.log.f80(x86_fp80 %Val) 13050 declare fp128 @llvm.log.f128(fp128 %Val) 13051 declare ppc_fp128 @llvm.log.ppcf128(ppc_fp128 %Val) 13052 13053Overview: 13054""""""""" 13055 13056The '``llvm.log.*``' intrinsics compute the base-e logarithm of the specified 13057value. 13058 13059Arguments: 13060"""""""""" 13061 13062The argument and return value are floating-point numbers of the same type. 13063 13064Semantics: 13065"""""""""" 13066 13067Return the same value as a corresponding libm '``log``' function but without 13068trapping or setting ``errno``. 13069 13070When specified with the fast-math-flag 'afn', the result may be approximated 13071using a less accurate calculation. 13072 13073'``llvm.log10.*``' Intrinsic 13074^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13075 13076Syntax: 13077""""""" 13078 13079This is an overloaded intrinsic. You can use ``llvm.log10`` on any 13080floating-point or vector of floating-point type. Not all targets support 13081all types however. 13082 13083:: 13084 13085 declare float @llvm.log10.f32(float %Val) 13086 declare double @llvm.log10.f64(double %Val) 13087 declare x86_fp80 @llvm.log10.f80(x86_fp80 %Val) 13088 declare fp128 @llvm.log10.f128(fp128 %Val) 13089 declare ppc_fp128 @llvm.log10.ppcf128(ppc_fp128 %Val) 13090 13091Overview: 13092""""""""" 13093 13094The '``llvm.log10.*``' intrinsics compute the base-10 logarithm of the 13095specified value. 13096 13097Arguments: 13098"""""""""" 13099 13100The argument and return value are floating-point numbers of the same type. 13101 13102Semantics: 13103"""""""""" 13104 13105Return the same value as a corresponding libm '``log10``' function but without 13106trapping or setting ``errno``. 13107 13108When specified with the fast-math-flag 'afn', the result may be approximated 13109using a less accurate calculation. 13110 13111'``llvm.log2.*``' Intrinsic 13112^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13113 13114Syntax: 13115""""""" 13116 13117This is an overloaded intrinsic. You can use ``llvm.log2`` on any 13118floating-point or vector of floating-point type. Not all targets support 13119all types however. 13120 13121:: 13122 13123 declare float @llvm.log2.f32(float %Val) 13124 declare double @llvm.log2.f64(double %Val) 13125 declare x86_fp80 @llvm.log2.f80(x86_fp80 %Val) 13126 declare fp128 @llvm.log2.f128(fp128 %Val) 13127 declare ppc_fp128 @llvm.log2.ppcf128(ppc_fp128 %Val) 13128 13129Overview: 13130""""""""" 13131 13132The '``llvm.log2.*``' intrinsics compute the base-2 logarithm of the specified 13133value. 13134 13135Arguments: 13136"""""""""" 13137 13138The argument and return value are floating-point numbers of the same type. 13139 13140Semantics: 13141"""""""""" 13142 13143Return the same value as a corresponding libm '``log2``' function but without 13144trapping or setting ``errno``. 13145 13146When specified with the fast-math-flag 'afn', the result may be approximated 13147using a less accurate calculation. 13148 13149.. _int_fma: 13150 13151'``llvm.fma.*``' Intrinsic 13152^^^^^^^^^^^^^^^^^^^^^^^^^^ 13153 13154Syntax: 13155""""""" 13156 13157This is an overloaded intrinsic. You can use ``llvm.fma`` on any 13158floating-point or vector of floating-point type. Not all targets support 13159all types however. 13160 13161:: 13162 13163 declare float @llvm.fma.f32(float %a, float %b, float %c) 13164 declare double @llvm.fma.f64(double %a, double %b, double %c) 13165 declare x86_fp80 @llvm.fma.f80(x86_fp80 %a, x86_fp80 %b, x86_fp80 %c) 13166 declare fp128 @llvm.fma.f128(fp128 %a, fp128 %b, fp128 %c) 13167 declare ppc_fp128 @llvm.fma.ppcf128(ppc_fp128 %a, ppc_fp128 %b, ppc_fp128 %c) 13168 13169Overview: 13170""""""""" 13171 13172The '``llvm.fma.*``' intrinsics perform the fused multiply-add operation. 13173 13174Arguments: 13175"""""""""" 13176 13177The arguments and return value are floating-point numbers of the same type. 13178 13179Semantics: 13180"""""""""" 13181 13182Return the same value as a corresponding libm '``fma``' function but without 13183trapping or setting ``errno``. 13184 13185When specified with the fast-math-flag 'afn', the result may be approximated 13186using a less accurate calculation. 13187 13188'``llvm.fabs.*``' Intrinsic 13189^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13190 13191Syntax: 13192""""""" 13193 13194This is an overloaded intrinsic. You can use ``llvm.fabs`` on any 13195floating-point or vector of floating-point type. Not all targets support 13196all types however. 13197 13198:: 13199 13200 declare float @llvm.fabs.f32(float %Val) 13201 declare double @llvm.fabs.f64(double %Val) 13202 declare x86_fp80 @llvm.fabs.f80(x86_fp80 %Val) 13203 declare fp128 @llvm.fabs.f128(fp128 %Val) 13204 declare ppc_fp128 @llvm.fabs.ppcf128(ppc_fp128 %Val) 13205 13206Overview: 13207""""""""" 13208 13209The '``llvm.fabs.*``' intrinsics return the absolute value of the 13210operand. 13211 13212Arguments: 13213"""""""""" 13214 13215The argument and return value are floating-point numbers of the same 13216type. 13217 13218Semantics: 13219"""""""""" 13220 13221This function returns the same values as the libm ``fabs`` functions 13222would, and handles error conditions in the same way. 13223 13224'``llvm.minnum.*``' Intrinsic 13225^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13226 13227Syntax: 13228""""""" 13229 13230This is an overloaded intrinsic. You can use ``llvm.minnum`` on any 13231floating-point or vector of floating-point type. Not all targets support 13232all types however. 13233 13234:: 13235 13236 declare float @llvm.minnum.f32(float %Val0, float %Val1) 13237 declare double @llvm.minnum.f64(double %Val0, double %Val1) 13238 declare x86_fp80 @llvm.minnum.f80(x86_fp80 %Val0, x86_fp80 %Val1) 13239 declare fp128 @llvm.minnum.f128(fp128 %Val0, fp128 %Val1) 13240 declare ppc_fp128 @llvm.minnum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1) 13241 13242Overview: 13243""""""""" 13244 13245The '``llvm.minnum.*``' intrinsics return the minimum of the two 13246arguments. 13247 13248 13249Arguments: 13250"""""""""" 13251 13252The arguments and return value are floating-point numbers of the same 13253type. 13254 13255Semantics: 13256"""""""""" 13257 13258Follows the IEEE-754 semantics for minNum, except for handling of 13259signaling NaNs. This match's the behavior of libm's fmin. 13260 13261If either operand is a NaN, returns the other non-NaN operand. Returns 13262NaN only if both operands are NaN. The returned NaN is always 13263quiet. If the operands compare equal, returns a value that compares 13264equal to both operands. This means that fmin(+/-0.0, +/-0.0) could 13265return either -0.0 or 0.0. 13266 13267Unlike the IEEE-754 2008 behavior, this does not distinguish between 13268signaling and quiet NaN inputs. If a target's implementation follows 13269the standard and returns a quiet NaN if either input is a signaling 13270NaN, the intrinsic lowering is responsible for quieting the inputs to 13271correctly return the non-NaN input (e.g. by using the equivalent of 13272``llvm.canonicalize``). 13273 13274 13275'``llvm.maxnum.*``' Intrinsic 13276^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13277 13278Syntax: 13279""""""" 13280 13281This is an overloaded intrinsic. You can use ``llvm.maxnum`` on any 13282floating-point or vector of floating-point type. Not all targets support 13283all types however. 13284 13285:: 13286 13287 declare float @llvm.maxnum.f32(float %Val0, float %Val1l) 13288 declare double @llvm.maxnum.f64(double %Val0, double %Val1) 13289 declare x86_fp80 @llvm.maxnum.f80(x86_fp80 %Val0, x86_fp80 %Val1) 13290 declare fp128 @llvm.maxnum.f128(fp128 %Val0, fp128 %Val1) 13291 declare ppc_fp128 @llvm.maxnum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1) 13292 13293Overview: 13294""""""""" 13295 13296The '``llvm.maxnum.*``' intrinsics return the maximum of the two 13297arguments. 13298 13299 13300Arguments: 13301"""""""""" 13302 13303The arguments and return value are floating-point numbers of the same 13304type. 13305 13306Semantics: 13307"""""""""" 13308Follows the IEEE-754 semantics for maxNum except for the handling of 13309signaling NaNs. This matches the behavior of libm's fmax. 13310 13311If either operand is a NaN, returns the other non-NaN operand. Returns 13312NaN only if both operands are NaN. The returned NaN is always 13313quiet. If the operands compare equal, returns a value that compares 13314equal to both operands. This means that fmax(+/-0.0, +/-0.0) could 13315return either -0.0 or 0.0. 13316 13317Unlike the IEEE-754 2008 behavior, this does not distinguish between 13318signaling and quiet NaN inputs. If a target's implementation follows 13319the standard and returns a quiet NaN if either input is a signaling 13320NaN, the intrinsic lowering is responsible for quieting the inputs to 13321correctly return the non-NaN input (e.g. by using the equivalent of 13322``llvm.canonicalize``). 13323 13324'``llvm.minimum.*``' Intrinsic 13325^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13326 13327Syntax: 13328""""""" 13329 13330This is an overloaded intrinsic. You can use ``llvm.minimum`` on any 13331floating-point or vector of floating-point type. Not all targets support 13332all types however. 13333 13334:: 13335 13336 declare float @llvm.minimum.f32(float %Val0, float %Val1) 13337 declare double @llvm.minimum.f64(double %Val0, double %Val1) 13338 declare x86_fp80 @llvm.minimum.f80(x86_fp80 %Val0, x86_fp80 %Val1) 13339 declare fp128 @llvm.minimum.f128(fp128 %Val0, fp128 %Val1) 13340 declare ppc_fp128 @llvm.minimum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1) 13341 13342Overview: 13343""""""""" 13344 13345The '``llvm.minimum.*``' intrinsics return the minimum of the two 13346arguments, propagating NaNs and treating -0.0 as less than +0.0. 13347 13348 13349Arguments: 13350"""""""""" 13351 13352The arguments and return value are floating-point numbers of the same 13353type. 13354 13355Semantics: 13356"""""""""" 13357If either operand is a NaN, returns NaN. Otherwise returns the lesser 13358of the two arguments. -0.0 is considered to be less than +0.0 for this 13359intrinsic. Note that these are the semantics specified in the draft of 13360IEEE 754-2018. 13361 13362'``llvm.maximum.*``' Intrinsic 13363^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13364 13365Syntax: 13366""""""" 13367 13368This is an overloaded intrinsic. You can use ``llvm.maximum`` on any 13369floating-point or vector of floating-point type. Not all targets support 13370all types however. 13371 13372:: 13373 13374 declare float @llvm.maximum.f32(float %Val0, float %Val1) 13375 declare double @llvm.maximum.f64(double %Val0, double %Val1) 13376 declare x86_fp80 @llvm.maximum.f80(x86_fp80 %Val0, x86_fp80 %Val1) 13377 declare fp128 @llvm.maximum.f128(fp128 %Val0, fp128 %Val1) 13378 declare ppc_fp128 @llvm.maximum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1) 13379 13380Overview: 13381""""""""" 13382 13383The '``llvm.maximum.*``' intrinsics return the maximum of the two 13384arguments, propagating NaNs and treating -0.0 as less than +0.0. 13385 13386 13387Arguments: 13388"""""""""" 13389 13390The arguments and return value are floating-point numbers of the same 13391type. 13392 13393Semantics: 13394"""""""""" 13395If either operand is a NaN, returns NaN. Otherwise returns the greater 13396of the two arguments. -0.0 is considered to be less than +0.0 for this 13397intrinsic. Note that these are the semantics specified in the draft of 13398IEEE 754-2018. 13399 13400'``llvm.copysign.*``' Intrinsic 13401^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13402 13403Syntax: 13404""""""" 13405 13406This is an overloaded intrinsic. You can use ``llvm.copysign`` on any 13407floating-point or vector of floating-point type. Not all targets support 13408all types however. 13409 13410:: 13411 13412 declare float @llvm.copysign.f32(float %Mag, float %Sgn) 13413 declare double @llvm.copysign.f64(double %Mag, double %Sgn) 13414 declare x86_fp80 @llvm.copysign.f80(x86_fp80 %Mag, x86_fp80 %Sgn) 13415 declare fp128 @llvm.copysign.f128(fp128 %Mag, fp128 %Sgn) 13416 declare ppc_fp128 @llvm.copysign.ppcf128(ppc_fp128 %Mag, ppc_fp128 %Sgn) 13417 13418Overview: 13419""""""""" 13420 13421The '``llvm.copysign.*``' intrinsics return a value with the magnitude of the 13422first operand and the sign of the second operand. 13423 13424Arguments: 13425"""""""""" 13426 13427The arguments and return value are floating-point numbers of the same 13428type. 13429 13430Semantics: 13431"""""""""" 13432 13433This function returns the same values as the libm ``copysign`` 13434functions would, and handles error conditions in the same way. 13435 13436'``llvm.floor.*``' Intrinsic 13437^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13438 13439Syntax: 13440""""""" 13441 13442This is an overloaded intrinsic. You can use ``llvm.floor`` on any 13443floating-point or vector of floating-point type. Not all targets support 13444all types however. 13445 13446:: 13447 13448 declare float @llvm.floor.f32(float %Val) 13449 declare double @llvm.floor.f64(double %Val) 13450 declare x86_fp80 @llvm.floor.f80(x86_fp80 %Val) 13451 declare fp128 @llvm.floor.f128(fp128 %Val) 13452 declare ppc_fp128 @llvm.floor.ppcf128(ppc_fp128 %Val) 13453 13454Overview: 13455""""""""" 13456 13457The '``llvm.floor.*``' intrinsics return the floor of the operand. 13458 13459Arguments: 13460"""""""""" 13461 13462The argument and return value are floating-point numbers of the same 13463type. 13464 13465Semantics: 13466"""""""""" 13467 13468This function returns the same values as the libm ``floor`` functions 13469would, and handles error conditions in the same way. 13470 13471'``llvm.ceil.*``' Intrinsic 13472^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13473 13474Syntax: 13475""""""" 13476 13477This is an overloaded intrinsic. You can use ``llvm.ceil`` on any 13478floating-point or vector of floating-point type. Not all targets support 13479all types however. 13480 13481:: 13482 13483 declare float @llvm.ceil.f32(float %Val) 13484 declare double @llvm.ceil.f64(double %Val) 13485 declare x86_fp80 @llvm.ceil.f80(x86_fp80 %Val) 13486 declare fp128 @llvm.ceil.f128(fp128 %Val) 13487 declare ppc_fp128 @llvm.ceil.ppcf128(ppc_fp128 %Val) 13488 13489Overview: 13490""""""""" 13491 13492The '``llvm.ceil.*``' intrinsics return the ceiling of the operand. 13493 13494Arguments: 13495"""""""""" 13496 13497The argument and return value are floating-point numbers of the same 13498type. 13499 13500Semantics: 13501"""""""""" 13502 13503This function returns the same values as the libm ``ceil`` functions 13504would, and handles error conditions in the same way. 13505 13506'``llvm.trunc.*``' Intrinsic 13507^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13508 13509Syntax: 13510""""""" 13511 13512This is an overloaded intrinsic. You can use ``llvm.trunc`` on any 13513floating-point or vector of floating-point type. Not all targets support 13514all types however. 13515 13516:: 13517 13518 declare float @llvm.trunc.f32(float %Val) 13519 declare double @llvm.trunc.f64(double %Val) 13520 declare x86_fp80 @llvm.trunc.f80(x86_fp80 %Val) 13521 declare fp128 @llvm.trunc.f128(fp128 %Val) 13522 declare ppc_fp128 @llvm.trunc.ppcf128(ppc_fp128 %Val) 13523 13524Overview: 13525""""""""" 13526 13527The '``llvm.trunc.*``' intrinsics returns the operand rounded to the 13528nearest integer not larger in magnitude than the operand. 13529 13530Arguments: 13531"""""""""" 13532 13533The argument and return value are floating-point numbers of the same 13534type. 13535 13536Semantics: 13537"""""""""" 13538 13539This function returns the same values as the libm ``trunc`` functions 13540would, and handles error conditions in the same way. 13541 13542'``llvm.rint.*``' Intrinsic 13543^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13544 13545Syntax: 13546""""""" 13547 13548This is an overloaded intrinsic. You can use ``llvm.rint`` on any 13549floating-point or vector of floating-point type. Not all targets support 13550all types however. 13551 13552:: 13553 13554 declare float @llvm.rint.f32(float %Val) 13555 declare double @llvm.rint.f64(double %Val) 13556 declare x86_fp80 @llvm.rint.f80(x86_fp80 %Val) 13557 declare fp128 @llvm.rint.f128(fp128 %Val) 13558 declare ppc_fp128 @llvm.rint.ppcf128(ppc_fp128 %Val) 13559 13560Overview: 13561""""""""" 13562 13563The '``llvm.rint.*``' intrinsics returns the operand rounded to the 13564nearest integer. It may raise an inexact floating-point exception if the 13565operand isn't an integer. 13566 13567Arguments: 13568"""""""""" 13569 13570The argument and return value are floating-point numbers of the same 13571type. 13572 13573Semantics: 13574"""""""""" 13575 13576This function returns the same values as the libm ``rint`` functions 13577would, and handles error conditions in the same way. 13578 13579'``llvm.nearbyint.*``' Intrinsic 13580^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13581 13582Syntax: 13583""""""" 13584 13585This is an overloaded intrinsic. You can use ``llvm.nearbyint`` on any 13586floating-point or vector of floating-point type. Not all targets support 13587all types however. 13588 13589:: 13590 13591 declare float @llvm.nearbyint.f32(float %Val) 13592 declare double @llvm.nearbyint.f64(double %Val) 13593 declare x86_fp80 @llvm.nearbyint.f80(x86_fp80 %Val) 13594 declare fp128 @llvm.nearbyint.f128(fp128 %Val) 13595 declare ppc_fp128 @llvm.nearbyint.ppcf128(ppc_fp128 %Val) 13596 13597Overview: 13598""""""""" 13599 13600The '``llvm.nearbyint.*``' intrinsics returns the operand rounded to the 13601nearest integer. 13602 13603Arguments: 13604"""""""""" 13605 13606The argument and return value are floating-point numbers of the same 13607type. 13608 13609Semantics: 13610"""""""""" 13611 13612This function returns the same values as the libm ``nearbyint`` 13613functions would, and handles error conditions in the same way. 13614 13615'``llvm.round.*``' Intrinsic 13616^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13617 13618Syntax: 13619""""""" 13620 13621This is an overloaded intrinsic. You can use ``llvm.round`` on any 13622floating-point or vector of floating-point type. Not all targets support 13623all types however. 13624 13625:: 13626 13627 declare float @llvm.round.f32(float %Val) 13628 declare double @llvm.round.f64(double %Val) 13629 declare x86_fp80 @llvm.round.f80(x86_fp80 %Val) 13630 declare fp128 @llvm.round.f128(fp128 %Val) 13631 declare ppc_fp128 @llvm.round.ppcf128(ppc_fp128 %Val) 13632 13633Overview: 13634""""""""" 13635 13636The '``llvm.round.*``' intrinsics returns the operand rounded to the 13637nearest integer. 13638 13639Arguments: 13640"""""""""" 13641 13642The argument and return value are floating-point numbers of the same 13643type. 13644 13645Semantics: 13646"""""""""" 13647 13648This function returns the same values as the libm ``round`` 13649functions would, and handles error conditions in the same way. 13650 13651'``llvm.roundeven.*``' Intrinsic 13652^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13653 13654Syntax: 13655""""""" 13656 13657This is an overloaded intrinsic. You can use ``llvm.roundeven`` on any 13658floating-point or vector of floating-point type. Not all targets support 13659all types however. 13660 13661:: 13662 13663 declare float @llvm.roundeven.f32(float %Val) 13664 declare double @llvm.roundeven.f64(double %Val) 13665 declare x86_fp80 @llvm.roundeven.f80(x86_fp80 %Val) 13666 declare fp128 @llvm.roundeven.f128(fp128 %Val) 13667 declare ppc_fp128 @llvm.roundeven.ppcf128(ppc_fp128 %Val) 13668 13669Overview: 13670""""""""" 13671 13672The '``llvm.roundeven.*``' intrinsics returns the operand rounded to the nearest 13673integer in floating-point format rounding halfway cases to even (that is, to the 13674nearest value that is an even integer). 13675 13676Arguments: 13677"""""""""" 13678 13679The argument and return value are floating-point numbers of the same type. 13680 13681Semantics: 13682"""""""""" 13683 13684This function implements IEEE-754 operation ``roundToIntegralTiesToEven``. It 13685also behaves in the same way as C standard function ``roundeven``, except that 13686it does not raise floating point exceptions. 13687 13688 13689'``llvm.lround.*``' Intrinsic 13690^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13691 13692Syntax: 13693""""""" 13694 13695This is an overloaded intrinsic. You can use ``llvm.lround`` on any 13696floating-point type. Not all targets support all types however. 13697 13698:: 13699 13700 declare i32 @llvm.lround.i32.f32(float %Val) 13701 declare i32 @llvm.lround.i32.f64(double %Val) 13702 declare i32 @llvm.lround.i32.f80(float %Val) 13703 declare i32 @llvm.lround.i32.f128(double %Val) 13704 declare i32 @llvm.lround.i32.ppcf128(double %Val) 13705 13706 declare i64 @llvm.lround.i64.f32(float %Val) 13707 declare i64 @llvm.lround.i64.f64(double %Val) 13708 declare i64 @llvm.lround.i64.f80(float %Val) 13709 declare i64 @llvm.lround.i64.f128(double %Val) 13710 declare i64 @llvm.lround.i64.ppcf128(double %Val) 13711 13712Overview: 13713""""""""" 13714 13715The '``llvm.lround.*``' intrinsics return the operand rounded to the nearest 13716integer with ties away from zero. 13717 13718 13719Arguments: 13720"""""""""" 13721 13722The argument is a floating-point number and the return value is an integer 13723type. 13724 13725Semantics: 13726"""""""""" 13727 13728This function returns the same values as the libm ``lround`` 13729functions would, but without setting errno. 13730 13731'``llvm.llround.*``' Intrinsic 13732^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13733 13734Syntax: 13735""""""" 13736 13737This is an overloaded intrinsic. You can use ``llvm.llround`` on any 13738floating-point type. Not all targets support all types however. 13739 13740:: 13741 13742 declare i64 @llvm.lround.i64.f32(float %Val) 13743 declare i64 @llvm.lround.i64.f64(double %Val) 13744 declare i64 @llvm.lround.i64.f80(float %Val) 13745 declare i64 @llvm.lround.i64.f128(double %Val) 13746 declare i64 @llvm.lround.i64.ppcf128(double %Val) 13747 13748Overview: 13749""""""""" 13750 13751The '``llvm.llround.*``' intrinsics return the operand rounded to the nearest 13752integer with ties away from zero. 13753 13754Arguments: 13755"""""""""" 13756 13757The argument is a floating-point number and the return value is an integer 13758type. 13759 13760Semantics: 13761"""""""""" 13762 13763This function returns the same values as the libm ``llround`` 13764functions would, but without setting errno. 13765 13766'``llvm.lrint.*``' Intrinsic 13767^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13768 13769Syntax: 13770""""""" 13771 13772This is an overloaded intrinsic. You can use ``llvm.lrint`` on any 13773floating-point type. Not all targets support all types however. 13774 13775:: 13776 13777 declare i32 @llvm.lrint.i32.f32(float %Val) 13778 declare i32 @llvm.lrint.i32.f64(double %Val) 13779 declare i32 @llvm.lrint.i32.f80(float %Val) 13780 declare i32 @llvm.lrint.i32.f128(double %Val) 13781 declare i32 @llvm.lrint.i32.ppcf128(double %Val) 13782 13783 declare i64 @llvm.lrint.i64.f32(float %Val) 13784 declare i64 @llvm.lrint.i64.f64(double %Val) 13785 declare i64 @llvm.lrint.i64.f80(float %Val) 13786 declare i64 @llvm.lrint.i64.f128(double %Val) 13787 declare i64 @llvm.lrint.i64.ppcf128(double %Val) 13788 13789Overview: 13790""""""""" 13791 13792The '``llvm.lrint.*``' intrinsics return the operand rounded to the nearest 13793integer. 13794 13795 13796Arguments: 13797"""""""""" 13798 13799The argument is a floating-point number and the return value is an integer 13800type. 13801 13802Semantics: 13803"""""""""" 13804 13805This function returns the same values as the libm ``lrint`` 13806functions would, but without setting errno. 13807 13808'``llvm.llrint.*``' Intrinsic 13809^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13810 13811Syntax: 13812""""""" 13813 13814This is an overloaded intrinsic. You can use ``llvm.llrint`` on any 13815floating-point type. Not all targets support all types however. 13816 13817:: 13818 13819 declare i64 @llvm.llrint.i64.f32(float %Val) 13820 declare i64 @llvm.llrint.i64.f64(double %Val) 13821 declare i64 @llvm.llrint.i64.f80(float %Val) 13822 declare i64 @llvm.llrint.i64.f128(double %Val) 13823 declare i64 @llvm.llrint.i64.ppcf128(double %Val) 13824 13825Overview: 13826""""""""" 13827 13828The '``llvm.llrint.*``' intrinsics return the operand rounded to the nearest 13829integer. 13830 13831Arguments: 13832"""""""""" 13833 13834The argument is a floating-point number and the return value is an integer 13835type. 13836 13837Semantics: 13838"""""""""" 13839 13840This function returns the same values as the libm ``llrint`` 13841functions would, but without setting errno. 13842 13843Bit Manipulation Intrinsics 13844--------------------------- 13845 13846LLVM provides intrinsics for a few important bit manipulation 13847operations. These allow efficient code generation for some algorithms. 13848 13849'``llvm.bitreverse.*``' Intrinsics 13850^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13851 13852Syntax: 13853""""""" 13854 13855This is an overloaded intrinsic function. You can use bitreverse on any 13856integer type. 13857 13858:: 13859 13860 declare i16 @llvm.bitreverse.i16(i16 <id>) 13861 declare i32 @llvm.bitreverse.i32(i32 <id>) 13862 declare i64 @llvm.bitreverse.i64(i64 <id>) 13863 declare <4 x i32> @llvm.bitreverse.v4i32(<4 x i32> <id>) 13864 13865Overview: 13866""""""""" 13867 13868The '``llvm.bitreverse``' family of intrinsics is used to reverse the 13869bitpattern of an integer value or vector of integer values; for example 13870``0b10110110`` becomes ``0b01101101``. 13871 13872Semantics: 13873"""""""""" 13874 13875The ``llvm.bitreverse.iN`` intrinsic returns an iN value that has bit 13876``M`` in the input moved to bit ``N-M`` in the output. The vector 13877intrinsics, such as ``llvm.bitreverse.v4i32``, operate on a per-element 13878basis and the element order is not affected. 13879 13880'``llvm.bswap.*``' Intrinsics 13881^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13882 13883Syntax: 13884""""""" 13885 13886This is an overloaded intrinsic function. You can use bswap on any 13887integer type that is an even number of bytes (i.e. BitWidth % 16 == 0). 13888 13889:: 13890 13891 declare i16 @llvm.bswap.i16(i16 <id>) 13892 declare i32 @llvm.bswap.i32(i32 <id>) 13893 declare i64 @llvm.bswap.i64(i64 <id>) 13894 declare <4 x i32> @llvm.bswap.v4i32(<4 x i32> <id>) 13895 13896Overview: 13897""""""""" 13898 13899The '``llvm.bswap``' family of intrinsics is used to byte swap an integer 13900value or vector of integer values with an even number of bytes (positive 13901multiple of 16 bits). 13902 13903Semantics: 13904"""""""""" 13905 13906The ``llvm.bswap.i16`` intrinsic returns an i16 value that has the high 13907and low byte of the input i16 swapped. Similarly, the ``llvm.bswap.i32`` 13908intrinsic returns an i32 value that has the four bytes of the input i32 13909swapped, so that if the input bytes are numbered 0, 1, 2, 3 then the 13910returned i32 will have its bytes in 3, 2, 1, 0 order. The 13911``llvm.bswap.i48``, ``llvm.bswap.i64`` and other intrinsics extend this 13912concept to additional even-byte lengths (6 bytes, 8 bytes and more, 13913respectively). The vector intrinsics, such as ``llvm.bswap.v4i32``, 13914operate on a per-element basis and the element order is not affected. 13915 13916'``llvm.ctpop.*``' Intrinsic 13917^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13918 13919Syntax: 13920""""""" 13921 13922This is an overloaded intrinsic. You can use llvm.ctpop on any integer 13923bit width, or on any vector with integer elements. Not all targets 13924support all bit widths or vector types, however. 13925 13926:: 13927 13928 declare i8 @llvm.ctpop.i8(i8 <src>) 13929 declare i16 @llvm.ctpop.i16(i16 <src>) 13930 declare i32 @llvm.ctpop.i32(i32 <src>) 13931 declare i64 @llvm.ctpop.i64(i64 <src>) 13932 declare i256 @llvm.ctpop.i256(i256 <src>) 13933 declare <2 x i32> @llvm.ctpop.v2i32(<2 x i32> <src>) 13934 13935Overview: 13936""""""""" 13937 13938The '``llvm.ctpop``' family of intrinsics counts the number of bits set 13939in a value. 13940 13941Arguments: 13942"""""""""" 13943 13944The only argument is the value to be counted. The argument may be of any 13945integer type, or a vector with integer elements. The return type must 13946match the argument type. 13947 13948Semantics: 13949"""""""""" 13950 13951The '``llvm.ctpop``' intrinsic counts the 1's in a variable, or within 13952each element of a vector. 13953 13954'``llvm.ctlz.*``' Intrinsic 13955^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13956 13957Syntax: 13958""""""" 13959 13960This is an overloaded intrinsic. You can use ``llvm.ctlz`` on any 13961integer bit width, or any vector whose elements are integers. Not all 13962targets support all bit widths or vector types, however. 13963 13964:: 13965 13966 declare i8 @llvm.ctlz.i8 (i8 <src>, i1 <is_zero_undef>) 13967 declare i16 @llvm.ctlz.i16 (i16 <src>, i1 <is_zero_undef>) 13968 declare i32 @llvm.ctlz.i32 (i32 <src>, i1 <is_zero_undef>) 13969 declare i64 @llvm.ctlz.i64 (i64 <src>, i1 <is_zero_undef>) 13970 declare i256 @llvm.ctlz.i256(i256 <src>, i1 <is_zero_undef>) 13971 declare <2 x i32> @llvm.ctlz.v2i32(<2 x i32> <src>, i1 <is_zero_undef>) 13972 13973Overview: 13974""""""""" 13975 13976The '``llvm.ctlz``' family of intrinsic functions counts the number of 13977leading zeros in a variable. 13978 13979Arguments: 13980"""""""""" 13981 13982The first argument is the value to be counted. This argument may be of 13983any integer type, or a vector with integer element type. The return 13984type must match the first argument type. 13985 13986The second argument must be a constant and is a flag to indicate whether 13987the intrinsic should ensure that a zero as the first argument produces a 13988defined result. Historically some architectures did not provide a 13989defined result for zero values as efficiently, and many algorithms are 13990now predicated on avoiding zero-value inputs. 13991 13992Semantics: 13993"""""""""" 13994 13995The '``llvm.ctlz``' intrinsic counts the leading (most significant) 13996zeros in a variable, or within each element of the vector. If 13997``src == 0`` then the result is the size in bits of the type of ``src`` 13998if ``is_zero_undef == 0`` and ``undef`` otherwise. For example, 13999``llvm.ctlz(i32 2) = 30``. 14000 14001'``llvm.cttz.*``' Intrinsic 14002^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14003 14004Syntax: 14005""""""" 14006 14007This is an overloaded intrinsic. You can use ``llvm.cttz`` on any 14008integer bit width, or any vector of integer elements. Not all targets 14009support all bit widths or vector types, however. 14010 14011:: 14012 14013 declare i8 @llvm.cttz.i8 (i8 <src>, i1 <is_zero_undef>) 14014 declare i16 @llvm.cttz.i16 (i16 <src>, i1 <is_zero_undef>) 14015 declare i32 @llvm.cttz.i32 (i32 <src>, i1 <is_zero_undef>) 14016 declare i64 @llvm.cttz.i64 (i64 <src>, i1 <is_zero_undef>) 14017 declare i256 @llvm.cttz.i256(i256 <src>, i1 <is_zero_undef>) 14018 declare <2 x i32> @llvm.cttz.v2i32(<2 x i32> <src>, i1 <is_zero_undef>) 14019 14020Overview: 14021""""""""" 14022 14023The '``llvm.cttz``' family of intrinsic functions counts the number of 14024trailing zeros. 14025 14026Arguments: 14027"""""""""" 14028 14029The first argument is the value to be counted. This argument may be of 14030any integer type, or a vector with integer element type. The return 14031type must match the first argument type. 14032 14033The second argument must be a constant and is a flag to indicate whether 14034the intrinsic should ensure that a zero as the first argument produces a 14035defined result. Historically some architectures did not provide a 14036defined result for zero values as efficiently, and many algorithms are 14037now predicated on avoiding zero-value inputs. 14038 14039Semantics: 14040"""""""""" 14041 14042The '``llvm.cttz``' intrinsic counts the trailing (least significant) 14043zeros in a variable, or within each element of a vector. If ``src == 0`` 14044then the result is the size in bits of the type of ``src`` if 14045``is_zero_undef == 0`` and ``undef`` otherwise. For example, 14046``llvm.cttz(2) = 1``. 14047 14048.. _int_overflow: 14049 14050'``llvm.fshl.*``' Intrinsic 14051^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14052 14053Syntax: 14054""""""" 14055 14056This is an overloaded intrinsic. You can use ``llvm.fshl`` on any 14057integer bit width or any vector of integer elements. Not all targets 14058support all bit widths or vector types, however. 14059 14060:: 14061 14062 declare i8 @llvm.fshl.i8 (i8 %a, i8 %b, i8 %c) 14063 declare i67 @llvm.fshl.i67(i67 %a, i67 %b, i67 %c) 14064 declare <2 x i32> @llvm.fshl.v2i32(<2 x i32> %a, <2 x i32> %b, <2 x i32> %c) 14065 14066Overview: 14067""""""""" 14068 14069The '``llvm.fshl``' family of intrinsic functions performs a funnel shift left: 14070the first two values are concatenated as { %a : %b } (%a is the most significant 14071bits of the wide value), the combined value is shifted left, and the most 14072significant bits are extracted to produce a result that is the same size as the 14073original arguments. If the first 2 arguments are identical, this is equivalent 14074to a rotate left operation. For vector types, the operation occurs for each 14075element of the vector. The shift argument is treated as an unsigned amount 14076modulo the element size of the arguments. 14077 14078Arguments: 14079"""""""""" 14080 14081The first two arguments are the values to be concatenated. The third 14082argument is the shift amount. The arguments may be any integer type or a 14083vector with integer element type. All arguments and the return value must 14084have the same type. 14085 14086Example: 14087"""""""" 14088 14089.. code-block:: text 14090 14091 %r = call i8 @llvm.fshl.i8(i8 %x, i8 %y, i8 %z) ; %r = i8: msb_extract((concat(x, y) << (z % 8)), 8) 14092 %r = call i8 @llvm.fshl.i8(i8 255, i8 0, i8 15) ; %r = i8: 128 (0b10000000) 14093 %r = call i8 @llvm.fshl.i8(i8 15, i8 15, i8 11) ; %r = i8: 120 (0b01111000) 14094 %r = call i8 @llvm.fshl.i8(i8 0, i8 255, i8 8) ; %r = i8: 0 (0b00000000) 14095 14096'``llvm.fshr.*``' Intrinsic 14097^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14098 14099Syntax: 14100""""""" 14101 14102This is an overloaded intrinsic. You can use ``llvm.fshr`` on any 14103integer bit width or any vector of integer elements. Not all targets 14104support all bit widths or vector types, however. 14105 14106:: 14107 14108 declare i8 @llvm.fshr.i8 (i8 %a, i8 %b, i8 %c) 14109 declare i67 @llvm.fshr.i67(i67 %a, i67 %b, i67 %c) 14110 declare <2 x i32> @llvm.fshr.v2i32(<2 x i32> %a, <2 x i32> %b, <2 x i32> %c) 14111 14112Overview: 14113""""""""" 14114 14115The '``llvm.fshr``' family of intrinsic functions performs a funnel shift right: 14116the first two values are concatenated as { %a : %b } (%a is the most significant 14117bits of the wide value), the combined value is shifted right, and the least 14118significant bits are extracted to produce a result that is the same size as the 14119original arguments. If the first 2 arguments are identical, this is equivalent 14120to a rotate right operation. For vector types, the operation occurs for each 14121element of the vector. The shift argument is treated as an unsigned amount 14122modulo the element size of the arguments. 14123 14124Arguments: 14125"""""""""" 14126 14127The first two arguments are the values to be concatenated. The third 14128argument is the shift amount. The arguments may be any integer type or a 14129vector with integer element type. All arguments and the return value must 14130have the same type. 14131 14132Example: 14133"""""""" 14134 14135.. code-block:: text 14136 14137 %r = call i8 @llvm.fshr.i8(i8 %x, i8 %y, i8 %z) ; %r = i8: lsb_extract((concat(x, y) >> (z % 8)), 8) 14138 %r = call i8 @llvm.fshr.i8(i8 255, i8 0, i8 15) ; %r = i8: 254 (0b11111110) 14139 %r = call i8 @llvm.fshr.i8(i8 15, i8 15, i8 11) ; %r = i8: 225 (0b11100001) 14140 %r = call i8 @llvm.fshr.i8(i8 0, i8 255, i8 8) ; %r = i8: 255 (0b11111111) 14141 14142Arithmetic with Overflow Intrinsics 14143----------------------------------- 14144 14145LLVM provides intrinsics for fast arithmetic overflow checking. 14146 14147Each of these intrinsics returns a two-element struct. The first 14148element of this struct contains the result of the corresponding 14149arithmetic operation modulo 2\ :sup:`n`\ , where n is the bit width of 14150the result. Therefore, for example, the first element of the struct 14151returned by ``llvm.sadd.with.overflow.i32`` is always the same as the 14152result of a 32-bit ``add`` instruction with the same operands, where 14153the ``add`` is *not* modified by an ``nsw`` or ``nuw`` flag. 14154 14155The second element of the result is an ``i1`` that is 1 if the 14156arithmetic operation overflowed and 0 otherwise. An operation 14157overflows if, for any values of its operands ``A`` and ``B`` and for 14158any ``N`` larger than the operands' width, ``ext(A op B) to iN`` is 14159not equal to ``(ext(A) to iN) op (ext(B) to iN)`` where ``ext`` is 14160``sext`` for signed overflow and ``zext`` for unsigned overflow, and 14161``op`` is the underlying arithmetic operation. 14162 14163The behavior of these intrinsics is well-defined for all argument 14164values. 14165 14166'``llvm.sadd.with.overflow.*``' Intrinsics 14167^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14168 14169Syntax: 14170""""""" 14171 14172This is an overloaded intrinsic. You can use ``llvm.sadd.with.overflow`` 14173on any integer bit width or vectors of integers. 14174 14175:: 14176 14177 declare {i16, i1} @llvm.sadd.with.overflow.i16(i16 %a, i16 %b) 14178 declare {i32, i1} @llvm.sadd.with.overflow.i32(i32 %a, i32 %b) 14179 declare {i64, i1} @llvm.sadd.with.overflow.i64(i64 %a, i64 %b) 14180 declare {<4 x i32>, <4 x i1>} @llvm.sadd.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b) 14181 14182Overview: 14183""""""""" 14184 14185The '``llvm.sadd.with.overflow``' family of intrinsic functions perform 14186a signed addition of the two arguments, and indicate whether an overflow 14187occurred during the signed summation. 14188 14189Arguments: 14190"""""""""" 14191 14192The arguments (%a and %b) and the first element of the result structure 14193may be of integer types of any bit width, but they must have the same 14194bit width. The second element of the result structure must be of type 14195``i1``. ``%a`` and ``%b`` are the two values that will undergo signed 14196addition. 14197 14198Semantics: 14199"""""""""" 14200 14201The '``llvm.sadd.with.overflow``' family of intrinsic functions perform 14202a signed addition of the two variables. They return a structure --- the 14203first element of which is the signed summation, and the second element 14204of which is a bit specifying if the signed summation resulted in an 14205overflow. 14206 14207Examples: 14208""""""""" 14209 14210.. code-block:: llvm 14211 14212 %res = call {i32, i1} @llvm.sadd.with.overflow.i32(i32 %a, i32 %b) 14213 %sum = extractvalue {i32, i1} %res, 0 14214 %obit = extractvalue {i32, i1} %res, 1 14215 br i1 %obit, label %overflow, label %normal 14216 14217'``llvm.uadd.with.overflow.*``' Intrinsics 14218^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14219 14220Syntax: 14221""""""" 14222 14223This is an overloaded intrinsic. You can use ``llvm.uadd.with.overflow`` 14224on any integer bit width or vectors of integers. 14225 14226:: 14227 14228 declare {i16, i1} @llvm.uadd.with.overflow.i16(i16 %a, i16 %b) 14229 declare {i32, i1} @llvm.uadd.with.overflow.i32(i32 %a, i32 %b) 14230 declare {i64, i1} @llvm.uadd.with.overflow.i64(i64 %a, i64 %b) 14231 declare {<4 x i32>, <4 x i1>} @llvm.uadd.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b) 14232 14233Overview: 14234""""""""" 14235 14236The '``llvm.uadd.with.overflow``' family of intrinsic functions perform 14237an unsigned addition of the two arguments, and indicate whether a carry 14238occurred during the unsigned summation. 14239 14240Arguments: 14241"""""""""" 14242 14243The arguments (%a and %b) and the first element of the result structure 14244may be of integer types of any bit width, but they must have the same 14245bit width. The second element of the result structure must be of type 14246``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned 14247addition. 14248 14249Semantics: 14250"""""""""" 14251 14252The '``llvm.uadd.with.overflow``' family of intrinsic functions perform 14253an unsigned addition of the two arguments. They return a structure --- the 14254first element of which is the sum, and the second element of which is a 14255bit specifying if the unsigned summation resulted in a carry. 14256 14257Examples: 14258""""""""" 14259 14260.. code-block:: llvm 14261 14262 %res = call {i32, i1} @llvm.uadd.with.overflow.i32(i32 %a, i32 %b) 14263 %sum = extractvalue {i32, i1} %res, 0 14264 %obit = extractvalue {i32, i1} %res, 1 14265 br i1 %obit, label %carry, label %normal 14266 14267'``llvm.ssub.with.overflow.*``' Intrinsics 14268^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14269 14270Syntax: 14271""""""" 14272 14273This is an overloaded intrinsic. You can use ``llvm.ssub.with.overflow`` 14274on any integer bit width or vectors of integers. 14275 14276:: 14277 14278 declare {i16, i1} @llvm.ssub.with.overflow.i16(i16 %a, i16 %b) 14279 declare {i32, i1} @llvm.ssub.with.overflow.i32(i32 %a, i32 %b) 14280 declare {i64, i1} @llvm.ssub.with.overflow.i64(i64 %a, i64 %b) 14281 declare {<4 x i32>, <4 x i1>} @llvm.ssub.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b) 14282 14283Overview: 14284""""""""" 14285 14286The '``llvm.ssub.with.overflow``' family of intrinsic functions perform 14287a signed subtraction of the two arguments, and indicate whether an 14288overflow occurred during the signed subtraction. 14289 14290Arguments: 14291"""""""""" 14292 14293The arguments (%a and %b) and the first element of the result structure 14294may be of integer types of any bit width, but they must have the same 14295bit width. The second element of the result structure must be of type 14296``i1``. ``%a`` and ``%b`` are the two values that will undergo signed 14297subtraction. 14298 14299Semantics: 14300"""""""""" 14301 14302The '``llvm.ssub.with.overflow``' family of intrinsic functions perform 14303a signed subtraction of the two arguments. They return a structure --- the 14304first element of which is the subtraction, and the second element of 14305which is a bit specifying if the signed subtraction resulted in an 14306overflow. 14307 14308Examples: 14309""""""""" 14310 14311.. code-block:: llvm 14312 14313 %res = call {i32, i1} @llvm.ssub.with.overflow.i32(i32 %a, i32 %b) 14314 %sum = extractvalue {i32, i1} %res, 0 14315 %obit = extractvalue {i32, i1} %res, 1 14316 br i1 %obit, label %overflow, label %normal 14317 14318'``llvm.usub.with.overflow.*``' Intrinsics 14319^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14320 14321Syntax: 14322""""""" 14323 14324This is an overloaded intrinsic. You can use ``llvm.usub.with.overflow`` 14325on any integer bit width or vectors of integers. 14326 14327:: 14328 14329 declare {i16, i1} @llvm.usub.with.overflow.i16(i16 %a, i16 %b) 14330 declare {i32, i1} @llvm.usub.with.overflow.i32(i32 %a, i32 %b) 14331 declare {i64, i1} @llvm.usub.with.overflow.i64(i64 %a, i64 %b) 14332 declare {<4 x i32>, <4 x i1>} @llvm.usub.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b) 14333 14334Overview: 14335""""""""" 14336 14337The '``llvm.usub.with.overflow``' family of intrinsic functions perform 14338an unsigned subtraction of the two arguments, and indicate whether an 14339overflow occurred during the unsigned subtraction. 14340 14341Arguments: 14342"""""""""" 14343 14344The arguments (%a and %b) and the first element of the result structure 14345may be of integer types of any bit width, but they must have the same 14346bit width. The second element of the result structure must be of type 14347``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned 14348subtraction. 14349 14350Semantics: 14351"""""""""" 14352 14353The '``llvm.usub.with.overflow``' family of intrinsic functions perform 14354an unsigned subtraction of the two arguments. They return a structure --- 14355the first element of which is the subtraction, and the second element of 14356which is a bit specifying if the unsigned subtraction resulted in an 14357overflow. 14358 14359Examples: 14360""""""""" 14361 14362.. code-block:: llvm 14363 14364 %res = call {i32, i1} @llvm.usub.with.overflow.i32(i32 %a, i32 %b) 14365 %sum = extractvalue {i32, i1} %res, 0 14366 %obit = extractvalue {i32, i1} %res, 1 14367 br i1 %obit, label %overflow, label %normal 14368 14369'``llvm.smul.with.overflow.*``' Intrinsics 14370^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14371 14372Syntax: 14373""""""" 14374 14375This is an overloaded intrinsic. You can use ``llvm.smul.with.overflow`` 14376on any integer bit width or vectors of integers. 14377 14378:: 14379 14380 declare {i16, i1} @llvm.smul.with.overflow.i16(i16 %a, i16 %b) 14381 declare {i32, i1} @llvm.smul.with.overflow.i32(i32 %a, i32 %b) 14382 declare {i64, i1} @llvm.smul.with.overflow.i64(i64 %a, i64 %b) 14383 declare {<4 x i32>, <4 x i1>} @llvm.smul.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b) 14384 14385Overview: 14386""""""""" 14387 14388The '``llvm.smul.with.overflow``' family of intrinsic functions perform 14389a signed multiplication of the two arguments, and indicate whether an 14390overflow occurred during the signed multiplication. 14391 14392Arguments: 14393"""""""""" 14394 14395The arguments (%a and %b) and the first element of the result structure 14396may be of integer types of any bit width, but they must have the same 14397bit width. The second element of the result structure must be of type 14398``i1``. ``%a`` and ``%b`` are the two values that will undergo signed 14399multiplication. 14400 14401Semantics: 14402"""""""""" 14403 14404The '``llvm.smul.with.overflow``' family of intrinsic functions perform 14405a signed multiplication of the two arguments. They return a structure --- 14406the first element of which is the multiplication, and the second element 14407of which is a bit specifying if the signed multiplication resulted in an 14408overflow. 14409 14410Examples: 14411""""""""" 14412 14413.. code-block:: llvm 14414 14415 %res = call {i32, i1} @llvm.smul.with.overflow.i32(i32 %a, i32 %b) 14416 %sum = extractvalue {i32, i1} %res, 0 14417 %obit = extractvalue {i32, i1} %res, 1 14418 br i1 %obit, label %overflow, label %normal 14419 14420'``llvm.umul.with.overflow.*``' Intrinsics 14421^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14422 14423Syntax: 14424""""""" 14425 14426This is an overloaded intrinsic. You can use ``llvm.umul.with.overflow`` 14427on any integer bit width or vectors of integers. 14428 14429:: 14430 14431 declare {i16, i1} @llvm.umul.with.overflow.i16(i16 %a, i16 %b) 14432 declare {i32, i1} @llvm.umul.with.overflow.i32(i32 %a, i32 %b) 14433 declare {i64, i1} @llvm.umul.with.overflow.i64(i64 %a, i64 %b) 14434 declare {<4 x i32>, <4 x i1>} @llvm.umul.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b) 14435 14436Overview: 14437""""""""" 14438 14439The '``llvm.umul.with.overflow``' family of intrinsic functions perform 14440a unsigned multiplication of the two arguments, and indicate whether an 14441overflow occurred during the unsigned multiplication. 14442 14443Arguments: 14444"""""""""" 14445 14446The arguments (%a and %b) and the first element of the result structure 14447may be of integer types of any bit width, but they must have the same 14448bit width. The second element of the result structure must be of type 14449``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned 14450multiplication. 14451 14452Semantics: 14453"""""""""" 14454 14455The '``llvm.umul.with.overflow``' family of intrinsic functions perform 14456an unsigned multiplication of the two arguments. They return a structure --- 14457the first element of which is the multiplication, and the second 14458element of which is a bit specifying if the unsigned multiplication 14459resulted in an overflow. 14460 14461Examples: 14462""""""""" 14463 14464.. code-block:: llvm 14465 14466 %res = call {i32, i1} @llvm.umul.with.overflow.i32(i32 %a, i32 %b) 14467 %sum = extractvalue {i32, i1} %res, 0 14468 %obit = extractvalue {i32, i1} %res, 1 14469 br i1 %obit, label %overflow, label %normal 14470 14471Saturation Arithmetic Intrinsics 14472--------------------------------- 14473 14474Saturation arithmetic is a version of arithmetic in which operations are 14475limited to a fixed range between a minimum and maximum value. If the result of 14476an operation is greater than the maximum value, the result is set (or 14477"clamped") to this maximum. If it is below the minimum, it is clamped to this 14478minimum. 14479 14480 14481'``llvm.sadd.sat.*``' Intrinsics 14482^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14483 14484Syntax 14485""""""" 14486 14487This is an overloaded intrinsic. You can use ``llvm.sadd.sat`` 14488on any integer bit width or vectors of integers. 14489 14490:: 14491 14492 declare i16 @llvm.sadd.sat.i16(i16 %a, i16 %b) 14493 declare i32 @llvm.sadd.sat.i32(i32 %a, i32 %b) 14494 declare i64 @llvm.sadd.sat.i64(i64 %a, i64 %b) 14495 declare <4 x i32> @llvm.sadd.sat.v4i32(<4 x i32> %a, <4 x i32> %b) 14496 14497Overview 14498""""""""" 14499 14500The '``llvm.sadd.sat``' family of intrinsic functions perform signed 14501saturating addition on the 2 arguments. 14502 14503Arguments 14504"""""""""" 14505 14506The arguments (%a and %b) and the result may be of integer types of any bit 14507width, but they must have the same bit width. ``%a`` and ``%b`` are the two 14508values that will undergo signed addition. 14509 14510Semantics: 14511"""""""""" 14512 14513The maximum value this operation can clamp to is the largest signed value 14514representable by the bit width of the arguments. The minimum value is the 14515smallest signed value representable by this bit width. 14516 14517 14518Examples 14519""""""""" 14520 14521.. code-block:: llvm 14522 14523 %res = call i4 @llvm.sadd.sat.i4(i4 1, i4 2) ; %res = 3 14524 %res = call i4 @llvm.sadd.sat.i4(i4 5, i4 6) ; %res = 7 14525 %res = call i4 @llvm.sadd.sat.i4(i4 -4, i4 2) ; %res = -2 14526 %res = call i4 @llvm.sadd.sat.i4(i4 -4, i4 -5) ; %res = -8 14527 14528 14529'``llvm.uadd.sat.*``' Intrinsics 14530^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14531 14532Syntax 14533""""""" 14534 14535This is an overloaded intrinsic. You can use ``llvm.uadd.sat`` 14536on any integer bit width or vectors of integers. 14537 14538:: 14539 14540 declare i16 @llvm.uadd.sat.i16(i16 %a, i16 %b) 14541 declare i32 @llvm.uadd.sat.i32(i32 %a, i32 %b) 14542 declare i64 @llvm.uadd.sat.i64(i64 %a, i64 %b) 14543 declare <4 x i32> @llvm.uadd.sat.v4i32(<4 x i32> %a, <4 x i32> %b) 14544 14545Overview 14546""""""""" 14547 14548The '``llvm.uadd.sat``' family of intrinsic functions perform unsigned 14549saturating addition on the 2 arguments. 14550 14551Arguments 14552"""""""""" 14553 14554The arguments (%a and %b) and the result may be of integer types of any bit 14555width, but they must have the same bit width. ``%a`` and ``%b`` are the two 14556values that will undergo unsigned addition. 14557 14558Semantics: 14559"""""""""" 14560 14561The maximum value this operation can clamp to is the largest unsigned value 14562representable by the bit width of the arguments. Because this is an unsigned 14563operation, the result will never saturate towards zero. 14564 14565 14566Examples 14567""""""""" 14568 14569.. code-block:: llvm 14570 14571 %res = call i4 @llvm.uadd.sat.i4(i4 1, i4 2) ; %res = 3 14572 %res = call i4 @llvm.uadd.sat.i4(i4 5, i4 6) ; %res = 11 14573 %res = call i4 @llvm.uadd.sat.i4(i4 8, i4 8) ; %res = 15 14574 14575 14576'``llvm.ssub.sat.*``' Intrinsics 14577^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14578 14579Syntax 14580""""""" 14581 14582This is an overloaded intrinsic. You can use ``llvm.ssub.sat`` 14583on any integer bit width or vectors of integers. 14584 14585:: 14586 14587 declare i16 @llvm.ssub.sat.i16(i16 %a, i16 %b) 14588 declare i32 @llvm.ssub.sat.i32(i32 %a, i32 %b) 14589 declare i64 @llvm.ssub.sat.i64(i64 %a, i64 %b) 14590 declare <4 x i32> @llvm.ssub.sat.v4i32(<4 x i32> %a, <4 x i32> %b) 14591 14592Overview 14593""""""""" 14594 14595The '``llvm.ssub.sat``' family of intrinsic functions perform signed 14596saturating subtraction on the 2 arguments. 14597 14598Arguments 14599"""""""""" 14600 14601The arguments (%a and %b) and the result may be of integer types of any bit 14602width, but they must have the same bit width. ``%a`` and ``%b`` are the two 14603values that will undergo signed subtraction. 14604 14605Semantics: 14606"""""""""" 14607 14608The maximum value this operation can clamp to is the largest signed value 14609representable by the bit width of the arguments. The minimum value is the 14610smallest signed value representable by this bit width. 14611 14612 14613Examples 14614""""""""" 14615 14616.. code-block:: llvm 14617 14618 %res = call i4 @llvm.ssub.sat.i4(i4 2, i4 1) ; %res = 1 14619 %res = call i4 @llvm.ssub.sat.i4(i4 2, i4 6) ; %res = -4 14620 %res = call i4 @llvm.ssub.sat.i4(i4 -4, i4 5) ; %res = -8 14621 %res = call i4 @llvm.ssub.sat.i4(i4 4, i4 -5) ; %res = 7 14622 14623 14624'``llvm.usub.sat.*``' Intrinsics 14625^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14626 14627Syntax 14628""""""" 14629 14630This is an overloaded intrinsic. You can use ``llvm.usub.sat`` 14631on any integer bit width or vectors of integers. 14632 14633:: 14634 14635 declare i16 @llvm.usub.sat.i16(i16 %a, i16 %b) 14636 declare i32 @llvm.usub.sat.i32(i32 %a, i32 %b) 14637 declare i64 @llvm.usub.sat.i64(i64 %a, i64 %b) 14638 declare <4 x i32> @llvm.usub.sat.v4i32(<4 x i32> %a, <4 x i32> %b) 14639 14640Overview 14641""""""""" 14642 14643The '``llvm.usub.sat``' family of intrinsic functions perform unsigned 14644saturating subtraction on the 2 arguments. 14645 14646Arguments 14647"""""""""" 14648 14649The arguments (%a and %b) and the result may be of integer types of any bit 14650width, but they must have the same bit width. ``%a`` and ``%b`` are the two 14651values that will undergo unsigned subtraction. 14652 14653Semantics: 14654"""""""""" 14655 14656The minimum value this operation can clamp to is 0, which is the smallest 14657unsigned value representable by the bit width of the unsigned arguments. 14658Because this is an unsigned operation, the result will never saturate towards 14659the largest possible value representable by this bit width. 14660 14661 14662Examples 14663""""""""" 14664 14665.. code-block:: llvm 14666 14667 %res = call i4 @llvm.usub.sat.i4(i4 2, i4 1) ; %res = 1 14668 %res = call i4 @llvm.usub.sat.i4(i4 2, i4 6) ; %res = 0 14669 14670 14671'``llvm.sshl.sat.*``' Intrinsics 14672^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14673 14674Syntax 14675""""""" 14676 14677This is an overloaded intrinsic. You can use ``llvm.sshl.sat`` 14678on integers or vectors of integers of any bit width. 14679 14680:: 14681 14682 declare i16 @llvm.sshl.sat.i16(i16 %a, i16 %b) 14683 declare i32 @llvm.sshl.sat.i32(i32 %a, i32 %b) 14684 declare i64 @llvm.sshl.sat.i64(i64 %a, i64 %b) 14685 declare <4 x i32> @llvm.sshl.sat.v4i32(<4 x i32> %a, <4 x i32> %b) 14686 14687Overview 14688""""""""" 14689 14690The '``llvm.sshl.sat``' family of intrinsic functions perform signed 14691saturating left shift on the first argument. 14692 14693Arguments 14694"""""""""" 14695 14696The arguments (``%a`` and ``%b``) and the result may be of integer types of any 14697bit width, but they must have the same bit width. ``%a`` is the value to be 14698shifted, and ``%b`` is the amount to shift by. If ``b`` is (statically or 14699dynamically) equal to or larger than the integer bit width of the arguments, 14700the result is a :ref:`poison value <poisonvalues>`. If the arguments are 14701vectors, each vector element of ``a`` is shifted by the corresponding shift 14702amount in ``b``. 14703 14704 14705Semantics: 14706"""""""""" 14707 14708The maximum value this operation can clamp to is the largest signed value 14709representable by the bit width of the arguments. The minimum value is the 14710smallest signed value representable by this bit width. 14711 14712 14713Examples 14714""""""""" 14715 14716.. code-block:: llvm 14717 14718 %res = call i4 @llvm.sshl.sat.i4(i4 2, i4 1) ; %res = 4 14719 %res = call i4 @llvm.sshl.sat.i4(i4 2, i4 2) ; %res = 7 14720 %res = call i4 @llvm.sshl.sat.i4(i4 -5, i4 1) ; %res = -8 14721 %res = call i4 @llvm.sshl.sat.i4(i4 -1, i4 1) ; %res = -2 14722 14723 14724'``llvm.ushl.sat.*``' Intrinsics 14725^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14726 14727Syntax 14728""""""" 14729 14730This is an overloaded intrinsic. You can use ``llvm.ushl.sat`` 14731on integers or vectors of integers of any bit width. 14732 14733:: 14734 14735 declare i16 @llvm.ushl.sat.i16(i16 %a, i16 %b) 14736 declare i32 @llvm.ushl.sat.i32(i32 %a, i32 %b) 14737 declare i64 @llvm.ushl.sat.i64(i64 %a, i64 %b) 14738 declare <4 x i32> @llvm.ushl.sat.v4i32(<4 x i32> %a, <4 x i32> %b) 14739 14740Overview 14741""""""""" 14742 14743The '``llvm.ushl.sat``' family of intrinsic functions perform unsigned 14744saturating left shift on the first argument. 14745 14746Arguments 14747"""""""""" 14748 14749The arguments (``%a`` and ``%b``) and the result may be of integer types of any 14750bit width, but they must have the same bit width. ``%a`` is the value to be 14751shifted, and ``%b`` is the amount to shift by. If ``b`` is (statically or 14752dynamically) equal to or larger than the integer bit width of the arguments, 14753the result is a :ref:`poison value <poisonvalues>`. If the arguments are 14754vectors, each vector element of ``a`` is shifted by the corresponding shift 14755amount in ``b``. 14756 14757Semantics: 14758"""""""""" 14759 14760The maximum value this operation can clamp to is the largest unsigned value 14761representable by the bit width of the arguments. 14762 14763 14764Examples 14765""""""""" 14766 14767.. code-block:: llvm 14768 14769 %res = call i4 @llvm.ushl.sat.i4(i4 2, i4 1) ; %res = 4 14770 %res = call i4 @llvm.ushl.sat.i4(i4 3, i4 3) ; %res = 15 14771 14772 14773Fixed Point Arithmetic Intrinsics 14774--------------------------------- 14775 14776A fixed point number represents a real data type for a number that has a fixed 14777number of digits after a radix point (equivalent to the decimal point '.'). 14778The number of digits after the radix point is referred as the `scale`. These 14779are useful for representing fractional values to a specific precision. The 14780following intrinsics perform fixed point arithmetic operations on 2 operands 14781of the same scale, specified as the third argument. 14782 14783The ``llvm.*mul.fix`` family of intrinsic functions represents a multiplication 14784of fixed point numbers through scaled integers. Therefore, fixed point 14785multiplication can be represented as 14786 14787.. code-block:: llvm 14788 14789 %result = call i4 @llvm.smul.fix.i4(i4 %a, i4 %b, i32 %scale) 14790 14791 ; Expands to 14792 %a2 = sext i4 %a to i8 14793 %b2 = sext i4 %b to i8 14794 %mul = mul nsw nuw i8 %a, %b 14795 %scale2 = trunc i32 %scale to i8 14796 %r = ashr i8 %mul, i8 %scale2 ; this is for a target rounding down towards negative infinity 14797 %result = trunc i8 %r to i4 14798 14799The ``llvm.*div.fix`` family of intrinsic functions represents a division of 14800fixed point numbers through scaled integers. Fixed point division can be 14801represented as: 14802 14803.. code-block:: llvm 14804 14805 %result call i4 @llvm.sdiv.fix.i4(i4 %a, i4 %b, i32 %scale) 14806 14807 ; Expands to 14808 %a2 = sext i4 %a to i8 14809 %b2 = sext i4 %b to i8 14810 %scale2 = trunc i32 %scale to i8 14811 %a3 = shl i8 %a2, %scale2 14812 %r = sdiv i8 %a3, %b2 ; this is for a target rounding towards zero 14813 %result = trunc i8 %r to i4 14814 14815For each of these functions, if the result cannot be represented exactly with 14816the provided scale, the result is rounded. Rounding is unspecified since 14817preferred rounding may vary for different targets. Rounding is specified 14818through a target hook. Different pipelines should legalize or optimize this 14819using the rounding specified by this hook if it is provided. Operations like 14820constant folding, instruction combining, KnownBits, and ValueTracking should 14821also use this hook, if provided, and not assume the direction of rounding. A 14822rounded result must always be within one unit of precision from the true 14823result. That is, the error between the returned result and the true result must 14824be less than 1/2^(scale). 14825 14826 14827'``llvm.smul.fix.*``' Intrinsics 14828^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14829 14830Syntax 14831""""""" 14832 14833This is an overloaded intrinsic. You can use ``llvm.smul.fix`` 14834on any integer bit width or vectors of integers. 14835 14836:: 14837 14838 declare i16 @llvm.smul.fix.i16(i16 %a, i16 %b, i32 %scale) 14839 declare i32 @llvm.smul.fix.i32(i32 %a, i32 %b, i32 %scale) 14840 declare i64 @llvm.smul.fix.i64(i64 %a, i64 %b, i32 %scale) 14841 declare <4 x i32> @llvm.smul.fix.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale) 14842 14843Overview 14844""""""""" 14845 14846The '``llvm.smul.fix``' family of intrinsic functions perform signed 14847fixed point multiplication on 2 arguments of the same scale. 14848 14849Arguments 14850"""""""""" 14851 14852The arguments (%a and %b) and the result may be of integer types of any bit 14853width, but they must have the same bit width. The arguments may also work with 14854int vectors of the same length and int size. ``%a`` and ``%b`` are the two 14855values that will undergo signed fixed point multiplication. The argument 14856``%scale`` represents the scale of both operands, and must be a constant 14857integer. 14858 14859Semantics: 14860"""""""""" 14861 14862This operation performs fixed point multiplication on the 2 arguments of a 14863specified scale. The result will also be returned in the same scale specified 14864in the third argument. 14865 14866If the result value cannot be precisely represented in the given scale, the 14867value is rounded up or down to the closest representable value. The rounding 14868direction is unspecified. 14869 14870It is undefined behavior if the result value does not fit within the range of 14871the fixed point type. 14872 14873 14874Examples 14875""""""""" 14876 14877.. code-block:: llvm 14878 14879 %res = call i4 @llvm.smul.fix.i4(i4 3, i4 2, i32 0) ; %res = 6 (2 x 3 = 6) 14880 %res = call i4 @llvm.smul.fix.i4(i4 3, i4 2, i32 1) ; %res = 3 (1.5 x 1 = 1.5) 14881 %res = call i4 @llvm.smul.fix.i4(i4 3, i4 -2, i32 1) ; %res = -3 (1.5 x -1 = -1.5) 14882 14883 ; The result in the following could be rounded up to -2 or down to -2.5 14884 %res = call i4 @llvm.smul.fix.i4(i4 3, i4 -3, i32 1) ; %res = -5 (or -4) (1.5 x -1.5 = -2.25) 14885 14886 14887'``llvm.umul.fix.*``' Intrinsics 14888^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14889 14890Syntax 14891""""""" 14892 14893This is an overloaded intrinsic. You can use ``llvm.umul.fix`` 14894on any integer bit width or vectors of integers. 14895 14896:: 14897 14898 declare i16 @llvm.umul.fix.i16(i16 %a, i16 %b, i32 %scale) 14899 declare i32 @llvm.umul.fix.i32(i32 %a, i32 %b, i32 %scale) 14900 declare i64 @llvm.umul.fix.i64(i64 %a, i64 %b, i32 %scale) 14901 declare <4 x i32> @llvm.umul.fix.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale) 14902 14903Overview 14904""""""""" 14905 14906The '``llvm.umul.fix``' family of intrinsic functions perform unsigned 14907fixed point multiplication on 2 arguments of the same scale. 14908 14909Arguments 14910"""""""""" 14911 14912The arguments (%a and %b) and the result may be of integer types of any bit 14913width, but they must have the same bit width. The arguments may also work with 14914int vectors of the same length and int size. ``%a`` and ``%b`` are the two 14915values that will undergo unsigned fixed point multiplication. The argument 14916``%scale`` represents the scale of both operands, and must be a constant 14917integer. 14918 14919Semantics: 14920"""""""""" 14921 14922This operation performs unsigned fixed point multiplication on the 2 arguments of a 14923specified scale. The result will also be returned in the same scale specified 14924in the third argument. 14925 14926If the result value cannot be precisely represented in the given scale, the 14927value is rounded up or down to the closest representable value. The rounding 14928direction is unspecified. 14929 14930It is undefined behavior if the result value does not fit within the range of 14931the fixed point type. 14932 14933 14934Examples 14935""""""""" 14936 14937.. code-block:: llvm 14938 14939 %res = call i4 @llvm.umul.fix.i4(i4 3, i4 2, i32 0) ; %res = 6 (2 x 3 = 6) 14940 %res = call i4 @llvm.umul.fix.i4(i4 3, i4 2, i32 1) ; %res = 3 (1.5 x 1 = 1.5) 14941 14942 ; The result in the following could be rounded down to 3.5 or up to 4 14943 %res = call i4 @llvm.umul.fix.i4(i4 15, i4 1, i32 1) ; %res = 7 (or 8) (7.5 x 0.5 = 3.75) 14944 14945 14946'``llvm.smul.fix.sat.*``' Intrinsics 14947^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14948 14949Syntax 14950""""""" 14951 14952This is an overloaded intrinsic. You can use ``llvm.smul.fix.sat`` 14953on any integer bit width or vectors of integers. 14954 14955:: 14956 14957 declare i16 @llvm.smul.fix.sat.i16(i16 %a, i16 %b, i32 %scale) 14958 declare i32 @llvm.smul.fix.sat.i32(i32 %a, i32 %b, i32 %scale) 14959 declare i64 @llvm.smul.fix.sat.i64(i64 %a, i64 %b, i32 %scale) 14960 declare <4 x i32> @llvm.smul.fix.sat.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale) 14961 14962Overview 14963""""""""" 14964 14965The '``llvm.smul.fix.sat``' family of intrinsic functions perform signed 14966fixed point saturating multiplication on 2 arguments of the same scale. 14967 14968Arguments 14969"""""""""" 14970 14971The arguments (%a and %b) and the result may be of integer types of any bit 14972width, but they must have the same bit width. ``%a`` and ``%b`` are the two 14973values that will undergo signed fixed point multiplication. The argument 14974``%scale`` represents the scale of both operands, and must be a constant 14975integer. 14976 14977Semantics: 14978"""""""""" 14979 14980This operation performs fixed point multiplication on the 2 arguments of a 14981specified scale. The result will also be returned in the same scale specified 14982in the third argument. 14983 14984If the result value cannot be precisely represented in the given scale, the 14985value is rounded up or down to the closest representable value. The rounding 14986direction is unspecified. 14987 14988The maximum value this operation can clamp to is the largest signed value 14989representable by the bit width of the first 2 arguments. The minimum value is the 14990smallest signed value representable by this bit width. 14991 14992 14993Examples 14994""""""""" 14995 14996.. code-block:: llvm 14997 14998 %res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 2, i32 0) ; %res = 6 (2 x 3 = 6) 14999 %res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 2, i32 1) ; %res = 3 (1.5 x 1 = 1.5) 15000 %res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 -2, i32 1) ; %res = -3 (1.5 x -1 = -1.5) 15001 15002 ; The result in the following could be rounded up to -2 or down to -2.5 15003 %res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 -3, i32 1) ; %res = -5 (or -4) (1.5 x -1.5 = -2.25) 15004 15005 ; Saturation 15006 %res = call i4 @llvm.smul.fix.sat.i4(i4 7, i4 2, i32 0) ; %res = 7 15007 %res = call i4 @llvm.smul.fix.sat.i4(i4 7, i4 4, i32 2) ; %res = 7 15008 %res = call i4 @llvm.smul.fix.sat.i4(i4 -8, i4 5, i32 2) ; %res = -8 15009 %res = call i4 @llvm.smul.fix.sat.i4(i4 -8, i4 -2, i32 1) ; %res = 7 15010 15011 ; Scale can affect the saturation result 15012 %res = call i4 @llvm.smul.fix.sat.i4(i4 2, i4 4, i32 0) ; %res = 7 (2 x 4 -> clamped to 7) 15013 %res = call i4 @llvm.smul.fix.sat.i4(i4 2, i4 4, i32 1) ; %res = 4 (1 x 2 = 2) 15014 15015 15016'``llvm.umul.fix.sat.*``' Intrinsics 15017^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15018 15019Syntax 15020""""""" 15021 15022This is an overloaded intrinsic. You can use ``llvm.umul.fix.sat`` 15023on any integer bit width or vectors of integers. 15024 15025:: 15026 15027 declare i16 @llvm.umul.fix.sat.i16(i16 %a, i16 %b, i32 %scale) 15028 declare i32 @llvm.umul.fix.sat.i32(i32 %a, i32 %b, i32 %scale) 15029 declare i64 @llvm.umul.fix.sat.i64(i64 %a, i64 %b, i32 %scale) 15030 declare <4 x i32> @llvm.umul.fix.sat.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale) 15031 15032Overview 15033""""""""" 15034 15035The '``llvm.umul.fix.sat``' family of intrinsic functions perform unsigned 15036fixed point saturating multiplication on 2 arguments of the same scale. 15037 15038Arguments 15039"""""""""" 15040 15041The arguments (%a and %b) and the result may be of integer types of any bit 15042width, but they must have the same bit width. ``%a`` and ``%b`` are the two 15043values that will undergo unsigned fixed point multiplication. The argument 15044``%scale`` represents the scale of both operands, and must be a constant 15045integer. 15046 15047Semantics: 15048"""""""""" 15049 15050This operation performs fixed point multiplication on the 2 arguments of a 15051specified scale. The result will also be returned in the same scale specified 15052in the third argument. 15053 15054If the result value cannot be precisely represented in the given scale, the 15055value is rounded up or down to the closest representable value. The rounding 15056direction is unspecified. 15057 15058The maximum value this operation can clamp to is the largest unsigned value 15059representable by the bit width of the first 2 arguments. The minimum value is the 15060smallest unsigned value representable by this bit width (zero). 15061 15062 15063Examples 15064""""""""" 15065 15066.. code-block:: llvm 15067 15068 %res = call i4 @llvm.umul.fix.sat.i4(i4 3, i4 2, i32 0) ; %res = 6 (2 x 3 = 6) 15069 %res = call i4 @llvm.umul.fix.sat.i4(i4 3, i4 2, i32 1) ; %res = 3 (1.5 x 1 = 1.5) 15070 15071 ; The result in the following could be rounded down to 2 or up to 2.5 15072 %res = call i4 @llvm.umul.fix.sat.i4(i4 3, i4 3, i32 1) ; %res = 4 (or 5) (1.5 x 1.5 = 2.25) 15073 15074 ; Saturation 15075 %res = call i4 @llvm.umul.fix.sat.i4(i4 8, i4 2, i32 0) ; %res = 15 (8 x 2 -> clamped to 15) 15076 %res = call i4 @llvm.umul.fix.sat.i4(i4 8, i4 8, i32 2) ; %res = 15 (2 x 2 -> clamped to 3.75) 15077 15078 ; Scale can affect the saturation result 15079 %res = call i4 @llvm.umul.fix.sat.i4(i4 2, i4 4, i32 0) ; %res = 7 (2 x 4 -> clamped to 7) 15080 %res = call i4 @llvm.umul.fix.sat.i4(i4 2, i4 4, i32 1) ; %res = 4 (1 x 2 = 2) 15081 15082 15083'``llvm.sdiv.fix.*``' Intrinsics 15084^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15085 15086Syntax 15087""""""" 15088 15089This is an overloaded intrinsic. You can use ``llvm.sdiv.fix`` 15090on any integer bit width or vectors of integers. 15091 15092:: 15093 15094 declare i16 @llvm.sdiv.fix.i16(i16 %a, i16 %b, i32 %scale) 15095 declare i32 @llvm.sdiv.fix.i32(i32 %a, i32 %b, i32 %scale) 15096 declare i64 @llvm.sdiv.fix.i64(i64 %a, i64 %b, i32 %scale) 15097 declare <4 x i32> @llvm.sdiv.fix.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale) 15098 15099Overview 15100""""""""" 15101 15102The '``llvm.sdiv.fix``' family of intrinsic functions perform signed 15103fixed point division on 2 arguments of the same scale. 15104 15105Arguments 15106"""""""""" 15107 15108The arguments (%a and %b) and the result may be of integer types of any bit 15109width, but they must have the same bit width. The arguments may also work with 15110int vectors of the same length and int size. ``%a`` and ``%b`` are the two 15111values that will undergo signed fixed point division. The argument 15112``%scale`` represents the scale of both operands, and must be a constant 15113integer. 15114 15115Semantics: 15116"""""""""" 15117 15118This operation performs fixed point division on the 2 arguments of a 15119specified scale. The result will also be returned in the same scale specified 15120in the third argument. 15121 15122If the result value cannot be precisely represented in the given scale, the 15123value is rounded up or down to the closest representable value. The rounding 15124direction is unspecified. 15125 15126It is undefined behavior if the result value does not fit within the range of 15127the fixed point type, or if the second argument is zero. 15128 15129 15130Examples 15131""""""""" 15132 15133.. code-block:: llvm 15134 15135 %res = call i4 @llvm.sdiv.fix.i4(i4 6, i4 2, i32 0) ; %res = 3 (6 / 2 = 3) 15136 %res = call i4 @llvm.sdiv.fix.i4(i4 6, i4 4, i32 1) ; %res = 3 (3 / 2 = 1.5) 15137 %res = call i4 @llvm.sdiv.fix.i4(i4 3, i4 -2, i32 1) ; %res = -3 (1.5 / -1 = -1.5) 15138 15139 ; The result in the following could be rounded up to 1 or down to 0.5 15140 %res = call i4 @llvm.sdiv.fix.i4(i4 3, i4 4, i32 1) ; %res = 2 (or 1) (1.5 / 2 = 0.75) 15141 15142 15143'``llvm.udiv.fix.*``' Intrinsics 15144^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15145 15146Syntax 15147""""""" 15148 15149This is an overloaded intrinsic. You can use ``llvm.udiv.fix`` 15150on any integer bit width or vectors of integers. 15151 15152:: 15153 15154 declare i16 @llvm.udiv.fix.i16(i16 %a, i16 %b, i32 %scale) 15155 declare i32 @llvm.udiv.fix.i32(i32 %a, i32 %b, i32 %scale) 15156 declare i64 @llvm.udiv.fix.i64(i64 %a, i64 %b, i32 %scale) 15157 declare <4 x i32> @llvm.udiv.fix.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale) 15158 15159Overview 15160""""""""" 15161 15162The '``llvm.udiv.fix``' family of intrinsic functions perform unsigned 15163fixed point division on 2 arguments of the same scale. 15164 15165Arguments 15166"""""""""" 15167 15168The arguments (%a and %b) and the result may be of integer types of any bit 15169width, but they must have the same bit width. The arguments may also work with 15170int vectors of the same length and int size. ``%a`` and ``%b`` are the two 15171values that will undergo unsigned fixed point division. The argument 15172``%scale`` represents the scale of both operands, and must be a constant 15173integer. 15174 15175Semantics: 15176"""""""""" 15177 15178This operation performs fixed point division on the 2 arguments of a 15179specified scale. The result will also be returned in the same scale specified 15180in the third argument. 15181 15182If the result value cannot be precisely represented in the given scale, the 15183value is rounded up or down to the closest representable value. The rounding 15184direction is unspecified. 15185 15186It is undefined behavior if the result value does not fit within the range of 15187the fixed point type, or if the second argument is zero. 15188 15189 15190Examples 15191""""""""" 15192 15193.. code-block:: llvm 15194 15195 %res = call i4 @llvm.udiv.fix.i4(i4 6, i4 2, i32 0) ; %res = 3 (6 / 2 = 3) 15196 %res = call i4 @llvm.udiv.fix.i4(i4 6, i4 4, i32 1) ; %res = 3 (3 / 2 = 1.5) 15197 %res = call i4 @llvm.udiv.fix.i4(i4 1, i4 -8, i32 4) ; %res = 2 (0.0625 / 0.5 = 0.125) 15198 15199 ; The result in the following could be rounded up to 1 or down to 0.5 15200 %res = call i4 @llvm.udiv.fix.i4(i4 3, i4 4, i32 1) ; %res = 2 (or 1) (1.5 / 2 = 0.75) 15201 15202 15203'``llvm.sdiv.fix.sat.*``' Intrinsics 15204^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15205 15206Syntax 15207""""""" 15208 15209This is an overloaded intrinsic. You can use ``llvm.sdiv.fix.sat`` 15210on any integer bit width or vectors of integers. 15211 15212:: 15213 15214 declare i16 @llvm.sdiv.fix.sat.i16(i16 %a, i16 %b, i32 %scale) 15215 declare i32 @llvm.sdiv.fix.sat.i32(i32 %a, i32 %b, i32 %scale) 15216 declare i64 @llvm.sdiv.fix.sat.i64(i64 %a, i64 %b, i32 %scale) 15217 declare <4 x i32> @llvm.sdiv.fix.sat.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale) 15218 15219Overview 15220""""""""" 15221 15222The '``llvm.sdiv.fix.sat``' family of intrinsic functions perform signed 15223fixed point saturating division on 2 arguments of the same scale. 15224 15225Arguments 15226"""""""""" 15227 15228The arguments (%a and %b) and the result may be of integer types of any bit 15229width, but they must have the same bit width. ``%a`` and ``%b`` are the two 15230values that will undergo signed fixed point division. The argument 15231``%scale`` represents the scale of both operands, and must be a constant 15232integer. 15233 15234Semantics: 15235"""""""""" 15236 15237This operation performs fixed point division on the 2 arguments of a 15238specified scale. The result will also be returned in the same scale specified 15239in the third argument. 15240 15241If the result value cannot be precisely represented in the given scale, the 15242value is rounded up or down to the closest representable value. The rounding 15243direction is unspecified. 15244 15245The maximum value this operation can clamp to is the largest signed value 15246representable by the bit width of the first 2 arguments. The minimum value is the 15247smallest signed value representable by this bit width. 15248 15249It is undefined behavior if the second argument is zero. 15250 15251 15252Examples 15253""""""""" 15254 15255.. code-block:: llvm 15256 15257 %res = call i4 @llvm.sdiv.fix.sat.i4(i4 6, i4 2, i32 0) ; %res = 3 (6 / 2 = 3) 15258 %res = call i4 @llvm.sdiv.fix.sat.i4(i4 6, i4 4, i32 1) ; %res = 3 (3 / 2 = 1.5) 15259 %res = call i4 @llvm.sdiv.fix.sat.i4(i4 3, i4 -2, i32 1) ; %res = -3 (1.5 / -1 = -1.5) 15260 15261 ; The result in the following could be rounded up to 1 or down to 0.5 15262 %res = call i4 @llvm.sdiv.fix.sat.i4(i4 3, i4 4, i32 1) ; %res = 2 (or 1) (1.5 / 2 = 0.75) 15263 15264 ; Saturation 15265 %res = call i4 @llvm.sdiv.fix.sat.i4(i4 -8, i4 -1, i32 0) ; %res = 7 (-8 / -1 = 8 => 7) 15266 %res = call i4 @llvm.sdiv.fix.sat.i4(i4 4, i4 2, i32 2) ; %res = 7 (1 / 0.5 = 2 => 1.75) 15267 %res = call i4 @llvm.sdiv.fix.sat.i4(i4 -4, i4 1, i32 2) ; %res = -8 (-1 / 0.25 = -4 => -2) 15268 15269 15270'``llvm.udiv.fix.sat.*``' Intrinsics 15271^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15272 15273Syntax 15274""""""" 15275 15276This is an overloaded intrinsic. You can use ``llvm.udiv.fix.sat`` 15277on any integer bit width or vectors of integers. 15278 15279:: 15280 15281 declare i16 @llvm.udiv.fix.sat.i16(i16 %a, i16 %b, i32 %scale) 15282 declare i32 @llvm.udiv.fix.sat.i32(i32 %a, i32 %b, i32 %scale) 15283 declare i64 @llvm.udiv.fix.sat.i64(i64 %a, i64 %b, i32 %scale) 15284 declare <4 x i32> @llvm.udiv.fix.sat.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale) 15285 15286Overview 15287""""""""" 15288 15289The '``llvm.udiv.fix.sat``' family of intrinsic functions perform unsigned 15290fixed point saturating division on 2 arguments of the same scale. 15291 15292Arguments 15293"""""""""" 15294 15295The arguments (%a and %b) and the result may be of integer types of any bit 15296width, but they must have the same bit width. ``%a`` and ``%b`` are the two 15297values that will undergo unsigned fixed point division. The argument 15298``%scale`` represents the scale of both operands, and must be a constant 15299integer. 15300 15301Semantics: 15302"""""""""" 15303 15304This operation performs fixed point division on the 2 arguments of a 15305specified scale. The result will also be returned in the same scale specified 15306in the third argument. 15307 15308If the result value cannot be precisely represented in the given scale, the 15309value is rounded up or down to the closest representable value. The rounding 15310direction is unspecified. 15311 15312The maximum value this operation can clamp to is the largest unsigned value 15313representable by the bit width of the first 2 arguments. The minimum value is the 15314smallest unsigned value representable by this bit width (zero). 15315 15316It is undefined behavior if the second argument is zero. 15317 15318Examples 15319""""""""" 15320 15321.. code-block:: llvm 15322 15323 %res = call i4 @llvm.udiv.fix.sat.i4(i4 6, i4 2, i32 0) ; %res = 3 (6 / 2 = 3) 15324 %res = call i4 @llvm.udiv.fix.sat.i4(i4 6, i4 4, i32 1) ; %res = 3 (3 / 2 = 1.5) 15325 15326 ; The result in the following could be rounded down to 0.5 or up to 1 15327 %res = call i4 @llvm.udiv.fix.sat.i4(i4 3, i4 4, i32 1) ; %res = 1 (or 2) (1.5 / 2 = 0.75) 15328 15329 ; Saturation 15330 %res = call i4 @llvm.udiv.fix.sat.i4(i4 8, i4 2, i32 2) ; %res = 15 (2 / 0.5 = 4 => 3.75) 15331 15332 15333Specialised Arithmetic Intrinsics 15334--------------------------------- 15335 15336.. _i_intr_llvm_canonicalize: 15337 15338'``llvm.canonicalize.*``' Intrinsic 15339^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15340 15341Syntax: 15342""""""" 15343 15344:: 15345 15346 declare float @llvm.canonicalize.f32(float %a) 15347 declare double @llvm.canonicalize.f64(double %b) 15348 15349Overview: 15350""""""""" 15351 15352The '``llvm.canonicalize.*``' intrinsic returns the platform specific canonical 15353encoding of a floating-point number. This canonicalization is useful for 15354implementing certain numeric primitives such as frexp. The canonical encoding is 15355defined by IEEE-754-2008 to be: 15356 15357:: 15358 15359 2.1.8 canonical encoding: The preferred encoding of a floating-point 15360 representation in a format. Applied to declets, significands of finite 15361 numbers, infinities, and NaNs, especially in decimal formats. 15362 15363This operation can also be considered equivalent to the IEEE-754-2008 15364conversion of a floating-point value to the same format. NaNs are handled 15365according to section 6.2. 15366 15367Examples of non-canonical encodings: 15368 15369- x87 pseudo denormals, pseudo NaNs, pseudo Infinity, Unnormals. These are 15370 converted to a canonical representation per hardware-specific protocol. 15371- Many normal decimal floating-point numbers have non-canonical alternative 15372 encodings. 15373- Some machines, like GPUs or ARMv7 NEON, do not support subnormal values. 15374 These are treated as non-canonical encodings of zero and will be flushed to 15375 a zero of the same sign by this operation. 15376 15377Note that per IEEE-754-2008 6.2, systems that support signaling NaNs with 15378default exception handling must signal an invalid exception, and produce a 15379quiet NaN result. 15380 15381This function should always be implementable as multiplication by 1.0, provided 15382that the compiler does not constant fold the operation. Likewise, division by 153831.0 and ``llvm.minnum(x, x)`` are possible implementations. Addition with 15384-0.0 is also sufficient provided that the rounding mode is not -Infinity. 15385 15386``@llvm.canonicalize`` must preserve the equality relation. That is: 15387 15388- ``(@llvm.canonicalize(x) == x)`` is equivalent to ``(x == x)`` 15389- ``(@llvm.canonicalize(x) == @llvm.canonicalize(y))`` is equivalent to 15390 to ``(x == y)`` 15391 15392Additionally, the sign of zero must be conserved: 15393``@llvm.canonicalize(-0.0) = -0.0`` and ``@llvm.canonicalize(+0.0) = +0.0`` 15394 15395The payload bits of a NaN must be conserved, with two exceptions. 15396First, environments which use only a single canonical representation of NaN 15397must perform said canonicalization. Second, SNaNs must be quieted per the 15398usual methods. 15399 15400The canonicalization operation may be optimized away if: 15401 15402- The input is known to be canonical. For example, it was produced by a 15403 floating-point operation that is required by the standard to be canonical. 15404- The result is consumed only by (or fused with) other floating-point 15405 operations. That is, the bits of the floating-point value are not examined. 15406 15407'``llvm.fmuladd.*``' Intrinsic 15408^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15409 15410Syntax: 15411""""""" 15412 15413:: 15414 15415 declare float @llvm.fmuladd.f32(float %a, float %b, float %c) 15416 declare double @llvm.fmuladd.f64(double %a, double %b, double %c) 15417 15418Overview: 15419""""""""" 15420 15421The '``llvm.fmuladd.*``' intrinsic functions represent multiply-add 15422expressions that can be fused if the code generator determines that (a) the 15423target instruction set has support for a fused operation, and (b) that the 15424fused operation is more efficient than the equivalent, separate pair of mul 15425and add instructions. 15426 15427Arguments: 15428"""""""""" 15429 15430The '``llvm.fmuladd.*``' intrinsics each take three arguments: two 15431multiplicands, a and b, and an addend c. 15432 15433Semantics: 15434"""""""""" 15435 15436The expression: 15437 15438:: 15439 15440 %0 = call float @llvm.fmuladd.f32(%a, %b, %c) 15441 15442is equivalent to the expression a \* b + c, except that it is unspecified 15443whether rounding will be performed between the multiplication and addition 15444steps. Fusion is not guaranteed, even if the target platform supports it. 15445If a fused multiply-add is required, the corresponding 15446:ref:`llvm.fma <int_fma>` intrinsic function should be used instead. 15447This never sets errno, just as '``llvm.fma.*``'. 15448 15449Examples: 15450""""""""" 15451 15452.. code-block:: llvm 15453 15454 %r2 = call float @llvm.fmuladd.f32(float %a, float %b, float %c) ; yields float:r2 = (a * b) + c 15455 15456 15457Hardware-Loop Intrinsics 15458------------------------ 15459 15460LLVM support several intrinsics to mark a loop as a hardware-loop. They are 15461hints to the backend which are required to lower these intrinsics further to target 15462specific instructions, or revert the hardware-loop to a normal loop if target 15463specific restriction are not met and a hardware-loop can't be generated. 15464 15465These intrinsics may be modified in the future and are not intended to be used 15466outside the backend. Thus, front-end and mid-level optimizations should not be 15467generating these intrinsics. 15468 15469 15470'``llvm.set.loop.iterations.*``' Intrinsic 15471^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15472 15473Syntax: 15474""""""" 15475 15476This is an overloaded intrinsic. 15477 15478:: 15479 15480 declare void @llvm.set.loop.iterations.i32(i32) 15481 declare void @llvm.set.loop.iterations.i64(i64) 15482 15483Overview: 15484""""""""" 15485 15486The '``llvm.set.loop.iterations.*``' intrinsics are used to specify the 15487hardware-loop trip count. They are placed in the loop preheader basic block and 15488are marked as ``IntrNoDuplicate`` to avoid optimizers duplicating these 15489instructions. 15490 15491Arguments: 15492"""""""""" 15493 15494The integer operand is the loop trip count of the hardware-loop, and thus 15495not e.g. the loop back-edge taken count. 15496 15497Semantics: 15498"""""""""" 15499 15500The '``llvm.set.loop.iterations.*``' intrinsics do not perform any arithmetic 15501on their operand. It's a hint to the backend that can use this to set up the 15502hardware-loop count with a target specific instruction, usually a move of this 15503value to a special register or a hardware-loop instruction. 15504 15505 15506'``llvm.start.loop.iterations.*``' Intrinsic 15507^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15508 15509Syntax: 15510""""""" 15511 15512This is an overloaded intrinsic. 15513 15514:: 15515 15516 declare i32 @llvm.start.loop.iterations.i32(i32) 15517 declare i64 @llvm.start.loop.iterations.i64(i64) 15518 15519Overview: 15520""""""""" 15521 15522The '``llvm.start.loop.iterations.*``' intrinsics are similar to the 15523'``llvm.set.loop.iterations.*``' intrinsics, used to specify the 15524hardware-loop trip count but also produce a value identical to the input 15525that can be used as the input to the loop. They are placed in the loop 15526preheader basic block and the output is expected to be the input to the 15527phi for the induction variable of the loop, decremented by the 15528'``llvm.loop.decrement.reg.*``'. 15529 15530Arguments: 15531"""""""""" 15532 15533The integer operand is the loop trip count of the hardware-loop, and thus 15534not e.g. the loop back-edge taken count. 15535 15536Semantics: 15537"""""""""" 15538 15539The '``llvm.start.loop.iterations.*``' intrinsics do not perform any arithmetic 15540on their operand. It's a hint to the backend that can use this to set up the 15541hardware-loop count with a target specific instruction, usually a move of this 15542value to a special register or a hardware-loop instruction. 15543 15544'``llvm.test.set.loop.iterations.*``' Intrinsic 15545^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15546 15547Syntax: 15548""""""" 15549 15550This is an overloaded intrinsic. 15551 15552:: 15553 15554 declare i1 @llvm.test.set.loop.iterations.i32(i32) 15555 declare i1 @llvm.test.set.loop.iterations.i64(i64) 15556 15557Overview: 15558""""""""" 15559 15560The '``llvm.test.set.loop.iterations.*``' intrinsics are used to specify the 15561the loop trip count, and also test that the given count is not zero, allowing 15562it to control entry to a while-loop. They are placed in the loop preheader's 15563predecessor basic block, and are marked as ``IntrNoDuplicate`` to avoid 15564optimizers duplicating these instructions. 15565 15566Arguments: 15567"""""""""" 15568 15569The integer operand is the loop trip count of the hardware-loop, and thus 15570not e.g. the loop back-edge taken count. 15571 15572Semantics: 15573"""""""""" 15574 15575The '``llvm.test.set.loop.iterations.*``' intrinsics do not perform any 15576arithmetic on their operand. It's a hint to the backend that can use this to 15577set up the hardware-loop count with a target specific instruction, usually a 15578move of this value to a special register or a hardware-loop instruction. 15579The result is the conditional value of whether the given count is not zero. 15580 15581'``llvm.loop.decrement.reg.*``' Intrinsic 15582^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15583 15584Syntax: 15585""""""" 15586 15587This is an overloaded intrinsic. 15588 15589:: 15590 15591 declare i32 @llvm.loop.decrement.reg.i32(i32, i32) 15592 declare i64 @llvm.loop.decrement.reg.i64(i64, i64) 15593 15594Overview: 15595""""""""" 15596 15597The '``llvm.loop.decrement.reg.*``' intrinsics are used to lower the loop 15598iteration counter and return an updated value that will be used in the next 15599loop test check. 15600 15601Arguments: 15602"""""""""" 15603 15604Both arguments must have identical integer types. The first operand is the 15605loop iteration counter. The second operand is the maximum number of elements 15606processed in an iteration. 15607 15608Semantics: 15609"""""""""" 15610 15611The '``llvm.loop.decrement.reg.*``' intrinsics do an integer ``SUB`` of its 15612two operands, which is not allowed to wrap. They return the remaining number of 15613iterations still to be executed, and can be used together with a ``PHI``, 15614``ICMP`` and ``BR`` to control the number of loop iterations executed. Any 15615optimisations are allowed to treat it is a ``SUB``, and it is supported by 15616SCEV, so it's the backends responsibility to handle cases where it may be 15617optimised. These intrinsics are marked as ``IntrNoDuplicate`` to avoid 15618optimizers duplicating these instructions. 15619 15620 15621'``llvm.loop.decrement.*``' Intrinsic 15622^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15623 15624Syntax: 15625""""""" 15626 15627This is an overloaded intrinsic. 15628 15629:: 15630 15631 declare i1 @llvm.loop.decrement.i32(i32) 15632 declare i1 @llvm.loop.decrement.i64(i64) 15633 15634Overview: 15635""""""""" 15636 15637The HardwareLoops pass allows the loop decrement value to be specified with an 15638option. It defaults to a loop decrement value of 1, but it can be an unsigned 15639integer value provided by this option. The '``llvm.loop.decrement.*``' 15640intrinsics decrement the loop iteration counter with this value, and return a 15641false predicate if the loop should exit, and true otherwise. 15642This is emitted if the loop counter is not updated via a ``PHI`` node, which 15643can also be controlled with an option. 15644 15645Arguments: 15646"""""""""" 15647 15648The integer argument is the loop decrement value used to decrement the loop 15649iteration counter. 15650 15651Semantics: 15652"""""""""" 15653 15654The '``llvm.loop.decrement.*``' intrinsics do a ``SUB`` of the loop iteration 15655counter with the given loop decrement value, and return false if the loop 15656should exit, this ``SUB`` is not allowed to wrap. The result is a condition 15657that is used by the conditional branch controlling the loop. 15658 15659 15660Vector Reduction Intrinsics 15661--------------------------- 15662 15663Horizontal reductions of vectors can be expressed using the following 15664intrinsics. Each one takes a vector operand as an input and applies its 15665respective operation across all elements of the vector, returning a single 15666scalar result of the same element type. 15667 15668 15669'``llvm.vector.reduce.add.*``' Intrinsic 15670^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15671 15672Syntax: 15673""""""" 15674 15675:: 15676 15677 declare i32 @llvm.vector.reduce.add.v4i32(<4 x i32> %a) 15678 declare i64 @llvm.vector.reduce.add.v2i64(<2 x i64> %a) 15679 15680Overview: 15681""""""""" 15682 15683The '``llvm.vector.reduce.add.*``' intrinsics do an integer ``ADD`` 15684reduction of a vector, returning the result as a scalar. The return type matches 15685the element-type of the vector input. 15686 15687Arguments: 15688"""""""""" 15689The argument to this intrinsic must be a vector of integer values. 15690 15691'``llvm.vector.reduce.fadd.*``' Intrinsic 15692^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15693 15694Syntax: 15695""""""" 15696 15697:: 15698 15699 declare float @llvm.vector.reduce.fadd.v4f32(float %start_value, <4 x float> %a) 15700 declare double @llvm.vector.reduce.fadd.v2f64(double %start_value, <2 x double> %a) 15701 15702Overview: 15703""""""""" 15704 15705The '``llvm.vector.reduce.fadd.*``' intrinsics do a floating-point 15706``ADD`` reduction of a vector, returning the result as a scalar. The return type 15707matches the element-type of the vector input. 15708 15709If the intrinsic call has the 'reassoc' flag set, then the reduction will not 15710preserve the associativity of an equivalent scalarized counterpart. Otherwise 15711the reduction will be *sequential*, thus implying that the operation respects 15712the associativity of a scalarized reduction. That is, the reduction begins with 15713the start value and performs an fadd operation with consecutively increasing 15714vector element indices. See the following pseudocode: 15715 15716:: 15717 15718 float sequential_fadd(start_value, input_vector) 15719 result = start_value 15720 for i = 0 to length(input_vector) 15721 result = result + input_vector[i] 15722 return result 15723 15724 15725Arguments: 15726"""""""""" 15727The first argument to this intrinsic is a scalar start value for the reduction. 15728The type of the start value matches the element-type of the vector input. 15729The second argument must be a vector of floating-point values. 15730 15731To ignore the start value, negative zero (``-0.0``) can be used, as it is 15732the neutral value of floating point addition. 15733 15734Examples: 15735""""""""" 15736 15737:: 15738 15739 %unord = call reassoc float @llvm.vector.reduce.fadd.v4f32(float -0.0, <4 x float> %input) ; relaxed reduction 15740 %ord = call float @llvm.vector.reduce.fadd.v4f32(float %start_value, <4 x float> %input) ; sequential reduction 15741 15742 15743'``llvm.vector.reduce.mul.*``' Intrinsic 15744^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15745 15746Syntax: 15747""""""" 15748 15749:: 15750 15751 declare i32 @llvm.vector.reduce.mul.v4i32(<4 x i32> %a) 15752 declare i64 @llvm.vector.reduce.mul.v2i64(<2 x i64> %a) 15753 15754Overview: 15755""""""""" 15756 15757The '``llvm.vector.reduce.mul.*``' intrinsics do an integer ``MUL`` 15758reduction of a vector, returning the result as a scalar. The return type matches 15759the element-type of the vector input. 15760 15761Arguments: 15762"""""""""" 15763The argument to this intrinsic must be a vector of integer values. 15764 15765'``llvm.vector.reduce.fmul.*``' Intrinsic 15766^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15767 15768Syntax: 15769""""""" 15770 15771:: 15772 15773 declare float @llvm.vector.reduce.fmul.v4f32(float %start_value, <4 x float> %a) 15774 declare double @llvm.vector.reduce.fmul.v2f64(double %start_value, <2 x double> %a) 15775 15776Overview: 15777""""""""" 15778 15779The '``llvm.vector.reduce.fmul.*``' intrinsics do a floating-point 15780``MUL`` reduction of a vector, returning the result as a scalar. The return type 15781matches the element-type of the vector input. 15782 15783If the intrinsic call has the 'reassoc' flag set, then the reduction will not 15784preserve the associativity of an equivalent scalarized counterpart. Otherwise 15785the reduction will be *sequential*, thus implying that the operation respects 15786the associativity of a scalarized reduction. That is, the reduction begins with 15787the start value and performs an fmul operation with consecutively increasing 15788vector element indices. See the following pseudocode: 15789 15790:: 15791 15792 float sequential_fmul(start_value, input_vector) 15793 result = start_value 15794 for i = 0 to length(input_vector) 15795 result = result * input_vector[i] 15796 return result 15797 15798 15799Arguments: 15800"""""""""" 15801The first argument to this intrinsic is a scalar start value for the reduction. 15802The type of the start value matches the element-type of the vector input. 15803The second argument must be a vector of floating-point values. 15804 15805To ignore the start value, one (``1.0``) can be used, as it is the neutral 15806value of floating point multiplication. 15807 15808Examples: 15809""""""""" 15810 15811:: 15812 15813 %unord = call reassoc float @llvm.vector.reduce.fmul.v4f32(float 1.0, <4 x float> %input) ; relaxed reduction 15814 %ord = call float @llvm.vector.reduce.fmul.v4f32(float %start_value, <4 x float> %input) ; sequential reduction 15815 15816'``llvm.vector.reduce.and.*``' Intrinsic 15817^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15818 15819Syntax: 15820""""""" 15821 15822:: 15823 15824 declare i32 @llvm.vector.reduce.and.v4i32(<4 x i32> %a) 15825 15826Overview: 15827""""""""" 15828 15829The '``llvm.vector.reduce.and.*``' intrinsics do a bitwise ``AND`` 15830reduction of a vector, returning the result as a scalar. The return type matches 15831the element-type of the vector input. 15832 15833Arguments: 15834"""""""""" 15835The argument to this intrinsic must be a vector of integer values. 15836 15837'``llvm.vector.reduce.or.*``' Intrinsic 15838^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15839 15840Syntax: 15841""""""" 15842 15843:: 15844 15845 declare i32 @llvm.vector.reduce.or.v4i32(<4 x i32> %a) 15846 15847Overview: 15848""""""""" 15849 15850The '``llvm.vector.reduce.or.*``' intrinsics do a bitwise ``OR`` reduction 15851of a vector, returning the result as a scalar. The return type matches the 15852element-type of the vector input. 15853 15854Arguments: 15855"""""""""" 15856The argument to this intrinsic must be a vector of integer values. 15857 15858'``llvm.vector.reduce.xor.*``' Intrinsic 15859^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15860 15861Syntax: 15862""""""" 15863 15864:: 15865 15866 declare i32 @llvm.vector.reduce.xor.v4i32(<4 x i32> %a) 15867 15868Overview: 15869""""""""" 15870 15871The '``llvm.vector.reduce.xor.*``' intrinsics do a bitwise ``XOR`` 15872reduction of a vector, returning the result as a scalar. The return type matches 15873the element-type of the vector input. 15874 15875Arguments: 15876"""""""""" 15877The argument to this intrinsic must be a vector of integer values. 15878 15879'``llvm.vector.reduce.smax.*``' Intrinsic 15880^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15881 15882Syntax: 15883""""""" 15884 15885:: 15886 15887 declare i32 @llvm.vector.reduce.smax.v4i32(<4 x i32> %a) 15888 15889Overview: 15890""""""""" 15891 15892The '``llvm.vector.reduce.smax.*``' intrinsics do a signed integer 15893``MAX`` reduction of a vector, returning the result as a scalar. The return type 15894matches the element-type of the vector input. 15895 15896Arguments: 15897"""""""""" 15898The argument to this intrinsic must be a vector of integer values. 15899 15900'``llvm.vector.reduce.smin.*``' Intrinsic 15901^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15902 15903Syntax: 15904""""""" 15905 15906:: 15907 15908 declare i32 @llvm.vector.reduce.smin.v4i32(<4 x i32> %a) 15909 15910Overview: 15911""""""""" 15912 15913The '``llvm.vector.reduce.smin.*``' intrinsics do a signed integer 15914``MIN`` reduction of a vector, returning the result as a scalar. The return type 15915matches the element-type of the vector input. 15916 15917Arguments: 15918"""""""""" 15919The argument to this intrinsic must be a vector of integer values. 15920 15921'``llvm.vector.reduce.umax.*``' Intrinsic 15922^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15923 15924Syntax: 15925""""""" 15926 15927:: 15928 15929 declare i32 @llvm.vector.reduce.umax.v4i32(<4 x i32> %a) 15930 15931Overview: 15932""""""""" 15933 15934The '``llvm.vector.reduce.umax.*``' intrinsics do an unsigned 15935integer ``MAX`` reduction of a vector, returning the result as a scalar. The 15936return type matches the element-type of the vector input. 15937 15938Arguments: 15939"""""""""" 15940The argument to this intrinsic must be a vector of integer values. 15941 15942'``llvm.vector.reduce.umin.*``' Intrinsic 15943^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15944 15945Syntax: 15946""""""" 15947 15948:: 15949 15950 declare i32 @llvm.vector.reduce.umin.v4i32(<4 x i32> %a) 15951 15952Overview: 15953""""""""" 15954 15955The '``llvm.vector.reduce.umin.*``' intrinsics do an unsigned 15956integer ``MIN`` reduction of a vector, returning the result as a scalar. The 15957return type matches the element-type of the vector input. 15958 15959Arguments: 15960"""""""""" 15961The argument to this intrinsic must be a vector of integer values. 15962 15963'``llvm.vector.reduce.fmax.*``' Intrinsic 15964^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15965 15966Syntax: 15967""""""" 15968 15969:: 15970 15971 declare float @llvm.vector.reduce.fmax.v4f32(<4 x float> %a) 15972 declare double @llvm.vector.reduce.fmax.v2f64(<2 x double> %a) 15973 15974Overview: 15975""""""""" 15976 15977The '``llvm.vector.reduce.fmax.*``' intrinsics do a floating-point 15978``MAX`` reduction of a vector, returning the result as a scalar. The return type 15979matches the element-type of the vector input. 15980 15981This instruction has the same comparison semantics as the '``llvm.maxnum.*``' 15982intrinsic. That is, the result will always be a number unless all elements of 15983the vector are NaN. For a vector with maximum element magnitude 0.0 and 15984containing both +0.0 and -0.0 elements, the sign of the result is unspecified. 15985 15986If the intrinsic call has the ``nnan`` fast-math flag, then the operation can 15987assume that NaNs are not present in the input vector. 15988 15989Arguments: 15990"""""""""" 15991The argument to this intrinsic must be a vector of floating-point values. 15992 15993'``llvm.vector.reduce.fmin.*``' Intrinsic 15994^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15995 15996Syntax: 15997""""""" 15998This is an overloaded intrinsic. 15999 16000:: 16001 16002 declare float @llvm.vector.reduce.fmin.v4f32(<4 x float> %a) 16003 declare double @llvm.vector.reduce.fmin.v2f64(<2 x double> %a) 16004 16005Overview: 16006""""""""" 16007 16008The '``llvm.vector.reduce.fmin.*``' intrinsics do a floating-point 16009``MIN`` reduction of a vector, returning the result as a scalar. The return type 16010matches the element-type of the vector input. 16011 16012This instruction has the same comparison semantics as the '``llvm.minnum.*``' 16013intrinsic. That is, the result will always be a number unless all elements of 16014the vector are NaN. For a vector with minimum element magnitude 0.0 and 16015containing both +0.0 and -0.0 elements, the sign of the result is unspecified. 16016 16017If the intrinsic call has the ``nnan`` fast-math flag, then the operation can 16018assume that NaNs are not present in the input vector. 16019 16020Arguments: 16021"""""""""" 16022The argument to this intrinsic must be a vector of floating-point values. 16023 16024Matrix Intrinsics 16025----------------- 16026 16027Operations on matrixes requiring shape information (like number of rows/columns 16028or the memory layout) can be expressed using the matrix intrinsics. These 16029intrinsics require matrix dimensions to be passed as immediate arguments, and 16030matrixes are passed and returned as vectors. This means that for a ``R`` x 16031``C`` matrix, element ``i`` of column ``j`` is at index ``j * R + i`` in the 16032corresponding vector, with indices starting at 0. Currently column-major layout 16033is assumed. The intrinsics support both integer and floating point matrixes. 16034 16035 16036'``llvm.matrix.transpose.*``' Intrinsic 16037^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16038 16039Syntax: 16040""""""" 16041This is an overloaded intrinsic. 16042 16043:: 16044 16045 declare vectorty @llvm.matrix.transpose.*(vectorty %In, i32 <Rows>, i32 <Cols>) 16046 16047Overview: 16048""""""""" 16049 16050The '``llvm.matrix.transpose.*``' intrinsics treat ``%In`` as a ``<Rows> x 16051<Cols>`` matrix and return the transposed matrix in the result vector. 16052 16053Arguments: 16054"""""""""" 16055 16056The first argument ``%In`` is a vector that corresponds to a ``<Rows> x 16057<Cols>`` matrix. Thus, arguments ``<Rows>`` and ``<Cols>`` correspond to the 16058number of rows and columns, respectively, and must be positive, constant 16059integers. The returned vector must have ``<Rows> * <Cols>`` elements, and have 16060the same float or integer element type as ``%In``. 16061 16062'``llvm.matrix.multiply.*``' Intrinsic 16063^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16064 16065Syntax: 16066""""""" 16067This is an overloaded intrinsic. 16068 16069:: 16070 16071 declare vectorty @llvm.matrix.multiply.*(vectorty %A, vectorty %B, i32 <OuterRows>, i32 <Inner>, i32 <OuterColumns>) 16072 16073Overview: 16074""""""""" 16075 16076The '``llvm.matrix.multiply.*``' intrinsics treat ``%A`` as a ``<OuterRows> x 16077<Inner>`` matrix, ``%B`` as a ``<Inner> x <OuterColumns>`` matrix, and 16078multiplies them. The result matrix is returned in the result vector. 16079 16080Arguments: 16081"""""""""" 16082 16083The first vector argument ``%A`` corresponds to a matrix with ``<OuterRows> * 16084<Inner>`` elements, and the second argument ``%B`` to a matrix with 16085``<Inner> * <OuterColumns>`` elements. Arguments ``<OuterRows>``, 16086``<Inner>`` and ``<OuterColumns>`` must be positive, constant integers. The 16087returned vector must have ``<OuterRows> * <OuterColumns>`` elements. 16088Vectors ``%A``, ``%B``, and the returned vector all have the same float or 16089integer element type. 16090 16091 16092'``llvm.matrix.column.major.load.*``' Intrinsic 16093^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16094 16095Syntax: 16096""""""" 16097This is an overloaded intrinsic. 16098 16099:: 16100 16101 declare vectorty @llvm.matrix.column.major.load.*( 16102 ptrty %Ptr, i64 %Stride, i1 <IsVolatile>, i32 <Rows>, i32 <Cols>) 16103 16104Overview: 16105""""""""" 16106 16107The '``llvm.matrix.column.major.load.*``' intrinsics load a ``<Rows> x <Cols>`` 16108matrix using a stride of ``%Stride`` to compute the start address of the 16109different columns. This allows for convenient loading of sub matrixes. If 16110``<IsVolatile>`` is true, the intrinsic is considered a :ref:`volatile memory 16111access <volatile>`. The result matrix is returned in the result vector. If the 16112``%Ptr`` argument is known to be aligned to some boundary, this can be 16113specified as an attribute on the argument. 16114 16115Arguments: 16116"""""""""" 16117 16118The first argument ``%Ptr`` is a pointer type to the returned vector type, and 16119corresponds to the start address to load from. The second argument ``%Stride`` 16120is a positive, constant integer with ``%Stride >= <Rows>``. ``%Stride`` is used 16121to compute the column memory addresses. I.e., for a column ``C``, its start 16122memory addresses is calculated with ``%Ptr + C * %Stride``. The third Argument 16123``<IsVolatile>`` is a boolean value. The fourth and fifth arguments, 16124``<Rows>`` and ``<Cols>``, correspond to the number of rows and columns, 16125respectively, and must be positive, constant integers. The returned vector must 16126have ``<Rows> * <Cols>`` elements. 16127 16128The :ref:`align <attr_align>` parameter attribute can be provided for the 16129``%Ptr`` arguments. 16130 16131 16132'``llvm.matrix.column.major.store.*``' Intrinsic 16133^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16134 16135Syntax: 16136""""""" 16137 16138:: 16139 16140 declare void @llvm.matrix.column.major.store.*( 16141 vectorty %In, ptrty %Ptr, i64 %Stride, i1 <IsVolatile>, i32 <Rows>, i32 <Cols>) 16142 16143Overview: 16144""""""""" 16145 16146The '``llvm.matrix.column.major.store.*``' intrinsics store the ``<Rows> x 16147<Cols>`` matrix in ``%In`` to memory using a stride of ``%Stride`` between 16148columns. If ``<IsVolatile>`` is true, the intrinsic is considered a 16149:ref:`volatile memory access <volatile>`. 16150 16151If the ``%Ptr`` argument is known to be aligned to some boundary, this can be 16152specified as an attribute on the argument. 16153 16154Arguments: 16155"""""""""" 16156 16157The first argument ``%In`` is a vector that corresponds to a ``<Rows> x 16158<Cols>`` matrix to be stored to memory. The second argument ``%Ptr`` is a 16159pointer to the vector type of ``%In``, and is the start address of the matrix 16160in memory. The third argument ``%Stride`` is a positive, constant integer with 16161``%Stride >= <Rows>``. ``%Stride`` is used to compute the column memory 16162addresses. I.e., for a column ``C``, its start memory addresses is calculated 16163with ``%Ptr + C * %Stride``. The fourth argument ``<IsVolatile>`` is a boolean 16164value. The arguments ``<Rows>`` and ``<Cols>`` correspond to the number of rows 16165and columns, respectively, and must be positive, constant integers. 16166 16167The :ref:`align <attr_align>` parameter attribute can be provided 16168for the ``%Ptr`` arguments. 16169 16170 16171Half Precision Floating-Point Intrinsics 16172---------------------------------------- 16173 16174For most target platforms, half precision floating-point is a 16175storage-only format. This means that it is a dense encoding (in memory) 16176but does not support computation in the format. 16177 16178This means that code must first load the half-precision floating-point 16179value as an i16, then convert it to float with 16180:ref:`llvm.convert.from.fp16 <int_convert_from_fp16>`. Computation can 16181then be performed on the float value (including extending to double 16182etc). To store the value back to memory, it is first converted to float 16183if needed, then converted to i16 with 16184:ref:`llvm.convert.to.fp16 <int_convert_to_fp16>`, then storing as an 16185i16 value. 16186 16187.. _int_convert_to_fp16: 16188 16189'``llvm.convert.to.fp16``' Intrinsic 16190^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16191 16192Syntax: 16193""""""" 16194 16195:: 16196 16197 declare i16 @llvm.convert.to.fp16.f32(float %a) 16198 declare i16 @llvm.convert.to.fp16.f64(double %a) 16199 16200Overview: 16201""""""""" 16202 16203The '``llvm.convert.to.fp16``' intrinsic function performs a conversion from a 16204conventional floating-point type to half precision floating-point format. 16205 16206Arguments: 16207"""""""""" 16208 16209The intrinsic function contains single argument - the value to be 16210converted. 16211 16212Semantics: 16213"""""""""" 16214 16215The '``llvm.convert.to.fp16``' intrinsic function performs a conversion from a 16216conventional floating-point format to half precision floating-point format. The 16217return value is an ``i16`` which contains the converted number. 16218 16219Examples: 16220""""""""" 16221 16222.. code-block:: llvm 16223 16224 %res = call i16 @llvm.convert.to.fp16.f32(float %a) 16225 store i16 %res, i16* @x, align 2 16226 16227.. _int_convert_from_fp16: 16228 16229'``llvm.convert.from.fp16``' Intrinsic 16230^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16231 16232Syntax: 16233""""""" 16234 16235:: 16236 16237 declare float @llvm.convert.from.fp16.f32(i16 %a) 16238 declare double @llvm.convert.from.fp16.f64(i16 %a) 16239 16240Overview: 16241""""""""" 16242 16243The '``llvm.convert.from.fp16``' intrinsic function performs a 16244conversion from half precision floating-point format to single precision 16245floating-point format. 16246 16247Arguments: 16248"""""""""" 16249 16250The intrinsic function contains single argument - the value to be 16251converted. 16252 16253Semantics: 16254"""""""""" 16255 16256The '``llvm.convert.from.fp16``' intrinsic function performs a 16257conversion from half single precision floating-point format to single 16258precision floating-point format. The input half-float value is 16259represented by an ``i16`` value. 16260 16261Examples: 16262""""""""" 16263 16264.. code-block:: llvm 16265 16266 %a = load i16, i16* @x, align 2 16267 %res = call float @llvm.convert.from.fp16(i16 %a) 16268 16269.. _dbg_intrinsics: 16270 16271Debugger Intrinsics 16272------------------- 16273 16274The LLVM debugger intrinsics (which all start with ``llvm.dbg.`` 16275prefix), are described in the `LLVM Source Level 16276Debugging <SourceLevelDebugging.html#format-common-intrinsics>`_ 16277document. 16278 16279Exception Handling Intrinsics 16280----------------------------- 16281 16282The LLVM exception handling intrinsics (which all start with 16283``llvm.eh.`` prefix), are described in the `LLVM Exception 16284Handling <ExceptionHandling.html#format-common-intrinsics>`_ document. 16285 16286.. _int_trampoline: 16287 16288Trampoline Intrinsics 16289--------------------- 16290 16291These intrinsics make it possible to excise one parameter, marked with 16292the :ref:`nest <nest>` attribute, from a function. The result is a 16293callable function pointer lacking the nest parameter - the caller does 16294not need to provide a value for it. Instead, the value to use is stored 16295in advance in a "trampoline", a block of memory usually allocated on the 16296stack, which also contains code to splice the nest value into the 16297argument list. This is used to implement the GCC nested function address 16298extension. 16299 16300For example, if the function is ``i32 f(i8* nest %c, i32 %x, i32 %y)`` 16301then the resulting function pointer has signature ``i32 (i32, i32)*``. 16302It can be created as follows: 16303 16304.. code-block:: llvm 16305 16306 %tramp = alloca [10 x i8], align 4 ; size and alignment only correct for X86 16307 %tramp1 = getelementptr [10 x i8], [10 x i8]* %tramp, i32 0, i32 0 16308 call i8* @llvm.init.trampoline(i8* %tramp1, i8* bitcast (i32 (i8*, i32, i32)* @f to i8*), i8* %nval) 16309 %p = call i8* @llvm.adjust.trampoline(i8* %tramp1) 16310 %fp = bitcast i8* %p to i32 (i32, i32)* 16311 16312The call ``%val = call i32 %fp(i32 %x, i32 %y)`` is then equivalent to 16313``%val = call i32 %f(i8* %nval, i32 %x, i32 %y)``. 16314 16315.. _int_it: 16316 16317'``llvm.init.trampoline``' Intrinsic 16318^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16319 16320Syntax: 16321""""""" 16322 16323:: 16324 16325 declare void @llvm.init.trampoline(i8* <tramp>, i8* <func>, i8* <nval>) 16326 16327Overview: 16328""""""""" 16329 16330This fills the memory pointed to by ``tramp`` with executable code, 16331turning it into a trampoline. 16332 16333Arguments: 16334"""""""""" 16335 16336The ``llvm.init.trampoline`` intrinsic takes three arguments, all 16337pointers. The ``tramp`` argument must point to a sufficiently large and 16338sufficiently aligned block of memory; this memory is written to by the 16339intrinsic. Note that the size and the alignment are target-specific - 16340LLVM currently provides no portable way of determining them, so a 16341front-end that generates this intrinsic needs to have some 16342target-specific knowledge. The ``func`` argument must hold a function 16343bitcast to an ``i8*``. 16344 16345Semantics: 16346"""""""""" 16347 16348The block of memory pointed to by ``tramp`` is filled with target 16349dependent code, turning it into a function. Then ``tramp`` needs to be 16350passed to :ref:`llvm.adjust.trampoline <int_at>` to get a pointer which can 16351be :ref:`bitcast (to a new function) and called <int_trampoline>`. The new 16352function's signature is the same as that of ``func`` with any arguments 16353marked with the ``nest`` attribute removed. At most one such ``nest`` 16354argument is allowed, and it must be of pointer type. Calling the new 16355function is equivalent to calling ``func`` with the same argument list, 16356but with ``nval`` used for the missing ``nest`` argument. If, after 16357calling ``llvm.init.trampoline``, the memory pointed to by ``tramp`` is 16358modified, then the effect of any later call to the returned function 16359pointer is undefined. 16360 16361.. _int_at: 16362 16363'``llvm.adjust.trampoline``' Intrinsic 16364^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16365 16366Syntax: 16367""""""" 16368 16369:: 16370 16371 declare i8* @llvm.adjust.trampoline(i8* <tramp>) 16372 16373Overview: 16374""""""""" 16375 16376This performs any required machine-specific adjustment to the address of 16377a trampoline (passed as ``tramp``). 16378 16379Arguments: 16380"""""""""" 16381 16382``tramp`` must point to a block of memory which already has trampoline 16383code filled in by a previous call to 16384:ref:`llvm.init.trampoline <int_it>`. 16385 16386Semantics: 16387"""""""""" 16388 16389On some architectures the address of the code to be executed needs to be 16390different than the address where the trampoline is actually stored. This 16391intrinsic returns the executable address corresponding to ``tramp`` 16392after performing the required machine specific adjustments. The pointer 16393returned can then be :ref:`bitcast and executed <int_trampoline>`. 16394 16395 16396.. _int_vp: 16397 16398Vector Predication Intrinsics 16399----------------------------- 16400VP intrinsics are intended for predicated SIMD/vector code. A typical VP 16401operation takes a vector mask and an explicit vector length parameter as in: 16402 16403:: 16404 16405 <W x T> llvm.vp.<opcode>.*(<W x T> %x, <W x T> %y, <W x i1> %mask, i32 %evl) 16406 16407The vector mask parameter (%mask) always has a vector of `i1` type, for example 16408`<32 x i1>`. The explicit vector length parameter always has the type `i32` and 16409is an unsigned integer value. The explicit vector length parameter (%evl) is in 16410the range: 16411 16412:: 16413 16414 0 <= %evl <= W, where W is the number of vector elements 16415 16416Note that for :ref:`scalable vector types <t_vector>` ``W`` is the runtime 16417length of the vector. 16418 16419The VP intrinsic has undefined behavior if ``%evl > W``. The explicit vector 16420length (%evl) creates a mask, %EVLmask, with all elements ``0 <= i < %evl`` set 16421to True, and all other lanes ``%evl <= i < W`` to False. A new mask %M is 16422calculated with an element-wise AND from %mask and %EVLmask: 16423 16424:: 16425 16426 M = %mask AND %EVLmask 16427 16428A vector operation ``<opcode>`` on vectors ``A`` and ``B`` calculates: 16429 16430:: 16431 16432 A <opcode> B = { A[i] <opcode> B[i] M[i] = True, and 16433 { undef otherwise 16434 16435Optimization Hint 16436^^^^^^^^^^^^^^^^^ 16437 16438Some targets, such as AVX512, do not support the %evl parameter in hardware. 16439The use of an effective %evl is discouraged for those targets. The function 16440``TargetTransformInfo::hasActiveVectorLength()`` returns true when the target 16441has native support for %evl. 16442 16443 16444.. _int_vp_add: 16445 16446'``llvm.vp.add.*``' Intrinsics 16447^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16448 16449Syntax: 16450""""""" 16451This is an overloaded intrinsic. 16452 16453:: 16454 16455 declare <16 x i32> @llvm.vp.add.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 16456 declare <vscale x 4 x i32> @llvm.vp.add.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 16457 declare <256 x i64> @llvm.vp.add.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 16458 16459Overview: 16460""""""""" 16461 16462Predicated integer addition of two vectors of integers. 16463 16464 16465Arguments: 16466"""""""""" 16467 16468The first two operands and the result have the same vector of integer type. The 16469third operand is the vector mask and has the same number of elements as the 16470result vector type. The fourth operand is the explicit vector length of the 16471operation. 16472 16473Semantics: 16474"""""""""" 16475 16476The '``llvm.vp.add``' intrinsic performs integer addition (:ref:`add <i_add>`) 16477of the first and second vector operand on each enabled lane. The result on 16478disabled lanes is undefined. 16479 16480Examples: 16481""""""""" 16482 16483.. code-block:: llvm 16484 16485 %r = call <4 x i32> @llvm.vp.add.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 16486 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 16487 16488 %t = add <4 x i32> %a, %b 16489 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef 16490 16491.. _int_vp_sub: 16492 16493'``llvm.vp.sub.*``' Intrinsics 16494^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16495 16496Syntax: 16497""""""" 16498This is an overloaded intrinsic. 16499 16500:: 16501 16502 declare <16 x i32> @llvm.vp.sub.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 16503 declare <vscale x 4 x i32> @llvm.vp.sub.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 16504 declare <256 x i64> @llvm.vp.sub.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 16505 16506Overview: 16507""""""""" 16508 16509Predicated integer subtraction of two vectors of integers. 16510 16511 16512Arguments: 16513"""""""""" 16514 16515The first two operands and the result have the same vector of integer type. The 16516third operand is the vector mask and has the same number of elements as the 16517result vector type. The fourth operand is the explicit vector length of the 16518operation. 16519 16520Semantics: 16521"""""""""" 16522 16523The '``llvm.vp.sub``' intrinsic performs integer subtraction 16524(:ref:`sub <i_sub>`) of the first and second vector operand on each enabled 16525lane. The result on disabled lanes is undefined. 16526 16527Examples: 16528""""""""" 16529 16530.. code-block:: llvm 16531 16532 %r = call <4 x i32> @llvm.vp.sub.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 16533 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 16534 16535 %t = sub <4 x i32> %a, %b 16536 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef 16537 16538 16539 16540.. _int_vp_mul: 16541 16542'``llvm.vp.mul.*``' Intrinsics 16543^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16544 16545Syntax: 16546""""""" 16547This is an overloaded intrinsic. 16548 16549:: 16550 16551 declare <16 x i32> @llvm.vp.mul.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 16552 declare <vscale x 4 x i32> @llvm.vp.mul.nxv46i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 16553 declare <256 x i64> @llvm.vp.mul.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 16554 16555Overview: 16556""""""""" 16557 16558Predicated integer multiplication of two vectors of integers. 16559 16560 16561Arguments: 16562"""""""""" 16563 16564The first two operands and the result have the same vector of integer type. The 16565third operand is the vector mask and has the same number of elements as the 16566result vector type. The fourth operand is the explicit vector length of the 16567operation. 16568 16569Semantics: 16570"""""""""" 16571The '``llvm.vp.mul``' intrinsic performs integer multiplication 16572(:ref:`mul <i_mul>`) of the first and second vector operand on each enabled 16573lane. The result on disabled lanes is undefined. 16574 16575Examples: 16576""""""""" 16577 16578.. code-block:: llvm 16579 16580 %r = call <4 x i32> @llvm.vp.mul.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 16581 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 16582 16583 %t = mul <4 x i32> %a, %b 16584 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef 16585 16586 16587.. _int_vp_sdiv: 16588 16589'``llvm.vp.sdiv.*``' Intrinsics 16590^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16591 16592Syntax: 16593""""""" 16594This is an overloaded intrinsic. 16595 16596:: 16597 16598 declare <16 x i32> @llvm.vp.sdiv.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 16599 declare <vscale x 4 x i32> @llvm.vp.sdiv.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 16600 declare <256 x i64> @llvm.vp.sdiv.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 16601 16602Overview: 16603""""""""" 16604 16605Predicated, signed division of two vectors of integers. 16606 16607 16608Arguments: 16609"""""""""" 16610 16611The first two operands and the result have the same vector of integer type. The 16612third operand is the vector mask and has the same number of elements as the 16613result vector type. The fourth operand is the explicit vector length of the 16614operation. 16615 16616Semantics: 16617"""""""""" 16618 16619The '``llvm.vp.sdiv``' intrinsic performs signed division (:ref:`sdiv <i_sdiv>`) 16620of the first and second vector operand on each enabled lane. The result on 16621disabled lanes is undefined. 16622 16623Examples: 16624""""""""" 16625 16626.. code-block:: llvm 16627 16628 %r = call <4 x i32> @llvm.vp.sdiv.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 16629 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 16630 16631 %t = sdiv <4 x i32> %a, %b 16632 %also.r = select <4 x ii> %mask, <4 x i32> %t, <4 x i32> undef 16633 16634 16635.. _int_vp_udiv: 16636 16637'``llvm.vp.udiv.*``' Intrinsics 16638^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16639 16640Syntax: 16641""""""" 16642This is an overloaded intrinsic. 16643 16644:: 16645 16646 declare <16 x i32> @llvm.vp.udiv.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 16647 declare <vscale x 4 x i32> @llvm.vp.udiv.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 16648 declare <256 x i64> @llvm.vp.udiv.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 16649 16650Overview: 16651""""""""" 16652 16653Predicated, unsigned division of two vectors of integers. 16654 16655 16656Arguments: 16657"""""""""" 16658 16659The first two operands and the result have the same vector of integer type. The third operand is the vector mask and has the same number of elements as the result vector type. The fourth operand is the explicit vector length of the operation. 16660 16661Semantics: 16662"""""""""" 16663 16664The '``llvm.vp.udiv``' intrinsic performs unsigned division 16665(:ref:`udiv <i_udiv>`) of the first and second vector operand on each enabled 16666lane. The result on disabled lanes is undefined. 16667 16668Examples: 16669""""""""" 16670 16671.. code-block:: llvm 16672 16673 %r = call <4 x i32> @llvm.vp.udiv.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 16674 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 16675 16676 %t = udiv <4 x i32> %a, %b 16677 %also.r = select <4 x ii> %mask, <4 x i32> %t, <4 x i32> undef 16678 16679 16680 16681.. _int_vp_srem: 16682 16683'``llvm.vp.srem.*``' Intrinsics 16684^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16685 16686Syntax: 16687""""""" 16688This is an overloaded intrinsic. 16689 16690:: 16691 16692 declare <16 x i32> @llvm.vp.srem.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 16693 declare <vscale x 4 x i32> @llvm.vp.srem.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 16694 declare <256 x i64> @llvm.vp.srem.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 16695 16696Overview: 16697""""""""" 16698 16699Predicated computations of the signed remainder of two integer vectors. 16700 16701 16702Arguments: 16703"""""""""" 16704 16705The first two operands and the result have the same vector of integer type. The 16706third operand is the vector mask and has the same number of elements as the 16707result vector type. The fourth operand is the explicit vector length of the 16708operation. 16709 16710Semantics: 16711"""""""""" 16712 16713The '``llvm.vp.srem``' intrinsic computes the remainder of the signed division 16714(:ref:`srem <i_srem>`) of the first and second vector operand on each enabled 16715lane. The result on disabled lanes is undefined. 16716 16717Examples: 16718""""""""" 16719 16720.. code-block:: llvm 16721 16722 %r = call <4 x i32> @llvm.vp.srem.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 16723 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 16724 16725 %t = srem <4 x i32> %a, %b 16726 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef 16727 16728 16729 16730.. _int_vp_urem: 16731 16732'``llvm.vp.urem.*``' Intrinsics 16733^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16734 16735Syntax: 16736""""""" 16737This is an overloaded intrinsic. 16738 16739:: 16740 16741 declare <16 x i32> @llvm.vp.urem.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 16742 declare <vscale x 4 x i32> @llvm.vp.urem.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 16743 declare <256 x i64> @llvm.vp.urem.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 16744 16745Overview: 16746""""""""" 16747 16748Predicated computation of the unsigned remainder of two integer vectors. 16749 16750 16751Arguments: 16752"""""""""" 16753 16754The first two operands and the result have the same vector of integer type. The 16755third operand is the vector mask and has the same number of elements as the 16756result vector type. The fourth operand is the explicit vector length of the 16757operation. 16758 16759Semantics: 16760"""""""""" 16761 16762The '``llvm.vp.urem``' intrinsic computes the remainder of the unsigned division 16763(:ref:`urem <i_urem>`) of the first and second vector operand on each enabled 16764lane. The result on disabled lanes is undefined. 16765 16766Examples: 16767""""""""" 16768 16769.. code-block:: llvm 16770 16771 %r = call <4 x i32> @llvm.vp.urem.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 16772 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 16773 16774 %t = urem <4 x i32> %a, %b 16775 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef 16776 16777 16778.. _int_vp_ashr: 16779 16780'``llvm.vp.ashr.*``' Intrinsics 16781^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16782 16783Syntax: 16784""""""" 16785This is an overloaded intrinsic. 16786 16787:: 16788 16789 declare <16 x i32> @llvm.vp.ashr.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 16790 declare <vscale x 4 x i32> @llvm.vp.ashr.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 16791 declare <256 x i64> @llvm.vp.ashr.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 16792 16793Overview: 16794""""""""" 16795 16796Vector-predicated arithmetic right-shift. 16797 16798 16799Arguments: 16800"""""""""" 16801 16802The first two operands and the result have the same vector of integer type. The 16803third operand is the vector mask and has the same number of elements as the 16804result vector type. The fourth operand is the explicit vector length of the 16805operation. 16806 16807Semantics: 16808"""""""""" 16809 16810The '``llvm.vp.ashr``' intrinsic computes the arithmetic right shift 16811(:ref:`ashr <i_ashr>`) of the first operand by the second operand on each 16812enabled lane. The result on disabled lanes is undefined. 16813 16814Examples: 16815""""""""" 16816 16817.. code-block:: llvm 16818 16819 %r = call <4 x i32> @llvm.vp.ashr.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 16820 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 16821 16822 %t = ashr <4 x i32> %a, %b 16823 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef 16824 16825 16826.. _int_vp_lshr: 16827 16828 16829'``llvm.vp.lshr.*``' Intrinsics 16830^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16831 16832Syntax: 16833""""""" 16834This is an overloaded intrinsic. 16835 16836:: 16837 16838 declare <16 x i32> @llvm.vp.lshr.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 16839 declare <vscale x 4 x i32> @llvm.vp.lshr.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 16840 declare <256 x i64> @llvm.vp.lshr.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 16841 16842Overview: 16843""""""""" 16844 16845Vector-predicated logical right-shift. 16846 16847 16848Arguments: 16849"""""""""" 16850 16851The first two operands and the result have the same vector of integer type. The 16852third operand is the vector mask and has the same number of elements as the 16853result vector type. The fourth operand is the explicit vector length of the 16854operation. 16855 16856Semantics: 16857"""""""""" 16858 16859The '``llvm.vp.lshr``' intrinsic computes the logical right shift 16860(:ref:`lshr <i_lshr>`) of the first operand by the second operand on each 16861enabled lane. The result on disabled lanes is undefined. 16862 16863Examples: 16864""""""""" 16865 16866.. code-block:: llvm 16867 16868 %r = call <4 x i32> @llvm.vp.lshr.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 16869 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 16870 16871 %t = lshr <4 x i32> %a, %b 16872 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef 16873 16874 16875.. _int_vp_shl: 16876 16877'``llvm.vp.shl.*``' Intrinsics 16878^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16879 16880Syntax: 16881""""""" 16882This is an overloaded intrinsic. 16883 16884:: 16885 16886 declare <16 x i32> @llvm.vp.shl.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 16887 declare <vscale x 4 x i32> @llvm.vp.shl.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 16888 declare <256 x i64> @llvm.vp.shl.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 16889 16890Overview: 16891""""""""" 16892 16893Vector-predicated left shift. 16894 16895 16896Arguments: 16897"""""""""" 16898 16899The first two operands and the result have the same vector of integer type. The 16900third operand is the vector mask and has the same number of elements as the 16901result vector type. The fourth operand is the explicit vector length of the 16902operation. 16903 16904Semantics: 16905"""""""""" 16906 16907The '``llvm.vp.shl``' intrinsic computes the left shift (:ref:`shl <i_shl>`) of 16908the first operand by the second operand on each enabled lane. The result on 16909disabled lanes is undefined. 16910 16911Examples: 16912""""""""" 16913 16914.. code-block:: llvm 16915 16916 %r = call <4 x i32> @llvm.vp.shl.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 16917 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 16918 16919 %t = shl <4 x i32> %a, %b 16920 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef 16921 16922 16923.. _int_vp_or: 16924 16925'``llvm.vp.or.*``' Intrinsics 16926^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16927 16928Syntax: 16929""""""" 16930This is an overloaded intrinsic. 16931 16932:: 16933 16934 declare <16 x i32> @llvm.vp.or.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 16935 declare <vscale x 4 x i32> @llvm.vp.or.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 16936 declare <256 x i64> @llvm.vp.or.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 16937 16938Overview: 16939""""""""" 16940 16941Vector-predicated or. 16942 16943 16944Arguments: 16945"""""""""" 16946 16947The first two operands and the result have the same vector of integer type. The 16948third operand is the vector mask and has the same number of elements as the 16949result vector type. The fourth operand is the explicit vector length of the 16950operation. 16951 16952Semantics: 16953"""""""""" 16954 16955The '``llvm.vp.or``' intrinsic performs a bitwise or (:ref:`or <i_or>`) of the 16956first two operands on each enabled lane. The result on disabled lanes is 16957undefined. 16958 16959Examples: 16960""""""""" 16961 16962.. code-block:: llvm 16963 16964 %r = call <4 x i32> @llvm.vp.or.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 16965 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 16966 16967 %t = or <4 x i32> %a, %b 16968 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef 16969 16970 16971.. _int_vp_and: 16972 16973'``llvm.vp.and.*``' Intrinsics 16974^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16975 16976Syntax: 16977""""""" 16978This is an overloaded intrinsic. 16979 16980:: 16981 16982 declare <16 x i32> @llvm.vp.and.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 16983 declare <vscale x 4 x i32> @llvm.vp.and.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 16984 declare <256 x i64> @llvm.vp.and.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 16985 16986Overview: 16987""""""""" 16988 16989Vector-predicated and. 16990 16991 16992Arguments: 16993"""""""""" 16994 16995The first two operands and the result have the same vector of integer type. The 16996third operand is the vector mask and has the same number of elements as the 16997result vector type. The fourth operand is the explicit vector length of the 16998operation. 16999 17000Semantics: 17001"""""""""" 17002 17003The '``llvm.vp.and``' intrinsic performs a bitwise and (:ref:`and <i_or>`) of 17004the first two operands on each enabled lane. The result on disabled lanes is 17005undefined. 17006 17007Examples: 17008""""""""" 17009 17010.. code-block:: llvm 17011 17012 %r = call <4 x i32> @llvm.vp.and.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 17013 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 17014 17015 %t = and <4 x i32> %a, %b 17016 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef 17017 17018 17019.. _int_vp_xor: 17020 17021'``llvm.vp.xor.*``' Intrinsics 17022^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17023 17024Syntax: 17025""""""" 17026This is an overloaded intrinsic. 17027 17028:: 17029 17030 declare <16 x i32> @llvm.vp.xor.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 17031 declare <vscale x 4 x i32> @llvm.vp.xor.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 17032 declare <256 x i64> @llvm.vp.xor.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 17033 17034Overview: 17035""""""""" 17036 17037Vector-predicated, bitwise xor. 17038 17039 17040Arguments: 17041"""""""""" 17042 17043The first two operands and the result have the same vector of integer type. The 17044third operand is the vector mask and has the same number of elements as the 17045result vector type. The fourth operand is the explicit vector length of the 17046operation. 17047 17048Semantics: 17049"""""""""" 17050 17051The '``llvm.vp.xor``' intrinsic performs a bitwise xor (:ref:`xor <i_xor>`) of 17052the first two operands on each enabled lane. 17053The result on disabled lanes is undefined. 17054 17055Examples: 17056""""""""" 17057 17058.. code-block:: llvm 17059 17060 %r = call <4 x i32> @llvm.vp.xor.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 17061 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 17062 17063 %t = xor <4 x i32> %a, %b 17064 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef 17065 17066 17067.. _int_get_active_lane_mask: 17068 17069'``llvm.get.active.lane.mask.*``' Intrinsics 17070^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17071 17072Syntax: 17073""""""" 17074This is an overloaded intrinsic. 17075 17076:: 17077 17078 declare <4 x i1> @llvm.get.active.lane.mask.v4i1.i32(i32 %base, i32 %n) 17079 declare <8 x i1> @llvm.get.active.lane.mask.v8i1.i64(i64 %base, i64 %n) 17080 declare <16 x i1> @llvm.get.active.lane.mask.v16i1.i64(i64 %base, i64 %n) 17081 declare <vscale x 16 x i1> @llvm.get.active.lane.mask.nxv16i1.i64(i64 %base, i64 %n) 17082 17083 17084Overview: 17085""""""""" 17086 17087Create a mask representing active and inactive vector lanes. 17088 17089 17090Arguments: 17091"""""""""" 17092 17093Both operands have the same scalar integer type. The result is a vector with 17094the i1 element type. 17095 17096Semantics: 17097"""""""""" 17098 17099The '``llvm.get.active.lane.mask.*``' intrinsics are semantically equivalent 17100to: 17101 17102:: 17103 17104 %m[i] = icmp ult (%base + i), %n 17105 17106where ``%m`` is a vector (mask) of active/inactive lanes with its elements 17107indexed by ``i``, and ``%base``, ``%n`` are the two arguments to 17108``llvm.get.active.lane.mask.*``, ``%icmp`` is an integer compare and ``ult`` 17109the unsigned less-than comparison operator. Overflow cannot occur in 17110``(%base + i)`` and its comparison against ``%n`` as it is performed in integer 17111numbers and not in machine numbers. If ``%n`` is ``0``, then the result is a 17112poison value. The above is equivalent to: 17113 17114:: 17115 17116 %m = @llvm.get.active.lane.mask(%base, %n) 17117 17118This can, for example, be emitted by the loop vectorizer in which case 17119``%base`` is the first element of the vector induction variable (VIV) and 17120``%n`` is the loop tripcount. Thus, these intrinsics perform an element-wise 17121less than comparison of VIV with the loop tripcount, producing a mask of 17122true/false values representing active/inactive vector lanes, except if the VIV 17123overflows in which case they return false in the lanes where the VIV overflows. 17124The arguments are scalar types to accommodate scalable vector types, for which 17125it is unknown what the type of the step vector needs to be that enumerate its 17126lanes without overflow. 17127 17128This mask ``%m`` can e.g. be used in masked load/store instructions. These 17129intrinsics provide a hint to the backend. I.e., for a vector loop, the 17130back-edge taken count of the original scalar loop is explicit as the second 17131argument. 17132 17133 17134Examples: 17135""""""""" 17136 17137.. code-block:: llvm 17138 17139 %active.lane.mask = call <4 x i1> @llvm.get.active.lane.mask.v4i1.i64(i64 %elem0, i64 429) 17140 %wide.masked.load = call <4 x i32> @llvm.masked.load.v4i32.p0v4i32(<4 x i32>* %3, i32 4, <4 x i1> %active.lane.mask, <4 x i32> undef) 17141 17142 17143.. _int_mload_mstore: 17144 17145Masked Vector Load and Store Intrinsics 17146--------------------------------------- 17147 17148LLVM provides intrinsics for predicated vector load and store operations. The predicate is specified by a mask operand, which holds one bit per vector element, switching the associated vector lane on or off. The memory addresses corresponding to the "off" lanes are not accessed. When all bits of the mask are on, the intrinsic is identical to a regular vector load or store. When all bits are off, no memory is accessed. 17149 17150.. _int_mload: 17151 17152'``llvm.masked.load.*``' Intrinsics 17153^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17154 17155Syntax: 17156""""""" 17157This is an overloaded intrinsic. The loaded data is a vector of any integer, floating-point or pointer data type. 17158 17159:: 17160 17161 declare <16 x float> @llvm.masked.load.v16f32.p0v16f32 (<16 x float>* <ptr>, i32 <alignment>, <16 x i1> <mask>, <16 x float> <passthru>) 17162 declare <2 x double> @llvm.masked.load.v2f64.p0v2f64 (<2 x double>* <ptr>, i32 <alignment>, <2 x i1> <mask>, <2 x double> <passthru>) 17163 ;; The data is a vector of pointers to double 17164 declare <8 x double*> @llvm.masked.load.v8p0f64.p0v8p0f64 (<8 x double*>* <ptr>, i32 <alignment>, <8 x i1> <mask>, <8 x double*> <passthru>) 17165 ;; The data is a vector of function pointers 17166 declare <8 x i32 ()*> @llvm.masked.load.v8p0f_i32f.p0v8p0f_i32f (<8 x i32 ()*>* <ptr>, i32 <alignment>, <8 x i1> <mask>, <8 x i32 ()*> <passthru>) 17167 17168Overview: 17169""""""""" 17170 17171Reads a vector from memory according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes. The masked-off lanes in the result vector are taken from the corresponding lanes of the '``passthru``' operand. 17172 17173 17174Arguments: 17175"""""""""" 17176 17177The first operand is the base pointer for the load. The second operand is the alignment of the source location. It must be a power of two constant integer value. The third operand, mask, is a vector of boolean values with the same number of elements as the return type. The fourth is a pass-through value that is used to fill the masked-off lanes of the result. The return type, underlying type of the base pointer and the type of the '``passthru``' operand are the same vector types. 17178 17179Semantics: 17180"""""""""" 17181 17182The '``llvm.masked.load``' intrinsic is designed for conditional reading of selected vector elements in a single IR operation. It is useful for targets that support vector masked loads and allows vectorizing predicated basic blocks on these targets. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar load operations. 17183The result of this operation is equivalent to a regular vector load instruction followed by a 'select' between the loaded and the passthru values, predicated on the same mask. However, using this intrinsic prevents exceptions on memory access to masked-off lanes. 17184 17185 17186:: 17187 17188 %res = call <16 x float> @llvm.masked.load.v16f32.p0v16f32 (<16 x float>* %ptr, i32 4, <16 x i1>%mask, <16 x float> %passthru) 17189 17190 ;; The result of the two following instructions is identical aside from potential memory access exception 17191 %loadlal = load <16 x float>, <16 x float>* %ptr, align 4 17192 %res = select <16 x i1> %mask, <16 x float> %loadlal, <16 x float> %passthru 17193 17194.. _int_mstore: 17195 17196'``llvm.masked.store.*``' Intrinsics 17197^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17198 17199Syntax: 17200""""""" 17201This is an overloaded intrinsic. The data stored in memory is a vector of any integer, floating-point or pointer data type. 17202 17203:: 17204 17205 declare void @llvm.masked.store.v8i32.p0v8i32 (<8 x i32> <value>, <8 x i32>* <ptr>, i32 <alignment>, <8 x i1> <mask>) 17206 declare void @llvm.masked.store.v16f32.p0v16f32 (<16 x float> <value>, <16 x float>* <ptr>, i32 <alignment>, <16 x i1> <mask>) 17207 ;; The data is a vector of pointers to double 17208 declare void @llvm.masked.store.v8p0f64.p0v8p0f64 (<8 x double*> <value>, <8 x double*>* <ptr>, i32 <alignment>, <8 x i1> <mask>) 17209 ;; The data is a vector of function pointers 17210 declare void @llvm.masked.store.v4p0f_i32f.p0v4p0f_i32f (<4 x i32 ()*> <value>, <4 x i32 ()*>* <ptr>, i32 <alignment>, <4 x i1> <mask>) 17211 17212Overview: 17213""""""""" 17214 17215Writes a vector to memory according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes. 17216 17217Arguments: 17218"""""""""" 17219 17220The first operand is the vector value to be written to memory. The second operand is the base pointer for the store, it has the same underlying type as the value operand. The third operand is the alignment of the destination location. It must be a power of two constant integer value. The fourth operand, mask, is a vector of boolean values. The types of the mask and the value operand must have the same number of vector elements. 17221 17222 17223Semantics: 17224"""""""""" 17225 17226The '``llvm.masked.store``' intrinsics is designed for conditional writing of selected vector elements in a single IR operation. It is useful for targets that support vector masked store and allows vectorizing predicated basic blocks on these targets. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar store operations. 17227The result of this operation is equivalent to a load-modify-store sequence. However, using this intrinsic prevents exceptions and data races on memory access to masked-off lanes. 17228 17229:: 17230 17231 call void @llvm.masked.store.v16f32.p0v16f32(<16 x float> %value, <16 x float>* %ptr, i32 4, <16 x i1> %mask) 17232 17233 ;; The result of the following instructions is identical aside from potential data races and memory access exceptions 17234 %oldval = load <16 x float>, <16 x float>* %ptr, align 4 17235 %res = select <16 x i1> %mask, <16 x float> %value, <16 x float> %oldval 17236 store <16 x float> %res, <16 x float>* %ptr, align 4 17237 17238 17239Masked Vector Gather and Scatter Intrinsics 17240------------------------------------------- 17241 17242LLVM provides intrinsics for vector gather and scatter operations. They are similar to :ref:`Masked Vector Load and Store <int_mload_mstore>`, except they are designed for arbitrary memory accesses, rather than sequential memory accesses. Gather and scatter also employ a mask operand, which holds one bit per vector element, switching the associated vector lane on or off. The memory addresses corresponding to the "off" lanes are not accessed. When all bits are off, no memory is accessed. 17243 17244.. _int_mgather: 17245 17246'``llvm.masked.gather.*``' Intrinsics 17247^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17248 17249Syntax: 17250""""""" 17251This is an overloaded intrinsic. The loaded data are multiple scalar values of any integer, floating-point or pointer data type gathered together into one vector. 17252 17253:: 17254 17255 declare <16 x float> @llvm.masked.gather.v16f32.v16p0f32 (<16 x float*> <ptrs>, i32 <alignment>, <16 x i1> <mask>, <16 x float> <passthru>) 17256 declare <2 x double> @llvm.masked.gather.v2f64.v2p1f64 (<2 x double addrspace(1)*> <ptrs>, i32 <alignment>, <2 x i1> <mask>, <2 x double> <passthru>) 17257 declare <8 x float*> @llvm.masked.gather.v8p0f32.v8p0p0f32 (<8 x float**> <ptrs>, i32 <alignment>, <8 x i1> <mask>, <8 x float*> <passthru>) 17258 17259Overview: 17260""""""""" 17261 17262Reads scalar values from arbitrary memory locations and gathers them into one vector. The memory locations are provided in the vector of pointers '``ptrs``'. The memory is accessed according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes. The masked-off lanes in the result vector are taken from the corresponding lanes of the '``passthru``' operand. 17263 17264 17265Arguments: 17266"""""""""" 17267 17268The first operand is a vector of pointers which holds all memory addresses to read. The second operand is an alignment of the source addresses. It must be 0 or a power of two constant integer value. The third operand, mask, is a vector of boolean values with the same number of elements as the return type. The fourth is a pass-through value that is used to fill the masked-off lanes of the result. The return type, underlying type of the vector of pointers and the type of the '``passthru``' operand are the same vector types. 17269 17270Semantics: 17271"""""""""" 17272 17273The '``llvm.masked.gather``' intrinsic is designed for conditional reading of multiple scalar values from arbitrary memory locations in a single IR operation. It is useful for targets that support vector masked gathers and allows vectorizing basic blocks with data and control divergence. Other targets may support this intrinsic differently, for example by lowering it into a sequence of scalar load operations. 17274The semantics of this operation are equivalent to a sequence of conditional scalar loads with subsequent gathering all loaded values into a single vector. The mask restricts memory access to certain lanes and facilitates vectorization of predicated basic blocks. 17275 17276 17277:: 17278 17279 %res = call <4 x double> @llvm.masked.gather.v4f64.v4p0f64 (<4 x double*> %ptrs, i32 8, <4 x i1> <i1 true, i1 true, i1 true, i1 true>, <4 x double> undef) 17280 17281 ;; The gather with all-true mask is equivalent to the following instruction sequence 17282 %ptr0 = extractelement <4 x double*> %ptrs, i32 0 17283 %ptr1 = extractelement <4 x double*> %ptrs, i32 1 17284 %ptr2 = extractelement <4 x double*> %ptrs, i32 2 17285 %ptr3 = extractelement <4 x double*> %ptrs, i32 3 17286 17287 %val0 = load double, double* %ptr0, align 8 17288 %val1 = load double, double* %ptr1, align 8 17289 %val2 = load double, double* %ptr2, align 8 17290 %val3 = load double, double* %ptr3, align 8 17291 17292 %vec0 = insertelement <4 x double>undef, %val0, 0 17293 %vec01 = insertelement <4 x double>%vec0, %val1, 1 17294 %vec012 = insertelement <4 x double>%vec01, %val2, 2 17295 %vec0123 = insertelement <4 x double>%vec012, %val3, 3 17296 17297.. _int_mscatter: 17298 17299'``llvm.masked.scatter.*``' Intrinsics 17300^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17301 17302Syntax: 17303""""""" 17304This is an overloaded intrinsic. The data stored in memory is a vector of any integer, floating-point or pointer data type. Each vector element is stored in an arbitrary memory address. Scatter with overlapping addresses is guaranteed to be ordered from least-significant to most-significant element. 17305 17306:: 17307 17308 declare void @llvm.masked.scatter.v8i32.v8p0i32 (<8 x i32> <value>, <8 x i32*> <ptrs>, i32 <alignment>, <8 x i1> <mask>) 17309 declare void @llvm.masked.scatter.v16f32.v16p1f32 (<16 x float> <value>, <16 x float addrspace(1)*> <ptrs>, i32 <alignment>, <16 x i1> <mask>) 17310 declare void @llvm.masked.scatter.v4p0f64.v4p0p0f64 (<4 x double*> <value>, <4 x double**> <ptrs>, i32 <alignment>, <4 x i1> <mask>) 17311 17312Overview: 17313""""""""" 17314 17315Writes each element from the value vector to the corresponding memory address. The memory addresses are represented as a vector of pointers. Writing is done according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes. 17316 17317Arguments: 17318"""""""""" 17319 17320The first operand is a vector value to be written to memory. The second operand is a vector of pointers, pointing to where the value elements should be stored. It has the same underlying type as the value operand. The third operand is an alignment of the destination addresses. It must be 0 or a power of two constant integer value. The fourth operand, mask, is a vector of boolean values. The types of the mask and the value operand must have the same number of vector elements. 17321 17322Semantics: 17323"""""""""" 17324 17325The '``llvm.masked.scatter``' intrinsics is designed for writing selected vector elements to arbitrary memory addresses in a single IR operation. The operation may be conditional, when not all bits in the mask are switched on. It is useful for targets that support vector masked scatter and allows vectorizing basic blocks with data and control divergence. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar store operations. 17326 17327:: 17328 17329 ;; This instruction unconditionally stores data vector in multiple addresses 17330 call @llvm.masked.scatter.v8i32.v8p0i32 (<8 x i32> %value, <8 x i32*> %ptrs, i32 4, <8 x i1> <true, true, .. true>) 17331 17332 ;; It is equivalent to a list of scalar stores 17333 %val0 = extractelement <8 x i32> %value, i32 0 17334 %val1 = extractelement <8 x i32> %value, i32 1 17335 .. 17336 %val7 = extractelement <8 x i32> %value, i32 7 17337 %ptr0 = extractelement <8 x i32*> %ptrs, i32 0 17338 %ptr1 = extractelement <8 x i32*> %ptrs, i32 1 17339 .. 17340 %ptr7 = extractelement <8 x i32*> %ptrs, i32 7 17341 ;; Note: the order of the following stores is important when they overlap: 17342 store i32 %val0, i32* %ptr0, align 4 17343 store i32 %val1, i32* %ptr1, align 4 17344 .. 17345 store i32 %val7, i32* %ptr7, align 4 17346 17347 17348Masked Vector Expanding Load and Compressing Store Intrinsics 17349------------------------------------------------------------- 17350 17351LLVM provides intrinsics for expanding load and compressing store operations. Data selected from a vector according to a mask is stored in consecutive memory addresses (compressed store), and vice-versa (expanding load). These operations effective map to "if (cond.i) a[j++] = v.i" and "if (cond.i) v.i = a[j++]" patterns, respectively. Note that when the mask starts with '1' bits followed by '0' bits, these operations are identical to :ref:`llvm.masked.store <int_mstore>` and :ref:`llvm.masked.load <int_mload>`. 17352 17353.. _int_expandload: 17354 17355'``llvm.masked.expandload.*``' Intrinsics 17356^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17357 17358Syntax: 17359""""""" 17360This is an overloaded intrinsic. Several values of integer, floating point or pointer data type are loaded from consecutive memory addresses and stored into the elements of a vector according to the mask. 17361 17362:: 17363 17364 declare <16 x float> @llvm.masked.expandload.v16f32 (float* <ptr>, <16 x i1> <mask>, <16 x float> <passthru>) 17365 declare <2 x i64> @llvm.masked.expandload.v2i64 (i64* <ptr>, <2 x i1> <mask>, <2 x i64> <passthru>) 17366 17367Overview: 17368""""""""" 17369 17370Reads a number of scalar values sequentially from memory location provided in '``ptr``' and spreads them in a vector. The '``mask``' holds a bit for each vector lane. The number of elements read from memory is equal to the number of '1' bits in the mask. The loaded elements are positioned in the destination vector according to the sequence of '1' and '0' bits in the mask. E.g., if the mask vector is '10010001', "expandload" reads 3 values from memory addresses ptr, ptr+1, ptr+2 and places them in lanes 0, 3 and 7 accordingly. The masked-off lanes are filled by elements from the corresponding lanes of the '``passthru``' operand. 17371 17372 17373Arguments: 17374"""""""""" 17375 17376The first operand is the base pointer for the load. It has the same underlying type as the element of the returned vector. The second operand, mask, is a vector of boolean values with the same number of elements as the return type. The third is a pass-through value that is used to fill the masked-off lanes of the result. The return type and the type of the '``passthru``' operand have the same vector type. 17377 17378Semantics: 17379"""""""""" 17380 17381The '``llvm.masked.expandload``' intrinsic is designed for reading multiple scalar values from adjacent memory addresses into possibly non-adjacent vector lanes. It is useful for targets that support vector expanding loads and allows vectorizing loop with cross-iteration dependency like in the following example: 17382 17383.. code-block:: c 17384 17385 // In this loop we load from B and spread the elements into array A. 17386 double *A, B; int *C; 17387 for (int i = 0; i < size; ++i) { 17388 if (C[i] != 0) 17389 A[i] = B[j++]; 17390 } 17391 17392 17393.. code-block:: llvm 17394 17395 ; Load several elements from array B and expand them in a vector. 17396 ; The number of loaded elements is equal to the number of '1' elements in the Mask. 17397 %Tmp = call <8 x double> @llvm.masked.expandload.v8f64(double* %Bptr, <8 x i1> %Mask, <8 x double> undef) 17398 ; Store the result in A 17399 call void @llvm.masked.store.v8f64.p0v8f64(<8 x double> %Tmp, <8 x double>* %Aptr, i32 8, <8 x i1> %Mask) 17400 17401 ; %Bptr should be increased on each iteration according to the number of '1' elements in the Mask. 17402 %MaskI = bitcast <8 x i1> %Mask to i8 17403 %MaskIPopcnt = call i8 @llvm.ctpop.i8(i8 %MaskI) 17404 %MaskI64 = zext i8 %MaskIPopcnt to i64 17405 %BNextInd = add i64 %BInd, %MaskI64 17406 17407 17408Other targets may support this intrinsic differently, for example, by lowering it into a sequence of conditional scalar load operations and shuffles. 17409If all mask elements are '1', the intrinsic behavior is equivalent to the regular unmasked vector load. 17410 17411.. _int_compressstore: 17412 17413'``llvm.masked.compressstore.*``' Intrinsics 17414^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17415 17416Syntax: 17417""""""" 17418This is an overloaded intrinsic. A number of scalar values of integer, floating point or pointer data type are collected from an input vector and stored into adjacent memory addresses. A mask defines which elements to collect from the vector. 17419 17420:: 17421 17422 declare void @llvm.masked.compressstore.v8i32 (<8 x i32> <value>, i32* <ptr>, <8 x i1> <mask>) 17423 declare void @llvm.masked.compressstore.v16f32 (<16 x float> <value>, float* <ptr>, <16 x i1> <mask>) 17424 17425Overview: 17426""""""""" 17427 17428Selects elements from input vector '``value``' according to the '``mask``'. All selected elements are written into adjacent memory addresses starting at address '`ptr`', from lower to higher. The mask holds a bit for each vector lane, and is used to select elements to be stored. The number of elements to be stored is equal to the number of active bits in the mask. 17429 17430Arguments: 17431"""""""""" 17432 17433The first operand is the input vector, from which elements are collected and written to memory. The second operand is the base pointer for the store, it has the same underlying type as the element of the input vector operand. The third operand is the mask, a vector of boolean values. The mask and the input vector must have the same number of vector elements. 17434 17435 17436Semantics: 17437"""""""""" 17438 17439The '``llvm.masked.compressstore``' intrinsic is designed for compressing data in memory. It allows to collect elements from possibly non-adjacent lanes of a vector and store them contiguously in memory in one IR operation. It is useful for targets that support compressing store operations and allows vectorizing loops with cross-iteration dependences like in the following example: 17440 17441.. code-block:: c 17442 17443 // In this loop we load elements from A and store them consecutively in B 17444 double *A, B; int *C; 17445 for (int i = 0; i < size; ++i) { 17446 if (C[i] != 0) 17447 B[j++] = A[i] 17448 } 17449 17450 17451.. code-block:: llvm 17452 17453 ; Load elements from A. 17454 %Tmp = call <8 x double> @llvm.masked.load.v8f64.p0v8f64(<8 x double>* %Aptr, i32 8, <8 x i1> %Mask, <8 x double> undef) 17455 ; Store all selected elements consecutively in array B 17456 call <void> @llvm.masked.compressstore.v8f64(<8 x double> %Tmp, double* %Bptr, <8 x i1> %Mask) 17457 17458 ; %Bptr should be increased on each iteration according to the number of '1' elements in the Mask. 17459 %MaskI = bitcast <8 x i1> %Mask to i8 17460 %MaskIPopcnt = call i8 @llvm.ctpop.i8(i8 %MaskI) 17461 %MaskI64 = zext i8 %MaskIPopcnt to i64 17462 %BNextInd = add i64 %BInd, %MaskI64 17463 17464 17465Other targets may support this intrinsic differently, for example, by lowering it into a sequence of branches that guard scalar store operations. 17466 17467 17468Memory Use Markers 17469------------------ 17470 17471This class of intrinsics provides information about the lifetime of 17472memory objects and ranges where variables are immutable. 17473 17474.. _int_lifestart: 17475 17476'``llvm.lifetime.start``' Intrinsic 17477^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17478 17479Syntax: 17480""""""" 17481 17482:: 17483 17484 declare void @llvm.lifetime.start(i64 <size>, i8* nocapture <ptr>) 17485 17486Overview: 17487""""""""" 17488 17489The '``llvm.lifetime.start``' intrinsic specifies the start of a memory 17490object's lifetime. 17491 17492Arguments: 17493"""""""""" 17494 17495The first argument is a constant integer representing the size of the 17496object, or -1 if it is variable sized. The second argument is a pointer 17497to the object. 17498 17499Semantics: 17500"""""""""" 17501 17502This intrinsic indicates that before this point in the code, the value 17503of the memory pointed to by ``ptr`` is dead. This means that it is known 17504to never be used and has an undefined value. A load from the pointer 17505that precedes this intrinsic can be replaced with ``'undef'``. 17506 17507.. _int_lifeend: 17508 17509'``llvm.lifetime.end``' Intrinsic 17510^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17511 17512Syntax: 17513""""""" 17514 17515:: 17516 17517 declare void @llvm.lifetime.end(i64 <size>, i8* nocapture <ptr>) 17518 17519Overview: 17520""""""""" 17521 17522The '``llvm.lifetime.end``' intrinsic specifies the end of a memory 17523object's lifetime. 17524 17525Arguments: 17526"""""""""" 17527 17528The first argument is a constant integer representing the size of the 17529object, or -1 if it is variable sized. The second argument is a pointer 17530to the object. 17531 17532Semantics: 17533"""""""""" 17534 17535This intrinsic indicates that after this point in the code, the value of 17536the memory pointed to by ``ptr`` is dead. This means that it is known to 17537never be used and has an undefined value. Any stores into the memory 17538object following this intrinsic may be removed as dead. 17539 17540'``llvm.invariant.start``' Intrinsic 17541^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17542 17543Syntax: 17544""""""" 17545This is an overloaded intrinsic. The memory object can belong to any address space. 17546 17547:: 17548 17549 declare {}* @llvm.invariant.start.p0i8(i64 <size>, i8* nocapture <ptr>) 17550 17551Overview: 17552""""""""" 17553 17554The '``llvm.invariant.start``' intrinsic specifies that the contents of 17555a memory object will not change. 17556 17557Arguments: 17558"""""""""" 17559 17560The first argument is a constant integer representing the size of the 17561object, or -1 if it is variable sized. The second argument is a pointer 17562to the object. 17563 17564Semantics: 17565"""""""""" 17566 17567This intrinsic indicates that until an ``llvm.invariant.end`` that uses 17568the return value, the referenced memory location is constant and 17569unchanging. 17570 17571'``llvm.invariant.end``' Intrinsic 17572^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17573 17574Syntax: 17575""""""" 17576This is an overloaded intrinsic. The memory object can belong to any address space. 17577 17578:: 17579 17580 declare void @llvm.invariant.end.p0i8({}* <start>, i64 <size>, i8* nocapture <ptr>) 17581 17582Overview: 17583""""""""" 17584 17585The '``llvm.invariant.end``' intrinsic specifies that the contents of a 17586memory object are mutable. 17587 17588Arguments: 17589"""""""""" 17590 17591The first argument is the matching ``llvm.invariant.start`` intrinsic. 17592The second argument is a constant integer representing the size of the 17593object, or -1 if it is variable sized and the third argument is a 17594pointer to the object. 17595 17596Semantics: 17597"""""""""" 17598 17599This intrinsic indicates that the memory is mutable again. 17600 17601'``llvm.launder.invariant.group``' Intrinsic 17602^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17603 17604Syntax: 17605""""""" 17606This is an overloaded intrinsic. The memory object can belong to any address 17607space. The returned pointer must belong to the same address space as the 17608argument. 17609 17610:: 17611 17612 declare i8* @llvm.launder.invariant.group.p0i8(i8* <ptr>) 17613 17614Overview: 17615""""""""" 17616 17617The '``llvm.launder.invariant.group``' intrinsic can be used when an invariant 17618established by ``invariant.group`` metadata no longer holds, to obtain a new 17619pointer value that carries fresh invariant group information. It is an 17620experimental intrinsic, which means that its semantics might change in the 17621future. 17622 17623 17624Arguments: 17625"""""""""" 17626 17627The ``llvm.launder.invariant.group`` takes only one argument, which is a pointer 17628to the memory. 17629 17630Semantics: 17631"""""""""" 17632 17633Returns another pointer that aliases its argument but which is considered different 17634for the purposes of ``load``/``store`` ``invariant.group`` metadata. 17635It does not read any accessible memory and the execution can be speculated. 17636 17637'``llvm.strip.invariant.group``' Intrinsic 17638^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17639 17640Syntax: 17641""""""" 17642This is an overloaded intrinsic. The memory object can belong to any address 17643space. The returned pointer must belong to the same address space as the 17644argument. 17645 17646:: 17647 17648 declare i8* @llvm.strip.invariant.group.p0i8(i8* <ptr>) 17649 17650Overview: 17651""""""""" 17652 17653The '``llvm.strip.invariant.group``' intrinsic can be used when an invariant 17654established by ``invariant.group`` metadata no longer holds, to obtain a new pointer 17655value that does not carry the invariant information. It is an experimental 17656intrinsic, which means that its semantics might change in the future. 17657 17658 17659Arguments: 17660"""""""""" 17661 17662The ``llvm.strip.invariant.group`` takes only one argument, which is a pointer 17663to the memory. 17664 17665Semantics: 17666"""""""""" 17667 17668Returns another pointer that aliases its argument but which has no associated 17669``invariant.group`` metadata. 17670It does not read any memory and can be speculated. 17671 17672 17673 17674.. _constrainedfp: 17675 17676Constrained Floating-Point Intrinsics 17677------------------------------------- 17678 17679These intrinsics are used to provide special handling of floating-point 17680operations when specific rounding mode or floating-point exception behavior is 17681required. By default, LLVM optimization passes assume that the rounding mode is 17682round-to-nearest and that floating-point exceptions will not be monitored. 17683Constrained FP intrinsics are used to support non-default rounding modes and 17684accurately preserve exception behavior without compromising LLVM's ability to 17685optimize FP code when the default behavior is used. 17686 17687If any FP operation in a function is constrained then they all must be 17688constrained. This is required for correct LLVM IR. Optimizations that 17689move code around can create miscompiles if mixing of constrained and normal 17690operations is done. The correct way to mix constrained and less constrained 17691operations is to use the rounding mode and exception handling metadata to 17692mark constrained intrinsics as having LLVM's default behavior. 17693 17694Each of these intrinsics corresponds to a normal floating-point operation. The 17695data arguments and the return value are the same as the corresponding FP 17696operation. 17697 17698The rounding mode argument is a metadata string specifying what 17699assumptions, if any, the optimizer can make when transforming constant 17700values. Some constrained FP intrinsics omit this argument. If required 17701by the intrinsic, this argument must be one of the following strings: 17702 17703:: 17704 17705 "round.dynamic" 17706 "round.tonearest" 17707 "round.downward" 17708 "round.upward" 17709 "round.towardzero" 17710 "round.tonearestaway" 17711 17712If this argument is "round.dynamic" optimization passes must assume that the 17713rounding mode is unknown and may change at runtime. No transformations that 17714depend on rounding mode may be performed in this case. 17715 17716The other possible values for the rounding mode argument correspond to the 17717similarly named IEEE rounding modes. If the argument is any of these values 17718optimization passes may perform transformations as long as they are consistent 17719with the specified rounding mode. 17720 17721For example, 'x-0'->'x' is not a valid transformation if the rounding mode is 17722"round.downward" or "round.dynamic" because if the value of 'x' is +0 then 17723'x-0' should evaluate to '-0' when rounding downward. However, this 17724transformation is legal for all other rounding modes. 17725 17726For values other than "round.dynamic" optimization passes may assume that the 17727actual runtime rounding mode (as defined in a target-specific manner) matches 17728the specified rounding mode, but this is not guaranteed. Using a specific 17729non-dynamic rounding mode which does not match the actual rounding mode at 17730runtime results in undefined behavior. 17731 17732The exception behavior argument is a metadata string describing the floating 17733point exception semantics that required for the intrinsic. This argument 17734must be one of the following strings: 17735 17736:: 17737 17738 "fpexcept.ignore" 17739 "fpexcept.maytrap" 17740 "fpexcept.strict" 17741 17742If this argument is "fpexcept.ignore" optimization passes may assume that the 17743exception status flags will not be read and that floating-point exceptions will 17744be masked. This allows transformations to be performed that may change the 17745exception semantics of the original code. For example, FP operations may be 17746speculatively executed in this case whereas they must not be for either of the 17747other possible values of this argument. 17748 17749If the exception behavior argument is "fpexcept.maytrap" optimization passes 17750must avoid transformations that may raise exceptions that would not have been 17751raised by the original code (such as speculatively executing FP operations), but 17752passes are not required to preserve all exceptions that are implied by the 17753original code. For example, exceptions may be potentially hidden by constant 17754folding. 17755 17756If the exception behavior argument is "fpexcept.strict" all transformations must 17757strictly preserve the floating-point exception semantics of the original code. 17758Any FP exception that would have been raised by the original code must be raised 17759by the transformed code, and the transformed code must not raise any FP 17760exceptions that would not have been raised by the original code. This is the 17761exception behavior argument that will be used if the code being compiled reads 17762the FP exception status flags, but this mode can also be used with code that 17763unmasks FP exceptions. 17764 17765The number and order of floating-point exceptions is NOT guaranteed. For 17766example, a series of FP operations that each may raise exceptions may be 17767vectorized into a single instruction that raises each unique exception a single 17768time. 17769 17770Proper :ref:`function attributes <fnattrs>` usage is required for the 17771constrained intrinsics to function correctly. 17772 17773All function *calls* done in a function that uses constrained floating 17774point intrinsics must have the ``strictfp`` attribute. 17775 17776All function *definitions* that use constrained floating point intrinsics 17777must have the ``strictfp`` attribute. 17778 17779'``llvm.experimental.constrained.fadd``' Intrinsic 17780^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17781 17782Syntax: 17783""""""" 17784 17785:: 17786 17787 declare <type> 17788 @llvm.experimental.constrained.fadd(<type> <op1>, <type> <op2>, 17789 metadata <rounding mode>, 17790 metadata <exception behavior>) 17791 17792Overview: 17793""""""""" 17794 17795The '``llvm.experimental.constrained.fadd``' intrinsic returns the sum of its 17796two operands. 17797 17798 17799Arguments: 17800"""""""""" 17801 17802The first two arguments to the '``llvm.experimental.constrained.fadd``' 17803intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` 17804of floating-point values. Both arguments must have identical types. 17805 17806The third and fourth arguments specify the rounding mode and exception 17807behavior as described above. 17808 17809Semantics: 17810"""""""""" 17811 17812The value produced is the floating-point sum of the two value operands and has 17813the same type as the operands. 17814 17815 17816'``llvm.experimental.constrained.fsub``' Intrinsic 17817^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17818 17819Syntax: 17820""""""" 17821 17822:: 17823 17824 declare <type> 17825 @llvm.experimental.constrained.fsub(<type> <op1>, <type> <op2>, 17826 metadata <rounding mode>, 17827 metadata <exception behavior>) 17828 17829Overview: 17830""""""""" 17831 17832The '``llvm.experimental.constrained.fsub``' intrinsic returns the difference 17833of its two operands. 17834 17835 17836Arguments: 17837"""""""""" 17838 17839The first two arguments to the '``llvm.experimental.constrained.fsub``' 17840intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` 17841of floating-point values. Both arguments must have identical types. 17842 17843The third and fourth arguments specify the rounding mode and exception 17844behavior as described above. 17845 17846Semantics: 17847"""""""""" 17848 17849The value produced is the floating-point difference of the two value operands 17850and has the same type as the operands. 17851 17852 17853'``llvm.experimental.constrained.fmul``' Intrinsic 17854^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17855 17856Syntax: 17857""""""" 17858 17859:: 17860 17861 declare <type> 17862 @llvm.experimental.constrained.fmul(<type> <op1>, <type> <op2>, 17863 metadata <rounding mode>, 17864 metadata <exception behavior>) 17865 17866Overview: 17867""""""""" 17868 17869The '``llvm.experimental.constrained.fmul``' intrinsic returns the product of 17870its two operands. 17871 17872 17873Arguments: 17874"""""""""" 17875 17876The first two arguments to the '``llvm.experimental.constrained.fmul``' 17877intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` 17878of floating-point values. Both arguments must have identical types. 17879 17880The third and fourth arguments specify the rounding mode and exception 17881behavior as described above. 17882 17883Semantics: 17884"""""""""" 17885 17886The value produced is the floating-point product of the two value operands and 17887has the same type as the operands. 17888 17889 17890'``llvm.experimental.constrained.fdiv``' Intrinsic 17891^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17892 17893Syntax: 17894""""""" 17895 17896:: 17897 17898 declare <type> 17899 @llvm.experimental.constrained.fdiv(<type> <op1>, <type> <op2>, 17900 metadata <rounding mode>, 17901 metadata <exception behavior>) 17902 17903Overview: 17904""""""""" 17905 17906The '``llvm.experimental.constrained.fdiv``' intrinsic returns the quotient of 17907its two operands. 17908 17909 17910Arguments: 17911"""""""""" 17912 17913The first two arguments to the '``llvm.experimental.constrained.fdiv``' 17914intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` 17915of floating-point values. Both arguments must have identical types. 17916 17917The third and fourth arguments specify the rounding mode and exception 17918behavior as described above. 17919 17920Semantics: 17921"""""""""" 17922 17923The value produced is the floating-point quotient of the two value operands and 17924has the same type as the operands. 17925 17926 17927'``llvm.experimental.constrained.frem``' Intrinsic 17928^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17929 17930Syntax: 17931""""""" 17932 17933:: 17934 17935 declare <type> 17936 @llvm.experimental.constrained.frem(<type> <op1>, <type> <op2>, 17937 metadata <rounding mode>, 17938 metadata <exception behavior>) 17939 17940Overview: 17941""""""""" 17942 17943The '``llvm.experimental.constrained.frem``' intrinsic returns the remainder 17944from the division of its two operands. 17945 17946 17947Arguments: 17948"""""""""" 17949 17950The first two arguments to the '``llvm.experimental.constrained.frem``' 17951intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` 17952of floating-point values. Both arguments must have identical types. 17953 17954The third and fourth arguments specify the rounding mode and exception 17955behavior as described above. The rounding mode argument has no effect, since 17956the result of frem is never rounded, but the argument is included for 17957consistency with the other constrained floating-point intrinsics. 17958 17959Semantics: 17960"""""""""" 17961 17962The value produced is the floating-point remainder from the division of the two 17963value operands and has the same type as the operands. The remainder has the 17964same sign as the dividend. 17965 17966'``llvm.experimental.constrained.fma``' Intrinsic 17967^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17968 17969Syntax: 17970""""""" 17971 17972:: 17973 17974 declare <type> 17975 @llvm.experimental.constrained.fma(<type> <op1>, <type> <op2>, <type> <op3>, 17976 metadata <rounding mode>, 17977 metadata <exception behavior>) 17978 17979Overview: 17980""""""""" 17981 17982The '``llvm.experimental.constrained.fma``' intrinsic returns the result of a 17983fused-multiply-add operation on its operands. 17984 17985Arguments: 17986"""""""""" 17987 17988The first three arguments to the '``llvm.experimental.constrained.fma``' 17989intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector 17990<t_vector>` of floating-point values. All arguments must have identical types. 17991 17992The fourth and fifth arguments specify the rounding mode and exception behavior 17993as described above. 17994 17995Semantics: 17996"""""""""" 17997 17998The result produced is the product of the first two operands added to the third 17999operand computed with infinite precision, and then rounded to the target 18000precision. 18001 18002'``llvm.experimental.constrained.fptoui``' Intrinsic 18003^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18004 18005Syntax: 18006""""""" 18007 18008:: 18009 18010 declare <ty2> 18011 @llvm.experimental.constrained.fptoui(<type> <value>, 18012 metadata <exception behavior>) 18013 18014Overview: 18015""""""""" 18016 18017The '``llvm.experimental.constrained.fptoui``' intrinsic converts a 18018floating-point ``value`` to its unsigned integer equivalent of type ``ty2``. 18019 18020Arguments: 18021"""""""""" 18022 18023The first argument to the '``llvm.experimental.constrained.fptoui``' 18024intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector 18025<t_vector>` of floating point values. 18026 18027The second argument specifies the exception behavior as described above. 18028 18029Semantics: 18030"""""""""" 18031 18032The result produced is an unsigned integer converted from the floating 18033point operand. The value is truncated, so it is rounded towards zero. 18034 18035'``llvm.experimental.constrained.fptosi``' Intrinsic 18036^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18037 18038Syntax: 18039""""""" 18040 18041:: 18042 18043 declare <ty2> 18044 @llvm.experimental.constrained.fptosi(<type> <value>, 18045 metadata <exception behavior>) 18046 18047Overview: 18048""""""""" 18049 18050The '``llvm.experimental.constrained.fptosi``' intrinsic converts 18051:ref:`floating-point <t_floating>` ``value`` to type ``ty2``. 18052 18053Arguments: 18054"""""""""" 18055 18056The first argument to the '``llvm.experimental.constrained.fptosi``' 18057intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector 18058<t_vector>` of floating point values. 18059 18060The second argument specifies the exception behavior as described above. 18061 18062Semantics: 18063"""""""""" 18064 18065The result produced is a signed integer converted from the floating 18066point operand. The value is truncated, so it is rounded towards zero. 18067 18068'``llvm.experimental.constrained.uitofp``' Intrinsic 18069^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18070 18071Syntax: 18072""""""" 18073 18074:: 18075 18076 declare <ty2> 18077 @llvm.experimental.constrained.uitofp(<type> <value>, 18078 metadata <rounding mode>, 18079 metadata <exception behavior>) 18080 18081Overview: 18082""""""""" 18083 18084The '``llvm.experimental.constrained.uitofp``' intrinsic converts an 18085unsigned integer ``value`` to a floating-point of type ``ty2``. 18086 18087Arguments: 18088"""""""""" 18089 18090The first argument to the '``llvm.experimental.constrained.uitofp``' 18091intrinsic must be an :ref:`integer <t_integer>` or :ref:`vector 18092<t_vector>` of integer values. 18093 18094The second and third arguments specify the rounding mode and exception 18095behavior as described above. 18096 18097Semantics: 18098"""""""""" 18099 18100An inexact floating-point exception will be raised if rounding is required. 18101Any result produced is a floating point value converted from the input 18102integer operand. 18103 18104'``llvm.experimental.constrained.sitofp``' Intrinsic 18105^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18106 18107Syntax: 18108""""""" 18109 18110:: 18111 18112 declare <ty2> 18113 @llvm.experimental.constrained.sitofp(<type> <value>, 18114 metadata <rounding mode>, 18115 metadata <exception behavior>) 18116 18117Overview: 18118""""""""" 18119 18120The '``llvm.experimental.constrained.sitofp``' intrinsic converts a 18121signed integer ``value`` to a floating-point of type ``ty2``. 18122 18123Arguments: 18124"""""""""" 18125 18126The first argument to the '``llvm.experimental.constrained.sitofp``' 18127intrinsic must be an :ref:`integer <t_integer>` or :ref:`vector 18128<t_vector>` of integer values. 18129 18130The second and third arguments specify the rounding mode and exception 18131behavior as described above. 18132 18133Semantics: 18134"""""""""" 18135 18136An inexact floating-point exception will be raised if rounding is required. 18137Any result produced is a floating point value converted from the input 18138integer operand. 18139 18140'``llvm.experimental.constrained.fptrunc``' Intrinsic 18141^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18142 18143Syntax: 18144""""""" 18145 18146:: 18147 18148 declare <ty2> 18149 @llvm.experimental.constrained.fptrunc(<type> <value>, 18150 metadata <rounding mode>, 18151 metadata <exception behavior>) 18152 18153Overview: 18154""""""""" 18155 18156The '``llvm.experimental.constrained.fptrunc``' intrinsic truncates ``value`` 18157to type ``ty2``. 18158 18159Arguments: 18160"""""""""" 18161 18162The first argument to the '``llvm.experimental.constrained.fptrunc``' 18163intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector 18164<t_vector>` of floating point values. This argument must be larger in size 18165than the result. 18166 18167The second and third arguments specify the rounding mode and exception 18168behavior as described above. 18169 18170Semantics: 18171"""""""""" 18172 18173The result produced is a floating point value truncated to be smaller in size 18174than the operand. 18175 18176'``llvm.experimental.constrained.fpext``' Intrinsic 18177^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18178 18179Syntax: 18180""""""" 18181 18182:: 18183 18184 declare <ty2> 18185 @llvm.experimental.constrained.fpext(<type> <value>, 18186 metadata <exception behavior>) 18187 18188Overview: 18189""""""""" 18190 18191The '``llvm.experimental.constrained.fpext``' intrinsic extends a 18192floating-point ``value`` to a larger floating-point value. 18193 18194Arguments: 18195"""""""""" 18196 18197The first argument to the '``llvm.experimental.constrained.fpext``' 18198intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector 18199<t_vector>` of floating point values. This argument must be smaller in size 18200than the result. 18201 18202The second argument specifies the exception behavior as described above. 18203 18204Semantics: 18205"""""""""" 18206 18207The result produced is a floating point value extended to be larger in size 18208than the operand. All restrictions that apply to the fpext instruction also 18209apply to this intrinsic. 18210 18211'``llvm.experimental.constrained.fcmp``' and '``llvm.experimental.constrained.fcmps``' Intrinsics 18212^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18213 18214Syntax: 18215""""""" 18216 18217:: 18218 18219 declare <ty2> 18220 @llvm.experimental.constrained.fcmp(<type> <op1>, <type> <op2>, 18221 metadata <condition code>, 18222 metadata <exception behavior>) 18223 declare <ty2> 18224 @llvm.experimental.constrained.fcmps(<type> <op1>, <type> <op2>, 18225 metadata <condition code>, 18226 metadata <exception behavior>) 18227 18228Overview: 18229""""""""" 18230 18231The '``llvm.experimental.constrained.fcmp``' and 18232'``llvm.experimental.constrained.fcmps``' intrinsics return a boolean 18233value or vector of boolean values based on comparison of its operands. 18234 18235If the operands are floating-point scalars, then the result type is a 18236boolean (:ref:`i1 <t_integer>`). 18237 18238If the operands are floating-point vectors, then the result type is a 18239vector of boolean with the same number of elements as the operands being 18240compared. 18241 18242The '``llvm.experimental.constrained.fcmp``' intrinsic performs a quiet 18243comparison operation while the '``llvm.experimental.constrained.fcmps``' 18244intrinsic performs a signaling comparison operation. 18245 18246Arguments: 18247"""""""""" 18248 18249The first two arguments to the '``llvm.experimental.constrained.fcmp``' 18250and '``llvm.experimental.constrained.fcmps``' intrinsics must be 18251:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` 18252of floating-point values. Both arguments must have identical types. 18253 18254The third argument is the condition code indicating the kind of comparison 18255to perform. It must be a metadata string with one of the following values: 18256 18257- "``oeq``": ordered and equal 18258- "``ogt``": ordered and greater than 18259- "``oge``": ordered and greater than or equal 18260- "``olt``": ordered and less than 18261- "``ole``": ordered and less than or equal 18262- "``one``": ordered and not equal 18263- "``ord``": ordered (no nans) 18264- "``ueq``": unordered or equal 18265- "``ugt``": unordered or greater than 18266- "``uge``": unordered or greater than or equal 18267- "``ult``": unordered or less than 18268- "``ule``": unordered or less than or equal 18269- "``une``": unordered or not equal 18270- "``uno``": unordered (either nans) 18271 18272*Ordered* means that neither operand is a NAN while *unordered* means 18273that either operand may be a NAN. 18274 18275The fourth argument specifies the exception behavior as described above. 18276 18277Semantics: 18278"""""""""" 18279 18280``op1`` and ``op2`` are compared according to the condition code given 18281as the third argument. If the operands are vectors, then the 18282vectors are compared element by element. Each comparison performed 18283always yields an :ref:`i1 <t_integer>` result, as follows: 18284 18285- "``oeq``": yields ``true`` if both operands are not a NAN and ``op1`` 18286 is equal to ``op2``. 18287- "``ogt``": yields ``true`` if both operands are not a NAN and ``op1`` 18288 is greater than ``op2``. 18289- "``oge``": yields ``true`` if both operands are not a NAN and ``op1`` 18290 is greater than or equal to ``op2``. 18291- "``olt``": yields ``true`` if both operands are not a NAN and ``op1`` 18292 is less than ``op2``. 18293- "``ole``": yields ``true`` if both operands are not a NAN and ``op1`` 18294 is less than or equal to ``op2``. 18295- "``one``": yields ``true`` if both operands are not a NAN and ``op1`` 18296 is not equal to ``op2``. 18297- "``ord``": yields ``true`` if both operands are not a NAN. 18298- "``ueq``": yields ``true`` if either operand is a NAN or ``op1`` is 18299 equal to ``op2``. 18300- "``ugt``": yields ``true`` if either operand is a NAN or ``op1`` is 18301 greater than ``op2``. 18302- "``uge``": yields ``true`` if either operand is a NAN or ``op1`` is 18303 greater than or equal to ``op2``. 18304- "``ult``": yields ``true`` if either operand is a NAN or ``op1`` is 18305 less than ``op2``. 18306- "``ule``": yields ``true`` if either operand is a NAN or ``op1`` is 18307 less than or equal to ``op2``. 18308- "``une``": yields ``true`` if either operand is a NAN or ``op1`` is 18309 not equal to ``op2``. 18310- "``uno``": yields ``true`` if either operand is a NAN. 18311 18312The quiet comparison operation performed by 18313'``llvm.experimental.constrained.fcmp``' will only raise an exception 18314if either operand is a SNAN. The signaling comparison operation 18315performed by '``llvm.experimental.constrained.fcmps``' will raise an 18316exception if either operand is a NAN (QNAN or SNAN). 18317 18318'``llvm.experimental.constrained.fmuladd``' Intrinsic 18319^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18320 18321Syntax: 18322""""""" 18323 18324:: 18325 18326 declare <type> 18327 @llvm.experimental.constrained.fmuladd(<type> <op1>, <type> <op2>, 18328 <type> <op3>, 18329 metadata <rounding mode>, 18330 metadata <exception behavior>) 18331 18332Overview: 18333""""""""" 18334 18335The '``llvm.experimental.constrained.fmuladd``' intrinsic represents 18336multiply-add expressions that can be fused if the code generator determines 18337that (a) the target instruction set has support for a fused operation, 18338and (b) that the fused operation is more efficient than the equivalent, 18339separate pair of mul and add instructions. 18340 18341Arguments: 18342"""""""""" 18343 18344The first three arguments to the '``llvm.experimental.constrained.fmuladd``' 18345intrinsic must be floating-point or vector of floating-point values. 18346All three arguments must have identical types. 18347 18348The fourth and fifth arguments specify the rounding mode and exception behavior 18349as described above. 18350 18351Semantics: 18352"""""""""" 18353 18354The expression: 18355 18356:: 18357 18358 %0 = call float @llvm.experimental.constrained.fmuladd.f32(%a, %b, %c, 18359 metadata <rounding mode>, 18360 metadata <exception behavior>) 18361 18362is equivalent to the expression: 18363 18364:: 18365 18366 %0 = call float @llvm.experimental.constrained.fmul.f32(%a, %b, 18367 metadata <rounding mode>, 18368 metadata <exception behavior>) 18369 %1 = call float @llvm.experimental.constrained.fadd.f32(%0, %c, 18370 metadata <rounding mode>, 18371 metadata <exception behavior>) 18372 18373except that it is unspecified whether rounding will be performed between the 18374multiplication and addition steps. Fusion is not guaranteed, even if the target 18375platform supports it. 18376If a fused multiply-add is required, the corresponding 18377:ref:`llvm.experimental.constrained.fma <int_fma>` intrinsic function should be 18378used instead. 18379This never sets errno, just as '``llvm.experimental.constrained.fma.*``'. 18380 18381Constrained libm-equivalent Intrinsics 18382-------------------------------------- 18383 18384In addition to the basic floating-point operations for which constrained 18385intrinsics are described above, there are constrained versions of various 18386operations which provide equivalent behavior to a corresponding libm function. 18387These intrinsics allow the precise behavior of these operations with respect to 18388rounding mode and exception behavior to be controlled. 18389 18390As with the basic constrained floating-point intrinsics, the rounding mode 18391and exception behavior arguments only control the behavior of the optimizer. 18392They do not change the runtime floating-point environment. 18393 18394 18395'``llvm.experimental.constrained.sqrt``' Intrinsic 18396^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18397 18398Syntax: 18399""""""" 18400 18401:: 18402 18403 declare <type> 18404 @llvm.experimental.constrained.sqrt(<type> <op1>, 18405 metadata <rounding mode>, 18406 metadata <exception behavior>) 18407 18408Overview: 18409""""""""" 18410 18411The '``llvm.experimental.constrained.sqrt``' intrinsic returns the square root 18412of the specified value, returning the same value as the libm '``sqrt``' 18413functions would, but without setting ``errno``. 18414 18415Arguments: 18416"""""""""" 18417 18418The first argument and the return type are floating-point numbers of the same 18419type. 18420 18421The second and third arguments specify the rounding mode and exception 18422behavior as described above. 18423 18424Semantics: 18425"""""""""" 18426 18427This function returns the nonnegative square root of the specified value. 18428If the value is less than negative zero, a floating-point exception occurs 18429and the return value is architecture specific. 18430 18431 18432'``llvm.experimental.constrained.pow``' Intrinsic 18433^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18434 18435Syntax: 18436""""""" 18437 18438:: 18439 18440 declare <type> 18441 @llvm.experimental.constrained.pow(<type> <op1>, <type> <op2>, 18442 metadata <rounding mode>, 18443 metadata <exception behavior>) 18444 18445Overview: 18446""""""""" 18447 18448The '``llvm.experimental.constrained.pow``' intrinsic returns the first operand 18449raised to the (positive or negative) power specified by the second operand. 18450 18451Arguments: 18452"""""""""" 18453 18454The first two arguments and the return value are floating-point numbers of the 18455same type. The second argument specifies the power to which the first argument 18456should be raised. 18457 18458The third and fourth arguments specify the rounding mode and exception 18459behavior as described above. 18460 18461Semantics: 18462"""""""""" 18463 18464This function returns the first value raised to the second power, 18465returning the same values as the libm ``pow`` functions would, and 18466handles error conditions in the same way. 18467 18468 18469'``llvm.experimental.constrained.powi``' Intrinsic 18470^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18471 18472Syntax: 18473""""""" 18474 18475:: 18476 18477 declare <type> 18478 @llvm.experimental.constrained.powi(<type> <op1>, i32 <op2>, 18479 metadata <rounding mode>, 18480 metadata <exception behavior>) 18481 18482Overview: 18483""""""""" 18484 18485The '``llvm.experimental.constrained.powi``' intrinsic returns the first operand 18486raised to the (positive or negative) power specified by the second operand. The 18487order of evaluation of multiplications is not defined. When a vector of 18488floating-point type is used, the second argument remains a scalar integer value. 18489 18490 18491Arguments: 18492"""""""""" 18493 18494The first argument and the return value are floating-point numbers of the same 18495type. The second argument is a 32-bit signed integer specifying the power to 18496which the first argument should be raised. 18497 18498The third and fourth arguments specify the rounding mode and exception 18499behavior as described above. 18500 18501Semantics: 18502"""""""""" 18503 18504This function returns the first value raised to the second power with an 18505unspecified sequence of rounding operations. 18506 18507 18508'``llvm.experimental.constrained.sin``' Intrinsic 18509^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18510 18511Syntax: 18512""""""" 18513 18514:: 18515 18516 declare <type> 18517 @llvm.experimental.constrained.sin(<type> <op1>, 18518 metadata <rounding mode>, 18519 metadata <exception behavior>) 18520 18521Overview: 18522""""""""" 18523 18524The '``llvm.experimental.constrained.sin``' intrinsic returns the sine of the 18525first operand. 18526 18527Arguments: 18528"""""""""" 18529 18530The first argument and the return type are floating-point numbers of the same 18531type. 18532 18533The second and third arguments specify the rounding mode and exception 18534behavior as described above. 18535 18536Semantics: 18537"""""""""" 18538 18539This function returns the sine of the specified operand, returning the 18540same values as the libm ``sin`` functions would, and handles error 18541conditions in the same way. 18542 18543 18544'``llvm.experimental.constrained.cos``' Intrinsic 18545^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18546 18547Syntax: 18548""""""" 18549 18550:: 18551 18552 declare <type> 18553 @llvm.experimental.constrained.cos(<type> <op1>, 18554 metadata <rounding mode>, 18555 metadata <exception behavior>) 18556 18557Overview: 18558""""""""" 18559 18560The '``llvm.experimental.constrained.cos``' intrinsic returns the cosine of the 18561first operand. 18562 18563Arguments: 18564"""""""""" 18565 18566The first argument and the return type are floating-point numbers of the same 18567type. 18568 18569The second and third arguments specify the rounding mode and exception 18570behavior as described above. 18571 18572Semantics: 18573"""""""""" 18574 18575This function returns the cosine of the specified operand, returning the 18576same values as the libm ``cos`` functions would, and handles error 18577conditions in the same way. 18578 18579 18580'``llvm.experimental.constrained.exp``' Intrinsic 18581^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18582 18583Syntax: 18584""""""" 18585 18586:: 18587 18588 declare <type> 18589 @llvm.experimental.constrained.exp(<type> <op1>, 18590 metadata <rounding mode>, 18591 metadata <exception behavior>) 18592 18593Overview: 18594""""""""" 18595 18596The '``llvm.experimental.constrained.exp``' intrinsic computes the base-e 18597exponential of the specified value. 18598 18599Arguments: 18600"""""""""" 18601 18602The first argument and the return value are floating-point numbers of the same 18603type. 18604 18605The second and third arguments specify the rounding mode and exception 18606behavior as described above. 18607 18608Semantics: 18609"""""""""" 18610 18611This function returns the same values as the libm ``exp`` functions 18612would, and handles error conditions in the same way. 18613 18614 18615'``llvm.experimental.constrained.exp2``' Intrinsic 18616^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18617 18618Syntax: 18619""""""" 18620 18621:: 18622 18623 declare <type> 18624 @llvm.experimental.constrained.exp2(<type> <op1>, 18625 metadata <rounding mode>, 18626 metadata <exception behavior>) 18627 18628Overview: 18629""""""""" 18630 18631The '``llvm.experimental.constrained.exp2``' intrinsic computes the base-2 18632exponential of the specified value. 18633 18634 18635Arguments: 18636"""""""""" 18637 18638The first argument and the return value are floating-point numbers of the same 18639type. 18640 18641The second and third arguments specify the rounding mode and exception 18642behavior as described above. 18643 18644Semantics: 18645"""""""""" 18646 18647This function returns the same values as the libm ``exp2`` functions 18648would, and handles error conditions in the same way. 18649 18650 18651'``llvm.experimental.constrained.log``' Intrinsic 18652^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18653 18654Syntax: 18655""""""" 18656 18657:: 18658 18659 declare <type> 18660 @llvm.experimental.constrained.log(<type> <op1>, 18661 metadata <rounding mode>, 18662 metadata <exception behavior>) 18663 18664Overview: 18665""""""""" 18666 18667The '``llvm.experimental.constrained.log``' intrinsic computes the base-e 18668logarithm of the specified value. 18669 18670Arguments: 18671"""""""""" 18672 18673The first argument and the return value are floating-point numbers of the same 18674type. 18675 18676The second and third arguments specify the rounding mode and exception 18677behavior as described above. 18678 18679 18680Semantics: 18681"""""""""" 18682 18683This function returns the same values as the libm ``log`` functions 18684would, and handles error conditions in the same way. 18685 18686 18687'``llvm.experimental.constrained.log10``' Intrinsic 18688^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18689 18690Syntax: 18691""""""" 18692 18693:: 18694 18695 declare <type> 18696 @llvm.experimental.constrained.log10(<type> <op1>, 18697 metadata <rounding mode>, 18698 metadata <exception behavior>) 18699 18700Overview: 18701""""""""" 18702 18703The '``llvm.experimental.constrained.log10``' intrinsic computes the base-10 18704logarithm of the specified value. 18705 18706Arguments: 18707"""""""""" 18708 18709The first argument and the return value are floating-point numbers of the same 18710type. 18711 18712The second and third arguments specify the rounding mode and exception 18713behavior as described above. 18714 18715Semantics: 18716"""""""""" 18717 18718This function returns the same values as the libm ``log10`` functions 18719would, and handles error conditions in the same way. 18720 18721 18722'``llvm.experimental.constrained.log2``' Intrinsic 18723^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18724 18725Syntax: 18726""""""" 18727 18728:: 18729 18730 declare <type> 18731 @llvm.experimental.constrained.log2(<type> <op1>, 18732 metadata <rounding mode>, 18733 metadata <exception behavior>) 18734 18735Overview: 18736""""""""" 18737 18738The '``llvm.experimental.constrained.log2``' intrinsic computes the base-2 18739logarithm of the specified value. 18740 18741Arguments: 18742"""""""""" 18743 18744The first argument and the return value are floating-point numbers of the same 18745type. 18746 18747The second and third arguments specify the rounding mode and exception 18748behavior as described above. 18749 18750Semantics: 18751"""""""""" 18752 18753This function returns the same values as the libm ``log2`` functions 18754would, and handles error conditions in the same way. 18755 18756 18757'``llvm.experimental.constrained.rint``' Intrinsic 18758^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18759 18760Syntax: 18761""""""" 18762 18763:: 18764 18765 declare <type> 18766 @llvm.experimental.constrained.rint(<type> <op1>, 18767 metadata <rounding mode>, 18768 metadata <exception behavior>) 18769 18770Overview: 18771""""""""" 18772 18773The '``llvm.experimental.constrained.rint``' intrinsic returns the first 18774operand rounded to the nearest integer. It may raise an inexact floating-point 18775exception if the operand is not an integer. 18776 18777Arguments: 18778"""""""""" 18779 18780The first argument and the return value are floating-point numbers of the same 18781type. 18782 18783The second and third arguments specify the rounding mode and exception 18784behavior as described above. 18785 18786Semantics: 18787"""""""""" 18788 18789This function returns the same values as the libm ``rint`` functions 18790would, and handles error conditions in the same way. The rounding mode is 18791described, not determined, by the rounding mode argument. The actual rounding 18792mode is determined by the runtime floating-point environment. The rounding 18793mode argument is only intended as information to the compiler. 18794 18795 18796'``llvm.experimental.constrained.lrint``' Intrinsic 18797^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18798 18799Syntax: 18800""""""" 18801 18802:: 18803 18804 declare <inttype> 18805 @llvm.experimental.constrained.lrint(<fptype> <op1>, 18806 metadata <rounding mode>, 18807 metadata <exception behavior>) 18808 18809Overview: 18810""""""""" 18811 18812The '``llvm.experimental.constrained.lrint``' intrinsic returns the first 18813operand rounded to the nearest integer. An inexact floating-point exception 18814will be raised if the operand is not an integer. An invalid exception is 18815raised if the result is too large to fit into a supported integer type, 18816and in this case the result is undefined. 18817 18818Arguments: 18819"""""""""" 18820 18821The first argument is a floating-point number. The return value is an 18822integer type. Not all types are supported on all targets. The supported 18823types are the same as the ``llvm.lrint`` intrinsic and the ``lrint`` 18824libm functions. 18825 18826The second and third arguments specify the rounding mode and exception 18827behavior as described above. 18828 18829Semantics: 18830"""""""""" 18831 18832This function returns the same values as the libm ``lrint`` functions 18833would, and handles error conditions in the same way. 18834 18835The rounding mode is described, not determined, by the rounding mode 18836argument. The actual rounding mode is determined by the runtime floating-point 18837environment. The rounding mode argument is only intended as information 18838to the compiler. 18839 18840If the runtime floating-point environment is using the default rounding mode 18841then the results will be the same as the llvm.lrint intrinsic. 18842 18843 18844'``llvm.experimental.constrained.llrint``' Intrinsic 18845^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18846 18847Syntax: 18848""""""" 18849 18850:: 18851 18852 declare <inttype> 18853 @llvm.experimental.constrained.llrint(<fptype> <op1>, 18854 metadata <rounding mode>, 18855 metadata <exception behavior>) 18856 18857Overview: 18858""""""""" 18859 18860The '``llvm.experimental.constrained.llrint``' intrinsic returns the first 18861operand rounded to the nearest integer. An inexact floating-point exception 18862will be raised if the operand is not an integer. An invalid exception is 18863raised if the result is too large to fit into a supported integer type, 18864and in this case the result is undefined. 18865 18866Arguments: 18867"""""""""" 18868 18869The first argument is a floating-point number. The return value is an 18870integer type. Not all types are supported on all targets. The supported 18871types are the same as the ``llvm.llrint`` intrinsic and the ``llrint`` 18872libm functions. 18873 18874The second and third arguments specify the rounding mode and exception 18875behavior as described above. 18876 18877Semantics: 18878"""""""""" 18879 18880This function returns the same values as the libm ``llrint`` functions 18881would, and handles error conditions in the same way. 18882 18883The rounding mode is described, not determined, by the rounding mode 18884argument. The actual rounding mode is determined by the runtime floating-point 18885environment. The rounding mode argument is only intended as information 18886to the compiler. 18887 18888If the runtime floating-point environment is using the default rounding mode 18889then the results will be the same as the llvm.llrint intrinsic. 18890 18891 18892'``llvm.experimental.constrained.nearbyint``' Intrinsic 18893^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18894 18895Syntax: 18896""""""" 18897 18898:: 18899 18900 declare <type> 18901 @llvm.experimental.constrained.nearbyint(<type> <op1>, 18902 metadata <rounding mode>, 18903 metadata <exception behavior>) 18904 18905Overview: 18906""""""""" 18907 18908The '``llvm.experimental.constrained.nearbyint``' intrinsic returns the first 18909operand rounded to the nearest integer. It will not raise an inexact 18910floating-point exception if the operand is not an integer. 18911 18912 18913Arguments: 18914"""""""""" 18915 18916The first argument and the return value are floating-point numbers of the same 18917type. 18918 18919The second and third arguments specify the rounding mode and exception 18920behavior as described above. 18921 18922Semantics: 18923"""""""""" 18924 18925This function returns the same values as the libm ``nearbyint`` functions 18926would, and handles error conditions in the same way. The rounding mode is 18927described, not determined, by the rounding mode argument. The actual rounding 18928mode is determined by the runtime floating-point environment. The rounding 18929mode argument is only intended as information to the compiler. 18930 18931 18932'``llvm.experimental.constrained.maxnum``' Intrinsic 18933^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18934 18935Syntax: 18936""""""" 18937 18938:: 18939 18940 declare <type> 18941 @llvm.experimental.constrained.maxnum(<type> <op1>, <type> <op2> 18942 metadata <exception behavior>) 18943 18944Overview: 18945""""""""" 18946 18947The '``llvm.experimental.constrained.maxnum``' intrinsic returns the maximum 18948of the two arguments. 18949 18950Arguments: 18951"""""""""" 18952 18953The first two arguments and the return value are floating-point numbers 18954of the same type. 18955 18956The third argument specifies the exception behavior as described above. 18957 18958Semantics: 18959"""""""""" 18960 18961This function follows the IEEE-754 semantics for maxNum. 18962 18963 18964'``llvm.experimental.constrained.minnum``' Intrinsic 18965^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18966 18967Syntax: 18968""""""" 18969 18970:: 18971 18972 declare <type> 18973 @llvm.experimental.constrained.minnum(<type> <op1>, <type> <op2> 18974 metadata <exception behavior>) 18975 18976Overview: 18977""""""""" 18978 18979The '``llvm.experimental.constrained.minnum``' intrinsic returns the minimum 18980of the two arguments. 18981 18982Arguments: 18983"""""""""" 18984 18985The first two arguments and the return value are floating-point numbers 18986of the same type. 18987 18988The third argument specifies the exception behavior as described above. 18989 18990Semantics: 18991"""""""""" 18992 18993This function follows the IEEE-754 semantics for minNum. 18994 18995 18996'``llvm.experimental.constrained.maximum``' Intrinsic 18997^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18998 18999Syntax: 19000""""""" 19001 19002:: 19003 19004 declare <type> 19005 @llvm.experimental.constrained.maximum(<type> <op1>, <type> <op2> 19006 metadata <exception behavior>) 19007 19008Overview: 19009""""""""" 19010 19011The '``llvm.experimental.constrained.maximum``' intrinsic returns the maximum 19012of the two arguments, propagating NaNs and treating -0.0 as less than +0.0. 19013 19014Arguments: 19015"""""""""" 19016 19017The first two arguments and the return value are floating-point numbers 19018of the same type. 19019 19020The third argument specifies the exception behavior as described above. 19021 19022Semantics: 19023"""""""""" 19024 19025This function follows semantics specified in the draft of IEEE 754-2018. 19026 19027 19028'``llvm.experimental.constrained.minimum``' Intrinsic 19029^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19030 19031Syntax: 19032""""""" 19033 19034:: 19035 19036 declare <type> 19037 @llvm.experimental.constrained.minimum(<type> <op1>, <type> <op2> 19038 metadata <exception behavior>) 19039 19040Overview: 19041""""""""" 19042 19043The '``llvm.experimental.constrained.minimum``' intrinsic returns the minimum 19044of the two arguments, propagating NaNs and treating -0.0 as less than +0.0. 19045 19046Arguments: 19047"""""""""" 19048 19049The first two arguments and the return value are floating-point numbers 19050of the same type. 19051 19052The third argument specifies the exception behavior as described above. 19053 19054Semantics: 19055"""""""""" 19056 19057This function follows semantics specified in the draft of IEEE 754-2018. 19058 19059 19060'``llvm.experimental.constrained.ceil``' Intrinsic 19061^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19062 19063Syntax: 19064""""""" 19065 19066:: 19067 19068 declare <type> 19069 @llvm.experimental.constrained.ceil(<type> <op1>, 19070 metadata <exception behavior>) 19071 19072Overview: 19073""""""""" 19074 19075The '``llvm.experimental.constrained.ceil``' intrinsic returns the ceiling of the 19076first operand. 19077 19078Arguments: 19079"""""""""" 19080 19081The first argument and the return value are floating-point numbers of the same 19082type. 19083 19084The second argument specifies the exception behavior as described above. 19085 19086Semantics: 19087"""""""""" 19088 19089This function returns the same values as the libm ``ceil`` functions 19090would and handles error conditions in the same way. 19091 19092 19093'``llvm.experimental.constrained.floor``' Intrinsic 19094^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19095 19096Syntax: 19097""""""" 19098 19099:: 19100 19101 declare <type> 19102 @llvm.experimental.constrained.floor(<type> <op1>, 19103 metadata <exception behavior>) 19104 19105Overview: 19106""""""""" 19107 19108The '``llvm.experimental.constrained.floor``' intrinsic returns the floor of the 19109first operand. 19110 19111Arguments: 19112"""""""""" 19113 19114The first argument and the return value are floating-point numbers of the same 19115type. 19116 19117The second argument specifies the exception behavior as described above. 19118 19119Semantics: 19120"""""""""" 19121 19122This function returns the same values as the libm ``floor`` functions 19123would and handles error conditions in the same way. 19124 19125 19126'``llvm.experimental.constrained.round``' Intrinsic 19127^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19128 19129Syntax: 19130""""""" 19131 19132:: 19133 19134 declare <type> 19135 @llvm.experimental.constrained.round(<type> <op1>, 19136 metadata <exception behavior>) 19137 19138Overview: 19139""""""""" 19140 19141The '``llvm.experimental.constrained.round``' intrinsic returns the first 19142operand rounded to the nearest integer. 19143 19144Arguments: 19145"""""""""" 19146 19147The first argument and the return value are floating-point numbers of the same 19148type. 19149 19150The second argument specifies the exception behavior as described above. 19151 19152Semantics: 19153"""""""""" 19154 19155This function returns the same values as the libm ``round`` functions 19156would and handles error conditions in the same way. 19157 19158 19159'``llvm.experimental.constrained.roundeven``' Intrinsic 19160^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19161 19162Syntax: 19163""""""" 19164 19165:: 19166 19167 declare <type> 19168 @llvm.experimental.constrained.roundeven(<type> <op1>, 19169 metadata <exception behavior>) 19170 19171Overview: 19172""""""""" 19173 19174The '``llvm.experimental.constrained.roundeven``' intrinsic returns the first 19175operand rounded to the nearest integer in floating-point format, rounding 19176halfway cases to even (that is, to the nearest value that is an even integer), 19177regardless of the current rounding direction. 19178 19179Arguments: 19180"""""""""" 19181 19182The first argument and the return value are floating-point numbers of the same 19183type. 19184 19185The second argument specifies the exception behavior as described above. 19186 19187Semantics: 19188"""""""""" 19189 19190This function implements IEEE-754 operation ``roundToIntegralTiesToEven``. It 19191also behaves in the same way as C standard function ``roundeven`` and can signal 19192the invalid operation exception for a SNAN operand. 19193 19194 19195'``llvm.experimental.constrained.lround``' Intrinsic 19196^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19197 19198Syntax: 19199""""""" 19200 19201:: 19202 19203 declare <inttype> 19204 @llvm.experimental.constrained.lround(<fptype> <op1>, 19205 metadata <exception behavior>) 19206 19207Overview: 19208""""""""" 19209 19210The '``llvm.experimental.constrained.lround``' intrinsic returns the first 19211operand rounded to the nearest integer with ties away from zero. It will 19212raise an inexact floating-point exception if the operand is not an integer. 19213An invalid exception is raised if the result is too large to fit into a 19214supported integer type, and in this case the result is undefined. 19215 19216Arguments: 19217"""""""""" 19218 19219The first argument is a floating-point number. The return value is an 19220integer type. Not all types are supported on all targets. The supported 19221types are the same as the ``llvm.lround`` intrinsic and the ``lround`` 19222libm functions. 19223 19224The second argument specifies the exception behavior as described above. 19225 19226Semantics: 19227"""""""""" 19228 19229This function returns the same values as the libm ``lround`` functions 19230would and handles error conditions in the same way. 19231 19232 19233'``llvm.experimental.constrained.llround``' Intrinsic 19234^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19235 19236Syntax: 19237""""""" 19238 19239:: 19240 19241 declare <inttype> 19242 @llvm.experimental.constrained.llround(<fptype> <op1>, 19243 metadata <exception behavior>) 19244 19245Overview: 19246""""""""" 19247 19248The '``llvm.experimental.constrained.llround``' intrinsic returns the first 19249operand rounded to the nearest integer with ties away from zero. It will 19250raise an inexact floating-point exception if the operand is not an integer. 19251An invalid exception is raised if the result is too large to fit into a 19252supported integer type, and in this case the result is undefined. 19253 19254Arguments: 19255"""""""""" 19256 19257The first argument is a floating-point number. The return value is an 19258integer type. Not all types are supported on all targets. The supported 19259types are the same as the ``llvm.llround`` intrinsic and the ``llround`` 19260libm functions. 19261 19262The second argument specifies the exception behavior as described above. 19263 19264Semantics: 19265"""""""""" 19266 19267This function returns the same values as the libm ``llround`` functions 19268would and handles error conditions in the same way. 19269 19270 19271'``llvm.experimental.constrained.trunc``' Intrinsic 19272^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19273 19274Syntax: 19275""""""" 19276 19277:: 19278 19279 declare <type> 19280 @llvm.experimental.constrained.trunc(<type> <op1>, 19281 metadata <exception behavior>) 19282 19283Overview: 19284""""""""" 19285 19286The '``llvm.experimental.constrained.trunc``' intrinsic returns the first 19287operand rounded to the nearest integer not larger in magnitude than the 19288operand. 19289 19290Arguments: 19291"""""""""" 19292 19293The first argument and the return value are floating-point numbers of the same 19294type. 19295 19296The second argument specifies the exception behavior as described above. 19297 19298Semantics: 19299"""""""""" 19300 19301This function returns the same values as the libm ``trunc`` functions 19302would and handles error conditions in the same way. 19303 19304 19305Floating Point Environment Manipulation intrinsics 19306-------------------------------------------------- 19307 19308These functions read or write floating point environment, such as rounding 19309mode or state of floating point exceptions. Altering the floating point 19310environment requires special care. See :ref:`Floating Point Environment <floatenv>`. 19311 19312'``llvm.flt.rounds``' Intrinsic 19313^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19314 19315Syntax: 19316""""""" 19317 19318:: 19319 19320 declare i32 @llvm.flt.rounds() 19321 19322Overview: 19323""""""""" 19324 19325The '``llvm.flt.rounds``' intrinsic reads the current rounding mode. 19326 19327Semantics: 19328"""""""""" 19329 19330The '``llvm.flt.rounds``' intrinsic returns the current rounding mode. 19331Encoding of the returned values is same as the result of ``FLT_ROUNDS``, 19332specified by C standard: 19333 19334:: 19335 19336 0 - toward zero 19337 1 - to nearest, ties to even 19338 2 - toward positive infinity 19339 3 - toward negative infinity 19340 4 - to nearest, ties away from zero 19341 19342Other values may be used to represent additional rounding modes, supported by a 19343target. These values are target-specific. 19344 19345General Intrinsics 19346------------------ 19347 19348This class of intrinsics is designed to be generic and has no specific 19349purpose. 19350 19351'``llvm.var.annotation``' Intrinsic 19352^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19353 19354Syntax: 19355""""""" 19356 19357:: 19358 19359 declare void @llvm.var.annotation(i8* <val>, i8* <str>, i8* <str>, i32 <int>) 19360 19361Overview: 19362""""""""" 19363 19364The '``llvm.var.annotation``' intrinsic. 19365 19366Arguments: 19367"""""""""" 19368 19369The first argument is a pointer to a value, the second is a pointer to a 19370global string, the third is a pointer to a global string which is the 19371source file name, and the last argument is the line number. 19372 19373Semantics: 19374"""""""""" 19375 19376This intrinsic allows annotation of local variables with arbitrary 19377strings. This can be useful for special purpose optimizations that want 19378to look for these annotations. These have no other defined use; they are 19379ignored by code generation and optimization. 19380 19381'``llvm.ptr.annotation.*``' Intrinsic 19382^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19383 19384Syntax: 19385""""""" 19386 19387This is an overloaded intrinsic. You can use '``llvm.ptr.annotation``' on a 19388pointer to an integer of any width. *NOTE* you must specify an address space for 19389the pointer. The identifier for the default address space is the integer 19390'``0``'. 19391 19392:: 19393 19394 declare i8* @llvm.ptr.annotation.p<address space>i8(i8* <val>, i8* <str>, i8* <str>, i32 <int>) 19395 declare i16* @llvm.ptr.annotation.p<address space>i16(i16* <val>, i8* <str>, i8* <str>, i32 <int>) 19396 declare i32* @llvm.ptr.annotation.p<address space>i32(i32* <val>, i8* <str>, i8* <str>, i32 <int>) 19397 declare i64* @llvm.ptr.annotation.p<address space>i64(i64* <val>, i8* <str>, i8* <str>, i32 <int>) 19398 declare i256* @llvm.ptr.annotation.p<address space>i256(i256* <val>, i8* <str>, i8* <str>, i32 <int>) 19399 19400Overview: 19401""""""""" 19402 19403The '``llvm.ptr.annotation``' intrinsic. 19404 19405Arguments: 19406"""""""""" 19407 19408The first argument is a pointer to an integer value of arbitrary bitwidth 19409(result of some expression), the second is a pointer to a global string, the 19410third is a pointer to a global string which is the source file name, and the 19411last argument is the line number. It returns the value of the first argument. 19412 19413Semantics: 19414"""""""""" 19415 19416This intrinsic allows annotation of a pointer to an integer with arbitrary 19417strings. This can be useful for special purpose optimizations that want to look 19418for these annotations. These have no other defined use; they are ignored by code 19419generation and optimization. 19420 19421'``llvm.annotation.*``' Intrinsic 19422^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19423 19424Syntax: 19425""""""" 19426 19427This is an overloaded intrinsic. You can use '``llvm.annotation``' on 19428any integer bit width. 19429 19430:: 19431 19432 declare i8 @llvm.annotation.i8(i8 <val>, i8* <str>, i8* <str>, i32 <int>) 19433 declare i16 @llvm.annotation.i16(i16 <val>, i8* <str>, i8* <str>, i32 <int>) 19434 declare i32 @llvm.annotation.i32(i32 <val>, i8* <str>, i8* <str>, i32 <int>) 19435 declare i64 @llvm.annotation.i64(i64 <val>, i8* <str>, i8* <str>, i32 <int>) 19436 declare i256 @llvm.annotation.i256(i256 <val>, i8* <str>, i8* <str>, i32 <int>) 19437 19438Overview: 19439""""""""" 19440 19441The '``llvm.annotation``' intrinsic. 19442 19443Arguments: 19444"""""""""" 19445 19446The first argument is an integer value (result of some expression), the 19447second is a pointer to a global string, the third is a pointer to a 19448global string which is the source file name, and the last argument is 19449the line number. It returns the value of the first argument. 19450 19451Semantics: 19452"""""""""" 19453 19454This intrinsic allows annotations to be put on arbitrary expressions 19455with arbitrary strings. This can be useful for special purpose 19456optimizations that want to look for these annotations. These have no 19457other defined use; they are ignored by code generation and optimization. 19458 19459'``llvm.codeview.annotation``' Intrinsic 19460^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19461 19462Syntax: 19463""""""" 19464 19465This annotation emits a label at its program point and an associated 19466``S_ANNOTATION`` codeview record with some additional string metadata. This is 19467used to implement MSVC's ``__annotation`` intrinsic. It is marked 19468``noduplicate``, so calls to this intrinsic prevent inlining and should be 19469considered expensive. 19470 19471:: 19472 19473 declare void @llvm.codeview.annotation(metadata) 19474 19475Arguments: 19476"""""""""" 19477 19478The argument should be an MDTuple containing any number of MDStrings. 19479 19480'``llvm.trap``' Intrinsic 19481^^^^^^^^^^^^^^^^^^^^^^^^^ 19482 19483Syntax: 19484""""""" 19485 19486:: 19487 19488 declare void @llvm.trap() cold noreturn nounwind 19489 19490Overview: 19491""""""""" 19492 19493The '``llvm.trap``' intrinsic. 19494 19495Arguments: 19496"""""""""" 19497 19498None. 19499 19500Semantics: 19501"""""""""" 19502 19503This intrinsic is lowered to the target dependent trap instruction. If 19504the target does not have a trap instruction, this intrinsic will be 19505lowered to a call of the ``abort()`` function. 19506 19507'``llvm.debugtrap``' Intrinsic 19508^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19509 19510Syntax: 19511""""""" 19512 19513:: 19514 19515 declare void @llvm.debugtrap() nounwind 19516 19517Overview: 19518""""""""" 19519 19520The '``llvm.debugtrap``' intrinsic. 19521 19522Arguments: 19523"""""""""" 19524 19525None. 19526 19527Semantics: 19528"""""""""" 19529 19530This intrinsic is lowered to code which is intended to cause an 19531execution trap with the intention of requesting the attention of a 19532debugger. 19533 19534'``llvm.stackprotector``' Intrinsic 19535^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19536 19537Syntax: 19538""""""" 19539 19540:: 19541 19542 declare void @llvm.stackprotector(i8* <guard>, i8** <slot>) 19543 19544Overview: 19545""""""""" 19546 19547The ``llvm.stackprotector`` intrinsic takes the ``guard`` and stores it 19548onto the stack at ``slot``. The stack slot is adjusted to ensure that it 19549is placed on the stack before local variables. 19550 19551Arguments: 19552"""""""""" 19553 19554The ``llvm.stackprotector`` intrinsic requires two pointer arguments. 19555The first argument is the value loaded from the stack guard 19556``@__stack_chk_guard``. The second variable is an ``alloca`` that has 19557enough space to hold the value of the guard. 19558 19559Semantics: 19560"""""""""" 19561 19562This intrinsic causes the prologue/epilogue inserter to force the position of 19563the ``AllocaInst`` stack slot to be before local variables on the stack. This is 19564to ensure that if a local variable on the stack is overwritten, it will destroy 19565the value of the guard. When the function exits, the guard on the stack is 19566checked against the original guard by ``llvm.stackprotectorcheck``. If they are 19567different, then ``llvm.stackprotectorcheck`` causes the program to abort by 19568calling the ``__stack_chk_fail()`` function. 19569 19570'``llvm.stackguard``' Intrinsic 19571^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19572 19573Syntax: 19574""""""" 19575 19576:: 19577 19578 declare i8* @llvm.stackguard() 19579 19580Overview: 19581""""""""" 19582 19583The ``llvm.stackguard`` intrinsic returns the system stack guard value. 19584 19585It should not be generated by frontends, since it is only for internal usage. 19586The reason why we create this intrinsic is that we still support IR form Stack 19587Protector in FastISel. 19588 19589Arguments: 19590"""""""""" 19591 19592None. 19593 19594Semantics: 19595"""""""""" 19596 19597On some platforms, the value returned by this intrinsic remains unchanged 19598between loads in the same thread. On other platforms, it returns the same 19599global variable value, if any, e.g. ``@__stack_chk_guard``. 19600 19601Currently some platforms have IR-level customized stack guard loading (e.g. 19602X86 Linux) that is not handled by ``llvm.stackguard()``, while they should be 19603in the future. 19604 19605'``llvm.objectsize``' Intrinsic 19606^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19607 19608Syntax: 19609""""""" 19610 19611:: 19612 19613 declare i32 @llvm.objectsize.i32(i8* <object>, i1 <min>, i1 <nullunknown>, i1 <dynamic>) 19614 declare i64 @llvm.objectsize.i64(i8* <object>, i1 <min>, i1 <nullunknown>, i1 <dynamic>) 19615 19616Overview: 19617""""""""" 19618 19619The ``llvm.objectsize`` intrinsic is designed to provide information to the 19620optimizer to determine whether a) an operation (like memcpy) will overflow a 19621buffer that corresponds to an object, or b) that a runtime check for overflow 19622isn't necessary. An object in this context means an allocation of a specific 19623class, structure, array, or other object. 19624 19625Arguments: 19626"""""""""" 19627 19628The ``llvm.objectsize`` intrinsic takes four arguments. The first argument is a 19629pointer to or into the ``object``. The second argument determines whether 19630``llvm.objectsize`` returns 0 (if true) or -1 (if false) when the object size is 19631unknown. The third argument controls how ``llvm.objectsize`` acts when ``null`` 19632in address space 0 is used as its pointer argument. If it's ``false``, 19633``llvm.objectsize`` reports 0 bytes available when given ``null``. Otherwise, if 19634the ``null`` is in a non-zero address space or if ``true`` is given for the 19635third argument of ``llvm.objectsize``, we assume its size is unknown. The fourth 19636argument to ``llvm.objectsize`` determines if the value should be evaluated at 19637runtime. 19638 19639The second, third, and fourth arguments only accept constants. 19640 19641Semantics: 19642"""""""""" 19643 19644The ``llvm.objectsize`` intrinsic is lowered to a value representing the size of 19645the object concerned. If the size cannot be determined, ``llvm.objectsize`` 19646returns ``i32/i64 -1 or 0`` (depending on the ``min`` argument). 19647 19648'``llvm.expect``' Intrinsic 19649^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19650 19651Syntax: 19652""""""" 19653 19654This is an overloaded intrinsic. You can use ``llvm.expect`` on any 19655integer bit width. 19656 19657:: 19658 19659 declare i1 @llvm.expect.i1(i1 <val>, i1 <expected_val>) 19660 declare i32 @llvm.expect.i32(i32 <val>, i32 <expected_val>) 19661 declare i64 @llvm.expect.i64(i64 <val>, i64 <expected_val>) 19662 19663Overview: 19664""""""""" 19665 19666The ``llvm.expect`` intrinsic provides information about expected (the 19667most probable) value of ``val``, which can be used by optimizers. 19668 19669Arguments: 19670"""""""""" 19671 19672The ``llvm.expect`` intrinsic takes two arguments. The first argument is 19673a value. The second argument is an expected value. 19674 19675Semantics: 19676"""""""""" 19677 19678This intrinsic is lowered to the ``val``. 19679 19680'``llvm.expect.with.probability``' Intrinsic 19681^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19682 19683Syntax: 19684""""""" 19685 19686This intrinsic is similar to ``llvm.expect``. This is an overloaded intrinsic. 19687You can use ``llvm.expect.with.probability`` on any integer bit width. 19688 19689:: 19690 19691 declare i1 @llvm.expect.with.probability.i1(i1 <val>, i1 <expected_val>, double <prob>) 19692 declare i32 @llvm.expect.with.probability.i32(i32 <val>, i32 <expected_val>, double <prob>) 19693 declare i64 @llvm.expect.with.probability.i64(i64 <val>, i64 <expected_val>, double <prob>) 19694 19695Overview: 19696""""""""" 19697 19698The ``llvm.expect.with.probability`` intrinsic provides information about 19699expected value of ``val`` with probability(or confidence) ``prob``, which can 19700be used by optimizers. 19701 19702Arguments: 19703"""""""""" 19704 19705The ``llvm.expect.with.probability`` intrinsic takes three arguments. The first 19706argument is a value. The second argument is an expected value. The third 19707argument is a probability. 19708 19709Semantics: 19710"""""""""" 19711 19712This intrinsic is lowered to the ``val``. 19713 19714.. _int_assume: 19715 19716'``llvm.assume``' Intrinsic 19717^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19718 19719Syntax: 19720""""""" 19721 19722:: 19723 19724 declare void @llvm.assume(i1 %cond) 19725 19726Overview: 19727""""""""" 19728 19729The ``llvm.assume`` allows the optimizer to assume that the provided 19730condition is true. This information can then be used in simplifying other parts 19731of the code. 19732 19733More complex assumptions can be encoded as 19734:ref:`assume operand bundles <assume_opbundles>`. 19735 19736Arguments: 19737"""""""""" 19738 19739The argument of the call is the condition which the optimizer may assume is 19740always true. 19741 19742Semantics: 19743"""""""""" 19744 19745The intrinsic allows the optimizer to assume that the provided condition is 19746always true whenever the control flow reaches the intrinsic call. No code is 19747generated for this intrinsic, and instructions that contribute only to the 19748provided condition are not used for code generation. If the condition is 19749violated during execution, the behavior is undefined. 19750 19751Note that the optimizer might limit the transformations performed on values 19752used by the ``llvm.assume`` intrinsic in order to preserve the instructions 19753only used to form the intrinsic's input argument. This might prove undesirable 19754if the extra information provided by the ``llvm.assume`` intrinsic does not cause 19755sufficient overall improvement in code quality. For this reason, 19756``llvm.assume`` should not be used to document basic mathematical invariants 19757that the optimizer can otherwise deduce or facts that are of little use to the 19758optimizer. 19759 19760.. _int_ssa_copy: 19761 19762'``llvm.ssa_copy``' Intrinsic 19763^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19764 19765Syntax: 19766""""""" 19767 19768:: 19769 19770 declare type @llvm.ssa_copy(type %operand) returned(1) readnone 19771 19772Arguments: 19773"""""""""" 19774 19775The first argument is an operand which is used as the returned value. 19776 19777Overview: 19778"""""""""" 19779 19780The ``llvm.ssa_copy`` intrinsic can be used to attach information to 19781operations by copying them and giving them new names. For example, 19782the PredicateInfo utility uses it to build Extended SSA form, and 19783attach various forms of information to operands that dominate specific 19784uses. It is not meant for general use, only for building temporary 19785renaming forms that require value splits at certain points. 19786 19787.. _type.test: 19788 19789'``llvm.type.test``' Intrinsic 19790^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19791 19792Syntax: 19793""""""" 19794 19795:: 19796 19797 declare i1 @llvm.type.test(i8* %ptr, metadata %type) nounwind readnone 19798 19799 19800Arguments: 19801"""""""""" 19802 19803The first argument is a pointer to be tested. The second argument is a 19804metadata object representing a :doc:`type identifier <TypeMetadata>`. 19805 19806Overview: 19807""""""""" 19808 19809The ``llvm.type.test`` intrinsic tests whether the given pointer is associated 19810with the given type identifier. 19811 19812.. _type.checked.load: 19813 19814'``llvm.type.checked.load``' Intrinsic 19815^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19816 19817Syntax: 19818""""""" 19819 19820:: 19821 19822 declare {i8*, i1} @llvm.type.checked.load(i8* %ptr, i32 %offset, metadata %type) argmemonly nounwind readonly 19823 19824 19825Arguments: 19826"""""""""" 19827 19828The first argument is a pointer from which to load a function pointer. The 19829second argument is the byte offset from which to load the function pointer. The 19830third argument is a metadata object representing a :doc:`type identifier 19831<TypeMetadata>`. 19832 19833Overview: 19834""""""""" 19835 19836The ``llvm.type.checked.load`` intrinsic safely loads a function pointer from a 19837virtual table pointer using type metadata. This intrinsic is used to implement 19838control flow integrity in conjunction with virtual call optimization. The 19839virtual call optimization pass will optimize away ``llvm.type.checked.load`` 19840intrinsics associated with devirtualized calls, thereby removing the type 19841check in cases where it is not needed to enforce the control flow integrity 19842constraint. 19843 19844If the given pointer is associated with a type metadata identifier, this 19845function returns true as the second element of its return value. (Note that 19846the function may also return true if the given pointer is not associated 19847with a type metadata identifier.) If the function's return value's second 19848element is true, the following rules apply to the first element: 19849 19850- If the given pointer is associated with the given type metadata identifier, 19851 it is the function pointer loaded from the given byte offset from the given 19852 pointer. 19853 19854- If the given pointer is not associated with the given type metadata 19855 identifier, it is one of the following (the choice of which is unspecified): 19856 19857 1. The function pointer that would have been loaded from an arbitrarily chosen 19858 (through an unspecified mechanism) pointer associated with the type 19859 metadata. 19860 19861 2. If the function has a non-void return type, a pointer to a function that 19862 returns an unspecified value without causing side effects. 19863 19864If the function's return value's second element is false, the value of the 19865first element is undefined. 19866 19867 19868'``llvm.donothing``' Intrinsic 19869^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19870 19871Syntax: 19872""""""" 19873 19874:: 19875 19876 declare void @llvm.donothing() nounwind readnone 19877 19878Overview: 19879""""""""" 19880 19881The ``llvm.donothing`` intrinsic doesn't perform any operation. It's one of only 19882three intrinsics (besides ``llvm.experimental.patchpoint`` and 19883``llvm.experimental.gc.statepoint``) that can be called with an invoke 19884instruction. 19885 19886Arguments: 19887"""""""""" 19888 19889None. 19890 19891Semantics: 19892"""""""""" 19893 19894This intrinsic does nothing, and it's removed by optimizers and ignored 19895by codegen. 19896 19897'``llvm.experimental.deoptimize``' Intrinsic 19898^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19899 19900Syntax: 19901""""""" 19902 19903:: 19904 19905 declare type @llvm.experimental.deoptimize(...) [ "deopt"(...) ] 19906 19907Overview: 19908""""""""" 19909 19910This intrinsic, together with :ref:`deoptimization operand bundles 19911<deopt_opbundles>`, allow frontends to express transfer of control and 19912frame-local state from the currently executing (typically more specialized, 19913hence faster) version of a function into another (typically more generic, hence 19914slower) version. 19915 19916In languages with a fully integrated managed runtime like Java and JavaScript 19917this intrinsic can be used to implement "uncommon trap" or "side exit" like 19918functionality. In unmanaged languages like C and C++, this intrinsic can be 19919used to represent the slow paths of specialized functions. 19920 19921 19922Arguments: 19923"""""""""" 19924 19925The intrinsic takes an arbitrary number of arguments, whose meaning is 19926decided by the :ref:`lowering strategy<deoptimize_lowering>`. 19927 19928Semantics: 19929"""""""""" 19930 19931The ``@llvm.experimental.deoptimize`` intrinsic executes an attached 19932deoptimization continuation (denoted using a :ref:`deoptimization 19933operand bundle <deopt_opbundles>`) and returns the value returned by 19934the deoptimization continuation. Defining the semantic properties of 19935the continuation itself is out of scope of the language reference -- 19936as far as LLVM is concerned, the deoptimization continuation can 19937invoke arbitrary side effects, including reading from and writing to 19938the entire heap. 19939 19940Deoptimization continuations expressed using ``"deopt"`` operand bundles always 19941continue execution to the end of the physical frame containing them, so all 19942calls to ``@llvm.experimental.deoptimize`` must be in "tail position": 19943 19944 - ``@llvm.experimental.deoptimize`` cannot be invoked. 19945 - The call must immediately precede a :ref:`ret <i_ret>` instruction. 19946 - The ``ret`` instruction must return the value produced by the 19947 ``@llvm.experimental.deoptimize`` call if there is one, or void. 19948 19949Note that the above restrictions imply that the return type for a call to 19950``@llvm.experimental.deoptimize`` will match the return type of its immediate 19951caller. 19952 19953The inliner composes the ``"deopt"`` continuations of the caller into the 19954``"deopt"`` continuations present in the inlinee, and also updates calls to this 19955intrinsic to return directly from the frame of the function it inlined into. 19956 19957All declarations of ``@llvm.experimental.deoptimize`` must share the 19958same calling convention. 19959 19960.. _deoptimize_lowering: 19961 19962Lowering: 19963""""""""" 19964 19965Calls to ``@llvm.experimental.deoptimize`` are lowered to calls to the 19966symbol ``__llvm_deoptimize`` (it is the frontend's responsibility to 19967ensure that this symbol is defined). The call arguments to 19968``@llvm.experimental.deoptimize`` are lowered as if they were formal 19969arguments of the specified types, and not as varargs. 19970 19971 19972'``llvm.experimental.guard``' Intrinsic 19973^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19974 19975Syntax: 19976""""""" 19977 19978:: 19979 19980 declare void @llvm.experimental.guard(i1, ...) [ "deopt"(...) ] 19981 19982Overview: 19983""""""""" 19984 19985This intrinsic, together with :ref:`deoptimization operand bundles 19986<deopt_opbundles>`, allows frontends to express guards or checks on 19987optimistic assumptions made during compilation. The semantics of 19988``@llvm.experimental.guard`` is defined in terms of 19989``@llvm.experimental.deoptimize`` -- its body is defined to be 19990equivalent to: 19991 19992.. code-block:: text 19993 19994 define void @llvm.experimental.guard(i1 %pred, <args...>) { 19995 %realPred = and i1 %pred, undef 19996 br i1 %realPred, label %continue, label %leave [, !make.implicit !{}] 19997 19998 leave: 19999 call void @llvm.experimental.deoptimize(<args...>) [ "deopt"() ] 20000 ret void 20001 20002 continue: 20003 ret void 20004 } 20005 20006 20007with the optional ``[, !make.implicit !{}]`` present if and only if it 20008is present on the call site. For more details on ``!make.implicit``, 20009see :doc:`FaultMaps`. 20010 20011In words, ``@llvm.experimental.guard`` executes the attached 20012``"deopt"`` continuation if (but **not** only if) its first argument 20013is ``false``. Since the optimizer is allowed to replace the ``undef`` 20014with an arbitrary value, it can optimize guard to fail "spuriously", 20015i.e. without the original condition being false (hence the "not only 20016if"); and this allows for "check widening" type optimizations. 20017 20018``@llvm.experimental.guard`` cannot be invoked. 20019 20020After ``@llvm.experimental.guard`` was first added, a more general 20021formulation was found in ``@llvm.experimental.widenable.condition``. 20022Support for ``@llvm.experimental.guard`` is slowly being rephrased in 20023terms of this alternate. 20024 20025'``llvm.experimental.widenable.condition``' Intrinsic 20026^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20027 20028Syntax: 20029""""""" 20030 20031:: 20032 20033 declare i1 @llvm.experimental.widenable.condition() 20034 20035Overview: 20036""""""""" 20037 20038This intrinsic represents a "widenable condition" which is 20039boolean expressions with the following property: whether this 20040expression is `true` or `false`, the program is correct and 20041well-defined. 20042 20043Together with :ref:`deoptimization operand bundles <deopt_opbundles>`, 20044``@llvm.experimental.widenable.condition`` allows frontends to 20045express guards or checks on optimistic assumptions made during 20046compilation and represent them as branch instructions on special 20047conditions. 20048 20049While this may appear similar in semantics to `undef`, it is very 20050different in that an invocation produces a particular, singular 20051value. It is also intended to be lowered late, and remain available 20052for specific optimizations and transforms that can benefit from its 20053special properties. 20054 20055Arguments: 20056"""""""""" 20057 20058None. 20059 20060Semantics: 20061"""""""""" 20062 20063The intrinsic ``@llvm.experimental.widenable.condition()`` 20064returns either `true` or `false`. For each evaluation of a call 20065to this intrinsic, the program must be valid and correct both if 20066it returns `true` and if it returns `false`. This allows 20067transformation passes to replace evaluations of this intrinsic 20068with either value whenever one is beneficial. 20069 20070When used in a branch condition, it allows us to choose between 20071two alternative correct solutions for the same problem, like 20072in example below: 20073 20074.. code-block:: text 20075 20076 %cond = call i1 @llvm.experimental.widenable.condition() 20077 br i1 %cond, label %solution_1, label %solution_2 20078 20079 label %fast_path: 20080 ; Apply memory-consuming but fast solution for a task. 20081 20082 label %slow_path: 20083 ; Cheap in memory but slow solution. 20084 20085Whether the result of intrinsic's call is `true` or `false`, 20086it should be correct to pick either solution. We can switch 20087between them by replacing the result of 20088``@llvm.experimental.widenable.condition`` with different 20089`i1` expressions. 20090 20091This is how it can be used to represent guards as widenable branches: 20092 20093.. code-block:: text 20094 20095 block: 20096 ; Unguarded instructions 20097 call void @llvm.experimental.guard(i1 %cond, <args...>) ["deopt"(<deopt_args...>)] 20098 ; Guarded instructions 20099 20100Can be expressed in an alternative equivalent form of explicit branch using 20101``@llvm.experimental.widenable.condition``: 20102 20103.. code-block:: text 20104 20105 block: 20106 ; Unguarded instructions 20107 %widenable_condition = call i1 @llvm.experimental.widenable.condition() 20108 %guard_condition = and i1 %cond, %widenable_condition 20109 br i1 %guard_condition, label %guarded, label %deopt 20110 20111 guarded: 20112 ; Guarded instructions 20113 20114 deopt: 20115 call type @llvm.experimental.deoptimize(<args...>) [ "deopt"(<deopt_args...>) ] 20116 20117So the block `guarded` is only reachable when `%cond` is `true`, 20118and it should be valid to go to the block `deopt` whenever `%cond` 20119is `true` or `false`. 20120 20121``@llvm.experimental.widenable.condition`` will never throw, thus 20122it cannot be invoked. 20123 20124Guard widening: 20125""""""""""""""" 20126 20127When ``@llvm.experimental.widenable.condition()`` is used in 20128condition of a guard represented as explicit branch, it is 20129legal to widen the guard's condition with any additional 20130conditions. 20131 20132Guard widening looks like replacement of 20133 20134.. code-block:: text 20135 20136 %widenable_cond = call i1 @llvm.experimental.widenable.condition() 20137 %guard_cond = and i1 %cond, %widenable_cond 20138 br i1 %guard_cond, label %guarded, label %deopt 20139 20140with 20141 20142.. code-block:: text 20143 20144 %widenable_cond = call i1 @llvm.experimental.widenable.condition() 20145 %new_cond = and i1 %any_other_cond, %widenable_cond 20146 %new_guard_cond = and i1 %cond, %new_cond 20147 br i1 %new_guard_cond, label %guarded, label %deopt 20148 20149for this branch. Here `%any_other_cond` is an arbitrarily chosen 20150well-defined `i1` value. By making guard widening, we may 20151impose stricter conditions on `guarded` block and bail to the 20152deopt when the new condition is not met. 20153 20154Lowering: 20155""""""""" 20156 20157Default lowering strategy is replacing the result of 20158call of ``@llvm.experimental.widenable.condition`` with 20159constant `true`. However it is always correct to replace 20160it with any other `i1` value. Any pass can 20161freely do it if it can benefit from non-default lowering. 20162 20163 20164'``llvm.load.relative``' Intrinsic 20165^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20166 20167Syntax: 20168""""""" 20169 20170:: 20171 20172 declare i8* @llvm.load.relative.iN(i8* %ptr, iN %offset) argmemonly nounwind readonly 20173 20174Overview: 20175""""""""" 20176 20177This intrinsic loads a 32-bit value from the address ``%ptr + %offset``, 20178adds ``%ptr`` to that value and returns it. The constant folder specifically 20179recognizes the form of this intrinsic and the constant initializers it may 20180load from; if a loaded constant initializer is known to have the form 20181``i32 trunc(x - %ptr)``, the intrinsic call is folded to ``x``. 20182 20183LLVM provides that the calculation of such a constant initializer will 20184not overflow at link time under the medium code model if ``x`` is an 20185``unnamed_addr`` function. However, it does not provide this guarantee for 20186a constant initializer folded into a function body. This intrinsic can be 20187used to avoid the possibility of overflows when loading from such a constant. 20188 20189.. _llvm_sideeffect: 20190 20191'``llvm.sideeffect``' Intrinsic 20192^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20193 20194Syntax: 20195""""""" 20196 20197:: 20198 20199 declare void @llvm.sideeffect() inaccessiblememonly nounwind 20200 20201Overview: 20202""""""""" 20203 20204The ``llvm.sideeffect`` intrinsic doesn't perform any operation. Optimizers 20205treat it as having side effects, so it can be inserted into a loop to 20206indicate that the loop shouldn't be assumed to terminate (which could 20207potentially lead to the loop being optimized away entirely), even if it's 20208an infinite loop with no other side effects. 20209 20210Arguments: 20211"""""""""" 20212 20213None. 20214 20215Semantics: 20216"""""""""" 20217 20218This intrinsic actually does nothing, but optimizers must assume that it 20219has externally observable side effects. 20220 20221'``llvm.is.constant.*``' Intrinsic 20222^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20223 20224Syntax: 20225""""""" 20226 20227This is an overloaded intrinsic. You can use llvm.is.constant with any argument type. 20228 20229:: 20230 20231 declare i1 @llvm.is.constant.i32(i32 %operand) nounwind readnone 20232 declare i1 @llvm.is.constant.f32(float %operand) nounwind readnone 20233 declare i1 @llvm.is.constant.TYPENAME(TYPE %operand) nounwind readnone 20234 20235Overview: 20236""""""""" 20237 20238The '``llvm.is.constant``' intrinsic will return true if the argument 20239is known to be a manifest compile-time constant. It is guaranteed to 20240fold to either true or false before generating machine code. 20241 20242Semantics: 20243"""""""""" 20244 20245This intrinsic generates no code. If its argument is known to be a 20246manifest compile-time constant value, then the intrinsic will be 20247converted to a constant true value. Otherwise, it will be converted to 20248a constant false value. 20249 20250In particular, note that if the argument is a constant expression 20251which refers to a global (the address of which _is_ a constant, but 20252not manifest during the compile), then the intrinsic evaluates to 20253false. 20254 20255The result also intentionally depends on the result of optimization 20256passes -- e.g., the result can change depending on whether a 20257function gets inlined or not. A function's parameters are 20258obviously not constant. However, a call like 20259``llvm.is.constant.i32(i32 %param)`` *can* return true after the 20260function is inlined, if the value passed to the function parameter was 20261a constant. 20262 20263On the other hand, if constant folding is not run, it will never 20264evaluate to true, even in simple cases. 20265 20266.. _int_ptrmask: 20267 20268'``llvm.ptrmask``' Intrinsic 20269^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20270 20271Syntax: 20272""""""" 20273 20274:: 20275 20276 declare ptrty llvm.ptrmask(ptrty %ptr, intty %mask) readnone speculatable 20277 20278Arguments: 20279"""""""""" 20280 20281The first argument is a pointer. The second argument is an integer. 20282 20283Overview: 20284"""""""""" 20285 20286The ``llvm.ptrmask`` intrinsic masks out bits of the pointer according to a mask. 20287This allows stripping data from tagged pointers without converting them to an 20288integer (ptrtoint/inttoptr). As a consequence, we can preserve more information 20289to facilitate alias analysis and underlying-object detection. 20290 20291Semantics: 20292"""""""""" 20293 20294The result of ``ptrmask(ptr, mask)`` is equivalent to 20295``getelementptr ptr, (ptrtoint(ptr) & mask) - ptrtoint(ptr)``. Both the returned 20296pointer and the first argument are based on the same underlying object (for more 20297information on the *based on* terminology see 20298:ref:`the pointer aliasing rules <pointeraliasing>`). If the bitwidth of the 20299mask argument does not match the pointer size of the target, the mask is 20300zero-extended or truncated accordingly. 20301 20302.. _int_vscale: 20303 20304'``llvm.vscale``' Intrinsic 20305^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20306 20307Syntax: 20308""""""" 20309 20310:: 20311 20312 declare i32 llvm.vscale.i32() 20313 declare i64 llvm.vscale.i64() 20314 20315Overview: 20316""""""""" 20317 20318The ``llvm.vscale`` intrinsic returns the value for ``vscale`` in scalable 20319vectors such as ``<vscale x 16 x i8>``. 20320 20321Semantics: 20322"""""""""" 20323 20324``vscale`` is a positive value that is constant throughout program 20325execution, but is unknown at compile time. 20326If the result value does not fit in the result type, then the result is 20327a :ref:`poison value <poisonvalues>`. 20328 20329 20330Stack Map Intrinsics 20331-------------------- 20332 20333LLVM provides experimental intrinsics to support runtime patching 20334mechanisms commonly desired in dynamic language JITs. These intrinsics 20335are described in :doc:`StackMaps`. 20336 20337Element Wise Atomic Memory Intrinsics 20338------------------------------------- 20339 20340These intrinsics are similar to the standard library memory intrinsics except 20341that they perform memory transfer as a sequence of atomic memory accesses. 20342 20343.. _int_memcpy_element_unordered_atomic: 20344 20345'``llvm.memcpy.element.unordered.atomic``' Intrinsic 20346^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20347 20348Syntax: 20349""""""" 20350 20351This is an overloaded intrinsic. You can use ``llvm.memcpy.element.unordered.atomic`` on 20352any integer bit width and for different address spaces. Not all targets 20353support all bit widths however. 20354 20355:: 20356 20357 declare void @llvm.memcpy.element.unordered.atomic.p0i8.p0i8.i32(i8* <dest>, 20358 i8* <src>, 20359 i32 <len>, 20360 i32 <element_size>) 20361 declare void @llvm.memcpy.element.unordered.atomic.p0i8.p0i8.i64(i8* <dest>, 20362 i8* <src>, 20363 i64 <len>, 20364 i32 <element_size>) 20365 20366Overview: 20367""""""""" 20368 20369The '``llvm.memcpy.element.unordered.atomic.*``' intrinsic is a specialization of the 20370'``llvm.memcpy.*``' intrinsic. It differs in that the ``dest`` and ``src`` are treated 20371as arrays with elements that are exactly ``element_size`` bytes, and the copy between 20372buffers uses a sequence of :ref:`unordered atomic <ordering>` load/store operations 20373that are a positive integer multiple of the ``element_size`` in size. 20374 20375Arguments: 20376"""""""""" 20377 20378The first three arguments are the same as they are in the :ref:`@llvm.memcpy <int_memcpy>` 20379intrinsic, with the added constraint that ``len`` is required to be a positive integer 20380multiple of the ``element_size``. If ``len`` is not a positive integer multiple of 20381``element_size``, then the behaviour of the intrinsic is undefined. 20382 20383``element_size`` must be a compile-time constant positive power of two no greater than 20384target-specific atomic access size limit. 20385 20386For each of the input pointers ``align`` parameter attribute must be specified. It 20387must be a power of two no less than the ``element_size``. Caller guarantees that 20388both the source and destination pointers are aligned to that boundary. 20389 20390Semantics: 20391"""""""""" 20392 20393The '``llvm.memcpy.element.unordered.atomic.*``' intrinsic copies ``len`` bytes of 20394memory from the source location to the destination location. These locations are not 20395allowed to overlap. The memory copy is performed as a sequence of load/store operations 20396where each access is guaranteed to be a multiple of ``element_size`` bytes wide and 20397aligned at an ``element_size`` boundary. 20398 20399The order of the copy is unspecified. The same value may be read from the source 20400buffer many times, but only one write is issued to the destination buffer per 20401element. It is well defined to have concurrent reads and writes to both source and 20402destination provided those reads and writes are unordered atomic when specified. 20403 20404This intrinsic does not provide any additional ordering guarantees over those 20405provided by a set of unordered loads from the source location and stores to the 20406destination. 20407 20408Lowering: 20409""""""""" 20410 20411In the most general case call to the '``llvm.memcpy.element.unordered.atomic.*``' is 20412lowered to a call to the symbol ``__llvm_memcpy_element_unordered_atomic_*``. Where '*' 20413is replaced with an actual element size. See :ref:`RewriteStatepointsForGC intrinsic 20414lowering <RewriteStatepointsForGC_intrinsic_lowering>` for details on GC specific 20415lowering. 20416 20417Optimizer is allowed to inline memory copy when it's profitable to do so. 20418 20419'``llvm.memmove.element.unordered.atomic``' Intrinsic 20420^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20421 20422Syntax: 20423""""""" 20424 20425This is an overloaded intrinsic. You can use 20426``llvm.memmove.element.unordered.atomic`` on any integer bit width and for 20427different address spaces. Not all targets support all bit widths however. 20428 20429:: 20430 20431 declare void @llvm.memmove.element.unordered.atomic.p0i8.p0i8.i32(i8* <dest>, 20432 i8* <src>, 20433 i32 <len>, 20434 i32 <element_size>) 20435 declare void @llvm.memmove.element.unordered.atomic.p0i8.p0i8.i64(i8* <dest>, 20436 i8* <src>, 20437 i64 <len>, 20438 i32 <element_size>) 20439 20440Overview: 20441""""""""" 20442 20443The '``llvm.memmove.element.unordered.atomic.*``' intrinsic is a specialization 20444of the '``llvm.memmove.*``' intrinsic. It differs in that the ``dest`` and 20445``src`` are treated as arrays with elements that are exactly ``element_size`` 20446bytes, and the copy between buffers uses a sequence of 20447:ref:`unordered atomic <ordering>` load/store operations that are a positive 20448integer multiple of the ``element_size`` in size. 20449 20450Arguments: 20451"""""""""" 20452 20453The first three arguments are the same as they are in the 20454:ref:`@llvm.memmove <int_memmove>` intrinsic, with the added constraint that 20455``len`` is required to be a positive integer multiple of the ``element_size``. 20456If ``len`` is not a positive integer multiple of ``element_size``, then the 20457behaviour of the intrinsic is undefined. 20458 20459``element_size`` must be a compile-time constant positive power of two no 20460greater than a target-specific atomic access size limit. 20461 20462For each of the input pointers the ``align`` parameter attribute must be 20463specified. It must be a power of two no less than the ``element_size``. Caller 20464guarantees that both the source and destination pointers are aligned to that 20465boundary. 20466 20467Semantics: 20468"""""""""" 20469 20470The '``llvm.memmove.element.unordered.atomic.*``' intrinsic copies ``len`` bytes 20471of memory from the source location to the destination location. These locations 20472are allowed to overlap. The memory copy is performed as a sequence of load/store 20473operations where each access is guaranteed to be a multiple of ``element_size`` 20474bytes wide and aligned at an ``element_size`` boundary. 20475 20476The order of the copy is unspecified. The same value may be read from the source 20477buffer many times, but only one write is issued to the destination buffer per 20478element. It is well defined to have concurrent reads and writes to both source 20479and destination provided those reads and writes are unordered atomic when 20480specified. 20481 20482This intrinsic does not provide any additional ordering guarantees over those 20483provided by a set of unordered loads from the source location and stores to the 20484destination. 20485 20486Lowering: 20487""""""""" 20488 20489In the most general case call to the 20490'``llvm.memmove.element.unordered.atomic.*``' is lowered to a call to the symbol 20491``__llvm_memmove_element_unordered_atomic_*``. Where '*' is replaced with an 20492actual element size. See :ref:`RewriteStatepointsForGC intrinsic lowering 20493<RewriteStatepointsForGC_intrinsic_lowering>` for details on GC specific 20494lowering. 20495 20496The optimizer is allowed to inline the memory copy when it's profitable to do so. 20497 20498.. _int_memset_element_unordered_atomic: 20499 20500'``llvm.memset.element.unordered.atomic``' Intrinsic 20501^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20502 20503Syntax: 20504""""""" 20505 20506This is an overloaded intrinsic. You can use ``llvm.memset.element.unordered.atomic`` on 20507any integer bit width and for different address spaces. Not all targets 20508support all bit widths however. 20509 20510:: 20511 20512 declare void @llvm.memset.element.unordered.atomic.p0i8.i32(i8* <dest>, 20513 i8 <value>, 20514 i32 <len>, 20515 i32 <element_size>) 20516 declare void @llvm.memset.element.unordered.atomic.p0i8.i64(i8* <dest>, 20517 i8 <value>, 20518 i64 <len>, 20519 i32 <element_size>) 20520 20521Overview: 20522""""""""" 20523 20524The '``llvm.memset.element.unordered.atomic.*``' intrinsic is a specialization of the 20525'``llvm.memset.*``' intrinsic. It differs in that the ``dest`` is treated as an array 20526with elements that are exactly ``element_size`` bytes, and the assignment to that array 20527uses uses a sequence of :ref:`unordered atomic <ordering>` store operations 20528that are a positive integer multiple of the ``element_size`` in size. 20529 20530Arguments: 20531"""""""""" 20532 20533The first three arguments are the same as they are in the :ref:`@llvm.memset <int_memset>` 20534intrinsic, with the added constraint that ``len`` is required to be a positive integer 20535multiple of the ``element_size``. If ``len`` is not a positive integer multiple of 20536``element_size``, then the behaviour of the intrinsic is undefined. 20537 20538``element_size`` must be a compile-time constant positive power of two no greater than 20539target-specific atomic access size limit. 20540 20541The ``dest`` input pointer must have the ``align`` parameter attribute specified. It 20542must be a power of two no less than the ``element_size``. Caller guarantees that 20543the destination pointer is aligned to that boundary. 20544 20545Semantics: 20546"""""""""" 20547 20548The '``llvm.memset.element.unordered.atomic.*``' intrinsic sets the ``len`` bytes of 20549memory starting at the destination location to the given ``value``. The memory is 20550set with a sequence of store operations where each access is guaranteed to be a 20551multiple of ``element_size`` bytes wide and aligned at an ``element_size`` boundary. 20552 20553The order of the assignment is unspecified. Only one write is issued to the 20554destination buffer per element. It is well defined to have concurrent reads and 20555writes to the destination provided those reads and writes are unordered atomic 20556when specified. 20557 20558This intrinsic does not provide any additional ordering guarantees over those 20559provided by a set of unordered stores to the destination. 20560 20561Lowering: 20562""""""""" 20563 20564In the most general case call to the '``llvm.memset.element.unordered.atomic.*``' is 20565lowered to a call to the symbol ``__llvm_memset_element_unordered_atomic_*``. Where '*' 20566is replaced with an actual element size. 20567 20568The optimizer is allowed to inline the memory assignment when it's profitable to do so. 20569 20570Objective-C ARC Runtime Intrinsics 20571---------------------------------- 20572 20573LLVM provides intrinsics that lower to Objective-C ARC runtime entry points. 20574LLVM is aware of the semantics of these functions, and optimizes based on that 20575knowledge. You can read more about the details of Objective-C ARC `here 20576<https://clang.llvm.org/docs/AutomaticReferenceCounting.html>`_. 20577 20578'``llvm.objc.autorelease``' Intrinsic 20579^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20580 20581Syntax: 20582""""""" 20583:: 20584 20585 declare i8* @llvm.objc.autorelease(i8*) 20586 20587Lowering: 20588""""""""" 20589 20590Lowers to a call to `objc_autorelease <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autorelease>`_. 20591 20592'``llvm.objc.autoreleasePoolPop``' Intrinsic 20593^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20594 20595Syntax: 20596""""""" 20597:: 20598 20599 declare void @llvm.objc.autoreleasePoolPop(i8*) 20600 20601Lowering: 20602""""""""" 20603 20604Lowers to a call to `objc_autoreleasePoolPop <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-autoreleasepoolpop-void-pool>`_. 20605 20606'``llvm.objc.autoreleasePoolPush``' Intrinsic 20607^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20608 20609Syntax: 20610""""""" 20611:: 20612 20613 declare i8* @llvm.objc.autoreleasePoolPush() 20614 20615Lowering: 20616""""""""" 20617 20618Lowers to a call to `objc_autoreleasePoolPush <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-autoreleasepoolpush-void>`_. 20619 20620'``llvm.objc.autoreleaseReturnValue``' Intrinsic 20621^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20622 20623Syntax: 20624""""""" 20625:: 20626 20627 declare i8* @llvm.objc.autoreleaseReturnValue(i8*) 20628 20629Lowering: 20630""""""""" 20631 20632Lowers to a call to `objc_autoreleaseReturnValue <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue>`_. 20633 20634'``llvm.objc.copyWeak``' Intrinsic 20635^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20636 20637Syntax: 20638""""""" 20639:: 20640 20641 declare void @llvm.objc.copyWeak(i8**, i8**) 20642 20643Lowering: 20644""""""""" 20645 20646Lowers to a call to `objc_copyWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-copyweak-id-dest-id-src>`_. 20647 20648'``llvm.objc.destroyWeak``' Intrinsic 20649^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20650 20651Syntax: 20652""""""" 20653:: 20654 20655 declare void @llvm.objc.destroyWeak(i8**) 20656 20657Lowering: 20658""""""""" 20659 20660Lowers to a call to `objc_destroyWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-destroyweak-id-object>`_. 20661 20662'``llvm.objc.initWeak``' Intrinsic 20663^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20664 20665Syntax: 20666""""""" 20667:: 20668 20669 declare i8* @llvm.objc.initWeak(i8**, i8*) 20670 20671Lowering: 20672""""""""" 20673 20674Lowers to a call to `objc_initWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-initweak>`_. 20675 20676'``llvm.objc.loadWeak``' Intrinsic 20677^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20678 20679Syntax: 20680""""""" 20681:: 20682 20683 declare i8* @llvm.objc.loadWeak(i8**) 20684 20685Lowering: 20686""""""""" 20687 20688Lowers to a call to `objc_loadWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-loadweak>`_. 20689 20690'``llvm.objc.loadWeakRetained``' Intrinsic 20691^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20692 20693Syntax: 20694""""""" 20695:: 20696 20697 declare i8* @llvm.objc.loadWeakRetained(i8**) 20698 20699Lowering: 20700""""""""" 20701 20702Lowers to a call to `objc_loadWeakRetained <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-loadweakretained>`_. 20703 20704'``llvm.objc.moveWeak``' Intrinsic 20705^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20706 20707Syntax: 20708""""""" 20709:: 20710 20711 declare void @llvm.objc.moveWeak(i8**, i8**) 20712 20713Lowering: 20714""""""""" 20715 20716Lowers to a call to `objc_moveWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-moveweak-id-dest-id-src>`_. 20717 20718'``llvm.objc.release``' Intrinsic 20719^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20720 20721Syntax: 20722""""""" 20723:: 20724 20725 declare void @llvm.objc.release(i8*) 20726 20727Lowering: 20728""""""""" 20729 20730Lowers to a call to `objc_release <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-release-id-value>`_. 20731 20732'``llvm.objc.retain``' Intrinsic 20733^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20734 20735Syntax: 20736""""""" 20737:: 20738 20739 declare i8* @llvm.objc.retain(i8*) 20740 20741Lowering: 20742""""""""" 20743 20744Lowers to a call to `objc_retain <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retain>`_. 20745 20746'``llvm.objc.retainAutorelease``' Intrinsic 20747^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20748 20749Syntax: 20750""""""" 20751:: 20752 20753 declare i8* @llvm.objc.retainAutorelease(i8*) 20754 20755Lowering: 20756""""""""" 20757 20758Lowers to a call to `objc_retainAutorelease <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retainautorelease>`_. 20759 20760'``llvm.objc.retainAutoreleaseReturnValue``' Intrinsic 20761^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20762 20763Syntax: 20764""""""" 20765:: 20766 20767 declare i8* @llvm.objc.retainAutoreleaseReturnValue(i8*) 20768 20769Lowering: 20770""""""""" 20771 20772Lowers to a call to `objc_retainAutoreleaseReturnValue <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retainautoreleasereturnvalue>`_. 20773 20774'``llvm.objc.retainAutoreleasedReturnValue``' Intrinsic 20775^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20776 20777Syntax: 20778""""""" 20779:: 20780 20781 declare i8* @llvm.objc.retainAutoreleasedReturnValue(i8*) 20782 20783Lowering: 20784""""""""" 20785 20786Lowers to a call to `objc_retainAutoreleasedReturnValue <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retainautoreleasedreturnvalue>`_. 20787 20788'``llvm.objc.retainBlock``' Intrinsic 20789^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20790 20791Syntax: 20792""""""" 20793:: 20794 20795 declare i8* @llvm.objc.retainBlock(i8*) 20796 20797Lowering: 20798""""""""" 20799 20800Lowers to a call to `objc_retainBlock <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retainblock>`_. 20801 20802'``llvm.objc.storeStrong``' Intrinsic 20803^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20804 20805Syntax: 20806""""""" 20807:: 20808 20809 declare void @llvm.objc.storeStrong(i8**, i8*) 20810 20811Lowering: 20812""""""""" 20813 20814Lowers to a call to `objc_storeStrong <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-storestrong-id-object-id-value>`_. 20815 20816'``llvm.objc.storeWeak``' Intrinsic 20817^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20818 20819Syntax: 20820""""""" 20821:: 20822 20823 declare i8* @llvm.objc.storeWeak(i8**, i8*) 20824 20825Lowering: 20826""""""""" 20827 20828Lowers to a call to `objc_storeWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-storeweak>`_. 20829 20830Preserving Debug Information Intrinsics 20831^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20832 20833These intrinsics are used to carry certain debuginfo together with 20834IR-level operations. For example, it may be desirable to 20835know the structure/union name and the original user-level field 20836indices. Such information got lost in IR GetElementPtr instruction 20837since the IR types are different from debugInfo types and unions 20838are converted to structs in IR. 20839 20840'``llvm.preserve.array.access.index``' Intrinsic 20841^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20842 20843Syntax: 20844""""""" 20845:: 20846 20847 declare <ret_type> 20848 @llvm.preserve.array.access.index.p0s_union.anons.p0a10s_union.anons(<type> base, 20849 i32 dim, 20850 i32 index) 20851 20852Overview: 20853""""""""" 20854 20855The '``llvm.preserve.array.access.index``' intrinsic returns the getelementptr address 20856based on array base ``base``, array dimension ``dim`` and the last access index ``index`` 20857into the array. The return type ``ret_type`` is a pointer type to the array element. 20858The array ``dim`` and ``index`` are preserved which is more robust than 20859getelementptr instruction which may be subject to compiler transformation. 20860The ``llvm.preserve.access.index`` type of metadata is attached to this call instruction 20861to provide array or pointer debuginfo type. 20862The metadata is a ``DICompositeType`` or ``DIDerivedType`` representing the 20863debuginfo version of ``type``. 20864 20865Arguments: 20866"""""""""" 20867 20868The ``base`` is the array base address. The ``dim`` is the array dimension. 20869The ``base`` is a pointer if ``dim`` equals 0. 20870The ``index`` is the last access index into the array or pointer. 20871 20872Semantics: 20873"""""""""" 20874 20875The '``llvm.preserve.array.access.index``' intrinsic produces the same result 20876as a getelementptr with base ``base`` and access operands ``{dim's 0's, index}``. 20877 20878'``llvm.preserve.union.access.index``' Intrinsic 20879^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20880 20881Syntax: 20882""""""" 20883:: 20884 20885 declare <type> 20886 @llvm.preserve.union.access.index.p0s_union.anons.p0s_union.anons(<type> base, 20887 i32 di_index) 20888 20889Overview: 20890""""""""" 20891 20892The '``llvm.preserve.union.access.index``' intrinsic carries the debuginfo field index 20893``di_index`` and returns the ``base`` address. 20894The ``llvm.preserve.access.index`` type of metadata is attached to this call instruction 20895to provide union debuginfo type. 20896The metadata is a ``DICompositeType`` representing the debuginfo version of ``type``. 20897The return type ``type`` is the same as the ``base`` type. 20898 20899Arguments: 20900"""""""""" 20901 20902The ``base`` is the union base address. The ``di_index`` is the field index in debuginfo. 20903 20904Semantics: 20905"""""""""" 20906 20907The '``llvm.preserve.union.access.index``' intrinsic returns the ``base`` address. 20908 20909'``llvm.preserve.struct.access.index``' Intrinsic 20910^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20911 20912Syntax: 20913""""""" 20914:: 20915 20916 declare <ret_type> 20917 @llvm.preserve.struct.access.index.p0i8.p0s_struct.anon.0s(<type> base, 20918 i32 gep_index, 20919 i32 di_index) 20920 20921Overview: 20922""""""""" 20923 20924The '``llvm.preserve.struct.access.index``' intrinsic returns the getelementptr address 20925based on struct base ``base`` and IR struct member index ``gep_index``. 20926The ``llvm.preserve.access.index`` type of metadata is attached to this call instruction 20927to provide struct debuginfo type. 20928The metadata is a ``DICompositeType`` representing the debuginfo version of ``type``. 20929The return type ``ret_type`` is a pointer type to the structure member. 20930 20931Arguments: 20932"""""""""" 20933 20934The ``base`` is the structure base address. The ``gep_index`` is the struct member index 20935based on IR structures. The ``di_index`` is the struct member index based on debuginfo. 20936 20937Semantics: 20938"""""""""" 20939 20940The '``llvm.preserve.struct.access.index``' intrinsic produces the same result 20941as a getelementptr with base ``base`` and access operands ``{0, gep_index}``. 20942