1============================== 2LLVM Language Reference Manual 3============================== 4 5.. contents:: 6 :local: 7 :depth: 4 8 9Abstract 10======== 11 12This document is a reference manual for the LLVM assembly language. LLVM 13is a Static Single Assignment (SSA) based representation that provides 14type safety, low-level operations, flexibility, and the capability of 15representing 'all' high-level languages cleanly. It is the common code 16representation used throughout all phases of the LLVM compilation 17strategy. 18 19Introduction 20============ 21 22The LLVM code representation is designed to be used in three different 23forms: as an in-memory compiler IR, as an on-disk bitcode representation 24(suitable for fast loading by a Just-In-Time compiler), and as a human 25readable assembly language representation. This allows LLVM to provide a 26powerful intermediate representation for efficient compiler 27transformations and analysis, while providing a natural means to debug 28and visualize the transformations. The three different forms of LLVM are 29all equivalent. This document describes the human readable 30representation and notation. 31 32The LLVM representation aims to be light-weight and low-level while 33being expressive, typed, and extensible at the same time. It aims to be 34a "universal IR" of sorts, by being at a low enough level that 35high-level ideas may be cleanly mapped to it (similar to how 36microprocessors are "universal IR's", allowing many source languages to 37be mapped to them). By providing type information, LLVM can be used as 38the target of optimizations: for example, through pointer analysis, it 39can be proven that a C automatic variable is never accessed outside of 40the current function, allowing it to be promoted to a simple SSA value 41instead of a memory location. 42 43.. _wellformed: 44 45Well-Formedness 46--------------- 47 48It is important to note that this document describes 'well formed' LLVM 49assembly language. There is a difference between what the parser accepts 50and what is considered 'well formed'. For example, the following 51instruction is syntactically okay, but not well formed: 52 53.. code-block:: llvm 54 55 %x = add i32 1, %x 56 57because the definition of ``%x`` does not dominate all of its uses. The 58LLVM infrastructure provides a verification pass that may be used to 59verify that an LLVM module is well formed. This pass is automatically 60run by the parser after parsing input assembly and by the optimizer 61before it outputs bitcode. The violations pointed out by the verifier 62pass indicate bugs in transformation passes or input to the parser. 63 64.. _identifiers: 65 66Identifiers 67=========== 68 69LLVM identifiers come in two basic types: global and local. Global 70identifiers (functions, global variables) begin with the ``'@'`` 71character. Local identifiers (register names, types) begin with the 72``'%'`` character. Additionally, there are three different formats for 73identifiers, for different purposes: 74 75#. Named values are represented as a string of characters with their 76 prefix. For example, ``%foo``, ``@DivisionByZero``, 77 ``%a.really.long.identifier``. The actual regular expression used is 78 '``[%@][-a-zA-Z$._][-a-zA-Z$._0-9]*``'. Identifiers that require other 79 characters in their names can be surrounded with quotes. Special 80 characters may be escaped using ``"\xx"`` where ``xx`` is the ASCII 81 code for the character in hexadecimal. In this way, any character can 82 be used in a name value, even quotes themselves. The ``"\01"`` prefix 83 can be used on global values to suppress mangling. 84#. Unnamed values are represented as an unsigned numeric value with 85 their prefix. For example, ``%12``, ``@2``, ``%44``. 86#. Constants, which are described in the section Constants_ below. 87 88LLVM requires that values start with a prefix for two reasons: Compilers 89don't need to worry about name clashes with reserved words, and the set 90of reserved words may be expanded in the future without penalty. 91Additionally, unnamed identifiers allow a compiler to quickly come up 92with a temporary variable without having to avoid symbol table 93conflicts. 94 95Reserved words in LLVM are very similar to reserved words in other 96languages. There are keywords for different opcodes ('``add``', 97'``bitcast``', '``ret``', etc...), for primitive type names ('``void``', 98'``i32``', etc...), and others. These reserved words cannot conflict 99with variable names, because none of them start with a prefix character 100(``'%'`` or ``'@'``). 101 102Here is an example of LLVM code to multiply the integer variable 103'``%X``' by 8: 104 105The easy way: 106 107.. code-block:: llvm 108 109 %result = mul i32 %X, 8 110 111After strength reduction: 112 113.. code-block:: llvm 114 115 %result = shl i32 %X, 3 116 117And the hard way: 118 119.. code-block:: llvm 120 121 %0 = add i32 %X, %X ; yields i32:%0 122 %1 = add i32 %0, %0 ; yields i32:%1 123 %result = add i32 %1, %1 124 125This last way of multiplying ``%X`` by 8 illustrates several important 126lexical features of LLVM: 127 128#. Comments are delimited with a '``;``' and go until the end of line. 129#. Unnamed temporaries are created when the result of a computation is 130 not assigned to a named value. 131#. Unnamed temporaries are numbered sequentially (using a per-function 132 incrementing counter, starting with 0). Note that basic blocks and unnamed 133 function parameters are included in this numbering. For example, if the 134 entry basic block is not given a label name and all function parameters are 135 named, then it will get number 0. 136 137It also shows a convention that we follow in this document. When 138demonstrating instructions, we will follow an instruction with a comment 139that defines the type and name of value produced. 140 141High Level Structure 142==================== 143 144Module Structure 145---------------- 146 147LLVM programs are composed of ``Module``'s, each of which is a 148translation unit of the input programs. Each module consists of 149functions, global variables, and symbol table entries. Modules may be 150combined together with the LLVM linker, which merges function (and 151global variable) definitions, resolves forward declarations, and merges 152symbol table entries. Here is an example of the "hello world" module: 153 154.. code-block:: llvm 155 156 ; Declare the string constant as a global constant. 157 @.str = private unnamed_addr constant [13 x i8] c"hello world\0A\00" 158 159 ; External declaration of the puts function 160 declare i32 @puts(i8* nocapture) nounwind 161 162 ; Definition of main function 163 define i32 @main() { ; i32()* 164 ; Convert [13 x i8]* to i8*... 165 %cast210 = getelementptr [13 x i8], [13 x i8]* @.str, i64 0, i64 0 166 167 ; Call puts function to write out the string to stdout. 168 call i32 @puts(i8* %cast210) 169 ret i32 0 170 } 171 172 ; Named metadata 173 !0 = !{i32 42, null, !"string"} 174 !foo = !{!0} 175 176This example is made up of a :ref:`global variable <globalvars>` named 177"``.str``", an external declaration of the "``puts``" function, a 178:ref:`function definition <functionstructure>` for "``main``" and 179:ref:`named metadata <namedmetadatastructure>` "``foo``". 180 181In general, a module is made up of a list of global values (where both 182functions and global variables are global values). Global values are 183represented by a pointer to a memory location (in this case, a pointer 184to an array of char, and a pointer to a function), and have one of the 185following :ref:`linkage types <linkage>`. 186 187.. _linkage: 188 189Linkage Types 190------------- 191 192All Global Variables and Functions have one of the following types of 193linkage: 194 195``private`` 196 Global values with "``private``" linkage are only directly 197 accessible by objects in the current module. In particular, linking 198 code into a module with a private global value may cause the 199 private to be renamed as necessary to avoid collisions. Because the 200 symbol is private to the module, all references can be updated. This 201 doesn't show up in any symbol table in the object file. 202``internal`` 203 Similar to private, but the value shows as a local symbol 204 (``STB_LOCAL`` in the case of ELF) in the object file. This 205 corresponds to the notion of the '``static``' keyword in C. 206``available_externally`` 207 Globals with "``available_externally``" linkage are never emitted into 208 the object file corresponding to the LLVM module. From the linker's 209 perspective, an ``available_externally`` global is equivalent to 210 an external declaration. They exist to allow inlining and other 211 optimizations to take place given knowledge of the definition of the 212 global, which is known to be somewhere outside the module. Globals 213 with ``available_externally`` linkage are allowed to be discarded at 214 will, and allow inlining and other optimizations. This linkage type is 215 only allowed on definitions, not declarations. 216``linkonce`` 217 Globals with "``linkonce``" linkage are merged with other globals of 218 the same name when linkage occurs. This can be used to implement 219 some forms of inline functions, templates, or other code which must 220 be generated in each translation unit that uses it, but where the 221 body may be overridden with a more definitive definition later. 222 Unreferenced ``linkonce`` globals are allowed to be discarded. Note 223 that ``linkonce`` linkage does not actually allow the optimizer to 224 inline the body of this function into callers because it doesn't 225 know if this definition of the function is the definitive definition 226 within the program or whether it will be overridden by a stronger 227 definition. To enable inlining and other optimizations, use 228 "``linkonce_odr``" linkage. 229``weak`` 230 "``weak``" linkage has the same merging semantics as ``linkonce`` 231 linkage, except that unreferenced globals with ``weak`` linkage may 232 not be discarded. This is used for globals that are declared "weak" 233 in C source code. 234``common`` 235 "``common``" linkage is most similar to "``weak``" linkage, but they 236 are used for tentative definitions in C, such as "``int X;``" at 237 global scope. Symbols with "``common``" linkage are merged in the 238 same way as ``weak symbols``, and they may not be deleted if 239 unreferenced. ``common`` symbols may not have an explicit section, 240 must have a zero initializer, and may not be marked 241 ':ref:`constant <globalvars>`'. Functions and aliases may not have 242 common linkage. 243 244.. _linkage_appending: 245 246``appending`` 247 "``appending``" linkage may only be applied to global variables of 248 pointer to array type. When two global variables with appending 249 linkage are linked together, the two global arrays are appended 250 together. This is the LLVM, typesafe, equivalent of having the 251 system linker append together "sections" with identical names when 252 .o files are linked. 253 254 Unfortunately this doesn't correspond to any feature in .o files, so it 255 can only be used for variables like ``llvm.global_ctors`` which llvm 256 interprets specially. 257 258``extern_weak`` 259 The semantics of this linkage follow the ELF object file model: the 260 symbol is weak until linked, if not linked, the symbol becomes null 261 instead of being an undefined reference. 262``linkonce_odr``, ``weak_odr`` 263 Some languages allow differing globals to be merged, such as two 264 functions with different semantics. Other languages, such as 265 ``C++``, ensure that only equivalent globals are ever merged (the 266 "one definition rule" --- "ODR"). Such languages can use the 267 ``linkonce_odr`` and ``weak_odr`` linkage types to indicate that the 268 global will only be merged with equivalent globals. These linkage 269 types are otherwise the same as their non-``odr`` versions. 270``external`` 271 If none of the above identifiers are used, the global is externally 272 visible, meaning that it participates in linkage and can be used to 273 resolve external symbol references. 274 275It is illegal for a global variable or function *declaration* to have any 276linkage type other than ``external`` or ``extern_weak``. 277 278.. _callingconv: 279 280Calling Conventions 281------------------- 282 283LLVM :ref:`functions <functionstructure>`, :ref:`calls <i_call>` and 284:ref:`invokes <i_invoke>` can all have an optional calling convention 285specified for the call. The calling convention of any pair of dynamic 286caller/callee must match, or the behavior of the program is undefined. 287The following calling conventions are supported by LLVM, and more may be 288added in the future: 289 290"``ccc``" - The C calling convention 291 This calling convention (the default if no other calling convention 292 is specified) matches the target C calling conventions. This calling 293 convention supports varargs function calls and tolerates some 294 mismatch in the declared prototype and implemented declaration of 295 the function (as does normal C). 296"``fastcc``" - The fast calling convention 297 This calling convention attempts to make calls as fast as possible 298 (e.g. by passing things in registers). This calling convention 299 allows the target to use whatever tricks it wants to produce fast 300 code for the target, without having to conform to an externally 301 specified ABI (Application Binary Interface). `Tail calls can only 302 be optimized when this, the tailcc, the GHC or the HiPE convention is 303 used. <CodeGenerator.html#id80>`_ This calling convention does not 304 support varargs and requires the prototype of all callees to exactly 305 match the prototype of the function definition. 306"``coldcc``" - The cold calling convention 307 This calling convention attempts to make code in the caller as 308 efficient as possible under the assumption that the call is not 309 commonly executed. As such, these calls often preserve all registers 310 so that the call does not break any live ranges in the caller side. 311 This calling convention does not support varargs and requires the 312 prototype of all callees to exactly match the prototype of the 313 function definition. Furthermore the inliner doesn't consider such function 314 calls for inlining. 315"``cc 10``" - GHC convention 316 This calling convention has been implemented specifically for use by 317 the `Glasgow Haskell Compiler (GHC) <http://www.haskell.org/ghc>`_. 318 It passes everything in registers, going to extremes to achieve this 319 by disabling callee save registers. This calling convention should 320 not be used lightly but only for specific situations such as an 321 alternative to the *register pinning* performance technique often 322 used when implementing functional programming languages. At the 323 moment only X86 supports this convention and it has the following 324 limitations: 325 326 - On *X86-32* only supports up to 4 bit type parameters. No 327 floating-point types are supported. 328 - On *X86-64* only supports up to 10 bit type parameters and 6 329 floating-point parameters. 330 331 This calling convention supports `tail call 332 optimization <CodeGenerator.html#id80>`_ but requires both the 333 caller and callee are using it. 334"``cc 11``" - The HiPE calling convention 335 This calling convention has been implemented specifically for use by 336 the `High-Performance Erlang 337 (HiPE) <http://www.it.uu.se/research/group/hipe/>`_ compiler, *the* 338 native code compiler of the `Ericsson's Open Source Erlang/OTP 339 system <http://www.erlang.org/download.shtml>`_. It uses more 340 registers for argument passing than the ordinary C calling 341 convention and defines no callee-saved registers. The calling 342 convention properly supports `tail call 343 optimization <CodeGenerator.html#id80>`_ but requires that both the 344 caller and the callee use it. It uses a *register pinning* 345 mechanism, similar to GHC's convention, for keeping frequently 346 accessed runtime components pinned to specific hardware registers. 347 At the moment only X86 supports this convention (both 32 and 64 348 bit). 349"``webkit_jscc``" - WebKit's JavaScript calling convention 350 This calling convention has been implemented for `WebKit FTL JIT 351 <https://trac.webkit.org/wiki/FTLJIT>`_. It passes arguments on the 352 stack right to left (as cdecl does), and returns a value in the 353 platform's customary return register. 354"``anyregcc``" - Dynamic calling convention for code patching 355 This is a special convention that supports patching an arbitrary code 356 sequence in place of a call site. This convention forces the call 357 arguments into registers but allows them to be dynamically 358 allocated. This can currently only be used with calls to 359 llvm.experimental.patchpoint because only this intrinsic records 360 the location of its arguments in a side table. See :doc:`StackMaps`. 361"``preserve_mostcc``" - The `PreserveMost` calling convention 362 This calling convention attempts to make the code in the caller as 363 unintrusive as possible. This convention behaves identically to the `C` 364 calling convention on how arguments and return values are passed, but it 365 uses a different set of caller/callee-saved registers. This alleviates the 366 burden of saving and recovering a large register set before and after the 367 call in the caller. If the arguments are passed in callee-saved registers, 368 then they will be preserved by the callee across the call. This doesn't 369 apply for values returned in callee-saved registers. 370 371 - On X86-64 the callee preserves all general purpose registers, except for 372 R11. R11 can be used as a scratch register. Floating-point registers 373 (XMMs/YMMs) are not preserved and need to be saved by the caller. 374 375 The idea behind this convention is to support calls to runtime functions 376 that have a hot path and a cold path. The hot path is usually a small piece 377 of code that doesn't use many registers. The cold path might need to call out to 378 another function and therefore only needs to preserve the caller-saved 379 registers, which haven't already been saved by the caller. The 380 `PreserveMost` calling convention is very similar to the `cold` calling 381 convention in terms of caller/callee-saved registers, but they are used for 382 different types of function calls. `coldcc` is for function calls that are 383 rarely executed, whereas `preserve_mostcc` function calls are intended to be 384 on the hot path and definitely executed a lot. Furthermore `preserve_mostcc` 385 doesn't prevent the inliner from inlining the function call. 386 387 This calling convention will be used by a future version of the ObjectiveC 388 runtime and should therefore still be considered experimental at this time. 389 Although this convention was created to optimize certain runtime calls to 390 the ObjectiveC runtime, it is not limited to this runtime and might be used 391 by other runtimes in the future too. The current implementation only 392 supports X86-64, but the intention is to support more architectures in the 393 future. 394"``preserve_allcc``" - The `PreserveAll` calling convention 395 This calling convention attempts to make the code in the caller even less 396 intrusive than the `PreserveMost` calling convention. This calling 397 convention also behaves identical to the `C` calling convention on how 398 arguments and return values are passed, but it uses a different set of 399 caller/callee-saved registers. This removes the burden of saving and 400 recovering a large register set before and after the call in the caller. If 401 the arguments are passed in callee-saved registers, then they will be 402 preserved by the callee across the call. This doesn't apply for values 403 returned in callee-saved registers. 404 405 - On X86-64 the callee preserves all general purpose registers, except for 406 R11. R11 can be used as a scratch register. Furthermore it also preserves 407 all floating-point registers (XMMs/YMMs). 408 409 The idea behind this convention is to support calls to runtime functions 410 that don't need to call out to any other functions. 411 412 This calling convention, like the `PreserveMost` calling convention, will be 413 used by a future version of the ObjectiveC runtime and should be considered 414 experimental at this time. 415"``cxx_fast_tlscc``" - The `CXX_FAST_TLS` calling convention for access functions 416 Clang generates an access function to access C++-style TLS. The access 417 function generally has an entry block, an exit block and an initialization 418 block that is run at the first time. The entry and exit blocks can access 419 a few TLS IR variables, each access will be lowered to a platform-specific 420 sequence. 421 422 This calling convention aims to minimize overhead in the caller by 423 preserving as many registers as possible (all the registers that are 424 preserved on the fast path, composed of the entry and exit blocks). 425 426 This calling convention behaves identical to the `C` calling convention on 427 how arguments and return values are passed, but it uses a different set of 428 caller/callee-saved registers. 429 430 Given that each platform has its own lowering sequence, hence its own set 431 of preserved registers, we can't use the existing `PreserveMost`. 432 433 - On X86-64 the callee preserves all general purpose registers, except for 434 RDI and RAX. 435"``tailcc``" - Tail callable calling convention 436 This calling convention ensures that calls in tail position will always be 437 tail call optimized. This calling convention is equivalent to fastcc, 438 except for an additional guarantee that tail calls will be produced 439 whenever possible. `Tail calls can only be optimized when this, the fastcc, 440 the GHC or the HiPE convention is used. <CodeGenerator.html#id80>`_ This 441 calling convention does not support varargs and requires the prototype of 442 all callees to exactly match the prototype of the function definition. 443"``swiftcc``" - This calling convention is used for Swift language. 444 - On X86-64 RCX and R8 are available for additional integer returns, and 445 XMM2 and XMM3 are available for additional FP/vector returns. 446 - On iOS platforms, we use AAPCS-VFP calling convention. 447"``swifttailcc``" 448 This calling convention is like ``swiftcc`` in most respects, but also the 449 callee pops the argument area of the stack so that mandatory tail calls are 450 possible as in ``tailcc``. 451"``cfguard_checkcc``" - Windows Control Flow Guard (Check mechanism) 452 This calling convention is used for the Control Flow Guard check function, 453 calls to which can be inserted before indirect calls to check that the call 454 target is a valid function address. The check function has no return value, 455 but it will trigger an OS-level error if the address is not a valid target. 456 The set of registers preserved by the check function, and the register 457 containing the target address are architecture-specific. 458 459 - On X86 the target address is passed in ECX. 460 - On ARM the target address is passed in R0. 461 - On AArch64 the target address is passed in X15. 462"``cc <n>``" - Numbered convention 463 Any calling convention may be specified by number, allowing 464 target-specific calling conventions to be used. Target specific 465 calling conventions start at 64. 466 467More calling conventions can be added/defined on an as-needed basis, to 468support Pascal conventions or any other well-known target-independent 469convention. 470 471.. _visibilitystyles: 472 473Visibility Styles 474----------------- 475 476All Global Variables and Functions have one of the following visibility 477styles: 478 479"``default``" - Default style 480 On targets that use the ELF object file format, default visibility 481 means that the declaration is visible to other modules and, in 482 shared libraries, means that the declared entity may be overridden. 483 On Darwin, default visibility means that the declaration is visible 484 to other modules. Default visibility corresponds to "external 485 linkage" in the language. 486"``hidden``" - Hidden style 487 Two declarations of an object with hidden visibility refer to the 488 same object if they are in the same shared object. Usually, hidden 489 visibility indicates that the symbol will not be placed into the 490 dynamic symbol table, so no other module (executable or shared 491 library) can reference it directly. 492"``protected``" - Protected style 493 On ELF, protected visibility indicates that the symbol will be 494 placed in the dynamic symbol table, but that references within the 495 defining module will bind to the local symbol. That is, the symbol 496 cannot be overridden by another module. 497 498A symbol with ``internal`` or ``private`` linkage must have ``default`` 499visibility. 500 501.. _dllstorageclass: 502 503DLL Storage Classes 504------------------- 505 506All Global Variables, Functions and Aliases can have one of the following 507DLL storage class: 508 509``dllimport`` 510 "``dllimport``" causes the compiler to reference a function or variable via 511 a global pointer to a pointer that is set up by the DLL exporting the 512 symbol. On Microsoft Windows targets, the pointer name is formed by 513 combining ``__imp_`` and the function or variable name. 514``dllexport`` 515 "``dllexport``" causes the compiler to provide a global pointer to a pointer 516 in a DLL, so that it can be referenced with the ``dllimport`` attribute. On 517 Microsoft Windows targets, the pointer name is formed by combining 518 ``__imp_`` and the function or variable name. Since this storage class 519 exists for defining a dll interface, the compiler, assembler and linker know 520 it is externally referenced and must refrain from deleting the symbol. 521 522.. _tls_model: 523 524Thread Local Storage Models 525--------------------------- 526 527A variable may be defined as ``thread_local``, which means that it will 528not be shared by threads (each thread will have a separated copy of the 529variable). Not all targets support thread-local variables. Optionally, a 530TLS model may be specified: 531 532``localdynamic`` 533 For variables that are only used within the current shared library. 534``initialexec`` 535 For variables in modules that will not be loaded dynamically. 536``localexec`` 537 For variables defined in the executable and only used within it. 538 539If no explicit model is given, the "general dynamic" model is used. 540 541The models correspond to the ELF TLS models; see `ELF Handling For 542Thread-Local Storage <http://people.redhat.com/drepper/tls.pdf>`_ for 543more information on under which circumstances the different models may 544be used. The target may choose a different TLS model if the specified 545model is not supported, or if a better choice of model can be made. 546 547A model can also be specified in an alias, but then it only governs how 548the alias is accessed. It will not have any effect in the aliasee. 549 550For platforms without linker support of ELF TLS model, the -femulated-tls 551flag can be used to generate GCC compatible emulated TLS code. 552 553.. _runtime_preemption_model: 554 555Runtime Preemption Specifiers 556----------------------------- 557 558Global variables, functions and aliases may have an optional runtime preemption 559specifier. If a preemption specifier isn't given explicitly, then a 560symbol is assumed to be ``dso_preemptable``. 561 562``dso_preemptable`` 563 Indicates that the function or variable may be replaced by a symbol from 564 outside the linkage unit at runtime. 565 566``dso_local`` 567 The compiler may assume that a function or variable marked as ``dso_local`` 568 will resolve to a symbol within the same linkage unit. Direct access will 569 be generated even if the definition is not within this compilation unit. 570 571.. _namedtypes: 572 573Structure Types 574--------------- 575 576LLVM IR allows you to specify both "identified" and "literal" :ref:`structure 577types <t_struct>`. Literal types are uniqued structurally, but identified types 578are never uniqued. An :ref:`opaque structural type <t_opaque>` can also be used 579to forward declare a type that is not yet available. 580 581An example of an identified structure specification is: 582 583.. code-block:: llvm 584 585 %mytype = type { %mytype*, i32 } 586 587Prior to the LLVM 3.0 release, identified types were structurally uniqued. Only 588literal types are uniqued in recent versions of LLVM. 589 590.. _nointptrtype: 591 592Non-Integral Pointer Type 593------------------------- 594 595Note: non-integral pointer types are a work in progress, and they should be 596considered experimental at this time. 597 598LLVM IR optionally allows the frontend to denote pointers in certain address 599spaces as "non-integral" via the :ref:`datalayout string<langref_datalayout>`. 600Non-integral pointer types represent pointers that have an *unspecified* bitwise 601representation; that is, the integral representation may be target dependent or 602unstable (not backed by a fixed integer). 603 604``inttoptr`` and ``ptrtoint`` instructions have the same semantics as for 605integral (i.e. normal) pointers in that they convert integers to and from 606corresponding pointer types, but there are additional implications to be 607aware of. Because the bit-representation of a non-integral pointer may 608not be stable, two identical casts of the same operand may or may not 609return the same value. Said differently, the conversion to or from the 610non-integral type depends on environmental state in an implementation 611defined manner. 612 613If the frontend wishes to observe a *particular* value following a cast, the 614generated IR must fence with the underlying environment in an implementation 615defined manner. (In practice, this tends to require ``noinline`` routines for 616such operations.) 617 618From the perspective of the optimizer, ``inttoptr`` and ``ptrtoint`` for 619non-integral types are analogous to ones on integral types with one 620key exception: the optimizer may not, in general, insert new dynamic 621occurrences of such casts. If a new cast is inserted, the optimizer would 622need to either ensure that a) all possible values are valid, or b) 623appropriate fencing is inserted. Since the appropriate fencing is 624implementation defined, the optimizer can't do the latter. The former is 625challenging as many commonly expected properties, such as 626``ptrtoint(v)-ptrtoint(v) == 0``, don't hold for non-integral types. 627 628.. _globalvars: 629 630Global Variables 631---------------- 632 633Global variables define regions of memory allocated at compilation time 634instead of run-time. 635 636Global variable definitions must be initialized. 637 638Global variables in other translation units can also be declared, in which 639case they don't have an initializer. 640 641Global variables can optionally specify a :ref:`linkage type <linkage>`. 642 643Either global variable definitions or declarations may have an explicit section 644to be placed in and may have an optional explicit alignment specified. If there 645is a mismatch between the explicit or inferred section information for the 646variable declaration and its definition the resulting behavior is undefined. 647 648A variable may be defined as a global ``constant``, which indicates that 649the contents of the variable will **never** be modified (enabling better 650optimization, allowing the global data to be placed in the read-only 651section of an executable, etc). Note that variables that need runtime 652initialization cannot be marked ``constant`` as there is a store to the 653variable. 654 655LLVM explicitly allows *declarations* of global variables to be marked 656constant, even if the final definition of the global is not. This 657capability can be used to enable slightly better optimization of the 658program, but requires the language definition to guarantee that 659optimizations based on the 'constantness' are valid for the translation 660units that do not include the definition. 661 662As SSA values, global variables define pointer values that are in scope 663(i.e. they dominate) all basic blocks in the program. Global variables 664always define a pointer to their "content" type because they describe a 665region of memory, and all memory objects in LLVM are accessed through 666pointers. 667 668Global variables can be marked with ``unnamed_addr`` which indicates 669that the address is not significant, only the content. Constants marked 670like this can be merged with other constants if they have the same 671initializer. Note that a constant with significant address *can* be 672merged with a ``unnamed_addr`` constant, the result being a constant 673whose address is significant. 674 675If the ``local_unnamed_addr`` attribute is given, the address is known to 676not be significant within the module. 677 678A global variable may be declared to reside in a target-specific 679numbered address space. For targets that support them, address spaces 680may affect how optimizations are performed and/or what target 681instructions are used to access the variable. The default address space 682is zero. The address space qualifier must precede any other attributes. 683 684LLVM allows an explicit section to be specified for globals. If the 685target supports it, it will emit globals to the section specified. 686Additionally, the global can placed in a comdat if the target has the necessary 687support. 688 689External declarations may have an explicit section specified. Section 690information is retained in LLVM IR for targets that make use of this 691information. Attaching section information to an external declaration is an 692assertion that its definition is located in the specified section. If the 693definition is located in a different section, the behavior is undefined. 694 695By default, global initializers are optimized by assuming that global 696variables defined within the module are not modified from their 697initial values before the start of the global initializer. This is 698true even for variables potentially accessible from outside the 699module, including those with external linkage or appearing in 700``@llvm.used`` or dllexported variables. This assumption may be suppressed 701by marking the variable with ``externally_initialized``. 702 703An explicit alignment may be specified for a global, which must be a 704power of 2. If not present, or if the alignment is set to zero, the 705alignment of the global is set by the target to whatever it feels 706convenient. If an explicit alignment is specified, the global is forced 707to have exactly that alignment. Targets and optimizers are not allowed 708to over-align the global if the global has an assigned section. In this 709case, the extra alignment could be observable: for example, code could 710assume that the globals are densely packed in their section and try to 711iterate over them as an array, alignment padding would break this 712iteration. The maximum alignment is ``1 << 32``. 713 714For global variables declarations, as well as definitions that may be 715replaced at link time (``linkonce``, ``weak``, ``extern_weak`` and ``common`` 716linkage types), LLVM makes no assumptions about the allocation size of the 717variables, except that they may not overlap. The alignment of a global variable 718declaration or replaceable definition must not be greater than the alignment of 719the definition it resolves to. 720 721Globals can also have a :ref:`DLL storage class <dllstorageclass>`, 722an optional :ref:`runtime preemption specifier <runtime_preemption_model>`, 723an optional :ref:`global attributes <glattrs>` and 724an optional list of attached :ref:`metadata <metadata>`. 725 726Variables and aliases can have a 727:ref:`Thread Local Storage Model <tls_model>`. 728 729:ref:`Scalable vectors <t_vector>` cannot be global variables or members of 730arrays because their size is unknown at compile time. They are allowed in 731structs to facilitate intrinsics returning multiple values. Structs containing 732scalable vectors cannot be used in loads, stores, allocas, or GEPs. 733 734Syntax:: 735 736 @<GlobalVarName> = [Linkage] [PreemptionSpecifier] [Visibility] 737 [DLLStorageClass] [ThreadLocal] 738 [(unnamed_addr|local_unnamed_addr)] [AddrSpace] 739 [ExternallyInitialized] 740 <global | constant> <Type> [<InitializerConstant>] 741 [, section "name"] [, comdat [($name)]] 742 [, align <Alignment>] (, !name !N)* 743 744For example, the following defines a global in a numbered address space 745with an initializer, section, and alignment: 746 747.. code-block:: llvm 748 749 @G = addrspace(5) constant float 1.0, section "foo", align 4 750 751The following example just declares a global variable 752 753.. code-block:: llvm 754 755 @G = external global i32 756 757The following example defines a thread-local global with the 758``initialexec`` TLS model: 759 760.. code-block:: llvm 761 762 @G = thread_local(initialexec) global i32 0, align 4 763 764.. _functionstructure: 765 766Functions 767--------- 768 769LLVM function definitions consist of the "``define``" keyword, an 770optional :ref:`linkage type <linkage>`, an optional :ref:`runtime preemption 771specifier <runtime_preemption_model>`, an optional :ref:`visibility 772style <visibility>`, an optional :ref:`DLL storage class <dllstorageclass>`, 773an optional :ref:`calling convention <callingconv>`, 774an optional ``unnamed_addr`` attribute, a return type, an optional 775:ref:`parameter attribute <paramattrs>` for the return type, a function 776name, a (possibly empty) argument list (each with optional :ref:`parameter 777attributes <paramattrs>`), optional :ref:`function attributes <fnattrs>`, 778an optional address space, an optional section, an optional alignment, 779an optional :ref:`comdat <langref_comdats>`, 780an optional :ref:`garbage collector name <gc>`, an optional :ref:`prefix <prefixdata>`, 781an optional :ref:`prologue <prologuedata>`, 782an optional :ref:`personality <personalityfn>`, 783an optional list of attached :ref:`metadata <metadata>`, 784an opening curly brace, a list of basic blocks, and a closing curly brace. 785 786LLVM function declarations consist of the "``declare``" keyword, an 787optional :ref:`linkage type <linkage>`, an optional :ref:`visibility style 788<visibility>`, an optional :ref:`DLL storage class <dllstorageclass>`, an 789optional :ref:`calling convention <callingconv>`, an optional ``unnamed_addr`` 790or ``local_unnamed_addr`` attribute, an optional address space, a return type, 791an optional :ref:`parameter attribute <paramattrs>` for the return type, a function name, a possibly 792empty list of arguments, an optional alignment, an optional :ref:`garbage 793collector name <gc>`, an optional :ref:`prefix <prefixdata>`, and an optional 794:ref:`prologue <prologuedata>`. 795 796A function definition contains a list of basic blocks, forming the CFG (Control 797Flow Graph) for the function. Each basic block may optionally start with a label 798(giving the basic block a symbol table entry), contains a list of instructions, 799and ends with a :ref:`terminator <terminators>` instruction (such as a branch or 800function return). If an explicit label name is not provided, a block is assigned 801an implicit numbered label, using the next value from the same counter as used 802for unnamed temporaries (:ref:`see above<identifiers>`). For example, if a 803function entry block does not have an explicit label, it will be assigned label 804"%0", then the first unnamed temporary in that block will be "%1", etc. If a 805numeric label is explicitly specified, it must match the numeric label that 806would be used implicitly. 807 808The first basic block in a function is special in two ways: it is 809immediately executed on entrance to the function, and it is not allowed 810to have predecessor basic blocks (i.e. there can not be any branches to 811the entry block of a function). Because the block can have no 812predecessors, it also cannot have any :ref:`PHI nodes <i_phi>`. 813 814LLVM allows an explicit section to be specified for functions. If the 815target supports it, it will emit functions to the section specified. 816Additionally, the function can be placed in a COMDAT. 817 818An explicit alignment may be specified for a function. If not present, 819or if the alignment is set to zero, the alignment of the function is set 820by the target to whatever it feels convenient. If an explicit alignment 821is specified, the function is forced to have at least that much 822alignment. All alignments must be a power of 2. 823 824If the ``unnamed_addr`` attribute is given, the address is known to not 825be significant and two identical functions can be merged. 826 827If the ``local_unnamed_addr`` attribute is given, the address is known to 828not be significant within the module. 829 830If an explicit address space is not given, it will default to the program 831address space from the :ref:`datalayout string<langref_datalayout>`. 832 833Syntax:: 834 835 define [linkage] [PreemptionSpecifier] [visibility] [DLLStorageClass] 836 [cconv] [ret attrs] 837 <ResultType> @<FunctionName> ([argument list]) 838 [(unnamed_addr|local_unnamed_addr)] [AddrSpace] [fn Attrs] 839 [section "name"] [comdat [($name)]] [align N] [gc] [prefix Constant] 840 [prologue Constant] [personality Constant] (!name !N)* { ... } 841 842The argument list is a comma separated sequence of arguments where each 843argument is of the following form: 844 845Syntax:: 846 847 <type> [parameter Attrs] [name] 848 849 850.. _langref_aliases: 851 852Aliases 853------- 854 855Aliases, unlike function or variables, don't create any new data. They 856are just a new symbol and metadata for an existing position. 857 858Aliases have a name and an aliasee that is either a global value or a 859constant expression. 860 861Aliases may have an optional :ref:`linkage type <linkage>`, an optional 862:ref:`runtime preemption specifier <runtime_preemption_model>`, an optional 863:ref:`visibility style <visibility>`, an optional :ref:`DLL storage class 864<dllstorageclass>` and an optional :ref:`tls model <tls_model>`. 865 866Syntax:: 867 868 @<Name> = [Linkage] [PreemptionSpecifier] [Visibility] [DLLStorageClass] [ThreadLocal] [(unnamed_addr|local_unnamed_addr)] alias <AliaseeTy>, <AliaseeTy>* @<Aliasee> 869 870The linkage must be one of ``private``, ``internal``, ``linkonce``, ``weak``, 871``linkonce_odr``, ``weak_odr``, ``external``. Note that some system linkers 872might not correctly handle dropping a weak symbol that is aliased. 873 874Aliases that are not ``unnamed_addr`` are guaranteed to have the same address as 875the aliasee expression. ``unnamed_addr`` ones are only guaranteed to point 876to the same content. 877 878If the ``local_unnamed_addr`` attribute is given, the address is known to 879not be significant within the module. 880 881Since aliases are only a second name, some restrictions apply, of which 882some can only be checked when producing an object file: 883 884* The expression defining the aliasee must be computable at assembly 885 time. Since it is just a name, no relocations can be used. 886 887* No alias in the expression can be weak as the possibility of the 888 intermediate alias being overridden cannot be represented in an 889 object file. 890 891* No global value in the expression can be a declaration, since that 892 would require a relocation, which is not possible. 893 894.. _langref_ifunc: 895 896IFuncs 897------- 898 899IFuncs, like as aliases, don't create any new data or func. They are just a new 900symbol that dynamic linker resolves at runtime by calling a resolver function. 901 902IFuncs have a name and a resolver that is a function called by dynamic linker 903that returns address of another function associated with the name. 904 905IFunc may have an optional :ref:`linkage type <linkage>` and an optional 906:ref:`visibility style <visibility>`. 907 908Syntax:: 909 910 @<Name> = [Linkage] [PreemptionSpecifier] [Visibility] ifunc <IFuncTy>, <ResolverTy>* @<Resolver> 911 912 913.. _langref_comdats: 914 915Comdats 916------- 917 918Comdat IR provides access to object file COMDAT/section group functionality 919which represents interrelated sections. 920 921Comdats have a name which represents the COMDAT key and a selection kind to 922provide input on how the linker deduplicates comdats with the same key in two 923different object files. A comdat must be included or omitted as a unit. 924Discarding the whole comdat is allowed but discarding a subset is not. 925 926A global object may be a member of at most one comdat. Aliases are placed in the 927same COMDAT that their aliasee computes to, if any. 928 929Syntax:: 930 931 $<Name> = comdat SelectionKind 932 933For selection kinds other than ``nodeduplicate``, only one of the duplicate 934comdats may be retained by the linker and the members of the remaining comdats 935must be discarded. The following selection kinds are supported: 936 937``any`` 938 The linker may choose any COMDAT key, the choice is arbitrary. 939``exactmatch`` 940 The linker may choose any COMDAT key but the sections must contain the 941 same data. 942``largest`` 943 The linker will choose the section containing the largest COMDAT key. 944``nodeduplicate`` 945 No deduplication is performed. 946``samesize`` 947 The linker may choose any COMDAT key but the sections must contain the 948 same amount of data. 949 950- XCOFF and Mach-O don't support COMDATs. 951- COFF supports all selection kinds. Non-``nodeduplicate`` selection kinds need 952 a non-local linkage COMDAT symbol. 953- ELF supports ``any`` and ``nodeduplicate``. 954- WebAssembly only supports ``any``. 955 956Here is an example of a COFF COMDAT where a function will only be selected if 957the COMDAT key's section is the largest: 958 959.. code-block:: text 960 961 $foo = comdat largest 962 @foo = global i32 2, comdat($foo) 963 964 define void @bar() comdat($foo) { 965 ret void 966 } 967 968In a COFF object file, this will create a COMDAT section with selection kind 969``IMAGE_COMDAT_SELECT_LARGEST`` containing the contents of the ``@foo`` symbol 970and another COMDAT section with selection kind 971``IMAGE_COMDAT_SELECT_ASSOCIATIVE`` which is associated with the first COMDAT 972section and contains the contents of the ``@bar`` symbol. 973 974As a syntactic sugar the ``$name`` can be omitted if the name is the same as 975the global name: 976 977.. code-block:: llvm 978 979 $foo = comdat any 980 @foo = global i32 2, comdat 981 @bar = global i32 3, comdat($foo) 982 983There are some restrictions on the properties of the global object. 984It, or an alias to it, must have the same name as the COMDAT group when 985targeting COFF. 986The contents and size of this object may be used during link-time to determine 987which COMDAT groups get selected depending on the selection kind. 988Because the name of the object must match the name of the COMDAT group, the 989linkage of the global object must not be local; local symbols can get renamed 990if a collision occurs in the symbol table. 991 992The combined use of COMDATS and section attributes may yield surprising results. 993For example: 994 995.. code-block:: llvm 996 997 $foo = comdat any 998 $bar = comdat any 999 @g1 = global i32 42, section "sec", comdat($foo) 1000 @g2 = global i32 42, section "sec", comdat($bar) 1001 1002From the object file perspective, this requires the creation of two sections 1003with the same name. This is necessary because both globals belong to different 1004COMDAT groups and COMDATs, at the object file level, are represented by 1005sections. 1006 1007Note that certain IR constructs like global variables and functions may 1008create COMDATs in the object file in addition to any which are specified using 1009COMDAT IR. This arises when the code generator is configured to emit globals 1010in individual sections (e.g. when `-data-sections` or `-function-sections` 1011is supplied to `llc`). 1012 1013.. _namedmetadatastructure: 1014 1015Named Metadata 1016-------------- 1017 1018Named metadata is a collection of metadata. :ref:`Metadata 1019nodes <metadata>` (but not metadata strings) are the only valid 1020operands for a named metadata. 1021 1022#. Named metadata are represented as a string of characters with the 1023 metadata prefix. The rules for metadata names are the same as for 1024 identifiers, but quoted names are not allowed. ``"\xx"`` type escapes 1025 are still valid, which allows any character to be part of a name. 1026 1027Syntax:: 1028 1029 ; Some unnamed metadata nodes, which are referenced by the named metadata. 1030 !0 = !{!"zero"} 1031 !1 = !{!"one"} 1032 !2 = !{!"two"} 1033 ; A named metadata. 1034 !name = !{!0, !1, !2} 1035 1036.. _paramattrs: 1037 1038Parameter Attributes 1039-------------------- 1040 1041The return type and each parameter of a function type may have a set of 1042*parameter attributes* associated with them. Parameter attributes are 1043used to communicate additional information about the result or 1044parameters of a function. Parameter attributes are considered to be part 1045of the function, not of the function type, so functions with different 1046parameter attributes can have the same function type. 1047 1048Parameter attributes are simple keywords that follow the type specified. 1049If multiple parameter attributes are needed, they are space separated. 1050For example: 1051 1052.. code-block:: llvm 1053 1054 declare i32 @printf(i8* noalias nocapture, ...) 1055 declare i32 @atoi(i8 zeroext) 1056 declare signext i8 @returns_signed_char() 1057 1058Note that any attributes for the function result (``nounwind``, 1059``readonly``) come immediately after the argument list. 1060 1061Currently, only the following parameter attributes are defined: 1062 1063``zeroext`` 1064 This indicates to the code generator that the parameter or return 1065 value should be zero-extended to the extent required by the target's 1066 ABI by the caller (for a parameter) or the callee (for a return value). 1067``signext`` 1068 This indicates to the code generator that the parameter or return 1069 value should be sign-extended to the extent required by the target's 1070 ABI (which is usually 32-bits) by the caller (for a parameter) or 1071 the callee (for a return value). 1072``inreg`` 1073 This indicates that this parameter or return value should be treated 1074 in a special target-dependent fashion while emitting code for 1075 a function call or return (usually, by putting it in a register as 1076 opposed to memory, though some targets use it to distinguish between 1077 two different kinds of registers). Use of this attribute is 1078 target-specific. 1079``byval(<ty>)`` 1080 This indicates that the pointer parameter should really be passed by 1081 value to the function. The attribute implies that a hidden copy of 1082 the pointee is made between the caller and the callee, so the callee 1083 is unable to modify the value in the caller. This attribute is only 1084 valid on LLVM pointer arguments. It is generally used to pass 1085 structs and arrays by value, but is also valid on pointers to 1086 scalars. The copy is considered to belong to the caller not the 1087 callee (for example, ``readonly`` functions should not write to 1088 ``byval`` parameters). This is not a valid attribute for return 1089 values. 1090 1091 The byval type argument indicates the in-memory value type, and 1092 must be the same as the pointee type of the argument. 1093 1094 The byval attribute also supports specifying an alignment with the 1095 align attribute. It indicates the alignment of the stack slot to 1096 form and the known alignment of the pointer specified to the call 1097 site. If the alignment is not specified, then the code generator 1098 makes a target-specific assumption. 1099 1100.. _attr_byref: 1101 1102``byref(<ty>)`` 1103 1104 The ``byref`` argument attribute allows specifying the pointee 1105 memory type of an argument. This is similar to ``byval``, but does 1106 not imply a copy is made anywhere, or that the argument is passed 1107 on the stack. This implies the pointer is dereferenceable up to 1108 the storage size of the type. 1109 1110 It is not generally permissible to introduce a write to an 1111 ``byref`` pointer. The pointer may have any address space and may 1112 be read only. 1113 1114 This is not a valid attribute for return values. 1115 1116 The alignment for an ``byref`` parameter can be explicitly 1117 specified by combining it with the ``align`` attribute, similar to 1118 ``byval``. If the alignment is not specified, then the code generator 1119 makes a target-specific assumption. 1120 1121 This is intended for representing ABI constraints, and is not 1122 intended to be inferred for optimization use. 1123 1124.. _attr_preallocated: 1125 1126``preallocated(<ty>)`` 1127 This indicates that the pointer parameter should really be passed by 1128 value to the function, and that the pointer parameter's pointee has 1129 already been initialized before the call instruction. This attribute 1130 is only valid on LLVM pointer arguments. The argument must be the value 1131 returned by the appropriate 1132 :ref:`llvm.call.preallocated.arg<int_call_preallocated_arg>` on non 1133 ``musttail`` calls, or the corresponding caller parameter in ``musttail`` 1134 calls, although it is ignored during codegen. 1135 1136 A non ``musttail`` function call with a ``preallocated`` attribute in 1137 any parameter must have a ``"preallocated"`` operand bundle. A ``musttail`` 1138 function call cannot have a ``"preallocated"`` operand bundle. 1139 1140 The preallocated attribute requires a type argument, which must be 1141 the same as the pointee type of the argument. 1142 1143 The preallocated attribute also supports specifying an alignment with the 1144 align attribute. It indicates the alignment of the stack slot to 1145 form and the known alignment of the pointer specified to the call 1146 site. If the alignment is not specified, then the code generator 1147 makes a target-specific assumption. 1148 1149.. _attr_inalloca: 1150 1151``inalloca(<ty>)`` 1152 1153 The ``inalloca`` argument attribute allows the caller to take the 1154 address of outgoing stack arguments. An ``inalloca`` argument must 1155 be a pointer to stack memory produced by an ``alloca`` instruction. 1156 The alloca, or argument allocation, must also be tagged with the 1157 inalloca keyword. Only the last argument may have the ``inalloca`` 1158 attribute, and that argument is guaranteed to be passed in memory. 1159 1160 An argument allocation may be used by a call at most once because 1161 the call may deallocate it. The ``inalloca`` attribute cannot be 1162 used in conjunction with other attributes that affect argument 1163 storage, like ``inreg``, ``nest``, ``sret``, or ``byval``. The 1164 ``inalloca`` attribute also disables LLVM's implicit lowering of 1165 large aggregate return values, which means that frontend authors 1166 must lower them with ``sret`` pointers. 1167 1168 When the call site is reached, the argument allocation must have 1169 been the most recent stack allocation that is still live, or the 1170 behavior is undefined. It is possible to allocate additional stack 1171 space after an argument allocation and before its call site, but it 1172 must be cleared off with :ref:`llvm.stackrestore 1173 <int_stackrestore>`. 1174 1175 The inalloca attribute requires a type argument, which must be the 1176 same as the pointee type of the argument. 1177 1178 See :doc:`InAlloca` for more information on how to use this 1179 attribute. 1180 1181``sret(<ty>)`` 1182 This indicates that the pointer parameter specifies the address of a 1183 structure that is the return value of the function in the source 1184 program. This pointer must be guaranteed by the caller to be valid: 1185 loads and stores to the structure may be assumed by the callee not 1186 to trap and to be properly aligned. This is not a valid attribute 1187 for return values. 1188 1189 The sret type argument specifies the in memory type, which must be 1190 the same as the pointee type of the argument. 1191 1192.. _attr_elementtype: 1193 1194``elementtype(<ty>)`` 1195 1196 The ``elementtype`` argument attribute can be used to specify a pointer 1197 element type in a way that is compatible with `opaque pointers 1198 <OpaquePointers.html>`. 1199 1200 The ``elementtype`` attribute by itself does not carry any specific 1201 semantics. However, certain intrinsics may require this attribute to be 1202 present and assign it particular semantics. This will be documented on 1203 individual intrinsics. 1204 1205 The attribute may only be applied to pointer typed arguments of intrinsic 1206 calls. It cannot be applied to non-intrinsic calls, and cannot be applied 1207 to parameters on function declarations. For non-opaque pointers, the type 1208 passed to ``elementtype`` must match the pointer element type. 1209 1210.. _attr_align: 1211 1212``align <n>`` or ``align(<n>)`` 1213 This indicates that the pointer value has the specified alignment. 1214 If the pointer value does not have the specified alignment, 1215 :ref:`poison value <poisonvalues>` is returned or passed instead. The 1216 ``align`` attribute should be combined with the ``noundef`` attribute to 1217 ensure a pointer is aligned, or otherwise the behavior is undefined. Note 1218 that ``align 1`` has no effect on non-byval, non-preallocated arguments. 1219 1220 Note that this attribute has additional semantics when combined with the 1221 ``byval`` or ``preallocated`` attribute, which are documented there. 1222 1223.. _noalias: 1224 1225``noalias`` 1226 This indicates that memory locations accessed via pointer values 1227 :ref:`based <pointeraliasing>` on the argument or return value are not also 1228 accessed, during the execution of the function, via pointer values not 1229 *based* on the argument or return value. This guarantee only holds for 1230 memory locations that are *modified*, by any means, during the execution of 1231 the function. The attribute on a return value also has additional semantics 1232 described below. The caller shares the responsibility with the callee for 1233 ensuring that these requirements are met. For further details, please see 1234 the discussion of the NoAlias response in :ref:`alias analysis <Must, May, 1235 or No>`. 1236 1237 Note that this definition of ``noalias`` is intentionally similar 1238 to the definition of ``restrict`` in C99 for function arguments. 1239 1240 For function return values, C99's ``restrict`` is not meaningful, 1241 while LLVM's ``noalias`` is. Furthermore, the semantics of the ``noalias`` 1242 attribute on return values are stronger than the semantics of the attribute 1243 when used on function arguments. On function return values, the ``noalias`` 1244 attribute indicates that the function acts like a system memory allocation 1245 function, returning a pointer to allocated storage disjoint from the 1246 storage for any other object accessible to the caller. 1247 1248.. _nocapture: 1249 1250``nocapture`` 1251 This indicates that the callee does not :ref:`capture <pointercapture>` the 1252 pointer. This is not a valid attribute for return values. 1253 This attribute applies only to the particular copy of the pointer passed in 1254 this argument. A caller could pass two copies of the same pointer with one 1255 being annotated nocapture and the other not, and the callee could validly 1256 capture through the non annotated parameter. 1257 1258.. code-block:: llvm 1259 1260 define void @f(i8* nocapture %a, i8* %b) { 1261 ; (capture %b) 1262 } 1263 1264 call void @f(i8* @glb, i8* @glb) ; well-defined 1265 1266``nofree`` 1267 This indicates that callee does not free the pointer argument. This is not 1268 a valid attribute for return values. 1269 1270.. _nest: 1271 1272``nest`` 1273 This indicates that the pointer parameter can be excised using the 1274 :ref:`trampoline intrinsics <int_trampoline>`. This is not a valid 1275 attribute for return values and can only be applied to one parameter. 1276 1277``returned`` 1278 This indicates that the function always returns the argument as its return 1279 value. This is a hint to the optimizer and code generator used when 1280 generating the caller, allowing value propagation, tail call optimization, 1281 and omission of register saves and restores in some cases; it is not 1282 checked or enforced when generating the callee. The parameter and the 1283 function return type must be valid operands for the 1284 :ref:`bitcast instruction <i_bitcast>`. This is not a valid attribute for 1285 return values and can only be applied to one parameter. 1286 1287``nonnull`` 1288 This indicates that the parameter or return pointer is not null. This 1289 attribute may only be applied to pointer typed parameters. This is not 1290 checked or enforced by LLVM; if the parameter or return pointer is null, 1291 :ref:`poison value <poisonvalues>` is returned or passed instead. 1292 The ``nonnull`` attribute should be combined with the ``noundef`` attribute 1293 to ensure a pointer is not null or otherwise the behavior is undefined. 1294 1295``dereferenceable(<n>)`` 1296 This indicates that the parameter or return pointer is dereferenceable. This 1297 attribute may only be applied to pointer typed parameters. A pointer that 1298 is dereferenceable can be loaded from speculatively without a risk of 1299 trapping. The number of bytes known to be dereferenceable must be provided 1300 in parentheses. It is legal for the number of bytes to be less than the 1301 size of the pointee type. The ``nonnull`` attribute does not imply 1302 dereferenceability (consider a pointer to one element past the end of an 1303 array), however ``dereferenceable(<n>)`` does imply ``nonnull`` in 1304 ``addrspace(0)`` (which is the default address space), except if the 1305 ``null_pointer_is_valid`` function attribute is present. 1306 ``n`` should be a positive number. The pointer should be well defined, 1307 otherwise it is undefined behavior. This means ``dereferenceable(<n>)`` 1308 implies ``noundef``. 1309 1310``dereferenceable_or_null(<n>)`` 1311 This indicates that the parameter or return value isn't both 1312 non-null and non-dereferenceable (up to ``<n>`` bytes) at the same 1313 time. All non-null pointers tagged with 1314 ``dereferenceable_or_null(<n>)`` are ``dereferenceable(<n>)``. 1315 For address space 0 ``dereferenceable_or_null(<n>)`` implies that 1316 a pointer is exactly one of ``dereferenceable(<n>)`` or ``null``, 1317 and in other address spaces ``dereferenceable_or_null(<n>)`` 1318 implies that a pointer is at least one of ``dereferenceable(<n>)`` 1319 or ``null`` (i.e. it may be both ``null`` and 1320 ``dereferenceable(<n>)``). This attribute may only be applied to 1321 pointer typed parameters. 1322 1323``swiftself`` 1324 This indicates that the parameter is the self/context parameter. This is not 1325 a valid attribute for return values and can only be applied to one 1326 parameter. 1327 1328``swiftasync`` 1329 This indicates that the parameter is the asynchronous context parameter and 1330 triggers the creation of a target-specific extended frame record to store 1331 this pointer. This is not a valid attribute for return values and can only 1332 be applied to one parameter. 1333 1334``swifterror`` 1335 This attribute is motivated to model and optimize Swift error handling. It 1336 can be applied to a parameter with pointer to pointer type or a 1337 pointer-sized alloca. At the call site, the actual argument that corresponds 1338 to a ``swifterror`` parameter has to come from a ``swifterror`` alloca or 1339 the ``swifterror`` parameter of the caller. A ``swifterror`` value (either 1340 the parameter or the alloca) can only be loaded and stored from, or used as 1341 a ``swifterror`` argument. This is not a valid attribute for return values 1342 and can only be applied to one parameter. 1343 1344 These constraints allow the calling convention to optimize access to 1345 ``swifterror`` variables by associating them with a specific register at 1346 call boundaries rather than placing them in memory. Since this does change 1347 the calling convention, a function which uses the ``swifterror`` attribute 1348 on a parameter is not ABI-compatible with one which does not. 1349 1350 These constraints also allow LLVM to assume that a ``swifterror`` argument 1351 does not alias any other memory visible within a function and that a 1352 ``swifterror`` alloca passed as an argument does not escape. 1353 1354``immarg`` 1355 This indicates the parameter is required to be an immediate 1356 value. This must be a trivial immediate integer or floating-point 1357 constant. Undef or constant expressions are not valid. This is 1358 only valid on intrinsic declarations and cannot be applied to a 1359 call site or arbitrary function. 1360 1361``noundef`` 1362 This attribute applies to parameters and return values. If the value 1363 representation contains any undefined or poison bits, the behavior is 1364 undefined. Note that this does not refer to padding introduced by the 1365 type's storage representation. 1366 1367``alignstack(<n>)`` 1368 This indicates the alignment that should be considered by the backend when 1369 assigning this parameter to a stack slot during calling convention 1370 lowering. The enforcement of the specified alignment is target-dependent, 1371 as target-specific calling convention rules may override this value. This 1372 attribute serves the purpose of carrying language specific alignment 1373 information that is not mapped to base types in the backend (for example, 1374 over-alignment specification through language attributes). 1375 1376.. _gc: 1377 1378Garbage Collector Strategy Names 1379-------------------------------- 1380 1381Each function may specify a garbage collector strategy name, which is simply a 1382string: 1383 1384.. code-block:: llvm 1385 1386 define void @f() gc "name" { ... } 1387 1388The supported values of *name* includes those :ref:`built in to LLVM 1389<builtin-gc-strategies>` and any provided by loaded plugins. Specifying a GC 1390strategy will cause the compiler to alter its output in order to support the 1391named garbage collection algorithm. Note that LLVM itself does not contain a 1392garbage collector, this functionality is restricted to generating machine code 1393which can interoperate with a collector provided externally. 1394 1395.. _prefixdata: 1396 1397Prefix Data 1398----------- 1399 1400Prefix data is data associated with a function which the code 1401generator will emit immediately before the function's entrypoint. 1402The purpose of this feature is to allow frontends to associate 1403language-specific runtime metadata with specific functions and make it 1404available through the function pointer while still allowing the 1405function pointer to be called. 1406 1407To access the data for a given function, a program may bitcast the 1408function pointer to a pointer to the constant's type and dereference 1409index -1. This implies that the IR symbol points just past the end of 1410the prefix data. For instance, take the example of a function annotated 1411with a single ``i32``, 1412 1413.. code-block:: llvm 1414 1415 define void @f() prefix i32 123 { ... } 1416 1417The prefix data can be referenced as, 1418 1419.. code-block:: llvm 1420 1421 %0 = bitcast void* () @f to i32* 1422 %a = getelementptr inbounds i32, i32* %0, i32 -1 1423 %b = load i32, i32* %a 1424 1425Prefix data is laid out as if it were an initializer for a global variable 1426of the prefix data's type. The function will be placed such that the 1427beginning of the prefix data is aligned. This means that if the size 1428of the prefix data is not a multiple of the alignment size, the 1429function's entrypoint will not be aligned. If alignment of the 1430function's entrypoint is desired, padding must be added to the prefix 1431data. 1432 1433A function may have prefix data but no body. This has similar semantics 1434to the ``available_externally`` linkage in that the data may be used by the 1435optimizers but will not be emitted in the object file. 1436 1437.. _prologuedata: 1438 1439Prologue Data 1440------------- 1441 1442The ``prologue`` attribute allows arbitrary code (encoded as bytes) to 1443be inserted prior to the function body. This can be used for enabling 1444function hot-patching and instrumentation. 1445 1446To maintain the semantics of ordinary function calls, the prologue data must 1447have a particular format. Specifically, it must begin with a sequence of 1448bytes which decode to a sequence of machine instructions, valid for the 1449module's target, which transfer control to the point immediately succeeding 1450the prologue data, without performing any other visible action. This allows 1451the inliner and other passes to reason about the semantics of the function 1452definition without needing to reason about the prologue data. Obviously this 1453makes the format of the prologue data highly target dependent. 1454 1455A trivial example of valid prologue data for the x86 architecture is ``i8 144``, 1456which encodes the ``nop`` instruction: 1457 1458.. code-block:: text 1459 1460 define void @f() prologue i8 144 { ... } 1461 1462Generally prologue data can be formed by encoding a relative branch instruction 1463which skips the metadata, as in this example of valid prologue data for the 1464x86_64 architecture, where the first two bytes encode ``jmp .+10``: 1465 1466.. code-block:: text 1467 1468 %0 = type <{ i8, i8, i8* }> 1469 1470 define void @f() prologue %0 <{ i8 235, i8 8, i8* @md}> { ... } 1471 1472A function may have prologue data but no body. This has similar semantics 1473to the ``available_externally`` linkage in that the data may be used by the 1474optimizers but will not be emitted in the object file. 1475 1476.. _personalityfn: 1477 1478Personality Function 1479-------------------- 1480 1481The ``personality`` attribute permits functions to specify what function 1482to use for exception handling. 1483 1484.. _attrgrp: 1485 1486Attribute Groups 1487---------------- 1488 1489Attribute groups are groups of attributes that are referenced by objects within 1490the IR. They are important for keeping ``.ll`` files readable, because a lot of 1491functions will use the same set of attributes. In the degenerative case of a 1492``.ll`` file that corresponds to a single ``.c`` file, the single attribute 1493group will capture the important command line flags used to build that file. 1494 1495An attribute group is a module-level object. To use an attribute group, an 1496object references the attribute group's ID (e.g. ``#37``). An object may refer 1497to more than one attribute group. In that situation, the attributes from the 1498different groups are merged. 1499 1500Here is an example of attribute groups for a function that should always be 1501inlined, has a stack alignment of 4, and which shouldn't use SSE instructions: 1502 1503.. code-block:: llvm 1504 1505 ; Target-independent attributes: 1506 attributes #0 = { alwaysinline alignstack=4 } 1507 1508 ; Target-dependent attributes: 1509 attributes #1 = { "no-sse" } 1510 1511 ; Function @f has attributes: alwaysinline, alignstack=4, and "no-sse". 1512 define void @f() #0 #1 { ... } 1513 1514.. _fnattrs: 1515 1516Function Attributes 1517------------------- 1518 1519Function attributes are set to communicate additional information about 1520a function. Function attributes are considered to be part of the 1521function, not of the function type, so functions with different function 1522attributes can have the same function type. 1523 1524Function attributes are simple keywords that follow the type specified. 1525If multiple attributes are needed, they are space separated. For 1526example: 1527 1528.. code-block:: llvm 1529 1530 define void @f() noinline { ... } 1531 define void @f() alwaysinline { ... } 1532 define void @f() alwaysinline optsize { ... } 1533 define void @f() optsize { ... } 1534 1535``alignstack(<n>)`` 1536 This attribute indicates that, when emitting the prologue and 1537 epilogue, the backend should forcibly align the stack pointer. 1538 Specify the desired alignment, which must be a power of two, in 1539 parentheses. 1540``allocsize(<EltSizeParam>[, <NumEltsParam>])`` 1541 This attribute indicates that the annotated function will always return at 1542 least a given number of bytes (or null). Its arguments are zero-indexed 1543 parameter numbers; if one argument is provided, then it's assumed that at 1544 least ``CallSite.Args[EltSizeParam]`` bytes will be available at the 1545 returned pointer. If two are provided, then it's assumed that 1546 ``CallSite.Args[EltSizeParam] * CallSite.Args[NumEltsParam]`` bytes are 1547 available. The referenced parameters must be integer types. No assumptions 1548 are made about the contents of the returned block of memory. 1549``alwaysinline`` 1550 This attribute indicates that the inliner should attempt to inline 1551 this function into callers whenever possible, ignoring any active 1552 inlining size threshold for this caller. 1553``builtin`` 1554 This indicates that the callee function at a call site should be 1555 recognized as a built-in function, even though the function's declaration 1556 uses the ``nobuiltin`` attribute. This is only valid at call sites for 1557 direct calls to functions that are declared with the ``nobuiltin`` 1558 attribute. 1559``cold`` 1560 This attribute indicates that this function is rarely called. When 1561 computing edge weights, basic blocks post-dominated by a cold 1562 function call are also considered to be cold; and, thus, given low 1563 weight. 1564``convergent`` 1565 In some parallel execution models, there exist operations that cannot be 1566 made control-dependent on any additional values. We call such operations 1567 ``convergent``, and mark them with this attribute. 1568 1569 The ``convergent`` attribute may appear on functions or call/invoke 1570 instructions. When it appears on a function, it indicates that calls to 1571 this function should not be made control-dependent on additional values. 1572 For example, the intrinsic ``llvm.nvvm.barrier0`` is ``convergent``, so 1573 calls to this intrinsic cannot be made control-dependent on additional 1574 values. 1575 1576 When it appears on a call/invoke, the ``convergent`` attribute indicates 1577 that we should treat the call as though we're calling a convergent 1578 function. This is particularly useful on indirect calls; without this we 1579 may treat such calls as though the target is non-convergent. 1580 1581 The optimizer may remove the ``convergent`` attribute on functions when it 1582 can prove that the function does not execute any convergent operations. 1583 Similarly, the optimizer may remove ``convergent`` on calls/invokes when it 1584 can prove that the call/invoke cannot call a convergent function. 1585``disable_sanitizer_instrumentation`` 1586 When instrumenting code with sanitizers, it can be important to skip certain 1587 functions to ensure no instrumentation is applied to them. 1588 1589 This attribute is not always similar to absent ``sanitize_<name>`` 1590 attributes: depending on the specific sanitizer, code can be inserted into 1591 functions regardless of the ``sanitize_<name>`` attribute to prevent false 1592 positive reports. 1593 1594 ``disable_sanitizer_instrumentation`` disables all kinds of instrumentation, 1595 taking precedence over the ``sanitize_<name>`` attributes and other compiler 1596 flags. 1597``"dontcall-error"`` 1598 This attribute denotes that an error diagnostic should be emitted when a 1599 call of a function with this attribute is not eliminated via optimization. 1600 Front ends can provide optional ``srcloc`` metadata nodes on call sites of 1601 such callees to attach information about where in the source language such a 1602 call came from. A string value can be provided as a note. 1603``"dontcall-warn"`` 1604 This attribute denotes that a warning diagnostic should be emitted when a 1605 call of a function with this attribute is not eliminated via optimization. 1606 Front ends can provide optional ``srcloc`` metadata nodes on call sites of 1607 such callees to attach information about where in the source language such a 1608 call came from. A string value can be provided as a note. 1609``"frame-pointer"`` 1610 This attribute tells the code generator whether the function 1611 should keep the frame pointer. The code generator may emit the frame pointer 1612 even if this attribute says the frame pointer can be eliminated. 1613 The allowed string values are: 1614 1615 * ``"none"`` (default) - the frame pointer can be eliminated. 1616 * ``"non-leaf"`` - the frame pointer should be kept if the function calls 1617 other functions. 1618 * ``"all"`` - the frame pointer should be kept. 1619``hot`` 1620 This attribute indicates that this function is a hot spot of the program 1621 execution. The function will be optimized more aggressively and will be 1622 placed into special subsection of the text section to improving locality. 1623 1624 When profile feedback is enabled, this attribute has the precedence over 1625 the profile information. By marking a function ``hot``, users can work 1626 around the cases where the training input does not have good coverage 1627 on all the hot functions. 1628``inaccessiblememonly`` 1629 This attribute indicates that the function may only access memory that 1630 is not accessible by the module being compiled. This is a weaker form 1631 of ``readnone``. If the function reads or writes other memory, the 1632 behavior is undefined. 1633``inaccessiblemem_or_argmemonly`` 1634 This attribute indicates that the function may only access memory that is 1635 either not accessible by the module being compiled, or is pointed to 1636 by its pointer arguments. This is a weaker form of ``argmemonly``. If the 1637 function reads or writes other memory, the behavior is undefined. 1638``inlinehint`` 1639 This attribute indicates that the source code contained a hint that 1640 inlining this function is desirable (such as the "inline" keyword in 1641 C/C++). It is just a hint; it imposes no requirements on the 1642 inliner. 1643``jumptable`` 1644 This attribute indicates that the function should be added to a 1645 jump-instruction table at code-generation time, and that all address-taken 1646 references to this function should be replaced with a reference to the 1647 appropriate jump-instruction-table function pointer. Note that this creates 1648 a new pointer for the original function, which means that code that depends 1649 on function-pointer identity can break. So, any function annotated with 1650 ``jumptable`` must also be ``unnamed_addr``. 1651``minsize`` 1652 This attribute suggests that optimization passes and code generator 1653 passes make choices that keep the code size of this function as small 1654 as possible and perform optimizations that may sacrifice runtime 1655 performance in order to minimize the size of the generated code. 1656``naked`` 1657 This attribute disables prologue / epilogue emission for the 1658 function. This can have very system-specific consequences. 1659``"no-inline-line-tables"`` 1660 When this attribute is set to true, the inliner discards source locations 1661 when inlining code and instead uses the source location of the call site. 1662 Breakpoints set on code that was inlined into the current function will 1663 not fire during the execution of the inlined call sites. If the debugger 1664 stops inside an inlined call site, it will appear to be stopped at the 1665 outermost inlined call site. 1666``no-jump-tables`` 1667 When this attribute is set to true, the jump tables and lookup tables that 1668 can be generated from a switch case lowering are disabled. 1669``nobuiltin`` 1670 This indicates that the callee function at a call site is not recognized as 1671 a built-in function. LLVM will retain the original call and not replace it 1672 with equivalent code based on the semantics of the built-in function, unless 1673 the call site uses the ``builtin`` attribute. This is valid at call sites 1674 and on function declarations and definitions. 1675``noduplicate`` 1676 This attribute indicates that calls to the function cannot be 1677 duplicated. A call to a ``noduplicate`` function may be moved 1678 within its parent function, but may not be duplicated within 1679 its parent function. 1680 1681 A function containing a ``noduplicate`` call may still 1682 be an inlining candidate, provided that the call is not 1683 duplicated by inlining. That implies that the function has 1684 internal linkage and only has one call site, so the original 1685 call is dead after inlining. 1686``nofree`` 1687 This function attribute indicates that the function does not, directly or 1688 transitively, call a memory-deallocation function (``free``, for example) 1689 on a memory allocation which existed before the call. 1690 1691 As a result, uncaptured pointers that are known to be dereferenceable 1692 prior to a call to a function with the ``nofree`` attribute are still 1693 known to be dereferenceable after the call. The capturing condition is 1694 necessary in environments where the function might communicate the 1695 pointer to another thread which then deallocates the memory. Alternatively, 1696 ``nosync`` would ensure such communication cannot happen and even captured 1697 pointers cannot be freed by the function. 1698 1699 A ``nofree`` function is explicitly allowed to free memory which it 1700 allocated or (if not ``nosync``) arrange for another thread to free 1701 memory on it's behalf. As a result, perhaps surprisingly, a ``nofree`` 1702 function can return a pointer to a previously deallocated memory object. 1703``noimplicitfloat`` 1704 Disallows implicit floating-point code. This inhibits optimizations that 1705 use floating-point code and floating-point/SIMD/vector registers for 1706 operations that are not nominally floating-point. LLVM instructions that 1707 perform floating-point operations or require access to floating-point 1708 registers may still cause floating-point code to be generated. 1709``noinline`` 1710 This attribute indicates that the inliner should never inline this 1711 function in any situation. This attribute may not be used together 1712 with the ``alwaysinline`` attribute. 1713``nomerge`` 1714 This attribute indicates that calls to this function should never be merged 1715 during optimization. For example, it will prevent tail merging otherwise 1716 identical code sequences that raise an exception or terminate the program. 1717 Tail merging normally reduces the precision of source location information, 1718 making stack traces less useful for debugging. This attribute gives the 1719 user control over the tradeoff between code size and debug information 1720 precision. 1721``nonlazybind`` 1722 This attribute suppresses lazy symbol binding for the function. This 1723 may make calls to the function faster, at the cost of extra program 1724 startup time if the function is not called during program startup. 1725``noprofile`` 1726 This function attribute prevents instrumentation based profiling, used for 1727 coverage or profile based optimization, from being added to a function, 1728 even when inlined. 1729``noredzone`` 1730 This attribute indicates that the code generator should not use a 1731 red zone, even if the target-specific ABI normally permits it. 1732``indirect-tls-seg-refs`` 1733 This attribute indicates that the code generator should not use 1734 direct TLS access through segment registers, even if the 1735 target-specific ABI normally permits it. 1736``noreturn`` 1737 This function attribute indicates that the function never returns 1738 normally, hence through a return instruction. This produces undefined 1739 behavior at runtime if the function ever does dynamically return. Annotated 1740 functions may still raise an exception, i.a., ``nounwind`` is not implied. 1741``norecurse`` 1742 This function attribute indicates that the function does not call itself 1743 either directly or indirectly down any possible call path. This produces 1744 undefined behavior at runtime if the function ever does recurse. 1745``willreturn`` 1746 This function attribute indicates that a call of this function will 1747 either exhibit undefined behavior or comes back and continues execution 1748 at a point in the existing call stack that includes the current invocation. 1749 Annotated functions may still raise an exception, i.a., ``nounwind`` is not implied. 1750 If an invocation of an annotated function does not return control back 1751 to a point in the call stack, the behavior is undefined. 1752``nosync`` 1753 This function attribute indicates that the function does not communicate 1754 (synchronize) with another thread through memory or other well-defined means. 1755 Synchronization is considered possible in the presence of `atomic` accesses 1756 that enforce an order, thus not "unordered" and "monotonic", `volatile` accesses, 1757 as well as `convergent` function calls. Note that through `convergent` function calls 1758 non-memory communication, e.g., cross-lane operations, are possible and are also 1759 considered synchronization. However `convergent` does not contradict `nosync`. 1760 If an annotated function does ever synchronize with another thread, 1761 the behavior is undefined. 1762``nounwind`` 1763 This function attribute indicates that the function never raises an 1764 exception. If the function does raise an exception, its runtime 1765 behavior is undefined. However, functions marked nounwind may still 1766 trap or generate asynchronous exceptions. Exception handling schemes 1767 that are recognized by LLVM to handle asynchronous exceptions, such 1768 as SEH, will still provide their implementation defined semantics. 1769``nosanitize_coverage`` 1770 This attribute indicates that SanitizerCoverage instrumentation is disabled 1771 for this function. 1772``null_pointer_is_valid`` 1773 If ``null_pointer_is_valid`` is set, then the ``null`` address 1774 in address-space 0 is considered to be a valid address for memory loads and 1775 stores. Any analysis or optimization should not treat dereferencing a 1776 pointer to ``null`` as undefined behavior in this function. 1777 Note: Comparing address of a global variable to ``null`` may still 1778 evaluate to false because of a limitation in querying this attribute inside 1779 constant expressions. 1780``optforfuzzing`` 1781 This attribute indicates that this function should be optimized 1782 for maximum fuzzing signal. 1783``optnone`` 1784 This function attribute indicates that most optimization passes will skip 1785 this function, with the exception of interprocedural optimization passes. 1786 Code generation defaults to the "fast" instruction selector. 1787 This attribute cannot be used together with the ``alwaysinline`` 1788 attribute; this attribute is also incompatible 1789 with the ``minsize`` attribute and the ``optsize`` attribute. 1790 1791 This attribute requires the ``noinline`` attribute to be specified on 1792 the function as well, so the function is never inlined into any caller. 1793 Only functions with the ``alwaysinline`` attribute are valid 1794 candidates for inlining into the body of this function. 1795``optsize`` 1796 This attribute suggests that optimization passes and code generator 1797 passes make choices that keep the code size of this function low, 1798 and otherwise do optimizations specifically to reduce code size as 1799 long as they do not significantly impact runtime performance. 1800``"patchable-function"`` 1801 This attribute tells the code generator that the code 1802 generated for this function needs to follow certain conventions that 1803 make it possible for a runtime function to patch over it later. 1804 The exact effect of this attribute depends on its string value, 1805 for which there currently is one legal possibility: 1806 1807 * ``"prologue-short-redirect"`` - This style of patchable 1808 function is intended to support patching a function prologue to 1809 redirect control away from the function in a thread safe 1810 manner. It guarantees that the first instruction of the 1811 function will be large enough to accommodate a short jump 1812 instruction, and will be sufficiently aligned to allow being 1813 fully changed via an atomic compare-and-swap instruction. 1814 While the first requirement can be satisfied by inserting large 1815 enough NOP, LLVM can and will try to re-purpose an existing 1816 instruction (i.e. one that would have to be emitted anyway) as 1817 the patchable instruction larger than a short jump. 1818 1819 ``"prologue-short-redirect"`` is currently only supported on 1820 x86-64. 1821 1822 This attribute by itself does not imply restrictions on 1823 inter-procedural optimizations. All of the semantic effects the 1824 patching may have to be separately conveyed via the linkage type. 1825``"probe-stack"`` 1826 This attribute indicates that the function will trigger a guard region 1827 in the end of the stack. It ensures that accesses to the stack must be 1828 no further apart than the size of the guard region to a previous 1829 access of the stack. It takes one required string value, the name of 1830 the stack probing function that will be called. 1831 1832 If a function that has a ``"probe-stack"`` attribute is inlined into 1833 a function with another ``"probe-stack"`` attribute, the resulting 1834 function has the ``"probe-stack"`` attribute of the caller. If a 1835 function that has a ``"probe-stack"`` attribute is inlined into a 1836 function that has no ``"probe-stack"`` attribute at all, the resulting 1837 function has the ``"probe-stack"`` attribute of the callee. 1838``readnone`` 1839 On a function, this attribute indicates that the function computes its 1840 result (or decides to unwind an exception) based strictly on its arguments, 1841 without dereferencing any pointer arguments or otherwise accessing 1842 any mutable state (e.g. memory, control registers, etc) visible to 1843 caller functions. It does not write through any pointer arguments 1844 (including ``byval`` arguments) and never changes any state visible 1845 to callers. This means while it cannot unwind exceptions by calling 1846 the ``C++`` exception throwing methods (since they write to memory), there may 1847 be non-``C++`` mechanisms that throw exceptions without writing to LLVM 1848 visible memory. 1849 1850 On an argument, this attribute indicates that the function does not 1851 dereference that pointer argument, even though it may read or write the 1852 memory that the pointer points to if accessed through other pointers. 1853 1854 If a readnone function reads or writes memory visible to the program, or 1855 has other side-effects, the behavior is undefined. If a function reads from 1856 or writes to a readnone pointer argument, the behavior is undefined. 1857``readonly`` 1858 On a function, this attribute indicates that the function does not write 1859 through any pointer arguments (including ``byval`` arguments) or otherwise 1860 modify any state (e.g. memory, control registers, etc) visible to 1861 caller functions. It may dereference pointer arguments and read 1862 state that may be set in the caller. A readonly function always 1863 returns the same value (or unwinds an exception identically) when 1864 called with the same set of arguments and global state. This means while it 1865 cannot unwind exceptions by calling the ``C++`` exception throwing methods 1866 (since they write to memory), there may be non-``C++`` mechanisms that throw 1867 exceptions without writing to LLVM visible memory. 1868 1869 On an argument, this attribute indicates that the function does not write 1870 through this pointer argument, even though it may write to the memory that 1871 the pointer points to. 1872 1873 If a readonly function writes memory visible to the program, or 1874 has other side-effects, the behavior is undefined. If a function writes to 1875 a readonly pointer argument, the behavior is undefined. 1876``"stack-probe-size"`` 1877 This attribute controls the behavior of stack probes: either 1878 the ``"probe-stack"`` attribute, or ABI-required stack probes, if any. 1879 It defines the size of the guard region. It ensures that if the function 1880 may use more stack space than the size of the guard region, stack probing 1881 sequence will be emitted. It takes one required integer value, which 1882 is 4096 by default. 1883 1884 If a function that has a ``"stack-probe-size"`` attribute is inlined into 1885 a function with another ``"stack-probe-size"`` attribute, the resulting 1886 function has the ``"stack-probe-size"`` attribute that has the lower 1887 numeric value. If a function that has a ``"stack-probe-size"`` attribute is 1888 inlined into a function that has no ``"stack-probe-size"`` attribute 1889 at all, the resulting function has the ``"stack-probe-size"`` attribute 1890 of the callee. 1891``"no-stack-arg-probe"`` 1892 This attribute disables ABI-required stack probes, if any. 1893``writeonly`` 1894 On a function, this attribute indicates that the function may write to but 1895 does not read from memory. 1896 1897 On an argument, this attribute indicates that the function may write to but 1898 does not read through this pointer argument (even though it may read from 1899 the memory that the pointer points to). 1900 1901 If a writeonly function reads memory visible to the program, or 1902 has other side-effects, the behavior is undefined. If a function reads 1903 from a writeonly pointer argument, the behavior is undefined. 1904``argmemonly`` 1905 This attribute indicates that the only memory accesses inside function are 1906 loads and stores from objects pointed to by its pointer-typed arguments, 1907 with arbitrary offsets. Or in other words, all memory operations in the 1908 function can refer to memory only using pointers based on its function 1909 arguments. 1910 1911 Note that ``argmemonly`` can be used together with ``readonly`` attribute 1912 in order to specify that function reads only from its arguments. 1913 1914 If an argmemonly function reads or writes memory other than the pointer 1915 arguments, or has other side-effects, the behavior is undefined. 1916``returns_twice`` 1917 This attribute indicates that this function can return twice. The C 1918 ``setjmp`` is an example of such a function. The compiler disables 1919 some optimizations (like tail calls) in the caller of these 1920 functions. 1921``safestack`` 1922 This attribute indicates that 1923 `SafeStack <https://clang.llvm.org/docs/SafeStack.html>`_ 1924 protection is enabled for this function. 1925 1926 If a function that has a ``safestack`` attribute is inlined into a 1927 function that doesn't have a ``safestack`` attribute or which has an 1928 ``ssp``, ``sspstrong`` or ``sspreq`` attribute, then the resulting 1929 function will have a ``safestack`` attribute. 1930``sanitize_address`` 1931 This attribute indicates that AddressSanitizer checks 1932 (dynamic address safety analysis) are enabled for this function. 1933``sanitize_memory`` 1934 This attribute indicates that MemorySanitizer checks (dynamic detection 1935 of accesses to uninitialized memory) are enabled for this function. 1936``sanitize_thread`` 1937 This attribute indicates that ThreadSanitizer checks 1938 (dynamic thread safety analysis) are enabled for this function. 1939``sanitize_hwaddress`` 1940 This attribute indicates that HWAddressSanitizer checks 1941 (dynamic address safety analysis based on tagged pointers) are enabled for 1942 this function. 1943``sanitize_memtag`` 1944 This attribute indicates that MemTagSanitizer checks 1945 (dynamic address safety analysis based on Armv8 MTE) are enabled for 1946 this function. 1947``speculative_load_hardening`` 1948 This attribute indicates that 1949 `Speculative Load Hardening <https://llvm.org/docs/SpeculativeLoadHardening.html>`_ 1950 should be enabled for the function body. 1951 1952 Speculative Load Hardening is a best-effort mitigation against 1953 information leak attacks that make use of control flow 1954 miss-speculation - specifically miss-speculation of whether a branch 1955 is taken or not. Typically vulnerabilities enabling such attacks are 1956 classified as "Spectre variant #1". Notably, this does not attempt to 1957 mitigate against miss-speculation of branch target, classified as 1958 "Spectre variant #2" vulnerabilities. 1959 1960 When inlining, the attribute is sticky. Inlining a function that carries 1961 this attribute will cause the caller to gain the attribute. This is intended 1962 to provide a maximally conservative model where the code in a function 1963 annotated with this attribute will always (even after inlining) end up 1964 hardened. 1965``speculatable`` 1966 This function attribute indicates that the function does not have any 1967 effects besides calculating its result and does not have undefined behavior. 1968 Note that ``speculatable`` is not enough to conclude that along any 1969 particular execution path the number of calls to this function will not be 1970 externally observable. This attribute is only valid on functions 1971 and declarations, not on individual call sites. If a function is 1972 incorrectly marked as speculatable and really does exhibit 1973 undefined behavior, the undefined behavior may be observed even 1974 if the call site is dead code. 1975 1976``ssp`` 1977 This attribute indicates that the function should emit a stack 1978 smashing protector. It is in the form of a "canary" --- a random value 1979 placed on the stack before the local variables that's checked upon 1980 return from the function to see if it has been overwritten. A 1981 heuristic is used to determine if a function needs stack protectors 1982 or not. The heuristic used will enable protectors for functions with: 1983 1984 - Character arrays larger than ``ssp-buffer-size`` (default 8). 1985 - Aggregates containing character arrays larger than ``ssp-buffer-size``. 1986 - Calls to alloca() with variable sizes or constant sizes greater than 1987 ``ssp-buffer-size``. 1988 1989 Variables that are identified as requiring a protector will be arranged 1990 on the stack such that they are adjacent to the stack protector guard. 1991 1992 A function with the ``ssp`` attribute but without the ``alwaysinline`` 1993 attribute cannot be inlined into a function without a 1994 ``ssp/sspreq/sspstrong`` attribute. If inlined, the caller will get the 1995 ``ssp`` attribute. ``call``, ``invoke``, and ``callbr`` instructions with 1996 the ``alwaysinline`` attribute force inlining. 1997``sspstrong`` 1998 This attribute indicates that the function should emit a stack smashing 1999 protector. This attribute causes a strong heuristic to be used when 2000 determining if a function needs stack protectors. The strong heuristic 2001 will enable protectors for functions with: 2002 2003 - Arrays of any size and type 2004 - Aggregates containing an array of any size and type. 2005 - Calls to alloca(). 2006 - Local variables that have had their address taken. 2007 2008 Variables that are identified as requiring a protector will be arranged 2009 on the stack such that they are adjacent to the stack protector guard. 2010 The specific layout rules are: 2011 2012 #. Large arrays and structures containing large arrays 2013 (``>= ssp-buffer-size``) are closest to the stack protector. 2014 #. Small arrays and structures containing small arrays 2015 (``< ssp-buffer-size``) are 2nd closest to the protector. 2016 #. Variables that have had their address taken are 3rd closest to the 2017 protector. 2018 2019 This overrides the ``ssp`` function attribute. 2020 2021 A function with the ``sspstrong`` attribute but without the 2022 ``alwaysinline`` attribute cannot be inlined into a function without a 2023 ``ssp/sspstrong/sspreq`` attribute. If inlined, the caller will get the 2024 ``sspstrong`` attribute unless the ``sspreq`` attribute exists. ``call``, 2025 ``invoke``, and ``callbr`` instructions with the ``alwaysinline`` attribute 2026 force inlining. 2027``sspreq`` 2028 This attribute indicates that the function should *always* emit a stack 2029 smashing protector. This overrides the ``ssp`` and ``sspstrong`` function 2030 attributes. 2031 2032 Variables that are identified as requiring a protector will be arranged 2033 on the stack such that they are adjacent to the stack protector guard. 2034 The specific layout rules are: 2035 2036 #. Large arrays and structures containing large arrays 2037 (``>= ssp-buffer-size``) are closest to the stack protector. 2038 #. Small arrays and structures containing small arrays 2039 (``< ssp-buffer-size``) are 2nd closest to the protector. 2040 #. Variables that have had their address taken are 3rd closest to the 2041 protector. 2042 2043 A function with the ``sspreq`` attribute but without the ``alwaysinline`` 2044 attribute cannot be inlined into a function without a 2045 ``ssp/sspstrong/sspreq`` attribute. If inlined, the caller will get the 2046 ``sspreq`` attribute. ``call``, ``invoke``, and ``callbr`` instructions 2047 with the ``alwaysinline`` attribute force inlining. 2048 2049``strictfp`` 2050 This attribute indicates that the function was called from a scope that 2051 requires strict floating-point semantics. LLVM will not attempt any 2052 optimizations that require assumptions about the floating-point rounding 2053 mode or that might alter the state of floating-point status flags that 2054 might otherwise be set or cleared by calling this function. LLVM will 2055 not introduce any new floating-point instructions that may trap. 2056 2057``"denormal-fp-math"`` 2058 This indicates the denormal (subnormal) handling that may be 2059 assumed for the default floating-point environment. This is a 2060 comma separated pair. The elements may be one of ``"ieee"``, 2061 ``"preserve-sign"``, or ``"positive-zero"``. The first entry 2062 indicates the flushing mode for the result of floating point 2063 operations. The second indicates the handling of denormal inputs 2064 to floating point instructions. For compatibility with older 2065 bitcode, if the second value is omitted, both input and output 2066 modes will assume the same mode. 2067 2068 If this is attribute is not specified, the default is 2069 ``"ieee,ieee"``. 2070 2071 If the output mode is ``"preserve-sign"``, or ``"positive-zero"``, 2072 denormal outputs may be flushed to zero by standard floating-point 2073 operations. It is not mandated that flushing to zero occurs, but if 2074 a denormal output is flushed to zero, it must respect the sign 2075 mode. Not all targets support all modes. While this indicates the 2076 expected floating point mode the function will be executed with, 2077 this does not make any attempt to ensure the mode is 2078 consistent. User or platform code is expected to set the floating 2079 point mode appropriately before function entry. 2080 2081 If the input mode is ``"preserve-sign"``, or ``"positive-zero"``, a 2082 floating-point operation must treat any input denormal value as 2083 zero. In some situations, if an instruction does not respect this 2084 mode, the input may need to be converted to 0 as if by 2085 ``@llvm.canonicalize`` during lowering for correctness. 2086 2087``"denormal-fp-math-f32"`` 2088 Same as ``"denormal-fp-math"``, but only controls the behavior of 2089 the 32-bit float type (or vectors of 32-bit floats). If both are 2090 are present, this overrides ``"denormal-fp-math"``. Not all targets 2091 support separately setting the denormal mode per type, and no 2092 attempt is made to diagnose unsupported uses. Currently this 2093 attribute is respected by the AMDGPU and NVPTX backends. 2094 2095``"thunk"`` 2096 This attribute indicates that the function will delegate to some other 2097 function with a tail call. The prototype of a thunk should not be used for 2098 optimization purposes. The caller is expected to cast the thunk prototype to 2099 match the thunk target prototype. 2100``uwtable`` 2101 This attribute indicates that the ABI being targeted requires that 2102 an unwind table entry be produced for this function even if we can 2103 show that no exceptions passes by it. This is normally the case for 2104 the ELF x86-64 abi, but it can be disabled for some compilation 2105 units. 2106``nocf_check`` 2107 This attribute indicates that no control-flow check will be performed on 2108 the attributed entity. It disables -fcf-protection=<> for a specific 2109 entity to fine grain the HW control flow protection mechanism. The flag 2110 is target independent and currently appertains to a function or function 2111 pointer. 2112``shadowcallstack`` 2113 This attribute indicates that the ShadowCallStack checks are enabled for 2114 the function. The instrumentation checks that the return address for the 2115 function has not changed between the function prolog and epilog. It is 2116 currently x86_64-specific. 2117``mustprogress`` 2118 This attribute indicates that the function is required to return, unwind, 2119 or interact with the environment in an observable way e.g. via a volatile 2120 memory access, I/O, or other synchronization. The ``mustprogress`` 2121 attribute is intended to model the requirements of the first section of 2122 [intro.progress] of the C++ Standard. As a consequence, a loop in a 2123 function with the `mustprogress` attribute can be assumed to terminate if 2124 it does not interact with the environment in an observable way, and 2125 terminating loops without side-effects can be removed. If a `mustprogress` 2126 function does not satisfy this contract, the behavior is undefined. This 2127 attribute does not apply transitively to callees, but does apply to call 2128 sites within the function. Note that `willreturn` implies `mustprogress`. 2129``"warn-stack-size"="<threshold>"`` 2130 This attribute sets a threshold to emit diagnostics once the frame size is 2131 known should the frame size exceed the specified value. It takes one 2132 required integer value, which should be a non-negative integer, and less 2133 than `UINT_MAX`. It's unspecified which threshold will be used when 2134 duplicate definitions are linked together with differing values. 2135``vscale_range(<min>[, <max>])`` 2136 This attribute indicates the minimum and maximum vscale value for the given 2137 function. A value of 0 means unbounded. If the optional max value is omitted 2138 then max is set to the value of min. If the attribute is not present, no 2139 assumptions are made about the range of vscale. 2140 2141Call Site Attributes 2142---------------------- 2143 2144In addition to function attributes the following call site only 2145attributes are supported: 2146 2147``vector-function-abi-variant`` 2148 This attribute can be attached to a :ref:`call <i_call>` to list 2149 the vector functions associated to the function. Notice that the 2150 attribute cannot be attached to a :ref:`invoke <i_invoke>` or a 2151 :ref:`callbr <i_callbr>` instruction. The attribute consists of a 2152 comma separated list of mangled names. The order of the list does 2153 not imply preference (it is logically a set). The compiler is free 2154 to pick any listed vector function of its choosing. 2155 2156 The syntax for the mangled names is as follows::: 2157 2158 _ZGV<isa><mask><vlen><parameters>_<scalar_name>[(<vector_redirection>)] 2159 2160 When present, the attribute informs the compiler that the function 2161 ``<scalar_name>`` has a corresponding vector variant that can be 2162 used to perform the concurrent invocation of ``<scalar_name>`` on 2163 vectors. The shape of the vector function is described by the 2164 tokens between the prefix ``_ZGV`` and the ``<scalar_name>`` 2165 token. The standard name of the vector function is 2166 ``_ZGV<isa><mask><vlen><parameters>_<scalar_name>``. When present, 2167 the optional token ``(<vector_redirection>)`` informs the compiler 2168 that a custom name is provided in addition to the standard one 2169 (custom names can be provided for example via the use of ``declare 2170 variant`` in OpenMP 5.0). The declaration of the variant must be 2171 present in the IR Module. The signature of the vector variant is 2172 determined by the rules of the Vector Function ABI (VFABI) 2173 specifications of the target. For Arm and X86, the VFABI can be 2174 found at https://github.com/ARM-software/abi-aa and 2175 https://software.intel.com/content/www/us/en/develop/download/vector-simd-function-abi.html, 2176 respectively. 2177 2178 For X86 and Arm targets, the values of the tokens in the standard 2179 name are those that are defined in the VFABI. LLVM has an internal 2180 ``<isa>`` token that can be used to create scalar-to-vector 2181 mappings for functions that are not directly associated to any of 2182 the target ISAs (for example, some of the mappings stored in the 2183 TargetLibraryInfo). Valid values for the ``<isa>`` token are::: 2184 2185 <isa>:= b | c | d | e -> X86 SSE, AVX, AVX2, AVX512 2186 | n | s -> Armv8 Advanced SIMD, SVE 2187 | __LLVM__ -> Internal LLVM Vector ISA 2188 2189 For all targets currently supported (x86, Arm and Internal LLVM), 2190 the remaining tokens can have the following values::: 2191 2192 <mask>:= M | N -> mask | no mask 2193 2194 <vlen>:= number -> number of lanes 2195 | x -> VLA (Vector Length Agnostic) 2196 2197 <parameters>:= v -> vector 2198 | l | l <number> -> linear 2199 | R | R <number> -> linear with ref modifier 2200 | L | L <number> -> linear with val modifier 2201 | U | U <number> -> linear with uval modifier 2202 | ls <pos> -> runtime linear 2203 | Rs <pos> -> runtime linear with ref modifier 2204 | Ls <pos> -> runtime linear with val modifier 2205 | Us <pos> -> runtime linear with uval modifier 2206 | u -> uniform 2207 2208 <scalar_name>:= name of the scalar function 2209 2210 <vector_redirection>:= optional, custom name of the vector function 2211 2212``preallocated(<ty>)`` 2213 This attribute is required on calls to ``llvm.call.preallocated.arg`` 2214 and cannot be used on any other call. See 2215 :ref:`llvm.call.preallocated.arg<int_call_preallocated_arg>` for more 2216 details. 2217 2218.. _glattrs: 2219 2220Global Attributes 2221----------------- 2222 2223Attributes may be set to communicate additional information about a global variable. 2224Unlike :ref:`function attributes <fnattrs>`, attributes on a global variable 2225are grouped into a single :ref:`attribute group <attrgrp>`. 2226 2227.. _opbundles: 2228 2229Operand Bundles 2230--------------- 2231 2232Operand bundles are tagged sets of SSA values that can be associated 2233with certain LLVM instructions (currently only ``call`` s and 2234``invoke`` s). In a way they are like metadata, but dropping them is 2235incorrect and will change program semantics. 2236 2237Syntax:: 2238 2239 operand bundle set ::= '[' operand bundle (, operand bundle )* ']' 2240 operand bundle ::= tag '(' [ bundle operand ] (, bundle operand )* ')' 2241 bundle operand ::= SSA value 2242 tag ::= string constant 2243 2244Operand bundles are **not** part of a function's signature, and a 2245given function may be called from multiple places with different kinds 2246of operand bundles. This reflects the fact that the operand bundles 2247are conceptually a part of the ``call`` (or ``invoke``), not the 2248callee being dispatched to. 2249 2250Operand bundles are a generic mechanism intended to support 2251runtime-introspection-like functionality for managed languages. While 2252the exact semantics of an operand bundle depend on the bundle tag, 2253there are certain limitations to how much the presence of an operand 2254bundle can influence the semantics of a program. These restrictions 2255are described as the semantics of an "unknown" operand bundle. As 2256long as the behavior of an operand bundle is describable within these 2257restrictions, LLVM does not need to have special knowledge of the 2258operand bundle to not miscompile programs containing it. 2259 2260- The bundle operands for an unknown operand bundle escape in unknown 2261 ways before control is transferred to the callee or invokee. 2262- Calls and invokes with operand bundles have unknown read / write 2263 effect on the heap on entry and exit (even if the call target is 2264 ``readnone`` or ``readonly``), unless they're overridden with 2265 callsite specific attributes. 2266- An operand bundle at a call site cannot change the implementation 2267 of the called function. Inter-procedural optimizations work as 2268 usual as long as they take into account the first two properties. 2269 2270More specific types of operand bundles are described below. 2271 2272.. _deopt_opbundles: 2273 2274Deoptimization Operand Bundles 2275^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2276 2277Deoptimization operand bundles are characterized by the ``"deopt"`` 2278operand bundle tag. These operand bundles represent an alternate 2279"safe" continuation for the call site they're attached to, and can be 2280used by a suitable runtime to deoptimize the compiled frame at the 2281specified call site. There can be at most one ``"deopt"`` operand 2282bundle attached to a call site. Exact details of deoptimization is 2283out of scope for the language reference, but it usually involves 2284rewriting a compiled frame into a set of interpreted frames. 2285 2286From the compiler's perspective, deoptimization operand bundles make 2287the call sites they're attached to at least ``readonly``. They read 2288through all of their pointer typed operands (even if they're not 2289otherwise escaped) and the entire visible heap. Deoptimization 2290operand bundles do not capture their operands except during 2291deoptimization, in which case control will not be returned to the 2292compiled frame. 2293 2294The inliner knows how to inline through calls that have deoptimization 2295operand bundles. Just like inlining through a normal call site 2296involves composing the normal and exceptional continuations, inlining 2297through a call site with a deoptimization operand bundle needs to 2298appropriately compose the "safe" deoptimization continuation. The 2299inliner does this by prepending the parent's deoptimization 2300continuation to every deoptimization continuation in the inlined body. 2301E.g. inlining ``@f`` into ``@g`` in the following example 2302 2303.. code-block:: llvm 2304 2305 define void @f() { 2306 call void @x() ;; no deopt state 2307 call void @y() [ "deopt"(i32 10) ] 2308 call void @y() [ "deopt"(i32 10), "unknown"(i8* null) ] 2309 ret void 2310 } 2311 2312 define void @g() { 2313 call void @f() [ "deopt"(i32 20) ] 2314 ret void 2315 } 2316 2317will result in 2318 2319.. code-block:: llvm 2320 2321 define void @g() { 2322 call void @x() ;; still no deopt state 2323 call void @y() [ "deopt"(i32 20, i32 10) ] 2324 call void @y() [ "deopt"(i32 20, i32 10), "unknown"(i8* null) ] 2325 ret void 2326 } 2327 2328It is the frontend's responsibility to structure or encode the 2329deoptimization state in a way that syntactically prepending the 2330caller's deoptimization state to the callee's deoptimization state is 2331semantically equivalent to composing the caller's deoptimization 2332continuation after the callee's deoptimization continuation. 2333 2334.. _ob_funclet: 2335 2336Funclet Operand Bundles 2337^^^^^^^^^^^^^^^^^^^^^^^ 2338 2339Funclet operand bundles are characterized by the ``"funclet"`` 2340operand bundle tag. These operand bundles indicate that a call site 2341is within a particular funclet. There can be at most one 2342``"funclet"`` operand bundle attached to a call site and it must have 2343exactly one bundle operand. 2344 2345If any funclet EH pads have been "entered" but not "exited" (per the 2346`description in the EH doc\ <ExceptionHandling.html#wineh-constraints>`_), 2347it is undefined behavior to execute a ``call`` or ``invoke`` which: 2348 2349* does not have a ``"funclet"`` bundle and is not a ``call`` to a nounwind 2350 intrinsic, or 2351* has a ``"funclet"`` bundle whose operand is not the most-recently-entered 2352 not-yet-exited funclet EH pad. 2353 2354Similarly, if no funclet EH pads have been entered-but-not-yet-exited, 2355executing a ``call`` or ``invoke`` with a ``"funclet"`` bundle is undefined behavior. 2356 2357GC Transition Operand Bundles 2358^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2359 2360GC transition operand bundles are characterized by the 2361``"gc-transition"`` operand bundle tag. These operand bundles mark a 2362call as a transition between a function with one GC strategy to a 2363function with a different GC strategy. If coordinating the transition 2364between GC strategies requires additional code generation at the call 2365site, these bundles may contain any values that are needed by the 2366generated code. For more details, see :ref:`GC Transitions 2367<gc_transition_args>`. 2368 2369The bundle contain an arbitrary list of Values which need to be passed 2370to GC transition code. They will be lowered and passed as operands to 2371the appropriate GC_TRANSITION nodes in the selection DAG. It is assumed 2372that these arguments must be available before and after (but not 2373necessarily during) the execution of the callee. 2374 2375.. _assume_opbundles: 2376 2377Assume Operand Bundles 2378^^^^^^^^^^^^^^^^^^^^^^ 2379 2380Operand bundles on an :ref:`llvm.assume <int_assume>` allows representing 2381assumptions that a :ref:`parameter attribute <paramattrs>` or a 2382:ref:`function attribute <fnattrs>` holds for a certain value at a certain 2383location. Operand bundles enable assumptions that are either hard or impossible 2384to represent as a boolean argument of an :ref:`llvm.assume <int_assume>`. 2385 2386An assume operand bundle has the form: 2387 2388:: 2389 2390 "<tag>"([ <holds for value> [, <attribute argument>] ]) 2391 2392* The tag of the operand bundle is usually the name of attribute that can be 2393 assumed to hold. It can also be `ignore`, this tag doesn't contain any 2394 information and should be ignored. 2395* The first argument if present is the value for which the attribute hold. 2396* The second argument if present is an argument of the attribute. 2397 2398If there are no arguments the attribute is a property of the call location. 2399 2400If the represented attribute expects a constant argument, the argument provided 2401to the operand bundle should be a constant as well. 2402 2403For example: 2404 2405.. code-block:: llvm 2406 2407 call void @llvm.assume(i1 true) ["align"(i32* %val, i32 8)] 2408 2409allows the optimizer to assume that at location of call to 2410:ref:`llvm.assume <int_assume>` ``%val`` has an alignment of at least 8. 2411 2412.. code-block:: llvm 2413 2414 call void @llvm.assume(i1 %cond) ["cold"(), "nonnull"(i64* %val)] 2415 2416allows the optimizer to assume that the :ref:`llvm.assume <int_assume>` 2417call location is cold and that ``%val`` may not be null. 2418 2419Just like for the argument of :ref:`llvm.assume <int_assume>`, if any of the 2420provided guarantees are violated at runtime the behavior is undefined. 2421 2422Even if the assumed property can be encoded as a boolean value, like 2423``nonnull``, using operand bundles to express the property can still have 2424benefits: 2425 2426* Attributes that can be expressed via operand bundles are directly the 2427 property that the optimizer uses and cares about. Encoding attributes as 2428 operand bundles removes the need for an instruction sequence that represents 2429 the property (e.g., `icmp ne i32* %p, null` for `nonnull`) and for the 2430 optimizer to deduce the property from that instruction sequence. 2431* Expressing the property using operand bundles makes it easy to identify the 2432 use of the value as a use in an :ref:`llvm.assume <int_assume>`. This then 2433 simplifies and improves heuristics, e.g., for use "use-sensitive" 2434 optimizations. 2435 2436.. _ob_preallocated: 2437 2438Preallocated Operand Bundles 2439^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2440 2441Preallocated operand bundles are characterized by the ``"preallocated"`` 2442operand bundle tag. These operand bundles allow separation of the allocation 2443of the call argument memory from the call site. This is necessary to pass 2444non-trivially copyable objects by value in a way that is compatible with MSVC 2445on some targets. There can be at most one ``"preallocated"`` operand bundle 2446attached to a call site and it must have exactly one bundle operand, which is 2447a token generated by ``@llvm.call.preallocated.setup``. A call with this 2448operand bundle should not adjust the stack before entering the function, as 2449that will have been done by one of the ``@llvm.call.preallocated.*`` intrinsics. 2450 2451.. code-block:: llvm 2452 2453 %foo = type { i64, i32 } 2454 2455 ... 2456 2457 %t = call token @llvm.call.preallocated.setup(i32 1) 2458 %a = call i8* @llvm.call.preallocated.arg(token %t, i32 0) preallocated(%foo) 2459 %b = bitcast i8* %a to %foo* 2460 ; initialize %b 2461 call void @bar(i32 42, %foo* preallocated(%foo) %b) ["preallocated"(token %t)] 2462 2463.. _ob_gc_live: 2464 2465GC Live Operand Bundles 2466^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2467 2468A "gc-live" operand bundle is only valid on a :ref:`gc.statepoint <gc_statepoint>` 2469intrinsic. The operand bundle must contain every pointer to a garbage collected 2470object which potentially needs to be updated by the garbage collector. 2471 2472When lowered, any relocated value will be recorded in the corresponding 2473:ref:`stackmap entry <statepoint-stackmap-format>`. See the intrinsic description 2474for further details. 2475 2476ObjC ARC Attached Call Operand Bundles 2477^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2478 2479A ``"clang.arc.attachedcall"`` operand bundle on a call indicates the call is 2480implicitly followed by a marker instruction and a call to an ObjC runtime 2481function that uses the result of the call. The operand bundle takes either the 2482pointer to the runtime function (``@objc_retainAutoreleasedReturnValue`` or 2483``@objc_unsafeClaimAutoreleasedReturnValue``) or no arguments. If the bundle 2484doesn't take any arguments, only the marker instruction has to be emitted after 2485the call; the runtime function calls don't have to be emitted since they already 2486have been emitted. The return value of a call with this bundle is used by a call 2487to ``@llvm.objc.clang.arc.noop.use`` unless the called function's return type is 2488void, in which case the operand bundle is ignored. 2489 2490.. code-block:: llvm 2491 2492 ; The marker instruction and a runtime function call are inserted after the call 2493 ; to @foo. 2494 call i8* @foo() [ "clang.arc.attachedcall"(i8* (i8*)* @objc_retainAutoreleasedReturnValue) ] 2495 call i8* @foo() [ "clang.arc.attachedcall"(i8* (i8*)* @objc_unsafeClaimAutoreleasedReturnValue) ] 2496 2497 ; Only the marker instruction is inserted after the call to @foo. 2498 call i8* @foo() [ "clang.arc.attachedcall"() ] 2499 2500The operand bundle is needed to ensure the call is immediately followed by the 2501marker instruction or the ObjC runtime call in the final output. 2502 2503.. _moduleasm: 2504 2505Module-Level Inline Assembly 2506---------------------------- 2507 2508Modules may contain "module-level inline asm" blocks, which corresponds 2509to the GCC "file scope inline asm" blocks. These blocks are internally 2510concatenated by LLVM and treated as a single unit, but may be separated 2511in the ``.ll`` file if desired. The syntax is very simple: 2512 2513.. code-block:: llvm 2514 2515 module asm "inline asm code goes here" 2516 module asm "more can go here" 2517 2518The strings can contain any character by escaping non-printable 2519characters. The escape sequence used is simply "\\xx" where "xx" is the 2520two digit hex code for the number. 2521 2522Note that the assembly string *must* be parseable by LLVM's integrated assembler 2523(unless it is disabled), even when emitting a ``.s`` file. 2524 2525.. _langref_datalayout: 2526 2527Data Layout 2528----------- 2529 2530A module may specify a target specific data layout string that specifies 2531how data is to be laid out in memory. The syntax for the data layout is 2532simply: 2533 2534.. code-block:: llvm 2535 2536 target datalayout = "layout specification" 2537 2538The *layout specification* consists of a list of specifications 2539separated by the minus sign character ('-'). Each specification starts 2540with a letter and may include other information after the letter to 2541define some aspect of the data layout. The specifications accepted are 2542as follows: 2543 2544``E`` 2545 Specifies that the target lays out data in big-endian form. That is, 2546 the bits with the most significance have the lowest address 2547 location. 2548``e`` 2549 Specifies that the target lays out data in little-endian form. That 2550 is, the bits with the least significance have the lowest address 2551 location. 2552``S<size>`` 2553 Specifies the natural alignment of the stack in bits. Alignment 2554 promotion of stack variables is limited to the natural stack 2555 alignment to avoid dynamic stack realignment. The stack alignment 2556 must be a multiple of 8-bits. If omitted, the natural stack 2557 alignment defaults to "unspecified", which does not prevent any 2558 alignment promotions. 2559``P<address space>`` 2560 Specifies the address space that corresponds to program memory. 2561 Harvard architectures can use this to specify what space LLVM 2562 should place things such as functions into. If omitted, the 2563 program memory space defaults to the default address space of 0, 2564 which corresponds to a Von Neumann architecture that has code 2565 and data in the same space. 2566``G<address space>`` 2567 Specifies the address space to be used by default when creating global 2568 variables. If omitted, the globals address space defaults to the default 2569 address space 0. 2570 Note: variable declarations without an address space are always created in 2571 address space 0, this property only affects the default value to be used 2572 when creating globals without additional contextual information (e.g. in 2573 LLVM passes). 2574``A<address space>`` 2575 Specifies the address space of objects created by '``alloca``'. 2576 Defaults to the default address space of 0. 2577``p[n]:<size>:<abi>:<pref>:<idx>`` 2578 This specifies the *size* of a pointer and its ``<abi>`` and 2579 ``<pref>``\erred alignments for address space ``n``. The fourth parameter 2580 ``<idx>`` is a size of index that used for address calculation. If not 2581 specified, the default index size is equal to the pointer size. All sizes 2582 are in bits. The address space, ``n``, is optional, and if not specified, 2583 denotes the default address space 0. The value of ``n`` must be 2584 in the range [1,2^23). 2585``i<size>:<abi>:<pref>`` 2586 This specifies the alignment for an integer type of a given bit 2587 ``<size>``. The value of ``<size>`` must be in the range [1,2^23). 2588``v<size>:<abi>:<pref>`` 2589 This specifies the alignment for a vector type of a given bit 2590 ``<size>``. 2591``f<size>:<abi>:<pref>`` 2592 This specifies the alignment for a floating-point type of a given bit 2593 ``<size>``. Only values of ``<size>`` that are supported by the target 2594 will work. 32 (float) and 64 (double) are supported on all targets; 80 2595 or 128 (different flavors of long double) are also supported on some 2596 targets. 2597``a:<abi>:<pref>`` 2598 This specifies the alignment for an object of aggregate type. 2599``F<type><abi>`` 2600 This specifies the alignment for function pointers. 2601 The options for ``<type>`` are: 2602 2603 * ``i``: The alignment of function pointers is independent of the alignment 2604 of functions, and is a multiple of ``<abi>``. 2605 * ``n``: The alignment of function pointers is a multiple of the explicit 2606 alignment specified on the function, and is a multiple of ``<abi>``. 2607``m:<mangling>`` 2608 If present, specifies that llvm names are mangled in the output. Symbols 2609 prefixed with the mangling escape character ``\01`` are passed through 2610 directly to the assembler without the escape character. The mangling style 2611 options are 2612 2613 * ``e``: ELF mangling: Private symbols get a ``.L`` prefix. 2614 * ``l``: GOFF mangling: Private symbols get a ``@`` prefix. 2615 * ``m``: Mips mangling: Private symbols get a ``$`` prefix. 2616 * ``o``: Mach-O mangling: Private symbols get ``L`` prefix. Other 2617 symbols get a ``_`` prefix. 2618 * ``x``: Windows x86 COFF mangling: Private symbols get the usual prefix. 2619 Regular C symbols get a ``_`` prefix. Functions with ``__stdcall``, 2620 ``__fastcall``, and ``__vectorcall`` have custom mangling that appends 2621 ``@N`` where N is the number of bytes used to pass parameters. C++ symbols 2622 starting with ``?`` are not mangled in any way. 2623 * ``w``: Windows COFF mangling: Similar to ``x``, except that normal C 2624 symbols do not receive a ``_`` prefix. 2625 * ``a``: XCOFF mangling: Private symbols get a ``L..`` prefix. 2626``n<size1>:<size2>:<size3>...`` 2627 This specifies a set of native integer widths for the target CPU in 2628 bits. For example, it might contain ``n32`` for 32-bit PowerPC, 2629 ``n32:64`` for PowerPC 64, or ``n8:16:32:64`` for X86-64. Elements of 2630 this set are considered to support most general arithmetic operations 2631 efficiently. 2632``ni:<address space0>:<address space1>:<address space2>...`` 2633 This specifies pointer types with the specified address spaces 2634 as :ref:`Non-Integral Pointer Type <nointptrtype>` s. The ``0`` 2635 address space cannot be specified as non-integral. 2636 2637On every specification that takes a ``<abi>:<pref>``, specifying the 2638``<pref>`` alignment is optional. If omitted, the preceding ``:`` 2639should be omitted too and ``<pref>`` will be equal to ``<abi>``. 2640 2641When constructing the data layout for a given target, LLVM starts with a 2642default set of specifications which are then (possibly) overridden by 2643the specifications in the ``datalayout`` keyword. The default 2644specifications are given in this list: 2645 2646- ``E`` - big endian 2647- ``p:64:64:64`` - 64-bit pointers with 64-bit alignment. 2648- ``p[n]:64:64:64`` - Other address spaces are assumed to be the 2649 same as the default address space. 2650- ``S0`` - natural stack alignment is unspecified 2651- ``i1:8:8`` - i1 is 8-bit (byte) aligned 2652- ``i8:8:8`` - i8 is 8-bit (byte) aligned 2653- ``i16:16:16`` - i16 is 16-bit aligned 2654- ``i32:32:32`` - i32 is 32-bit aligned 2655- ``i64:32:64`` - i64 has ABI alignment of 32-bits but preferred 2656 alignment of 64-bits 2657- ``f16:16:16`` - half is 16-bit aligned 2658- ``f32:32:32`` - float is 32-bit aligned 2659- ``f64:64:64`` - double is 64-bit aligned 2660- ``f128:128:128`` - quad is 128-bit aligned 2661- ``v64:64:64`` - 64-bit vector is 64-bit aligned 2662- ``v128:128:128`` - 128-bit vector is 128-bit aligned 2663- ``a:0:64`` - aggregates are 64-bit aligned 2664 2665When LLVM is determining the alignment for a given type, it uses the 2666following rules: 2667 2668#. If the type sought is an exact match for one of the specifications, 2669 that specification is used. 2670#. If no match is found, and the type sought is an integer type, then 2671 the smallest integer type that is larger than the bitwidth of the 2672 sought type is used. If none of the specifications are larger than 2673 the bitwidth then the largest integer type is used. For example, 2674 given the default specifications above, the i7 type will use the 2675 alignment of i8 (next largest) while both i65 and i256 will use the 2676 alignment of i64 (largest specified). 2677#. If no match is found, and the type sought is a vector type, then the 2678 largest vector type that is smaller than the sought vector type will 2679 be used as a fall back. This happens because <128 x double> can be 2680 implemented in terms of 64 <2 x double>, for example. 2681 2682The function of the data layout string may not be what you expect. 2683Notably, this is not a specification from the frontend of what alignment 2684the code generator should use. 2685 2686Instead, if specified, the target data layout is required to match what 2687the ultimate *code generator* expects. This string is used by the 2688mid-level optimizers to improve code, and this only works if it matches 2689what the ultimate code generator uses. There is no way to generate IR 2690that does not embed this target-specific detail into the IR. If you 2691don't specify the string, the default specifications will be used to 2692generate a Data Layout and the optimization phases will operate 2693accordingly and introduce target specificity into the IR with respect to 2694these default specifications. 2695 2696.. _langref_triple: 2697 2698Target Triple 2699------------- 2700 2701A module may specify a target triple string that describes the target 2702host. The syntax for the target triple is simply: 2703 2704.. code-block:: llvm 2705 2706 target triple = "x86_64-apple-macosx10.7.0" 2707 2708The *target triple* string consists of a series of identifiers delimited 2709by the minus sign character ('-'). The canonical forms are: 2710 2711:: 2712 2713 ARCHITECTURE-VENDOR-OPERATING_SYSTEM 2714 ARCHITECTURE-VENDOR-OPERATING_SYSTEM-ENVIRONMENT 2715 2716This information is passed along to the backend so that it generates 2717code for the proper architecture. It's possible to override this on the 2718command line with the ``-mtriple`` command line option. 2719 2720.. _objectlifetime: 2721 2722Object Lifetime 2723---------------------- 2724 2725A memory object, or simply object, is a region of a memory space that is 2726reserved by a memory allocation such as :ref:`alloca <i_alloca>`, heap 2727allocation calls, and global variable definitions. 2728Once it is allocated, the bytes stored in the region can only be read or written 2729through a pointer that is :ref:`based on <pointeraliasing>` the allocation 2730value. 2731If a pointer that is not based on the object tries to read or write to the 2732object, it is undefined behavior. 2733 2734A lifetime of a memory object is a property that decides its accessibility. 2735Unless stated otherwise, a memory object is alive since its allocation, and 2736dead after its deallocation. 2737It is undefined behavior to access a memory object that isn't alive, but 2738operations that don't dereference it such as 2739:ref:`getelementptr <i_getelementptr>`, :ref:`ptrtoint <i_ptrtoint>` and 2740:ref:`icmp <i_icmp>` return a valid result. 2741This explains code motion of these instructions across operations that 2742impact the object's lifetime. 2743A stack object's lifetime can be explicitly specified using 2744:ref:`llvm.lifetime.start <int_lifestart>` and 2745:ref:`llvm.lifetime.end <int_lifeend>` intrinsic function calls. 2746 2747.. _pointeraliasing: 2748 2749Pointer Aliasing Rules 2750---------------------- 2751 2752Any memory access must be done through a pointer value associated with 2753an address range of the memory access, otherwise the behavior is 2754undefined. Pointer values are associated with address ranges according 2755to the following rules: 2756 2757- A pointer value is associated with the addresses associated with any 2758 value it is *based* on. 2759- An address of a global variable is associated with the address range 2760 of the variable's storage. 2761- The result value of an allocation instruction is associated with the 2762 address range of the allocated storage. 2763- A null pointer in the default address-space is associated with no 2764 address. 2765- An :ref:`undef value <undefvalues>` in *any* address-space is 2766 associated with no address. 2767- An integer constant other than zero or a pointer value returned from 2768 a function not defined within LLVM may be associated with address 2769 ranges allocated through mechanisms other than those provided by 2770 LLVM. Such ranges shall not overlap with any ranges of addresses 2771 allocated by mechanisms provided by LLVM. 2772 2773A pointer value is *based* on another pointer value according to the 2774following rules: 2775 2776- A pointer value formed from a scalar ``getelementptr`` operation is *based* on 2777 the pointer-typed operand of the ``getelementptr``. 2778- The pointer in lane *l* of the result of a vector ``getelementptr`` operation 2779 is *based* on the pointer in lane *l* of the vector-of-pointers-typed operand 2780 of the ``getelementptr``. 2781- The result value of a ``bitcast`` is *based* on the operand of the 2782 ``bitcast``. 2783- A pointer value formed by an ``inttoptr`` is *based* on all pointer 2784 values that contribute (directly or indirectly) to the computation of 2785 the pointer's value. 2786- The "*based* on" relationship is transitive. 2787 2788Note that this definition of *"based"* is intentionally similar to the 2789definition of *"based"* in C99, though it is slightly weaker. 2790 2791LLVM IR does not associate types with memory. The result type of a 2792``load`` merely indicates the size and alignment of the memory from 2793which to load, as well as the interpretation of the value. The first 2794operand type of a ``store`` similarly only indicates the size and 2795alignment of the store. 2796 2797Consequently, type-based alias analysis, aka TBAA, aka 2798``-fstrict-aliasing``, is not applicable to general unadorned LLVM IR. 2799:ref:`Metadata <metadata>` may be used to encode additional information 2800which specialized optimization passes may use to implement type-based 2801alias analysis. 2802 2803.. _pointercapture: 2804 2805Pointer Capture 2806--------------- 2807 2808Given a function call and a pointer that is passed as an argument or stored in 2809the memory before the call, a pointer is *captured* by the call if it makes a 2810copy of any part of the pointer that outlives the call. 2811To be precise, a pointer is captured if one or more of the following conditions 2812hold: 2813 28141. The call stores any bit of the pointer carrying information into a place, 2815 and the stored bits can be read from the place by the caller after this call 2816 exits. 2817 2818.. code-block:: llvm 2819 2820 @glb = global i8* null 2821 @glb2 = global i8* null 2822 @glb3 = global i8* null 2823 @glbi = global i32 0 2824 2825 define i8* @f(i8* %a, i8* %b, i8* %c, i8* %d, i8* %e) { 2826 store i8* %a, i8** @glb ; %a is captured by this call 2827 2828 store i8* %b, i8** @glb2 ; %b isn't captured because the stored value is overwritten by the store below 2829 store i8* null, i8** @glb2 2830 2831 store i8* %c, i8** @glb3 2832 call void @g() ; If @g makes a copy of %c that outlives this call (@f), %c is captured 2833 store i8* null, i8** @glb3 2834 2835 %i = ptrtoint i8* %d to i64 2836 %j = trunc i64 %i to i32 2837 store i32 %j, i32* @glbi ; %d is captured 2838 2839 ret i8* %e ; %e is captured 2840 } 2841 28422. The call stores any bit of the pointer carrying information into a place, 2843 and the stored bits can be safely read from the place by another thread via 2844 synchronization. 2845 2846.. code-block:: llvm 2847 2848 @lock = global i1 true 2849 2850 define void @f(i8* %a) { 2851 store i8* %a, i8** @glb 2852 store atomic i1 false, i1* @lock release ; %a is captured because another thread can safely read @glb 2853 store i8* null, i8** @glb 2854 ret void 2855 } 2856 28573. The call's behavior depends on any bit of the pointer carrying information. 2858 2859.. code-block:: llvm 2860 2861 @glb = global i8 0 2862 2863 define void @f(i8* %a) { 2864 %c = icmp eq i8* %a, @glb 2865 br i1 %c, label %BB_EXIT, label %BB_CONTINUE ; escapes %a 2866 BB_EXIT: 2867 call void @exit() 2868 unreachable 2869 BB_CONTINUE: 2870 ret void 2871 } 2872 28734. The pointer is used in a volatile access as its address. 2874 2875 2876.. _volatile: 2877 2878Volatile Memory Accesses 2879------------------------ 2880 2881Certain memory accesses, such as :ref:`load <i_load>`'s, 2882:ref:`store <i_store>`'s, and :ref:`llvm.memcpy <int_memcpy>`'s may be 2883marked ``volatile``. The optimizers must not change the number of 2884volatile operations or change their order of execution relative to other 2885volatile operations. The optimizers *may* change the order of volatile 2886operations relative to non-volatile operations. This is not Java's 2887"volatile" and has no cross-thread synchronization behavior. 2888 2889A volatile load or store may have additional target-specific semantics. 2890Any volatile operation can have side effects, and any volatile operation 2891can read and/or modify state which is not accessible via a regular load 2892or store in this module. Volatile operations may use addresses which do 2893not point to memory (like MMIO registers). This means the compiler may 2894not use a volatile operation to prove a non-volatile access to that 2895address has defined behavior. 2896 2897The allowed side-effects for volatile accesses are limited. If a 2898non-volatile store to a given address would be legal, a volatile 2899operation may modify the memory at that address. A volatile operation 2900may not modify any other memory accessible by the module being compiled. 2901A volatile operation may not call any code in the current module. 2902 2903The compiler may assume execution will continue after a volatile operation, 2904so operations which modify memory or may have undefined behavior can be 2905hoisted past a volatile operation. 2906 2907As an exception to the preceding rule, the compiler may not assume execution 2908will continue after a volatile store operation. This restriction is necessary 2909to support the somewhat common pattern in C of intentionally storing to an 2910invalid pointer to crash the program. In the future, it might make sense to 2911allow frontends to control this behavior. 2912 2913IR-level volatile loads and stores cannot safely be optimized into llvm.memcpy 2914or llvm.memmove intrinsics even when those intrinsics are flagged volatile. 2915Likewise, the backend should never split or merge target-legal volatile 2916load/store instructions. Similarly, IR-level volatile loads and stores cannot 2917change from integer to floating-point or vice versa. 2918 2919.. admonition:: Rationale 2920 2921 Platforms may rely on volatile loads and stores of natively supported 2922 data width to be executed as single instruction. For example, in C 2923 this holds for an l-value of volatile primitive type with native 2924 hardware support, but not necessarily for aggregate types. The 2925 frontend upholds these expectations, which are intentionally 2926 unspecified in the IR. The rules above ensure that IR transformations 2927 do not violate the frontend's contract with the language. 2928 2929.. _memmodel: 2930 2931Memory Model for Concurrent Operations 2932-------------------------------------- 2933 2934The LLVM IR does not define any way to start parallel threads of 2935execution or to register signal handlers. Nonetheless, there are 2936platform-specific ways to create them, and we define LLVM IR's behavior 2937in their presence. This model is inspired by the C++0x memory model. 2938 2939For a more informal introduction to this model, see the :doc:`Atomics`. 2940 2941We define a *happens-before* partial order as the least partial order 2942that 2943 2944- Is a superset of single-thread program order, and 2945- When a *synchronizes-with* ``b``, includes an edge from ``a`` to 2946 ``b``. *Synchronizes-with* pairs are introduced by platform-specific 2947 techniques, like pthread locks, thread creation, thread joining, 2948 etc., and by atomic instructions. (See also :ref:`Atomic Memory Ordering 2949 Constraints <ordering>`). 2950 2951Note that program order does not introduce *happens-before* edges 2952between a thread and signals executing inside that thread. 2953 2954Every (defined) read operation (load instructions, memcpy, atomic 2955loads/read-modify-writes, etc.) R reads a series of bytes written by 2956(defined) write operations (store instructions, atomic 2957stores/read-modify-writes, memcpy, etc.). For the purposes of this 2958section, initialized globals are considered to have a write of the 2959initializer which is atomic and happens before any other read or write 2960of the memory in question. For each byte of a read R, R\ :sub:`byte` 2961may see any write to the same byte, except: 2962 2963- If write\ :sub:`1` happens before write\ :sub:`2`, and 2964 write\ :sub:`2` happens before R\ :sub:`byte`, then 2965 R\ :sub:`byte` does not see write\ :sub:`1`. 2966- If R\ :sub:`byte` happens before write\ :sub:`3`, then 2967 R\ :sub:`byte` does not see write\ :sub:`3`. 2968 2969Given that definition, R\ :sub:`byte` is defined as follows: 2970 2971- If R is volatile, the result is target-dependent. (Volatile is 2972 supposed to give guarantees which can support ``sig_atomic_t`` in 2973 C/C++, and may be used for accesses to addresses that do not behave 2974 like normal memory. It does not generally provide cross-thread 2975 synchronization.) 2976- Otherwise, if there is no write to the same byte that happens before 2977 R\ :sub:`byte`, R\ :sub:`byte` returns ``undef`` for that byte. 2978- Otherwise, if R\ :sub:`byte` may see exactly one write, 2979 R\ :sub:`byte` returns the value written by that write. 2980- Otherwise, if R is atomic, and all the writes R\ :sub:`byte` may 2981 see are atomic, it chooses one of the values written. See the :ref:`Atomic 2982 Memory Ordering Constraints <ordering>` section for additional 2983 constraints on how the choice is made. 2984- Otherwise R\ :sub:`byte` returns ``undef``. 2985 2986R returns the value composed of the series of bytes it read. This 2987implies that some bytes within the value may be ``undef`` **without** 2988the entire value being ``undef``. Note that this only defines the 2989semantics of the operation; it doesn't mean that targets will emit more 2990than one instruction to read the series of bytes. 2991 2992Note that in cases where none of the atomic intrinsics are used, this 2993model places only one restriction on IR transformations on top of what 2994is required for single-threaded execution: introducing a store to a byte 2995which might not otherwise be stored is not allowed in general. 2996(Specifically, in the case where another thread might write to and read 2997from an address, introducing a store can change a load that may see 2998exactly one write into a load that may see multiple writes.) 2999 3000.. _ordering: 3001 3002Atomic Memory Ordering Constraints 3003---------------------------------- 3004 3005Atomic instructions (:ref:`cmpxchg <i_cmpxchg>`, 3006:ref:`atomicrmw <i_atomicrmw>`, :ref:`fence <i_fence>`, 3007:ref:`atomic load <i_load>`, and :ref:`atomic store <i_store>`) take 3008ordering parameters that determine which other atomic instructions on 3009the same address they *synchronize with*. These semantics are borrowed 3010from Java and C++0x, but are somewhat more colloquial. If these 3011descriptions aren't precise enough, check those specs (see spec 3012references in the :doc:`atomics guide <Atomics>`). 3013:ref:`fence <i_fence>` instructions treat these orderings somewhat 3014differently since they don't take an address. See that instruction's 3015documentation for details. 3016 3017For a simpler introduction to the ordering constraints, see the 3018:doc:`Atomics`. 3019 3020``unordered`` 3021 The set of values that can be read is governed by the happens-before 3022 partial order. A value cannot be read unless some operation wrote 3023 it. This is intended to provide a guarantee strong enough to model 3024 Java's non-volatile shared variables. This ordering cannot be 3025 specified for read-modify-write operations; it is not strong enough 3026 to make them atomic in any interesting way. 3027``monotonic`` 3028 In addition to the guarantees of ``unordered``, there is a single 3029 total order for modifications by ``monotonic`` operations on each 3030 address. All modification orders must be compatible with the 3031 happens-before order. There is no guarantee that the modification 3032 orders can be combined to a global total order for the whole program 3033 (and this often will not be possible). The read in an atomic 3034 read-modify-write operation (:ref:`cmpxchg <i_cmpxchg>` and 3035 :ref:`atomicrmw <i_atomicrmw>`) reads the value in the modification 3036 order immediately before the value it writes. If one atomic read 3037 happens before another atomic read of the same address, the later 3038 read must see the same value or a later value in the address's 3039 modification order. This disallows reordering of ``monotonic`` (or 3040 stronger) operations on the same address. If an address is written 3041 ``monotonic``-ally by one thread, and other threads ``monotonic``-ally 3042 read that address repeatedly, the other threads must eventually see 3043 the write. This corresponds to the C++0x/C1x 3044 ``memory_order_relaxed``. 3045``acquire`` 3046 In addition to the guarantees of ``monotonic``, a 3047 *synchronizes-with* edge may be formed with a ``release`` operation. 3048 This is intended to model C++'s ``memory_order_acquire``. 3049``release`` 3050 In addition to the guarantees of ``monotonic``, if this operation 3051 writes a value which is subsequently read by an ``acquire`` 3052 operation, it *synchronizes-with* that operation. (This isn't a 3053 complete description; see the C++0x definition of a release 3054 sequence.) This corresponds to the C++0x/C1x 3055 ``memory_order_release``. 3056``acq_rel`` (acquire+release) 3057 Acts as both an ``acquire`` and ``release`` operation on its 3058 address. This corresponds to the C++0x/C1x ``memory_order_acq_rel``. 3059``seq_cst`` (sequentially consistent) 3060 In addition to the guarantees of ``acq_rel`` (``acquire`` for an 3061 operation that only reads, ``release`` for an operation that only 3062 writes), there is a global total order on all 3063 sequentially-consistent operations on all addresses, which is 3064 consistent with the *happens-before* partial order and with the 3065 modification orders of all the affected addresses. Each 3066 sequentially-consistent read sees the last preceding write to the 3067 same address in this global order. This corresponds to the C++0x/C1x 3068 ``memory_order_seq_cst`` and Java volatile. 3069 3070.. _syncscope: 3071 3072If an atomic operation is marked ``syncscope("singlethread")``, it only 3073*synchronizes with* and only participates in the seq\_cst total orderings of 3074other operations running in the same thread (for example, in signal handlers). 3075 3076If an atomic operation is marked ``syncscope("<target-scope>")``, where 3077``<target-scope>`` is a target specific synchronization scope, then it is target 3078dependent if it *synchronizes with* and participates in the seq\_cst total 3079orderings of other operations. 3080 3081Otherwise, an atomic operation that is not marked ``syncscope("singlethread")`` 3082or ``syncscope("<target-scope>")`` *synchronizes with* and participates in the 3083seq\_cst total orderings of other operations that are not marked 3084``syncscope("singlethread")`` or ``syncscope("<target-scope>")``. 3085 3086.. _floatenv: 3087 3088Floating-Point Environment 3089-------------------------- 3090 3091The default LLVM floating-point environment assumes that floating-point 3092instructions do not have side effects. Results assume the round-to-nearest 3093rounding mode. No floating-point exception state is maintained in this 3094environment. Therefore, there is no attempt to create or preserve invalid 3095operation (SNaN) or division-by-zero exceptions. 3096 3097The benefit of this exception-free assumption is that floating-point 3098operations may be speculated freely without any other fast-math relaxations 3099to the floating-point model. 3100 3101Code that requires different behavior than this should use the 3102:ref:`Constrained Floating-Point Intrinsics <constrainedfp>`. 3103 3104.. _fastmath: 3105 3106Fast-Math Flags 3107--------------- 3108 3109LLVM IR floating-point operations (:ref:`fneg <i_fneg>`, :ref:`fadd <i_fadd>`, 3110:ref:`fsub <i_fsub>`, :ref:`fmul <i_fmul>`, :ref:`fdiv <i_fdiv>`, 3111:ref:`frem <i_frem>`, :ref:`fcmp <i_fcmp>`), :ref:`phi <i_phi>`, 3112:ref:`select <i_select>` and :ref:`call <i_call>` 3113may use the following flags to enable otherwise unsafe 3114floating-point transformations. 3115 3116``nnan`` 3117 No NaNs - Allow optimizations to assume the arguments and result are not 3118 NaN. If an argument is a nan, or the result would be a nan, it produces 3119 a :ref:`poison value <poisonvalues>` instead. 3120 3121``ninf`` 3122 No Infs - Allow optimizations to assume the arguments and result are not 3123 +/-Inf. If an argument is +/-Inf, or the result would be +/-Inf, it 3124 produces a :ref:`poison value <poisonvalues>` instead. 3125 3126``nsz`` 3127 No Signed Zeros - Allow optimizations to treat the sign of a zero 3128 argument or result as insignificant. This does not imply that -0.0 3129 is poison and/or guaranteed to not exist in the operation. 3130 3131``arcp`` 3132 Allow Reciprocal - Allow optimizations to use the reciprocal of an 3133 argument rather than perform division. 3134 3135``contract`` 3136 Allow floating-point contraction (e.g. fusing a multiply followed by an 3137 addition into a fused multiply-and-add). This does not enable reassociating 3138 to form arbitrary contractions. For example, ``(a*b) + (c*d) + e`` can not 3139 be transformed into ``(a*b) + ((c*d) + e)`` to create two fma operations. 3140 3141``afn`` 3142 Approximate functions - Allow substitution of approximate calculations for 3143 functions (sin, log, sqrt, etc). See floating-point intrinsic definitions 3144 for places where this can apply to LLVM's intrinsic math functions. 3145 3146``reassoc`` 3147 Allow reassociation transformations for floating-point instructions. 3148 This may dramatically change results in floating-point. 3149 3150``fast`` 3151 This flag implies all of the others. 3152 3153.. _uselistorder: 3154 3155Use-list Order Directives 3156------------------------- 3157 3158Use-list directives encode the in-memory order of each use-list, allowing the 3159order to be recreated. ``<order-indexes>`` is a comma-separated list of 3160indexes that are assigned to the referenced value's uses. The referenced 3161value's use-list is immediately sorted by these indexes. 3162 3163Use-list directives may appear at function scope or global scope. They are not 3164instructions, and have no effect on the semantics of the IR. When they're at 3165function scope, they must appear after the terminator of the final basic block. 3166 3167If basic blocks have their address taken via ``blockaddress()`` expressions, 3168``uselistorder_bb`` can be used to reorder their use-lists from outside their 3169function's scope. 3170 3171:Syntax: 3172 3173:: 3174 3175 uselistorder <ty> <value>, { <order-indexes> } 3176 uselistorder_bb @function, %block { <order-indexes> } 3177 3178:Examples: 3179 3180:: 3181 3182 define void @foo(i32 %arg1, i32 %arg2) { 3183 entry: 3184 ; ... instructions ... 3185 bb: 3186 ; ... instructions ... 3187 3188 ; At function scope. 3189 uselistorder i32 %arg1, { 1, 0, 2 } 3190 uselistorder label %bb, { 1, 0 } 3191 } 3192 3193 ; At global scope. 3194 uselistorder i32* @global, { 1, 2, 0 } 3195 uselistorder i32 7, { 1, 0 } 3196 uselistorder i32 (i32) @bar, { 1, 0 } 3197 uselistorder_bb @foo, %bb, { 5, 1, 3, 2, 0, 4 } 3198 3199.. _source_filename: 3200 3201Source Filename 3202--------------- 3203 3204The *source filename* string is set to the original module identifier, 3205which will be the name of the compiled source file when compiling from 3206source through the clang front end, for example. It is then preserved through 3207the IR and bitcode. 3208 3209This is currently necessary to generate a consistent unique global 3210identifier for local functions used in profile data, which prepends the 3211source file name to the local function name. 3212 3213The syntax for the source file name is simply: 3214 3215.. code-block:: text 3216 3217 source_filename = "/path/to/source.c" 3218 3219.. _typesystem: 3220 3221Type System 3222=========== 3223 3224The LLVM type system is one of the most important features of the 3225intermediate representation. Being typed enables a number of 3226optimizations to be performed on the intermediate representation 3227directly, without having to do extra analyses on the side before the 3228transformation. A strong type system makes it easier to read the 3229generated code and enables novel analyses and transformations that are 3230not feasible to perform on normal three address code representations. 3231 3232.. _t_void: 3233 3234Void Type 3235--------- 3236 3237:Overview: 3238 3239 3240The void type does not represent any value and has no size. 3241 3242:Syntax: 3243 3244 3245:: 3246 3247 void 3248 3249 3250.. _t_function: 3251 3252Function Type 3253------------- 3254 3255:Overview: 3256 3257 3258The function type can be thought of as a function signature. It consists of a 3259return type and a list of formal parameter types. The return type of a function 3260type is a void type or first class type --- except for :ref:`label <t_label>` 3261and :ref:`metadata <t_metadata>` types. 3262 3263:Syntax: 3264 3265:: 3266 3267 <returntype> (<parameter list>) 3268 3269...where '``<parameter list>``' is a comma-separated list of type 3270specifiers. Optionally, the parameter list may include a type ``...``, which 3271indicates that the function takes a variable number of arguments. Variable 3272argument functions can access their arguments with the :ref:`variable argument 3273handling intrinsic <int_varargs>` functions. '``<returntype>``' is any type 3274except :ref:`label <t_label>` and :ref:`metadata <t_metadata>`. 3275 3276:Examples: 3277 3278+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 3279| ``i32 (i32)`` | function taking an ``i32``, returning an ``i32`` | 3280+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 3281| ``float (i16, i32 *) *`` | :ref:`Pointer <t_pointer>` to a function that takes an ``i16`` and a :ref:`pointer <t_pointer>` to ``i32``, returning ``float``. | 3282+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 3283| ``i32 (i8*, ...)`` | A vararg function that takes at least one :ref:`pointer <t_pointer>` to ``i8`` (char in C), which returns an integer. This is the signature for ``printf`` in LLVM. | 3284+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 3285| ``{i32, i32} (i32)`` | A function taking an ``i32``, returning a :ref:`structure <t_struct>` containing two ``i32`` values | 3286+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 3287 3288.. _t_firstclass: 3289 3290First Class Types 3291----------------- 3292 3293The :ref:`first class <t_firstclass>` types are perhaps the most important. 3294Values of these types are the only ones which can be produced by 3295instructions. 3296 3297.. _t_single_value: 3298 3299Single Value Types 3300^^^^^^^^^^^^^^^^^^ 3301 3302These are the types that are valid in registers from CodeGen's perspective. 3303 3304.. _t_integer: 3305 3306Integer Type 3307"""""""""""" 3308 3309:Overview: 3310 3311The integer type is a very simple type that simply specifies an 3312arbitrary bit width for the integer type desired. Any bit width from 1 3313bit to 2\ :sup:`23`\ (about 8 million) can be specified. 3314 3315:Syntax: 3316 3317:: 3318 3319 iN 3320 3321The number of bits the integer will occupy is specified by the ``N`` 3322value. 3323 3324Examples: 3325********* 3326 3327+----------------+------------------------------------------------+ 3328| ``i1`` | a single-bit integer. | 3329+----------------+------------------------------------------------+ 3330| ``i32`` | a 32-bit integer. | 3331+----------------+------------------------------------------------+ 3332| ``i1942652`` | a really big integer of over 1 million bits. | 3333+----------------+------------------------------------------------+ 3334 3335.. _t_floating: 3336 3337Floating-Point Types 3338"""""""""""""""""""" 3339 3340.. list-table:: 3341 :header-rows: 1 3342 3343 * - Type 3344 - Description 3345 3346 * - ``half`` 3347 - 16-bit floating-point value 3348 3349 * - ``bfloat`` 3350 - 16-bit "brain" floating-point value (7-bit significand). Provides the 3351 same number of exponent bits as ``float``, so that it matches its dynamic 3352 range, but with greatly reduced precision. Used in Intel's AVX-512 BF16 3353 extensions and Arm's ARMv8.6-A extensions, among others. 3354 3355 * - ``float`` 3356 - 32-bit floating-point value 3357 3358 * - ``double`` 3359 - 64-bit floating-point value 3360 3361 * - ``fp128`` 3362 - 128-bit floating-point value (113-bit significand) 3363 3364 * - ``x86_fp80`` 3365 - 80-bit floating-point value (X87) 3366 3367 * - ``ppc_fp128`` 3368 - 128-bit floating-point value (two 64-bits) 3369 3370The binary format of half, float, double, and fp128 correspond to the 3371IEEE-754-2008 specifications for binary16, binary32, binary64, and binary128 3372respectively. 3373 3374X86_amx Type 3375"""""""""""" 3376 3377:Overview: 3378 3379The x86_amx type represents a value held in an AMX tile register on an x86 3380machine. The operations allowed on it are quite limited. Only few intrinsics 3381are allowed: stride load and store, zero and dot product. No instruction is 3382allowed for this type. There are no arguments, arrays, pointers, vectors 3383or constants of this type. 3384 3385:Syntax: 3386 3387:: 3388 3389 x86_amx 3390 3391 3392X86_mmx Type 3393"""""""""""" 3394 3395:Overview: 3396 3397The x86_mmx type represents a value held in an MMX register on an x86 3398machine. The operations allowed on it are quite limited: parameters and 3399return values, load and store, and bitcast. User-specified MMX 3400instructions are represented as intrinsic or asm calls with arguments 3401and/or results of this type. There are no arrays, vectors or constants 3402of this type. 3403 3404:Syntax: 3405 3406:: 3407 3408 x86_mmx 3409 3410 3411.. _t_pointer: 3412 3413Pointer Type 3414"""""""""""" 3415 3416:Overview: 3417 3418The pointer type is used to specify memory locations. Pointers are 3419commonly used to reference objects in memory. 3420 3421Pointer types may have an optional address space attribute defining the 3422numbered address space where the pointed-to object resides. The default 3423address space is number zero. The semantics of non-zero address spaces 3424are target-specific. 3425 3426Note that LLVM does not permit pointers to void (``void*``) nor does it 3427permit pointers to labels (``label*``). Use ``i8*`` instead. 3428 3429LLVM is in the process of transitioning to 3430`opaque pointers <OpaquePointers.html#opaque-pointers>`_. 3431Opaque pointers do not have a pointee type. Rather, instructions 3432interacting through pointers specify the type of the underlying memory 3433they are interacting with. Opaque pointers are still in the process of 3434being worked on and are not complete. 3435 3436:Syntax: 3437 3438:: 3439 3440 <type> * 3441 ptr 3442 3443:Examples: 3444 3445+-------------------------+--------------------------------------------------------------------------------------------------------------+ 3446| ``[4 x i32]*`` | A :ref:`pointer <t_pointer>` to :ref:`array <t_array>` of four ``i32`` values. | 3447+-------------------------+--------------------------------------------------------------------------------------------------------------+ 3448| ``i32 (i32*) *`` | A :ref:`pointer <t_pointer>` to a :ref:`function <t_function>` that takes an ``i32*``, returning an ``i32``. | 3449+-------------------------+--------------------------------------------------------------------------------------------------------------+ 3450| ``i32 addrspace(5)*`` | A :ref:`pointer <t_pointer>` to an ``i32`` value that resides in address space 5. | 3451+-------------------------+--------------------------------------------------------------------------------------------------------------+ 3452| ``ptr`` | An opaque pointer type to a value that resides in address space 0. | 3453+-------------------------+--------------------------------------------------------------------------------------------------------------+ 3454| ``ptr addrspace(5)`` | An opaque pointer type to a value that resides in address space 5. | 3455+-------------------------+--------------------------------------------------------------------------------------------------------------+ 3456 3457.. _t_vector: 3458 3459Vector Type 3460""""""""""" 3461 3462:Overview: 3463 3464A vector type is a simple derived type that represents a vector of 3465elements. Vector types are used when multiple primitive data are 3466operated in parallel using a single instruction (SIMD). A vector type 3467requires a size (number of elements), an underlying primitive data type, 3468and a scalable property to represent vectors where the exact hardware 3469vector length is unknown at compile time. Vector types are considered 3470:ref:`first class <t_firstclass>`. 3471 3472:Memory Layout: 3473 3474In general vector elements are laid out in memory in the same way as 3475:ref:`array types <t_array>`. Such an analogy works fine as long as the vector 3476elements are byte sized. However, when the elements of the vector aren't byte 3477sized it gets a bit more complicated. One way to describe the layout is by 3478describing what happens when a vector such as <N x iM> is bitcasted to an 3479integer type with N*M bits, and then following the rules for storing such an 3480integer to memory. 3481 3482A bitcast from a vector type to a scalar integer type will see the elements 3483being packed together (without padding). The order in which elements are 3484inserted in the integer depends on endianess. For little endian element zero 3485is put in the least significant bits of the integer, and for big endian 3486element zero is put in the most significant bits. 3487 3488Using a vector such as ``<i4 1, i4 2, i4 3, i4 5>`` as an example, together 3489with the analogy that we can replace a vector store by a bitcast followed by 3490an integer store, we get this for big endian: 3491 3492.. code-block:: llvm 3493 3494 %val = bitcast <4 x i4> <i4 1, i4 2, i4 3, i4 5> to i16 3495 3496 ; Bitcasting from a vector to an integral type can be seen as 3497 ; concatenating the values: 3498 ; %val now has the hexadecimal value 0x1235. 3499 3500 store i16 %val, i16* %ptr 3501 3502 ; In memory the content will be (8-bit addressing): 3503 ; 3504 ; [%ptr + 0]: 00010010 (0x12) 3505 ; [%ptr + 1]: 00110101 (0x35) 3506 3507The same example for little endian: 3508 3509.. code-block:: llvm 3510 3511 %val = bitcast <4 x i4> <i4 1, i4 2, i4 3, i4 5> to i16 3512 3513 ; Bitcasting from a vector to an integral type can be seen as 3514 ; concatenating the values: 3515 ; %val now has the hexadecimal value 0x5321. 3516 3517 store i16 %val, i16* %ptr 3518 3519 ; In memory the content will be (8-bit addressing): 3520 ; 3521 ; [%ptr + 0]: 01010011 (0x53) 3522 ; [%ptr + 1]: 00100001 (0x21) 3523 3524When ``<N*M>`` isn't evenly divisible by the byte size the exact memory layout 3525is unspecified (just like it is for an integral type of the same size). This 3526is because different targets could put the padding at different positions when 3527the type size is smaller than the type's store size. 3528 3529:Syntax: 3530 3531:: 3532 3533 < <# elements> x <elementtype> > ; Fixed-length vector 3534 < vscale x <# elements> x <elementtype> > ; Scalable vector 3535 3536The number of elements is a constant integer value larger than 0; 3537elementtype may be any integer, floating-point or pointer type. Vectors 3538of size zero are not allowed. For scalable vectors, the total number of 3539elements is a constant multiple (called vscale) of the specified number 3540of elements; vscale is a positive integer that is unknown at compile time 3541and the same hardware-dependent constant for all scalable vectors at run 3542time. The size of a specific scalable vector type is thus constant within 3543IR, even if the exact size in bytes cannot be determined until run time. 3544 3545:Examples: 3546 3547+------------------------+----------------------------------------------------+ 3548| ``<4 x i32>`` | Vector of 4 32-bit integer values. | 3549+------------------------+----------------------------------------------------+ 3550| ``<8 x float>`` | Vector of 8 32-bit floating-point values. | 3551+------------------------+----------------------------------------------------+ 3552| ``<2 x i64>`` | Vector of 2 64-bit integer values. | 3553+------------------------+----------------------------------------------------+ 3554| ``<4 x i64*>`` | Vector of 4 pointers to 64-bit integer values. | 3555+------------------------+----------------------------------------------------+ 3556| ``<vscale x 4 x i32>`` | Vector with a multiple of 4 32-bit integer values. | 3557+------------------------+----------------------------------------------------+ 3558 3559.. _t_label: 3560 3561Label Type 3562^^^^^^^^^^ 3563 3564:Overview: 3565 3566The label type represents code labels. 3567 3568:Syntax: 3569 3570:: 3571 3572 label 3573 3574.. _t_token: 3575 3576Token Type 3577^^^^^^^^^^ 3578 3579:Overview: 3580 3581The token type is used when a value is associated with an instruction 3582but all uses of the value must not attempt to introspect or obscure it. 3583As such, it is not appropriate to have a :ref:`phi <i_phi>` or 3584:ref:`select <i_select>` of type token. 3585 3586:Syntax: 3587 3588:: 3589 3590 token 3591 3592 3593 3594.. _t_metadata: 3595 3596Metadata Type 3597^^^^^^^^^^^^^ 3598 3599:Overview: 3600 3601The metadata type represents embedded metadata. No derived types may be 3602created from metadata except for :ref:`function <t_function>` arguments. 3603 3604:Syntax: 3605 3606:: 3607 3608 metadata 3609 3610.. _t_aggregate: 3611 3612Aggregate Types 3613^^^^^^^^^^^^^^^ 3614 3615Aggregate Types are a subset of derived types that can contain multiple 3616member types. :ref:`Arrays <t_array>` and :ref:`structs <t_struct>` are 3617aggregate types. :ref:`Vectors <t_vector>` are not considered to be 3618aggregate types. 3619 3620.. _t_array: 3621 3622Array Type 3623"""""""""" 3624 3625:Overview: 3626 3627The array type is a very simple derived type that arranges elements 3628sequentially in memory. The array type requires a size (number of 3629elements) and an underlying data type. 3630 3631:Syntax: 3632 3633:: 3634 3635 [<# elements> x <elementtype>] 3636 3637The number of elements is a constant integer value; ``elementtype`` may 3638be any type with a size. 3639 3640:Examples: 3641 3642+------------------+--------------------------------------+ 3643| ``[40 x i32]`` | Array of 40 32-bit integer values. | 3644+------------------+--------------------------------------+ 3645| ``[41 x i32]`` | Array of 41 32-bit integer values. | 3646+------------------+--------------------------------------+ 3647| ``[4 x i8]`` | Array of 4 8-bit integer values. | 3648+------------------+--------------------------------------+ 3649 3650Here are some examples of multidimensional arrays: 3651 3652+-----------------------------+----------------------------------------------------------+ 3653| ``[3 x [4 x i32]]`` | 3x4 array of 32-bit integer values. | 3654+-----------------------------+----------------------------------------------------------+ 3655| ``[12 x [10 x float]]`` | 12x10 array of single precision floating-point values. | 3656+-----------------------------+----------------------------------------------------------+ 3657| ``[2 x [3 x [4 x i16]]]`` | 2x3x4 array of 16-bit integer values. | 3658+-----------------------------+----------------------------------------------------------+ 3659 3660There is no restriction on indexing beyond the end of the array implied 3661by a static type (though there are restrictions on indexing beyond the 3662bounds of an allocated object in some cases). This means that 3663single-dimension 'variable sized array' addressing can be implemented in 3664LLVM with a zero length array type. An implementation of 'pascal style 3665arrays' in LLVM could use the type "``{ i32, [0 x float]}``", for 3666example. 3667 3668.. _t_struct: 3669 3670Structure Type 3671"""""""""""""" 3672 3673:Overview: 3674 3675The structure type is used to represent a collection of data members 3676together in memory. The elements of a structure may be any type that has 3677a size. 3678 3679Structures in memory are accessed using '``load``' and '``store``' by 3680getting a pointer to a field with the '``getelementptr``' instruction. 3681Structures in registers are accessed using the '``extractvalue``' and 3682'``insertvalue``' instructions. 3683 3684Structures may optionally be "packed" structures, which indicate that 3685the alignment of the struct is one byte, and that there is no padding 3686between the elements. In non-packed structs, padding between field types 3687is inserted as defined by the DataLayout string in the module, which is 3688required to match what the underlying code generator expects. 3689 3690Structures can either be "literal" or "identified". A literal structure 3691is defined inline with other types (e.g. ``{i32, i32}*``) whereas 3692identified types are always defined at the top level with a name. 3693Literal types are uniqued by their contents and can never be recursive 3694or opaque since there is no way to write one. Identified types can be 3695recursive, can be opaqued, and are never uniqued. 3696 3697:Syntax: 3698 3699:: 3700 3701 %T1 = type { <type list> } ; Identified normal struct type 3702 %T2 = type <{ <type list> }> ; Identified packed struct type 3703 3704:Examples: 3705 3706+------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 3707| ``{ i32, i32, i32 }`` | A triple of three ``i32`` values | 3708+------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 3709| ``{ float, i32 (i32) * }`` | A pair, where the first element is a ``float`` and the second element is a :ref:`pointer <t_pointer>` to a :ref:`function <t_function>` that takes an ``i32``, returning an ``i32``. | 3710+------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 3711| ``<{ i8, i32 }>`` | A packed struct known to be 5 bytes in size. | 3712+------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 3713 3714.. _t_opaque: 3715 3716Opaque Structure Types 3717"""""""""""""""""""""" 3718 3719:Overview: 3720 3721Opaque structure types are used to represent structure types that 3722do not have a body specified. This corresponds (for example) to the C 3723notion of a forward declared structure. They can be named (``%X``) or 3724unnamed (``%52``). 3725 3726:Syntax: 3727 3728:: 3729 3730 %X = type opaque 3731 %52 = type opaque 3732 3733:Examples: 3734 3735+--------------+-------------------+ 3736| ``opaque`` | An opaque type. | 3737+--------------+-------------------+ 3738 3739.. _constants: 3740 3741Constants 3742========= 3743 3744LLVM has several different basic types of constants. This section 3745describes them all and their syntax. 3746 3747Simple Constants 3748---------------- 3749 3750**Boolean constants** 3751 The two strings '``true``' and '``false``' are both valid constants 3752 of the ``i1`` type. 3753**Integer constants** 3754 Standard integers (such as '4') are constants of the 3755 :ref:`integer <t_integer>` type. Negative numbers may be used with 3756 integer types. 3757**Floating-point constants** 3758 Floating-point constants use standard decimal notation (e.g. 3759 123.421), exponential notation (e.g. 1.23421e+2), or a more precise 3760 hexadecimal notation (see below). The assembler requires the exact 3761 decimal value of a floating-point constant. For example, the 3762 assembler accepts 1.25 but rejects 1.3 because 1.3 is a repeating 3763 decimal in binary. Floating-point constants must have a 3764 :ref:`floating-point <t_floating>` type. 3765**Null pointer constants** 3766 The identifier '``null``' is recognized as a null pointer constant 3767 and must be of :ref:`pointer type <t_pointer>`. 3768**Token constants** 3769 The identifier '``none``' is recognized as an empty token constant 3770 and must be of :ref:`token type <t_token>`. 3771 3772The one non-intuitive notation for constants is the hexadecimal form of 3773floating-point constants. For example, the form 3774'``double 0x432ff973cafa8000``' is equivalent to (but harder to read 3775than) '``double 4.5e+15``'. The only time hexadecimal floating-point 3776constants are required (and the only time that they are generated by the 3777disassembler) is when a floating-point constant must be emitted but it 3778cannot be represented as a decimal floating-point number in a reasonable 3779number of digits. For example, NaN's, infinities, and other special 3780values are represented in their IEEE hexadecimal format so that assembly 3781and disassembly do not cause any bits to change in the constants. 3782 3783When using the hexadecimal form, constants of types bfloat, half, float, and 3784double are represented using the 16-digit form shown above (which matches the 3785IEEE754 representation for double); bfloat, half and float values must, however, 3786be exactly representable as bfloat, IEEE 754 half, and IEEE 754 single 3787precision respectively. Hexadecimal format is always used for long double, and 3788there are three forms of long double. The 80-bit format used by x86 is 3789represented as ``0xK`` followed by 20 hexadecimal digits. The 128-bit format 3790used by PowerPC (two adjacent doubles) is represented by ``0xM`` followed by 32 3791hexadecimal digits. The IEEE 128-bit format is represented by ``0xL`` followed 3792by 32 hexadecimal digits. Long doubles will only work if they match the long 3793double format on your target. The IEEE 16-bit format (half precision) is 3794represented by ``0xH`` followed by 4 hexadecimal digits. The bfloat 16-bit 3795format is represented by ``0xR`` followed by 4 hexadecimal digits. All 3796hexadecimal formats are big-endian (sign bit at the left). 3797 3798There are no constants of type x86_mmx and x86_amx. 3799 3800.. _complexconstants: 3801 3802Complex Constants 3803----------------- 3804 3805Complex constants are a (potentially recursive) combination of simple 3806constants and smaller complex constants. 3807 3808**Structure constants** 3809 Structure constants are represented with notation similar to 3810 structure type definitions (a comma separated list of elements, 3811 surrounded by braces (``{}``)). For example: 3812 "``{ i32 4, float 17.0, i32* @G }``", where "``@G``" is declared as 3813 "``@G = external global i32``". Structure constants must have 3814 :ref:`structure type <t_struct>`, and the number and types of elements 3815 must match those specified by the type. 3816**Array constants** 3817 Array constants are represented with notation similar to array type 3818 definitions (a comma separated list of elements, surrounded by 3819 square brackets (``[]``)). For example: 3820 "``[ i32 42, i32 11, i32 74 ]``". Array constants must have 3821 :ref:`array type <t_array>`, and the number and types of elements must 3822 match those specified by the type. As a special case, character array 3823 constants may also be represented as a double-quoted string using the ``c`` 3824 prefix. For example: "``c"Hello World\0A\00"``". 3825**Vector constants** 3826 Vector constants are represented with notation similar to vector 3827 type definitions (a comma separated list of elements, surrounded by 3828 less-than/greater-than's (``<>``)). For example: 3829 "``< i32 42, i32 11, i32 74, i32 100 >``". Vector constants 3830 must have :ref:`vector type <t_vector>`, and the number and types of 3831 elements must match those specified by the type. 3832**Zero initialization** 3833 The string '``zeroinitializer``' can be used to zero initialize a 3834 value to zero of *any* type, including scalar and 3835 :ref:`aggregate <t_aggregate>` types. This is often used to avoid 3836 having to print large zero initializers (e.g. for large arrays) and 3837 is always exactly equivalent to using explicit zero initializers. 3838**Metadata node** 3839 A metadata node is a constant tuple without types. For example: 3840 "``!{!0, !{!2, !0}, !"test"}``". Metadata can reference constant values, 3841 for example: "``!{!0, i32 0, i8* @global, i64 (i64)* @function, !"str"}``". 3842 Unlike other typed constants that are meant to be interpreted as part of 3843 the instruction stream, metadata is a place to attach additional 3844 information such as debug info. 3845 3846Global Variable and Function Addresses 3847-------------------------------------- 3848 3849The addresses of :ref:`global variables <globalvars>` and 3850:ref:`functions <functionstructure>` are always implicitly valid 3851(link-time) constants. These constants are explicitly referenced when 3852the :ref:`identifier for the global <identifiers>` is used and always have 3853:ref:`pointer <t_pointer>` type. For example, the following is a legal LLVM 3854file: 3855 3856.. code-block:: llvm 3857 3858 @X = global i32 17 3859 @Y = global i32 42 3860 @Z = global [2 x i32*] [ i32* @X, i32* @Y ] 3861 3862.. _undefvalues: 3863 3864Undefined Values 3865---------------- 3866 3867The string '``undef``' can be used anywhere a constant is expected, and 3868indicates that the user of the value may receive an unspecified 3869bit-pattern. Undefined values may be of any type (other than '``label``' 3870or '``void``') and be used anywhere a constant is permitted. 3871 3872Undefined values are useful because they indicate to the compiler that 3873the program is well defined no matter what value is used. This gives the 3874compiler more freedom to optimize. Here are some examples of 3875(potentially surprising) transformations that are valid (in pseudo IR): 3876 3877.. code-block:: llvm 3878 3879 %A = add %X, undef 3880 %B = sub %X, undef 3881 %C = xor %X, undef 3882 Safe: 3883 %A = undef 3884 %B = undef 3885 %C = undef 3886 3887This is safe because all of the output bits are affected by the undef 3888bits. Any output bit can have a zero or one depending on the input bits. 3889 3890.. code-block:: llvm 3891 3892 %A = or %X, undef 3893 %B = and %X, undef 3894 Safe: 3895 %A = -1 3896 %B = 0 3897 Safe: 3898 %A = %X ;; By choosing undef as 0 3899 %B = %X ;; By choosing undef as -1 3900 Unsafe: 3901 %A = undef 3902 %B = undef 3903 3904These logical operations have bits that are not always affected by the 3905input. For example, if ``%X`` has a zero bit, then the output of the 3906'``and``' operation will always be a zero for that bit, no matter what 3907the corresponding bit from the '``undef``' is. As such, it is unsafe to 3908optimize or assume that the result of the '``and``' is '``undef``'. 3909However, it is safe to assume that all bits of the '``undef``' could be 39100, and optimize the '``and``' to 0. Likewise, it is safe to assume that 3911all the bits of the '``undef``' operand to the '``or``' could be set, 3912allowing the '``or``' to be folded to -1. 3913 3914.. code-block:: llvm 3915 3916 %A = select undef, %X, %Y 3917 %B = select undef, 42, %Y 3918 %C = select %X, %Y, undef 3919 Safe: 3920 %A = %X (or %Y) 3921 %B = 42 (or %Y) 3922 %C = %Y 3923 Unsafe: 3924 %A = undef 3925 %B = undef 3926 %C = undef 3927 3928This set of examples shows that undefined '``select``' (and conditional 3929branch) conditions can go *either way*, but they have to come from one 3930of the two operands. In the ``%A`` example, if ``%X`` and ``%Y`` were 3931both known to have a clear low bit, then ``%A`` would have to have a 3932cleared low bit. However, in the ``%C`` example, the optimizer is 3933allowed to assume that the '``undef``' operand could be the same as 3934``%Y``, allowing the whole '``select``' to be eliminated. 3935 3936.. code-block:: llvm 3937 3938 %A = xor undef, undef 3939 3940 %B = undef 3941 %C = xor %B, %B 3942 3943 %D = undef 3944 %E = icmp slt %D, 4 3945 %F = icmp gte %D, 4 3946 3947 Safe: 3948 %A = undef 3949 %B = undef 3950 %C = undef 3951 %D = undef 3952 %E = undef 3953 %F = undef 3954 3955This example points out that two '``undef``' operands are not 3956necessarily the same. This can be surprising to people (and also matches 3957C semantics) where they assume that "``X^X``" is always zero, even if 3958``X`` is undefined. This isn't true for a number of reasons, but the 3959short answer is that an '``undef``' "variable" can arbitrarily change 3960its value over its "live range". This is true because the variable 3961doesn't actually *have a live range*. Instead, the value is logically 3962read from arbitrary registers that happen to be around when needed, so 3963the value is not necessarily consistent over time. In fact, ``%A`` and 3964``%C`` need to have the same semantics or the core LLVM "replace all 3965uses with" concept would not hold. 3966 3967To ensure all uses of a given register observe the same value (even if 3968'``undef``'), the :ref:`freeze instruction <i_freeze>` can be used. 3969 3970.. code-block:: llvm 3971 3972 %A = sdiv undef, %X 3973 %B = sdiv %X, undef 3974 Safe: 3975 %A = 0 3976 b: unreachable 3977 3978These examples show the crucial difference between an *undefined value* 3979and *undefined behavior*. An undefined value (like '``undef``') is 3980allowed to have an arbitrary bit-pattern. This means that the ``%A`` 3981operation can be constant folded to '``0``', because the '``undef``' 3982could be zero, and zero divided by any value is zero. 3983However, in the second example, we can make a more aggressive 3984assumption: because the ``undef`` is allowed to be an arbitrary value, 3985we are allowed to assume that it could be zero. Since a divide by zero 3986has *undefined behavior*, we are allowed to assume that the operation 3987does not execute at all. This allows us to delete the divide and all 3988code after it. Because the undefined operation "can't happen", the 3989optimizer can assume that it occurs in dead code. 3990 3991.. code-block:: text 3992 3993 a: store undef -> %X 3994 b: store %X -> undef 3995 Safe: 3996 a: <deleted> 3997 b: unreachable 3998 3999A store *of* an undefined value can be assumed to not have any effect; 4000we can assume that the value is overwritten with bits that happen to 4001match what was already there. However, a store *to* an undefined 4002location could clobber arbitrary memory, therefore, it has undefined 4003behavior. 4004 4005Branching on an undefined value is undefined behavior. 4006This explains optimizations that depend on branch conditions to construct 4007predicates, such as Correlated Value Propagation and Global Value Numbering. 4008In case of switch instruction, the branch condition should be frozen, otherwise 4009it is undefined behavior. 4010 4011.. code-block:: llvm 4012 4013 Unsafe: 4014 br undef, BB1, BB2 ; UB 4015 4016 %X = and i32 undef, 255 4017 switch %X, label %ret [ .. ] ; UB 4018 4019 store undef, i8* %ptr 4020 %X = load i8* %ptr ; %X is undef 4021 switch i8 %X, label %ret [ .. ] ; UB 4022 4023 Safe: 4024 %X = or i8 undef, 255 ; always 255 4025 switch i8 %X, label %ret [ .. ] ; Well-defined 4026 4027 %X = freeze i1 undef 4028 br %X, BB1, BB2 ; Well-defined (non-deterministic jump) 4029 4030 4031This is also consistent with the behavior of MemorySanitizer. 4032MemorySanitizer, detector of uses of uninitialized memory, 4033defines a branch with condition that depends on an undef value (or 4034certain other values, like e.g. a result of a load from heap-allocated 4035memory that has never been stored to) to have an externally visible 4036side effect. For this reason functions with *sanitize_memory* 4037attribute are not allowed to produce such branches "out of thin 4038air". More strictly, an optimization that inserts a conditional branch 4039is only valid if in all executions where the branch condition has at 4040least one undefined bit, the same branch condition is evaluated in the 4041input IR as well. 4042 4043.. _poisonvalues: 4044 4045Poison Values 4046------------- 4047 4048A poison value is a result of an erroneous operation. 4049In order to facilitate speculative execution, many instructions do not 4050invoke immediate undefined behavior when provided with illegal operands, 4051and return a poison value instead. 4052The string '``poison``' can be used anywhere a constant is expected, and 4053operations such as :ref:`add <i_add>` with the ``nsw`` flag can produce 4054a poison value. 4055 4056Poison value behavior is defined in terms of value *dependence*: 4057 4058- Values other than :ref:`phi <i_phi>` nodes, :ref:`select <i_select>`, and 4059 :ref:`freeze <i_freeze>` instructions depend on their operands. 4060- :ref:`Phi <i_phi>` nodes depend on the operand corresponding to 4061 their dynamic predecessor basic block. 4062- :ref:`Select <i_select>` instructions depend on their condition operand and 4063 their selected operand. 4064- Function arguments depend on the corresponding actual argument values 4065 in the dynamic callers of their functions. 4066- :ref:`Call <i_call>` instructions depend on the :ref:`ret <i_ret>` 4067 instructions that dynamically transfer control back to them. 4068- :ref:`Invoke <i_invoke>` instructions depend on the 4069 :ref:`ret <i_ret>`, :ref:`resume <i_resume>`, or exception-throwing 4070 call instructions that dynamically transfer control back to them. 4071- Non-volatile loads and stores depend on the most recent stores to all 4072 of the referenced memory addresses, following the order in the IR 4073 (including loads and stores implied by intrinsics such as 4074 :ref:`@llvm.memcpy <int_memcpy>`.) 4075- An instruction with externally visible side effects depends on the 4076 most recent preceding instruction with externally visible side 4077 effects, following the order in the IR. (This includes :ref:`volatile 4078 operations <volatile>`.) 4079- An instruction *control-depends* on a :ref:`terminator 4080 instruction <terminators>` if the terminator instruction has 4081 multiple successors and the instruction is always executed when 4082 control transfers to one of the successors, and may not be executed 4083 when control is transferred to another. 4084- Additionally, an instruction also *control-depends* on a terminator 4085 instruction if the set of instructions it otherwise depends on would 4086 be different if the terminator had transferred control to a different 4087 successor. 4088- Dependence is transitive. 4089- Vector elements may be independently poisoned. Therefore, transforms 4090 on instructions such as shufflevector must be careful to propagate 4091 poison across values or elements only as allowed by the original code. 4092 4093An instruction that *depends* on a poison value, produces a poison value 4094itself. A poison value may be relaxed into an 4095:ref:`undef value <undefvalues>`, which takes an arbitrary bit-pattern. 4096Propagation of poison can be stopped with the 4097:ref:`freeze instruction <i_freeze>`. 4098 4099This means that immediate undefined behavior occurs if a poison value is 4100used as an instruction operand that has any values that trigger undefined 4101behavior. Notably this includes (but is not limited to): 4102 4103- The pointer operand of a :ref:`load <i_load>`, :ref:`store <i_store>` or 4104 any other pointer dereferencing instruction (independent of address 4105 space). 4106- The divisor operand of a ``udiv``, ``sdiv``, ``urem`` or ``srem`` 4107 instruction. 4108- The condition operand of a :ref:`br <i_br>` instruction. 4109- The callee operand of a :ref:`call <i_call>` or :ref:`invoke <i_invoke>` 4110 instruction. 4111- The parameter operand of a :ref:`call <i_call>` or :ref:`invoke <i_invoke>` 4112 instruction, when the function or invoking call site has a ``noundef`` 4113 attribute in the corresponding position. 4114- The operand of a :ref:`ret <i_ret>` instruction if the function or invoking 4115 call site has a `noundef` attribute in the return value position. 4116 4117Here are some examples: 4118 4119.. code-block:: llvm 4120 4121 entry: 4122 %poison = sub nuw i32 0, 1 ; Results in a poison value. 4123 %poison2 = sub i32 poison, 1 ; Also results in a poison value. 4124 %still_poison = and i32 %poison, 0 ; 0, but also poison. 4125 %poison_yet_again = getelementptr i32, i32* @h, i32 %still_poison 4126 store i32 0, i32* %poison_yet_again ; Undefined behavior due to 4127 ; store to poison. 4128 4129 store i32 %poison, i32* @g ; Poison value stored to memory. 4130 %poison3 = load i32, i32* @g ; Poison value loaded back from memory. 4131 4132 %narrowaddr = bitcast i32* @g to i16* 4133 %wideaddr = bitcast i32* @g to i64* 4134 %poison4 = load i16, i16* %narrowaddr ; Returns a poison value. 4135 %poison5 = load i64, i64* %wideaddr ; Returns a poison value. 4136 4137 %cmp = icmp slt i32 %poison, 0 ; Returns a poison value. 4138 br i1 %cmp, label %end, label %end ; undefined behavior 4139 4140 end: 4141 4142.. _welldefinedvalues: 4143 4144Well-Defined Values 4145------------------- 4146 4147Given a program execution, a value is *well defined* if the value does not 4148have an undef bit and is not poison in the execution. 4149An aggregate value or vector is well defined if its elements are well defined. 4150The padding of an aggregate isn't considered, since it isn't visible 4151without storing it into memory and loading it with a different type. 4152 4153A constant of a :ref:`single value <t_single_value>`, non-vector type is well 4154defined if it is neither '``undef``' constant nor '``poison``' constant. 4155The result of :ref:`freeze instruction <i_freeze>` is well defined regardless 4156of its operand. 4157 4158.. _blockaddress: 4159 4160Addresses of Basic Blocks 4161------------------------- 4162 4163``blockaddress(@function, %block)`` 4164 4165The '``blockaddress``' constant computes the address of the specified 4166basic block in the specified function. 4167 4168It always has an ``i8 addrspace(P)*`` type, where ``P`` is the address space 4169of the function containing ``%block`` (usually ``addrspace(0)``). 4170 4171Taking the address of the entry block is illegal. 4172 4173This value only has defined behavior when used as an operand to the 4174':ref:`indirectbr <i_indirectbr>`' or ':ref:`callbr <i_callbr>`'instruction, or 4175for comparisons against null. Pointer equality tests between labels addresses 4176results in undefined behavior --- though, again, comparison against null is ok, 4177and no label is equal to the null pointer. This may be passed around as an 4178opaque pointer sized value as long as the bits are not inspected. This 4179allows ``ptrtoint`` and arithmetic to be performed on these values so 4180long as the original value is reconstituted before the ``indirectbr`` or 4181``callbr`` instruction. 4182 4183Finally, some targets may provide defined semantics when using the value 4184as the operand to an inline assembly, but that is target specific. 4185 4186.. _dso_local_equivalent: 4187 4188DSO Local Equivalent 4189-------------------- 4190 4191``dso_local_equivalent @func`` 4192 4193A '``dso_local_equivalent``' constant represents a function which is 4194functionally equivalent to a given function, but is always defined in the 4195current linkage unit. The resulting pointer has the same type as the underlying 4196function. The resulting pointer is permitted, but not required, to be different 4197from a pointer to the function, and it may have different values in different 4198translation units. 4199 4200The target function may not have ``extern_weak`` linkage. 4201 4202``dso_local_equivalent`` can be implemented as such: 4203 4204- If the function has local linkage, hidden visibility, or is 4205 ``dso_local``, ``dso_local_equivalent`` can be implemented as simply a pointer 4206 to the function. 4207- ``dso_local_equivalent`` can be implemented with a stub that tail-calls the 4208 function. Many targets support relocations that resolve at link time to either 4209 a function or a stub for it, depending on if the function is defined within the 4210 linkage unit; LLVM will use this when available. (This is commonly called a 4211 "PLT stub".) On other targets, the stub may need to be emitted explicitly. 4212 4213This can be used wherever a ``dso_local`` instance of a function is needed without 4214needing to explicitly make the original function ``dso_local``. An instance where 4215this can be used is for static offset calculations between a function and some other 4216``dso_local`` symbol. This is especially useful for the Relative VTables C++ ABI, 4217where dynamic relocations for function pointers in VTables can be replaced with 4218static relocations for offsets between the VTable and virtual functions which 4219may not be ``dso_local``. 4220 4221This is currently only supported for ELF binary formats. 4222 4223.. _constantexprs: 4224 4225Constant Expressions 4226-------------------- 4227 4228Constant expressions are used to allow expressions involving other 4229constants to be used as constants. Constant expressions may be of any 4230:ref:`first class <t_firstclass>` type and may involve any LLVM operation 4231that does not have side effects (e.g. load and call are not supported). 4232The following is the syntax for constant expressions: 4233 4234``trunc (CST to TYPE)`` 4235 Perform the :ref:`trunc operation <i_trunc>` on constants. 4236``zext (CST to TYPE)`` 4237 Perform the :ref:`zext operation <i_zext>` on constants. 4238``sext (CST to TYPE)`` 4239 Perform the :ref:`sext operation <i_sext>` on constants. 4240``fptrunc (CST to TYPE)`` 4241 Truncate a floating-point constant to another floating-point type. 4242 The size of CST must be larger than the size of TYPE. Both types 4243 must be floating-point. 4244``fpext (CST to TYPE)`` 4245 Floating-point extend a constant to another type. The size of CST 4246 must be smaller or equal to the size of TYPE. Both types must be 4247 floating-point. 4248``fptoui (CST to TYPE)`` 4249 Convert a floating-point constant to the corresponding unsigned 4250 integer constant. TYPE must be a scalar or vector integer type. CST 4251 must be of scalar or vector floating-point type. Both CST and TYPE 4252 must be scalars, or vectors of the same number of elements. If the 4253 value won't fit in the integer type, the result is a 4254 :ref:`poison value <poisonvalues>`. 4255``fptosi (CST to TYPE)`` 4256 Convert a floating-point constant to the corresponding signed 4257 integer constant. TYPE must be a scalar or vector integer type. CST 4258 must be of scalar or vector floating-point type. Both CST and TYPE 4259 must be scalars, or vectors of the same number of elements. If the 4260 value won't fit in the integer type, the result is a 4261 :ref:`poison value <poisonvalues>`. 4262``uitofp (CST to TYPE)`` 4263 Convert an unsigned integer constant to the corresponding 4264 floating-point constant. TYPE must be a scalar or vector floating-point 4265 type. CST must be of scalar or vector integer type. Both CST and TYPE must 4266 be scalars, or vectors of the same number of elements. 4267``sitofp (CST to TYPE)`` 4268 Convert a signed integer constant to the corresponding floating-point 4269 constant. TYPE must be a scalar or vector floating-point type. 4270 CST must be of scalar or vector integer type. Both CST and TYPE must 4271 be scalars, or vectors of the same number of elements. 4272``ptrtoint (CST to TYPE)`` 4273 Perform the :ref:`ptrtoint operation <i_ptrtoint>` on constants. 4274``inttoptr (CST to TYPE)`` 4275 Perform the :ref:`inttoptr operation <i_inttoptr>` on constants. 4276 This one is *really* dangerous! 4277``bitcast (CST to TYPE)`` 4278 Convert a constant, CST, to another TYPE. 4279 The constraints of the operands are the same as those for the 4280 :ref:`bitcast instruction <i_bitcast>`. 4281``addrspacecast (CST to TYPE)`` 4282 Convert a constant pointer or constant vector of pointer, CST, to another 4283 TYPE in a different address space. The constraints of the operands are the 4284 same as those for the :ref:`addrspacecast instruction <i_addrspacecast>`. 4285``getelementptr (TY, CSTPTR, IDX0, IDX1, ...)``, ``getelementptr inbounds (TY, CSTPTR, IDX0, IDX1, ...)`` 4286 Perform the :ref:`getelementptr operation <i_getelementptr>` on 4287 constants. As with the :ref:`getelementptr <i_getelementptr>` 4288 instruction, the index list may have one or more indexes, which are 4289 required to make sense for the type of "pointer to TY". 4290``select (COND, VAL1, VAL2)`` 4291 Perform the :ref:`select operation <i_select>` on constants. 4292``icmp COND (VAL1, VAL2)`` 4293 Perform the :ref:`icmp operation <i_icmp>` on constants. 4294``fcmp COND (VAL1, VAL2)`` 4295 Perform the :ref:`fcmp operation <i_fcmp>` on constants. 4296``extractelement (VAL, IDX)`` 4297 Perform the :ref:`extractelement operation <i_extractelement>` on 4298 constants. 4299``insertelement (VAL, ELT, IDX)`` 4300 Perform the :ref:`insertelement operation <i_insertelement>` on 4301 constants. 4302``shufflevector (VEC1, VEC2, IDXMASK)`` 4303 Perform the :ref:`shufflevector operation <i_shufflevector>` on 4304 constants. 4305``extractvalue (VAL, IDX0, IDX1, ...)`` 4306 Perform the :ref:`extractvalue operation <i_extractvalue>` on 4307 constants. The index list is interpreted in a similar manner as 4308 indices in a ':ref:`getelementptr <i_getelementptr>`' operation. At 4309 least one index value must be specified. 4310``insertvalue (VAL, ELT, IDX0, IDX1, ...)`` 4311 Perform the :ref:`insertvalue operation <i_insertvalue>` on constants. 4312 The index list is interpreted in a similar manner as indices in a 4313 ':ref:`getelementptr <i_getelementptr>`' operation. At least one index 4314 value must be specified. 4315``OPCODE (LHS, RHS)`` 4316 Perform the specified operation of the LHS and RHS constants. OPCODE 4317 may be any of the :ref:`binary <binaryops>` or :ref:`bitwise 4318 binary <bitwiseops>` operations. The constraints on operands are 4319 the same as those for the corresponding instruction (e.g. no bitwise 4320 operations on floating-point values are allowed). 4321 4322Other Values 4323============ 4324 4325.. _inlineasmexprs: 4326 4327Inline Assembler Expressions 4328---------------------------- 4329 4330LLVM supports inline assembler expressions (as opposed to :ref:`Module-Level 4331Inline Assembly <moduleasm>`) through the use of a special value. This value 4332represents the inline assembler as a template string (containing the 4333instructions to emit), a list of operand constraints (stored as a string), a 4334flag that indicates whether or not the inline asm expression has side effects, 4335and a flag indicating whether the function containing the asm needs to align its 4336stack conservatively. 4337 4338The template string supports argument substitution of the operands using "``$``" 4339followed by a number, to indicate substitution of the given register/memory 4340location, as specified by the constraint string. "``${NUM:MODIFIER}``" may also 4341be used, where ``MODIFIER`` is a target-specific annotation for how to print the 4342operand (See :ref:`inline-asm-modifiers`). 4343 4344A literal "``$``" may be included by using "``$$``" in the template. To include 4345other special characters into the output, the usual "``\XX``" escapes may be 4346used, just as in other strings. Note that after template substitution, the 4347resulting assembly string is parsed by LLVM's integrated assembler unless it is 4348disabled -- even when emitting a ``.s`` file -- and thus must contain assembly 4349syntax known to LLVM. 4350 4351LLVM also supports a few more substitutions useful for writing inline assembly: 4352 4353- ``${:uid}``: Expands to a decimal integer unique to this inline assembly blob. 4354 This substitution is useful when declaring a local label. Many standard 4355 compiler optimizations, such as inlining, may duplicate an inline asm blob. 4356 Adding a blob-unique identifier ensures that the two labels will not conflict 4357 during assembly. This is used to implement `GCC's %= special format 4358 string <https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html>`_. 4359- ``${:comment}``: Expands to the comment character of the current target's 4360 assembly dialect. This is usually ``#``, but many targets use other strings, 4361 such as ``;``, ``//``, or ``!``. 4362- ``${:private}``: Expands to the assembler private label prefix. Labels with 4363 this prefix will not appear in the symbol table of the assembled object. 4364 Typically the prefix is ``L``, but targets may use other strings. ``.L`` is 4365 relatively popular. 4366 4367LLVM's support for inline asm is modeled closely on the requirements of Clang's 4368GCC-compatible inline-asm support. Thus, the feature-set and the constraint and 4369modifier codes listed here are similar or identical to those in GCC's inline asm 4370support. However, to be clear, the syntax of the template and constraint strings 4371described here is *not* the same as the syntax accepted by GCC and Clang, and, 4372while most constraint letters are passed through as-is by Clang, some get 4373translated to other codes when converting from the C source to the LLVM 4374assembly. 4375 4376An example inline assembler expression is: 4377 4378.. code-block:: llvm 4379 4380 i32 (i32) asm "bswap $0", "=r,r" 4381 4382Inline assembler expressions may **only** be used as the callee operand 4383of a :ref:`call <i_call>` or an :ref:`invoke <i_invoke>` instruction. 4384Thus, typically we have: 4385 4386.. code-block:: llvm 4387 4388 %X = call i32 asm "bswap $0", "=r,r"(i32 %Y) 4389 4390Inline asms with side effects not visible in the constraint list must be 4391marked as having side effects. This is done through the use of the 4392'``sideeffect``' keyword, like so: 4393 4394.. code-block:: llvm 4395 4396 call void asm sideeffect "eieio", ""() 4397 4398In some cases inline asms will contain code that will not work unless 4399the stack is aligned in some way, such as calls or SSE instructions on 4400x86, yet will not contain code that does that alignment within the asm. 4401The compiler should make conservative assumptions about what the asm 4402might contain and should generate its usual stack alignment code in the 4403prologue if the '``alignstack``' keyword is present: 4404 4405.. code-block:: llvm 4406 4407 call void asm alignstack "eieio", ""() 4408 4409Inline asms also support using non-standard assembly dialects. The 4410assumed dialect is ATT. When the '``inteldialect``' keyword is present, 4411the inline asm is using the Intel dialect. Currently, ATT and Intel are 4412the only supported dialects. An example is: 4413 4414.. code-block:: llvm 4415 4416 call void asm inteldialect "eieio", ""() 4417 4418In the case that the inline asm might unwind the stack, 4419the '``unwind``' keyword must be used, so that the compiler emits 4420unwinding information: 4421 4422.. code-block:: llvm 4423 4424 call void asm unwind "call func", ""() 4425 4426If the inline asm unwinds the stack and isn't marked with 4427the '``unwind``' keyword, the behavior is undefined. 4428 4429If multiple keywords appear, the '``sideeffect``' keyword must come 4430first, the '``alignstack``' keyword second, the '``inteldialect``' keyword 4431third and the '``unwind``' keyword last. 4432 4433Inline Asm Constraint String 4434^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 4435 4436The constraint list is a comma-separated string, each element containing one or 4437more constraint codes. 4438 4439For each element in the constraint list an appropriate register or memory 4440operand will be chosen, and it will be made available to assembly template 4441string expansion as ``$0`` for the first constraint in the list, ``$1`` for the 4442second, etc. 4443 4444There are three different types of constraints, which are distinguished by a 4445prefix symbol in front of the constraint code: Output, Input, and Clobber. The 4446constraints must always be given in that order: outputs first, then inputs, then 4447clobbers. They cannot be intermingled. 4448 4449There are also three different categories of constraint codes: 4450 4451- Register constraint. This is either a register class, or a fixed physical 4452 register. This kind of constraint will allocate a register, and if necessary, 4453 bitcast the argument or result to the appropriate type. 4454- Memory constraint. This kind of constraint is for use with an instruction 4455 taking a memory operand. Different constraints allow for different addressing 4456 modes used by the target. 4457- Immediate value constraint. This kind of constraint is for an integer or other 4458 immediate value which can be rendered directly into an instruction. The 4459 various target-specific constraints allow the selection of a value in the 4460 proper range for the instruction you wish to use it with. 4461 4462Output constraints 4463"""""""""""""""""" 4464 4465Output constraints are specified by an "``=``" prefix (e.g. "``=r``"). This 4466indicates that the assembly will write to this operand, and the operand will 4467then be made available as a return value of the ``asm`` expression. Output 4468constraints do not consume an argument from the call instruction. (Except, see 4469below about indirect outputs). 4470 4471Normally, it is expected that no output locations are written to by the assembly 4472expression until *all* of the inputs have been read. As such, LLVM may assign 4473the same register to an output and an input. If this is not safe (e.g. if the 4474assembly contains two instructions, where the first writes to one output, and 4475the second reads an input and writes to a second output), then the "``&``" 4476modifier must be used (e.g. "``=&r``") to specify that the output is an 4477"early-clobber" output. Marking an output as "early-clobber" ensures that LLVM 4478will not use the same register for any inputs (other than an input tied to this 4479output). 4480 4481Input constraints 4482""""""""""""""""" 4483 4484Input constraints do not have a prefix -- just the constraint codes. Each input 4485constraint will consume one argument from the call instruction. It is not 4486permitted for the asm to write to any input register or memory location (unless 4487that input is tied to an output). Note also that multiple inputs may all be 4488assigned to the same register, if LLVM can determine that they necessarily all 4489contain the same value. 4490 4491Instead of providing a Constraint Code, input constraints may also "tie" 4492themselves to an output constraint, by providing an integer as the constraint 4493string. Tied inputs still consume an argument from the call instruction, and 4494take up a position in the asm template numbering as is usual -- they will simply 4495be constrained to always use the same register as the output they've been tied 4496to. For example, a constraint string of "``=r,0``" says to assign a register for 4497output, and use that register as an input as well (it being the 0'th 4498constraint). 4499 4500It is permitted to tie an input to an "early-clobber" output. In that case, no 4501*other* input may share the same register as the input tied to the early-clobber 4502(even when the other input has the same value). 4503 4504You may only tie an input to an output which has a register constraint, not a 4505memory constraint. Only a single input may be tied to an output. 4506 4507There is also an "interesting" feature which deserves a bit of explanation: if a 4508register class constraint allocates a register which is too small for the value 4509type operand provided as input, the input value will be split into multiple 4510registers, and all of them passed to the inline asm. 4511 4512However, this feature is often not as useful as you might think. 4513 4514Firstly, the registers are *not* guaranteed to be consecutive. So, on those 4515architectures that have instructions which operate on multiple consecutive 4516instructions, this is not an appropriate way to support them. (e.g. the 32-bit 4517SparcV8 has a 64-bit load, which instruction takes a single 32-bit register. The 4518hardware then loads into both the named register, and the next register. This 4519feature of inline asm would not be useful to support that.) 4520 4521A few of the targets provide a template string modifier allowing explicit access 4522to the second register of a two-register operand (e.g. MIPS ``L``, ``M``, and 4523``D``). On such an architecture, you can actually access the second allocated 4524register (yet, still, not any subsequent ones). But, in that case, you're still 4525probably better off simply splitting the value into two separate operands, for 4526clarity. (e.g. see the description of the ``A`` constraint on X86, which, 4527despite existing only for use with this feature, is not really a good idea to 4528use) 4529 4530Indirect inputs and outputs 4531""""""""""""""""""""""""""" 4532 4533Indirect output or input constraints can be specified by the "``*``" modifier 4534(which goes after the "``=``" in case of an output). This indicates that the asm 4535will write to or read from the contents of an *address* provided as an input 4536argument. (Note that in this way, indirect outputs act more like an *input* than 4537an output: just like an input, they consume an argument of the call expression, 4538rather than producing a return value. An indirect output constraint is an 4539"output" only in that the asm is expected to write to the contents of the input 4540memory location, instead of just read from it). 4541 4542This is most typically used for memory constraint, e.g. "``=*m``", to pass the 4543address of a variable as a value. 4544 4545It is also possible to use an indirect *register* constraint, but only on output 4546(e.g. "``=*r``"). This will cause LLVM to allocate a register for an output 4547value normally, and then, separately emit a store to the address provided as 4548input, after the provided inline asm. (It's not clear what value this 4549functionality provides, compared to writing the store explicitly after the asm 4550statement, and it can only produce worse code, since it bypasses many 4551optimization passes. I would recommend not using it.) 4552 4553 4554Clobber constraints 4555""""""""""""""""""" 4556 4557A clobber constraint is indicated by a "``~``" prefix. A clobber does not 4558consume an input operand, nor generate an output. Clobbers cannot use any of the 4559general constraint code letters -- they may use only explicit register 4560constraints, e.g. "``~{eax}``". The one exception is that a clobber string of 4561"``~{memory}``" indicates that the assembly writes to arbitrary undeclared 4562memory locations -- not only the memory pointed to by a declared indirect 4563output. 4564 4565Note that clobbering named registers that are also present in output 4566constraints is not legal. 4567 4568 4569Constraint Codes 4570"""""""""""""""" 4571After a potential prefix comes constraint code, or codes. 4572 4573A Constraint Code is either a single letter (e.g. "``r``"), a "``^``" character 4574followed by two letters (e.g. "``^wc``"), or "``{``" register-name "``}``" 4575(e.g. "``{eax}``"). 4576 4577The one and two letter constraint codes are typically chosen to be the same as 4578GCC's constraint codes. 4579 4580A single constraint may include one or more than constraint code in it, leaving 4581it up to LLVM to choose which one to use. This is included mainly for 4582compatibility with the translation of GCC inline asm coming from clang. 4583 4584There are two ways to specify alternatives, and either or both may be used in an 4585inline asm constraint list: 4586 45871) Append the codes to each other, making a constraint code set. E.g. "``im``" 4588 or "``{eax}m``". This means "choose any of the options in the set". The 4589 choice of constraint is made independently for each constraint in the 4590 constraint list. 4591 45922) Use "``|``" between constraint code sets, creating alternatives. Every 4593 constraint in the constraint list must have the same number of alternative 4594 sets. With this syntax, the same alternative in *all* of the items in the 4595 constraint list will be chosen together. 4596 4597Putting those together, you might have a two operand constraint string like 4598``"rm|r,ri|rm"``. This indicates that if operand 0 is ``r`` or ``m``, then 4599operand 1 may be one of ``r`` or ``i``. If operand 0 is ``r``, then operand 1 4600may be one of ``r`` or ``m``. But, operand 0 and 1 cannot both be of type m. 4601 4602However, the use of either of the alternatives features is *NOT* recommended, as 4603LLVM is not able to make an intelligent choice about which one to use. (At the 4604point it currently needs to choose, not enough information is available to do so 4605in a smart way.) Thus, it simply tries to make a choice that's most likely to 4606compile, not one that will be optimal performance. (e.g., given "``rm``", it'll 4607always choose to use memory, not registers). And, if given multiple registers, 4608or multiple register classes, it will simply choose the first one. (In fact, it 4609doesn't currently even ensure explicitly specified physical registers are 4610unique, so specifying multiple physical registers as alternatives, like 4611``{r11}{r12},{r11}{r12}``, will assign r11 to both operands, not at all what was 4612intended.) 4613 4614Supported Constraint Code List 4615"""""""""""""""""""""""""""""" 4616 4617The constraint codes are, in general, expected to behave the same way they do in 4618GCC. LLVM's support is often implemented on an 'as-needed' basis, to support C 4619inline asm code which was supported by GCC. A mismatch in behavior between LLVM 4620and GCC likely indicates a bug in LLVM. 4621 4622Some constraint codes are typically supported by all targets: 4623 4624- ``r``: A register in the target's general purpose register class. 4625- ``m``: A memory address operand. It is target-specific what addressing modes 4626 are supported, typical examples are register, or register + register offset, 4627 or register + immediate offset (of some target-specific size). 4628- ``i``: An integer constant (of target-specific width). Allows either a simple 4629 immediate, or a relocatable value. 4630- ``n``: An integer constant -- *not* including relocatable values. 4631- ``s``: An integer constant, but allowing *only* relocatable values. 4632- ``X``: Allows an operand of any kind, no constraint whatsoever. Typically 4633 useful to pass a label for an asm branch or call. 4634 4635 .. FIXME: but that surely isn't actually okay to jump out of an asm 4636 block without telling llvm about the control transfer???) 4637 4638- ``{register-name}``: Requires exactly the named physical register. 4639 4640Other constraints are target-specific: 4641 4642AArch64: 4643 4644- ``z``: An immediate integer 0. Outputs ``WZR`` or ``XZR``, as appropriate. 4645- ``I``: An immediate integer valid for an ``ADD`` or ``SUB`` instruction, 4646 i.e. 0 to 4095 with optional shift by 12. 4647- ``J``: An immediate integer that, when negated, is valid for an ``ADD`` or 4648 ``SUB`` instruction, i.e. -1 to -4095 with optional left shift by 12. 4649- ``K``: An immediate integer that is valid for the 'bitmask immediate 32' of a 4650 logical instruction like ``AND``, ``EOR``, or ``ORR`` with a 32-bit register. 4651- ``L``: An immediate integer that is valid for the 'bitmask immediate 64' of a 4652 logical instruction like ``AND``, ``EOR``, or ``ORR`` with a 64-bit register. 4653- ``M``: An immediate integer for use with the ``MOV`` assembly alias on a 4654 32-bit register. This is a superset of ``K``: in addition to the bitmask 4655 immediate, also allows immediate integers which can be loaded with a single 4656 ``MOVZ`` or ``MOVL`` instruction. 4657- ``N``: An immediate integer for use with the ``MOV`` assembly alias on a 4658 64-bit register. This is a superset of ``L``. 4659- ``Q``: Memory address operand must be in a single register (no 4660 offsets). (However, LLVM currently does this for the ``m`` constraint as 4661 well.) 4662- ``r``: A 32 or 64-bit integer register (W* or X*). 4663- ``w``: A 32, 64, or 128-bit floating-point, SIMD or SVE vector register. 4664- ``x``: Like w, but restricted to registers 0 to 15 inclusive. 4665- ``y``: Like w, but restricted to SVE vector registers Z0 to Z7 inclusive. 4666- ``Upl``: One of the low eight SVE predicate registers (P0 to P7) 4667- ``Upa``: Any of the SVE predicate registers (P0 to P15) 4668 4669AMDGPU: 4670 4671- ``r``: A 32 or 64-bit integer register. 4672- ``[0-9]v``: The 32-bit VGPR register, number 0-9. 4673- ``[0-9]s``: The 32-bit SGPR register, number 0-9. 4674- ``[0-9]a``: The 32-bit AGPR register, number 0-9. 4675- ``I``: An integer inline constant in the range from -16 to 64. 4676- ``J``: A 16-bit signed integer constant. 4677- ``A``: An integer or a floating-point inline constant. 4678- ``B``: A 32-bit signed integer constant. 4679- ``C``: A 32-bit unsigned integer constant or an integer inline constant in the range from -16 to 64. 4680- ``DA``: A 64-bit constant that can be split into two "A" constants. 4681- ``DB``: A 64-bit constant that can be split into two "B" constants. 4682 4683All ARM modes: 4684 4685- ``Q``, ``Um``, ``Un``, ``Uq``, ``Us``, ``Ut``, ``Uv``, ``Uy``: Memory address 4686 operand. Treated the same as operand ``m``, at the moment. 4687- ``Te``: An even general-purpose 32-bit integer register: ``r0,r2,...,r12,r14`` 4688- ``To``: An odd general-purpose 32-bit integer register: ``r1,r3,...,r11`` 4689 4690ARM and ARM's Thumb2 mode: 4691 4692- ``j``: An immediate integer between 0 and 65535 (valid for ``MOVW``) 4693- ``I``: An immediate integer valid for a data-processing instruction. 4694- ``J``: An immediate integer between -4095 and 4095. 4695- ``K``: An immediate integer whose bitwise inverse is valid for a 4696 data-processing instruction. (Can be used with template modifier "``B``" to 4697 print the inverted value). 4698- ``L``: An immediate integer whose negation is valid for a data-processing 4699 instruction. (Can be used with template modifier "``n``" to print the negated 4700 value). 4701- ``M``: A power of two or an integer between 0 and 32. 4702- ``N``: Invalid immediate constraint. 4703- ``O``: Invalid immediate constraint. 4704- ``r``: A general-purpose 32-bit integer register (``r0-r15``). 4705- ``l``: In Thumb2 mode, low 32-bit GPR registers (``r0-r7``). In ARM mode, same 4706 as ``r``. 4707- ``h``: In Thumb2 mode, a high 32-bit GPR register (``r8-r15``). In ARM mode, 4708 invalid. 4709- ``w``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges 4710 ``s0-s31``, ``d0-d31``, or ``q0-q15``, respectively. 4711- ``t``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges 4712 ``s0-s31``, ``d0-d15``, or ``q0-q7``, respectively. 4713- ``x``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges 4714 ``s0-s15``, ``d0-d7``, or ``q0-q3``, respectively. 4715 4716ARM's Thumb1 mode: 4717 4718- ``I``: An immediate integer between 0 and 255. 4719- ``J``: An immediate integer between -255 and -1. 4720- ``K``: An immediate integer between 0 and 255, with optional left-shift by 4721 some amount. 4722- ``L``: An immediate integer between -7 and 7. 4723- ``M``: An immediate integer which is a multiple of 4 between 0 and 1020. 4724- ``N``: An immediate integer between 0 and 31. 4725- ``O``: An immediate integer which is a multiple of 4 between -508 and 508. 4726- ``r``: A low 32-bit GPR register (``r0-r7``). 4727- ``l``: A low 32-bit GPR register (``r0-r7``). 4728- ``h``: A high GPR register (``r0-r7``). 4729- ``w``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges 4730 ``s0-s31``, ``d0-d31``, or ``q0-q15``, respectively. 4731- ``t``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges 4732 ``s0-s31``, ``d0-d15``, or ``q0-q7``, respectively. 4733- ``x``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges 4734 ``s0-s15``, ``d0-d7``, or ``q0-q3``, respectively. 4735 4736 4737Hexagon: 4738 4739- ``o``, ``v``: A memory address operand, treated the same as constraint ``m``, 4740 at the moment. 4741- ``r``: A 32 or 64-bit register. 4742 4743MSP430: 4744 4745- ``r``: An 8 or 16-bit register. 4746 4747MIPS: 4748 4749- ``I``: An immediate signed 16-bit integer. 4750- ``J``: An immediate integer zero. 4751- ``K``: An immediate unsigned 16-bit integer. 4752- ``L``: An immediate 32-bit integer, where the lower 16 bits are 0. 4753- ``N``: An immediate integer between -65535 and -1. 4754- ``O``: An immediate signed 15-bit integer. 4755- ``P``: An immediate integer between 1 and 65535. 4756- ``m``: A memory address operand. In MIPS-SE mode, allows a base address 4757 register plus 16-bit immediate offset. In MIPS mode, just a base register. 4758- ``R``: A memory address operand. In MIPS-SE mode, allows a base address 4759 register plus a 9-bit signed offset. In MIPS mode, the same as constraint 4760 ``m``. 4761- ``ZC``: A memory address operand, suitable for use in a ``pref``, ``ll``, or 4762 ``sc`` instruction on the given subtarget (details vary). 4763- ``r``, ``d``, ``y``: A 32 or 64-bit GPR register. 4764- ``f``: A 32 or 64-bit FPU register (``F0-F31``), or a 128-bit MSA register 4765 (``W0-W31``). In the case of MSA registers, it is recommended to use the ``w`` 4766 argument modifier for compatibility with GCC. 4767- ``c``: A 32-bit or 64-bit GPR register suitable for indirect jump (always 4768 ``25``). 4769- ``l``: The ``lo`` register, 32 or 64-bit. 4770- ``x``: Invalid. 4771 4772NVPTX: 4773 4774- ``b``: A 1-bit integer register. 4775- ``c`` or ``h``: A 16-bit integer register. 4776- ``r``: A 32-bit integer register. 4777- ``l`` or ``N``: A 64-bit integer register. 4778- ``f``: A 32-bit float register. 4779- ``d``: A 64-bit float register. 4780 4781 4782PowerPC: 4783 4784- ``I``: An immediate signed 16-bit integer. 4785- ``J``: An immediate unsigned 16-bit integer, shifted left 16 bits. 4786- ``K``: An immediate unsigned 16-bit integer. 4787- ``L``: An immediate signed 16-bit integer, shifted left 16 bits. 4788- ``M``: An immediate integer greater than 31. 4789- ``N``: An immediate integer that is an exact power of 2. 4790- ``O``: The immediate integer constant 0. 4791- ``P``: An immediate integer constant whose negation is a signed 16-bit 4792 constant. 4793- ``es``, ``o``, ``Q``, ``Z``, ``Zy``: A memory address operand, currently 4794 treated the same as ``m``. 4795- ``r``: A 32 or 64-bit integer register. 4796- ``b``: A 32 or 64-bit integer register, excluding ``R0`` (that is: 4797 ``R1-R31``). 4798- ``f``: A 32 or 64-bit float register (``F0-F31``), 4799- ``v``: For ``4 x f32`` or ``4 x f64`` types, a 128-bit altivec vector 4800 register (``V0-V31``). 4801 4802- ``y``: Condition register (``CR0-CR7``). 4803- ``wc``: An individual CR bit in a CR register. 4804- ``wa``, ``wd``, ``wf``: Any 128-bit VSX vector register, from the full VSX 4805 register set (overlapping both the floating-point and vector register files). 4806- ``ws``: A 32 or 64-bit floating-point register, from the full VSX register 4807 set. 4808 4809RISC-V: 4810 4811- ``A``: An address operand (using a general-purpose register, without an 4812 offset). 4813- ``I``: A 12-bit signed integer immediate operand. 4814- ``J``: A zero integer immediate operand. 4815- ``K``: A 5-bit unsigned integer immediate operand. 4816- ``f``: A 32- or 64-bit floating-point register (requires F or D extension). 4817- ``r``: A 32- or 64-bit general-purpose register (depending on the platform 4818 ``XLEN``). 4819- ``vr``: A vector register. (requires V extension). 4820- ``vm``: A vector mask register. (requires V extension). 4821 4822Sparc: 4823 4824- ``I``: An immediate 13-bit signed integer. 4825- ``r``: A 32-bit integer register. 4826- ``f``: Any floating-point register on SparcV8, or a floating-point 4827 register in the "low" half of the registers on SparcV9. 4828- ``e``: Any floating-point register. (Same as ``f`` on SparcV8.) 4829 4830SystemZ: 4831 4832- ``I``: An immediate unsigned 8-bit integer. 4833- ``J``: An immediate unsigned 12-bit integer. 4834- ``K``: An immediate signed 16-bit integer. 4835- ``L``: An immediate signed 20-bit integer. 4836- ``M``: An immediate integer 0x7fffffff. 4837- ``Q``: A memory address operand with a base address and a 12-bit immediate 4838 unsigned displacement. 4839- ``R``: A memory address operand with a base address, a 12-bit immediate 4840 unsigned displacement, and an index register. 4841- ``S``: A memory address operand with a base address and a 20-bit immediate 4842 signed displacement. 4843- ``T``: A memory address operand with a base address, a 20-bit immediate 4844 signed displacement, and an index register. 4845- ``r`` or ``d``: A 32, 64, or 128-bit integer register. 4846- ``a``: A 32, 64, or 128-bit integer address register (excludes R0, which in an 4847 address context evaluates as zero). 4848- ``h``: A 32-bit value in the high part of a 64bit data register 4849 (LLVM-specific) 4850- ``f``: A 32, 64, or 128-bit floating-point register. 4851 4852X86: 4853 4854- ``I``: An immediate integer between 0 and 31. 4855- ``J``: An immediate integer between 0 and 64. 4856- ``K``: An immediate signed 8-bit integer. 4857- ``L``: An immediate integer, 0xff or 0xffff or (in 64-bit mode only) 4858 0xffffffff. 4859- ``M``: An immediate integer between 0 and 3. 4860- ``N``: An immediate unsigned 8-bit integer. 4861- ``O``: An immediate integer between 0 and 127. 4862- ``e``: An immediate 32-bit signed integer. 4863- ``Z``: An immediate 32-bit unsigned integer. 4864- ``o``, ``v``: Treated the same as ``m``, at the moment. 4865- ``q``: An 8, 16, 32, or 64-bit register which can be accessed as an 8-bit 4866 ``l`` integer register. On X86-32, this is the ``a``, ``b``, ``c``, and ``d`` 4867 registers, and on X86-64, it is all of the integer registers. 4868- ``Q``: An 8, 16, 32, or 64-bit register which can be accessed as an 8-bit 4869 ``h`` integer register. This is the ``a``, ``b``, ``c``, and ``d`` registers. 4870- ``r`` or ``l``: An 8, 16, 32, or 64-bit integer register. 4871- ``R``: An 8, 16, 32, or 64-bit "legacy" integer register -- one which has 4872 existed since i386, and can be accessed without the REX prefix. 4873- ``f``: A 32, 64, or 80-bit '387 FPU stack pseudo-register. 4874- ``y``: A 64-bit MMX register, if MMX is enabled. 4875- ``x``: If SSE is enabled: a 32 or 64-bit scalar operand, or 128-bit vector 4876 operand in a SSE register. If AVX is also enabled, can also be a 256-bit 4877 vector operand in an AVX register. If AVX-512 is also enabled, can also be a 4878 512-bit vector operand in an AVX512 register, Otherwise, an error. 4879- ``Y``: The same as ``x``, if *SSE2* is enabled, otherwise an error. 4880- ``A``: Special case: allocates EAX first, then EDX, for a single operand (in 4881 32-bit mode, a 64-bit integer operand will get split into two registers). It 4882 is not recommended to use this constraint, as in 64-bit mode, the 64-bit 4883 operand will get allocated only to RAX -- if two 32-bit operands are needed, 4884 you're better off splitting it yourself, before passing it to the asm 4885 statement. 4886 4887XCore: 4888 4889- ``r``: A 32-bit integer register. 4890 4891 4892.. _inline-asm-modifiers: 4893 4894Asm template argument modifiers 4895^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 4896 4897In the asm template string, modifiers can be used on the operand reference, like 4898"``${0:n}``". 4899 4900The modifiers are, in general, expected to behave the same way they do in 4901GCC. LLVM's support is often implemented on an 'as-needed' basis, to support C 4902inline asm code which was supported by GCC. A mismatch in behavior between LLVM 4903and GCC likely indicates a bug in LLVM. 4904 4905Target-independent: 4906 4907- ``c``: Print an immediate integer constant unadorned, without 4908 the target-specific immediate punctuation (e.g. no ``$`` prefix). 4909- ``n``: Negate and print immediate integer constant unadorned, without the 4910 target-specific immediate punctuation (e.g. no ``$`` prefix). 4911- ``l``: Print as an unadorned label, without the target-specific label 4912 punctuation (e.g. no ``$`` prefix). 4913 4914AArch64: 4915 4916- ``w``: Print a GPR register with a ``w*`` name instead of ``x*`` name. E.g., 4917 instead of ``x30``, print ``w30``. 4918- ``x``: Print a GPR register with a ``x*`` name. (this is the default, anyhow). 4919- ``b``, ``h``, ``s``, ``d``, ``q``: Print a floating-point/SIMD register with a 4920 ``b*``, ``h*``, ``s*``, ``d*``, or ``q*`` name, rather than the default of 4921 ``v*``. 4922 4923AMDGPU: 4924 4925- ``r``: No effect. 4926 4927ARM: 4928 4929- ``a``: Print an operand as an address (with ``[`` and ``]`` surrounding a 4930 register). 4931- ``P``: No effect. 4932- ``q``: No effect. 4933- ``y``: Print a VFP single-precision register as an indexed double (e.g. print 4934 as ``d4[1]`` instead of ``s9``) 4935- ``B``: Bitwise invert and print an immediate integer constant without ``#`` 4936 prefix. 4937- ``L``: Print the low 16-bits of an immediate integer constant. 4938- ``M``: Print as a register set suitable for ldm/stm. Also prints *all* 4939 register operands subsequent to the specified one (!), so use carefully. 4940- ``Q``: Print the low-order register of a register-pair, or the low-order 4941 register of a two-register operand. 4942- ``R``: Print the high-order register of a register-pair, or the high-order 4943 register of a two-register operand. 4944- ``H``: Print the second register of a register-pair. (On a big-endian system, 4945 ``H`` is equivalent to ``Q``, and on little-endian system, ``H`` is equivalent 4946 to ``R``.) 4947 4948 .. FIXME: H doesn't currently support printing the second register 4949 of a two-register operand. 4950 4951- ``e``: Print the low doubleword register of a NEON quad register. 4952- ``f``: Print the high doubleword register of a NEON quad register. 4953- ``m``: Print the base register of a memory operand without the ``[`` and ``]`` 4954 adornment. 4955 4956Hexagon: 4957 4958- ``L``: Print the second register of a two-register operand. Requires that it 4959 has been allocated consecutively to the first. 4960 4961 .. FIXME: why is it restricted to consecutive ones? And there's 4962 nothing that ensures that happens, is there? 4963 4964- ``I``: Print the letter 'i' if the operand is an integer constant, otherwise 4965 nothing. Used to print 'addi' vs 'add' instructions. 4966 4967MSP430: 4968 4969No additional modifiers. 4970 4971MIPS: 4972 4973- ``X``: Print an immediate integer as hexadecimal 4974- ``x``: Print the low 16 bits of an immediate integer as hexadecimal. 4975- ``d``: Print an immediate integer as decimal. 4976- ``m``: Subtract one and print an immediate integer as decimal. 4977- ``z``: Print $0 if an immediate zero, otherwise print normally. 4978- ``L``: Print the low-order register of a two-register operand, or prints the 4979 address of the low-order word of a double-word memory operand. 4980 4981 .. FIXME: L seems to be missing memory operand support. 4982 4983- ``M``: Print the high-order register of a two-register operand, or prints the 4984 address of the high-order word of a double-word memory operand. 4985 4986 .. FIXME: M seems to be missing memory operand support. 4987 4988- ``D``: Print the second register of a two-register operand, or prints the 4989 second word of a double-word memory operand. (On a big-endian system, ``D`` is 4990 equivalent to ``L``, and on little-endian system, ``D`` is equivalent to 4991 ``M``.) 4992- ``w``: No effect. Provided for compatibility with GCC which requires this 4993 modifier in order to print MSA registers (``W0-W31``) with the ``f`` 4994 constraint. 4995 4996NVPTX: 4997 4998- ``r``: No effect. 4999 5000PowerPC: 5001 5002- ``L``: Print the second register of a two-register operand. Requires that it 5003 has been allocated consecutively to the first. 5004 5005 .. FIXME: why is it restricted to consecutive ones? And there's 5006 nothing that ensures that happens, is there? 5007 5008- ``I``: Print the letter 'i' if the operand is an integer constant, otherwise 5009 nothing. Used to print 'addi' vs 'add' instructions. 5010- ``y``: For a memory operand, prints formatter for a two-register X-form 5011 instruction. (Currently always prints ``r0,OPERAND``). 5012- ``U``: Prints 'u' if the memory operand is an update form, and nothing 5013 otherwise. (NOTE: LLVM does not support update form, so this will currently 5014 always print nothing) 5015- ``X``: Prints 'x' if the memory operand is an indexed form. (NOTE: LLVM does 5016 not support indexed form, so this will currently always print nothing) 5017 5018RISC-V: 5019 5020- ``i``: Print the letter 'i' if the operand is not a register, otherwise print 5021 nothing. Used to print 'addi' vs 'add' instructions, etc. 5022- ``z``: Print the register ``zero`` if an immediate zero, otherwise print 5023 normally. 5024 5025Sparc: 5026 5027- ``r``: No effect. 5028 5029SystemZ: 5030 5031SystemZ implements only ``n``, and does *not* support any of the other 5032target-independent modifiers. 5033 5034X86: 5035 5036- ``c``: Print an unadorned integer or symbol name. (The latter is 5037 target-specific behavior for this typically target-independent modifier). 5038- ``A``: Print a register name with a '``*``' before it. 5039- ``b``: Print an 8-bit register name (e.g. ``al``); do nothing on a memory 5040 operand. 5041- ``h``: Print the upper 8-bit register name (e.g. ``ah``); do nothing on a 5042 memory operand. 5043- ``w``: Print the 16-bit register name (e.g. ``ax``); do nothing on a memory 5044 operand. 5045- ``k``: Print the 32-bit register name (e.g. ``eax``); do nothing on a memory 5046 operand. 5047- ``q``: Print the 64-bit register name (e.g. ``rax``), if 64-bit registers are 5048 available, otherwise the 32-bit register name; do nothing on a memory operand. 5049- ``n``: Negate and print an unadorned integer, or, for operands other than an 5050 immediate integer (e.g. a relocatable symbol expression), print a '-' before 5051 the operand. (The behavior for relocatable symbol expressions is a 5052 target-specific behavior for this typically target-independent modifier) 5053- ``H``: Print a memory reference with additional offset +8. 5054- ``P``: Print a memory reference or operand for use as the argument of a call 5055 instruction. (E.g. omit ``(rip)``, even though it's PC-relative.) 5056 5057XCore: 5058 5059No additional modifiers. 5060 5061 5062Inline Asm Metadata 5063^^^^^^^^^^^^^^^^^^^ 5064 5065The call instructions that wrap inline asm nodes may have a 5066"``!srcloc``" MDNode attached to it that contains a list of constant 5067integers. If present, the code generator will use the integer as the 5068location cookie value when report errors through the ``LLVMContext`` 5069error reporting mechanisms. This allows a front-end to correlate backend 5070errors that occur with inline asm back to the source code that produced 5071it. For example: 5072 5073.. code-block:: llvm 5074 5075 call void asm sideeffect "something bad", ""(), !srcloc !42 5076 ... 5077 !42 = !{ i32 1234567 } 5078 5079It is up to the front-end to make sense of the magic numbers it places 5080in the IR. If the MDNode contains multiple constants, the code generator 5081will use the one that corresponds to the line of the asm that the error 5082occurs on. 5083 5084.. _metadata: 5085 5086Metadata 5087======== 5088 5089LLVM IR allows metadata to be attached to instructions and global objects in the 5090program that can convey extra information about the code to the optimizers and 5091code generator. One example application of metadata is source-level 5092debug information. There are two metadata primitives: strings and nodes. 5093 5094Metadata does not have a type, and is not a value. If referenced from a 5095``call`` instruction, it uses the ``metadata`` type. 5096 5097All metadata are identified in syntax by an exclamation point ('``!``'). 5098 5099.. _metadata-string: 5100 5101Metadata Nodes and Metadata Strings 5102----------------------------------- 5103 5104A metadata string is a string surrounded by double quotes. It can 5105contain any character by escaping non-printable characters with 5106"``\xx``" where "``xx``" is the two digit hex code. For example: 5107"``!"test\00"``". 5108 5109Metadata nodes are represented with notation similar to structure 5110constants (a comma separated list of elements, surrounded by braces and 5111preceded by an exclamation point). Metadata nodes can have any values as 5112their operand. For example: 5113 5114.. code-block:: llvm 5115 5116 !{ !"test\00", i32 10} 5117 5118Metadata nodes that aren't uniqued use the ``distinct`` keyword. For example: 5119 5120.. code-block:: text 5121 5122 !0 = distinct !{!"test\00", i32 10} 5123 5124``distinct`` nodes are useful when nodes shouldn't be merged based on their 5125content. They can also occur when transformations cause uniquing collisions 5126when metadata operands change. 5127 5128A :ref:`named metadata <namedmetadatastructure>` is a collection of 5129metadata nodes, which can be looked up in the module symbol table. For 5130example: 5131 5132.. code-block:: llvm 5133 5134 !foo = !{!4, !3} 5135 5136Metadata can be used as function arguments. Here the ``llvm.dbg.value`` 5137intrinsic is using three metadata arguments: 5138 5139.. code-block:: llvm 5140 5141 call void @llvm.dbg.value(metadata !24, metadata !25, metadata !26) 5142 5143Metadata can be attached to an instruction. Here metadata ``!21`` is attached 5144to the ``add`` instruction using the ``!dbg`` identifier: 5145 5146.. code-block:: llvm 5147 5148 %indvar.next = add i64 %indvar, 1, !dbg !21 5149 5150Instructions may not have multiple metadata attachments with the same 5151identifier. 5152 5153Metadata can also be attached to a function or a global variable. Here metadata 5154``!22`` is attached to the ``f1`` and ``f2`` functions, and the globals ``g1`` 5155and ``g2`` using the ``!dbg`` identifier: 5156 5157.. code-block:: llvm 5158 5159 declare !dbg !22 void @f1() 5160 define void @f2() !dbg !22 { 5161 ret void 5162 } 5163 5164 @g1 = global i32 0, !dbg !22 5165 @g2 = external global i32, !dbg !22 5166 5167Unlike instructions, global objects (functions and global variables) may have 5168multiple metadata attachments with the same identifier. 5169 5170A transformation is required to drop any metadata attachment that it does not 5171know or know it can't preserve. Currently there is an exception for metadata 5172attachment to globals for ``!type`` and ``!absolute_symbol`` which can't be 5173unconditionally dropped unless the global is itself deleted. 5174 5175Metadata attached to a module using named metadata may not be dropped, with 5176the exception of debug metadata (named metadata with the name ``!llvm.dbg.*``). 5177 5178More information about specific metadata nodes recognized by the 5179optimizers and code generator is found below. 5180 5181.. _specialized-metadata: 5182 5183Specialized Metadata Nodes 5184^^^^^^^^^^^^^^^^^^^^^^^^^^ 5185 5186Specialized metadata nodes are custom data structures in metadata (as opposed 5187to generic tuples). Their fields are labelled, and can be specified in any 5188order. 5189 5190These aren't inherently debug info centric, but currently all the specialized 5191metadata nodes are related to debug info. 5192 5193.. _DICompileUnit: 5194 5195DICompileUnit 5196""""""""""""" 5197 5198``DICompileUnit`` nodes represent a compile unit. The ``enums:``, 5199``retainedTypes:``, ``globals:``, ``imports:`` and ``macros:`` fields are tuples 5200containing the debug info to be emitted along with the compile unit, regardless 5201of code optimizations (some nodes are only emitted if there are references to 5202them from instructions). The ``debugInfoForProfiling:`` field is a boolean 5203indicating whether or not line-table discriminators are updated to provide 5204more-accurate debug info for profiling results. 5205 5206.. code-block:: text 5207 5208 !0 = !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang", 5209 isOptimized: true, flags: "-O2", runtimeVersion: 2, 5210 splitDebugFilename: "abc.debug", emissionKind: FullDebug, 5211 enums: !2, retainedTypes: !3, globals: !4, imports: !5, 5212 macros: !6, dwoId: 0x0abcd) 5213 5214Compile unit descriptors provide the root scope for objects declared in a 5215specific compilation unit. File descriptors are defined using this scope. These 5216descriptors are collected by a named metadata node ``!llvm.dbg.cu``. They keep 5217track of global variables, type information, and imported entities (declarations 5218and namespaces). 5219 5220.. _DIFile: 5221 5222DIFile 5223"""""" 5224 5225``DIFile`` nodes represent files. The ``filename:`` can include slashes. 5226 5227.. code-block:: none 5228 5229 !0 = !DIFile(filename: "path/to/file", directory: "/path/to/dir", 5230 checksumkind: CSK_MD5, 5231 checksum: "000102030405060708090a0b0c0d0e0f") 5232 5233Files are sometimes used in ``scope:`` fields, and are the only valid target 5234for ``file:`` fields. 5235Valid values for ``checksumkind:`` field are: {CSK_None, CSK_MD5, CSK_SHA1, CSK_SHA256} 5236 5237.. _DIBasicType: 5238 5239DIBasicType 5240""""""""""" 5241 5242``DIBasicType`` nodes represent primitive types, such as ``int``, ``bool`` and 5243``float``. ``tag:`` defaults to ``DW_TAG_base_type``. 5244 5245.. code-block:: text 5246 5247 !0 = !DIBasicType(name: "unsigned char", size: 8, align: 8, 5248 encoding: DW_ATE_unsigned_char) 5249 !1 = !DIBasicType(tag: DW_TAG_unspecified_type, name: "decltype(nullptr)") 5250 5251The ``encoding:`` describes the details of the type. Usually it's one of the 5252following: 5253 5254.. code-block:: text 5255 5256 DW_ATE_address = 1 5257 DW_ATE_boolean = 2 5258 DW_ATE_float = 4 5259 DW_ATE_signed = 5 5260 DW_ATE_signed_char = 6 5261 DW_ATE_unsigned = 7 5262 DW_ATE_unsigned_char = 8 5263 5264.. _DISubroutineType: 5265 5266DISubroutineType 5267"""""""""""""""" 5268 5269``DISubroutineType`` nodes represent subroutine types. Their ``types:`` field 5270refers to a tuple; the first operand is the return type, while the rest are the 5271types of the formal arguments in order. If the first operand is ``null``, that 5272represents a function with no return value (such as ``void foo() {}`` in C++). 5273 5274.. code-block:: text 5275 5276 !0 = !BasicType(name: "int", size: 32, align: 32, DW_ATE_signed) 5277 !1 = !BasicType(name: "char", size: 8, align: 8, DW_ATE_signed_char) 5278 !2 = !DISubroutineType(types: !{null, !0, !1}) ; void (int, char) 5279 5280.. _DIDerivedType: 5281 5282DIDerivedType 5283""""""""""""" 5284 5285``DIDerivedType`` nodes represent types derived from other types, such as 5286qualified types. 5287 5288.. code-block:: text 5289 5290 !0 = !DIBasicType(name: "unsigned char", size: 8, align: 8, 5291 encoding: DW_ATE_unsigned_char) 5292 !1 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !0, size: 32, 5293 align: 32) 5294 5295The following ``tag:`` values are valid: 5296 5297.. code-block:: text 5298 5299 DW_TAG_member = 13 5300 DW_TAG_pointer_type = 15 5301 DW_TAG_reference_type = 16 5302 DW_TAG_typedef = 22 5303 DW_TAG_inheritance = 28 5304 DW_TAG_ptr_to_member_type = 31 5305 DW_TAG_const_type = 38 5306 DW_TAG_friend = 42 5307 DW_TAG_volatile_type = 53 5308 DW_TAG_restrict_type = 55 5309 DW_TAG_atomic_type = 71 5310 5311.. _DIDerivedTypeMember: 5312 5313``DW_TAG_member`` is used to define a member of a :ref:`composite type 5314<DICompositeType>`. The type of the member is the ``baseType:``. The 5315``offset:`` is the member's bit offset. If the composite type has an ODR 5316``identifier:`` and does not set ``flags: DIFwdDecl``, then the member is 5317uniqued based only on its ``name:`` and ``scope:``. 5318 5319``DW_TAG_inheritance`` and ``DW_TAG_friend`` are used in the ``elements:`` 5320field of :ref:`composite types <DICompositeType>` to describe parents and 5321friends. 5322 5323``DW_TAG_typedef`` is used to provide a name for the ``baseType:``. 5324 5325``DW_TAG_pointer_type``, ``DW_TAG_reference_type``, ``DW_TAG_const_type``, 5326``DW_TAG_volatile_type``, ``DW_TAG_restrict_type`` and ``DW_TAG_atomic_type`` 5327are used to qualify the ``baseType:``. 5328 5329Note that the ``void *`` type is expressed as a type derived from NULL. 5330 5331.. _DICompositeType: 5332 5333DICompositeType 5334""""""""""""""" 5335 5336``DICompositeType`` nodes represent types composed of other types, like 5337structures and unions. ``elements:`` points to a tuple of the composed types. 5338 5339If the source language supports ODR, the ``identifier:`` field gives the unique 5340identifier used for type merging between modules. When specified, 5341:ref:`subprogram declarations <DISubprogramDeclaration>` and :ref:`member 5342derived types <DIDerivedTypeMember>` that reference the ODR-type in their 5343``scope:`` change uniquing rules. 5344 5345For a given ``identifier:``, there should only be a single composite type that 5346does not have ``flags: DIFlagFwdDecl`` set. LLVM tools that link modules 5347together will unique such definitions at parse time via the ``identifier:`` 5348field, even if the nodes are ``distinct``. 5349 5350.. code-block:: text 5351 5352 !0 = !DIEnumerator(name: "SixKind", value: 7) 5353 !1 = !DIEnumerator(name: "SevenKind", value: 7) 5354 !2 = !DIEnumerator(name: "NegEightKind", value: -8) 5355 !3 = !DICompositeType(tag: DW_TAG_enumeration_type, name: "Enum", file: !12, 5356 line: 2, size: 32, align: 32, identifier: "_M4Enum", 5357 elements: !{!0, !1, !2}) 5358 5359The following ``tag:`` values are valid: 5360 5361.. code-block:: text 5362 5363 DW_TAG_array_type = 1 5364 DW_TAG_class_type = 2 5365 DW_TAG_enumeration_type = 4 5366 DW_TAG_structure_type = 19 5367 DW_TAG_union_type = 23 5368 5369For ``DW_TAG_array_type``, the ``elements:`` should be :ref:`subrange 5370descriptors <DISubrange>`, each representing the range of subscripts at that 5371level of indexing. The ``DIFlagVector`` flag to ``flags:`` indicates that an 5372array type is a native packed vector. The optional ``dataLocation`` is a 5373DIExpression that describes how to get from an object's address to the actual 5374raw data, if they aren't equivalent. This is only supported for array types, 5375particularly to describe Fortran arrays, which have an array descriptor in 5376addition to the array data. Alternatively it can also be DIVariable which 5377has the address of the actual raw data. The Fortran language supports pointer 5378arrays which can be attached to actual arrays, this attachment between pointer 5379and pointee is called association. The optional ``associated`` is a 5380DIExpression that describes whether the pointer array is currently associated. 5381The optional ``allocated`` is a DIExpression that describes whether the 5382allocatable array is currently allocated. The optional ``rank`` is a 5383DIExpression that describes the rank (number of dimensions) of fortran assumed 5384rank array (rank is known at runtime). 5385 5386For ``DW_TAG_enumeration_type``, the ``elements:`` should be :ref:`enumerator 5387descriptors <DIEnumerator>`, each representing the definition of an enumeration 5388value for the set. All enumeration type descriptors are collected in the 5389``enums:`` field of the :ref:`compile unit <DICompileUnit>`. 5390 5391For ``DW_TAG_structure_type``, ``DW_TAG_class_type``, and 5392``DW_TAG_union_type``, the ``elements:`` should be :ref:`derived types 5393<DIDerivedType>` with ``tag: DW_TAG_member``, ``tag: DW_TAG_inheritance``, or 5394``tag: DW_TAG_friend``; or :ref:`subprograms <DISubprogram>` with 5395``isDefinition: false``. 5396 5397.. _DISubrange: 5398 5399DISubrange 5400"""""""""" 5401 5402``DISubrange`` nodes are the elements for ``DW_TAG_array_type`` variants of 5403:ref:`DICompositeType`. 5404 5405- ``count: -1`` indicates an empty array. 5406- ``count: !10`` describes the count with a :ref:`DILocalVariable`. 5407- ``count: !12`` describes the count with a :ref:`DIGlobalVariable`. 5408 5409.. code-block:: text 5410 5411 !0 = !DISubrange(count: 5, lowerBound: 0) ; array counting from 0 5412 !1 = !DISubrange(count: 5, lowerBound: 1) ; array counting from 1 5413 !2 = !DISubrange(count: -1) ; empty array. 5414 5415 ; Scopes used in rest of example 5416 !6 = !DIFile(filename: "vla.c", directory: "/path/to/file") 5417 !7 = distinct !DICompileUnit(language: DW_LANG_C99, file: !6) 5418 !8 = distinct !DISubprogram(name: "foo", scope: !7, file: !6, line: 5) 5419 5420 ; Use of local variable as count value 5421 !9 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed) 5422 !10 = !DILocalVariable(name: "count", scope: !8, file: !6, line: 42, type: !9) 5423 !11 = !DISubrange(count: !10, lowerBound: 0) 5424 5425 ; Use of global variable as count value 5426 !12 = !DIGlobalVariable(name: "count", scope: !8, file: !6, line: 22, type: !9) 5427 !13 = !DISubrange(count: !12, lowerBound: 0) 5428 5429.. _DIEnumerator: 5430 5431DIEnumerator 5432"""""""""""" 5433 5434``DIEnumerator`` nodes are the elements for ``DW_TAG_enumeration_type`` 5435variants of :ref:`DICompositeType`. 5436 5437.. code-block:: text 5438 5439 !0 = !DIEnumerator(name: "SixKind", value: 7) 5440 !1 = !DIEnumerator(name: "SevenKind", value: 7) 5441 !2 = !DIEnumerator(name: "NegEightKind", value: -8) 5442 5443DITemplateTypeParameter 5444""""""""""""""""""""""" 5445 5446``DITemplateTypeParameter`` nodes represent type parameters to generic source 5447language constructs. They are used (optionally) in :ref:`DICompositeType` and 5448:ref:`DISubprogram` ``templateParams:`` fields. 5449 5450.. code-block:: text 5451 5452 !0 = !DITemplateTypeParameter(name: "Ty", type: !1) 5453 5454DITemplateValueParameter 5455"""""""""""""""""""""""" 5456 5457``DITemplateValueParameter`` nodes represent value parameters to generic source 5458language constructs. ``tag:`` defaults to ``DW_TAG_template_value_parameter``, 5459but if specified can also be set to ``DW_TAG_GNU_template_template_param`` or 5460``DW_TAG_GNU_template_param_pack``. They are used (optionally) in 5461:ref:`DICompositeType` and :ref:`DISubprogram` ``templateParams:`` fields. 5462 5463.. code-block:: text 5464 5465 !0 = !DITemplateValueParameter(name: "Ty", type: !1, value: i32 7) 5466 5467DINamespace 5468""""""""""" 5469 5470``DINamespace`` nodes represent namespaces in the source language. 5471 5472.. code-block:: text 5473 5474 !0 = !DINamespace(name: "myawesomeproject", scope: !1, file: !2, line: 7) 5475 5476.. _DIGlobalVariable: 5477 5478DIGlobalVariable 5479"""""""""""""""" 5480 5481``DIGlobalVariable`` nodes represent global variables in the source language. 5482 5483.. code-block:: text 5484 5485 @foo = global i32, !dbg !0 5486 !0 = !DIGlobalVariableExpression(var: !1, expr: !DIExpression()) 5487 !1 = !DIGlobalVariable(name: "foo", linkageName: "foo", scope: !2, 5488 file: !3, line: 7, type: !4, isLocal: true, 5489 isDefinition: false, declaration: !5) 5490 5491 5492DIGlobalVariableExpression 5493"""""""""""""""""""""""""" 5494 5495``DIGlobalVariableExpression`` nodes tie a :ref:`DIGlobalVariable` together 5496with a :ref:`DIExpression`. 5497 5498.. code-block:: text 5499 5500 @lower = global i32, !dbg !0 5501 @upper = global i32, !dbg !1 5502 !0 = !DIGlobalVariableExpression( 5503 var: !2, 5504 expr: !DIExpression(DW_OP_LLVM_fragment, 0, 32) 5505 ) 5506 !1 = !DIGlobalVariableExpression( 5507 var: !2, 5508 expr: !DIExpression(DW_OP_LLVM_fragment, 32, 32) 5509 ) 5510 !2 = !DIGlobalVariable(name: "split64", linkageName: "split64", scope: !3, 5511 file: !4, line: 8, type: !5, declaration: !6) 5512 5513All global variable expressions should be referenced by the `globals:` field of 5514a :ref:`compile unit <DICompileUnit>`. 5515 5516.. _DISubprogram: 5517 5518DISubprogram 5519"""""""""""" 5520 5521``DISubprogram`` nodes represent functions from the source language. A distinct 5522``DISubprogram`` may be attached to a function definition using ``!dbg`` 5523metadata. A unique ``DISubprogram`` may be attached to a function declaration 5524used for call site debug info. The ``retainedNodes:`` field is a list of 5525:ref:`variables <DILocalVariable>` and :ref:`labels <DILabel>` that must be 5526retained, even if their IR counterparts are optimized out of the IR. The 5527``type:`` field must point at an :ref:`DISubroutineType`. 5528 5529.. _DISubprogramDeclaration: 5530 5531When ``isDefinition: false``, subprograms describe a declaration in the type 5532tree as opposed to a definition of a function. If the scope is a composite 5533type with an ODR ``identifier:`` and that does not set ``flags: DIFwdDecl``, 5534then the subprogram declaration is uniqued based only on its ``linkageName:`` 5535and ``scope:``. 5536 5537.. code-block:: text 5538 5539 define void @_Z3foov() !dbg !0 { 5540 ... 5541 } 5542 5543 !0 = distinct !DISubprogram(name: "foo", linkageName: "_Zfoov", scope: !1, 5544 file: !2, line: 7, type: !3, isLocal: true, 5545 isDefinition: true, scopeLine: 8, 5546 containingType: !4, 5547 virtuality: DW_VIRTUALITY_pure_virtual, 5548 virtualIndex: 10, flags: DIFlagPrototyped, 5549 isOptimized: true, unit: !5, templateParams: !6, 5550 declaration: !7, retainedNodes: !8, 5551 thrownTypes: !9) 5552 5553.. _DILexicalBlock: 5554 5555DILexicalBlock 5556"""""""""""""" 5557 5558``DILexicalBlock`` nodes describe nested blocks within a :ref:`subprogram 5559<DISubprogram>`. The line number and column numbers are used to distinguish 5560two lexical blocks at same depth. They are valid targets for ``scope:`` 5561fields. 5562 5563.. code-block:: text 5564 5565 !0 = distinct !DILexicalBlock(scope: !1, file: !2, line: 7, column: 35) 5566 5567Usually lexical blocks are ``distinct`` to prevent node merging based on 5568operands. 5569 5570.. _DILexicalBlockFile: 5571 5572DILexicalBlockFile 5573"""""""""""""""""" 5574 5575``DILexicalBlockFile`` nodes are used to discriminate between sections of a 5576:ref:`lexical block <DILexicalBlock>`. The ``file:`` field can be changed to 5577indicate textual inclusion, or the ``discriminator:`` field can be used to 5578discriminate between control flow within a single block in the source language. 5579 5580.. code-block:: text 5581 5582 !0 = !DILexicalBlock(scope: !3, file: !4, line: 7, column: 35) 5583 !1 = !DILexicalBlockFile(scope: !0, file: !4, discriminator: 0) 5584 !2 = !DILexicalBlockFile(scope: !0, file: !4, discriminator: 1) 5585 5586.. _DILocation: 5587 5588DILocation 5589"""""""""" 5590 5591``DILocation`` nodes represent source debug locations. The ``scope:`` field is 5592mandatory, and points at an :ref:`DILexicalBlockFile`, an 5593:ref:`DILexicalBlock`, or an :ref:`DISubprogram`. 5594 5595.. code-block:: text 5596 5597 !0 = !DILocation(line: 2900, column: 42, scope: !1, inlinedAt: !2) 5598 5599.. _DILocalVariable: 5600 5601DILocalVariable 5602""""""""""""""" 5603 5604``DILocalVariable`` nodes represent local variables in the source language. If 5605the ``arg:`` field is set to non-zero, then this variable is a subprogram 5606parameter, and it will be included in the ``retainedNodes:`` field of its 5607:ref:`DISubprogram`. 5608 5609.. code-block:: text 5610 5611 !0 = !DILocalVariable(name: "this", arg: 1, scope: !3, file: !2, line: 7, 5612 type: !3, flags: DIFlagArtificial) 5613 !1 = !DILocalVariable(name: "x", arg: 2, scope: !4, file: !2, line: 7, 5614 type: !3) 5615 !2 = !DILocalVariable(name: "y", scope: !5, file: !2, line: 7, type: !3) 5616 5617.. _DIExpression: 5618 5619DIExpression 5620"""""""""""" 5621 5622``DIExpression`` nodes represent expressions that are inspired by the DWARF 5623expression language. They are used in :ref:`debug intrinsics<dbg_intrinsics>` 5624(such as ``llvm.dbg.declare`` and ``llvm.dbg.value``) to describe how the 5625referenced LLVM variable relates to the source language variable. Debug 5626intrinsics are interpreted left-to-right: start by pushing the value/address 5627operand of the intrinsic onto a stack, then repeatedly push and evaluate 5628opcodes from the DIExpression until the final variable description is produced. 5629 5630The current supported opcode vocabulary is limited: 5631 5632- ``DW_OP_deref`` dereferences the top of the expression stack. 5633- ``DW_OP_plus`` pops the last two entries from the expression stack, adds 5634 them together and appends the result to the expression stack. 5635- ``DW_OP_minus`` pops the last two entries from the expression stack, subtracts 5636 the last entry from the second last entry and appends the result to the 5637 expression stack. 5638- ``DW_OP_plus_uconst, 93`` adds ``93`` to the working expression. 5639- ``DW_OP_LLVM_fragment, 16, 8`` specifies the offset and size (``16`` and ``8`` 5640 here, respectively) of the variable fragment from the working expression. Note 5641 that contrary to DW_OP_bit_piece, the offset is describing the location 5642 within the described source variable. 5643- ``DW_OP_LLVM_convert, 16, DW_ATE_signed`` specifies a bit size and encoding 5644 (``16`` and ``DW_ATE_signed`` here, respectively) to which the top of the 5645 expression stack is to be converted. Maps into a ``DW_OP_convert`` operation 5646 that references a base type constructed from the supplied values. 5647- ``DW_OP_LLVM_tag_offset, tag_offset`` specifies that a memory tag should be 5648 optionally applied to the pointer. The memory tag is derived from the 5649 given tag offset in an implementation-defined manner. 5650- ``DW_OP_swap`` swaps top two stack entries. 5651- ``DW_OP_xderef`` provides extended dereference mechanism. The entry at the top 5652 of the stack is treated as an address. The second stack entry is treated as an 5653 address space identifier. 5654- ``DW_OP_stack_value`` marks a constant value. 5655- ``DW_OP_LLVM_entry_value, N`` may only appear in MIR and at the 5656 beginning of a ``DIExpression``. In DWARF a ``DBG_VALUE`` 5657 instruction binding a ``DIExpression(DW_OP_LLVM_entry_value`` to a 5658 register is lowered to a ``DW_OP_entry_value [reg]``, pushing the 5659 value the register had upon function entry onto the stack. The next 5660 ``(N - 1)`` operations will be part of the ``DW_OP_entry_value`` 5661 block argument. For example, ``!DIExpression(DW_OP_LLVM_entry_value, 5662 1, DW_OP_plus_uconst, 123, DW_OP_stack_value)`` specifies an 5663 expression where the entry value of the debug value instruction's 5664 value/address operand is pushed to the stack, and is added 5665 with 123. Due to framework limitations ``N`` can currently only 5666 be 1. 5667 5668 The operation is introduced by the ``LiveDebugValues`` pass, which 5669 applies it only to function parameters that are unmodified 5670 throughout the function. Support is limited to simple register 5671 location descriptions, or as indirect locations (e.g., when a struct 5672 is passed-by-value to a callee via a pointer to a temporary copy 5673 made in the caller). The entry value op is also introduced by the 5674 ``AsmPrinter`` pass when a call site parameter value 5675 (``DW_AT_call_site_parameter_value``) is represented as entry value 5676 of the parameter. 5677- ``DW_OP_LLVM_arg, N`` is used in debug intrinsics that refer to more than one 5678 value, such as one that calculates the sum of two registers. This is always 5679 used in combination with an ordered list of values, such that 5680 ``DW_OP_LLVM_arg, N`` refers to the ``N``th element in that list. For 5681 example, ``!DIExpression(DW_OP_LLVM_arg, 0, DW_OP_LLVM_arg, 1, DW_OP_minus, 5682 DW_OP_stack_value)`` used with the list ``(%reg1, %reg2)`` would evaluate to 5683 ``%reg1 - reg2``. This list of values should be provided by the containing 5684 intrinsic/instruction. 5685- ``DW_OP_breg`` (or ``DW_OP_bregx``) represents a content on the provided 5686 signed offset of the specified register. The opcode is only generated by the 5687 ``AsmPrinter`` pass to describe call site parameter value which requires an 5688 expression over two registers. 5689- ``DW_OP_push_object_address`` pushes the address of the object which can then 5690 serve as a descriptor in subsequent calculation. This opcode can be used to 5691 calculate bounds of fortran allocatable array which has array descriptors. 5692- ``DW_OP_over`` duplicates the entry currently second in the stack at the top 5693 of the stack. This opcode can be used to calculate bounds of fortran assumed 5694 rank array which has rank known at run time and current dimension number is 5695 implicitly first element of the stack. 5696- ``DW_OP_LLVM_implicit_pointer`` It specifies the dereferenced value. It can 5697 be used to represent pointer variables which are optimized out but the value 5698 it points to is known. This operator is required as it is different than DWARF 5699 operator DW_OP_implicit_pointer in representation and specification (number 5700 and types of operands) and later can not be used as multiple level. 5701 5702.. code-block:: text 5703 5704 IR for "*ptr = 4;" 5705 -------------- 5706 call void @llvm.dbg.value(metadata i32 4, metadata !17, metadata !20) 5707 !17 = !DILocalVariable(name: "ptr1", scope: !12, file: !3, line: 5, 5708 type: !18) 5709 !18 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !19, size: 64) 5710 !19 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed) 5711 !20 = !DIExpression(DW_OP_LLVM_implicit_pointer)) 5712 5713 IR for "**ptr = 4;" 5714 -------------- 5715 call void @llvm.dbg.value(metadata i32 4, metadata !17, metadata !21) 5716 !17 = !DILocalVariable(name: "ptr1", scope: !12, file: !3, line: 5, 5717 type: !18) 5718 !18 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !19, size: 64) 5719 !19 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !20, size: 64) 5720 !20 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed) 5721 !21 = !DIExpression(DW_OP_LLVM_implicit_pointer, 5722 DW_OP_LLVM_implicit_pointer)) 5723 5724DWARF specifies three kinds of simple location descriptions: Register, memory, 5725and implicit location descriptions. Note that a location description is 5726defined over certain ranges of a program, i.e the location of a variable may 5727change over the course of the program. Register and memory location 5728descriptions describe the *concrete location* of a source variable (in the 5729sense that a debugger might modify its value), whereas *implicit locations* 5730describe merely the actual *value* of a source variable which might not exist 5731in registers or in memory (see ``DW_OP_stack_value``). 5732 5733A ``llvm.dbg.addr`` or ``llvm.dbg.declare`` intrinsic describes an indirect 5734value (the address) of a source variable. The first operand of the intrinsic 5735must be an address of some kind. A DIExpression attached to the intrinsic 5736refines this address to produce a concrete location for the source variable. 5737 5738A ``llvm.dbg.value`` intrinsic describes the direct value of a source variable. 5739The first operand of the intrinsic may be a direct or indirect value. A 5740DIExpression attached to the intrinsic refines the first operand to produce a 5741direct value. For example, if the first operand is an indirect value, it may be 5742necessary to insert ``DW_OP_deref`` into the DIExpression in order to produce a 5743valid debug intrinsic. 5744 5745.. note:: 5746 5747 A DIExpression is interpreted in the same way regardless of which kind of 5748 debug intrinsic it's attached to. 5749 5750.. code-block:: text 5751 5752 !0 = !DIExpression(DW_OP_deref) 5753 !1 = !DIExpression(DW_OP_plus_uconst, 3) 5754 !1 = !DIExpression(DW_OP_constu, 3, DW_OP_plus) 5755 !2 = !DIExpression(DW_OP_bit_piece, 3, 7) 5756 !3 = !DIExpression(DW_OP_deref, DW_OP_constu, 3, DW_OP_plus, DW_OP_LLVM_fragment, 3, 7) 5757 !4 = !DIExpression(DW_OP_constu, 2, DW_OP_swap, DW_OP_xderef) 5758 !5 = !DIExpression(DW_OP_constu, 42, DW_OP_stack_value) 5759 5760DIArgList 5761"""""""""""" 5762 5763``DIArgList`` nodes hold a list of constant or SSA value references. These are 5764used in :ref:`debug intrinsics<dbg_intrinsics>` (currently only in 5765``llvm.dbg.value``) in combination with a ``DIExpression`` that uses the 5766``DW_OP_LLVM_arg`` operator. Because a DIArgList may refer to local values 5767within a function, it must only be used as a function argument, must always be 5768inlined, and cannot appear in named metadata. 5769 5770.. code-block:: text 5771 5772 llvm.dbg.value(metadata !DIArgList(i32 %a, i32 %b), 5773 metadata !16, 5774 metadata !DIExpression(DW_OP_LLVM_arg, 0, DW_OP_LLVM_arg, 1, DW_OP_plus)) 5775 5776DIFlags 5777""""""""""""""" 5778 5779These flags encode various properties of DINodes. 5780 5781The `ExportSymbols` flag marks a class, struct or union whose members 5782may be referenced as if they were defined in the containing class or 5783union. This flag is used to decide whether the DW_AT_export_symbols can 5784be used for the structure type. 5785 5786DIObjCProperty 5787"""""""""""""" 5788 5789``DIObjCProperty`` nodes represent Objective-C property nodes. 5790 5791.. code-block:: text 5792 5793 !3 = !DIObjCProperty(name: "foo", file: !1, line: 7, setter: "setFoo", 5794 getter: "getFoo", attributes: 7, type: !2) 5795 5796DIImportedEntity 5797"""""""""""""""" 5798 5799``DIImportedEntity`` nodes represent entities (such as modules) imported into a 5800compile unit. The ``elements`` field is a list of renamed entities (such as 5801variables and subprograms) in the imported entity (such as module). 5802 5803.. code-block:: text 5804 5805 !2 = !DIImportedEntity(tag: DW_TAG_imported_module, name: "foo", scope: !0, 5806 entity: !1, line: 7, elements: !3) 5807 !3 = !{!4} 5808 !4 = !DIImportedEntity(tag: DW_TAG_imported_declaration, name: "bar", scope: !0, 5809 entity: !5, line: 7) 5810 5811DIMacro 5812""""""" 5813 5814``DIMacro`` nodes represent definition or undefinition of a macro identifiers. 5815The ``name:`` field is the macro identifier, followed by macro parameters when 5816defining a function-like macro, and the ``value`` field is the token-string 5817used to expand the macro identifier. 5818 5819.. code-block:: text 5820 5821 !2 = !DIMacro(macinfo: DW_MACINFO_define, line: 7, name: "foo(x)", 5822 value: "((x) + 1)") 5823 !3 = !DIMacro(macinfo: DW_MACINFO_undef, line: 30, name: "foo") 5824 5825DIMacroFile 5826""""""""""" 5827 5828``DIMacroFile`` nodes represent inclusion of source files. 5829The ``nodes:`` field is a list of ``DIMacro`` and ``DIMacroFile`` nodes that 5830appear in the included source file. 5831 5832.. code-block:: text 5833 5834 !2 = !DIMacroFile(macinfo: DW_MACINFO_start_file, line: 7, file: !2, 5835 nodes: !3) 5836 5837.. _DILabel: 5838 5839DILabel 5840""""""" 5841 5842``DILabel`` nodes represent labels within a :ref:`DISubprogram`. All fields of 5843a ``DILabel`` are mandatory. The ``scope:`` field must be one of either a 5844:ref:`DILexicalBlockFile`, a :ref:`DILexicalBlock`, or a :ref:`DISubprogram`. 5845The ``name:`` field is the label identifier. The ``file:`` field is the 5846:ref:`DIFile` the label is present in. The ``line:`` field is the source line 5847within the file where the label is declared. 5848 5849.. code-block:: text 5850 5851 !2 = !DILabel(scope: !0, name: "foo", file: !1, line: 7) 5852 5853'``tbaa``' Metadata 5854^^^^^^^^^^^^^^^^^^^ 5855 5856In LLVM IR, memory does not have types, so LLVM's own type system is not 5857suitable for doing type based alias analysis (TBAA). Instead, metadata is 5858added to the IR to describe a type system of a higher level language. This 5859can be used to implement C/C++ strict type aliasing rules, but it can also 5860be used to implement custom alias analysis behavior for other languages. 5861 5862This description of LLVM's TBAA system is broken into two parts: 5863:ref:`Semantics<tbaa_node_semantics>` talks about high level issues, and 5864:ref:`Representation<tbaa_node_representation>` talks about the metadata 5865encoding of various entities. 5866 5867It is always possible to trace any TBAA node to a "root" TBAA node (details 5868in the :ref:`Representation<tbaa_node_representation>` section). TBAA 5869nodes with different roots have an unknown aliasing relationship, and LLVM 5870conservatively infers ``MayAlias`` between them. The rules mentioned in 5871this section only pertain to TBAA nodes living under the same root. 5872 5873.. _tbaa_node_semantics: 5874 5875Semantics 5876""""""""" 5877 5878The TBAA metadata system, referred to as "struct path TBAA" (not to be 5879confused with ``tbaa.struct``), consists of the following high level 5880concepts: *Type Descriptors*, further subdivided into scalar type 5881descriptors and struct type descriptors; and *Access Tags*. 5882 5883**Type descriptors** describe the type system of the higher level language 5884being compiled. **Scalar type descriptors** describe types that do not 5885contain other types. Each scalar type has a parent type, which must also 5886be a scalar type or the TBAA root. Via this parent relation, scalar types 5887within a TBAA root form a tree. **Struct type descriptors** denote types 5888that contain a sequence of other type descriptors, at known offsets. These 5889contained type descriptors can either be struct type descriptors themselves 5890or scalar type descriptors. 5891 5892**Access tags** are metadata nodes attached to load and store instructions. 5893Access tags use type descriptors to describe the *location* being accessed 5894in terms of the type system of the higher level language. Access tags are 5895tuples consisting of a base type, an access type and an offset. The base 5896type is a scalar type descriptor or a struct type descriptor, the access 5897type is a scalar type descriptor, and the offset is a constant integer. 5898 5899The access tag ``(BaseTy, AccessTy, Offset)`` can describe one of two 5900things: 5901 5902 * If ``BaseTy`` is a struct type, the tag describes a memory access (load 5903 or store) of a value of type ``AccessTy`` contained in the struct type 5904 ``BaseTy`` at offset ``Offset``. 5905 5906 * If ``BaseTy`` is a scalar type, ``Offset`` must be 0 and ``BaseTy`` and 5907 ``AccessTy`` must be the same; and the access tag describes a scalar 5908 access with scalar type ``AccessTy``. 5909 5910We first define an ``ImmediateParent`` relation on ``(BaseTy, Offset)`` 5911tuples this way: 5912 5913 * If ``BaseTy`` is a scalar type then ``ImmediateParent(BaseTy, 0)`` is 5914 ``(ParentTy, 0)`` where ``ParentTy`` is the parent of the scalar type as 5915 described in the TBAA metadata. ``ImmediateParent(BaseTy, Offset)`` is 5916 undefined if ``Offset`` is non-zero. 5917 5918 * If ``BaseTy`` is a struct type then ``ImmediateParent(BaseTy, Offset)`` 5919 is ``(NewTy, NewOffset)`` where ``NewTy`` is the type contained in 5920 ``BaseTy`` at offset ``Offset`` and ``NewOffset`` is ``Offset`` adjusted 5921 to be relative within that inner type. 5922 5923A memory access with an access tag ``(BaseTy1, AccessTy1, Offset1)`` 5924aliases a memory access with an access tag ``(BaseTy2, AccessTy2, 5925Offset2)`` if either ``(BaseTy1, Offset1)`` is reachable from ``(Base2, 5926Offset2)`` via the ``Parent`` relation or vice versa. 5927 5928As a concrete example, the type descriptor graph for the following program 5929 5930.. code-block:: c 5931 5932 struct Inner { 5933 int i; // offset 0 5934 float f; // offset 4 5935 }; 5936 5937 struct Outer { 5938 float f; // offset 0 5939 double d; // offset 4 5940 struct Inner inner_a; // offset 12 5941 }; 5942 5943 void f(struct Outer* outer, struct Inner* inner, float* f, int* i, char* c) { 5944 outer->f = 0; // tag0: (OuterStructTy, FloatScalarTy, 0) 5945 outer->inner_a.i = 0; // tag1: (OuterStructTy, IntScalarTy, 12) 5946 outer->inner_a.f = 0.0; // tag2: (OuterStructTy, FloatScalarTy, 16) 5947 *f = 0.0; // tag3: (FloatScalarTy, FloatScalarTy, 0) 5948 } 5949 5950is (note that in C and C++, ``char`` can be used to access any arbitrary 5951type): 5952 5953.. code-block:: text 5954 5955 Root = "TBAA Root" 5956 CharScalarTy = ("char", Root, 0) 5957 FloatScalarTy = ("float", CharScalarTy, 0) 5958 DoubleScalarTy = ("double", CharScalarTy, 0) 5959 IntScalarTy = ("int", CharScalarTy, 0) 5960 InnerStructTy = {"Inner" (IntScalarTy, 0), (FloatScalarTy, 4)} 5961 OuterStructTy = {"Outer", (FloatScalarTy, 0), (DoubleScalarTy, 4), 5962 (InnerStructTy, 12)} 5963 5964 5965with (e.g.) ``ImmediateParent(OuterStructTy, 12)`` = ``(InnerStructTy, 59660)``, ``ImmediateParent(InnerStructTy, 0)`` = ``(IntScalarTy, 0)``, and 5967``ImmediateParent(IntScalarTy, 0)`` = ``(CharScalarTy, 0)``. 5968 5969.. _tbaa_node_representation: 5970 5971Representation 5972"""""""""""""" 5973 5974The root node of a TBAA type hierarchy is an ``MDNode`` with 0 operands or 5975with exactly one ``MDString`` operand. 5976 5977Scalar type descriptors are represented as an ``MDNode`` s with two 5978operands. The first operand is an ``MDString`` denoting the name of the 5979struct type. LLVM does not assign meaning to the value of this operand, it 5980only cares about it being an ``MDString``. The second operand is an 5981``MDNode`` which points to the parent for said scalar type descriptor, 5982which is either another scalar type descriptor or the TBAA root. Scalar 5983type descriptors can have an optional third argument, but that must be the 5984constant integer zero. 5985 5986Struct type descriptors are represented as ``MDNode`` s with an odd number 5987of operands greater than 1. The first operand is an ``MDString`` denoting 5988the name of the struct type. Like in scalar type descriptors the actual 5989value of this name operand is irrelevant to LLVM. After the name operand, 5990the struct type descriptors have a sequence of alternating ``MDNode`` and 5991``ConstantInt`` operands. With N starting from 1, the 2N - 1 th operand, 5992an ``MDNode``, denotes a contained field, and the 2N th operand, a 5993``ConstantInt``, is the offset of the said contained field. The offsets 5994must be in non-decreasing order. 5995 5996Access tags are represented as ``MDNode`` s with either 3 or 4 operands. 5997The first operand is an ``MDNode`` pointing to the node representing the 5998base type. The second operand is an ``MDNode`` pointing to the node 5999representing the access type. The third operand is a ``ConstantInt`` that 6000states the offset of the access. If a fourth field is present, it must be 6001a ``ConstantInt`` valued at 0 or 1. If it is 1 then the access tag states 6002that the location being accessed is "constant" (meaning 6003``pointsToConstantMemory`` should return true; see `other useful 6004AliasAnalysis methods <AliasAnalysis.html#OtherItfs>`_). The TBAA root of 6005the access type and the base type of an access tag must be the same, and 6006that is the TBAA root of the access tag. 6007 6008'``tbaa.struct``' Metadata 6009^^^^^^^^^^^^^^^^^^^^^^^^^^ 6010 6011The :ref:`llvm.memcpy <int_memcpy>` is often used to implement 6012aggregate assignment operations in C and similar languages, however it 6013is defined to copy a contiguous region of memory, which is more than 6014strictly necessary for aggregate types which contain holes due to 6015padding. Also, it doesn't contain any TBAA information about the fields 6016of the aggregate. 6017 6018``!tbaa.struct`` metadata can describe which memory subregions in a 6019memcpy are padding and what the TBAA tags of the struct are. 6020 6021The current metadata format is very simple. ``!tbaa.struct`` metadata 6022nodes are a list of operands which are in conceptual groups of three. 6023For each group of three, the first operand gives the byte offset of a 6024field in bytes, the second gives its size in bytes, and the third gives 6025its tbaa tag. e.g.: 6026 6027.. code-block:: llvm 6028 6029 !4 = !{ i64 0, i64 4, !1, i64 8, i64 4, !2 } 6030 6031This describes a struct with two fields. The first is at offset 0 bytes 6032with size 4 bytes, and has tbaa tag !1. The second is at offset 8 bytes 6033and has size 4 bytes and has tbaa tag !2. 6034 6035Note that the fields need not be contiguous. In this example, there is a 60364 byte gap between the two fields. This gap represents padding which 6037does not carry useful data and need not be preserved. 6038 6039'``noalias``' and '``alias.scope``' Metadata 6040^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6041 6042``noalias`` and ``alias.scope`` metadata provide the ability to specify generic 6043noalias memory-access sets. This means that some collection of memory access 6044instructions (loads, stores, memory-accessing calls, etc.) that carry 6045``noalias`` metadata can specifically be specified not to alias with some other 6046collection of memory access instructions that carry ``alias.scope`` metadata. 6047Each type of metadata specifies a list of scopes where each scope has an id and 6048a domain. 6049 6050When evaluating an aliasing query, if for some domain, the set 6051of scopes with that domain in one instruction's ``alias.scope`` list is a 6052subset of (or equal to) the set of scopes for that domain in another 6053instruction's ``noalias`` list, then the two memory accesses are assumed not to 6054alias. 6055 6056Because scopes in one domain don't affect scopes in other domains, separate 6057domains can be used to compose multiple independent noalias sets. This is 6058used for example during inlining. As the noalias function parameters are 6059turned into noalias scope metadata, a new domain is used every time the 6060function is inlined. 6061 6062The metadata identifying each domain is itself a list containing one or two 6063entries. The first entry is the name of the domain. Note that if the name is a 6064string then it can be combined across functions and translation units. A 6065self-reference can be used to create globally unique domain names. A 6066descriptive string may optionally be provided as a second list entry. 6067 6068The metadata identifying each scope is also itself a list containing two or 6069three entries. The first entry is the name of the scope. Note that if the name 6070is a string then it can be combined across functions and translation units. A 6071self-reference can be used to create globally unique scope names. A metadata 6072reference to the scope's domain is the second entry. A descriptive string may 6073optionally be provided as a third list entry. 6074 6075For example, 6076 6077.. code-block:: llvm 6078 6079 ; Two scope domains: 6080 !0 = !{!0} 6081 !1 = !{!1} 6082 6083 ; Some scopes in these domains: 6084 !2 = !{!2, !0} 6085 !3 = !{!3, !0} 6086 !4 = !{!4, !1} 6087 6088 ; Some scope lists: 6089 !5 = !{!4} ; A list containing only scope !4 6090 !6 = !{!4, !3, !2} 6091 !7 = !{!3} 6092 6093 ; These two instructions don't alias: 6094 %0 = load float, float* %c, align 4, !alias.scope !5 6095 store float %0, float* %arrayidx.i, align 4, !noalias !5 6096 6097 ; These two instructions also don't alias (for domain !1, the set of scopes 6098 ; in the !alias.scope equals that in the !noalias list): 6099 %2 = load float, float* %c, align 4, !alias.scope !5 6100 store float %2, float* %arrayidx.i2, align 4, !noalias !6 6101 6102 ; These two instructions may alias (for domain !0, the set of scopes in 6103 ; the !noalias list is not a superset of, or equal to, the scopes in the 6104 ; !alias.scope list): 6105 %2 = load float, float* %c, align 4, !alias.scope !6 6106 store float %0, float* %arrayidx.i, align 4, !noalias !7 6107 6108'``fpmath``' Metadata 6109^^^^^^^^^^^^^^^^^^^^^ 6110 6111``fpmath`` metadata may be attached to any instruction of floating-point 6112type. It can be used to express the maximum acceptable error in the 6113result of that instruction, in ULPs, thus potentially allowing the 6114compiler to use a more efficient but less accurate method of computing 6115it. ULP is defined as follows: 6116 6117 If ``x`` is a real number that lies between two finite consecutive 6118 floating-point numbers ``a`` and ``b``, without being equal to one 6119 of them, then ``ulp(x) = |b - a|``, otherwise ``ulp(x)`` is the 6120 distance between the two non-equal finite floating-point numbers 6121 nearest ``x``. Moreover, ``ulp(NaN)`` is ``NaN``. 6122 6123The metadata node shall consist of a single positive float type number 6124representing the maximum relative error, for example: 6125 6126.. code-block:: llvm 6127 6128 !0 = !{ float 2.5 } ; maximum acceptable inaccuracy is 2.5 ULPs 6129 6130.. _range-metadata: 6131 6132'``range``' Metadata 6133^^^^^^^^^^^^^^^^^^^^ 6134 6135``range`` metadata may be attached only to ``load``, ``call`` and ``invoke`` of 6136integer types. It expresses the possible ranges the loaded value or the value 6137returned by the called function at this call site is in. If the loaded or 6138returned value is not in the specified range, the behavior is undefined. The 6139ranges are represented with a flattened list of integers. The loaded value or 6140the value returned is known to be in the union of the ranges defined by each 6141consecutive pair. Each pair has the following properties: 6142 6143- The type must match the type loaded by the instruction. 6144- The pair ``a,b`` represents the range ``[a,b)``. 6145- Both ``a`` and ``b`` are constants. 6146- The range is allowed to wrap. 6147- The range should not represent the full or empty set. That is, 6148 ``a!=b``. 6149 6150In addition, the pairs must be in signed order of the lower bound and 6151they must be non-contiguous. 6152 6153Examples: 6154 6155.. code-block:: llvm 6156 6157 %a = load i8, i8* %x, align 1, !range !0 ; Can only be 0 or 1 6158 %b = load i8, i8* %y, align 1, !range !1 ; Can only be 255 (-1), 0 or 1 6159 %c = call i8 @foo(), !range !2 ; Can only be 0, 1, 3, 4 or 5 6160 %d = invoke i8 @bar() to label %cont 6161 unwind label %lpad, !range !3 ; Can only be -2, -1, 3, 4 or 5 6162 ... 6163 !0 = !{ i8 0, i8 2 } 6164 !1 = !{ i8 255, i8 2 } 6165 !2 = !{ i8 0, i8 2, i8 3, i8 6 } 6166 !3 = !{ i8 -2, i8 0, i8 3, i8 6 } 6167 6168'``absolute_symbol``' Metadata 6169^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6170 6171``absolute_symbol`` metadata may be attached to a global variable 6172declaration. It marks the declaration as a reference to an absolute symbol, 6173which causes the backend to use absolute relocations for the symbol even 6174in position independent code, and expresses the possible ranges that the 6175global variable's *address* (not its value) is in, in the same format as 6176``range`` metadata, with the extension that the pair ``all-ones,all-ones`` 6177may be used to represent the full set. 6178 6179Example (assuming 64-bit pointers): 6180 6181.. code-block:: llvm 6182 6183 @a = external global i8, !absolute_symbol !0 ; Absolute symbol in range [0,256) 6184 @b = external global i8, !absolute_symbol !1 ; Absolute symbol in range [0,2^64) 6185 6186 ... 6187 !0 = !{ i64 0, i64 256 } 6188 !1 = !{ i64 -1, i64 -1 } 6189 6190'``callees``' Metadata 6191^^^^^^^^^^^^^^^^^^^^^^ 6192 6193``callees`` metadata may be attached to indirect call sites. If ``callees`` 6194metadata is attached to a call site, and any callee is not among the set of 6195functions provided by the metadata, the behavior is undefined. The intent of 6196this metadata is to facilitate optimizations such as indirect-call promotion. 6197For example, in the code below, the call instruction may only target the 6198``add`` or ``sub`` functions: 6199 6200.. code-block:: llvm 6201 6202 %result = call i64 %binop(i64 %x, i64 %y), !callees !0 6203 6204 ... 6205 !0 = !{i64 (i64, i64)* @add, i64 (i64, i64)* @sub} 6206 6207'``callback``' Metadata 6208^^^^^^^^^^^^^^^^^^^^^^^ 6209 6210``callback`` metadata may be attached to a function declaration, or definition. 6211(Call sites are excluded only due to the lack of a use case.) For ease of 6212exposition, we'll refer to the function annotated w/ metadata as a broker 6213function. The metadata describes how the arguments of a call to the broker are 6214in turn passed to the callback function specified by the metadata. Thus, the 6215``callback`` metadata provides a partial description of a call site inside the 6216broker function with regards to the arguments of a call to the broker. The only 6217semantic restriction on the broker function itself is that it is not allowed to 6218inspect or modify arguments referenced in the ``callback`` metadata as 6219pass-through to the callback function. 6220 6221The broker is not required to actually invoke the callback function at runtime. 6222However, the assumptions about not inspecting or modifying arguments that would 6223be passed to the specified callback function still hold, even if the callback 6224function is not dynamically invoked. The broker is allowed to invoke the 6225callback function more than once per invocation of the broker. The broker is 6226also allowed to invoke (directly or indirectly) the function passed as a 6227callback through another use. Finally, the broker is also allowed to relay the 6228callback callee invocation to a different thread. 6229 6230The metadata is structured as follows: At the outer level, ``callback`` 6231metadata is a list of ``callback`` encodings. Each encoding starts with a 6232constant ``i64`` which describes the argument position of the callback function 6233in the call to the broker. The following elements, except the last, describe 6234what arguments are passed to the callback function. Each element is again an 6235``i64`` constant identifying the argument of the broker that is passed through, 6236or ``i64 -1`` to indicate an unknown or inspected argument. The order in which 6237they are listed has to be the same in which they are passed to the callback 6238callee. The last element of the encoding is a boolean which specifies how 6239variadic arguments of the broker are handled. If it is true, all variadic 6240arguments of the broker are passed through to the callback function *after* the 6241arguments encoded explicitly before. 6242 6243In the code below, the ``pthread_create`` function is marked as a broker 6244through the ``!callback !1`` metadata. In the example, there is only one 6245callback encoding, namely ``!2``, associated with the broker. This encoding 6246identifies the callback function as the second argument of the broker (``i64 62472``) and the sole argument of the callback function as the third one of the 6248broker function (``i64 3``). 6249 6250.. FIXME why does the llvm-sphinx-docs builder give a highlighting 6251 error if the below is set to highlight as 'llvm', despite that we 6252 have misc.highlighting_failure set? 6253 6254.. code-block:: text 6255 6256 declare !callback !1 dso_local i32 @pthread_create(i64*, %union.pthread_attr_t*, i8* (i8*)*, i8*) 6257 6258 ... 6259 !2 = !{i64 2, i64 3, i1 false} 6260 !1 = !{!2} 6261 6262Another example is shown below. The callback callee is the second argument of 6263the ``__kmpc_fork_call`` function (``i64 2``). The callee is given two unknown 6264values (each identified by a ``i64 -1``) and afterwards all 6265variadic arguments that are passed to the ``__kmpc_fork_call`` call (due to the 6266final ``i1 true``). 6267 6268.. FIXME why does the llvm-sphinx-docs builder give a highlighting 6269 error if the below is set to highlight as 'llvm', despite that we 6270 have misc.highlighting_failure set? 6271 6272.. code-block:: text 6273 6274 declare !callback !0 dso_local void @__kmpc_fork_call(%struct.ident_t*, i32, void (i32*, i32*, ...)*, ...) 6275 6276 ... 6277 !1 = !{i64 2, i64 -1, i64 -1, i1 true} 6278 !0 = !{!1} 6279 6280 6281'``unpredictable``' Metadata 6282^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6283 6284``unpredictable`` metadata may be attached to any branch or switch 6285instruction. It can be used to express the unpredictability of control 6286flow. Similar to the llvm.expect intrinsic, it may be used to alter 6287optimizations related to compare and branch instructions. The metadata 6288is treated as a boolean value; if it exists, it signals that the branch 6289or switch that it is attached to is completely unpredictable. 6290 6291.. _md_dereferenceable: 6292 6293'``dereferenceable``' Metadata 6294^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6295 6296The existence of the ``!dereferenceable`` metadata on the instruction 6297tells the optimizer that the value loaded is known to be dereferenceable. 6298The number of bytes known to be dereferenceable is specified by the integer 6299value in the metadata node. This is analogous to the ''dereferenceable'' 6300attribute on parameters and return values. 6301 6302.. _md_dereferenceable_or_null: 6303 6304'``dereferenceable_or_null``' Metadata 6305^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6306 6307The existence of the ``!dereferenceable_or_null`` metadata on the 6308instruction tells the optimizer that the value loaded is known to be either 6309dereferenceable or null. 6310The number of bytes known to be dereferenceable is specified by the integer 6311value in the metadata node. This is analogous to the ''dereferenceable_or_null'' 6312attribute on parameters and return values. 6313 6314.. _llvm.loop: 6315 6316'``llvm.loop``' 6317^^^^^^^^^^^^^^^ 6318 6319It is sometimes useful to attach information to loop constructs. Currently, 6320loop metadata is implemented as metadata attached to the branch instruction 6321in the loop latch block. The loop metadata node is a list of 6322other metadata nodes, each representing a property of the loop. Usually, 6323the first item of the property node is a string. For example, the 6324``llvm.loop.unroll.count`` suggests an unroll factor to the loop 6325unroller: 6326 6327.. code-block:: llvm 6328 6329 br i1 %exitcond, label %._crit_edge, label %.lr.ph, !llvm.loop !0 6330 ... 6331 !0 = !{!0, !1, !2} 6332 !1 = !{!"llvm.loop.unroll.enable"} 6333 !2 = !{!"llvm.loop.unroll.count", i32 4} 6334 6335For legacy reasons, the first item of a loop metadata node must be a 6336reference to itself. Before the advent of the 'distinct' keyword, this 6337forced the preservation of otherwise identical metadata nodes. Since 6338the loop-metadata node can be attached to multiple nodes, the 'distinct' 6339keyword has become unnecessary. 6340 6341Prior to the property nodes, one or two ``DILocation`` (debug location) 6342nodes can be present in the list. The first, if present, identifies the 6343source-code location where the loop begins. The second, if present, 6344identifies the source-code location where the loop ends. 6345 6346Loop metadata nodes cannot be used as unique identifiers. They are 6347neither persistent for the same loop through transformations nor 6348necessarily unique to just one loop. 6349 6350'``llvm.loop.disable_nonforced``' 6351^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6352 6353This metadata disables all optional loop transformations unless 6354explicitly instructed using other transformation metadata such as 6355``llvm.loop.unroll.enable``. That is, no heuristic will try to determine 6356whether a transformation is profitable. The purpose is to avoid that the 6357loop is transformed to a different loop before an explicitly requested 6358(forced) transformation is applied. For instance, loop fusion can make 6359other transformations impossible. Mandatory loop canonicalizations such 6360as loop rotation are still applied. 6361 6362It is recommended to use this metadata in addition to any llvm.loop.* 6363transformation directive. Also, any loop should have at most one 6364directive applied to it (and a sequence of transformations built using 6365followup-attributes). Otherwise, which transformation will be applied 6366depends on implementation details such as the pass pipeline order. 6367 6368See :ref:`transformation-metadata` for details. 6369 6370'``llvm.loop.vectorize``' and '``llvm.loop.interleave``' 6371^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6372 6373Metadata prefixed with ``llvm.loop.vectorize`` or ``llvm.loop.interleave`` are 6374used to control per-loop vectorization and interleaving parameters such as 6375vectorization width and interleave count. These metadata should be used in 6376conjunction with ``llvm.loop`` loop identification metadata. The 6377``llvm.loop.vectorize`` and ``llvm.loop.interleave`` metadata are only 6378optimization hints and the optimizer will only interleave and vectorize loops if 6379it believes it is safe to do so. The ``llvm.loop.parallel_accesses`` metadata 6380which contains information about loop-carried memory dependencies can be helpful 6381in determining the safety of these transformations. 6382 6383'``llvm.loop.interleave.count``' Metadata 6384^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6385 6386This metadata suggests an interleave count to the loop interleaver. 6387The first operand is the string ``llvm.loop.interleave.count`` and the 6388second operand is an integer specifying the interleave count. For 6389example: 6390 6391.. code-block:: llvm 6392 6393 !0 = !{!"llvm.loop.interleave.count", i32 4} 6394 6395Note that setting ``llvm.loop.interleave.count`` to 1 disables interleaving 6396multiple iterations of the loop. If ``llvm.loop.interleave.count`` is set to 0 6397then the interleave count will be determined automatically. 6398 6399'``llvm.loop.vectorize.enable``' Metadata 6400^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6401 6402This metadata selectively enables or disables vectorization for the loop. The 6403first operand is the string ``llvm.loop.vectorize.enable`` and the second operand 6404is a bit. If the bit operand value is 1 vectorization is enabled. A value of 64050 disables vectorization: 6406 6407.. code-block:: llvm 6408 6409 !0 = !{!"llvm.loop.vectorize.enable", i1 0} 6410 !1 = !{!"llvm.loop.vectorize.enable", i1 1} 6411 6412'``llvm.loop.vectorize.predicate.enable``' Metadata 6413^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6414 6415This metadata selectively enables or disables creating predicated instructions 6416for the loop, which can enable folding of the scalar epilogue loop into the 6417main loop. The first operand is the string 6418``llvm.loop.vectorize.predicate.enable`` and the second operand is a bit. If 6419the bit operand value is 1 vectorization is enabled. A value of 0 disables 6420vectorization: 6421 6422.. code-block:: llvm 6423 6424 !0 = !{!"llvm.loop.vectorize.predicate.enable", i1 0} 6425 !1 = !{!"llvm.loop.vectorize.predicate.enable", i1 1} 6426 6427'``llvm.loop.vectorize.scalable.enable``' Metadata 6428^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6429 6430This metadata selectively enables or disables scalable vectorization for the 6431loop, and only has any effect if vectorization for the loop is already enabled. 6432The first operand is the string ``llvm.loop.vectorize.scalable.enable`` 6433and the second operand is a bit. If the bit operand value is 1 scalable 6434vectorization is enabled, whereas a value of 0 reverts to the default fixed 6435width vectorization: 6436 6437.. code-block:: llvm 6438 6439 !0 = !{!"llvm.loop.vectorize.scalable.enable", i1 0} 6440 !1 = !{!"llvm.loop.vectorize.scalable.enable", i1 1} 6441 6442'``llvm.loop.vectorize.width``' Metadata 6443^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6444 6445This metadata sets the target width of the vectorizer. The first 6446operand is the string ``llvm.loop.vectorize.width`` and the second 6447operand is an integer specifying the width. For example: 6448 6449.. code-block:: llvm 6450 6451 !0 = !{!"llvm.loop.vectorize.width", i32 4} 6452 6453Note that setting ``llvm.loop.vectorize.width`` to 1 disables 6454vectorization of the loop. If ``llvm.loop.vectorize.width`` is set to 64550 or if the loop does not have this metadata the width will be 6456determined automatically. 6457 6458'``llvm.loop.vectorize.followup_vectorized``' Metadata 6459^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6460 6461This metadata defines which loop attributes the vectorized loop will 6462have. See :ref:`transformation-metadata` for details. 6463 6464'``llvm.loop.vectorize.followup_epilogue``' Metadata 6465^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6466 6467This metadata defines which loop attributes the epilogue will have. The 6468epilogue is not vectorized and is executed when either the vectorized 6469loop is not known to preserve semantics (because e.g., it processes two 6470arrays that are found to alias by a runtime check) or for the last 6471iterations that do not fill a complete set of vector lanes. See 6472:ref:`Transformation Metadata <transformation-metadata>` for details. 6473 6474'``llvm.loop.vectorize.followup_all``' Metadata 6475^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6476 6477Attributes in the metadata will be added to both the vectorized and 6478epilogue loop. 6479See :ref:`Transformation Metadata <transformation-metadata>` for details. 6480 6481'``llvm.loop.unroll``' 6482^^^^^^^^^^^^^^^^^^^^^^ 6483 6484Metadata prefixed with ``llvm.loop.unroll`` are loop unrolling 6485optimization hints such as the unroll factor. ``llvm.loop.unroll`` 6486metadata should be used in conjunction with ``llvm.loop`` loop 6487identification metadata. The ``llvm.loop.unroll`` metadata are only 6488optimization hints and the unrolling will only be performed if the 6489optimizer believes it is safe to do so. 6490 6491'``llvm.loop.unroll.count``' Metadata 6492^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6493 6494This metadata suggests an unroll factor to the loop unroller. The 6495first operand is the string ``llvm.loop.unroll.count`` and the second 6496operand is a positive integer specifying the unroll factor. For 6497example: 6498 6499.. code-block:: llvm 6500 6501 !0 = !{!"llvm.loop.unroll.count", i32 4} 6502 6503If the trip count of the loop is less than the unroll count the loop 6504will be partially unrolled. 6505 6506'``llvm.loop.unroll.disable``' Metadata 6507^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6508 6509This metadata disables loop unrolling. The metadata has a single operand 6510which is the string ``llvm.loop.unroll.disable``. For example: 6511 6512.. code-block:: llvm 6513 6514 !0 = !{!"llvm.loop.unroll.disable"} 6515 6516'``llvm.loop.unroll.runtime.disable``' Metadata 6517^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6518 6519This metadata disables runtime loop unrolling. The metadata has a single 6520operand which is the string ``llvm.loop.unroll.runtime.disable``. For example: 6521 6522.. code-block:: llvm 6523 6524 !0 = !{!"llvm.loop.unroll.runtime.disable"} 6525 6526'``llvm.loop.unroll.enable``' Metadata 6527^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6528 6529This metadata suggests that the loop should be fully unrolled if the trip count 6530is known at compile time and partially unrolled if the trip count is not known 6531at compile time. The metadata has a single operand which is the string 6532``llvm.loop.unroll.enable``. For example: 6533 6534.. code-block:: llvm 6535 6536 !0 = !{!"llvm.loop.unroll.enable"} 6537 6538'``llvm.loop.unroll.full``' Metadata 6539^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6540 6541This metadata suggests that the loop should be unrolled fully. The 6542metadata has a single operand which is the string ``llvm.loop.unroll.full``. 6543For example: 6544 6545.. code-block:: llvm 6546 6547 !0 = !{!"llvm.loop.unroll.full"} 6548 6549'``llvm.loop.unroll.followup``' Metadata 6550^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6551 6552This metadata defines which loop attributes the unrolled loop will have. 6553See :ref:`Transformation Metadata <transformation-metadata>` for details. 6554 6555'``llvm.loop.unroll.followup_remainder``' Metadata 6556^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6557 6558This metadata defines which loop attributes the remainder loop after 6559partial/runtime unrolling will have. See 6560:ref:`Transformation Metadata <transformation-metadata>` for details. 6561 6562'``llvm.loop.unroll_and_jam``' 6563^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6564 6565This metadata is treated very similarly to the ``llvm.loop.unroll`` metadata 6566above, but affect the unroll and jam pass. In addition any loop with 6567``llvm.loop.unroll`` metadata but no ``llvm.loop.unroll_and_jam`` metadata will 6568disable unroll and jam (so ``llvm.loop.unroll`` metadata will be left to the 6569unroller, plus ``llvm.loop.unroll.disable`` metadata will disable unroll and jam 6570too.) 6571 6572The metadata for unroll and jam otherwise is the same as for ``unroll``. 6573``llvm.loop.unroll_and_jam.enable``, ``llvm.loop.unroll_and_jam.disable`` and 6574``llvm.loop.unroll_and_jam.count`` do the same as for unroll. 6575``llvm.loop.unroll_and_jam.full`` is not supported. Again these are only hints 6576and the normal safety checks will still be performed. 6577 6578'``llvm.loop.unroll_and_jam.count``' Metadata 6579^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6580 6581This metadata suggests an unroll and jam factor to use, similarly to 6582``llvm.loop.unroll.count``. The first operand is the string 6583``llvm.loop.unroll_and_jam.count`` and the second operand is a positive integer 6584specifying the unroll factor. For example: 6585 6586.. code-block:: llvm 6587 6588 !0 = !{!"llvm.loop.unroll_and_jam.count", i32 4} 6589 6590If the trip count of the loop is less than the unroll count the loop 6591will be partially unroll and jammed. 6592 6593'``llvm.loop.unroll_and_jam.disable``' Metadata 6594^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6595 6596This metadata disables loop unroll and jamming. The metadata has a single 6597operand which is the string ``llvm.loop.unroll_and_jam.disable``. For example: 6598 6599.. code-block:: llvm 6600 6601 !0 = !{!"llvm.loop.unroll_and_jam.disable"} 6602 6603'``llvm.loop.unroll_and_jam.enable``' Metadata 6604^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6605 6606This metadata suggests that the loop should be fully unroll and jammed if the 6607trip count is known at compile time and partially unrolled if the trip count is 6608not known at compile time. The metadata has a single operand which is the 6609string ``llvm.loop.unroll_and_jam.enable``. For example: 6610 6611.. code-block:: llvm 6612 6613 !0 = !{!"llvm.loop.unroll_and_jam.enable"} 6614 6615'``llvm.loop.unroll_and_jam.followup_outer``' Metadata 6616^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6617 6618This metadata defines which loop attributes the outer unrolled loop will 6619have. See :ref:`Transformation Metadata <transformation-metadata>` for 6620details. 6621 6622'``llvm.loop.unroll_and_jam.followup_inner``' Metadata 6623^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6624 6625This metadata defines which loop attributes the inner jammed loop will 6626have. See :ref:`Transformation Metadata <transformation-metadata>` for 6627details. 6628 6629'``llvm.loop.unroll_and_jam.followup_remainder_outer``' Metadata 6630^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6631 6632This metadata defines which attributes the epilogue of the outer loop 6633will have. This loop is usually unrolled, meaning there is no such 6634loop. This attribute will be ignored in this case. See 6635:ref:`Transformation Metadata <transformation-metadata>` for details. 6636 6637'``llvm.loop.unroll_and_jam.followup_remainder_inner``' Metadata 6638^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6639 6640This metadata defines which attributes the inner loop of the epilogue 6641will have. The outer epilogue will usually be unrolled, meaning there 6642can be multiple inner remainder loops. See 6643:ref:`Transformation Metadata <transformation-metadata>` for details. 6644 6645'``llvm.loop.unroll_and_jam.followup_all``' Metadata 6646^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6647 6648Attributes specified in the metadata is added to all 6649``llvm.loop.unroll_and_jam.*`` loops. See 6650:ref:`Transformation Metadata <transformation-metadata>` for details. 6651 6652'``llvm.loop.licm_versioning.disable``' Metadata 6653^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6654 6655This metadata indicates that the loop should not be versioned for the purpose 6656of enabling loop-invariant code motion (LICM). The metadata has a single operand 6657which is the string ``llvm.loop.licm_versioning.disable``. For example: 6658 6659.. code-block:: llvm 6660 6661 !0 = !{!"llvm.loop.licm_versioning.disable"} 6662 6663'``llvm.loop.distribute.enable``' Metadata 6664^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6665 6666Loop distribution allows splitting a loop into multiple loops. Currently, 6667this is only performed if the entire loop cannot be vectorized due to unsafe 6668memory dependencies. The transformation will attempt to isolate the unsafe 6669dependencies into their own loop. 6670 6671This metadata can be used to selectively enable or disable distribution of the 6672loop. The first operand is the string ``llvm.loop.distribute.enable`` and the 6673second operand is a bit. If the bit operand value is 1 distribution is 6674enabled. A value of 0 disables distribution: 6675 6676.. code-block:: llvm 6677 6678 !0 = !{!"llvm.loop.distribute.enable", i1 0} 6679 !1 = !{!"llvm.loop.distribute.enable", i1 1} 6680 6681This metadata should be used in conjunction with ``llvm.loop`` loop 6682identification metadata. 6683 6684'``llvm.loop.distribute.followup_coincident``' Metadata 6685^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6686 6687This metadata defines which attributes extracted loops with no cyclic 6688dependencies will have (i.e. can be vectorized). See 6689:ref:`Transformation Metadata <transformation-metadata>` for details. 6690 6691'``llvm.loop.distribute.followup_sequential``' Metadata 6692^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6693 6694This metadata defines which attributes the isolated loops with unsafe 6695memory dependencies will have. See 6696:ref:`Transformation Metadata <transformation-metadata>` for details. 6697 6698'``llvm.loop.distribute.followup_fallback``' Metadata 6699^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6700 6701If loop versioning is necessary, this metadata defined the attributes 6702the non-distributed fallback version will have. See 6703:ref:`Transformation Metadata <transformation-metadata>` for details. 6704 6705'``llvm.loop.distribute.followup_all``' Metadata 6706^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6707 6708The attributes in this metadata is added to all followup loops of the 6709loop distribution pass. See 6710:ref:`Transformation Metadata <transformation-metadata>` for details. 6711 6712'``llvm.licm.disable``' Metadata 6713^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6714 6715This metadata indicates that loop-invariant code motion (LICM) should not be 6716performed on this loop. The metadata has a single operand which is the string 6717``llvm.licm.disable``. For example: 6718 6719.. code-block:: llvm 6720 6721 !0 = !{!"llvm.licm.disable"} 6722 6723Note that although it operates per loop it isn't given the llvm.loop prefix 6724as it is not affected by the ``llvm.loop.disable_nonforced`` metadata. 6725 6726'``llvm.access.group``' Metadata 6727^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6728 6729``llvm.access.group`` metadata can be attached to any instruction that 6730potentially accesses memory. It can point to a single distinct metadata 6731node, which we call access group. This node represents all memory access 6732instructions referring to it via ``llvm.access.group``. When an 6733instruction belongs to multiple access groups, it can also point to a 6734list of accesses groups, illustrated by the following example. 6735 6736.. code-block:: llvm 6737 6738 %val = load i32, i32* %arrayidx, !llvm.access.group !0 6739 ... 6740 !0 = !{!1, !2} 6741 !1 = distinct !{} 6742 !2 = distinct !{} 6743 6744It is illegal for the list node to be empty since it might be confused 6745with an access group. 6746 6747The access group metadata node must be 'distinct' to avoid collapsing 6748multiple access groups by content. A access group metadata node must 6749always be empty which can be used to distinguish an access group 6750metadata node from a list of access groups. Being empty avoids the 6751situation that the content must be updated which, because metadata is 6752immutable by design, would required finding and updating all references 6753to the access group node. 6754 6755The access group can be used to refer to a memory access instruction 6756without pointing to it directly (which is not possible in global 6757metadata). Currently, the only metadata making use of it is 6758``llvm.loop.parallel_accesses``. 6759 6760'``llvm.loop.parallel_accesses``' Metadata 6761^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6762 6763The ``llvm.loop.parallel_accesses`` metadata refers to one or more 6764access group metadata nodes (see ``llvm.access.group``). It denotes that 6765no loop-carried memory dependence exist between it and other instructions 6766in the loop with this metadata. 6767 6768Let ``m1`` and ``m2`` be two instructions that both have the 6769``llvm.access.group`` metadata to the access group ``g1``, respectively 6770``g2`` (which might be identical). If a loop contains both access groups 6771in its ``llvm.loop.parallel_accesses`` metadata, then the compiler can 6772assume that there is no dependency between ``m1`` and ``m2`` carried by 6773this loop. Instructions that belong to multiple access groups are 6774considered having this property if at least one of the access groups 6775matches the ``llvm.loop.parallel_accesses`` list. 6776 6777If all memory-accessing instructions in a loop have 6778``llvm.access.group`` metadata that each refer to one of the access 6779groups of a loop's ``llvm.loop.parallel_accesses`` metadata, then the 6780loop has no loop carried memory dependences and is considered to be a 6781parallel loop. 6782 6783Note that if not all memory access instructions belong to an access 6784group referred to by ``llvm.loop.parallel_accesses``, then the loop must 6785not be considered trivially parallel. Additional 6786memory dependence analysis is required to make that determination. As a fail 6787safe mechanism, this causes loops that were originally parallel to be considered 6788sequential (if optimization passes that are unaware of the parallel semantics 6789insert new memory instructions into the loop body). 6790 6791Example of a loop that is considered parallel due to its correct use of 6792both ``llvm.access.group`` and ``llvm.loop.parallel_accesses`` 6793metadata types. 6794 6795.. code-block:: llvm 6796 6797 for.body: 6798 ... 6799 %val0 = load i32, i32* %arrayidx, !llvm.access.group !1 6800 ... 6801 store i32 %val0, i32* %arrayidx1, !llvm.access.group !1 6802 ... 6803 br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !0 6804 6805 for.end: 6806 ... 6807 !0 = distinct !{!0, !{!"llvm.loop.parallel_accesses", !1}} 6808 !1 = distinct !{} 6809 6810It is also possible to have nested parallel loops: 6811 6812.. code-block:: llvm 6813 6814 outer.for.body: 6815 ... 6816 %val1 = load i32, i32* %arrayidx3, !llvm.access.group !4 6817 ... 6818 br label %inner.for.body 6819 6820 inner.for.body: 6821 ... 6822 %val0 = load i32, i32* %arrayidx1, !llvm.access.group !3 6823 ... 6824 store i32 %val0, i32* %arrayidx2, !llvm.access.group !3 6825 ... 6826 br i1 %exitcond, label %inner.for.end, label %inner.for.body, !llvm.loop !1 6827 6828 inner.for.end: 6829 ... 6830 store i32 %val1, i32* %arrayidx4, !llvm.access.group !4 6831 ... 6832 br i1 %exitcond, label %outer.for.end, label %outer.for.body, !llvm.loop !2 6833 6834 outer.for.end: ; preds = %for.body 6835 ... 6836 !1 = distinct !{!1, !{!"llvm.loop.parallel_accesses", !3}} ; metadata for the inner loop 6837 !2 = distinct !{!2, !{!"llvm.loop.parallel_accesses", !3, !4}} ; metadata for the outer loop 6838 !3 = distinct !{} ; access group for instructions in the inner loop (which are implicitly contained in outer loop as well) 6839 !4 = distinct !{} ; access group for instructions in the outer, but not the inner loop 6840 6841'``llvm.loop.mustprogress``' Metadata 6842^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6843 6844The ``llvm.loop.mustprogress`` metadata indicates that this loop is required to 6845terminate, unwind, or interact with the environment in an observable way e.g. 6846via a volatile memory access, I/O, or other synchronization. If such a loop is 6847not found to interact with the environment in an observable way, the loop may 6848be removed. This corresponds to the ``mustprogress`` function attribute. 6849 6850'``irr_loop``' Metadata 6851^^^^^^^^^^^^^^^^^^^^^^^ 6852 6853``irr_loop`` metadata may be attached to the terminator instruction of a basic 6854block that's an irreducible loop header (note that an irreducible loop has more 6855than once header basic blocks.) If ``irr_loop`` metadata is attached to the 6856terminator instruction of a basic block that is not really an irreducible loop 6857header, the behavior is undefined. The intent of this metadata is to improve the 6858accuracy of the block frequency propagation. For example, in the code below, the 6859block ``header0`` may have a loop header weight (relative to the other headers of 6860the irreducible loop) of 100: 6861 6862.. code-block:: llvm 6863 6864 header0: 6865 ... 6866 br i1 %cmp, label %t1, label %t2, !irr_loop !0 6867 6868 ... 6869 !0 = !{"loop_header_weight", i64 100} 6870 6871Irreducible loop header weights are typically based on profile data. 6872 6873.. _md_invariant.group: 6874 6875'``invariant.group``' Metadata 6876^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6877 6878The experimental ``invariant.group`` metadata may be attached to 6879``load``/``store`` instructions referencing a single metadata with no entries. 6880The existence of the ``invariant.group`` metadata on the instruction tells 6881the optimizer that every ``load`` and ``store`` to the same pointer operand 6882can be assumed to load or store the same 6883value (but see the ``llvm.launder.invariant.group`` intrinsic which affects 6884when two pointers are considered the same). Pointers returned by bitcast or 6885getelementptr with only zero indices are considered the same. 6886 6887Examples: 6888 6889.. code-block:: llvm 6890 6891 @unknownPtr = external global i8 6892 ... 6893 %ptr = alloca i8 6894 store i8 42, i8* %ptr, !invariant.group !0 6895 call void @foo(i8* %ptr) 6896 6897 %a = load i8, i8* %ptr, !invariant.group !0 ; Can assume that value under %ptr didn't change 6898 call void @foo(i8* %ptr) 6899 6900 %newPtr = call i8* @getPointer(i8* %ptr) 6901 %c = load i8, i8* %newPtr, !invariant.group !0 ; Can't assume anything, because we only have information about %ptr 6902 6903 %unknownValue = load i8, i8* @unknownPtr 6904 store i8 %unknownValue, i8* %ptr, !invariant.group !0 ; Can assume that %unknownValue == 42 6905 6906 call void @foo(i8* %ptr) 6907 %newPtr2 = call i8* @llvm.launder.invariant.group(i8* %ptr) 6908 %d = load i8, i8* %newPtr2, !invariant.group !0 ; Can't step through launder.invariant.group to get value of %ptr 6909 6910 ... 6911 declare void @foo(i8*) 6912 declare i8* @getPointer(i8*) 6913 declare i8* @llvm.launder.invariant.group(i8*) 6914 6915 !0 = !{} 6916 6917The invariant.group metadata must be dropped when replacing one pointer by 6918another based on aliasing information. This is because invariant.group is tied 6919to the SSA value of the pointer operand. 6920 6921.. code-block:: llvm 6922 6923 %v = load i8, i8* %x, !invariant.group !0 6924 ; if %x mustalias %y then we can replace the above instruction with 6925 %v = load i8, i8* %y 6926 6927Note that this is an experimental feature, which means that its semantics might 6928change in the future. 6929 6930'``type``' Metadata 6931^^^^^^^^^^^^^^^^^^^ 6932 6933See :doc:`TypeMetadata`. 6934 6935'``associated``' Metadata 6936^^^^^^^^^^^^^^^^^^^^^^^^^ 6937 6938The ``associated`` metadata may be attached to a global variable definition with 6939a single argument that references a global object (optionally through an alias). 6940 6941This metadata lowers to the ELF section flag ``SHF_LINK_ORDER`` which prevents 6942discarding of the global variable in linker GC unless the referenced object is 6943also discarded. The linker support for this feature is spotty. For best 6944compatibility, globals carrying this metadata should: 6945 6946- Be in ``@llvm.compiler.used``. 6947- If the referenced global variable is in a comdat, be in the same comdat. 6948 6949``!associated`` can not express many-to-one relationship. A global variable with 6950the metadata should generally not be referenced by a function: the function may 6951be inlined into other functions, leading to more references to the metadata. 6952Ideally we would want to keep metadata alive as long as any inline location is 6953alive, but this many-to-one relationship is not representable. Moreover, if the 6954metadata is retained while the function is discarded, the linker will report an 6955error of a relocation referencing a discarded section. 6956 6957The metadata is often used with an explicit section consisting of valid C 6958identifiers so that the runtime can find the metadata section with 6959linker-defined encapsulation symbols ``__start_<section_name>`` and 6960``__stop_<section_name>``. 6961 6962It does not have any effect on non-ELF targets. 6963 6964Example: 6965 6966.. code-block:: text 6967 6968 $a = comdat any 6969 @a = global i32 1, comdat $a 6970 @b = internal global i32 2, comdat $a, section "abc", !associated !0 6971 !0 = !{i32* @a} 6972 6973 6974'``prof``' Metadata 6975^^^^^^^^^^^^^^^^^^^ 6976 6977The ``prof`` metadata is used to record profile data in the IR. 6978The first operand of the metadata node indicates the profile metadata 6979type. There are currently 3 types: 6980:ref:`branch_weights<prof_node_branch_weights>`, 6981:ref:`function_entry_count<prof_node_function_entry_count>`, and 6982:ref:`VP<prof_node_VP>`. 6983 6984.. _prof_node_branch_weights: 6985 6986branch_weights 6987"""""""""""""" 6988 6989Branch weight metadata attached to a branch, select, switch or call instruction 6990represents the likeliness of the associated branch being taken. 6991For more information, see :doc:`BranchWeightMetadata`. 6992 6993.. _prof_node_function_entry_count: 6994 6995function_entry_count 6996"""""""""""""""""""" 6997 6998Function entry count metadata can be attached to function definitions 6999to record the number of times the function is called. Used with BFI 7000information, it is also used to derive the basic block profile count. 7001For more information, see :doc:`BranchWeightMetadata`. 7002 7003.. _prof_node_VP: 7004 7005VP 7006"" 7007 7008VP (value profile) metadata can be attached to instructions that have 7009value profile information. Currently this is indirect calls (where it 7010records the hottest callees) and calls to memory intrinsics such as memcpy, 7011memmove, and memset (where it records the hottest byte lengths). 7012 7013Each VP metadata node contains "VP" string, then a uint32_t value for the value 7014profiling kind, a uint64_t value for the total number of times the instruction 7015is executed, followed by uint64_t value and execution count pairs. 7016The value profiling kind is 0 for indirect call targets and 1 for memory 7017operations. For indirect call targets, each profile value is a hash 7018of the callee function name, and for memory operations each value is the 7019byte length. 7020 7021Note that the value counts do not need to add up to the total count 7022listed in the third operand (in practice only the top hottest values 7023are tracked and reported). 7024 7025Indirect call example: 7026 7027.. code-block:: llvm 7028 7029 call void %f(), !prof !1 7030 !1 = !{!"VP", i32 0, i64 1600, i64 7651369219802541373, i64 1030, i64 -4377547752858689819, i64 410} 7031 7032Note that the VP type is 0 (the second operand), which indicates this is 7033an indirect call value profile data. The third operand indicates that the 7034indirect call executed 1600 times. The 4th and 6th operands give the 7035hashes of the 2 hottest target functions' names (this is the same hash used 7036to represent function names in the profile database), and the 5th and 7th 7037operands give the execution count that each of the respective prior target 7038functions was called. 7039 7040.. _md_annotation: 7041 7042'``annotation``' Metadata 7043^^^^^^^^^^^^^^^^^^^^^^^^^ 7044 7045The ``annotation`` metadata can be used to attach a tuple of annotation strings 7046to any instruction. This metadata does not impact the semantics of the program 7047and may only be used to provide additional insight about the program and 7048transformations to users. 7049 7050Example: 7051 7052.. code-block:: text 7053 7054 %a.addr = alloca float*, align 8, !annotation !0 7055 !0 = !{!"auto-init"} 7056 7057Module Flags Metadata 7058===================== 7059 7060Information about the module as a whole is difficult to convey to LLVM's 7061subsystems. The LLVM IR isn't sufficient to transmit this information. 7062The ``llvm.module.flags`` named metadata exists in order to facilitate 7063this. These flags are in the form of key / value pairs --- much like a 7064dictionary --- making it easy for any subsystem who cares about a flag to 7065look it up. 7066 7067The ``llvm.module.flags`` metadata contains a list of metadata triplets. 7068Each triplet has the following form: 7069 7070- The first element is a *behavior* flag, which specifies the behavior 7071 when two (or more) modules are merged together, and it encounters two 7072 (or more) metadata with the same ID. The supported behaviors are 7073 described below. 7074- The second element is a metadata string that is a unique ID for the 7075 metadata. Each module may only have one flag entry for each unique ID (not 7076 including entries with the **Require** behavior). 7077- The third element is the value of the flag. 7078 7079When two (or more) modules are merged together, the resulting 7080``llvm.module.flags`` metadata is the union of the modules' flags. That is, for 7081each unique metadata ID string, there will be exactly one entry in the merged 7082modules ``llvm.module.flags`` metadata table, and the value for that entry will 7083be determined by the merge behavior flag, as described below. The only exception 7084is that entries with the *Require* behavior are always preserved. 7085 7086The following behaviors are supported: 7087 7088.. list-table:: 7089 :header-rows: 1 7090 :widths: 10 90 7091 7092 * - Value 7093 - Behavior 7094 7095 * - 1 7096 - **Error** 7097 Emits an error if two values disagree, otherwise the resulting value 7098 is that of the operands. 7099 7100 * - 2 7101 - **Warning** 7102 Emits a warning if two values disagree. The result value will be the 7103 operand for the flag from the first module being linked, or the max 7104 if the other module uses **Max** (in which case the resulting flag 7105 will be **Max**). 7106 7107 * - 3 7108 - **Require** 7109 Adds a requirement that another module flag be present and have a 7110 specified value after linking is performed. The value must be a 7111 metadata pair, where the first element of the pair is the ID of the 7112 module flag to be restricted, and the second element of the pair is 7113 the value the module flag should be restricted to. This behavior can 7114 be used to restrict the allowable results (via triggering of an 7115 error) of linking IDs with the **Override** behavior. 7116 7117 * - 4 7118 - **Override** 7119 Uses the specified value, regardless of the behavior or value of the 7120 other module. If both modules specify **Override**, but the values 7121 differ, an error will be emitted. 7122 7123 * - 5 7124 - **Append** 7125 Appends the two values, which are required to be metadata nodes. 7126 7127 * - 6 7128 - **AppendUnique** 7129 Appends the two values, which are required to be metadata 7130 nodes. However, duplicate entries in the second list are dropped 7131 during the append operation. 7132 7133 * - 7 7134 - **Max** 7135 Takes the max of the two values, which are required to be integers. 7136 7137It is an error for a particular unique flag ID to have multiple behaviors, 7138except in the case of **Require** (which adds restrictions on another metadata 7139value) or **Override**. 7140 7141An example of module flags: 7142 7143.. code-block:: llvm 7144 7145 !0 = !{ i32 1, !"foo", i32 1 } 7146 !1 = !{ i32 4, !"bar", i32 37 } 7147 !2 = !{ i32 2, !"qux", i32 42 } 7148 !3 = !{ i32 3, !"qux", 7149 !{ 7150 !"foo", i32 1 7151 } 7152 } 7153 !llvm.module.flags = !{ !0, !1, !2, !3 } 7154 7155- Metadata ``!0`` has the ID ``!"foo"`` and the value '1'. The behavior 7156 if two or more ``!"foo"`` flags are seen is to emit an error if their 7157 values are not equal. 7158 7159- Metadata ``!1`` has the ID ``!"bar"`` and the value '37'. The 7160 behavior if two or more ``!"bar"`` flags are seen is to use the value 7161 '37'. 7162 7163- Metadata ``!2`` has the ID ``!"qux"`` and the value '42'. The 7164 behavior if two or more ``!"qux"`` flags are seen is to emit a 7165 warning if their values are not equal. 7166 7167- Metadata ``!3`` has the ID ``!"qux"`` and the value: 7168 7169 :: 7170 7171 !{ !"foo", i32 1 } 7172 7173 The behavior is to emit an error if the ``llvm.module.flags`` does not 7174 contain a flag with the ID ``!"foo"`` that has the value '1' after linking is 7175 performed. 7176 7177Synthesized Functions Module Flags Metadata 7178------------------------------------------- 7179 7180These metadata specify the default attributes synthesized functions should have. 7181These metadata are currently respected by a few instrumentation passes, such as 7182sanitizers. 7183 7184These metadata correspond to a few function attributes with significant code 7185generation behaviors. Function attributes with just optimization purposes 7186should not be listed because the performance impact of these synthesized 7187functions is small. 7188 7189- "frame-pointer": **Max**. The value can be 0, 1, or 2. A synthesized function 7190 will get the "frame-pointer" function attribute, with value being "none", 7191 "non-leaf", or "all", respectively. 7192- "uwtable": **Max**. The value can be 0 or 1. If the value is 1, a synthesized 7193 function will get the ``uwtable`` function attribute. 7194 7195Objective-C Garbage Collection Module Flags Metadata 7196---------------------------------------------------- 7197 7198On the Mach-O platform, Objective-C stores metadata about garbage 7199collection in a special section called "image info". The metadata 7200consists of a version number and a bitmask specifying what types of 7201garbage collection are supported (if any) by the file. If two or more 7202modules are linked together their garbage collection metadata needs to 7203be merged rather than appended together. 7204 7205The Objective-C garbage collection module flags metadata consists of the 7206following key-value pairs: 7207 7208.. list-table:: 7209 :header-rows: 1 7210 :widths: 30 70 7211 7212 * - Key 7213 - Value 7214 7215 * - ``Objective-C Version`` 7216 - **[Required]** --- The Objective-C ABI version. Valid values are 1 and 2. 7217 7218 * - ``Objective-C Image Info Version`` 7219 - **[Required]** --- The version of the image info section. Currently 7220 always 0. 7221 7222 * - ``Objective-C Image Info Section`` 7223 - **[Required]** --- The section to place the metadata. Valid values are 7224 ``"__OBJC, __image_info, regular"`` for Objective-C ABI version 1, and 7225 ``"__DATA,__objc_imageinfo, regular, no_dead_strip"`` for 7226 Objective-C ABI version 2. 7227 7228 * - ``Objective-C Garbage Collection`` 7229 - **[Required]** --- Specifies whether garbage collection is supported or 7230 not. Valid values are 0, for no garbage collection, and 2, for garbage 7231 collection supported. 7232 7233 * - ``Objective-C GC Only`` 7234 - **[Optional]** --- Specifies that only garbage collection is supported. 7235 If present, its value must be 6. This flag requires that the 7236 ``Objective-C Garbage Collection`` flag have the value 2. 7237 7238Some important flag interactions: 7239 7240- If a module with ``Objective-C Garbage Collection`` set to 0 is 7241 merged with a module with ``Objective-C Garbage Collection`` set to 7242 2, then the resulting module has the 7243 ``Objective-C Garbage Collection`` flag set to 0. 7244- A module with ``Objective-C Garbage Collection`` set to 0 cannot be 7245 merged with a module with ``Objective-C GC Only`` set to 6. 7246 7247C type width Module Flags Metadata 7248---------------------------------- 7249 7250The ARM backend emits a section into each generated object file describing the 7251options that it was compiled with (in a compiler-independent way) to prevent 7252linking incompatible objects, and to allow automatic library selection. Some 7253of these options are not visible at the IR level, namely wchar_t width and enum 7254width. 7255 7256To pass this information to the backend, these options are encoded in module 7257flags metadata, using the following key-value pairs: 7258 7259.. list-table:: 7260 :header-rows: 1 7261 :widths: 30 70 7262 7263 * - Key 7264 - Value 7265 7266 * - short_wchar 7267 - * 0 --- sizeof(wchar_t) == 4 7268 * 1 --- sizeof(wchar_t) == 2 7269 7270 * - short_enum 7271 - * 0 --- Enums are at least as large as an ``int``. 7272 * 1 --- Enums are stored in the smallest integer type which can 7273 represent all of its values. 7274 7275For example, the following metadata section specifies that the module was 7276compiled with a ``wchar_t`` width of 4 bytes, and the underlying type of an 7277enum is the smallest type which can represent all of its values:: 7278 7279 !llvm.module.flags = !{!0, !1} 7280 !0 = !{i32 1, !"short_wchar", i32 1} 7281 !1 = !{i32 1, !"short_enum", i32 0} 7282 7283LTO Post-Link Module Flags Metadata 7284----------------------------------- 7285 7286Some optimisations are only when the entire LTO unit is present in the current 7287module. This is represented by the ``LTOPostLink`` module flags metadata, which 7288will be created with a value of ``1`` when LTO linking occurs. 7289 7290Automatic Linker Flags Named Metadata 7291===================================== 7292 7293Some targets support embedding of flags to the linker inside individual object 7294files. Typically this is used in conjunction with language extensions which 7295allow source files to contain linker command line options, and have these 7296automatically be transmitted to the linker via object files. 7297 7298These flags are encoded in the IR using named metadata with the name 7299``!llvm.linker.options``. Each operand is expected to be a metadata node 7300which should be a list of other metadata nodes, each of which should be a 7301list of metadata strings defining linker options. 7302 7303For example, the following metadata section specifies two separate sets of 7304linker options, presumably to link against ``libz`` and the ``Cocoa`` 7305framework:: 7306 7307 !0 = !{ !"-lz" } 7308 !1 = !{ !"-framework", !"Cocoa" } 7309 !llvm.linker.options = !{ !0, !1 } 7310 7311The metadata encoding as lists of lists of options, as opposed to a collapsed 7312list of options, is chosen so that the IR encoding can use multiple option 7313strings to specify e.g., a single library, while still having that specifier be 7314preserved as an atomic element that can be recognized by a target specific 7315assembly writer or object file emitter. 7316 7317Each individual option is required to be either a valid option for the target's 7318linker, or an option that is reserved by the target specific assembly writer or 7319object file emitter. No other aspect of these options is defined by the IR. 7320 7321Dependent Libs Named Metadata 7322============================= 7323 7324Some targets support embedding of strings into object files to indicate 7325a set of libraries to add to the link. Typically this is used in conjunction 7326with language extensions which allow source files to explicitly declare the 7327libraries they depend on, and have these automatically be transmitted to the 7328linker via object files. 7329 7330The list is encoded in the IR using named metadata with the name 7331``!llvm.dependent-libraries``. Each operand is expected to be a metadata node 7332which should contain a single string operand. 7333 7334For example, the following metadata section contains two library specifiers:: 7335 7336 !0 = !{!"a library specifier"} 7337 !1 = !{!"another library specifier"} 7338 !llvm.dependent-libraries = !{ !0, !1 } 7339 7340Each library specifier will be handled independently by the consuming linker. 7341The effect of the library specifiers are defined by the consuming linker. 7342 7343.. _summary: 7344 7345ThinLTO Summary 7346=============== 7347 7348Compiling with `ThinLTO <https://clang.llvm.org/docs/ThinLTO.html>`_ 7349causes the building of a compact summary of the module that is emitted into 7350the bitcode. The summary is emitted into the LLVM assembly and identified 7351in syntax by a caret ('``^``'). 7352 7353The summary is parsed into a bitcode output, along with the Module 7354IR, via the "``llvm-as``" tool. Tools that parse the Module IR for the purposes 7355of optimization (e.g. "``clang -x ir``" and "``opt``"), will ignore the 7356summary entries (just as they currently ignore summary entries in a bitcode 7357input file). 7358 7359Eventually, the summary will be parsed into a ModuleSummaryIndex object under 7360the same conditions where summary index is currently built from bitcode. 7361Specifically, tools that test the Thin Link portion of a ThinLTO compile 7362(i.e. llvm-lto and llvm-lto2), or when parsing a combined index 7363for a distributed ThinLTO backend via clang's "``-fthinlto-index=<>``" flag 7364(this part is not yet implemented, use llvm-as to create a bitcode object 7365before feeding into thin link tools for now). 7366 7367There are currently 3 types of summary entries in the LLVM assembly: 7368:ref:`module paths<module_path_summary>`, 7369:ref:`global values<gv_summary>`, and 7370:ref:`type identifiers<typeid_summary>`. 7371 7372.. _module_path_summary: 7373 7374Module Path Summary Entry 7375------------------------- 7376 7377Each module path summary entry lists a module containing global values included 7378in the summary. For a single IR module there will be one such entry, but 7379in a combined summary index produced during the thin link, there will be 7380one module path entry per linked module with summary. 7381 7382Example: 7383 7384.. code-block:: text 7385 7386 ^0 = module: (path: "/path/to/file.o", hash: (2468601609, 1329373163, 1565878005, 638838075, 3148790418)) 7387 7388The ``path`` field is a string path to the bitcode file, and the ``hash`` 7389field is the 160-bit SHA-1 hash of the IR bitcode contents, used for 7390incremental builds and caching. 7391 7392.. _gv_summary: 7393 7394Global Value Summary Entry 7395-------------------------- 7396 7397Each global value summary entry corresponds to a global value defined or 7398referenced by a summarized module. 7399 7400Example: 7401 7402.. code-block:: text 7403 7404 ^4 = gv: (name: "f"[, summaries: (Summary)[, (Summary)]*]?) ; guid = 14740650423002898831 7405 7406For declarations, there will not be a summary list. For definitions, a 7407global value will contain a list of summaries, one per module containing 7408a definition. There can be multiple entries in a combined summary index 7409for symbols with weak linkage. 7410 7411Each ``Summary`` format will depend on whether the global value is a 7412:ref:`function<function_summary>`, :ref:`variable<variable_summary>`, or 7413:ref:`alias<alias_summary>`. 7414 7415.. _function_summary: 7416 7417Function Summary 7418^^^^^^^^^^^^^^^^ 7419 7420If the global value is a function, the ``Summary`` entry will look like: 7421 7422.. code-block:: text 7423 7424 function: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0), insts: 2[, FuncFlags]?[, Calls]?[, TypeIdInfo]?[, Params]?[, Refs]? 7425 7426The ``module`` field includes the summary entry id for the module containing 7427this definition, and the ``flags`` field contains information such as 7428the linkage type, a flag indicating whether it is legal to import the 7429definition, whether it is globally live and whether the linker resolved it 7430to a local definition (the latter two are populated during the thin link). 7431The ``insts`` field contains the number of IR instructions in the function. 7432Finally, there are several optional fields: :ref:`FuncFlags<funcflags_summary>`, 7433:ref:`Calls<calls_summary>`, :ref:`TypeIdInfo<typeidinfo_summary>`, 7434:ref:`Params<params_summary>`, :ref:`Refs<refs_summary>`. 7435 7436.. _variable_summary: 7437 7438Global Variable Summary 7439^^^^^^^^^^^^^^^^^^^^^^^ 7440 7441If the global value is a variable, the ``Summary`` entry will look like: 7442 7443.. code-block:: text 7444 7445 variable: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0)[, Refs]? 7446 7447The variable entry contains a subset of the fields in a 7448:ref:`function summary <function_summary>`, see the descriptions there. 7449 7450.. _alias_summary: 7451 7452Alias Summary 7453^^^^^^^^^^^^^ 7454 7455If the global value is an alias, the ``Summary`` entry will look like: 7456 7457.. code-block:: text 7458 7459 alias: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0), aliasee: ^2) 7460 7461The ``module`` and ``flags`` fields are as described for a 7462:ref:`function summary <function_summary>`. The ``aliasee`` field 7463contains a reference to the global value summary entry of the aliasee. 7464 7465.. _funcflags_summary: 7466 7467Function Flags 7468^^^^^^^^^^^^^^ 7469 7470The optional ``FuncFlags`` field looks like: 7471 7472.. code-block:: text 7473 7474 funcFlags: (readNone: 0, readOnly: 0, noRecurse: 0, returnDoesNotAlias: 0, noInline: 0, alwaysInline: 0, noUnwind: 1, mayThrow: 0, hasUnknownCall: 0) 7475 7476If unspecified, flags are assumed to hold the conservative ``false`` value of 7477``0``. 7478 7479.. _calls_summary: 7480 7481Calls 7482^^^^^ 7483 7484The optional ``Calls`` field looks like: 7485 7486.. code-block:: text 7487 7488 calls: ((Callee)[, (Callee)]*) 7489 7490where each ``Callee`` looks like: 7491 7492.. code-block:: text 7493 7494 callee: ^1[, hotness: None]?[, relbf: 0]? 7495 7496The ``callee`` refers to the summary entry id of the callee. At most one 7497of ``hotness`` (which can take the values ``Unknown``, ``Cold``, ``None``, 7498``Hot``, and ``Critical``), and ``relbf`` (which holds the integer 7499branch frequency relative to the entry frequency, scaled down by 2^8) 7500may be specified. The defaults are ``Unknown`` and ``0``, respectively. 7501 7502.. _params_summary: 7503 7504Params 7505^^^^^^ 7506 7507The optional ``Params`` is used by ``StackSafety`` and looks like: 7508 7509.. code-block:: text 7510 7511 Params: ((Param)[, (Param)]*) 7512 7513where each ``Param`` describes pointer parameter access inside of the 7514function and looks like: 7515 7516.. code-block:: text 7517 7518 param: 4, offset: [0, 5][, calls: ((Callee)[, (Callee)]*)]? 7519 7520where the first ``param`` is the number of the parameter it describes, 7521``offset`` is the inclusive range of offsets from the pointer parameter to bytes 7522which can be accessed by the function. This range does not include accesses by 7523function calls from ``calls`` list. 7524 7525where each ``Callee`` describes how parameter is forwarded into other 7526functions and looks like: 7527 7528.. code-block:: text 7529 7530 callee: ^3, param: 5, offset: [-3, 3] 7531 7532The ``callee`` refers to the summary entry id of the callee, ``param`` is 7533the number of the callee parameter which points into the callers parameter 7534with offset known to be inside of the ``offset`` range. ``calls`` will be 7535consumed and removed by thin link stage to update ``Param::offset`` so it 7536covers all accesses possible by ``calls``. 7537 7538Pointer parameter without corresponding ``Param`` is considered unsafe and we 7539assume that access with any offset is possible. 7540 7541Example: 7542 7543If we have the following function: 7544 7545.. code-block:: text 7546 7547 define i64 @foo(i64* %0, i32* %1, i8* %2, i8 %3) { 7548 store i32* %1, i32** @x 7549 %5 = getelementptr inbounds i8, i8* %2, i64 5 7550 %6 = load i8, i8* %5 7551 %7 = getelementptr inbounds i8, i8* %2, i8 %3 7552 tail call void @bar(i8 %3, i8* %7) 7553 %8 = load i64, i64* %0 7554 ret i64 %8 7555 } 7556 7557We can expect the record like this: 7558 7559.. code-block:: text 7560 7561 params: ((param: 0, offset: [0, 7]),(param: 2, offset: [5, 5], calls: ((callee: ^3, param: 1, offset: [-128, 127])))) 7562 7563The function may access just 8 bytes of the parameter %0 . ``calls`` is empty, 7564so the parameter is either not used for function calls or ``offset`` already 7565covers all accesses from nested function calls. 7566Parameter %1 escapes, so access is unknown. 7567The function itself can access just a single byte of the parameter %2. Additional 7568access is possible inside of the ``@bar`` or ``^3``. The function adds signed 7569offset to the pointer and passes the result as the argument %1 into ``^3``. 7570This record itself does not tell us how ``^3`` will access the parameter. 7571Parameter %3 is not a pointer. 7572 7573.. _refs_summary: 7574 7575Refs 7576^^^^ 7577 7578The optional ``Refs`` field looks like: 7579 7580.. code-block:: text 7581 7582 refs: ((Ref)[, (Ref)]*) 7583 7584where each ``Ref`` contains a reference to the summary id of the referenced 7585value (e.g. ``^1``). 7586 7587.. _typeidinfo_summary: 7588 7589TypeIdInfo 7590^^^^^^^^^^ 7591 7592The optional ``TypeIdInfo`` field, used for 7593`Control Flow Integrity <https://clang.llvm.org/docs/ControlFlowIntegrity.html>`_, 7594looks like: 7595 7596.. code-block:: text 7597 7598 typeIdInfo: [(TypeTests)]?[, (TypeTestAssumeVCalls)]?[, (TypeCheckedLoadVCalls)]?[, (TypeTestAssumeConstVCalls)]?[, (TypeCheckedLoadConstVCalls)]? 7599 7600These optional fields have the following forms: 7601 7602TypeTests 7603""""""""" 7604 7605.. code-block:: text 7606 7607 typeTests: (TypeIdRef[, TypeIdRef]*) 7608 7609Where each ``TypeIdRef`` refers to a :ref:`type id<typeid_summary>` 7610by summary id or ``GUID``. 7611 7612TypeTestAssumeVCalls 7613"""""""""""""""""""" 7614 7615.. code-block:: text 7616 7617 typeTestAssumeVCalls: (VFuncId[, VFuncId]*) 7618 7619Where each VFuncId has the format: 7620 7621.. code-block:: text 7622 7623 vFuncId: (TypeIdRef, offset: 16) 7624 7625Where each ``TypeIdRef`` refers to a :ref:`type id<typeid_summary>` 7626by summary id or ``GUID`` preceded by a ``guid:`` tag. 7627 7628TypeCheckedLoadVCalls 7629""""""""""""""""""""" 7630 7631.. code-block:: text 7632 7633 typeCheckedLoadVCalls: (VFuncId[, VFuncId]*) 7634 7635Where each VFuncId has the format described for ``TypeTestAssumeVCalls``. 7636 7637TypeTestAssumeConstVCalls 7638""""""""""""""""""""""""" 7639 7640.. code-block:: text 7641 7642 typeTestAssumeConstVCalls: (ConstVCall[, ConstVCall]*) 7643 7644Where each ConstVCall has the format: 7645 7646.. code-block:: text 7647 7648 (VFuncId, args: (Arg[, Arg]*)) 7649 7650and where each VFuncId has the format described for ``TypeTestAssumeVCalls``, 7651and each Arg is an integer argument number. 7652 7653TypeCheckedLoadConstVCalls 7654"""""""""""""""""""""""""" 7655 7656.. code-block:: text 7657 7658 typeCheckedLoadConstVCalls: (ConstVCall[, ConstVCall]*) 7659 7660Where each ConstVCall has the format described for 7661``TypeTestAssumeConstVCalls``. 7662 7663.. _typeid_summary: 7664 7665Type ID Summary Entry 7666--------------------- 7667 7668Each type id summary entry corresponds to a type identifier resolution 7669which is generated during the LTO link portion of the compile when building 7670with `Control Flow Integrity <https://clang.llvm.org/docs/ControlFlowIntegrity.html>`_, 7671so these are only present in a combined summary index. 7672 7673Example: 7674 7675.. code-block:: text 7676 7677 ^4 = typeid: (name: "_ZTS1A", summary: (typeTestRes: (kind: allOnes, sizeM1BitWidth: 7[, alignLog2: 0]?[, sizeM1: 0]?[, bitMask: 0]?[, inlineBits: 0]?)[, WpdResolutions]?)) ; guid = 7004155349499253778 7678 7679The ``typeTestRes`` gives the type test resolution ``kind`` (which may 7680be ``unsat``, ``byteArray``, ``inline``, ``single``, or ``allOnes``), and 7681the ``size-1`` bit width. It is followed by optional flags, which default to 0, 7682and an optional WpdResolutions (whole program devirtualization resolution) 7683field that looks like: 7684 7685.. code-block:: text 7686 7687 wpdResolutions: ((offset: 0, WpdRes)[, (offset: 1, WpdRes)]* 7688 7689where each entry is a mapping from the given byte offset to the whole-program 7690devirtualization resolution WpdRes, that has one of the following formats: 7691 7692.. code-block:: text 7693 7694 wpdRes: (kind: branchFunnel) 7695 wpdRes: (kind: singleImpl, singleImplName: "_ZN1A1nEi") 7696 wpdRes: (kind: indir) 7697 7698Additionally, each wpdRes has an optional ``resByArg`` field, which 7699describes the resolutions for calls with all constant integer arguments: 7700 7701.. code-block:: text 7702 7703 resByArg: (ResByArg[, ResByArg]*) 7704 7705where ResByArg is: 7706 7707.. code-block:: text 7708 7709 args: (Arg[, Arg]*), byArg: (kind: UniformRetVal[, info: 0][, byte: 0][, bit: 0]) 7710 7711Where the ``kind`` can be ``Indir``, ``UniformRetVal``, ``UniqueRetVal`` 7712or ``VirtualConstProp``. The ``info`` field is only used if the kind 7713is ``UniformRetVal`` (indicates the uniform return value), or 7714``UniqueRetVal`` (holds the return value associated with the unique vtable 7715(0 or 1)). The ``byte`` and ``bit`` fields are only used if the target does 7716not support the use of absolute symbols to store constants. 7717 7718.. _intrinsicglobalvariables: 7719 7720Intrinsic Global Variables 7721========================== 7722 7723LLVM has a number of "magic" global variables that contain data that 7724affect code generation or other IR semantics. These are documented here. 7725All globals of this sort should have a section specified as 7726"``llvm.metadata``". This section and all globals that start with 7727"``llvm.``" are reserved for use by LLVM. 7728 7729.. _gv_llvmused: 7730 7731The '``llvm.used``' Global Variable 7732----------------------------------- 7733 7734The ``@llvm.used`` global is an array which has 7735:ref:`appending linkage <linkage_appending>`. This array contains a list of 7736pointers to named global variables, functions and aliases which may optionally 7737have a pointer cast formed of bitcast or getelementptr. For example, a legal 7738use of it is: 7739 7740.. code-block:: llvm 7741 7742 @X = global i8 4 7743 @Y = global i32 123 7744 7745 @llvm.used = appending global [2 x i8*] [ 7746 i8* @X, 7747 i8* bitcast (i32* @Y to i8*) 7748 ], section "llvm.metadata" 7749 7750If a symbol appears in the ``@llvm.used`` list, then the compiler, assembler, 7751and linker are required to treat the symbol as if there is a reference to the 7752symbol that it cannot see (which is why they have to be named). For example, if 7753a variable has internal linkage and no references other than that from the 7754``@llvm.used`` list, it cannot be deleted. This is commonly used to represent 7755references from inline asms and other things the compiler cannot "see", and 7756corresponds to "``attribute((used))``" in GNU C. 7757 7758On some targets, the code generator must emit a directive to the 7759assembler or object file to prevent the assembler and linker from 7760removing the symbol. 7761 7762.. _gv_llvmcompilerused: 7763 7764The '``llvm.compiler.used``' Global Variable 7765-------------------------------------------- 7766 7767The ``@llvm.compiler.used`` directive is the same as the ``@llvm.used`` 7768directive, except that it only prevents the compiler from touching the 7769symbol. On targets that support it, this allows an intelligent linker to 7770optimize references to the symbol without being impeded as it would be 7771by ``@llvm.used``. 7772 7773This is a rare construct that should only be used in rare circumstances, 7774and should not be exposed to source languages. 7775 7776.. _gv_llvmglobalctors: 7777 7778The '``llvm.global_ctors``' Global Variable 7779------------------------------------------- 7780 7781.. code-block:: llvm 7782 7783 %0 = type { i32, void ()*, i8* } 7784 @llvm.global_ctors = appending global [1 x %0] [%0 { i32 65535, void ()* @ctor, i8* @data }] 7785 7786The ``@llvm.global_ctors`` array contains a list of constructor 7787functions, priorities, and an associated global or function. 7788The functions referenced by this array will be called in ascending order 7789of priority (i.e. lowest first) when the module is loaded. The order of 7790functions with the same priority is not defined. 7791 7792If the third field is non-null, and points to a global variable 7793or function, the initializer function will only run if the associated 7794data from the current module is not discarded. 7795On ELF the referenced global variable or function must be in a comdat. 7796 7797.. _llvmglobaldtors: 7798 7799The '``llvm.global_dtors``' Global Variable 7800------------------------------------------- 7801 7802.. code-block:: llvm 7803 7804 %0 = type { i32, void ()*, i8* } 7805 @llvm.global_dtors = appending global [1 x %0] [%0 { i32 65535, void ()* @dtor, i8* @data }] 7806 7807The ``@llvm.global_dtors`` array contains a list of destructor 7808functions, priorities, and an associated global or function. 7809The functions referenced by this array will be called in descending 7810order of priority (i.e. highest first) when the module is unloaded. The 7811order of functions with the same priority is not defined. 7812 7813If the third field is non-null, and points to a global variable 7814or function, the destructor function will only run if the associated 7815data from the current module is not discarded. 7816On ELF the referenced global variable or function must be in a comdat. 7817 7818Instruction Reference 7819===================== 7820 7821The LLVM instruction set consists of several different classifications 7822of instructions: :ref:`terminator instructions <terminators>`, :ref:`binary 7823instructions <binaryops>`, :ref:`bitwise binary 7824instructions <bitwiseops>`, :ref:`memory instructions <memoryops>`, and 7825:ref:`other instructions <otherops>`. 7826 7827.. _terminators: 7828 7829Terminator Instructions 7830----------------------- 7831 7832As mentioned :ref:`previously <functionstructure>`, every basic block in a 7833program ends with a "Terminator" instruction, which indicates which 7834block should be executed after the current block is finished. These 7835terminator instructions typically yield a '``void``' value: they produce 7836control flow, not values (the one exception being the 7837':ref:`invoke <i_invoke>`' instruction). 7838 7839The terminator instructions are: ':ref:`ret <i_ret>`', 7840':ref:`br <i_br>`', ':ref:`switch <i_switch>`', 7841':ref:`indirectbr <i_indirectbr>`', ':ref:`invoke <i_invoke>`', 7842':ref:`callbr <i_callbr>`' 7843':ref:`resume <i_resume>`', ':ref:`catchswitch <i_catchswitch>`', 7844':ref:`catchret <i_catchret>`', 7845':ref:`cleanupret <i_cleanupret>`', 7846and ':ref:`unreachable <i_unreachable>`'. 7847 7848.. _i_ret: 7849 7850'``ret``' Instruction 7851^^^^^^^^^^^^^^^^^^^^^ 7852 7853Syntax: 7854""""""" 7855 7856:: 7857 7858 ret <type> <value> ; Return a value from a non-void function 7859 ret void ; Return from void function 7860 7861Overview: 7862""""""""" 7863 7864The '``ret``' instruction is used to return control flow (and optionally 7865a value) from a function back to the caller. 7866 7867There are two forms of the '``ret``' instruction: one that returns a 7868value and then causes control flow, and one that just causes control 7869flow to occur. 7870 7871Arguments: 7872"""""""""" 7873 7874The '``ret``' instruction optionally accepts a single argument, the 7875return value. The type of the return value must be a ':ref:`first 7876class <t_firstclass>`' type. 7877 7878A function is not :ref:`well formed <wellformed>` if it has a non-void 7879return type and contains a '``ret``' instruction with no return value or 7880a return value with a type that does not match its type, or if it has a 7881void return type and contains a '``ret``' instruction with a return 7882value. 7883 7884Semantics: 7885"""""""""" 7886 7887When the '``ret``' instruction is executed, control flow returns back to 7888the calling function's context. If the caller is a 7889":ref:`call <i_call>`" instruction, execution continues at the 7890instruction after the call. If the caller was an 7891":ref:`invoke <i_invoke>`" instruction, execution continues at the 7892beginning of the "normal" destination block. If the instruction returns 7893a value, that value shall set the call or invoke instruction's return 7894value. 7895 7896Example: 7897"""""""" 7898 7899.. code-block:: llvm 7900 7901 ret i32 5 ; Return an integer value of 5 7902 ret void ; Return from a void function 7903 ret { i32, i8 } { i32 4, i8 2 } ; Return a struct of values 4 and 2 7904 7905.. _i_br: 7906 7907'``br``' Instruction 7908^^^^^^^^^^^^^^^^^^^^ 7909 7910Syntax: 7911""""""" 7912 7913:: 7914 7915 br i1 <cond>, label <iftrue>, label <iffalse> 7916 br label <dest> ; Unconditional branch 7917 7918Overview: 7919""""""""" 7920 7921The '``br``' instruction is used to cause control flow to transfer to a 7922different basic block in the current function. There are two forms of 7923this instruction, corresponding to a conditional branch and an 7924unconditional branch. 7925 7926Arguments: 7927"""""""""" 7928 7929The conditional branch form of the '``br``' instruction takes a single 7930'``i1``' value and two '``label``' values. The unconditional form of the 7931'``br``' instruction takes a single '``label``' value as a target. 7932 7933Semantics: 7934"""""""""" 7935 7936Upon execution of a conditional '``br``' instruction, the '``i1``' 7937argument is evaluated. If the value is ``true``, control flows to the 7938'``iftrue``' ``label`` argument. If "cond" is ``false``, control flows 7939to the '``iffalse``' ``label`` argument. 7940If '``cond``' is ``poison`` or ``undef``, this instruction has undefined 7941behavior. 7942 7943Example: 7944"""""""" 7945 7946.. code-block:: llvm 7947 7948 Test: 7949 %cond = icmp eq i32 %a, %b 7950 br i1 %cond, label %IfEqual, label %IfUnequal 7951 IfEqual: 7952 ret i32 1 7953 IfUnequal: 7954 ret i32 0 7955 7956.. _i_switch: 7957 7958'``switch``' Instruction 7959^^^^^^^^^^^^^^^^^^^^^^^^ 7960 7961Syntax: 7962""""""" 7963 7964:: 7965 7966 switch <intty> <value>, label <defaultdest> [ <intty> <val>, label <dest> ... ] 7967 7968Overview: 7969""""""""" 7970 7971The '``switch``' instruction is used to transfer control flow to one of 7972several different places. It is a generalization of the '``br``' 7973instruction, allowing a branch to occur to one of many possible 7974destinations. 7975 7976Arguments: 7977"""""""""" 7978 7979The '``switch``' instruction uses three parameters: an integer 7980comparison value '``value``', a default '``label``' destination, and an 7981array of pairs of comparison value constants and '``label``'s. The table 7982is not allowed to contain duplicate constant entries. 7983 7984Semantics: 7985"""""""""" 7986 7987The ``switch`` instruction specifies a table of values and destinations. 7988When the '``switch``' instruction is executed, this table is searched 7989for the given value. If the value is found, control flow is transferred 7990to the corresponding destination; otherwise, control flow is transferred 7991to the default destination. 7992If '``value``' is ``poison`` or ``undef``, this instruction has undefined 7993behavior. 7994 7995Implementation: 7996""""""""""""""" 7997 7998Depending on properties of the target machine and the particular 7999``switch`` instruction, this instruction may be code generated in 8000different ways. For example, it could be generated as a series of 8001chained conditional branches or with a lookup table. 8002 8003Example: 8004"""""""" 8005 8006.. code-block:: llvm 8007 8008 ; Emulate a conditional br instruction 8009 %Val = zext i1 %value to i32 8010 switch i32 %Val, label %truedest [ i32 0, label %falsedest ] 8011 8012 ; Emulate an unconditional br instruction 8013 switch i32 0, label %dest [ ] 8014 8015 ; Implement a jump table: 8016 switch i32 %val, label %otherwise [ i32 0, label %onzero 8017 i32 1, label %onone 8018 i32 2, label %ontwo ] 8019 8020.. _i_indirectbr: 8021 8022'``indirectbr``' Instruction 8023^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 8024 8025Syntax: 8026""""""" 8027 8028:: 8029 8030 indirectbr <somety>* <address>, [ label <dest1>, label <dest2>, ... ] 8031 8032Overview: 8033""""""""" 8034 8035The '``indirectbr``' instruction implements an indirect branch to a 8036label within the current function, whose address is specified by 8037"``address``". Address must be derived from a 8038:ref:`blockaddress <blockaddress>` constant. 8039 8040Arguments: 8041"""""""""" 8042 8043The '``address``' argument is the address of the label to jump to. The 8044rest of the arguments indicate the full set of possible destinations 8045that the address may point to. Blocks are allowed to occur multiple 8046times in the destination list, though this isn't particularly useful. 8047 8048This destination list is required so that dataflow analysis has an 8049accurate understanding of the CFG. 8050 8051Semantics: 8052"""""""""" 8053 8054Control transfers to the block specified in the address argument. All 8055possible destination blocks must be listed in the label list, otherwise 8056this instruction has undefined behavior. This implies that jumps to 8057labels defined in other functions have undefined behavior as well. 8058If '``address``' is ``poison`` or ``undef``, this instruction has undefined 8059behavior. 8060 8061Implementation: 8062""""""""""""""" 8063 8064This is typically implemented with a jump through a register. 8065 8066Example: 8067"""""""" 8068 8069.. code-block:: llvm 8070 8071 indirectbr i8* %Addr, [ label %bb1, label %bb2, label %bb3 ] 8072 8073.. _i_invoke: 8074 8075'``invoke``' Instruction 8076^^^^^^^^^^^^^^^^^^^^^^^^ 8077 8078Syntax: 8079""""""" 8080 8081:: 8082 8083 <result> = invoke [cconv] [ret attrs] [addrspace(<num>)] <ty>|<fnty> <fnptrval>(<function args>) [fn attrs] 8084 [operand bundles] to label <normal label> unwind label <exception label> 8085 8086Overview: 8087""""""""" 8088 8089The '``invoke``' instruction causes control to transfer to a specified 8090function, with the possibility of control flow transfer to either the 8091'``normal``' label or the '``exception``' label. If the callee function 8092returns with the "``ret``" instruction, control flow will return to the 8093"normal" label. If the callee (or any indirect callees) returns via the 8094":ref:`resume <i_resume>`" instruction or other exception handling 8095mechanism, control is interrupted and continued at the dynamically 8096nearest "exception" label. 8097 8098The '``exception``' label is a `landing 8099pad <ExceptionHandling.html#overview>`_ for the exception. As such, 8100'``exception``' label is required to have the 8101":ref:`landingpad <i_landingpad>`" instruction, which contains the 8102information about the behavior of the program after unwinding happens, 8103as its first non-PHI instruction. The restrictions on the 8104"``landingpad``" instruction's tightly couples it to the "``invoke``" 8105instruction, so that the important information contained within the 8106"``landingpad``" instruction can't be lost through normal code motion. 8107 8108Arguments: 8109"""""""""" 8110 8111This instruction requires several arguments: 8112 8113#. The optional "cconv" marker indicates which :ref:`calling 8114 convention <callingconv>` the call should use. If none is 8115 specified, the call defaults to using C calling conventions. 8116#. The optional :ref:`Parameter Attributes <paramattrs>` list for return 8117 values. Only '``zeroext``', '``signext``', and '``inreg``' attributes 8118 are valid here. 8119#. The optional addrspace attribute can be used to indicate the address space 8120 of the called function. If it is not specified, the program address space 8121 from the :ref:`datalayout string<langref_datalayout>` will be used. 8122#. '``ty``': the type of the call instruction itself which is also the 8123 type of the return value. Functions that return no value are marked 8124 ``void``. 8125#. '``fnty``': shall be the signature of the function being invoked. The 8126 argument types must match the types implied by this signature. This 8127 type can be omitted if the function is not varargs. 8128#. '``fnptrval``': An LLVM value containing a pointer to a function to 8129 be invoked. In most cases, this is a direct function invocation, but 8130 indirect ``invoke``'s are just as possible, calling an arbitrary pointer 8131 to function value. 8132#. '``function args``': argument list whose types match the function 8133 signature argument types and parameter attributes. All arguments must 8134 be of :ref:`first class <t_firstclass>` type. If the function signature 8135 indicates the function accepts a variable number of arguments, the 8136 extra arguments can be specified. 8137#. '``normal label``': the label reached when the called function 8138 executes a '``ret``' instruction. 8139#. '``exception label``': the label reached when a callee returns via 8140 the :ref:`resume <i_resume>` instruction or other exception handling 8141 mechanism. 8142#. The optional :ref:`function attributes <fnattrs>` list. 8143#. The optional :ref:`operand bundles <opbundles>` list. 8144 8145Semantics: 8146"""""""""" 8147 8148This instruction is designed to operate as a standard '``call``' 8149instruction in most regards. The primary difference is that it 8150establishes an association with a label, which is used by the runtime 8151library to unwind the stack. 8152 8153This instruction is used in languages with destructors to ensure that 8154proper cleanup is performed in the case of either a ``longjmp`` or a 8155thrown exception. Additionally, this is important for implementation of 8156'``catch``' clauses in high-level languages that support them. 8157 8158For the purposes of the SSA form, the definition of the value returned 8159by the '``invoke``' instruction is deemed to occur on the edge from the 8160current block to the "normal" label. If the callee unwinds then no 8161return value is available. 8162 8163Example: 8164"""""""" 8165 8166.. code-block:: llvm 8167 8168 %retval = invoke i32 @Test(i32 15) to label %Continue 8169 unwind label %TestCleanup ; i32:retval set 8170 %retval = invoke coldcc i32 %Testfnptr(i32 15) to label %Continue 8171 unwind label %TestCleanup ; i32:retval set 8172 8173.. _i_callbr: 8174 8175'``callbr``' Instruction 8176^^^^^^^^^^^^^^^^^^^^^^^^ 8177 8178Syntax: 8179""""""" 8180 8181:: 8182 8183 <result> = callbr [cconv] [ret attrs] [addrspace(<num>)] <ty>|<fnty> <fnptrval>(<function args>) [fn attrs] 8184 [operand bundles] to label <fallthrough label> [indirect labels] 8185 8186Overview: 8187""""""""" 8188 8189The '``callbr``' instruction causes control to transfer to a specified 8190function, with the possibility of control flow transfer to either the 8191'``fallthrough``' label or one of the '``indirect``' labels. 8192 8193This instruction should only be used to implement the "goto" feature of gcc 8194style inline assembly. Any other usage is an error in the IR verifier. 8195 8196Arguments: 8197"""""""""" 8198 8199This instruction requires several arguments: 8200 8201#. The optional "cconv" marker indicates which :ref:`calling 8202 convention <callingconv>` the call should use. If none is 8203 specified, the call defaults to using C calling conventions. 8204#. The optional :ref:`Parameter Attributes <paramattrs>` list for return 8205 values. Only '``zeroext``', '``signext``', and '``inreg``' attributes 8206 are valid here. 8207#. The optional addrspace attribute can be used to indicate the address space 8208 of the called function. If it is not specified, the program address space 8209 from the :ref:`datalayout string<langref_datalayout>` will be used. 8210#. '``ty``': the type of the call instruction itself which is also the 8211 type of the return value. Functions that return no value are marked 8212 ``void``. 8213#. '``fnty``': shall be the signature of the function being called. The 8214 argument types must match the types implied by this signature. This 8215 type can be omitted if the function is not varargs. 8216#. '``fnptrval``': An LLVM value containing a pointer to a function to 8217 be called. In most cases, this is a direct function call, but 8218 other ``callbr``'s are just as possible, calling an arbitrary pointer 8219 to function value. 8220#. '``function args``': argument list whose types match the function 8221 signature argument types and parameter attributes. All arguments must 8222 be of :ref:`first class <t_firstclass>` type. If the function signature 8223 indicates the function accepts a variable number of arguments, the 8224 extra arguments can be specified. 8225#. '``fallthrough label``': the label reached when the inline assembly's 8226 execution exits the bottom. 8227#. '``indirect labels``': the labels reached when a callee transfers control 8228 to a location other than the '``fallthrough label``'. The blockaddress 8229 constant for these should also be in the list of '``function args``'. 8230#. The optional :ref:`function attributes <fnattrs>` list. 8231#. The optional :ref:`operand bundles <opbundles>` list. 8232 8233Semantics: 8234"""""""""" 8235 8236This instruction is designed to operate as a standard '``call``' 8237instruction in most regards. The primary difference is that it 8238establishes an association with additional labels to define where control 8239flow goes after the call. 8240 8241The output values of a '``callbr``' instruction are available only to 8242the '``fallthrough``' block, not to any '``indirect``' blocks(s). 8243 8244The only use of this today is to implement the "goto" feature of gcc inline 8245assembly where additional labels can be provided as locations for the inline 8246assembly to jump to. 8247 8248Example: 8249"""""""" 8250 8251.. code-block:: llvm 8252 8253 ; "asm goto" without output constraints. 8254 callbr void asm "", "r,X"(i32 %x, i8 *blockaddress(@foo, %indirect)) 8255 to label %fallthrough [label %indirect] 8256 8257 ; "asm goto" with output constraints. 8258 <result> = callbr i32 asm "", "=r,r,X"(i32 %x, i8 *blockaddress(@foo, %indirect)) 8259 to label %fallthrough [label %indirect] 8260 8261.. _i_resume: 8262 8263'``resume``' Instruction 8264^^^^^^^^^^^^^^^^^^^^^^^^ 8265 8266Syntax: 8267""""""" 8268 8269:: 8270 8271 resume <type> <value> 8272 8273Overview: 8274""""""""" 8275 8276The '``resume``' instruction is a terminator instruction that has no 8277successors. 8278 8279Arguments: 8280"""""""""" 8281 8282The '``resume``' instruction requires one argument, which must have the 8283same type as the result of any '``landingpad``' instruction in the same 8284function. 8285 8286Semantics: 8287"""""""""" 8288 8289The '``resume``' instruction resumes propagation of an existing 8290(in-flight) exception whose unwinding was interrupted with a 8291:ref:`landingpad <i_landingpad>` instruction. 8292 8293Example: 8294"""""""" 8295 8296.. code-block:: llvm 8297 8298 resume { i8*, i32 } %exn 8299 8300.. _i_catchswitch: 8301 8302'``catchswitch``' Instruction 8303^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 8304 8305Syntax: 8306""""""" 8307 8308:: 8309 8310 <resultval> = catchswitch within <parent> [ label <handler1>, label <handler2>, ... ] unwind to caller 8311 <resultval> = catchswitch within <parent> [ label <handler1>, label <handler2>, ... ] unwind label <default> 8312 8313Overview: 8314""""""""" 8315 8316The '``catchswitch``' instruction is used by `LLVM's exception handling system 8317<ExceptionHandling.html#overview>`_ to describe the set of possible catch handlers 8318that may be executed by the :ref:`EH personality routine <personalityfn>`. 8319 8320Arguments: 8321"""""""""" 8322 8323The ``parent`` argument is the token of the funclet that contains the 8324``catchswitch`` instruction. If the ``catchswitch`` is not inside a funclet, 8325this operand may be the token ``none``. 8326 8327The ``default`` argument is the label of another basic block beginning with 8328either a ``cleanuppad`` or ``catchswitch`` instruction. This unwind destination 8329must be a legal target with respect to the ``parent`` links, as described in 8330the `exception handling documentation\ <ExceptionHandling.html#wineh-constraints>`_. 8331 8332The ``handlers`` are a nonempty list of successor blocks that each begin with a 8333:ref:`catchpad <i_catchpad>` instruction. 8334 8335Semantics: 8336"""""""""" 8337 8338Executing this instruction transfers control to one of the successors in 8339``handlers``, if appropriate, or continues to unwind via the unwind label if 8340present. 8341 8342The ``catchswitch`` is both a terminator and a "pad" instruction, meaning that 8343it must be both the first non-phi instruction and last instruction in the basic 8344block. Therefore, it must be the only non-phi instruction in the block. 8345 8346Example: 8347"""""""" 8348 8349.. code-block:: text 8350 8351 dispatch1: 8352 %cs1 = catchswitch within none [label %handler0, label %handler1] unwind to caller 8353 dispatch2: 8354 %cs2 = catchswitch within %parenthandler [label %handler0] unwind label %cleanup 8355 8356.. _i_catchret: 8357 8358'``catchret``' Instruction 8359^^^^^^^^^^^^^^^^^^^^^^^^^^ 8360 8361Syntax: 8362""""""" 8363 8364:: 8365 8366 catchret from <token> to label <normal> 8367 8368Overview: 8369""""""""" 8370 8371The '``catchret``' instruction is a terminator instruction that has a 8372single successor. 8373 8374 8375Arguments: 8376"""""""""" 8377 8378The first argument to a '``catchret``' indicates which ``catchpad`` it 8379exits. It must be a :ref:`catchpad <i_catchpad>`. 8380The second argument to a '``catchret``' specifies where control will 8381transfer to next. 8382 8383Semantics: 8384"""""""""" 8385 8386The '``catchret``' instruction ends an existing (in-flight) exception whose 8387unwinding was interrupted with a :ref:`catchpad <i_catchpad>` instruction. The 8388:ref:`personality function <personalityfn>` gets a chance to execute arbitrary 8389code to, for example, destroy the active exception. Control then transfers to 8390``normal``. 8391 8392The ``token`` argument must be a token produced by a ``catchpad`` instruction. 8393If the specified ``catchpad`` is not the most-recently-entered not-yet-exited 8394funclet pad (as described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_), 8395the ``catchret``'s behavior is undefined. 8396 8397Example: 8398"""""""" 8399 8400.. code-block:: text 8401 8402 catchret from %catch label %continue 8403 8404.. _i_cleanupret: 8405 8406'``cleanupret``' Instruction 8407^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 8408 8409Syntax: 8410""""""" 8411 8412:: 8413 8414 cleanupret from <value> unwind label <continue> 8415 cleanupret from <value> unwind to caller 8416 8417Overview: 8418""""""""" 8419 8420The '``cleanupret``' instruction is a terminator instruction that has 8421an optional successor. 8422 8423 8424Arguments: 8425"""""""""" 8426 8427The '``cleanupret``' instruction requires one argument, which indicates 8428which ``cleanuppad`` it exits, and must be a :ref:`cleanuppad <i_cleanuppad>`. 8429If the specified ``cleanuppad`` is not the most-recently-entered not-yet-exited 8430funclet pad (as described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_), 8431the ``cleanupret``'s behavior is undefined. 8432 8433The '``cleanupret``' instruction also has an optional successor, ``continue``, 8434which must be the label of another basic block beginning with either a 8435``cleanuppad`` or ``catchswitch`` instruction. This unwind destination must 8436be a legal target with respect to the ``parent`` links, as described in the 8437`exception handling documentation\ <ExceptionHandling.html#wineh-constraints>`_. 8438 8439Semantics: 8440"""""""""" 8441 8442The '``cleanupret``' instruction indicates to the 8443:ref:`personality function <personalityfn>` that one 8444:ref:`cleanuppad <i_cleanuppad>` it transferred control to has ended. 8445It transfers control to ``continue`` or unwinds out of the function. 8446 8447Example: 8448"""""""" 8449 8450.. code-block:: text 8451 8452 cleanupret from %cleanup unwind to caller 8453 cleanupret from %cleanup unwind label %continue 8454 8455.. _i_unreachable: 8456 8457'``unreachable``' Instruction 8458^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 8459 8460Syntax: 8461""""""" 8462 8463:: 8464 8465 unreachable 8466 8467Overview: 8468""""""""" 8469 8470The '``unreachable``' instruction has no defined semantics. This 8471instruction is used to inform the optimizer that a particular portion of 8472the code is not reachable. This can be used to indicate that the code 8473after a no-return function cannot be reached, and other facts. 8474 8475Semantics: 8476"""""""""" 8477 8478The '``unreachable``' instruction has no defined semantics. 8479 8480.. _unaryops: 8481 8482Unary Operations 8483----------------- 8484 8485Unary operators require a single operand, execute an operation on 8486it, and produce a single value. The operand might represent multiple 8487data, as is the case with the :ref:`vector <t_vector>` data type. The 8488result value has the same type as its operand. 8489 8490.. _i_fneg: 8491 8492'``fneg``' Instruction 8493^^^^^^^^^^^^^^^^^^^^^^ 8494 8495Syntax: 8496""""""" 8497 8498:: 8499 8500 <result> = fneg [fast-math flags]* <ty> <op1> ; yields ty:result 8501 8502Overview: 8503""""""""" 8504 8505The '``fneg``' instruction returns the negation of its operand. 8506 8507Arguments: 8508"""""""""" 8509 8510The argument to the '``fneg``' instruction must be a 8511:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of 8512floating-point values. 8513 8514Semantics: 8515"""""""""" 8516 8517The value produced is a copy of the operand with its sign bit flipped. 8518This instruction can also take any number of :ref:`fast-math 8519flags <fastmath>`, which are optimization hints to enable otherwise 8520unsafe floating-point optimizations: 8521 8522Example: 8523"""""""" 8524 8525.. code-block:: text 8526 8527 <result> = fneg float %val ; yields float:result = -%var 8528 8529.. _binaryops: 8530 8531Binary Operations 8532----------------- 8533 8534Binary operators are used to do most of the computation in a program. 8535They require two operands of the same type, execute an operation on 8536them, and produce a single value. The operands might represent multiple 8537data, as is the case with the :ref:`vector <t_vector>` data type. The 8538result value has the same type as its operands. 8539 8540There are several different binary operators: 8541 8542.. _i_add: 8543 8544'``add``' Instruction 8545^^^^^^^^^^^^^^^^^^^^^ 8546 8547Syntax: 8548""""""" 8549 8550:: 8551 8552 <result> = add <ty> <op1>, <op2> ; yields ty:result 8553 <result> = add nuw <ty> <op1>, <op2> ; yields ty:result 8554 <result> = add nsw <ty> <op1>, <op2> ; yields ty:result 8555 <result> = add nuw nsw <ty> <op1>, <op2> ; yields ty:result 8556 8557Overview: 8558""""""""" 8559 8560The '``add``' instruction returns the sum of its two operands. 8561 8562Arguments: 8563"""""""""" 8564 8565The two arguments to the '``add``' instruction must be 8566:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both 8567arguments must have identical types. 8568 8569Semantics: 8570"""""""""" 8571 8572The value produced is the integer sum of the two operands. 8573 8574If the sum has unsigned overflow, the result returned is the 8575mathematical result modulo 2\ :sup:`n`\ , where n is the bit width of 8576the result. 8577 8578Because LLVM integers use a two's complement representation, this 8579instruction is appropriate for both signed and unsigned integers. 8580 8581``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap", 8582respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the 8583result value of the ``add`` is a :ref:`poison value <poisonvalues>` if 8584unsigned and/or signed overflow, respectively, occurs. 8585 8586Example: 8587"""""""" 8588 8589.. code-block:: text 8590 8591 <result> = add i32 4, %var ; yields i32:result = 4 + %var 8592 8593.. _i_fadd: 8594 8595'``fadd``' Instruction 8596^^^^^^^^^^^^^^^^^^^^^^ 8597 8598Syntax: 8599""""""" 8600 8601:: 8602 8603 <result> = fadd [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result 8604 8605Overview: 8606""""""""" 8607 8608The '``fadd``' instruction returns the sum of its two operands. 8609 8610Arguments: 8611"""""""""" 8612 8613The two arguments to the '``fadd``' instruction must be 8614:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of 8615floating-point values. Both arguments must have identical types. 8616 8617Semantics: 8618"""""""""" 8619 8620The value produced is the floating-point sum of the two operands. 8621This instruction is assumed to execute in the default :ref:`floating-point 8622environment <floatenv>`. 8623This instruction can also take any number of :ref:`fast-math 8624flags <fastmath>`, which are optimization hints to enable otherwise 8625unsafe floating-point optimizations: 8626 8627Example: 8628"""""""" 8629 8630.. code-block:: text 8631 8632 <result> = fadd float 4.0, %var ; yields float:result = 4.0 + %var 8633 8634.. _i_sub: 8635 8636'``sub``' Instruction 8637^^^^^^^^^^^^^^^^^^^^^ 8638 8639Syntax: 8640""""""" 8641 8642:: 8643 8644 <result> = sub <ty> <op1>, <op2> ; yields ty:result 8645 <result> = sub nuw <ty> <op1>, <op2> ; yields ty:result 8646 <result> = sub nsw <ty> <op1>, <op2> ; yields ty:result 8647 <result> = sub nuw nsw <ty> <op1>, <op2> ; yields ty:result 8648 8649Overview: 8650""""""""" 8651 8652The '``sub``' instruction returns the difference of its two operands. 8653 8654Note that the '``sub``' instruction is used to represent the '``neg``' 8655instruction present in most other intermediate representations. 8656 8657Arguments: 8658"""""""""" 8659 8660The two arguments to the '``sub``' instruction must be 8661:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both 8662arguments must have identical types. 8663 8664Semantics: 8665"""""""""" 8666 8667The value produced is the integer difference of the two operands. 8668 8669If the difference has unsigned overflow, the result returned is the 8670mathematical result modulo 2\ :sup:`n`\ , where n is the bit width of 8671the result. 8672 8673Because LLVM integers use a two's complement representation, this 8674instruction is appropriate for both signed and unsigned integers. 8675 8676``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap", 8677respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the 8678result value of the ``sub`` is a :ref:`poison value <poisonvalues>` if 8679unsigned and/or signed overflow, respectively, occurs. 8680 8681Example: 8682"""""""" 8683 8684.. code-block:: text 8685 8686 <result> = sub i32 4, %var ; yields i32:result = 4 - %var 8687 <result> = sub i32 0, %val ; yields i32:result = -%var 8688 8689.. _i_fsub: 8690 8691'``fsub``' Instruction 8692^^^^^^^^^^^^^^^^^^^^^^ 8693 8694Syntax: 8695""""""" 8696 8697:: 8698 8699 <result> = fsub [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result 8700 8701Overview: 8702""""""""" 8703 8704The '``fsub``' instruction returns the difference of its two operands. 8705 8706Arguments: 8707"""""""""" 8708 8709The two arguments to the '``fsub``' instruction must be 8710:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of 8711floating-point values. Both arguments must have identical types. 8712 8713Semantics: 8714"""""""""" 8715 8716The value produced is the floating-point difference of the two operands. 8717This instruction is assumed to execute in the default :ref:`floating-point 8718environment <floatenv>`. 8719This instruction can also take any number of :ref:`fast-math 8720flags <fastmath>`, which are optimization hints to enable otherwise 8721unsafe floating-point optimizations: 8722 8723Example: 8724"""""""" 8725 8726.. code-block:: text 8727 8728 <result> = fsub float 4.0, %var ; yields float:result = 4.0 - %var 8729 <result> = fsub float -0.0, %val ; yields float:result = -%var 8730 8731.. _i_mul: 8732 8733'``mul``' Instruction 8734^^^^^^^^^^^^^^^^^^^^^ 8735 8736Syntax: 8737""""""" 8738 8739:: 8740 8741 <result> = mul <ty> <op1>, <op2> ; yields ty:result 8742 <result> = mul nuw <ty> <op1>, <op2> ; yields ty:result 8743 <result> = mul nsw <ty> <op1>, <op2> ; yields ty:result 8744 <result> = mul nuw nsw <ty> <op1>, <op2> ; yields ty:result 8745 8746Overview: 8747""""""""" 8748 8749The '``mul``' instruction returns the product of its two operands. 8750 8751Arguments: 8752"""""""""" 8753 8754The two arguments to the '``mul``' instruction must be 8755:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both 8756arguments must have identical types. 8757 8758Semantics: 8759"""""""""" 8760 8761The value produced is the integer product of the two operands. 8762 8763If the result of the multiplication has unsigned overflow, the result 8764returned is the mathematical result modulo 2\ :sup:`n`\ , where n is the 8765bit width of the result. 8766 8767Because LLVM integers use a two's complement representation, and the 8768result is the same width as the operands, this instruction returns the 8769correct result for both signed and unsigned integers. If a full product 8770(e.g. ``i32`` * ``i32`` -> ``i64``) is needed, the operands should be 8771sign-extended or zero-extended as appropriate to the width of the full 8772product. 8773 8774``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap", 8775respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the 8776result value of the ``mul`` is a :ref:`poison value <poisonvalues>` if 8777unsigned and/or signed overflow, respectively, occurs. 8778 8779Example: 8780"""""""" 8781 8782.. code-block:: text 8783 8784 <result> = mul i32 4, %var ; yields i32:result = 4 * %var 8785 8786.. _i_fmul: 8787 8788'``fmul``' Instruction 8789^^^^^^^^^^^^^^^^^^^^^^ 8790 8791Syntax: 8792""""""" 8793 8794:: 8795 8796 <result> = fmul [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result 8797 8798Overview: 8799""""""""" 8800 8801The '``fmul``' instruction returns the product of its two operands. 8802 8803Arguments: 8804"""""""""" 8805 8806The two arguments to the '``fmul``' instruction must be 8807:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of 8808floating-point values. Both arguments must have identical types. 8809 8810Semantics: 8811"""""""""" 8812 8813The value produced is the floating-point product of the two operands. 8814This instruction is assumed to execute in the default :ref:`floating-point 8815environment <floatenv>`. 8816This instruction can also take any number of :ref:`fast-math 8817flags <fastmath>`, which are optimization hints to enable otherwise 8818unsafe floating-point optimizations: 8819 8820Example: 8821"""""""" 8822 8823.. code-block:: text 8824 8825 <result> = fmul float 4.0, %var ; yields float:result = 4.0 * %var 8826 8827.. _i_udiv: 8828 8829'``udiv``' Instruction 8830^^^^^^^^^^^^^^^^^^^^^^ 8831 8832Syntax: 8833""""""" 8834 8835:: 8836 8837 <result> = udiv <ty> <op1>, <op2> ; yields ty:result 8838 <result> = udiv exact <ty> <op1>, <op2> ; yields ty:result 8839 8840Overview: 8841""""""""" 8842 8843The '``udiv``' instruction returns the quotient of its two operands. 8844 8845Arguments: 8846"""""""""" 8847 8848The two arguments to the '``udiv``' instruction must be 8849:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both 8850arguments must have identical types. 8851 8852Semantics: 8853"""""""""" 8854 8855The value produced is the unsigned integer quotient of the two operands. 8856 8857Note that unsigned integer division and signed integer division are 8858distinct operations; for signed integer division, use '``sdiv``'. 8859 8860Division by zero is undefined behavior. For vectors, if any element 8861of the divisor is zero, the operation has undefined behavior. 8862 8863 8864If the ``exact`` keyword is present, the result value of the ``udiv`` is 8865a :ref:`poison value <poisonvalues>` if %op1 is not a multiple of %op2 (as 8866such, "((a udiv exact b) mul b) == a"). 8867 8868Example: 8869"""""""" 8870 8871.. code-block:: text 8872 8873 <result> = udiv i32 4, %var ; yields i32:result = 4 / %var 8874 8875.. _i_sdiv: 8876 8877'``sdiv``' Instruction 8878^^^^^^^^^^^^^^^^^^^^^^ 8879 8880Syntax: 8881""""""" 8882 8883:: 8884 8885 <result> = sdiv <ty> <op1>, <op2> ; yields ty:result 8886 <result> = sdiv exact <ty> <op1>, <op2> ; yields ty:result 8887 8888Overview: 8889""""""""" 8890 8891The '``sdiv``' instruction returns the quotient of its two operands. 8892 8893Arguments: 8894"""""""""" 8895 8896The two arguments to the '``sdiv``' instruction must be 8897:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both 8898arguments must have identical types. 8899 8900Semantics: 8901"""""""""" 8902 8903The value produced is the signed integer quotient of the two operands 8904rounded towards zero. 8905 8906Note that signed integer division and unsigned integer division are 8907distinct operations; for unsigned integer division, use '``udiv``'. 8908 8909Division by zero is undefined behavior. For vectors, if any element 8910of the divisor is zero, the operation has undefined behavior. 8911Overflow also leads to undefined behavior; this is a rare case, but can 8912occur, for example, by doing a 32-bit division of -2147483648 by -1. 8913 8914If the ``exact`` keyword is present, the result value of the ``sdiv`` is 8915a :ref:`poison value <poisonvalues>` if the result would be rounded. 8916 8917Example: 8918"""""""" 8919 8920.. code-block:: text 8921 8922 <result> = sdiv i32 4, %var ; yields i32:result = 4 / %var 8923 8924.. _i_fdiv: 8925 8926'``fdiv``' Instruction 8927^^^^^^^^^^^^^^^^^^^^^^ 8928 8929Syntax: 8930""""""" 8931 8932:: 8933 8934 <result> = fdiv [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result 8935 8936Overview: 8937""""""""" 8938 8939The '``fdiv``' instruction returns the quotient of its two operands. 8940 8941Arguments: 8942"""""""""" 8943 8944The two arguments to the '``fdiv``' instruction must be 8945:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of 8946floating-point values. Both arguments must have identical types. 8947 8948Semantics: 8949"""""""""" 8950 8951The value produced is the floating-point quotient of the two operands. 8952This instruction is assumed to execute in the default :ref:`floating-point 8953environment <floatenv>`. 8954This instruction can also take any number of :ref:`fast-math 8955flags <fastmath>`, which are optimization hints to enable otherwise 8956unsafe floating-point optimizations: 8957 8958Example: 8959"""""""" 8960 8961.. code-block:: text 8962 8963 <result> = fdiv float 4.0, %var ; yields float:result = 4.0 / %var 8964 8965.. _i_urem: 8966 8967'``urem``' Instruction 8968^^^^^^^^^^^^^^^^^^^^^^ 8969 8970Syntax: 8971""""""" 8972 8973:: 8974 8975 <result> = urem <ty> <op1>, <op2> ; yields ty:result 8976 8977Overview: 8978""""""""" 8979 8980The '``urem``' instruction returns the remainder from the unsigned 8981division of its two arguments. 8982 8983Arguments: 8984"""""""""" 8985 8986The two arguments to the '``urem``' instruction must be 8987:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both 8988arguments must have identical types. 8989 8990Semantics: 8991"""""""""" 8992 8993This instruction returns the unsigned integer *remainder* of a division. 8994This instruction always performs an unsigned division to get the 8995remainder. 8996 8997Note that unsigned integer remainder and signed integer remainder are 8998distinct operations; for signed integer remainder, use '``srem``'. 8999 9000Taking the remainder of a division by zero is undefined behavior. 9001For vectors, if any element of the divisor is zero, the operation has 9002undefined behavior. 9003 9004Example: 9005"""""""" 9006 9007.. code-block:: text 9008 9009 <result> = urem i32 4, %var ; yields i32:result = 4 % %var 9010 9011.. _i_srem: 9012 9013'``srem``' Instruction 9014^^^^^^^^^^^^^^^^^^^^^^ 9015 9016Syntax: 9017""""""" 9018 9019:: 9020 9021 <result> = srem <ty> <op1>, <op2> ; yields ty:result 9022 9023Overview: 9024""""""""" 9025 9026The '``srem``' instruction returns the remainder from the signed 9027division of its two operands. This instruction can also take 9028:ref:`vector <t_vector>` versions of the values in which case the elements 9029must be integers. 9030 9031Arguments: 9032"""""""""" 9033 9034The two arguments to the '``srem``' instruction must be 9035:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both 9036arguments must have identical types. 9037 9038Semantics: 9039"""""""""" 9040 9041This instruction returns the *remainder* of a division (where the result 9042is either zero or has the same sign as the dividend, ``op1``), not the 9043*modulo* operator (where the result is either zero or has the same sign 9044as the divisor, ``op2``) of a value. For more information about the 9045difference, see `The Math 9046Forum <http://mathforum.org/dr.math/problems/anne.4.28.99.html>`_. For a 9047table of how this is implemented in various languages, please see 9048`Wikipedia: modulo 9049operation <http://en.wikipedia.org/wiki/Modulo_operation>`_. 9050 9051Note that signed integer remainder and unsigned integer remainder are 9052distinct operations; for unsigned integer remainder, use '``urem``'. 9053 9054Taking the remainder of a division by zero is undefined behavior. 9055For vectors, if any element of the divisor is zero, the operation has 9056undefined behavior. 9057Overflow also leads to undefined behavior; this is a rare case, but can 9058occur, for example, by taking the remainder of a 32-bit division of 9059-2147483648 by -1. (The remainder doesn't actually overflow, but this 9060rule lets srem be implemented using instructions that return both the 9061result of the division and the remainder.) 9062 9063Example: 9064"""""""" 9065 9066.. code-block:: text 9067 9068 <result> = srem i32 4, %var ; yields i32:result = 4 % %var 9069 9070.. _i_frem: 9071 9072'``frem``' Instruction 9073^^^^^^^^^^^^^^^^^^^^^^ 9074 9075Syntax: 9076""""""" 9077 9078:: 9079 9080 <result> = frem [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result 9081 9082Overview: 9083""""""""" 9084 9085The '``frem``' instruction returns the remainder from the division of 9086its two operands. 9087 9088Arguments: 9089"""""""""" 9090 9091The two arguments to the '``frem``' instruction must be 9092:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of 9093floating-point values. Both arguments must have identical types. 9094 9095Semantics: 9096"""""""""" 9097 9098The value produced is the floating-point remainder of the two operands. 9099This is the same output as a libm '``fmod``' function, but without any 9100possibility of setting ``errno``. The remainder has the same sign as the 9101dividend. 9102This instruction is assumed to execute in the default :ref:`floating-point 9103environment <floatenv>`. 9104This instruction can also take any number of :ref:`fast-math 9105flags <fastmath>`, which are optimization hints to enable otherwise 9106unsafe floating-point optimizations: 9107 9108Example: 9109"""""""" 9110 9111.. code-block:: text 9112 9113 <result> = frem float 4.0, %var ; yields float:result = 4.0 % %var 9114 9115.. _bitwiseops: 9116 9117Bitwise Binary Operations 9118------------------------- 9119 9120Bitwise binary operators are used to do various forms of bit-twiddling 9121in a program. They are generally very efficient instructions and can 9122commonly be strength reduced from other instructions. They require two 9123operands of the same type, execute an operation on them, and produce a 9124single value. The resulting value is the same type as its operands. 9125 9126.. _i_shl: 9127 9128'``shl``' Instruction 9129^^^^^^^^^^^^^^^^^^^^^ 9130 9131Syntax: 9132""""""" 9133 9134:: 9135 9136 <result> = shl <ty> <op1>, <op2> ; yields ty:result 9137 <result> = shl nuw <ty> <op1>, <op2> ; yields ty:result 9138 <result> = shl nsw <ty> <op1>, <op2> ; yields ty:result 9139 <result> = shl nuw nsw <ty> <op1>, <op2> ; yields ty:result 9140 9141Overview: 9142""""""""" 9143 9144The '``shl``' instruction returns the first operand shifted to the left 9145a specified number of bits. 9146 9147Arguments: 9148"""""""""" 9149 9150Both arguments to the '``shl``' instruction must be the same 9151:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type. 9152'``op2``' is treated as an unsigned value. 9153 9154Semantics: 9155"""""""""" 9156 9157The value produced is ``op1`` \* 2\ :sup:`op2` mod 2\ :sup:`n`, 9158where ``n`` is the width of the result. If ``op2`` is (statically or 9159dynamically) equal to or larger than the number of bits in 9160``op1``, this instruction returns a :ref:`poison value <poisonvalues>`. 9161If the arguments are vectors, each vector element of ``op1`` is shifted 9162by the corresponding shift amount in ``op2``. 9163 9164If the ``nuw`` keyword is present, then the shift produces a poison 9165value if it shifts out any non-zero bits. 9166If the ``nsw`` keyword is present, then the shift produces a poison 9167value if it shifts out any bits that disagree with the resultant sign bit. 9168 9169Example: 9170"""""""" 9171 9172.. code-block:: text 9173 9174 <result> = shl i32 4, %var ; yields i32: 4 << %var 9175 <result> = shl i32 4, 2 ; yields i32: 16 9176 <result> = shl i32 1, 10 ; yields i32: 1024 9177 <result> = shl i32 1, 32 ; undefined 9178 <result> = shl <2 x i32> < i32 1, i32 1>, < i32 1, i32 2> ; yields: result=<2 x i32> < i32 2, i32 4> 9179 9180.. _i_lshr: 9181 9182 9183'``lshr``' Instruction 9184^^^^^^^^^^^^^^^^^^^^^^ 9185 9186Syntax: 9187""""""" 9188 9189:: 9190 9191 <result> = lshr <ty> <op1>, <op2> ; yields ty:result 9192 <result> = lshr exact <ty> <op1>, <op2> ; yields ty:result 9193 9194Overview: 9195""""""""" 9196 9197The '``lshr``' instruction (logical shift right) returns the first 9198operand shifted to the right a specified number of bits with zero fill. 9199 9200Arguments: 9201"""""""""" 9202 9203Both arguments to the '``lshr``' instruction must be the same 9204:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type. 9205'``op2``' is treated as an unsigned value. 9206 9207Semantics: 9208"""""""""" 9209 9210This instruction always performs a logical shift right operation. The 9211most significant bits of the result will be filled with zero bits after 9212the shift. If ``op2`` is (statically or dynamically) equal to or larger 9213than the number of bits in ``op1``, this instruction returns a :ref:`poison 9214value <poisonvalues>`. If the arguments are vectors, each vector element 9215of ``op1`` is shifted by the corresponding shift amount in ``op2``. 9216 9217If the ``exact`` keyword is present, the result value of the ``lshr`` is 9218a poison value if any of the bits shifted out are non-zero. 9219 9220Example: 9221"""""""" 9222 9223.. code-block:: text 9224 9225 <result> = lshr i32 4, 1 ; yields i32:result = 2 9226 <result> = lshr i32 4, 2 ; yields i32:result = 1 9227 <result> = lshr i8 4, 3 ; yields i8:result = 0 9228 <result> = lshr i8 -2, 1 ; yields i8:result = 0x7F 9229 <result> = lshr i32 1, 32 ; undefined 9230 <result> = lshr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 2> ; yields: result=<2 x i32> < i32 0x7FFFFFFF, i32 1> 9231 9232.. _i_ashr: 9233 9234'``ashr``' Instruction 9235^^^^^^^^^^^^^^^^^^^^^^ 9236 9237Syntax: 9238""""""" 9239 9240:: 9241 9242 <result> = ashr <ty> <op1>, <op2> ; yields ty:result 9243 <result> = ashr exact <ty> <op1>, <op2> ; yields ty:result 9244 9245Overview: 9246""""""""" 9247 9248The '``ashr``' instruction (arithmetic shift right) returns the first 9249operand shifted to the right a specified number of bits with sign 9250extension. 9251 9252Arguments: 9253"""""""""" 9254 9255Both arguments to the '``ashr``' instruction must be the same 9256:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type. 9257'``op2``' is treated as an unsigned value. 9258 9259Semantics: 9260"""""""""" 9261 9262This instruction always performs an arithmetic shift right operation, 9263The most significant bits of the result will be filled with the sign bit 9264of ``op1``. If ``op2`` is (statically or dynamically) equal to or larger 9265than the number of bits in ``op1``, this instruction returns a :ref:`poison 9266value <poisonvalues>`. If the arguments are vectors, each vector element 9267of ``op1`` is shifted by the corresponding shift amount in ``op2``. 9268 9269If the ``exact`` keyword is present, the result value of the ``ashr`` is 9270a poison value if any of the bits shifted out are non-zero. 9271 9272Example: 9273"""""""" 9274 9275.. code-block:: text 9276 9277 <result> = ashr i32 4, 1 ; yields i32:result = 2 9278 <result> = ashr i32 4, 2 ; yields i32:result = 1 9279 <result> = ashr i8 4, 3 ; yields i8:result = 0 9280 <result> = ashr i8 -2, 1 ; yields i8:result = -1 9281 <result> = ashr i32 1, 32 ; undefined 9282 <result> = ashr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 3> ; yields: result=<2 x i32> < i32 -1, i32 0> 9283 9284.. _i_and: 9285 9286'``and``' Instruction 9287^^^^^^^^^^^^^^^^^^^^^ 9288 9289Syntax: 9290""""""" 9291 9292:: 9293 9294 <result> = and <ty> <op1>, <op2> ; yields ty:result 9295 9296Overview: 9297""""""""" 9298 9299The '``and``' instruction returns the bitwise logical and of its two 9300operands. 9301 9302Arguments: 9303"""""""""" 9304 9305The two arguments to the '``and``' instruction must be 9306:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both 9307arguments must have identical types. 9308 9309Semantics: 9310"""""""""" 9311 9312The truth table used for the '``and``' instruction is: 9313 9314+-----+-----+-----+ 9315| In0 | In1 | Out | 9316+-----+-----+-----+ 9317| 0 | 0 | 0 | 9318+-----+-----+-----+ 9319| 0 | 1 | 0 | 9320+-----+-----+-----+ 9321| 1 | 0 | 0 | 9322+-----+-----+-----+ 9323| 1 | 1 | 1 | 9324+-----+-----+-----+ 9325 9326Example: 9327"""""""" 9328 9329.. code-block:: text 9330 9331 <result> = and i32 4, %var ; yields i32:result = 4 & %var 9332 <result> = and i32 15, 40 ; yields i32:result = 8 9333 <result> = and i32 4, 8 ; yields i32:result = 0 9334 9335.. _i_or: 9336 9337'``or``' Instruction 9338^^^^^^^^^^^^^^^^^^^^ 9339 9340Syntax: 9341""""""" 9342 9343:: 9344 9345 <result> = or <ty> <op1>, <op2> ; yields ty:result 9346 9347Overview: 9348""""""""" 9349 9350The '``or``' instruction returns the bitwise logical inclusive or of its 9351two operands. 9352 9353Arguments: 9354"""""""""" 9355 9356The two arguments to the '``or``' instruction must be 9357:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both 9358arguments must have identical types. 9359 9360Semantics: 9361"""""""""" 9362 9363The truth table used for the '``or``' instruction is: 9364 9365+-----+-----+-----+ 9366| In0 | In1 | Out | 9367+-----+-----+-----+ 9368| 0 | 0 | 0 | 9369+-----+-----+-----+ 9370| 0 | 1 | 1 | 9371+-----+-----+-----+ 9372| 1 | 0 | 1 | 9373+-----+-----+-----+ 9374| 1 | 1 | 1 | 9375+-----+-----+-----+ 9376 9377Example: 9378"""""""" 9379 9380:: 9381 9382 <result> = or i32 4, %var ; yields i32:result = 4 | %var 9383 <result> = or i32 15, 40 ; yields i32:result = 47 9384 <result> = or i32 4, 8 ; yields i32:result = 12 9385 9386.. _i_xor: 9387 9388'``xor``' Instruction 9389^^^^^^^^^^^^^^^^^^^^^ 9390 9391Syntax: 9392""""""" 9393 9394:: 9395 9396 <result> = xor <ty> <op1>, <op2> ; yields ty:result 9397 9398Overview: 9399""""""""" 9400 9401The '``xor``' instruction returns the bitwise logical exclusive or of 9402its two operands. The ``xor`` is used to implement the "one's 9403complement" operation, which is the "~" operator in C. 9404 9405Arguments: 9406"""""""""" 9407 9408The two arguments to the '``xor``' instruction must be 9409:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both 9410arguments must have identical types. 9411 9412Semantics: 9413"""""""""" 9414 9415The truth table used for the '``xor``' instruction is: 9416 9417+-----+-----+-----+ 9418| In0 | In1 | Out | 9419+-----+-----+-----+ 9420| 0 | 0 | 0 | 9421+-----+-----+-----+ 9422| 0 | 1 | 1 | 9423+-----+-----+-----+ 9424| 1 | 0 | 1 | 9425+-----+-----+-----+ 9426| 1 | 1 | 0 | 9427+-----+-----+-----+ 9428 9429Example: 9430"""""""" 9431 9432.. code-block:: text 9433 9434 <result> = xor i32 4, %var ; yields i32:result = 4 ^ %var 9435 <result> = xor i32 15, 40 ; yields i32:result = 39 9436 <result> = xor i32 4, 8 ; yields i32:result = 12 9437 <result> = xor i32 %V, -1 ; yields i32:result = ~%V 9438 9439Vector Operations 9440----------------- 9441 9442LLVM supports several instructions to represent vector operations in a 9443target-independent manner. These instructions cover the element-access 9444and vector-specific operations needed to process vectors effectively. 9445While LLVM does directly support these vector operations, many 9446sophisticated algorithms will want to use target-specific intrinsics to 9447take full advantage of a specific target. 9448 9449.. _i_extractelement: 9450 9451'``extractelement``' Instruction 9452^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 9453 9454Syntax: 9455""""""" 9456 9457:: 9458 9459 <result> = extractelement <n x <ty>> <val>, <ty2> <idx> ; yields <ty> 9460 <result> = extractelement <vscale x n x <ty>> <val>, <ty2> <idx> ; yields <ty> 9461 9462Overview: 9463""""""""" 9464 9465The '``extractelement``' instruction extracts a single scalar element 9466from a vector at a specified index. 9467 9468Arguments: 9469"""""""""" 9470 9471The first operand of an '``extractelement``' instruction is a value of 9472:ref:`vector <t_vector>` type. The second operand is an index indicating 9473the position from which to extract the element. The index may be a 9474variable of any integer type. 9475 9476Semantics: 9477"""""""""" 9478 9479The result is a scalar of the same type as the element type of ``val``. 9480Its value is the value at position ``idx`` of ``val``. If ``idx`` 9481exceeds the length of ``val`` for a fixed-length vector, the result is a 9482:ref:`poison value <poisonvalues>`. For a scalable vector, if the value 9483of ``idx`` exceeds the runtime length of the vector, the result is a 9484:ref:`poison value <poisonvalues>`. 9485 9486Example: 9487"""""""" 9488 9489.. code-block:: text 9490 9491 <result> = extractelement <4 x i32> %vec, i32 0 ; yields i32 9492 9493.. _i_insertelement: 9494 9495'``insertelement``' Instruction 9496^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 9497 9498Syntax: 9499""""""" 9500 9501:: 9502 9503 <result> = insertelement <n x <ty>> <val>, <ty> <elt>, <ty2> <idx> ; yields <n x <ty>> 9504 <result> = insertelement <vscale x n x <ty>> <val>, <ty> <elt>, <ty2> <idx> ; yields <vscale x n x <ty>> 9505 9506Overview: 9507""""""""" 9508 9509The '``insertelement``' instruction inserts a scalar element into a 9510vector at a specified index. 9511 9512Arguments: 9513"""""""""" 9514 9515The first operand of an '``insertelement``' instruction is a value of 9516:ref:`vector <t_vector>` type. The second operand is a scalar value whose 9517type must equal the element type of the first operand. The third operand 9518is an index indicating the position at which to insert the value. The 9519index may be a variable of any integer type. 9520 9521Semantics: 9522"""""""""" 9523 9524The result is a vector of the same type as ``val``. Its element values 9525are those of ``val`` except at position ``idx``, where it gets the value 9526``elt``. If ``idx`` exceeds the length of ``val`` for a fixed-length vector, 9527the result is a :ref:`poison value <poisonvalues>`. For a scalable vector, 9528if the value of ``idx`` exceeds the runtime length of the vector, the result 9529is a :ref:`poison value <poisonvalues>`. 9530 9531Example: 9532"""""""" 9533 9534.. code-block:: text 9535 9536 <result> = insertelement <4 x i32> %vec, i32 1, i32 0 ; yields <4 x i32> 9537 9538.. _i_shufflevector: 9539 9540'``shufflevector``' Instruction 9541^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 9542 9543Syntax: 9544""""""" 9545 9546:: 9547 9548 <result> = shufflevector <n x <ty>> <v1>, <n x <ty>> <v2>, <m x i32> <mask> ; yields <m x <ty>> 9549 <result> = shufflevector <vscale x n x <ty>> <v1>, <vscale x n x <ty>> v2, <vscale x m x i32> <mask> ; yields <vscale x m x <ty>> 9550 9551Overview: 9552""""""""" 9553 9554The '``shufflevector``' instruction constructs a permutation of elements 9555from two input vectors, returning a vector with the same element type as 9556the input and length that is the same as the shuffle mask. 9557 9558Arguments: 9559"""""""""" 9560 9561The first two operands of a '``shufflevector``' instruction are vectors 9562with the same type. The third argument is a shuffle mask vector constant 9563whose element type is ``i32``. The mask vector elements must be constant 9564integers or ``undef`` values. The result of the instruction is a vector 9565whose length is the same as the shuffle mask and whose element type is the 9566same as the element type of the first two operands. 9567 9568Semantics: 9569"""""""""" 9570 9571The elements of the two input vectors are numbered from left to right 9572across both of the vectors. For each element of the result vector, the 9573shuffle mask selects an element from one of the input vectors to copy 9574to the result. Non-negative elements in the mask represent an index 9575into the concatenated pair of input vectors. 9576 9577If the shuffle mask is undefined, the result vector is undefined. If 9578the shuffle mask selects an undefined element from one of the input 9579vectors, the resulting element is undefined. An undefined element 9580in the mask vector specifies that the resulting element is undefined. 9581An undefined element in the mask vector prevents a poisoned vector 9582element from propagating. 9583 9584For scalable vectors, the only valid mask values at present are 9585``zeroinitializer`` and ``undef``, since we cannot write all indices as 9586literals for a vector with a length unknown at compile time. 9587 9588Example: 9589"""""""" 9590 9591.. code-block:: text 9592 9593 <result> = shufflevector <4 x i32> %v1, <4 x i32> %v2, 9594 <4 x i32> <i32 0, i32 4, i32 1, i32 5> ; yields <4 x i32> 9595 <result> = shufflevector <4 x i32> %v1, <4 x i32> undef, 9596 <4 x i32> <i32 0, i32 1, i32 2, i32 3> ; yields <4 x i32> - Identity shuffle. 9597 <result> = shufflevector <8 x i32> %v1, <8 x i32> undef, 9598 <4 x i32> <i32 0, i32 1, i32 2, i32 3> ; yields <4 x i32> 9599 <result> = shufflevector <4 x i32> %v1, <4 x i32> %v2, 9600 <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7 > ; yields <8 x i32> 9601 9602Aggregate Operations 9603-------------------- 9604 9605LLVM supports several instructions for working with 9606:ref:`aggregate <t_aggregate>` values. 9607 9608.. _i_extractvalue: 9609 9610'``extractvalue``' Instruction 9611^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 9612 9613Syntax: 9614""""""" 9615 9616:: 9617 9618 <result> = extractvalue <aggregate type> <val>, <idx>{, <idx>}* 9619 9620Overview: 9621""""""""" 9622 9623The '``extractvalue``' instruction extracts the value of a member field 9624from an :ref:`aggregate <t_aggregate>` value. 9625 9626Arguments: 9627"""""""""" 9628 9629The first operand of an '``extractvalue``' instruction is a value of 9630:ref:`struct <t_struct>` or :ref:`array <t_array>` type. The other operands are 9631constant indices to specify which value to extract in a similar manner 9632as indices in a '``getelementptr``' instruction. 9633 9634The major differences to ``getelementptr`` indexing are: 9635 9636- Since the value being indexed is not a pointer, the first index is 9637 omitted and assumed to be zero. 9638- At least one index must be specified. 9639- Not only struct indices but also array indices must be in bounds. 9640 9641Semantics: 9642"""""""""" 9643 9644The result is the value at the position in the aggregate specified by 9645the index operands. 9646 9647Example: 9648"""""""" 9649 9650.. code-block:: text 9651 9652 <result> = extractvalue {i32, float} %agg, 0 ; yields i32 9653 9654.. _i_insertvalue: 9655 9656'``insertvalue``' Instruction 9657^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 9658 9659Syntax: 9660""""""" 9661 9662:: 9663 9664 <result> = insertvalue <aggregate type> <val>, <ty> <elt>, <idx>{, <idx>}* ; yields <aggregate type> 9665 9666Overview: 9667""""""""" 9668 9669The '``insertvalue``' instruction inserts a value into a member field in 9670an :ref:`aggregate <t_aggregate>` value. 9671 9672Arguments: 9673"""""""""" 9674 9675The first operand of an '``insertvalue``' instruction is a value of 9676:ref:`struct <t_struct>` or :ref:`array <t_array>` type. The second operand is 9677a first-class value to insert. The following operands are constant 9678indices indicating the position at which to insert the value in a 9679similar manner as indices in a '``extractvalue``' instruction. The value 9680to insert must have the same type as the value identified by the 9681indices. 9682 9683Semantics: 9684"""""""""" 9685 9686The result is an aggregate of the same type as ``val``. Its value is 9687that of ``val`` except that the value at the position specified by the 9688indices is that of ``elt``. 9689 9690Example: 9691"""""""" 9692 9693.. code-block:: llvm 9694 9695 %agg1 = insertvalue {i32, float} undef, i32 1, 0 ; yields {i32 1, float undef} 9696 %agg2 = insertvalue {i32, float} %agg1, float %val, 1 ; yields {i32 1, float %val} 9697 %agg3 = insertvalue {i32, {float}} undef, float %val, 1, 0 ; yields {i32 undef, {float %val}} 9698 9699.. _memoryops: 9700 9701Memory Access and Addressing Operations 9702--------------------------------------- 9703 9704A key design point of an SSA-based representation is how it represents 9705memory. In LLVM, no memory locations are in SSA form, which makes things 9706very simple. This section describes how to read, write, and allocate 9707memory in LLVM. 9708 9709.. _i_alloca: 9710 9711'``alloca``' Instruction 9712^^^^^^^^^^^^^^^^^^^^^^^^ 9713 9714Syntax: 9715""""""" 9716 9717:: 9718 9719 <result> = alloca [inalloca] <type> [, <ty> <NumElements>] [, align <alignment>] [, addrspace(<num>)] ; yields type addrspace(num)*:result 9720 9721Overview: 9722""""""""" 9723 9724The '``alloca``' instruction allocates memory on the stack frame of the 9725currently executing function, to be automatically released when this 9726function returns to its caller. If the address space is not explicitly 9727specified, the object is allocated in the alloca address space from the 9728:ref:`datalayout string<langref_datalayout>`. 9729 9730Arguments: 9731"""""""""" 9732 9733The '``alloca``' instruction allocates ``sizeof(<type>)*NumElements`` 9734bytes of memory on the runtime stack, returning a pointer of the 9735appropriate type to the program. If "NumElements" is specified, it is 9736the number of elements allocated, otherwise "NumElements" is defaulted 9737to be one. If a constant alignment is specified, the value result of the 9738allocation is guaranteed to be aligned to at least that boundary. The 9739alignment may not be greater than ``1 << 32``. If not specified, or if 9740zero, the target can choose to align the allocation on any convenient 9741boundary compatible with the type. 9742 9743'``type``' may be any sized type. 9744 9745Semantics: 9746"""""""""" 9747 9748Memory is allocated; a pointer is returned. The allocated memory is 9749uninitialized, and loading from uninitialized memory produces an undefined 9750value. The operation itself is undefined if there is insufficient stack 9751space for the allocation.'``alloca``'d memory is automatically released 9752when the function returns. The '``alloca``' instruction is commonly used 9753to represent automatic variables that must have an address available. When 9754the function returns (either with the ``ret`` or ``resume`` instructions), 9755the memory is reclaimed. Allocating zero bytes is legal, but the returned 9756pointer may not be unique. The order in which memory is allocated (ie., 9757which way the stack grows) is not specified. 9758 9759Note that '``alloca``' outside of the alloca address space from the 9760:ref:`datalayout string<langref_datalayout>` is meaningful only if the 9761target has assigned it a semantics. 9762 9763If the returned pointer is used by :ref:`llvm.lifetime.start <int_lifestart>`, 9764the returned object is initially dead. 9765See :ref:`llvm.lifetime.start <int_lifestart>` and 9766:ref:`llvm.lifetime.end <int_lifeend>` for the precise semantics of 9767lifetime-manipulating intrinsics. 9768 9769Example: 9770"""""""" 9771 9772.. code-block:: llvm 9773 9774 %ptr = alloca i32 ; yields i32*:ptr 9775 %ptr = alloca i32, i32 4 ; yields i32*:ptr 9776 %ptr = alloca i32, i32 4, align 1024 ; yields i32*:ptr 9777 %ptr = alloca i32, align 1024 ; yields i32*:ptr 9778 9779.. _i_load: 9780 9781'``load``' Instruction 9782^^^^^^^^^^^^^^^^^^^^^^ 9783 9784Syntax: 9785""""""" 9786 9787:: 9788 9789 <result> = load [volatile] <ty>, <ty>* <pointer>[, align <alignment>][, !nontemporal !<nontemp_node>][, !invariant.load !<empty_node>][, !invariant.group !<empty_node>][, !nonnull !<empty_node>][, !dereferenceable !<deref_bytes_node>][, !dereferenceable_or_null !<deref_bytes_node>][, !align !<align_node>][, !noundef !<empty_node>] 9790 <result> = load atomic [volatile] <ty>, <ty>* <pointer> [syncscope("<target-scope>")] <ordering>, align <alignment> [, !invariant.group !<empty_node>] 9791 !<nontemp_node> = !{ i32 1 } 9792 !<empty_node> = !{} 9793 !<deref_bytes_node> = !{ i64 <dereferenceable_bytes> } 9794 !<align_node> = !{ i64 <value_alignment> } 9795 9796Overview: 9797""""""""" 9798 9799The '``load``' instruction is used to read from memory. 9800 9801Arguments: 9802"""""""""" 9803 9804The argument to the ``load`` instruction specifies the memory address from which 9805to load. The type specified must be a :ref:`first class <t_firstclass>` type of 9806known size (i.e. not containing an :ref:`opaque structural type <t_opaque>`). If 9807the ``load`` is marked as ``volatile``, then the optimizer is not allowed to 9808modify the number or order of execution of this ``load`` with other 9809:ref:`volatile operations <volatile>`. 9810 9811If the ``load`` is marked as ``atomic``, it takes an extra :ref:`ordering 9812<ordering>` and optional ``syncscope("<target-scope>")`` argument. The 9813``release`` and ``acq_rel`` orderings are not valid on ``load`` instructions. 9814Atomic loads produce :ref:`defined <memmodel>` results when they may see 9815multiple atomic stores. The type of the pointee must be an integer, pointer, or 9816floating-point type whose bit width is a power of two greater than or equal to 9817eight and less than or equal to a target-specific size limit. ``align`` must be 9818explicitly specified on atomic loads, and the load has undefined behavior if the 9819alignment is not set to a value which is at least the size in bytes of the 9820pointee. ``!nontemporal`` does not have any defined semantics for atomic loads. 9821 9822The optional constant ``align`` argument specifies the alignment of the 9823operation (that is, the alignment of the memory address). A value of 0 9824or an omitted ``align`` argument means that the operation has the ABI 9825alignment for the target. It is the responsibility of the code emitter 9826to ensure that the alignment information is correct. Overestimating the 9827alignment results in undefined behavior. Underestimating the alignment 9828may produce less efficient code. An alignment of 1 is always safe. The 9829maximum possible alignment is ``1 << 32``. An alignment value higher 9830than the size of the loaded type implies memory up to the alignment 9831value bytes can be safely loaded without trapping in the default 9832address space. Access of the high bytes can interfere with debugging 9833tools, so should not be accessed if the function has the 9834``sanitize_thread`` or ``sanitize_address`` attributes. 9835 9836The optional ``!nontemporal`` metadata must reference a single 9837metadata name ``<nontemp_node>`` corresponding to a metadata node with one 9838``i32`` entry of value 1. The existence of the ``!nontemporal`` 9839metadata on the instruction tells the optimizer and code generator 9840that this load is not expected to be reused in the cache. The code 9841generator may select special instructions to save cache bandwidth, such 9842as the ``MOVNT`` instruction on x86. 9843 9844The optional ``!invariant.load`` metadata must reference a single 9845metadata name ``<empty_node>`` corresponding to a metadata node with no 9846entries. If a load instruction tagged with the ``!invariant.load`` 9847metadata is executed, the memory location referenced by the load has 9848to contain the same value at all points in the program where the 9849memory location is dereferenceable; otherwise, the behavior is 9850undefined. 9851 9852The optional ``!invariant.group`` metadata must reference a single metadata name 9853 ``<empty_node>`` corresponding to a metadata node with no entries. 9854 See ``invariant.group`` metadata :ref:`invariant.group <md_invariant.group>`. 9855 9856The optional ``!nonnull`` metadata must reference a single 9857metadata name ``<empty_node>`` corresponding to a metadata node with no 9858entries. The existence of the ``!nonnull`` metadata on the 9859instruction tells the optimizer that the value loaded is known to 9860never be null. If the value is null at runtime, the behavior is undefined. 9861This is analogous to the ``nonnull`` attribute on parameters and return 9862values. This metadata can only be applied to loads of a pointer type. 9863 9864The optional ``!dereferenceable`` metadata must reference a single metadata 9865name ``<deref_bytes_node>`` corresponding to a metadata node with one ``i64`` 9866entry. 9867See ``dereferenceable`` metadata :ref:`dereferenceable <md_dereferenceable>`. 9868 9869The optional ``!dereferenceable_or_null`` metadata must reference a single 9870metadata name ``<deref_bytes_node>`` corresponding to a metadata node with one 9871``i64`` entry. 9872See ``dereferenceable_or_null`` metadata :ref:`dereferenceable_or_null 9873<md_dereferenceable_or_null>`. 9874 9875The optional ``!align`` metadata must reference a single metadata name 9876``<align_node>`` corresponding to a metadata node with one ``i64`` entry. 9877The existence of the ``!align`` metadata on the instruction tells the 9878optimizer that the value loaded is known to be aligned to a boundary specified 9879by the integer value in the metadata node. The alignment must be a power of 2. 9880This is analogous to the ''align'' attribute on parameters and return values. 9881This metadata can only be applied to loads of a pointer type. If the returned 9882value is not appropriately aligned at runtime, the behavior is undefined. 9883 9884The optional ``!noundef`` metadata must reference a single metadata name 9885``<empty_node>`` corresponding to a node with no entries. The existence of 9886``!noundef`` metadata on the instruction tells the optimizer that the value 9887loaded is known to be :ref:`well defined <welldefinedvalues>`. 9888If the value isn't well defined, the behavior is undefined. 9889 9890Semantics: 9891"""""""""" 9892 9893The location of memory pointed to is loaded. If the value being loaded 9894is of scalar type then the number of bytes read does not exceed the 9895minimum number of bytes needed to hold all bits of the type. For 9896example, loading an ``i24`` reads at most three bytes. When loading a 9897value of a type like ``i20`` with a size that is not an integral number 9898of bytes, the result is undefined if the value was not originally 9899written using a store of the same type. 9900If the value being loaded is of aggregate type, the bytes that correspond to 9901padding may be accessed but are ignored, because it is impossible to observe 9902padding from the loaded aggregate value. 9903If ``<pointer>`` is not a well-defined value, the behavior is undefined. 9904 9905Examples: 9906""""""""" 9907 9908.. code-block:: llvm 9909 9910 %ptr = alloca i32 ; yields i32*:ptr 9911 store i32 3, i32* %ptr ; yields void 9912 %val = load i32, i32* %ptr ; yields i32:val = i32 3 9913 9914.. _i_store: 9915 9916'``store``' Instruction 9917^^^^^^^^^^^^^^^^^^^^^^^ 9918 9919Syntax: 9920""""""" 9921 9922:: 9923 9924 store [volatile] <ty> <value>, <ty>* <pointer>[, align <alignment>][, !nontemporal !<nontemp_node>][, !invariant.group !<empty_node>] ; yields void 9925 store atomic [volatile] <ty> <value>, <ty>* <pointer> [syncscope("<target-scope>")] <ordering>, align <alignment> [, !invariant.group !<empty_node>] ; yields void 9926 !<nontemp_node> = !{ i32 1 } 9927 !<empty_node> = !{} 9928 9929Overview: 9930""""""""" 9931 9932The '``store``' instruction is used to write to memory. 9933 9934Arguments: 9935"""""""""" 9936 9937There are two arguments to the ``store`` instruction: a value to store and an 9938address at which to store it. The type of the ``<pointer>`` operand must be a 9939pointer to the :ref:`first class <t_firstclass>` type of the ``<value>`` 9940operand. If the ``store`` is marked as ``volatile``, then the optimizer is not 9941allowed to modify the number or order of execution of this ``store`` with other 9942:ref:`volatile operations <volatile>`. Only values of :ref:`first class 9943<t_firstclass>` types of known size (i.e. not containing an :ref:`opaque 9944structural type <t_opaque>`) can be stored. 9945 9946If the ``store`` is marked as ``atomic``, it takes an extra :ref:`ordering 9947<ordering>` and optional ``syncscope("<target-scope>")`` argument. The 9948``acquire`` and ``acq_rel`` orderings aren't valid on ``store`` instructions. 9949Atomic loads produce :ref:`defined <memmodel>` results when they may see 9950multiple atomic stores. The type of the pointee must be an integer, pointer, or 9951floating-point type whose bit width is a power of two greater than or equal to 9952eight and less than or equal to a target-specific size limit. ``align`` must be 9953explicitly specified on atomic stores, and the store has undefined behavior if 9954the alignment is not set to a value which is at least the size in bytes of the 9955pointee. ``!nontemporal`` does not have any defined semantics for atomic stores. 9956 9957The optional constant ``align`` argument specifies the alignment of the 9958operation (that is, the alignment of the memory address). A value of 0 9959or an omitted ``align`` argument means that the operation has the ABI 9960alignment for the target. It is the responsibility of the code emitter 9961to ensure that the alignment information is correct. Overestimating the 9962alignment results in undefined behavior. Underestimating the 9963alignment may produce less efficient code. An alignment of 1 is always 9964safe. The maximum possible alignment is ``1 << 32``. An alignment 9965value higher than the size of the stored type implies memory up to the 9966alignment value bytes can be stored to without trapping in the default 9967address space. Storing to the higher bytes however may result in data 9968races if another thread can access the same address. Introducing a 9969data race is not allowed. Storing to the extra bytes is not allowed 9970even in situations where a data race is known to not exist if the 9971function has the ``sanitize_address`` attribute. 9972 9973The optional ``!nontemporal`` metadata must reference a single metadata 9974name ``<nontemp_node>`` corresponding to a metadata node with one ``i32`` entry 9975of value 1. The existence of the ``!nontemporal`` metadata on the instruction 9976tells the optimizer and code generator that this load is not expected to 9977be reused in the cache. The code generator may select special 9978instructions to save cache bandwidth, such as the ``MOVNT`` instruction on 9979x86. 9980 9981The optional ``!invariant.group`` metadata must reference a 9982single metadata name ``<empty_node>``. See ``invariant.group`` metadata. 9983 9984Semantics: 9985"""""""""" 9986 9987The contents of memory are updated to contain ``<value>`` at the 9988location specified by the ``<pointer>`` operand. If ``<value>`` is 9989of scalar type then the number of bytes written does not exceed the 9990minimum number of bytes needed to hold all bits of the type. For 9991example, storing an ``i24`` writes at most three bytes. When writing a 9992value of a type like ``i20`` with a size that is not an integral number 9993of bytes, it is unspecified what happens to the extra bits that do not 9994belong to the type, but they will typically be overwritten. 9995If ``<value>`` is of aggregate type, padding is filled with 9996:ref:`undef <undefvalues>`. 9997If ``<pointer>`` is not a well-defined value, the behavior is undefined. 9998 9999Example: 10000"""""""" 10001 10002.. code-block:: llvm 10003 10004 %ptr = alloca i32 ; yields i32*:ptr 10005 store i32 3, i32* %ptr ; yields void 10006 %val = load i32, i32* %ptr ; yields i32:val = i32 3 10007 10008.. _i_fence: 10009 10010'``fence``' Instruction 10011^^^^^^^^^^^^^^^^^^^^^^^ 10012 10013Syntax: 10014""""""" 10015 10016:: 10017 10018 fence [syncscope("<target-scope>")] <ordering> ; yields void 10019 10020Overview: 10021""""""""" 10022 10023The '``fence``' instruction is used to introduce happens-before edges 10024between operations. 10025 10026Arguments: 10027"""""""""" 10028 10029'``fence``' instructions take an :ref:`ordering <ordering>` argument which 10030defines what *synchronizes-with* edges they add. They can only be given 10031``acquire``, ``release``, ``acq_rel``, and ``seq_cst`` orderings. 10032 10033Semantics: 10034"""""""""" 10035 10036A fence A which has (at least) ``release`` ordering semantics 10037*synchronizes with* a fence B with (at least) ``acquire`` ordering 10038semantics if and only if there exist atomic operations X and Y, both 10039operating on some atomic object M, such that A is sequenced before X, X 10040modifies M (either directly or through some side effect of a sequence 10041headed by X), Y is sequenced before B, and Y observes M. This provides a 10042*happens-before* dependency between A and B. Rather than an explicit 10043``fence``, one (but not both) of the atomic operations X or Y might 10044provide a ``release`` or ``acquire`` (resp.) ordering constraint and 10045still *synchronize-with* the explicit ``fence`` and establish the 10046*happens-before* edge. 10047 10048A ``fence`` which has ``seq_cst`` ordering, in addition to having both 10049``acquire`` and ``release`` semantics specified above, participates in 10050the global program order of other ``seq_cst`` operations and/or fences. 10051 10052A ``fence`` instruction can also take an optional 10053":ref:`syncscope <syncscope>`" argument. 10054 10055Example: 10056"""""""" 10057 10058.. code-block:: text 10059 10060 fence acquire ; yields void 10061 fence syncscope("singlethread") seq_cst ; yields void 10062 fence syncscope("agent") seq_cst ; yields void 10063 10064.. _i_cmpxchg: 10065 10066'``cmpxchg``' Instruction 10067^^^^^^^^^^^^^^^^^^^^^^^^^ 10068 10069Syntax: 10070""""""" 10071 10072:: 10073 10074 cmpxchg [weak] [volatile] <ty>* <pointer>, <ty> <cmp>, <ty> <new> [syncscope("<target-scope>")] <success ordering> <failure ordering>[, align <alignment>] ; yields { ty, i1 } 10075 10076Overview: 10077""""""""" 10078 10079The '``cmpxchg``' instruction is used to atomically modify memory. It 10080loads a value in memory and compares it to a given value. If they are 10081equal, it tries to store a new value into the memory. 10082 10083Arguments: 10084"""""""""" 10085 10086There are three arguments to the '``cmpxchg``' instruction: an address 10087to operate on, a value to compare to the value currently be at that 10088address, and a new value to place at that address if the compared values 10089are equal. The type of '<cmp>' must be an integer or pointer type whose 10090bit width is a power of two greater than or equal to eight and less 10091than or equal to a target-specific size limit. '<cmp>' and '<new>' must 10092have the same type, and the type of '<pointer>' must be a pointer to 10093that type. If the ``cmpxchg`` is marked as ``volatile``, then the 10094optimizer is not allowed to modify the number or order of execution of 10095this ``cmpxchg`` with other :ref:`volatile operations <volatile>`. 10096 10097The success and failure :ref:`ordering <ordering>` arguments specify how this 10098``cmpxchg`` synchronizes with other atomic operations. Both ordering parameters 10099must be at least ``monotonic``, the failure ordering cannot be either 10100``release`` or ``acq_rel``. 10101 10102A ``cmpxchg`` instruction can also take an optional 10103":ref:`syncscope <syncscope>`" argument. 10104 10105The instruction can take an optional ``align`` attribute. 10106The alignment must be a power of two greater or equal to the size of the 10107`<value>` type. If unspecified, the alignment is assumed to be equal to the 10108size of the '<value>' type. Note that this default alignment assumption is 10109different from the alignment used for the load/store instructions when align 10110isn't specified. 10111 10112The pointer passed into cmpxchg must have alignment greater than or 10113equal to the size in memory of the operand. 10114 10115Semantics: 10116"""""""""" 10117 10118The contents of memory at the location specified by the '``<pointer>``' operand 10119is read and compared to '``<cmp>``'; if the values are equal, '``<new>``' is 10120written to the location. The original value at the location is returned, 10121together with a flag indicating success (true) or failure (false). 10122 10123If the cmpxchg operation is marked as ``weak`` then a spurious failure is 10124permitted: the operation may not write ``<new>`` even if the comparison 10125matched. 10126 10127If the cmpxchg operation is strong (the default), the i1 value is 1 if and only 10128if the value loaded equals ``cmp``. 10129 10130A successful ``cmpxchg`` is a read-modify-write instruction for the purpose of 10131identifying release sequences. A failed ``cmpxchg`` is equivalent to an atomic 10132load with an ordering parameter determined the second ordering parameter. 10133 10134Example: 10135"""""""" 10136 10137.. code-block:: llvm 10138 10139 entry: 10140 %orig = load atomic i32, i32* %ptr unordered, align 4 ; yields i32 10141 br label %loop 10142 10143 loop: 10144 %cmp = phi i32 [ %orig, %entry ], [%value_loaded, %loop] 10145 %squared = mul i32 %cmp, %cmp 10146 %val_success = cmpxchg i32* %ptr, i32 %cmp, i32 %squared acq_rel monotonic ; yields { i32, i1 } 10147 %value_loaded = extractvalue { i32, i1 } %val_success, 0 10148 %success = extractvalue { i32, i1 } %val_success, 1 10149 br i1 %success, label %done, label %loop 10150 10151 done: 10152 ... 10153 10154.. _i_atomicrmw: 10155 10156'``atomicrmw``' Instruction 10157^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10158 10159Syntax: 10160""""""" 10161 10162:: 10163 10164 atomicrmw [volatile] <operation> <ty>* <pointer>, <ty> <value> [syncscope("<target-scope>")] <ordering>[, align <alignment>] ; yields ty 10165 10166Overview: 10167""""""""" 10168 10169The '``atomicrmw``' instruction is used to atomically modify memory. 10170 10171Arguments: 10172"""""""""" 10173 10174There are three arguments to the '``atomicrmw``' instruction: an 10175operation to apply, an address whose value to modify, an argument to the 10176operation. The operation must be one of the following keywords: 10177 10178- xchg 10179- add 10180- sub 10181- and 10182- nand 10183- or 10184- xor 10185- max 10186- min 10187- umax 10188- umin 10189- fadd 10190- fsub 10191 10192For most of these operations, the type of '<value>' must be an integer 10193type whose bit width is a power of two greater than or equal to eight 10194and less than or equal to a target-specific size limit. For xchg, this 10195may also be a floating point type with the same size constraints as 10196integers. For fadd/fsub, this must be a floating point type. The 10197type of the '``<pointer>``' operand must be a pointer to that type. If 10198the ``atomicrmw`` is marked as ``volatile``, then the optimizer is not 10199allowed to modify the number or order of execution of this 10200``atomicrmw`` with other :ref:`volatile operations <volatile>`. 10201 10202The instruction can take an optional ``align`` attribute. 10203The alignment must be a power of two greater or equal to the size of the 10204`<value>` type. If unspecified, the alignment is assumed to be equal to the 10205size of the '<value>' type. Note that this default alignment assumption is 10206different from the alignment used for the load/store instructions when align 10207isn't specified. 10208 10209A ``atomicrmw`` instruction can also take an optional 10210":ref:`syncscope <syncscope>`" argument. 10211 10212Semantics: 10213"""""""""" 10214 10215The contents of memory at the location specified by the '``<pointer>``' 10216operand are atomically read, modified, and written back. The original 10217value at the location is returned. The modification is specified by the 10218operation argument: 10219 10220- xchg: ``*ptr = val`` 10221- add: ``*ptr = *ptr + val`` 10222- sub: ``*ptr = *ptr - val`` 10223- and: ``*ptr = *ptr & val`` 10224- nand: ``*ptr = ~(*ptr & val)`` 10225- or: ``*ptr = *ptr | val`` 10226- xor: ``*ptr = *ptr ^ val`` 10227- max: ``*ptr = *ptr > val ? *ptr : val`` (using a signed comparison) 10228- min: ``*ptr = *ptr < val ? *ptr : val`` (using a signed comparison) 10229- umax: ``*ptr = *ptr > val ? *ptr : val`` (using an unsigned comparison) 10230- umin: ``*ptr = *ptr < val ? *ptr : val`` (using an unsigned comparison) 10231- fadd: ``*ptr = *ptr + val`` (using floating point arithmetic) 10232- fsub: ``*ptr = *ptr - val`` (using floating point arithmetic) 10233 10234Example: 10235"""""""" 10236 10237.. code-block:: llvm 10238 10239 %old = atomicrmw add i32* %ptr, i32 1 acquire ; yields i32 10240 10241.. _i_getelementptr: 10242 10243'``getelementptr``' Instruction 10244^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10245 10246Syntax: 10247""""""" 10248 10249:: 10250 10251 <result> = getelementptr <ty>, <ty>* <ptrval>{, [inrange] <ty> <idx>}* 10252 <result> = getelementptr inbounds <ty>, <ty>* <ptrval>{, [inrange] <ty> <idx>}* 10253 <result> = getelementptr <ty>, <ptr vector> <ptrval>, [inrange] <vector index type> <idx> 10254 10255Overview: 10256""""""""" 10257 10258The '``getelementptr``' instruction is used to get the address of a 10259subelement of an :ref:`aggregate <t_aggregate>` data structure. It performs 10260address calculation only and does not access memory. The instruction can also 10261be used to calculate a vector of such addresses. 10262 10263Arguments: 10264"""""""""" 10265 10266The first argument is always a type used as the basis for the calculations. 10267The second argument is always a pointer or a vector of pointers, and is the 10268base address to start from. The remaining arguments are indices 10269that indicate which of the elements of the aggregate object are indexed. 10270The interpretation of each index is dependent on the type being indexed 10271into. The first index always indexes the pointer value given as the 10272second argument, the second index indexes a value of the type pointed to 10273(not necessarily the value directly pointed to, since the first index 10274can be non-zero), etc. The first type indexed into must be a pointer 10275value, subsequent types can be arrays, vectors, and structs. Note that 10276subsequent types being indexed into can never be pointers, since that 10277would require loading the pointer before continuing calculation. 10278 10279The type of each index argument depends on the type it is indexing into. 10280When indexing into a (optionally packed) structure, only ``i32`` integer 10281**constants** are allowed (when using a vector of indices they must all 10282be the **same** ``i32`` integer constant). When indexing into an array, 10283pointer or vector, integers of any width are allowed, and they are not 10284required to be constant. These integers are treated as signed values 10285where relevant. 10286 10287For example, let's consider a C code fragment and how it gets compiled 10288to LLVM: 10289 10290.. code-block:: c 10291 10292 struct RT { 10293 char A; 10294 int B[10][20]; 10295 char C; 10296 }; 10297 struct ST { 10298 int X; 10299 double Y; 10300 struct RT Z; 10301 }; 10302 10303 int *foo(struct ST *s) { 10304 return &s[1].Z.B[5][13]; 10305 } 10306 10307The LLVM code generated by Clang is: 10308 10309.. code-block:: llvm 10310 10311 %struct.RT = type { i8, [10 x [20 x i32]], i8 } 10312 %struct.ST = type { i32, double, %struct.RT } 10313 10314 define i32* @foo(%struct.ST* %s) nounwind uwtable readnone optsize ssp { 10315 entry: 10316 %arrayidx = getelementptr inbounds %struct.ST, %struct.ST* %s, i64 1, i32 2, i32 1, i64 5, i64 13 10317 ret i32* %arrayidx 10318 } 10319 10320Semantics: 10321"""""""""" 10322 10323In the example above, the first index is indexing into the 10324'``%struct.ST*``' type, which is a pointer, yielding a '``%struct.ST``' 10325= '``{ i32, double, %struct.RT }``' type, a structure. The second index 10326indexes into the third element of the structure, yielding a 10327'``%struct.RT``' = '``{ i8 , [10 x [20 x i32]], i8 }``' type, another 10328structure. The third index indexes into the second element of the 10329structure, yielding a '``[10 x [20 x i32]]``' type, an array. The two 10330dimensions of the array are subscripted into, yielding an '``i32``' 10331type. The '``getelementptr``' instruction returns a pointer to this 10332element, thus computing a value of '``i32*``' type. 10333 10334Note that it is perfectly legal to index partially through a structure, 10335returning a pointer to an inner element. Because of this, the LLVM code 10336for the given testcase is equivalent to: 10337 10338.. code-block:: llvm 10339 10340 define i32* @foo(%struct.ST* %s) { 10341 %t1 = getelementptr %struct.ST, %struct.ST* %s, i32 1 ; yields %struct.ST*:%t1 10342 %t2 = getelementptr %struct.ST, %struct.ST* %t1, i32 0, i32 2 ; yields %struct.RT*:%t2 10343 %t3 = getelementptr %struct.RT, %struct.RT* %t2, i32 0, i32 1 ; yields [10 x [20 x i32]]*:%t3 10344 %t4 = getelementptr [10 x [20 x i32]], [10 x [20 x i32]]* %t3, i32 0, i32 5 ; yields [20 x i32]*:%t4 10345 %t5 = getelementptr [20 x i32], [20 x i32]* %t4, i32 0, i32 13 ; yields i32*:%t5 10346 ret i32* %t5 10347 } 10348 10349If the ``inbounds`` keyword is present, the result value of the 10350``getelementptr`` is a :ref:`poison value <poisonvalues>` if one of the 10351following rules is violated: 10352 10353* The base pointer has an *in bounds* address of an allocated object, which 10354 means that it points into an allocated object, or to its end. The only 10355 *in bounds* address for a null pointer in the default address-space is the 10356 null pointer itself. 10357* If the type of an index is larger than the pointer index type, the 10358 truncation to the pointer index type preserves the signed value. 10359* The multiplication of an index by the type size does not wrap the pointer 10360 index type in a signed sense (``nsw``). 10361* The successive addition of offsets (without adding the base address) does 10362 not wrap the pointer index type in a signed sense (``nsw``). 10363* The successive addition of the current address, interpreted as an unsigned 10364 number, and an offset, interpreted as a signed number, does not wrap the 10365 unsigned address space and remains *in bounds* of the allocated object. 10366 As a corollary, if the added offset is non-negative, the addition does not 10367 wrap in an unsigned sense (``nuw``). 10368* In cases where the base is a vector of pointers, the ``inbounds`` keyword 10369 applies to each of the computations element-wise. 10370 10371These rules are based on the assumption that no allocated object may cross 10372the unsigned address space boundary, and no allocated object may be larger 10373than half the pointer index type space. 10374 10375If the ``inbounds`` keyword is not present, the offsets are added to the 10376base address with silently-wrapping two's complement arithmetic. If the 10377offsets have a different width from the pointer, they are sign-extended 10378or truncated to the width of the pointer. The result value of the 10379``getelementptr`` may be outside the object pointed to by the base 10380pointer. The result value may not necessarily be used to access memory 10381though, even if it happens to point into allocated storage. See the 10382:ref:`Pointer Aliasing Rules <pointeraliasing>` section for more 10383information. 10384 10385If the ``inrange`` keyword is present before any index, loading from or 10386storing to any pointer derived from the ``getelementptr`` has undefined 10387behavior if the load or store would access memory outside of the bounds of 10388the element selected by the index marked as ``inrange``. The result of a 10389pointer comparison or ``ptrtoint`` (including ``ptrtoint``-like operations 10390involving memory) involving a pointer derived from a ``getelementptr`` with 10391the ``inrange`` keyword is undefined, with the exception of comparisons 10392in the case where both operands are in the range of the element selected 10393by the ``inrange`` keyword, inclusive of the address one past the end of 10394that element. Note that the ``inrange`` keyword is currently only allowed 10395in constant ``getelementptr`` expressions. 10396 10397The getelementptr instruction is often confusing. For some more insight 10398into how it works, see :doc:`the getelementptr FAQ <GetElementPtr>`. 10399 10400Example: 10401"""""""" 10402 10403.. code-block:: llvm 10404 10405 ; yields [12 x i8]*:aptr 10406 %aptr = getelementptr {i32, [12 x i8]}, {i32, [12 x i8]}* %saptr, i64 0, i32 1 10407 ; yields i8*:vptr 10408 %vptr = getelementptr {i32, <2 x i8>}, {i32, <2 x i8>}* %svptr, i64 0, i32 1, i32 1 10409 ; yields i8*:eptr 10410 %eptr = getelementptr [12 x i8], [12 x i8]* %aptr, i64 0, i32 1 10411 ; yields i32*:iptr 10412 %iptr = getelementptr [10 x i32], [10 x i32]* @arr, i16 0, i16 0 10413 10414Vector of pointers: 10415""""""""""""""""""" 10416 10417The ``getelementptr`` returns a vector of pointers, instead of a single address, 10418when one or more of its arguments is a vector. In such cases, all vector 10419arguments should have the same number of elements, and every scalar argument 10420will be effectively broadcast into a vector during address calculation. 10421 10422.. code-block:: llvm 10423 10424 ; All arguments are vectors: 10425 ; A[i] = ptrs[i] + offsets[i]*sizeof(i8) 10426 %A = getelementptr i8, <4 x i8*> %ptrs, <4 x i64> %offsets 10427 10428 ; Add the same scalar offset to each pointer of a vector: 10429 ; A[i] = ptrs[i] + offset*sizeof(i8) 10430 %A = getelementptr i8, <4 x i8*> %ptrs, i64 %offset 10431 10432 ; Add distinct offsets to the same pointer: 10433 ; A[i] = ptr + offsets[i]*sizeof(i8) 10434 %A = getelementptr i8, i8* %ptr, <4 x i64> %offsets 10435 10436 ; In all cases described above the type of the result is <4 x i8*> 10437 10438The two following instructions are equivalent: 10439 10440.. code-block:: llvm 10441 10442 getelementptr %struct.ST, <4 x %struct.ST*> %s, <4 x i64> %ind1, 10443 <4 x i32> <i32 2, i32 2, i32 2, i32 2>, 10444 <4 x i32> <i32 1, i32 1, i32 1, i32 1>, 10445 <4 x i32> %ind4, 10446 <4 x i64> <i64 13, i64 13, i64 13, i64 13> 10447 10448 getelementptr %struct.ST, <4 x %struct.ST*> %s, <4 x i64> %ind1, 10449 i32 2, i32 1, <4 x i32> %ind4, i64 13 10450 10451Let's look at the C code, where the vector version of ``getelementptr`` 10452makes sense: 10453 10454.. code-block:: c 10455 10456 // Let's assume that we vectorize the following loop: 10457 double *A, *B; int *C; 10458 for (int i = 0; i < size; ++i) { 10459 A[i] = B[C[i]]; 10460 } 10461 10462.. code-block:: llvm 10463 10464 ; get pointers for 8 elements from array B 10465 %ptrs = getelementptr double, double* %B, <8 x i32> %C 10466 ; load 8 elements from array B into A 10467 %A = call <8 x double> @llvm.masked.gather.v8f64.v8p0f64(<8 x double*> %ptrs, 10468 i32 8, <8 x i1> %mask, <8 x double> %passthru) 10469 10470Conversion Operations 10471--------------------- 10472 10473The instructions in this category are the conversion instructions 10474(casting) which all take a single operand and a type. They perform 10475various bit conversions on the operand. 10476 10477.. _i_trunc: 10478 10479'``trunc .. to``' Instruction 10480^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10481 10482Syntax: 10483""""""" 10484 10485:: 10486 10487 <result> = trunc <ty> <value> to <ty2> ; yields ty2 10488 10489Overview: 10490""""""""" 10491 10492The '``trunc``' instruction truncates its operand to the type ``ty2``. 10493 10494Arguments: 10495"""""""""" 10496 10497The '``trunc``' instruction takes a value to trunc, and a type to trunc 10498it to. Both types must be of :ref:`integer <t_integer>` types, or vectors 10499of the same number of integers. The bit size of the ``value`` must be 10500larger than the bit size of the destination type, ``ty2``. Equal sized 10501types are not allowed. 10502 10503Semantics: 10504"""""""""" 10505 10506The '``trunc``' instruction truncates the high order bits in ``value`` 10507and converts the remaining bits to ``ty2``. Since the source size must 10508be larger than the destination size, ``trunc`` cannot be a *no-op cast*. 10509It will always truncate bits. 10510 10511Example: 10512"""""""" 10513 10514.. code-block:: llvm 10515 10516 %X = trunc i32 257 to i8 ; yields i8:1 10517 %Y = trunc i32 123 to i1 ; yields i1:true 10518 %Z = trunc i32 122 to i1 ; yields i1:false 10519 %W = trunc <2 x i16> <i16 8, i16 7> to <2 x i8> ; yields <i8 8, i8 7> 10520 10521.. _i_zext: 10522 10523'``zext .. to``' Instruction 10524^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10525 10526Syntax: 10527""""""" 10528 10529:: 10530 10531 <result> = zext <ty> <value> to <ty2> ; yields ty2 10532 10533Overview: 10534""""""""" 10535 10536The '``zext``' instruction zero extends its operand to type ``ty2``. 10537 10538Arguments: 10539"""""""""" 10540 10541The '``zext``' instruction takes a value to cast, and a type to cast it 10542to. Both types must be of :ref:`integer <t_integer>` types, or vectors of 10543the same number of integers. The bit size of the ``value`` must be 10544smaller than the bit size of the destination type, ``ty2``. 10545 10546Semantics: 10547"""""""""" 10548 10549The ``zext`` fills the high order bits of the ``value`` with zero bits 10550until it reaches the size of the destination type, ``ty2``. 10551 10552When zero extending from i1, the result will always be either 0 or 1. 10553 10554Example: 10555"""""""" 10556 10557.. code-block:: llvm 10558 10559 %X = zext i32 257 to i64 ; yields i64:257 10560 %Y = zext i1 true to i32 ; yields i32:1 10561 %Z = zext <2 x i16> <i16 8, i16 7> to <2 x i32> ; yields <i32 8, i32 7> 10562 10563.. _i_sext: 10564 10565'``sext .. to``' Instruction 10566^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10567 10568Syntax: 10569""""""" 10570 10571:: 10572 10573 <result> = sext <ty> <value> to <ty2> ; yields ty2 10574 10575Overview: 10576""""""""" 10577 10578The '``sext``' sign extends ``value`` to the type ``ty2``. 10579 10580Arguments: 10581"""""""""" 10582 10583The '``sext``' instruction takes a value to cast, and a type to cast it 10584to. Both types must be of :ref:`integer <t_integer>` types, or vectors of 10585the same number of integers. The bit size of the ``value`` must be 10586smaller than the bit size of the destination type, ``ty2``. 10587 10588Semantics: 10589"""""""""" 10590 10591The '``sext``' instruction performs a sign extension by copying the sign 10592bit (highest order bit) of the ``value`` until it reaches the bit size 10593of the type ``ty2``. 10594 10595When sign extending from i1, the extension always results in -1 or 0. 10596 10597Example: 10598"""""""" 10599 10600.. code-block:: llvm 10601 10602 %X = sext i8 -1 to i16 ; yields i16 :65535 10603 %Y = sext i1 true to i32 ; yields i32:-1 10604 %Z = sext <2 x i16> <i16 8, i16 7> to <2 x i32> ; yields <i32 8, i32 7> 10605 10606'``fptrunc .. to``' Instruction 10607^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10608 10609Syntax: 10610""""""" 10611 10612:: 10613 10614 <result> = fptrunc <ty> <value> to <ty2> ; yields ty2 10615 10616Overview: 10617""""""""" 10618 10619The '``fptrunc``' instruction truncates ``value`` to type ``ty2``. 10620 10621Arguments: 10622"""""""""" 10623 10624The '``fptrunc``' instruction takes a :ref:`floating-point <t_floating>` 10625value to cast and a :ref:`floating-point <t_floating>` type to cast it to. 10626The size of ``value`` must be larger than the size of ``ty2``. This 10627implies that ``fptrunc`` cannot be used to make a *no-op cast*. 10628 10629Semantics: 10630"""""""""" 10631 10632The '``fptrunc``' instruction casts a ``value`` from a larger 10633:ref:`floating-point <t_floating>` type to a smaller :ref:`floating-point 10634<t_floating>` type. 10635This instruction is assumed to execute in the default :ref:`floating-point 10636environment <floatenv>`. 10637 10638Example: 10639"""""""" 10640 10641.. code-block:: llvm 10642 10643 %X = fptrunc double 16777217.0 to float ; yields float:16777216.0 10644 %Y = fptrunc double 1.0E+300 to half ; yields half:+infinity 10645 10646'``fpext .. to``' Instruction 10647^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10648 10649Syntax: 10650""""""" 10651 10652:: 10653 10654 <result> = fpext <ty> <value> to <ty2> ; yields ty2 10655 10656Overview: 10657""""""""" 10658 10659The '``fpext``' extends a floating-point ``value`` to a larger floating-point 10660value. 10661 10662Arguments: 10663"""""""""" 10664 10665The '``fpext``' instruction takes a :ref:`floating-point <t_floating>` 10666``value`` to cast, and a :ref:`floating-point <t_floating>` type to cast it 10667to. The source type must be smaller than the destination type. 10668 10669Semantics: 10670"""""""""" 10671 10672The '``fpext``' instruction extends the ``value`` from a smaller 10673:ref:`floating-point <t_floating>` type to a larger :ref:`floating-point 10674<t_floating>` type. The ``fpext`` cannot be used to make a 10675*no-op cast* because it always changes bits. Use ``bitcast`` to make a 10676*no-op cast* for a floating-point cast. 10677 10678Example: 10679"""""""" 10680 10681.. code-block:: llvm 10682 10683 %X = fpext float 3.125 to double ; yields double:3.125000e+00 10684 %Y = fpext double %X to fp128 ; yields fp128:0xL00000000000000004000900000000000 10685 10686'``fptoui .. to``' Instruction 10687^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10688 10689Syntax: 10690""""""" 10691 10692:: 10693 10694 <result> = fptoui <ty> <value> to <ty2> ; yields ty2 10695 10696Overview: 10697""""""""" 10698 10699The '``fptoui``' converts a floating-point ``value`` to its unsigned 10700integer equivalent of type ``ty2``. 10701 10702Arguments: 10703"""""""""" 10704 10705The '``fptoui``' instruction takes a value to cast, which must be a 10706scalar or vector :ref:`floating-point <t_floating>` value, and a type to 10707cast it to ``ty2``, which must be an :ref:`integer <t_integer>` type. If 10708``ty`` is a vector floating-point type, ``ty2`` must be a vector integer 10709type with the same number of elements as ``ty`` 10710 10711Semantics: 10712"""""""""" 10713 10714The '``fptoui``' instruction converts its :ref:`floating-point 10715<t_floating>` operand into the nearest (rounding towards zero) 10716unsigned integer value. If the value cannot fit in ``ty2``, the result 10717is a :ref:`poison value <poisonvalues>`. 10718 10719Example: 10720"""""""" 10721 10722.. code-block:: llvm 10723 10724 %X = fptoui double 123.0 to i32 ; yields i32:123 10725 %Y = fptoui float 1.0E+300 to i1 ; yields undefined:1 10726 %Z = fptoui float 1.04E+17 to i8 ; yields undefined:1 10727 10728'``fptosi .. to``' Instruction 10729^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10730 10731Syntax: 10732""""""" 10733 10734:: 10735 10736 <result> = fptosi <ty> <value> to <ty2> ; yields ty2 10737 10738Overview: 10739""""""""" 10740 10741The '``fptosi``' instruction converts :ref:`floating-point <t_floating>` 10742``value`` to type ``ty2``. 10743 10744Arguments: 10745"""""""""" 10746 10747The '``fptosi``' instruction takes a value to cast, which must be a 10748scalar or vector :ref:`floating-point <t_floating>` value, and a type to 10749cast it to ``ty2``, which must be an :ref:`integer <t_integer>` type. If 10750``ty`` is a vector floating-point type, ``ty2`` must be a vector integer 10751type with the same number of elements as ``ty`` 10752 10753Semantics: 10754"""""""""" 10755 10756The '``fptosi``' instruction converts its :ref:`floating-point 10757<t_floating>` operand into the nearest (rounding towards zero) 10758signed integer value. If the value cannot fit in ``ty2``, the result 10759is a :ref:`poison value <poisonvalues>`. 10760 10761Example: 10762"""""""" 10763 10764.. code-block:: llvm 10765 10766 %X = fptosi double -123.0 to i32 ; yields i32:-123 10767 %Y = fptosi float 1.0E-247 to i1 ; yields undefined:1 10768 %Z = fptosi float 1.04E+17 to i8 ; yields undefined:1 10769 10770'``uitofp .. to``' Instruction 10771^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10772 10773Syntax: 10774""""""" 10775 10776:: 10777 10778 <result> = uitofp <ty> <value> to <ty2> ; yields ty2 10779 10780Overview: 10781""""""""" 10782 10783The '``uitofp``' instruction regards ``value`` as an unsigned integer 10784and converts that value to the ``ty2`` type. 10785 10786Arguments: 10787"""""""""" 10788 10789The '``uitofp``' instruction takes a value to cast, which must be a 10790scalar or vector :ref:`integer <t_integer>` value, and a type to cast it to 10791``ty2``, which must be an :ref:`floating-point <t_floating>` type. If 10792``ty`` is a vector integer type, ``ty2`` must be a vector floating-point 10793type with the same number of elements as ``ty`` 10794 10795Semantics: 10796"""""""""" 10797 10798The '``uitofp``' instruction interprets its operand as an unsigned 10799integer quantity and converts it to the corresponding floating-point 10800value. If the value cannot be exactly represented, it is rounded using 10801the default rounding mode. 10802 10803 10804Example: 10805"""""""" 10806 10807.. code-block:: llvm 10808 10809 %X = uitofp i32 257 to float ; yields float:257.0 10810 %Y = uitofp i8 -1 to double ; yields double:255.0 10811 10812'``sitofp .. to``' Instruction 10813^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10814 10815Syntax: 10816""""""" 10817 10818:: 10819 10820 <result> = sitofp <ty> <value> to <ty2> ; yields ty2 10821 10822Overview: 10823""""""""" 10824 10825The '``sitofp``' instruction regards ``value`` as a signed integer and 10826converts that value to the ``ty2`` type. 10827 10828Arguments: 10829"""""""""" 10830 10831The '``sitofp``' instruction takes a value to cast, which must be a 10832scalar or vector :ref:`integer <t_integer>` value, and a type to cast it to 10833``ty2``, which must be an :ref:`floating-point <t_floating>` type. If 10834``ty`` is a vector integer type, ``ty2`` must be a vector floating-point 10835type with the same number of elements as ``ty`` 10836 10837Semantics: 10838"""""""""" 10839 10840The '``sitofp``' instruction interprets its operand as a signed integer 10841quantity and converts it to the corresponding floating-point value. If the 10842value cannot be exactly represented, it is rounded using the default rounding 10843mode. 10844 10845Example: 10846"""""""" 10847 10848.. code-block:: llvm 10849 10850 %X = sitofp i32 257 to float ; yields float:257.0 10851 %Y = sitofp i8 -1 to double ; yields double:-1.0 10852 10853.. _i_ptrtoint: 10854 10855'``ptrtoint .. to``' Instruction 10856^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10857 10858Syntax: 10859""""""" 10860 10861:: 10862 10863 <result> = ptrtoint <ty> <value> to <ty2> ; yields ty2 10864 10865Overview: 10866""""""""" 10867 10868The '``ptrtoint``' instruction converts the pointer or a vector of 10869pointers ``value`` to the integer (or vector of integers) type ``ty2``. 10870 10871Arguments: 10872"""""""""" 10873 10874The '``ptrtoint``' instruction takes a ``value`` to cast, which must be 10875a value of type :ref:`pointer <t_pointer>` or a vector of pointers, and a 10876type to cast it to ``ty2``, which must be an :ref:`integer <t_integer>` or 10877a vector of integers type. 10878 10879Semantics: 10880"""""""""" 10881 10882The '``ptrtoint``' instruction converts ``value`` to integer type 10883``ty2`` by interpreting the pointer value as an integer and either 10884truncating or zero extending that value to the size of the integer type. 10885If ``value`` is smaller than ``ty2`` then a zero extension is done. If 10886``value`` is larger than ``ty2`` then a truncation is done. If they are 10887the same size, then nothing is done (*no-op cast*) other than a type 10888change. 10889 10890Example: 10891"""""""" 10892 10893.. code-block:: llvm 10894 10895 %X = ptrtoint i32* %P to i8 ; yields truncation on 32-bit architecture 10896 %Y = ptrtoint i32* %P to i64 ; yields zero extension on 32-bit architecture 10897 %Z = ptrtoint <4 x i32*> %P to <4 x i64>; yields vector zero extension for a vector of addresses on 32-bit architecture 10898 10899.. _i_inttoptr: 10900 10901'``inttoptr .. to``' Instruction 10902^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10903 10904Syntax: 10905""""""" 10906 10907:: 10908 10909 <result> = inttoptr <ty> <value> to <ty2>[, !dereferenceable !<deref_bytes_node>][, !dereferenceable_or_null !<deref_bytes_node>] ; yields ty2 10910 10911Overview: 10912""""""""" 10913 10914The '``inttoptr``' instruction converts an integer ``value`` to a 10915pointer type, ``ty2``. 10916 10917Arguments: 10918"""""""""" 10919 10920The '``inttoptr``' instruction takes an :ref:`integer <t_integer>` value to 10921cast, and a type to cast it to, which must be a :ref:`pointer <t_pointer>` 10922type. 10923 10924The optional ``!dereferenceable`` metadata must reference a single metadata 10925name ``<deref_bytes_node>`` corresponding to a metadata node with one ``i64`` 10926entry. 10927See ``dereferenceable`` metadata. 10928 10929The optional ``!dereferenceable_or_null`` metadata must reference a single 10930metadata name ``<deref_bytes_node>`` corresponding to a metadata node with one 10931``i64`` entry. 10932See ``dereferenceable_or_null`` metadata. 10933 10934Semantics: 10935"""""""""" 10936 10937The '``inttoptr``' instruction converts ``value`` to type ``ty2`` by 10938applying either a zero extension or a truncation depending on the size 10939of the integer ``value``. If ``value`` is larger than the size of a 10940pointer then a truncation is done. If ``value`` is smaller than the size 10941of a pointer then a zero extension is done. If they are the same size, 10942nothing is done (*no-op cast*). 10943 10944Example: 10945"""""""" 10946 10947.. code-block:: llvm 10948 10949 %X = inttoptr i32 255 to i32* ; yields zero extension on 64-bit architecture 10950 %Y = inttoptr i32 255 to i32* ; yields no-op on 32-bit architecture 10951 %Z = inttoptr i64 0 to i32* ; yields truncation on 32-bit architecture 10952 %Z = inttoptr <4 x i32> %G to <4 x i8*>; yields truncation of vector G to four pointers 10953 10954.. _i_bitcast: 10955 10956'``bitcast .. to``' Instruction 10957^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10958 10959Syntax: 10960""""""" 10961 10962:: 10963 10964 <result> = bitcast <ty> <value> to <ty2> ; yields ty2 10965 10966Overview: 10967""""""""" 10968 10969The '``bitcast``' instruction converts ``value`` to type ``ty2`` without 10970changing any bits. 10971 10972Arguments: 10973"""""""""" 10974 10975The '``bitcast``' instruction takes a value to cast, which must be a 10976non-aggregate first class value, and a type to cast it to, which must 10977also be a non-aggregate :ref:`first class <t_firstclass>` type. The 10978bit sizes of ``value`` and the destination type, ``ty2``, must be 10979identical. If the source type is a pointer, the destination type must 10980also be a pointer of the same size. This instruction supports bitwise 10981conversion of vectors to integers and to vectors of other types (as 10982long as they have the same size). 10983 10984Semantics: 10985"""""""""" 10986 10987The '``bitcast``' instruction converts ``value`` to type ``ty2``. It 10988is always a *no-op cast* because no bits change with this 10989conversion. The conversion is done as if the ``value`` had been stored 10990to memory and read back as type ``ty2``. Pointer (or vector of 10991pointers) types may only be converted to other pointer (or vector of 10992pointers) types with the same address space through this instruction. 10993To convert pointers to other types, use the :ref:`inttoptr <i_inttoptr>` 10994or :ref:`ptrtoint <i_ptrtoint>` instructions first. 10995 10996There is a caveat for bitcasts involving vector types in relation to 10997endianess. For example ``bitcast <2 x i8> <value> to i16`` puts element zero 10998of the vector in the least significant bits of the i16 for little-endian while 10999element zero ends up in the most significant bits for big-endian. 11000 11001Example: 11002"""""""" 11003 11004.. code-block:: text 11005 11006 %X = bitcast i8 255 to i8 ; yields i8 :-1 11007 %Y = bitcast i32* %x to sint* ; yields sint*:%x 11008 %Z = bitcast <2 x int> %V to i64; ; yields i64: %V (depends on endianess) 11009 %Z = bitcast <2 x i32*> %V to <2 x i64*> ; yields <2 x i64*> 11010 11011.. _i_addrspacecast: 11012 11013'``addrspacecast .. to``' Instruction 11014^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 11015 11016Syntax: 11017""""""" 11018 11019:: 11020 11021 <result> = addrspacecast <pty> <ptrval> to <pty2> ; yields pty2 11022 11023Overview: 11024""""""""" 11025 11026The '``addrspacecast``' instruction converts ``ptrval`` from ``pty`` in 11027address space ``n`` to type ``pty2`` in address space ``m``. 11028 11029Arguments: 11030"""""""""" 11031 11032The '``addrspacecast``' instruction takes a pointer or vector of pointer value 11033to cast and a pointer type to cast it to, which must have a different 11034address space. 11035 11036Semantics: 11037"""""""""" 11038 11039The '``addrspacecast``' instruction converts the pointer value 11040``ptrval`` to type ``pty2``. It can be a *no-op cast* or a complex 11041value modification, depending on the target and the address space 11042pair. Pointer conversions within the same address space must be 11043performed with the ``bitcast`` instruction. Note that if the address space 11044conversion is legal then both result and operand refer to the same memory 11045location. 11046 11047Example: 11048"""""""" 11049 11050.. code-block:: llvm 11051 11052 %X = addrspacecast i32* %x to i32 addrspace(1)* ; yields i32 addrspace(1)*:%x 11053 %Y = addrspacecast i32 addrspace(1)* %y to i64 addrspace(2)* ; yields i64 addrspace(2)*:%y 11054 %Z = addrspacecast <4 x i32*> %z to <4 x float addrspace(3)*> ; yields <4 x float addrspace(3)*>:%z 11055 11056.. _otherops: 11057 11058Other Operations 11059---------------- 11060 11061The instructions in this category are the "miscellaneous" instructions, 11062which defy better classification. 11063 11064.. _i_icmp: 11065 11066'``icmp``' Instruction 11067^^^^^^^^^^^^^^^^^^^^^^ 11068 11069Syntax: 11070""""""" 11071 11072:: 11073 11074 <result> = icmp <cond> <ty> <op1>, <op2> ; yields i1 or <N x i1>:result 11075 11076Overview: 11077""""""""" 11078 11079The '``icmp``' instruction returns a boolean value or a vector of 11080boolean values based on comparison of its two integer, integer vector, 11081pointer, or pointer vector operands. 11082 11083Arguments: 11084"""""""""" 11085 11086The '``icmp``' instruction takes three operands. The first operand is 11087the condition code indicating the kind of comparison to perform. It is 11088not a value, just a keyword. The possible condition codes are: 11089 11090#. ``eq``: equal 11091#. ``ne``: not equal 11092#. ``ugt``: unsigned greater than 11093#. ``uge``: unsigned greater or equal 11094#. ``ult``: unsigned less than 11095#. ``ule``: unsigned less or equal 11096#. ``sgt``: signed greater than 11097#. ``sge``: signed greater or equal 11098#. ``slt``: signed less than 11099#. ``sle``: signed less or equal 11100 11101The remaining two arguments must be :ref:`integer <t_integer>` or 11102:ref:`pointer <t_pointer>` or integer :ref:`vector <t_vector>` typed. They 11103must also be identical types. 11104 11105Semantics: 11106"""""""""" 11107 11108The '``icmp``' compares ``op1`` and ``op2`` according to the condition 11109code given as ``cond``. The comparison performed always yields either an 11110:ref:`i1 <t_integer>` or vector of ``i1`` result, as follows: 11111 11112#. ``eq``: yields ``true`` if the operands are equal, ``false`` 11113 otherwise. No sign interpretation is necessary or performed. 11114#. ``ne``: yields ``true`` if the operands are unequal, ``false`` 11115 otherwise. No sign interpretation is necessary or performed. 11116#. ``ugt``: interprets the operands as unsigned values and yields 11117 ``true`` if ``op1`` is greater than ``op2``. 11118#. ``uge``: interprets the operands as unsigned values and yields 11119 ``true`` if ``op1`` is greater than or equal to ``op2``. 11120#. ``ult``: interprets the operands as unsigned values and yields 11121 ``true`` if ``op1`` is less than ``op2``. 11122#. ``ule``: interprets the operands as unsigned values and yields 11123 ``true`` if ``op1`` is less than or equal to ``op2``. 11124#. ``sgt``: interprets the operands as signed values and yields ``true`` 11125 if ``op1`` is greater than ``op2``. 11126#. ``sge``: interprets the operands as signed values and yields ``true`` 11127 if ``op1`` is greater than or equal to ``op2``. 11128#. ``slt``: interprets the operands as signed values and yields ``true`` 11129 if ``op1`` is less than ``op2``. 11130#. ``sle``: interprets the operands as signed values and yields ``true`` 11131 if ``op1`` is less than or equal to ``op2``. 11132 11133If the operands are :ref:`pointer <t_pointer>` typed, the pointer values 11134are compared as if they were integers. 11135 11136If the operands are integer vectors, then they are compared element by 11137element. The result is an ``i1`` vector with the same number of elements 11138as the values being compared. Otherwise, the result is an ``i1``. 11139 11140Example: 11141"""""""" 11142 11143.. code-block:: text 11144 11145 <result> = icmp eq i32 4, 5 ; yields: result=false 11146 <result> = icmp ne float* %X, %X ; yields: result=false 11147 <result> = icmp ult i16 4, 5 ; yields: result=true 11148 <result> = icmp sgt i16 4, 5 ; yields: result=false 11149 <result> = icmp ule i16 -4, 5 ; yields: result=false 11150 <result> = icmp sge i16 4, 5 ; yields: result=false 11151 11152.. _i_fcmp: 11153 11154'``fcmp``' Instruction 11155^^^^^^^^^^^^^^^^^^^^^^ 11156 11157Syntax: 11158""""""" 11159 11160:: 11161 11162 <result> = fcmp [fast-math flags]* <cond> <ty> <op1>, <op2> ; yields i1 or <N x i1>:result 11163 11164Overview: 11165""""""""" 11166 11167The '``fcmp``' instruction returns a boolean value or vector of boolean 11168values based on comparison of its operands. 11169 11170If the operands are floating-point scalars, then the result type is a 11171boolean (:ref:`i1 <t_integer>`). 11172 11173If the operands are floating-point vectors, then the result type is a 11174vector of boolean with the same number of elements as the operands being 11175compared. 11176 11177Arguments: 11178"""""""""" 11179 11180The '``fcmp``' instruction takes three operands. The first operand is 11181the condition code indicating the kind of comparison to perform. It is 11182not a value, just a keyword. The possible condition codes are: 11183 11184#. ``false``: no comparison, always returns false 11185#. ``oeq``: ordered and equal 11186#. ``ogt``: ordered and greater than 11187#. ``oge``: ordered and greater than or equal 11188#. ``olt``: ordered and less than 11189#. ``ole``: ordered and less than or equal 11190#. ``one``: ordered and not equal 11191#. ``ord``: ordered (no nans) 11192#. ``ueq``: unordered or equal 11193#. ``ugt``: unordered or greater than 11194#. ``uge``: unordered or greater than or equal 11195#. ``ult``: unordered or less than 11196#. ``ule``: unordered or less than or equal 11197#. ``une``: unordered or not equal 11198#. ``uno``: unordered (either nans) 11199#. ``true``: no comparison, always returns true 11200 11201*Ordered* means that neither operand is a QNAN while *unordered* means 11202that either operand may be a QNAN. 11203 11204Each of ``val1`` and ``val2`` arguments must be either a :ref:`floating-point 11205<t_floating>` type or a :ref:`vector <t_vector>` of floating-point type. 11206They must have identical types. 11207 11208Semantics: 11209"""""""""" 11210 11211The '``fcmp``' instruction compares ``op1`` and ``op2`` according to the 11212condition code given as ``cond``. If the operands are vectors, then the 11213vectors are compared element by element. Each comparison performed 11214always yields an :ref:`i1 <t_integer>` result, as follows: 11215 11216#. ``false``: always yields ``false``, regardless of operands. 11217#. ``oeq``: yields ``true`` if both operands are not a QNAN and ``op1`` 11218 is equal to ``op2``. 11219#. ``ogt``: yields ``true`` if both operands are not a QNAN and ``op1`` 11220 is greater than ``op2``. 11221#. ``oge``: yields ``true`` if both operands are not a QNAN and ``op1`` 11222 is greater than or equal to ``op2``. 11223#. ``olt``: yields ``true`` if both operands are not a QNAN and ``op1`` 11224 is less than ``op2``. 11225#. ``ole``: yields ``true`` if both operands are not a QNAN and ``op1`` 11226 is less than or equal to ``op2``. 11227#. ``one``: yields ``true`` if both operands are not a QNAN and ``op1`` 11228 is not equal to ``op2``. 11229#. ``ord``: yields ``true`` if both operands are not a QNAN. 11230#. ``ueq``: yields ``true`` if either operand is a QNAN or ``op1`` is 11231 equal to ``op2``. 11232#. ``ugt``: yields ``true`` if either operand is a QNAN or ``op1`` is 11233 greater than ``op2``. 11234#. ``uge``: yields ``true`` if either operand is a QNAN or ``op1`` is 11235 greater than or equal to ``op2``. 11236#. ``ult``: yields ``true`` if either operand is a QNAN or ``op1`` is 11237 less than ``op2``. 11238#. ``ule``: yields ``true`` if either operand is a QNAN or ``op1`` is 11239 less than or equal to ``op2``. 11240#. ``une``: yields ``true`` if either operand is a QNAN or ``op1`` is 11241 not equal to ``op2``. 11242#. ``uno``: yields ``true`` if either operand is a QNAN. 11243#. ``true``: always yields ``true``, regardless of operands. 11244 11245The ``fcmp`` instruction can also optionally take any number of 11246:ref:`fast-math flags <fastmath>`, which are optimization hints to enable 11247otherwise unsafe floating-point optimizations. 11248 11249Any set of fast-math flags are legal on an ``fcmp`` instruction, but the 11250only flags that have any effect on its semantics are those that allow 11251assumptions to be made about the values of input arguments; namely 11252``nnan``, ``ninf``, and ``reassoc``. See :ref:`fastmath` for more information. 11253 11254Example: 11255"""""""" 11256 11257.. code-block:: text 11258 11259 <result> = fcmp oeq float 4.0, 5.0 ; yields: result=false 11260 <result> = fcmp one float 4.0, 5.0 ; yields: result=true 11261 <result> = fcmp olt float 4.0, 5.0 ; yields: result=true 11262 <result> = fcmp ueq double 1.0, 2.0 ; yields: result=false 11263 11264.. _i_phi: 11265 11266'``phi``' Instruction 11267^^^^^^^^^^^^^^^^^^^^^ 11268 11269Syntax: 11270""""""" 11271 11272:: 11273 11274 <result> = phi [fast-math-flags] <ty> [ <val0>, <label0>], ... 11275 11276Overview: 11277""""""""" 11278 11279The '``phi``' instruction is used to implement the φ node in the SSA 11280graph representing the function. 11281 11282Arguments: 11283"""""""""" 11284 11285The type of the incoming values is specified with the first type field. 11286After this, the '``phi``' instruction takes a list of pairs as 11287arguments, with one pair for each predecessor basic block of the current 11288block. Only values of :ref:`first class <t_firstclass>` type may be used as 11289the value arguments to the PHI node. Only labels may be used as the 11290label arguments. 11291 11292There must be no non-phi instructions between the start of a basic block 11293and the PHI instructions: i.e. PHI instructions must be first in a basic 11294block. 11295 11296For the purposes of the SSA form, the use of each incoming value is 11297deemed to occur on the edge from the corresponding predecessor block to 11298the current block (but after any definition of an '``invoke``' 11299instruction's return value on the same edge). 11300 11301The optional ``fast-math-flags`` marker indicates that the phi has one 11302or more :ref:`fast-math-flags <fastmath>`. These are optimization hints 11303to enable otherwise unsafe floating-point optimizations. Fast-math-flags 11304are only valid for phis that return a floating-point scalar or vector 11305type, or an array (nested to any depth) of floating-point scalar or vector 11306types. 11307 11308Semantics: 11309"""""""""" 11310 11311At runtime, the '``phi``' instruction logically takes on the value 11312specified by the pair corresponding to the predecessor basic block that 11313executed just prior to the current block. 11314 11315Example: 11316"""""""" 11317 11318.. code-block:: llvm 11319 11320 Loop: ; Infinite loop that counts from 0 on up... 11321 %indvar = phi i32 [ 0, %LoopHeader ], [ %nextindvar, %Loop ] 11322 %nextindvar = add i32 %indvar, 1 11323 br label %Loop 11324 11325.. _i_select: 11326 11327'``select``' Instruction 11328^^^^^^^^^^^^^^^^^^^^^^^^ 11329 11330Syntax: 11331""""""" 11332 11333:: 11334 11335 <result> = select [fast-math flags] selty <cond>, <ty> <val1>, <ty> <val2> ; yields ty 11336 11337 selty is either i1 or {<N x i1>} 11338 11339Overview: 11340""""""""" 11341 11342The '``select``' instruction is used to choose one value based on a 11343condition, without IR-level branching. 11344 11345Arguments: 11346"""""""""" 11347 11348The '``select``' instruction requires an 'i1' value or a vector of 'i1' 11349values indicating the condition, and two values of the same :ref:`first 11350class <t_firstclass>` type. 11351 11352#. The optional ``fast-math flags`` marker indicates that the select has one or more 11353 :ref:`fast-math flags <fastmath>`. These are optimization hints to enable 11354 otherwise unsafe floating-point optimizations. Fast-math flags are only valid 11355 for selects that return a floating-point scalar or vector type, or an array 11356 (nested to any depth) of floating-point scalar or vector types. 11357 11358Semantics: 11359"""""""""" 11360 11361If the condition is an i1 and it evaluates to 1, the instruction returns 11362the first value argument; otherwise, it returns the second value 11363argument. 11364 11365If the condition is a vector of i1, then the value arguments must be 11366vectors of the same size, and the selection is done element by element. 11367 11368If the condition is an i1 and the value arguments are vectors of the 11369same size, then an entire vector is selected. 11370 11371Example: 11372"""""""" 11373 11374.. code-block:: llvm 11375 11376 %X = select i1 true, i8 17, i8 42 ; yields i8:17 11377 11378 11379.. _i_freeze: 11380 11381'``freeze``' Instruction 11382^^^^^^^^^^^^^^^^^^^^^^^^ 11383 11384Syntax: 11385""""""" 11386 11387:: 11388 11389 <result> = freeze ty <val> ; yields ty:result 11390 11391Overview: 11392""""""""" 11393 11394The '``freeze``' instruction is used to stop propagation of 11395:ref:`undef <undefvalues>` and :ref:`poison <poisonvalues>` values. 11396 11397Arguments: 11398"""""""""" 11399 11400The '``freeze``' instruction takes a single argument. 11401 11402Semantics: 11403"""""""""" 11404 11405If the argument is ``undef`` or ``poison``, '``freeze``' returns an 11406arbitrary, but fixed, value of type '``ty``'. 11407Otherwise, this instruction is a no-op and returns the input argument. 11408All uses of a value returned by the same '``freeze``' instruction are 11409guaranteed to always observe the same value, while different '``freeze``' 11410instructions may yield different values. 11411 11412While ``undef`` and ``poison`` pointers can be frozen, the result is a 11413non-dereferenceable pointer. See the 11414:ref:`Pointer Aliasing Rules <pointeraliasing>` section for more information. 11415If an aggregate value or vector is frozen, the operand is frozen element-wise. 11416The padding of an aggregate isn't considered, since it isn't visible 11417without storing it into memory and loading it with a different type. 11418 11419 11420Example: 11421"""""""" 11422 11423.. code-block:: text 11424 11425 %w = i32 undef 11426 %x = freeze i32 %w 11427 %y = add i32 %w, %w ; undef 11428 %z = add i32 %x, %x ; even number because all uses of %x observe 11429 ; the same value 11430 %x2 = freeze i32 %w 11431 %cmp = icmp eq i32 %x, %x2 ; can be true or false 11432 11433 ; example with vectors 11434 %v = <2 x i32> <i32 undef, i32 poison> 11435 %a = extractelement <2 x i32> %v, i32 0 ; undef 11436 %b = extractelement <2 x i32> %v, i32 1 ; poison 11437 %add = add i32 %a, %a ; undef 11438 11439 %v.fr = freeze <2 x i32> %v ; element-wise freeze 11440 %d = extractelement <2 x i32> %v.fr, i32 0 ; not undef 11441 %add.f = add i32 %d, %d ; even number 11442 11443 ; branching on frozen value 11444 %poison = add nsw i1 %k, undef ; poison 11445 %c = freeze i1 %poison 11446 br i1 %c, label %foo, label %bar ; non-deterministic branch to %foo or %bar 11447 11448 11449.. _i_call: 11450 11451'``call``' Instruction 11452^^^^^^^^^^^^^^^^^^^^^^ 11453 11454Syntax: 11455""""""" 11456 11457:: 11458 11459 <result> = [tail | musttail | notail ] call [fast-math flags] [cconv] [ret attrs] [addrspace(<num>)] 11460 <ty>|<fnty> <fnptrval>(<function args>) [fn attrs] [ operand bundles ] 11461 11462Overview: 11463""""""""" 11464 11465The '``call``' instruction represents a simple function call. 11466 11467Arguments: 11468"""""""""" 11469 11470This instruction requires several arguments: 11471 11472#. The optional ``tail`` and ``musttail`` markers indicate that the optimizers 11473 should perform tail call optimization. The ``tail`` marker is a hint that 11474 `can be ignored <CodeGenerator.html#sibcallopt>`_. The ``musttail`` marker 11475 means that the call must be tail call optimized in order for the program to 11476 be correct. The ``musttail`` marker provides these guarantees: 11477 11478 #. The call will not cause unbounded stack growth if it is part of a 11479 recursive cycle in the call graph. 11480 #. Arguments with the :ref:`inalloca <attr_inalloca>` or 11481 :ref:`preallocated <attr_preallocated>` attribute are forwarded in place. 11482 #. If the musttail call appears in a function with the ``"thunk"`` attribute 11483 and the caller and callee both have varargs, than any unprototyped 11484 arguments in register or memory are forwarded to the callee. Similarly, 11485 the return value of the callee is returned to the caller's caller, even 11486 if a void return type is in use. 11487 11488 Both markers imply that the callee does not access allocas from the caller. 11489 The ``tail`` marker additionally implies that the callee does not access 11490 varargs from the caller. Calls marked ``musttail`` must obey the following 11491 additional rules: 11492 11493 - The call must immediately precede a :ref:`ret <i_ret>` instruction, 11494 or a pointer bitcast followed by a ret instruction. 11495 - The ret instruction must return the (possibly bitcasted) value 11496 produced by the call, undef, or void. 11497 - The calling conventions of the caller and callee must match. 11498 - The callee must be varargs iff the caller is varargs. Bitcasting a 11499 non-varargs function to the appropriate varargs type is legal so 11500 long as the non-varargs prefixes obey the other rules. 11501 - The return type must not undergo automatic conversion to an `sret` pointer. 11502 11503 In addition, if the calling convention is not `swifttailcc` or `tailcc`: 11504 11505 - All ABI-impacting function attributes, such as sret, byval, inreg, 11506 returned, and inalloca, must match. 11507 - The caller and callee prototypes must match. Pointer types of parameters 11508 or return types may differ in pointee type, but not in address space. 11509 11510 On the other hand, if the calling convention is `swifttailcc` or `swiftcc`: 11511 11512 - Only these ABI-impacting attributes attributes are allowed: sret, byval, 11513 swiftself, and swiftasync. 11514 - Prototypes are not required to match. 11515 11516 Tail call optimization for calls marked ``tail`` is guaranteed to occur if 11517 the following conditions are met: 11518 11519 - Caller and callee both have the calling convention ``fastcc`` or ``tailcc``. 11520 - The call is in tail position (ret immediately follows call and ret 11521 uses value of call or is void). 11522 - Option ``-tailcallopt`` is enabled, 11523 ``llvm::GuaranteedTailCallOpt`` is ``true``, or the calling convention 11524 is ``tailcc`` 11525 - `Platform-specific constraints are 11526 met. <CodeGenerator.html#tailcallopt>`_ 11527 11528#. The optional ``notail`` marker indicates that the optimizers should not add 11529 ``tail`` or ``musttail`` markers to the call. It is used to prevent tail 11530 call optimization from being performed on the call. 11531 11532#. The optional ``fast-math flags`` marker indicates that the call has one or more 11533 :ref:`fast-math flags <fastmath>`, which are optimization hints to enable 11534 otherwise unsafe floating-point optimizations. Fast-math flags are only valid 11535 for calls that return a floating-point scalar or vector type, or an array 11536 (nested to any depth) of floating-point scalar or vector types. 11537 11538#. The optional "cconv" marker indicates which :ref:`calling 11539 convention <callingconv>` the call should use. If none is 11540 specified, the call defaults to using C calling conventions. The 11541 calling convention of the call must match the calling convention of 11542 the target function, or else the behavior is undefined. 11543#. The optional :ref:`Parameter Attributes <paramattrs>` list for return 11544 values. Only '``zeroext``', '``signext``', and '``inreg``' attributes 11545 are valid here. 11546#. The optional addrspace attribute can be used to indicate the address space 11547 of the called function. If it is not specified, the program address space 11548 from the :ref:`datalayout string<langref_datalayout>` will be used. 11549#. '``ty``': the type of the call instruction itself which is also the 11550 type of the return value. Functions that return no value are marked 11551 ``void``. 11552#. '``fnty``': shall be the signature of the function being called. The 11553 argument types must match the types implied by this signature. This 11554 type can be omitted if the function is not varargs. 11555#. '``fnptrval``': An LLVM value containing a pointer to a function to 11556 be called. In most cases, this is a direct function call, but 11557 indirect ``call``'s are just as possible, calling an arbitrary pointer 11558 to function value. 11559#. '``function args``': argument list whose types match the function 11560 signature argument types and parameter attributes. All arguments must 11561 be of :ref:`first class <t_firstclass>` type. If the function signature 11562 indicates the function accepts a variable number of arguments, the 11563 extra arguments can be specified. 11564#. The optional :ref:`function attributes <fnattrs>` list. 11565#. The optional :ref:`operand bundles <opbundles>` list. 11566 11567Semantics: 11568"""""""""" 11569 11570The '``call``' instruction is used to cause control flow to transfer to 11571a specified function, with its incoming arguments bound to the specified 11572values. Upon a '``ret``' instruction in the called function, control 11573flow continues with the instruction after the function call, and the 11574return value of the function is bound to the result argument. 11575 11576Example: 11577"""""""" 11578 11579.. code-block:: llvm 11580 11581 %retval = call i32 @test(i32 %argc) 11582 call i32 (i8*, ...)* @printf(i8* %msg, i32 12, i8 42) ; yields i32 11583 %X = tail call i32 @foo() ; yields i32 11584 %Y = tail call fastcc i32 @foo() ; yields i32 11585 call void %foo(i8 97 signext) 11586 11587 %struct.A = type { i32, i8 } 11588 %r = call %struct.A @foo() ; yields { i32, i8 } 11589 %gr = extractvalue %struct.A %r, 0 ; yields i32 11590 %gr1 = extractvalue %struct.A %r, 1 ; yields i8 11591 %Z = call void @foo() noreturn ; indicates that %foo never returns normally 11592 %ZZ = call zeroext i32 @bar() ; Return value is %zero extended 11593 11594llvm treats calls to some functions with names and arguments that match 11595the standard C99 library as being the C99 library functions, and may 11596perform optimizations or generate code for them under that assumption. 11597This is something we'd like to change in the future to provide better 11598support for freestanding environments and non-C-based languages. 11599 11600.. _i_va_arg: 11601 11602'``va_arg``' Instruction 11603^^^^^^^^^^^^^^^^^^^^^^^^ 11604 11605Syntax: 11606""""""" 11607 11608:: 11609 11610 <resultval> = va_arg <va_list*> <arglist>, <argty> 11611 11612Overview: 11613""""""""" 11614 11615The '``va_arg``' instruction is used to access arguments passed through 11616the "variable argument" area of a function call. It is used to implement 11617the ``va_arg`` macro in C. 11618 11619Arguments: 11620"""""""""" 11621 11622This instruction takes a ``va_list*`` value and the type of the 11623argument. It returns a value of the specified argument type and 11624increments the ``va_list`` to point to the next argument. The actual 11625type of ``va_list`` is target specific. 11626 11627Semantics: 11628"""""""""" 11629 11630The '``va_arg``' instruction loads an argument of the specified type 11631from the specified ``va_list`` and causes the ``va_list`` to point to 11632the next argument. For more information, see the variable argument 11633handling :ref:`Intrinsic Functions <int_varargs>`. 11634 11635It is legal for this instruction to be called in a function which does 11636not take a variable number of arguments, for example, the ``vfprintf`` 11637function. 11638 11639``va_arg`` is an LLVM instruction instead of an :ref:`intrinsic 11640function <intrinsics>` because it takes a type as an argument. 11641 11642Example: 11643"""""""" 11644 11645See the :ref:`variable argument processing <int_varargs>` section. 11646 11647Note that the code generator does not yet fully support va\_arg on many 11648targets. Also, it does not currently support va\_arg with aggregate 11649types on any target. 11650 11651.. _i_landingpad: 11652 11653'``landingpad``' Instruction 11654^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 11655 11656Syntax: 11657""""""" 11658 11659:: 11660 11661 <resultval> = landingpad <resultty> <clause>+ 11662 <resultval> = landingpad <resultty> cleanup <clause>* 11663 11664 <clause> := catch <type> <value> 11665 <clause> := filter <array constant type> <array constant> 11666 11667Overview: 11668""""""""" 11669 11670The '``landingpad``' instruction is used by `LLVM's exception handling 11671system <ExceptionHandling.html#overview>`_ to specify that a basic block 11672is a landing pad --- one where the exception lands, and corresponds to the 11673code found in the ``catch`` portion of a ``try``/``catch`` sequence. It 11674defines values supplied by the :ref:`personality function <personalityfn>` upon 11675re-entry to the function. The ``resultval`` has the type ``resultty``. 11676 11677Arguments: 11678"""""""""" 11679 11680The optional 11681``cleanup`` flag indicates that the landing pad block is a cleanup. 11682 11683A ``clause`` begins with the clause type --- ``catch`` or ``filter`` --- and 11684contains the global variable representing the "type" that may be caught 11685or filtered respectively. Unlike the ``catch`` clause, the ``filter`` 11686clause takes an array constant as its argument. Use 11687"``[0 x i8**] undef``" for a filter which cannot throw. The 11688'``landingpad``' instruction must contain *at least* one ``clause`` or 11689the ``cleanup`` flag. 11690 11691Semantics: 11692"""""""""" 11693 11694The '``landingpad``' instruction defines the values which are set by the 11695:ref:`personality function <personalityfn>` upon re-entry to the function, and 11696therefore the "result type" of the ``landingpad`` instruction. As with 11697calling conventions, how the personality function results are 11698represented in LLVM IR is target specific. 11699 11700The clauses are applied in order from top to bottom. If two 11701``landingpad`` instructions are merged together through inlining, the 11702clauses from the calling function are appended to the list of clauses. 11703When the call stack is being unwound due to an exception being thrown, 11704the exception is compared against each ``clause`` in turn. If it doesn't 11705match any of the clauses, and the ``cleanup`` flag is not set, then 11706unwinding continues further up the call stack. 11707 11708The ``landingpad`` instruction has several restrictions: 11709 11710- A landing pad block is a basic block which is the unwind destination 11711 of an '``invoke``' instruction. 11712- A landing pad block must have a '``landingpad``' instruction as its 11713 first non-PHI instruction. 11714- There can be only one '``landingpad``' instruction within the landing 11715 pad block. 11716- A basic block that is not a landing pad block may not include a 11717 '``landingpad``' instruction. 11718 11719Example: 11720"""""""" 11721 11722.. code-block:: llvm 11723 11724 ;; A landing pad which can catch an integer. 11725 %res = landingpad { i8*, i32 } 11726 catch i8** @_ZTIi 11727 ;; A landing pad that is a cleanup. 11728 %res = landingpad { i8*, i32 } 11729 cleanup 11730 ;; A landing pad which can catch an integer and can only throw a double. 11731 %res = landingpad { i8*, i32 } 11732 catch i8** @_ZTIi 11733 filter [1 x i8**] [@_ZTId] 11734 11735.. _i_catchpad: 11736 11737'``catchpad``' Instruction 11738^^^^^^^^^^^^^^^^^^^^^^^^^^ 11739 11740Syntax: 11741""""""" 11742 11743:: 11744 11745 <resultval> = catchpad within <catchswitch> [<args>*] 11746 11747Overview: 11748""""""""" 11749 11750The '``catchpad``' instruction is used by `LLVM's exception handling 11751system <ExceptionHandling.html#overview>`_ to specify that a basic block 11752begins a catch handler --- one where a personality routine attempts to transfer 11753control to catch an exception. 11754 11755Arguments: 11756"""""""""" 11757 11758The ``catchswitch`` operand must always be a token produced by a 11759:ref:`catchswitch <i_catchswitch>` instruction in a predecessor block. This 11760ensures that each ``catchpad`` has exactly one predecessor block, and it always 11761terminates in a ``catchswitch``. 11762 11763The ``args`` correspond to whatever information the personality routine 11764requires to know if this is an appropriate handler for the exception. Control 11765will transfer to the ``catchpad`` if this is the first appropriate handler for 11766the exception. 11767 11768The ``resultval`` has the type :ref:`token <t_token>` and is used to match the 11769``catchpad`` to corresponding :ref:`catchrets <i_catchret>` and other nested EH 11770pads. 11771 11772Semantics: 11773"""""""""" 11774 11775When the call stack is being unwound due to an exception being thrown, the 11776exception is compared against the ``args``. If it doesn't match, control will 11777not reach the ``catchpad`` instruction. The representation of ``args`` is 11778entirely target and personality function-specific. 11779 11780Like the :ref:`landingpad <i_landingpad>` instruction, the ``catchpad`` 11781instruction must be the first non-phi of its parent basic block. 11782 11783The meaning of the tokens produced and consumed by ``catchpad`` and other "pad" 11784instructions is described in the 11785`Windows exception handling documentation\ <ExceptionHandling.html#wineh>`_. 11786 11787When a ``catchpad`` has been "entered" but not yet "exited" (as 11788described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_), 11789it is undefined behavior to execute a :ref:`call <i_call>` or :ref:`invoke <i_invoke>` 11790that does not carry an appropriate :ref:`"funclet" bundle <ob_funclet>`. 11791 11792Example: 11793"""""""" 11794 11795.. code-block:: text 11796 11797 dispatch: 11798 %cs = catchswitch within none [label %handler0] unwind to caller 11799 ;; A catch block which can catch an integer. 11800 handler0: 11801 %tok = catchpad within %cs [i8** @_ZTIi] 11802 11803.. _i_cleanuppad: 11804 11805'``cleanuppad``' Instruction 11806^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 11807 11808Syntax: 11809""""""" 11810 11811:: 11812 11813 <resultval> = cleanuppad within <parent> [<args>*] 11814 11815Overview: 11816""""""""" 11817 11818The '``cleanuppad``' instruction is used by `LLVM's exception handling 11819system <ExceptionHandling.html#overview>`_ to specify that a basic block 11820is a cleanup block --- one where a personality routine attempts to 11821transfer control to run cleanup actions. 11822The ``args`` correspond to whatever additional 11823information the :ref:`personality function <personalityfn>` requires to 11824execute the cleanup. 11825The ``resultval`` has the type :ref:`token <t_token>` and is used to 11826match the ``cleanuppad`` to corresponding :ref:`cleanuprets <i_cleanupret>`. 11827The ``parent`` argument is the token of the funclet that contains the 11828``cleanuppad`` instruction. If the ``cleanuppad`` is not inside a funclet, 11829this operand may be the token ``none``. 11830 11831Arguments: 11832"""""""""" 11833 11834The instruction takes a list of arbitrary values which are interpreted 11835by the :ref:`personality function <personalityfn>`. 11836 11837Semantics: 11838"""""""""" 11839 11840When the call stack is being unwound due to an exception being thrown, 11841the :ref:`personality function <personalityfn>` transfers control to the 11842``cleanuppad`` with the aid of the personality-specific arguments. 11843As with calling conventions, how the personality function results are 11844represented in LLVM IR is target specific. 11845 11846The ``cleanuppad`` instruction has several restrictions: 11847 11848- A cleanup block is a basic block which is the unwind destination of 11849 an exceptional instruction. 11850- A cleanup block must have a '``cleanuppad``' instruction as its 11851 first non-PHI instruction. 11852- There can be only one '``cleanuppad``' instruction within the 11853 cleanup block. 11854- A basic block that is not a cleanup block may not include a 11855 '``cleanuppad``' instruction. 11856 11857When a ``cleanuppad`` has been "entered" but not yet "exited" (as 11858described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_), 11859it is undefined behavior to execute a :ref:`call <i_call>` or :ref:`invoke <i_invoke>` 11860that does not carry an appropriate :ref:`"funclet" bundle <ob_funclet>`. 11861 11862Example: 11863"""""""" 11864 11865.. code-block:: text 11866 11867 %tok = cleanuppad within %cs [] 11868 11869.. _intrinsics: 11870 11871Intrinsic Functions 11872=================== 11873 11874LLVM supports the notion of an "intrinsic function". These functions 11875have well known names and semantics and are required to follow certain 11876restrictions. Overall, these intrinsics represent an extension mechanism 11877for the LLVM language that does not require changing all of the 11878transformations in LLVM when adding to the language (or the bitcode 11879reader/writer, the parser, etc...). 11880 11881Intrinsic function names must all start with an "``llvm.``" prefix. This 11882prefix is reserved in LLVM for intrinsic names; thus, function names may 11883not begin with this prefix. Intrinsic functions must always be external 11884functions: you cannot define the body of intrinsic functions. Intrinsic 11885functions may only be used in call or invoke instructions: it is illegal 11886to take the address of an intrinsic function. Additionally, because 11887intrinsic functions are part of the LLVM language, it is required if any 11888are added that they be documented here. 11889 11890Some intrinsic functions can be overloaded, i.e., the intrinsic 11891represents a family of functions that perform the same operation but on 11892different data types. Because LLVM can represent over 8 million 11893different integer types, overloading is used commonly to allow an 11894intrinsic function to operate on any integer type. One or more of the 11895argument types or the result type can be overloaded to accept any 11896integer type. Argument types may also be defined as exactly matching a 11897previous argument's type or the result type. This allows an intrinsic 11898function which accepts multiple arguments, but needs all of them to be 11899of the same type, to only be overloaded with respect to a single 11900argument or the result. 11901 11902Overloaded intrinsics will have the names of its overloaded argument 11903types encoded into its function name, each preceded by a period. Only 11904those types which are overloaded result in a name suffix. Arguments 11905whose type is matched against another type do not. For example, the 11906``llvm.ctpop`` function can take an integer of any width and returns an 11907integer of exactly the same integer width. This leads to a family of 11908functions such as ``i8 @llvm.ctpop.i8(i8 %val)`` and 11909``i29 @llvm.ctpop.i29(i29 %val)``. Only one type, the return type, is 11910overloaded, and only one type suffix is required. Because the argument's 11911type is matched against the return type, it does not require its own 11912name suffix. 11913 11914:ref:`Unnamed types <t_opaque>` are encoded as ``s_s``. Overloaded intrinsics 11915that depend on an unnamed type in one of its overloaded argument types get an 11916additional ``.<number>`` suffix. This allows differentiating intrinsics with 11917different unnamed types as arguments. (For example: 11918``llvm.ssa.copy.p0s_s.2(%42*)``) The number is tracked in the LLVM module and 11919it ensures unique names in the module. While linking together two modules, it is 11920still possible to get a name clash. In that case one of the names will be 11921changed by getting a new number. 11922 11923For target developers who are defining intrinsics for back-end code 11924generation, any intrinsic overloads based solely the distinction between 11925integer or floating point types should not be relied upon for correct 11926code generation. In such cases, the recommended approach for target 11927maintainers when defining intrinsics is to create separate integer and 11928FP intrinsics rather than rely on overloading. For example, if different 11929codegen is required for ``llvm.target.foo(<4 x i32>)`` and 11930``llvm.target.foo(<4 x float>)`` then these should be split into 11931different intrinsics. 11932 11933To learn how to add an intrinsic function, please see the `Extending 11934LLVM Guide <ExtendingLLVM.html>`_. 11935 11936.. _int_varargs: 11937 11938Variable Argument Handling Intrinsics 11939------------------------------------- 11940 11941Variable argument support is defined in LLVM with the 11942:ref:`va_arg <i_va_arg>` instruction and these three intrinsic 11943functions. These functions are related to the similarly named macros 11944defined in the ``<stdarg.h>`` header file. 11945 11946All of these functions operate on arguments that use a target-specific 11947value type "``va_list``". The LLVM assembly language reference manual 11948does not define what this type is, so all transformations should be 11949prepared to handle these functions regardless of the type used. 11950 11951This example shows how the :ref:`va_arg <i_va_arg>` instruction and the 11952variable argument handling intrinsic functions are used. 11953 11954.. code-block:: llvm 11955 11956 ; This struct is different for every platform. For most platforms, 11957 ; it is merely an i8*. 11958 %struct.va_list = type { i8* } 11959 11960 ; For Unix x86_64 platforms, va_list is the following struct: 11961 ; %struct.va_list = type { i32, i32, i8*, i8* } 11962 11963 define i32 @test(i32 %X, ...) { 11964 ; Initialize variable argument processing 11965 %ap = alloca %struct.va_list 11966 %ap2 = bitcast %struct.va_list* %ap to i8* 11967 call void @llvm.va_start(i8* %ap2) 11968 11969 ; Read a single integer argument 11970 %tmp = va_arg i8* %ap2, i32 11971 11972 ; Demonstrate usage of llvm.va_copy and llvm.va_end 11973 %aq = alloca i8* 11974 %aq2 = bitcast i8** %aq to i8* 11975 call void @llvm.va_copy(i8* %aq2, i8* %ap2) 11976 call void @llvm.va_end(i8* %aq2) 11977 11978 ; Stop processing of arguments. 11979 call void @llvm.va_end(i8* %ap2) 11980 ret i32 %tmp 11981 } 11982 11983 declare void @llvm.va_start(i8*) 11984 declare void @llvm.va_copy(i8*, i8*) 11985 declare void @llvm.va_end(i8*) 11986 11987.. _int_va_start: 11988 11989'``llvm.va_start``' Intrinsic 11990^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 11991 11992Syntax: 11993""""""" 11994 11995:: 11996 11997 declare void @llvm.va_start(i8* <arglist>) 11998 11999Overview: 12000""""""""" 12001 12002The '``llvm.va_start``' intrinsic initializes ``*<arglist>`` for 12003subsequent use by ``va_arg``. 12004 12005Arguments: 12006"""""""""" 12007 12008The argument is a pointer to a ``va_list`` element to initialize. 12009 12010Semantics: 12011"""""""""" 12012 12013The '``llvm.va_start``' intrinsic works just like the ``va_start`` macro 12014available in C. In a target-dependent way, it initializes the 12015``va_list`` element to which the argument points, so that the next call 12016to ``va_arg`` will produce the first variable argument passed to the 12017function. Unlike the C ``va_start`` macro, this intrinsic does not need 12018to know the last argument of the function as the compiler can figure 12019that out. 12020 12021'``llvm.va_end``' Intrinsic 12022^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12023 12024Syntax: 12025""""""" 12026 12027:: 12028 12029 declare void @llvm.va_end(i8* <arglist>) 12030 12031Overview: 12032""""""""" 12033 12034The '``llvm.va_end``' intrinsic destroys ``*<arglist>``, which has been 12035initialized previously with ``llvm.va_start`` or ``llvm.va_copy``. 12036 12037Arguments: 12038"""""""""" 12039 12040The argument is a pointer to a ``va_list`` to destroy. 12041 12042Semantics: 12043"""""""""" 12044 12045The '``llvm.va_end``' intrinsic works just like the ``va_end`` macro 12046available in C. In a target-dependent way, it destroys the ``va_list`` 12047element to which the argument points. Calls to 12048:ref:`llvm.va_start <int_va_start>` and 12049:ref:`llvm.va_copy <int_va_copy>` must be matched exactly with calls to 12050``llvm.va_end``. 12051 12052.. _int_va_copy: 12053 12054'``llvm.va_copy``' Intrinsic 12055^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12056 12057Syntax: 12058""""""" 12059 12060:: 12061 12062 declare void @llvm.va_copy(i8* <destarglist>, i8* <srcarglist>) 12063 12064Overview: 12065""""""""" 12066 12067The '``llvm.va_copy``' intrinsic copies the current argument position 12068from the source argument list to the destination argument list. 12069 12070Arguments: 12071"""""""""" 12072 12073The first argument is a pointer to a ``va_list`` element to initialize. 12074The second argument is a pointer to a ``va_list`` element to copy from. 12075 12076Semantics: 12077"""""""""" 12078 12079The '``llvm.va_copy``' intrinsic works just like the ``va_copy`` macro 12080available in C. In a target-dependent way, it copies the source 12081``va_list`` element into the destination ``va_list`` element. This 12082intrinsic is necessary because the `` llvm.va_start`` intrinsic may be 12083arbitrarily complex and require, for example, memory allocation. 12084 12085Accurate Garbage Collection Intrinsics 12086-------------------------------------- 12087 12088LLVM's support for `Accurate Garbage Collection <GarbageCollection.html>`_ 12089(GC) requires the frontend to generate code containing appropriate intrinsic 12090calls and select an appropriate GC strategy which knows how to lower these 12091intrinsics in a manner which is appropriate for the target collector. 12092 12093These intrinsics allow identification of :ref:`GC roots on the 12094stack <int_gcroot>`, as well as garbage collector implementations that 12095require :ref:`read <int_gcread>` and :ref:`write <int_gcwrite>` barriers. 12096Frontends for type-safe garbage collected languages should generate 12097these intrinsics to make use of the LLVM garbage collectors. For more 12098details, see `Garbage Collection with LLVM <GarbageCollection.html>`_. 12099 12100LLVM provides an second experimental set of intrinsics for describing garbage 12101collection safepoints in compiled code. These intrinsics are an alternative 12102to the ``llvm.gcroot`` intrinsics, but are compatible with the ones for 12103:ref:`read <int_gcread>` and :ref:`write <int_gcwrite>` barriers. The 12104differences in approach are covered in the `Garbage Collection with LLVM 12105<GarbageCollection.html>`_ documentation. The intrinsics themselves are 12106described in :doc:`Statepoints`. 12107 12108.. _int_gcroot: 12109 12110'``llvm.gcroot``' Intrinsic 12111^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12112 12113Syntax: 12114""""""" 12115 12116:: 12117 12118 declare void @llvm.gcroot(i8** %ptrloc, i8* %metadata) 12119 12120Overview: 12121""""""""" 12122 12123The '``llvm.gcroot``' intrinsic declares the existence of a GC root to 12124the code generator, and allows some metadata to be associated with it. 12125 12126Arguments: 12127"""""""""" 12128 12129The first argument specifies the address of a stack object that contains 12130the root pointer. The second pointer (which must be either a constant or 12131a global value address) contains the meta-data to be associated with the 12132root. 12133 12134Semantics: 12135"""""""""" 12136 12137At runtime, a call to this intrinsic stores a null pointer into the 12138"ptrloc" location. At compile-time, the code generator generates 12139information to allow the runtime to find the pointer at GC safe points. 12140The '``llvm.gcroot``' intrinsic may only be used in a function which 12141:ref:`specifies a GC algorithm <gc>`. 12142 12143.. _int_gcread: 12144 12145'``llvm.gcread``' Intrinsic 12146^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12147 12148Syntax: 12149""""""" 12150 12151:: 12152 12153 declare i8* @llvm.gcread(i8* %ObjPtr, i8** %Ptr) 12154 12155Overview: 12156""""""""" 12157 12158The '``llvm.gcread``' intrinsic identifies reads of references from heap 12159locations, allowing garbage collector implementations that require read 12160barriers. 12161 12162Arguments: 12163"""""""""" 12164 12165The second argument is the address to read from, which should be an 12166address allocated from the garbage collector. The first object is a 12167pointer to the start of the referenced object, if needed by the language 12168runtime (otherwise null). 12169 12170Semantics: 12171"""""""""" 12172 12173The '``llvm.gcread``' intrinsic has the same semantics as a load 12174instruction, but may be replaced with substantially more complex code by 12175the garbage collector runtime, as needed. The '``llvm.gcread``' 12176intrinsic may only be used in a function which :ref:`specifies a GC 12177algorithm <gc>`. 12178 12179.. _int_gcwrite: 12180 12181'``llvm.gcwrite``' Intrinsic 12182^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12183 12184Syntax: 12185""""""" 12186 12187:: 12188 12189 declare void @llvm.gcwrite(i8* %P1, i8* %Obj, i8** %P2) 12190 12191Overview: 12192""""""""" 12193 12194The '``llvm.gcwrite``' intrinsic identifies writes of references to heap 12195locations, allowing garbage collector implementations that require write 12196barriers (such as generational or reference counting collectors). 12197 12198Arguments: 12199"""""""""" 12200 12201The first argument is the reference to store, the second is the start of 12202the object to store it to, and the third is the address of the field of 12203Obj to store to. If the runtime does not require a pointer to the 12204object, Obj may be null. 12205 12206Semantics: 12207"""""""""" 12208 12209The '``llvm.gcwrite``' intrinsic has the same semantics as a store 12210instruction, but may be replaced with substantially more complex code by 12211the garbage collector runtime, as needed. The '``llvm.gcwrite``' 12212intrinsic may only be used in a function which :ref:`specifies a GC 12213algorithm <gc>`. 12214 12215 12216.. _gc_statepoint: 12217 12218'llvm.experimental.gc.statepoint' Intrinsic 12219^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12220 12221Syntax: 12222""""""" 12223 12224:: 12225 12226 declare token 12227 @llvm.experimental.gc.statepoint(i64 <id>, i32 <num patch bytes>, 12228 func_type <target>, 12229 i64 <#call args>, i64 <flags>, 12230 ... (call parameters), 12231 i64 0, i64 0) 12232 12233Overview: 12234""""""""" 12235 12236The statepoint intrinsic represents a call which is parse-able by the 12237runtime. 12238 12239Operands: 12240""""""""" 12241 12242The 'id' operand is a constant integer that is reported as the ID 12243field in the generated stackmap. LLVM does not interpret this 12244parameter in any way and its meaning is up to the statepoint user to 12245decide. Note that LLVM is free to duplicate code containing 12246statepoint calls, and this may transform IR that had a unique 'id' per 12247lexical call to statepoint to IR that does not. 12248 12249If 'num patch bytes' is non-zero then the call instruction 12250corresponding to the statepoint is not emitted and LLVM emits 'num 12251patch bytes' bytes of nops in its place. LLVM will emit code to 12252prepare the function arguments and retrieve the function return value 12253in accordance to the calling convention; the former before the nop 12254sequence and the latter after the nop sequence. It is expected that 12255the user will patch over the 'num patch bytes' bytes of nops with a 12256calling sequence specific to their runtime before executing the 12257generated machine code. There are no guarantees with respect to the 12258alignment of the nop sequence. Unlike :doc:`StackMaps` statepoints do 12259not have a concept of shadow bytes. Note that semantically the 12260statepoint still represents a call or invoke to 'target', and the nop 12261sequence after patching is expected to represent an operation 12262equivalent to a call or invoke to 'target'. 12263 12264The 'target' operand is the function actually being called. The 12265target can be specified as either a symbolic LLVM function, or as an 12266arbitrary Value of appropriate function type. Note that the function 12267type must match the signature of the callee and the types of the 'call 12268parameters' arguments. 12269 12270The '#call args' operand is the number of arguments to the actual 12271call. It must exactly match the number of arguments passed in the 12272'call parameters' variable length section. 12273 12274The 'flags' operand is used to specify extra information about the 12275statepoint. This is currently only used to mark certain statepoints 12276as GC transitions. This operand is a 64-bit integer with the following 12277layout, where bit 0 is the least significant bit: 12278 12279 +-------+---------------------------------------------------+ 12280 | Bit # | Usage | 12281 +=======+===================================================+ 12282 | 0 | Set if the statepoint is a GC transition, cleared | 12283 | | otherwise. | 12284 +-------+---------------------------------------------------+ 12285 | 1-63 | Reserved for future use; must be cleared. | 12286 +-------+---------------------------------------------------+ 12287 12288The 'call parameters' arguments are simply the arguments which need to 12289be passed to the call target. They will be lowered according to the 12290specified calling convention and otherwise handled like a normal call 12291instruction. The number of arguments must exactly match what is 12292specified in '# call args'. The types must match the signature of 12293'target'. 12294 12295The 'call parameter' attributes must be followed by two 'i64 0' constants. 12296These were originally the length prefixes for 'gc transition parameter' and 12297'deopt parameter' arguments, but the role of these parameter sets have been 12298entirely replaced with the corresponding operand bundles. In a future 12299revision, these now redundant arguments will be removed. 12300 12301Semantics: 12302"""""""""" 12303 12304A statepoint is assumed to read and write all memory. As a result, 12305memory operations can not be reordered past a statepoint. It is 12306illegal to mark a statepoint as being either 'readonly' or 'readnone'. 12307 12308Note that legal IR can not perform any memory operation on a 'gc 12309pointer' argument of the statepoint in a location statically reachable 12310from the statepoint. Instead, the explicitly relocated value (from a 12311``gc.relocate``) must be used. 12312 12313'llvm.experimental.gc.result' Intrinsic 12314^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12315 12316Syntax: 12317""""""" 12318 12319:: 12320 12321 declare type* 12322 @llvm.experimental.gc.result(token %statepoint_token) 12323 12324Overview: 12325""""""""" 12326 12327``gc.result`` extracts the result of the original call instruction 12328which was replaced by the ``gc.statepoint``. The ``gc.result`` 12329intrinsic is actually a family of three intrinsics due to an 12330implementation limitation. Other than the type of the return value, 12331the semantics are the same. 12332 12333Operands: 12334""""""""" 12335 12336The first and only argument is the ``gc.statepoint`` which starts 12337the safepoint sequence of which this ``gc.result`` is a part. 12338Despite the typing of this as a generic token, *only* the value defined 12339by a ``gc.statepoint`` is legal here. 12340 12341Semantics: 12342"""""""""" 12343 12344The ``gc.result`` represents the return value of the call target of 12345the ``statepoint``. The type of the ``gc.result`` must exactly match 12346the type of the target. If the call target returns void, there will 12347be no ``gc.result``. 12348 12349A ``gc.result`` is modeled as a 'readnone' pure function. It has no 12350side effects since it is just a projection of the return value of the 12351previous call represented by the ``gc.statepoint``. 12352 12353'llvm.experimental.gc.relocate' Intrinsic 12354^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12355 12356Syntax: 12357""""""" 12358 12359:: 12360 12361 declare <pointer type> 12362 @llvm.experimental.gc.relocate(token %statepoint_token, 12363 i32 %base_offset, 12364 i32 %pointer_offset) 12365 12366Overview: 12367""""""""" 12368 12369A ``gc.relocate`` returns the potentially relocated value of a pointer 12370at the safepoint. 12371 12372Operands: 12373""""""""" 12374 12375The first argument is the ``gc.statepoint`` which starts the 12376safepoint sequence of which this ``gc.relocation`` is a part. 12377Despite the typing of this as a generic token, *only* the value defined 12378by a ``gc.statepoint`` is legal here. 12379 12380The second and third arguments are both indices into operands of the 12381corresponding statepoint's :ref:`gc-live <ob_gc_live>` operand bundle. 12382 12383The second argument is an index which specifies the allocation for the pointer 12384being relocated. The associated value must be within the object with which the 12385pointer being relocated is associated. The optimizer is free to change *which* 12386interior derived pointer is reported, provided that it does not replace an 12387actual base pointer with another interior derived pointer. Collectors are 12388allowed to rely on the base pointer operand remaining an actual base pointer if 12389so constructed. 12390 12391The third argument is an index which specify the (potentially) derived pointer 12392being relocated. It is legal for this index to be the same as the second 12393argument if-and-only-if a base pointer is being relocated. 12394 12395Semantics: 12396"""""""""" 12397 12398The return value of ``gc.relocate`` is the potentially relocated value 12399of the pointer specified by its arguments. It is unspecified how the 12400value of the returned pointer relates to the argument to the 12401``gc.statepoint`` other than that a) it points to the same source 12402language object with the same offset, and b) the 'based-on' 12403relationship of the newly relocated pointers is a projection of the 12404unrelocated pointers. In particular, the integer value of the pointer 12405returned is unspecified. 12406 12407A ``gc.relocate`` is modeled as a ``readnone`` pure function. It has no 12408side effects since it is just a way to extract information about work 12409done during the actual call modeled by the ``gc.statepoint``. 12410 12411.. _gc.get.pointer.base: 12412 12413'llvm.experimental.gc.get.pointer.base' Intrinsic 12414^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12415 12416Syntax: 12417""""""" 12418 12419:: 12420 12421 declare <pointer type> 12422 @llvm.experimental.gc.get.pointer.base( 12423 <pointer type> readnone nocapture %derived_ptr) 12424 nounwind readnone willreturn 12425 12426Overview: 12427""""""""" 12428 12429``gc.get.pointer.base`` for a derived pointer returns its base pointer. 12430 12431Operands: 12432""""""""" 12433 12434The only argument is a pointer which is based on some object with 12435an unknown offset from the base of said object. 12436 12437Semantics: 12438"""""""""" 12439 12440This intrinsic is used in the abstract machine model for GC to represent 12441the base pointer for an arbitrary derived pointer. 12442 12443This intrinsic is inlined by the :ref:`RewriteStatepointsForGC` pass by 12444replacing all uses of this callsite with the offset of a derived pointer from 12445its base pointer value. The replacement is done as part of the lowering to the 12446explicit statepoint model. 12447 12448The return pointer type must be the same as the type of the parameter. 12449 12450 12451'llvm.experimental.gc.get.pointer.offset' Intrinsic 12452^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12453 12454Syntax: 12455""""""" 12456 12457:: 12458 12459 declare i64 12460 @llvm.experimental.gc.get.pointer.offset( 12461 <pointer type> readnone nocapture %derived_ptr) 12462 nounwind readnone willreturn 12463 12464Overview: 12465""""""""" 12466 12467``gc.get.pointer.offset`` for a derived pointer returns the offset from its 12468base pointer. 12469 12470Operands: 12471""""""""" 12472 12473The only argument is a pointer which is based on some object with 12474an unknown offset from the base of said object. 12475 12476Semantics: 12477"""""""""" 12478 12479This intrinsic is used in the abstract machine model for GC to represent 12480the offset of an arbitrary derived pointer from its base pointer. 12481 12482This intrinsic is inlined by the :ref:`RewriteStatepointsForGC` pass by 12483replacing all uses of this callsite with the offset of a derived pointer from 12484its base pointer value. The replacement is done as part of the lowering to the 12485explicit statepoint model. 12486 12487Basically this call calculates difference between the derived pointer and its 12488base pointer (see :ref:`gc.get.pointer.base`) both ptrtoint casted. But 12489this cast done outside the :ref:`RewriteStatepointsForGC` pass could result 12490in the pointers lost for further lowering from the abstract model to the 12491explicit physical one. 12492 12493Code Generator Intrinsics 12494------------------------- 12495 12496These intrinsics are provided by LLVM to expose special features that 12497may only be implemented with code generator support. 12498 12499'``llvm.returnaddress``' Intrinsic 12500^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12501 12502Syntax: 12503""""""" 12504 12505:: 12506 12507 declare i8* @llvm.returnaddress(i32 <level>) 12508 12509Overview: 12510""""""""" 12511 12512The '``llvm.returnaddress``' intrinsic attempts to compute a 12513target-specific value indicating the return address of the current 12514function or one of its callers. 12515 12516Arguments: 12517"""""""""" 12518 12519The argument to this intrinsic indicates which function to return the 12520address for. Zero indicates the calling function, one indicates its 12521caller, etc. The argument is **required** to be a constant integer 12522value. 12523 12524Semantics: 12525"""""""""" 12526 12527The '``llvm.returnaddress``' intrinsic either returns a pointer 12528indicating the return address of the specified call frame, or zero if it 12529cannot be identified. The value returned by this intrinsic is likely to 12530be incorrect or 0 for arguments other than zero, so it should only be 12531used for debugging purposes. 12532 12533Note that calling this intrinsic does not prevent function inlining or 12534other aggressive transformations, so the value returned may not be that 12535of the obvious source-language caller. 12536 12537'``llvm.addressofreturnaddress``' Intrinsic 12538^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12539 12540Syntax: 12541""""""" 12542 12543:: 12544 12545 declare i8* @llvm.addressofreturnaddress() 12546 12547Overview: 12548""""""""" 12549 12550The '``llvm.addressofreturnaddress``' intrinsic returns a target-specific 12551pointer to the place in the stack frame where the return address of the 12552current function is stored. 12553 12554Semantics: 12555"""""""""" 12556 12557Note that calling this intrinsic does not prevent function inlining or 12558other aggressive transformations, so the value returned may not be that 12559of the obvious source-language caller. 12560 12561This intrinsic is only implemented for x86 and aarch64. 12562 12563'``llvm.sponentry``' Intrinsic 12564^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12565 12566Syntax: 12567""""""" 12568 12569:: 12570 12571 declare i8* @llvm.sponentry() 12572 12573Overview: 12574""""""""" 12575 12576The '``llvm.sponentry``' intrinsic returns the stack pointer value at 12577the entry of the current function calling this intrinsic. 12578 12579Semantics: 12580"""""""""" 12581 12582Note this intrinsic is only verified on AArch64. 12583 12584'``llvm.frameaddress``' Intrinsic 12585^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12586 12587Syntax: 12588""""""" 12589 12590:: 12591 12592 declare i8* @llvm.frameaddress(i32 <level>) 12593 12594Overview: 12595""""""""" 12596 12597The '``llvm.frameaddress``' intrinsic attempts to return the 12598target-specific frame pointer value for the specified stack frame. 12599 12600Arguments: 12601"""""""""" 12602 12603The argument to this intrinsic indicates which function to return the 12604frame pointer for. Zero indicates the calling function, one indicates 12605its caller, etc. The argument is **required** to be a constant integer 12606value. 12607 12608Semantics: 12609"""""""""" 12610 12611The '``llvm.frameaddress``' intrinsic either returns a pointer 12612indicating the frame address of the specified call frame, or zero if it 12613cannot be identified. The value returned by this intrinsic is likely to 12614be incorrect or 0 for arguments other than zero, so it should only be 12615used for debugging purposes. 12616 12617Note that calling this intrinsic does not prevent function inlining or 12618other aggressive transformations, so the value returned may not be that 12619of the obvious source-language caller. 12620 12621'``llvm.swift.async.context.addr``' Intrinsic 12622^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12623 12624Syntax: 12625""""""" 12626 12627:: 12628 12629 declare i8** @llvm.swift.async.context.addr() 12630 12631Overview: 12632""""""""" 12633 12634The '``llvm.swift.async.context.addr``' intrinsic returns a pointer to 12635the part of the extended frame record containing the asynchronous 12636context of a Swift execution. 12637 12638Semantics: 12639"""""""""" 12640 12641If the caller has a ``swiftasync`` parameter, that argument will initially 12642be stored at the returned address. If not, it will be initialized to null. 12643 12644'``llvm.localescape``' and '``llvm.localrecover``' Intrinsics 12645^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12646 12647Syntax: 12648""""""" 12649 12650:: 12651 12652 declare void @llvm.localescape(...) 12653 declare i8* @llvm.localrecover(i8* %func, i8* %fp, i32 %idx) 12654 12655Overview: 12656""""""""" 12657 12658The '``llvm.localescape``' intrinsic escapes offsets of a collection of static 12659allocas, and the '``llvm.localrecover``' intrinsic applies those offsets to a 12660live frame pointer to recover the address of the allocation. The offset is 12661computed during frame layout of the caller of ``llvm.localescape``. 12662 12663Arguments: 12664"""""""""" 12665 12666All arguments to '``llvm.localescape``' must be pointers to static allocas or 12667casts of static allocas. Each function can only call '``llvm.localescape``' 12668once, and it can only do so from the entry block. 12669 12670The ``func`` argument to '``llvm.localrecover``' must be a constant 12671bitcasted pointer to a function defined in the current module. The code 12672generator cannot determine the frame allocation offset of functions defined in 12673other modules. 12674 12675The ``fp`` argument to '``llvm.localrecover``' must be a frame pointer of a 12676call frame that is currently live. The return value of '``llvm.localaddress``' 12677is one way to produce such a value, but various runtimes also expose a suitable 12678pointer in platform-specific ways. 12679 12680The ``idx`` argument to '``llvm.localrecover``' indicates which alloca passed to 12681'``llvm.localescape``' to recover. It is zero-indexed. 12682 12683Semantics: 12684"""""""""" 12685 12686These intrinsics allow a group of functions to share access to a set of local 12687stack allocations of a one parent function. The parent function may call the 12688'``llvm.localescape``' intrinsic once from the function entry block, and the 12689child functions can use '``llvm.localrecover``' to access the escaped allocas. 12690The '``llvm.localescape``' intrinsic blocks inlining, as inlining changes where 12691the escaped allocas are allocated, which would break attempts to use 12692'``llvm.localrecover``'. 12693 12694'``llvm.seh.try.begin``' and '``llvm.seh.try.end``' Intrinsics 12695^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12696 12697Syntax: 12698""""""" 12699 12700:: 12701 12702 declare void @llvm.seh.try.begin() 12703 declare void @llvm.seh.try.end() 12704 12705Overview: 12706""""""""" 12707 12708The '``llvm.seh.try.begin``' and '``llvm.seh.try.end``' intrinsics mark 12709the boundary of a _try region for Windows SEH Asynchrous Exception Handling. 12710 12711Semantics: 12712"""""""""" 12713 12714When a C-function is compiled with Windows SEH Asynchrous Exception option, 12715-feh_asynch (aka MSVC -EHa), these two intrinsics are injected to mark _try 12716boundary and to prevent potential exceptions from being moved across boundary. 12717Any set of operations can then be confined to the region by reading their leaf 12718inputs via volatile loads and writing their root outputs via volatile stores. 12719 12720'``llvm.seh.scope.begin``' and '``llvm.seh.scope.end``' Intrinsics 12721^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12722 12723Syntax: 12724""""""" 12725 12726:: 12727 12728 declare void @llvm.seh.scope.begin() 12729 declare void @llvm.seh.scope.end() 12730 12731Overview: 12732""""""""" 12733 12734The '``llvm.seh.scope.begin``' and '``llvm.seh.scope.end``' intrinsics mark 12735the boundary of a CPP object lifetime for Windows SEH Asynchrous Exception 12736Handling (MSVC option -EHa). 12737 12738Semantics: 12739"""""""""" 12740 12741LLVM's ordinary exception-handling representation associates EH cleanups and 12742handlers only with ``invoke``s, which normally correspond only to call sites. To 12743support arbitrary faulting instructions, it must be possible to recover the current 12744EH scope for any instruction. Turning every operation in LLVM that could fault 12745into an ``invoke`` of a new, potentially-throwing intrinsic would require adding a 12746large number of intrinsics, impede optimization of those operations, and make 12747compilation slower by introducing many extra basic blocks. These intrinsics can 12748be used instead to mark the region protected by a cleanup, such as for a local 12749C++ object with a non-trivial destructor. ``llvm.seh.scope.begin`` is used to mark 12750the start of the region; it is always called with ``invoke``, with the unwind block 12751being the desired unwind destination for any potentially-throwing instructions 12752within the region. `llvm.seh.scope.end` is used to mark when the scope ends 12753and the EH cleanup is no longer required (e.g. because the destructor is being 12754called). 12755 12756.. _int_read_register: 12757.. _int_read_volatile_register: 12758.. _int_write_register: 12759 12760'``llvm.read_register``', '``llvm.read_volatile_register``', and '``llvm.write_register``' Intrinsics 12761^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12762 12763Syntax: 12764""""""" 12765 12766:: 12767 12768 declare i32 @llvm.read_register.i32(metadata) 12769 declare i64 @llvm.read_register.i64(metadata) 12770 declare i32 @llvm.read_volatile_register.i32(metadata) 12771 declare i64 @llvm.read_volatile_register.i64(metadata) 12772 declare void @llvm.write_register.i32(metadata, i32 @value) 12773 declare void @llvm.write_register.i64(metadata, i64 @value) 12774 !0 = !{!"sp\00"} 12775 12776Overview: 12777""""""""" 12778 12779The '``llvm.read_register``', '``llvm.read_volatile_register``', and 12780'``llvm.write_register``' intrinsics provide access to the named register. 12781The register must be valid on the architecture being compiled to. The type 12782needs to be compatible with the register being read. 12783 12784Semantics: 12785"""""""""" 12786 12787The '``llvm.read_register``' and '``llvm.read_volatile_register``' intrinsics 12788return the current value of the register, where possible. The 12789'``llvm.write_register``' intrinsic sets the current value of the register, 12790where possible. 12791 12792A call to '``llvm.read_volatile_register``' is assumed to have side-effects 12793and possibly return a different value each time (e.g. for a timer register). 12794 12795This is useful to implement named register global variables that need 12796to always be mapped to a specific register, as is common practice on 12797bare-metal programs including OS kernels. 12798 12799The compiler doesn't check for register availability or use of the used 12800register in surrounding code, including inline assembly. Because of that, 12801allocatable registers are not supported. 12802 12803Warning: So far it only works with the stack pointer on selected 12804architectures (ARM, AArch64, PowerPC and x86_64). Significant amount of 12805work is needed to support other registers and even more so, allocatable 12806registers. 12807 12808.. _int_stacksave: 12809 12810'``llvm.stacksave``' Intrinsic 12811^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12812 12813Syntax: 12814""""""" 12815 12816:: 12817 12818 declare i8* @llvm.stacksave() 12819 12820Overview: 12821""""""""" 12822 12823The '``llvm.stacksave``' intrinsic is used to remember the current state 12824of the function stack, for use with 12825:ref:`llvm.stackrestore <int_stackrestore>`. This is useful for 12826implementing language features like scoped automatic variable sized 12827arrays in C99. 12828 12829Semantics: 12830"""""""""" 12831 12832This intrinsic returns an opaque pointer value that can be passed to 12833:ref:`llvm.stackrestore <int_stackrestore>`. When an 12834``llvm.stackrestore`` intrinsic is executed with a value saved from 12835``llvm.stacksave``, it effectively restores the state of the stack to 12836the state it was in when the ``llvm.stacksave`` intrinsic executed. In 12837practice, this pops any :ref:`alloca <i_alloca>` blocks from the stack that 12838were allocated after the ``llvm.stacksave`` was executed. 12839 12840.. _int_stackrestore: 12841 12842'``llvm.stackrestore``' Intrinsic 12843^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12844 12845Syntax: 12846""""""" 12847 12848:: 12849 12850 declare void @llvm.stackrestore(i8* %ptr) 12851 12852Overview: 12853""""""""" 12854 12855The '``llvm.stackrestore``' intrinsic is used to restore the state of 12856the function stack to the state it was in when the corresponding 12857:ref:`llvm.stacksave <int_stacksave>` intrinsic executed. This is 12858useful for implementing language features like scoped automatic variable 12859sized arrays in C99. 12860 12861Semantics: 12862"""""""""" 12863 12864See the description for :ref:`llvm.stacksave <int_stacksave>`. 12865 12866.. _int_get_dynamic_area_offset: 12867 12868'``llvm.get.dynamic.area.offset``' Intrinsic 12869^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12870 12871Syntax: 12872""""""" 12873 12874:: 12875 12876 declare i32 @llvm.get.dynamic.area.offset.i32() 12877 declare i64 @llvm.get.dynamic.area.offset.i64() 12878 12879Overview: 12880""""""""" 12881 12882 The '``llvm.get.dynamic.area.offset.*``' intrinsic family is used to 12883 get the offset from native stack pointer to the address of the most 12884 recent dynamic alloca on the caller's stack. These intrinsics are 12885 intended for use in combination with 12886 :ref:`llvm.stacksave <int_stacksave>` to get a 12887 pointer to the most recent dynamic alloca. This is useful, for example, 12888 for AddressSanitizer's stack unpoisoning routines. 12889 12890Semantics: 12891"""""""""" 12892 12893 These intrinsics return a non-negative integer value that can be used to 12894 get the address of the most recent dynamic alloca, allocated by :ref:`alloca <i_alloca>` 12895 on the caller's stack. In particular, for targets where stack grows downwards, 12896 adding this offset to the native stack pointer would get the address of the most 12897 recent dynamic alloca. For targets where stack grows upwards, the situation is a bit more 12898 complicated, because subtracting this value from stack pointer would get the address 12899 one past the end of the most recent dynamic alloca. 12900 12901 Although for most targets `llvm.get.dynamic.area.offset <int_get_dynamic_area_offset>` 12902 returns just a zero, for others, such as PowerPC and PowerPC64, it returns a 12903 compile-time-known constant value. 12904 12905 The return value type of :ref:`llvm.get.dynamic.area.offset <int_get_dynamic_area_offset>` 12906 must match the target's default address space's (address space 0) pointer type. 12907 12908'``llvm.prefetch``' Intrinsic 12909^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12910 12911Syntax: 12912""""""" 12913 12914:: 12915 12916 declare void @llvm.prefetch(i8* <address>, i32 <rw>, i32 <locality>, i32 <cache type>) 12917 12918Overview: 12919""""""""" 12920 12921The '``llvm.prefetch``' intrinsic is a hint to the code generator to 12922insert a prefetch instruction if supported; otherwise, it is a noop. 12923Prefetches have no effect on the behavior of the program but can change 12924its performance characteristics. 12925 12926Arguments: 12927"""""""""" 12928 12929``address`` is the address to be prefetched, ``rw`` is the specifier 12930determining if the fetch should be for a read (0) or write (1), and 12931``locality`` is a temporal locality specifier ranging from (0) - no 12932locality, to (3) - extremely local keep in cache. The ``cache type`` 12933specifies whether the prefetch is performed on the data (1) or 12934instruction (0) cache. The ``rw``, ``locality`` and ``cache type`` 12935arguments must be constant integers. 12936 12937Semantics: 12938"""""""""" 12939 12940This intrinsic does not modify the behavior of the program. In 12941particular, prefetches cannot trap and do not produce a value. On 12942targets that support this intrinsic, the prefetch can provide hints to 12943the processor cache for better performance. 12944 12945'``llvm.pcmarker``' Intrinsic 12946^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12947 12948Syntax: 12949""""""" 12950 12951:: 12952 12953 declare void @llvm.pcmarker(i32 <id>) 12954 12955Overview: 12956""""""""" 12957 12958The '``llvm.pcmarker``' intrinsic is a method to export a Program 12959Counter (PC) in a region of code to simulators and other tools. The 12960method is target specific, but it is expected that the marker will use 12961exported symbols to transmit the PC of the marker. The marker makes no 12962guarantees that it will remain with any specific instruction after 12963optimizations. It is possible that the presence of a marker will inhibit 12964optimizations. The intended use is to be inserted after optimizations to 12965allow correlations of simulation runs. 12966 12967Arguments: 12968"""""""""" 12969 12970``id`` is a numerical id identifying the marker. 12971 12972Semantics: 12973"""""""""" 12974 12975This intrinsic does not modify the behavior of the program. Backends 12976that do not support this intrinsic may ignore it. 12977 12978'``llvm.readcyclecounter``' Intrinsic 12979^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12980 12981Syntax: 12982""""""" 12983 12984:: 12985 12986 declare i64 @llvm.readcyclecounter() 12987 12988Overview: 12989""""""""" 12990 12991The '``llvm.readcyclecounter``' intrinsic provides access to the cycle 12992counter register (or similar low latency, high accuracy clocks) on those 12993targets that support it. On X86, it should map to RDTSC. On Alpha, it 12994should map to RPCC. As the backing counters overflow quickly (on the 12995order of 9 seconds on alpha), this should only be used for small 12996timings. 12997 12998Semantics: 12999"""""""""" 13000 13001When directly supported, reading the cycle counter should not modify any 13002memory. Implementations are allowed to either return an application 13003specific value or a system wide value. On backends without support, this 13004is lowered to a constant 0. 13005 13006Note that runtime support may be conditional on the privilege-level code is 13007running at and the host platform. 13008 13009'``llvm.clear_cache``' Intrinsic 13010^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13011 13012Syntax: 13013""""""" 13014 13015:: 13016 13017 declare void @llvm.clear_cache(i8*, i8*) 13018 13019Overview: 13020""""""""" 13021 13022The '``llvm.clear_cache``' intrinsic ensures visibility of modifications 13023in the specified range to the execution unit of the processor. On 13024targets with non-unified instruction and data cache, the implementation 13025flushes the instruction cache. 13026 13027Semantics: 13028"""""""""" 13029 13030On platforms with coherent instruction and data caches (e.g. x86), this 13031intrinsic is a nop. On platforms with non-coherent instruction and data 13032cache (e.g. ARM, MIPS), the intrinsic is lowered either to appropriate 13033instructions or a system call, if cache flushing requires special 13034privileges. 13035 13036The default behavior is to emit a call to ``__clear_cache`` from the run 13037time library. 13038 13039This intrinsic does *not* empty the instruction pipeline. Modifications 13040of the current function are outside the scope of the intrinsic. 13041 13042'``llvm.instrprof.increment``' Intrinsic 13043^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13044 13045Syntax: 13046""""""" 13047 13048:: 13049 13050 declare void @llvm.instrprof.increment(i8* <name>, i64 <hash>, 13051 i32 <num-counters>, i32 <index>) 13052 13053Overview: 13054""""""""" 13055 13056The '``llvm.instrprof.increment``' intrinsic can be emitted by a 13057frontend for use with instrumentation based profiling. These will be 13058lowered by the ``-instrprof`` pass to generate execution counts of a 13059program at runtime. 13060 13061Arguments: 13062"""""""""" 13063 13064The first argument is a pointer to a global variable containing the 13065name of the entity being instrumented. This should generally be the 13066(mangled) function name for a set of counters. 13067 13068The second argument is a hash value that can be used by the consumer 13069of the profile data to detect changes to the instrumented source, and 13070the third is the number of counters associated with ``name``. It is an 13071error if ``hash`` or ``num-counters`` differ between two instances of 13072``instrprof.increment`` that refer to the same name. 13073 13074The last argument refers to which of the counters for ``name`` should 13075be incremented. It should be a value between 0 and ``num-counters``. 13076 13077Semantics: 13078"""""""""" 13079 13080This intrinsic represents an increment of a profiling counter. It will 13081cause the ``-instrprof`` pass to generate the appropriate data 13082structures and the code to increment the appropriate value, in a 13083format that can be written out by a compiler runtime and consumed via 13084the ``llvm-profdata`` tool. 13085 13086'``llvm.instrprof.increment.step``' Intrinsic 13087^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13088 13089Syntax: 13090""""""" 13091 13092:: 13093 13094 declare void @llvm.instrprof.increment.step(i8* <name>, i64 <hash>, 13095 i32 <num-counters>, 13096 i32 <index>, i64 <step>) 13097 13098Overview: 13099""""""""" 13100 13101The '``llvm.instrprof.increment.step``' intrinsic is an extension to 13102the '``llvm.instrprof.increment``' intrinsic with an additional fifth 13103argument to specify the step of the increment. 13104 13105Arguments: 13106"""""""""" 13107The first four arguments are the same as '``llvm.instrprof.increment``' 13108intrinsic. 13109 13110The last argument specifies the value of the increment of the counter variable. 13111 13112Semantics: 13113"""""""""" 13114See description of '``llvm.instrprof.increment``' intrinsic. 13115 13116 13117'``llvm.instrprof.value.profile``' Intrinsic 13118^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13119 13120Syntax: 13121""""""" 13122 13123:: 13124 13125 declare void @llvm.instrprof.value.profile(i8* <name>, i64 <hash>, 13126 i64 <value>, i32 <value_kind>, 13127 i32 <index>) 13128 13129Overview: 13130""""""""" 13131 13132The '``llvm.instrprof.value.profile``' intrinsic can be emitted by a 13133frontend for use with instrumentation based profiling. This will be 13134lowered by the ``-instrprof`` pass to find out the target values, 13135instrumented expressions take in a program at runtime. 13136 13137Arguments: 13138"""""""""" 13139 13140The first argument is a pointer to a global variable containing the 13141name of the entity being instrumented. ``name`` should generally be the 13142(mangled) function name for a set of counters. 13143 13144The second argument is a hash value that can be used by the consumer 13145of the profile data to detect changes to the instrumented source. It 13146is an error if ``hash`` differs between two instances of 13147``llvm.instrprof.*`` that refer to the same name. 13148 13149The third argument is the value of the expression being profiled. The profiled 13150expression's value should be representable as an unsigned 64-bit value. The 13151fourth argument represents the kind of value profiling that is being done. The 13152supported value profiling kinds are enumerated through the 13153``InstrProfValueKind`` type declared in the 13154``<include/llvm/ProfileData/InstrProf.h>`` header file. The last argument is the 13155index of the instrumented expression within ``name``. It should be >= 0. 13156 13157Semantics: 13158"""""""""" 13159 13160This intrinsic represents the point where a call to a runtime routine 13161should be inserted for value profiling of target expressions. ``-instrprof`` 13162pass will generate the appropriate data structures and replace the 13163``llvm.instrprof.value.profile`` intrinsic with the call to the profile 13164runtime library with proper arguments. 13165 13166'``llvm.thread.pointer``' Intrinsic 13167^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13168 13169Syntax: 13170""""""" 13171 13172:: 13173 13174 declare i8* @llvm.thread.pointer() 13175 13176Overview: 13177""""""""" 13178 13179The '``llvm.thread.pointer``' intrinsic returns the value of the thread 13180pointer. 13181 13182Semantics: 13183"""""""""" 13184 13185The '``llvm.thread.pointer``' intrinsic returns a pointer to the TLS area 13186for the current thread. The exact semantics of this value are target 13187specific: it may point to the start of TLS area, to the end, or somewhere 13188in the middle. Depending on the target, this intrinsic may read a register, 13189call a helper function, read from an alternate memory space, or perform 13190other operations necessary to locate the TLS area. Not all targets support 13191this intrinsic. 13192 13193'``llvm.call.preallocated.setup``' Intrinsic 13194^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13195 13196Syntax: 13197""""""" 13198 13199:: 13200 13201 declare token @llvm.call.preallocated.setup(i32 %num_args) 13202 13203Overview: 13204""""""""" 13205 13206The '``llvm.call.preallocated.setup``' intrinsic returns a token which can 13207be used with a call's ``"preallocated"`` operand bundle to indicate that 13208certain arguments are allocated and initialized before the call. 13209 13210Semantics: 13211"""""""""" 13212 13213The '``llvm.call.preallocated.setup``' intrinsic returns a token which is 13214associated with at most one call. The token can be passed to 13215'``@llvm.call.preallocated.arg``' to get a pointer to get that 13216corresponding argument. The token must be the parameter to a 13217``"preallocated"`` operand bundle for the corresponding call. 13218 13219Nested calls to '``llvm.call.preallocated.setup``' are allowed, but must 13220be properly nested. e.g. 13221 13222:: code-block:: llvm 13223 13224 %t1 = call token @llvm.call.preallocated.setup(i32 0) 13225 %t2 = call token @llvm.call.preallocated.setup(i32 0) 13226 call void foo() ["preallocated"(token %t2)] 13227 call void foo() ["preallocated"(token %t1)] 13228 13229is allowed, but not 13230 13231:: code-block:: llvm 13232 13233 %t1 = call token @llvm.call.preallocated.setup(i32 0) 13234 %t2 = call token @llvm.call.preallocated.setup(i32 0) 13235 call void foo() ["preallocated"(token %t1)] 13236 call void foo() ["preallocated"(token %t2)] 13237 13238.. _int_call_preallocated_arg: 13239 13240'``llvm.call.preallocated.arg``' Intrinsic 13241^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13242 13243Syntax: 13244""""""" 13245 13246:: 13247 13248 declare i8* @llvm.call.preallocated.arg(token %setup_token, i32 %arg_index) 13249 13250Overview: 13251""""""""" 13252 13253The '``llvm.call.preallocated.arg``' intrinsic returns a pointer to the 13254corresponding preallocated argument for the preallocated call. 13255 13256Semantics: 13257"""""""""" 13258 13259The '``llvm.call.preallocated.arg``' intrinsic returns a pointer to the 13260``%arg_index``th argument with the ``preallocated`` attribute for 13261the call associated with the ``%setup_token``, which must be from 13262'``llvm.call.preallocated.setup``'. 13263 13264A call to '``llvm.call.preallocated.arg``' must have a call site 13265``preallocated`` attribute. The type of the ``preallocated`` attribute must 13266match the type used by the ``preallocated`` attribute of the corresponding 13267argument at the preallocated call. The type is used in the case that an 13268``llvm.call.preallocated.setup`` does not have a corresponding call (e.g. due 13269to DCE), where otherwise we cannot know how large the arguments are. 13270 13271It is undefined behavior if this is called with a token from an 13272'``llvm.call.preallocated.setup``' if another 13273'``llvm.call.preallocated.setup``' has already been called or if the 13274preallocated call corresponding to the '``llvm.call.preallocated.setup``' 13275has already been called. 13276 13277.. _int_call_preallocated_teardown: 13278 13279'``llvm.call.preallocated.teardown``' Intrinsic 13280^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13281 13282Syntax: 13283""""""" 13284 13285:: 13286 13287 declare i8* @llvm.call.preallocated.teardown(token %setup_token) 13288 13289Overview: 13290""""""""" 13291 13292The '``llvm.call.preallocated.teardown``' intrinsic cleans up the stack 13293created by a '``llvm.call.preallocated.setup``'. 13294 13295Semantics: 13296"""""""""" 13297 13298The token argument must be a '``llvm.call.preallocated.setup``'. 13299 13300The '``llvm.call.preallocated.teardown``' intrinsic cleans up the stack 13301allocated by the corresponding '``llvm.call.preallocated.setup``'. Exactly 13302one of this or the preallocated call must be called to prevent stack leaks. 13303It is undefined behavior to call both a '``llvm.call.preallocated.teardown``' 13304and the preallocated call for a given '``llvm.call.preallocated.setup``'. 13305 13306For example, if the stack is allocated for a preallocated call by a 13307'``llvm.call.preallocated.setup``', then an initializer function called on an 13308allocated argument throws an exception, there should be a 13309'``llvm.call.preallocated.teardown``' in the exception handler to prevent 13310stack leaks. 13311 13312Following the nesting rules in '``llvm.call.preallocated.setup``', nested 13313calls to '``llvm.call.preallocated.setup``' and 13314'``llvm.call.preallocated.teardown``' are allowed but must be properly 13315nested. 13316 13317Example: 13318"""""""" 13319 13320.. code-block:: llvm 13321 13322 %cs = call token @llvm.call.preallocated.setup(i32 1) 13323 %x = call i8* @llvm.call.preallocated.arg(token %cs, i32 0) preallocated(i32) 13324 %y = bitcast i8* %x to i32* 13325 invoke void @constructor(i32* %y) to label %conta unwind label %contb 13326 conta: 13327 call void @foo1(i32* preallocated(i32) %y) ["preallocated"(token %cs)] 13328 ret void 13329 contb: 13330 %s = catchswitch within none [label %catch] unwind to caller 13331 catch: 13332 %p = catchpad within %s [] 13333 call void @llvm.call.preallocated.teardown(token %cs) 13334 ret void 13335 13336Standard C/C++ Library Intrinsics 13337--------------------------------- 13338 13339LLVM provides intrinsics for a few important standard C/C++ library 13340functions. These intrinsics allow source-language front-ends to pass 13341information about the alignment of the pointer arguments to the code 13342generator, providing opportunity for more efficient code generation. 13343 13344 13345'``llvm.abs.*``' Intrinsic 13346^^^^^^^^^^^^^^^^^^^^^^^^^^ 13347 13348Syntax: 13349""""""" 13350 13351This is an overloaded intrinsic. You can use ``llvm.abs`` on any 13352integer bit width or any vector of integer elements. 13353 13354:: 13355 13356 declare i32 @llvm.abs.i32(i32 <src>, i1 <is_int_min_poison>) 13357 declare <4 x i32> @llvm.abs.v4i32(<4 x i32> <src>, i1 <is_int_min_poison>) 13358 13359Overview: 13360""""""""" 13361 13362The '``llvm.abs``' family of intrinsic functions returns the absolute value 13363of an argument. 13364 13365Arguments: 13366"""""""""" 13367 13368The first argument is the value for which the absolute value is to be returned. 13369This argument may be of any integer type or a vector with integer element type. 13370The return type must match the first argument type. 13371 13372The second argument must be a constant and is a flag to indicate whether the 13373result value of the '``llvm.abs``' intrinsic is a 13374:ref:`poison value <poisonvalues>` if the argument is statically or dynamically 13375an ``INT_MIN`` value. 13376 13377Semantics: 13378"""""""""" 13379 13380The '``llvm.abs``' intrinsic returns the magnitude (always positive) of the 13381argument or each element of a vector argument.". If the argument is ``INT_MIN``, 13382then the result is also ``INT_MIN`` if ``is_int_min_poison == 0`` and 13383``poison`` otherwise. 13384 13385 13386'``llvm.smax.*``' Intrinsic 13387^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13388 13389Syntax: 13390""""""" 13391 13392This is an overloaded intrinsic. You can use ``@llvm.smax`` on any 13393integer bit width or any vector of integer elements. 13394 13395:: 13396 13397 declare i32 @llvm.smax.i32(i32 %a, i32 %b) 13398 declare <4 x i32> @llvm.smax.v4i32(<4 x i32> %a, <4 x i32> %b) 13399 13400Overview: 13401""""""""" 13402 13403Return the larger of ``%a`` and ``%b`` comparing the values as signed integers. 13404Vector intrinsics operate on a per-element basis. The larger element of ``%a`` 13405and ``%b`` at a given index is returned for that index. 13406 13407Arguments: 13408"""""""""" 13409 13410The arguments (``%a`` and ``%b``) may be of any integer type or a vector with 13411integer element type. The argument types must match each other, and the return 13412type must match the argument type. 13413 13414 13415'``llvm.smin.*``' Intrinsic 13416^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13417 13418Syntax: 13419""""""" 13420 13421This is an overloaded intrinsic. You can use ``@llvm.smin`` on any 13422integer bit width or any vector of integer elements. 13423 13424:: 13425 13426 declare i32 @llvm.smin.i32(i32 %a, i32 %b) 13427 declare <4 x i32> @llvm.smin.v4i32(<4 x i32> %a, <4 x i32> %b) 13428 13429Overview: 13430""""""""" 13431 13432Return the smaller of ``%a`` and ``%b`` comparing the values as signed integers. 13433Vector intrinsics operate on a per-element basis. The smaller element of ``%a`` 13434and ``%b`` at a given index is returned for that index. 13435 13436Arguments: 13437"""""""""" 13438 13439The arguments (``%a`` and ``%b``) may be of any integer type or a vector with 13440integer element type. The argument types must match each other, and the return 13441type must match the argument type. 13442 13443 13444'``llvm.umax.*``' Intrinsic 13445^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13446 13447Syntax: 13448""""""" 13449 13450This is an overloaded intrinsic. You can use ``@llvm.umax`` on any 13451integer bit width or any vector of integer elements. 13452 13453:: 13454 13455 declare i32 @llvm.umax.i32(i32 %a, i32 %b) 13456 declare <4 x i32> @llvm.umax.v4i32(<4 x i32> %a, <4 x i32> %b) 13457 13458Overview: 13459""""""""" 13460 13461Return the larger of ``%a`` and ``%b`` comparing the values as unsigned 13462integers. Vector intrinsics operate on a per-element basis. The larger element 13463of ``%a`` and ``%b`` at a given index is returned for that index. 13464 13465Arguments: 13466"""""""""" 13467 13468The arguments (``%a`` and ``%b``) may be of any integer type or a vector with 13469integer element type. The argument types must match each other, and the return 13470type must match the argument type. 13471 13472 13473'``llvm.umin.*``' Intrinsic 13474^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13475 13476Syntax: 13477""""""" 13478 13479This is an overloaded intrinsic. You can use ``@llvm.umin`` on any 13480integer bit width or any vector of integer elements. 13481 13482:: 13483 13484 declare i32 @llvm.umin.i32(i32 %a, i32 %b) 13485 declare <4 x i32> @llvm.umin.v4i32(<4 x i32> %a, <4 x i32> %b) 13486 13487Overview: 13488""""""""" 13489 13490Return the smaller of ``%a`` and ``%b`` comparing the values as unsigned 13491integers. Vector intrinsics operate on a per-element basis. The smaller element 13492of ``%a`` and ``%b`` at a given index is returned for that index. 13493 13494Arguments: 13495"""""""""" 13496 13497The arguments (``%a`` and ``%b``) may be of any integer type or a vector with 13498integer element type. The argument types must match each other, and the return 13499type must match the argument type. 13500 13501 13502.. _int_memcpy: 13503 13504'``llvm.memcpy``' Intrinsic 13505^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13506 13507Syntax: 13508""""""" 13509 13510This is an overloaded intrinsic. You can use ``llvm.memcpy`` on any 13511integer bit width and for different address spaces. Not all targets 13512support all bit widths however. 13513 13514:: 13515 13516 declare void @llvm.memcpy.p0i8.p0i8.i32(i8* <dest>, i8* <src>, 13517 i32 <len>, i1 <isvolatile>) 13518 declare void @llvm.memcpy.p0i8.p0i8.i64(i8* <dest>, i8* <src>, 13519 i64 <len>, i1 <isvolatile>) 13520 13521Overview: 13522""""""""" 13523 13524The '``llvm.memcpy.*``' intrinsics copy a block of memory from the 13525source location to the destination location. 13526 13527Note that, unlike the standard libc function, the ``llvm.memcpy.*`` 13528intrinsics do not return a value, takes extra isvolatile 13529arguments and the pointers can be in specified address spaces. 13530 13531Arguments: 13532"""""""""" 13533 13534The first argument is a pointer to the destination, the second is a 13535pointer to the source. The third argument is an integer argument 13536specifying the number of bytes to copy, and the fourth is a 13537boolean indicating a volatile access. 13538 13539The :ref:`align <attr_align>` parameter attribute can be provided 13540for the first and second arguments. 13541 13542If the ``isvolatile`` parameter is ``true``, the ``llvm.memcpy`` call is 13543a :ref:`volatile operation <volatile>`. The detailed access behavior is not 13544very cleanly specified and it is unwise to depend on it. 13545 13546Semantics: 13547"""""""""" 13548 13549The '``llvm.memcpy.*``' intrinsics copy a block of memory from the source 13550location to the destination location, which must either be equal or 13551non-overlapping. It copies "len" bytes of memory over. If the argument is known 13552to be aligned to some boundary, this can be specified as an attribute on the 13553argument. 13554 13555If ``<len>`` is 0, it is no-op modulo the behavior of attributes attached to 13556the arguments. 13557If ``<len>`` is not a well-defined value, the behavior is undefined. 13558If ``<len>`` is not zero, both ``<dest>`` and ``<src>`` should be well-defined, 13559otherwise the behavior is undefined. 13560 13561.. _int_memcpy_inline: 13562 13563'``llvm.memcpy.inline``' Intrinsic 13564^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13565 13566Syntax: 13567""""""" 13568 13569This is an overloaded intrinsic. You can use ``llvm.memcpy.inline`` on any 13570integer bit width and for different address spaces. Not all targets 13571support all bit widths however. 13572 13573:: 13574 13575 declare void @llvm.memcpy.inline.p0i8.p0i8.i32(i8* <dest>, i8* <src>, 13576 i32 <len>, i1 <isvolatile>) 13577 declare void @llvm.memcpy.inline.p0i8.p0i8.i64(i8* <dest>, i8* <src>, 13578 i64 <len>, i1 <isvolatile>) 13579 13580Overview: 13581""""""""" 13582 13583The '``llvm.memcpy.inline.*``' intrinsics copy a block of memory from the 13584source location to the destination location and guarantees that no external 13585functions are called. 13586 13587Note that, unlike the standard libc function, the ``llvm.memcpy.inline.*`` 13588intrinsics do not return a value, takes extra isvolatile 13589arguments and the pointers can be in specified address spaces. 13590 13591Arguments: 13592"""""""""" 13593 13594The first argument is a pointer to the destination, the second is a 13595pointer to the source. The third argument is a constant integer argument 13596specifying the number of bytes to copy, and the fourth is a 13597boolean indicating a volatile access. 13598 13599The :ref:`align <attr_align>` parameter attribute can be provided 13600for the first and second arguments. 13601 13602If the ``isvolatile`` parameter is ``true``, the ``llvm.memcpy.inline`` call is 13603a :ref:`volatile operation <volatile>`. The detailed access behavior is not 13604very cleanly specified and it is unwise to depend on it. 13605 13606Semantics: 13607"""""""""" 13608 13609The '``llvm.memcpy.inline.*``' intrinsics copy a block of memory from the 13610source location to the destination location, which are not allowed to 13611overlap. It copies "len" bytes of memory over. If the argument is known 13612to be aligned to some boundary, this can be specified as an attribute on 13613the argument. 13614The behavior of '``llvm.memcpy.inline.*``' is equivalent to the behavior of 13615'``llvm.memcpy.*``', but the generated code is guaranteed not to call any 13616external functions. 13617 13618.. _int_memmove: 13619 13620'``llvm.memmove``' Intrinsic 13621^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13622 13623Syntax: 13624""""""" 13625 13626This is an overloaded intrinsic. You can use llvm.memmove on any integer 13627bit width and for different address space. Not all targets support all 13628bit widths however. 13629 13630:: 13631 13632 declare void @llvm.memmove.p0i8.p0i8.i32(i8* <dest>, i8* <src>, 13633 i32 <len>, i1 <isvolatile>) 13634 declare void @llvm.memmove.p0i8.p0i8.i64(i8* <dest>, i8* <src>, 13635 i64 <len>, i1 <isvolatile>) 13636 13637Overview: 13638""""""""" 13639 13640The '``llvm.memmove.*``' intrinsics move a block of memory from the 13641source location to the destination location. It is similar to the 13642'``llvm.memcpy``' intrinsic but allows the two memory locations to 13643overlap. 13644 13645Note that, unlike the standard libc function, the ``llvm.memmove.*`` 13646intrinsics do not return a value, takes an extra isvolatile 13647argument and the pointers can be in specified address spaces. 13648 13649Arguments: 13650"""""""""" 13651 13652The first argument is a pointer to the destination, the second is a 13653pointer to the source. The third argument is an integer argument 13654specifying the number of bytes to copy, and the fourth is a 13655boolean indicating a volatile access. 13656 13657The :ref:`align <attr_align>` parameter attribute can be provided 13658for the first and second arguments. 13659 13660If the ``isvolatile`` parameter is ``true``, the ``llvm.memmove`` call 13661is a :ref:`volatile operation <volatile>`. The detailed access behavior is 13662not very cleanly specified and it is unwise to depend on it. 13663 13664Semantics: 13665"""""""""" 13666 13667The '``llvm.memmove.*``' intrinsics copy a block of memory from the 13668source location to the destination location, which may overlap. It 13669copies "len" bytes of memory over. If the argument is known to be 13670aligned to some boundary, this can be specified as an attribute on 13671the argument. 13672 13673If ``<len>`` is 0, it is no-op modulo the behavior of attributes attached to 13674the arguments. 13675If ``<len>`` is not a well-defined value, the behavior is undefined. 13676If ``<len>`` is not zero, both ``<dest>`` and ``<src>`` should be well-defined, 13677otherwise the behavior is undefined. 13678 13679.. _int_memset: 13680 13681'``llvm.memset.*``' Intrinsics 13682^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13683 13684Syntax: 13685""""""" 13686 13687This is an overloaded intrinsic. You can use llvm.memset on any integer 13688bit width and for different address spaces. However, not all targets 13689support all bit widths. 13690 13691:: 13692 13693 declare void @llvm.memset.p0i8.i32(i8* <dest>, i8 <val>, 13694 i32 <len>, i1 <isvolatile>) 13695 declare void @llvm.memset.p0i8.i64(i8* <dest>, i8 <val>, 13696 i64 <len>, i1 <isvolatile>) 13697 13698Overview: 13699""""""""" 13700 13701The '``llvm.memset.*``' intrinsics fill a block of memory with a 13702particular byte value. 13703 13704Note that, unlike the standard libc function, the ``llvm.memset`` 13705intrinsic does not return a value and takes an extra volatile 13706argument. Also, the destination can be in an arbitrary address space. 13707 13708Arguments: 13709"""""""""" 13710 13711The first argument is a pointer to the destination to fill, the second 13712is the byte value with which to fill it, the third argument is an 13713integer argument specifying the number of bytes to fill, and the fourth 13714is a boolean indicating a volatile access. 13715 13716The :ref:`align <attr_align>` parameter attribute can be provided 13717for the first arguments. 13718 13719If the ``isvolatile`` parameter is ``true``, the ``llvm.memset`` call is 13720a :ref:`volatile operation <volatile>`. The detailed access behavior is not 13721very cleanly specified and it is unwise to depend on it. 13722 13723Semantics: 13724"""""""""" 13725 13726The '``llvm.memset.*``' intrinsics fill "len" bytes of memory starting 13727at the destination location. If the argument is known to be 13728aligned to some boundary, this can be specified as an attribute on 13729the argument. 13730 13731If ``<len>`` is 0, it is no-op modulo the behavior of attributes attached to 13732the arguments. 13733If ``<len>`` is not a well-defined value, the behavior is undefined. 13734If ``<len>`` is not zero, both ``<dest>`` and ``<src>`` should be well-defined, 13735otherwise the behavior is undefined. 13736 13737'``llvm.sqrt.*``' Intrinsic 13738^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13739 13740Syntax: 13741""""""" 13742 13743This is an overloaded intrinsic. You can use ``llvm.sqrt`` on any 13744floating-point or vector of floating-point type. Not all targets support 13745all types however. 13746 13747:: 13748 13749 declare float @llvm.sqrt.f32(float %Val) 13750 declare double @llvm.sqrt.f64(double %Val) 13751 declare x86_fp80 @llvm.sqrt.f80(x86_fp80 %Val) 13752 declare fp128 @llvm.sqrt.f128(fp128 %Val) 13753 declare ppc_fp128 @llvm.sqrt.ppcf128(ppc_fp128 %Val) 13754 13755Overview: 13756""""""""" 13757 13758The '``llvm.sqrt``' intrinsics return the square root of the specified value. 13759 13760Arguments: 13761"""""""""" 13762 13763The argument and return value are floating-point numbers of the same type. 13764 13765Semantics: 13766"""""""""" 13767 13768Return the same value as a corresponding libm '``sqrt``' function but without 13769trapping or setting ``errno``. For types specified by IEEE-754, the result 13770matches a conforming libm implementation. 13771 13772When specified with the fast-math-flag 'afn', the result may be approximated 13773using a less accurate calculation. 13774 13775'``llvm.powi.*``' Intrinsic 13776^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13777 13778Syntax: 13779""""""" 13780 13781This is an overloaded intrinsic. You can use ``llvm.powi`` on any 13782floating-point or vector of floating-point type. Not all targets support 13783all types however. 13784 13785Generally, the only supported type for the exponent is the one matching 13786with the C type ``int``. 13787 13788:: 13789 13790 declare float @llvm.powi.f32.i32(float %Val, i32 %power) 13791 declare double @llvm.powi.f64.i16(double %Val, i16 %power) 13792 declare x86_fp80 @llvm.powi.f80.i32(x86_fp80 %Val, i32 %power) 13793 declare fp128 @llvm.powi.f128.i32(fp128 %Val, i32 %power) 13794 declare ppc_fp128 @llvm.powi.ppcf128.i32(ppc_fp128 %Val, i32 %power) 13795 13796Overview: 13797""""""""" 13798 13799The '``llvm.powi.*``' intrinsics return the first operand raised to the 13800specified (positive or negative) power. The order of evaluation of 13801multiplications is not defined. When a vector of floating-point type is 13802used, the second argument remains a scalar integer value. 13803 13804Arguments: 13805"""""""""" 13806 13807The second argument is an integer power, and the first is a value to 13808raise to that power. 13809 13810Semantics: 13811"""""""""" 13812 13813This function returns the first value raised to the second power with an 13814unspecified sequence of rounding operations. 13815 13816'``llvm.sin.*``' Intrinsic 13817^^^^^^^^^^^^^^^^^^^^^^^^^^ 13818 13819Syntax: 13820""""""" 13821 13822This is an overloaded intrinsic. You can use ``llvm.sin`` on any 13823floating-point or vector of floating-point type. Not all targets support 13824all types however. 13825 13826:: 13827 13828 declare float @llvm.sin.f32(float %Val) 13829 declare double @llvm.sin.f64(double %Val) 13830 declare x86_fp80 @llvm.sin.f80(x86_fp80 %Val) 13831 declare fp128 @llvm.sin.f128(fp128 %Val) 13832 declare ppc_fp128 @llvm.sin.ppcf128(ppc_fp128 %Val) 13833 13834Overview: 13835""""""""" 13836 13837The '``llvm.sin.*``' intrinsics return the sine of the operand. 13838 13839Arguments: 13840"""""""""" 13841 13842The argument and return value are floating-point numbers of the same type. 13843 13844Semantics: 13845"""""""""" 13846 13847Return the same value as a corresponding libm '``sin``' function but without 13848trapping or setting ``errno``. 13849 13850When specified with the fast-math-flag 'afn', the result may be approximated 13851using a less accurate calculation. 13852 13853'``llvm.cos.*``' Intrinsic 13854^^^^^^^^^^^^^^^^^^^^^^^^^^ 13855 13856Syntax: 13857""""""" 13858 13859This is an overloaded intrinsic. You can use ``llvm.cos`` on any 13860floating-point or vector of floating-point type. Not all targets support 13861all types however. 13862 13863:: 13864 13865 declare float @llvm.cos.f32(float %Val) 13866 declare double @llvm.cos.f64(double %Val) 13867 declare x86_fp80 @llvm.cos.f80(x86_fp80 %Val) 13868 declare fp128 @llvm.cos.f128(fp128 %Val) 13869 declare ppc_fp128 @llvm.cos.ppcf128(ppc_fp128 %Val) 13870 13871Overview: 13872""""""""" 13873 13874The '``llvm.cos.*``' intrinsics return the cosine of the operand. 13875 13876Arguments: 13877"""""""""" 13878 13879The argument and return value are floating-point numbers of the same type. 13880 13881Semantics: 13882"""""""""" 13883 13884Return the same value as a corresponding libm '``cos``' function but without 13885trapping or setting ``errno``. 13886 13887When specified with the fast-math-flag 'afn', the result may be approximated 13888using a less accurate calculation. 13889 13890'``llvm.pow.*``' Intrinsic 13891^^^^^^^^^^^^^^^^^^^^^^^^^^ 13892 13893Syntax: 13894""""""" 13895 13896This is an overloaded intrinsic. You can use ``llvm.pow`` on any 13897floating-point or vector of floating-point type. Not all targets support 13898all types however. 13899 13900:: 13901 13902 declare float @llvm.pow.f32(float %Val, float %Power) 13903 declare double @llvm.pow.f64(double %Val, double %Power) 13904 declare x86_fp80 @llvm.pow.f80(x86_fp80 %Val, x86_fp80 %Power) 13905 declare fp128 @llvm.pow.f128(fp128 %Val, fp128 %Power) 13906 declare ppc_fp128 @llvm.pow.ppcf128(ppc_fp128 %Val, ppc_fp128 Power) 13907 13908Overview: 13909""""""""" 13910 13911The '``llvm.pow.*``' intrinsics return the first operand raised to the 13912specified (positive or negative) power. 13913 13914Arguments: 13915"""""""""" 13916 13917The arguments and return value are floating-point numbers of the same type. 13918 13919Semantics: 13920"""""""""" 13921 13922Return the same value as a corresponding libm '``pow``' function but without 13923trapping or setting ``errno``. 13924 13925When specified with the fast-math-flag 'afn', the result may be approximated 13926using a less accurate calculation. 13927 13928'``llvm.exp.*``' Intrinsic 13929^^^^^^^^^^^^^^^^^^^^^^^^^^ 13930 13931Syntax: 13932""""""" 13933 13934This is an overloaded intrinsic. You can use ``llvm.exp`` on any 13935floating-point or vector of floating-point type. Not all targets support 13936all types however. 13937 13938:: 13939 13940 declare float @llvm.exp.f32(float %Val) 13941 declare double @llvm.exp.f64(double %Val) 13942 declare x86_fp80 @llvm.exp.f80(x86_fp80 %Val) 13943 declare fp128 @llvm.exp.f128(fp128 %Val) 13944 declare ppc_fp128 @llvm.exp.ppcf128(ppc_fp128 %Val) 13945 13946Overview: 13947""""""""" 13948 13949The '``llvm.exp.*``' intrinsics compute the base-e exponential of the specified 13950value. 13951 13952Arguments: 13953"""""""""" 13954 13955The argument and return value are floating-point numbers of the same type. 13956 13957Semantics: 13958"""""""""" 13959 13960Return the same value as a corresponding libm '``exp``' function but without 13961trapping or setting ``errno``. 13962 13963When specified with the fast-math-flag 'afn', the result may be approximated 13964using a less accurate calculation. 13965 13966'``llvm.exp2.*``' Intrinsic 13967^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13968 13969Syntax: 13970""""""" 13971 13972This is an overloaded intrinsic. You can use ``llvm.exp2`` on any 13973floating-point or vector of floating-point type. Not all targets support 13974all types however. 13975 13976:: 13977 13978 declare float @llvm.exp2.f32(float %Val) 13979 declare double @llvm.exp2.f64(double %Val) 13980 declare x86_fp80 @llvm.exp2.f80(x86_fp80 %Val) 13981 declare fp128 @llvm.exp2.f128(fp128 %Val) 13982 declare ppc_fp128 @llvm.exp2.ppcf128(ppc_fp128 %Val) 13983 13984Overview: 13985""""""""" 13986 13987The '``llvm.exp2.*``' intrinsics compute the base-2 exponential of the 13988specified value. 13989 13990Arguments: 13991"""""""""" 13992 13993The argument and return value are floating-point numbers of the same type. 13994 13995Semantics: 13996"""""""""" 13997 13998Return the same value as a corresponding libm '``exp2``' function but without 13999trapping or setting ``errno``. 14000 14001When specified with the fast-math-flag 'afn', the result may be approximated 14002using a less accurate calculation. 14003 14004'``llvm.log.*``' Intrinsic 14005^^^^^^^^^^^^^^^^^^^^^^^^^^ 14006 14007Syntax: 14008""""""" 14009 14010This is an overloaded intrinsic. You can use ``llvm.log`` on any 14011floating-point or vector of floating-point type. Not all targets support 14012all types however. 14013 14014:: 14015 14016 declare float @llvm.log.f32(float %Val) 14017 declare double @llvm.log.f64(double %Val) 14018 declare x86_fp80 @llvm.log.f80(x86_fp80 %Val) 14019 declare fp128 @llvm.log.f128(fp128 %Val) 14020 declare ppc_fp128 @llvm.log.ppcf128(ppc_fp128 %Val) 14021 14022Overview: 14023""""""""" 14024 14025The '``llvm.log.*``' intrinsics compute the base-e logarithm of the specified 14026value. 14027 14028Arguments: 14029"""""""""" 14030 14031The argument and return value are floating-point numbers of the same type. 14032 14033Semantics: 14034"""""""""" 14035 14036Return the same value as a corresponding libm '``log``' function but without 14037trapping or setting ``errno``. 14038 14039When specified with the fast-math-flag 'afn', the result may be approximated 14040using a less accurate calculation. 14041 14042'``llvm.log10.*``' Intrinsic 14043^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14044 14045Syntax: 14046""""""" 14047 14048This is an overloaded intrinsic. You can use ``llvm.log10`` on any 14049floating-point or vector of floating-point type. Not all targets support 14050all types however. 14051 14052:: 14053 14054 declare float @llvm.log10.f32(float %Val) 14055 declare double @llvm.log10.f64(double %Val) 14056 declare x86_fp80 @llvm.log10.f80(x86_fp80 %Val) 14057 declare fp128 @llvm.log10.f128(fp128 %Val) 14058 declare ppc_fp128 @llvm.log10.ppcf128(ppc_fp128 %Val) 14059 14060Overview: 14061""""""""" 14062 14063The '``llvm.log10.*``' intrinsics compute the base-10 logarithm of the 14064specified value. 14065 14066Arguments: 14067"""""""""" 14068 14069The argument and return value are floating-point numbers of the same type. 14070 14071Semantics: 14072"""""""""" 14073 14074Return the same value as a corresponding libm '``log10``' function but without 14075trapping or setting ``errno``. 14076 14077When specified with the fast-math-flag 'afn', the result may be approximated 14078using a less accurate calculation. 14079 14080'``llvm.log2.*``' Intrinsic 14081^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14082 14083Syntax: 14084""""""" 14085 14086This is an overloaded intrinsic. You can use ``llvm.log2`` on any 14087floating-point or vector of floating-point type. Not all targets support 14088all types however. 14089 14090:: 14091 14092 declare float @llvm.log2.f32(float %Val) 14093 declare double @llvm.log2.f64(double %Val) 14094 declare x86_fp80 @llvm.log2.f80(x86_fp80 %Val) 14095 declare fp128 @llvm.log2.f128(fp128 %Val) 14096 declare ppc_fp128 @llvm.log2.ppcf128(ppc_fp128 %Val) 14097 14098Overview: 14099""""""""" 14100 14101The '``llvm.log2.*``' intrinsics compute the base-2 logarithm of the specified 14102value. 14103 14104Arguments: 14105"""""""""" 14106 14107The argument and return value are floating-point numbers of the same type. 14108 14109Semantics: 14110"""""""""" 14111 14112Return the same value as a corresponding libm '``log2``' function but without 14113trapping or setting ``errno``. 14114 14115When specified with the fast-math-flag 'afn', the result may be approximated 14116using a less accurate calculation. 14117 14118.. _int_fma: 14119 14120'``llvm.fma.*``' Intrinsic 14121^^^^^^^^^^^^^^^^^^^^^^^^^^ 14122 14123Syntax: 14124""""""" 14125 14126This is an overloaded intrinsic. You can use ``llvm.fma`` on any 14127floating-point or vector of floating-point type. Not all targets support 14128all types however. 14129 14130:: 14131 14132 declare float @llvm.fma.f32(float %a, float %b, float %c) 14133 declare double @llvm.fma.f64(double %a, double %b, double %c) 14134 declare x86_fp80 @llvm.fma.f80(x86_fp80 %a, x86_fp80 %b, x86_fp80 %c) 14135 declare fp128 @llvm.fma.f128(fp128 %a, fp128 %b, fp128 %c) 14136 declare ppc_fp128 @llvm.fma.ppcf128(ppc_fp128 %a, ppc_fp128 %b, ppc_fp128 %c) 14137 14138Overview: 14139""""""""" 14140 14141The '``llvm.fma.*``' intrinsics perform the fused multiply-add operation. 14142 14143Arguments: 14144"""""""""" 14145 14146The arguments and return value are floating-point numbers of the same type. 14147 14148Semantics: 14149"""""""""" 14150 14151Return the same value as a corresponding libm '``fma``' function but without 14152trapping or setting ``errno``. 14153 14154When specified with the fast-math-flag 'afn', the result may be approximated 14155using a less accurate calculation. 14156 14157'``llvm.fabs.*``' Intrinsic 14158^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14159 14160Syntax: 14161""""""" 14162 14163This is an overloaded intrinsic. You can use ``llvm.fabs`` on any 14164floating-point or vector of floating-point type. Not all targets support 14165all types however. 14166 14167:: 14168 14169 declare float @llvm.fabs.f32(float %Val) 14170 declare double @llvm.fabs.f64(double %Val) 14171 declare x86_fp80 @llvm.fabs.f80(x86_fp80 %Val) 14172 declare fp128 @llvm.fabs.f128(fp128 %Val) 14173 declare ppc_fp128 @llvm.fabs.ppcf128(ppc_fp128 %Val) 14174 14175Overview: 14176""""""""" 14177 14178The '``llvm.fabs.*``' intrinsics return the absolute value of the 14179operand. 14180 14181Arguments: 14182"""""""""" 14183 14184The argument and return value are floating-point numbers of the same 14185type. 14186 14187Semantics: 14188"""""""""" 14189 14190This function returns the same values as the libm ``fabs`` functions 14191would, and handles error conditions in the same way. 14192 14193'``llvm.minnum.*``' Intrinsic 14194^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14195 14196Syntax: 14197""""""" 14198 14199This is an overloaded intrinsic. You can use ``llvm.minnum`` on any 14200floating-point or vector of floating-point type. Not all targets support 14201all types however. 14202 14203:: 14204 14205 declare float @llvm.minnum.f32(float %Val0, float %Val1) 14206 declare double @llvm.minnum.f64(double %Val0, double %Val1) 14207 declare x86_fp80 @llvm.minnum.f80(x86_fp80 %Val0, x86_fp80 %Val1) 14208 declare fp128 @llvm.minnum.f128(fp128 %Val0, fp128 %Val1) 14209 declare ppc_fp128 @llvm.minnum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1) 14210 14211Overview: 14212""""""""" 14213 14214The '``llvm.minnum.*``' intrinsics return the minimum of the two 14215arguments. 14216 14217 14218Arguments: 14219"""""""""" 14220 14221The arguments and return value are floating-point numbers of the same 14222type. 14223 14224Semantics: 14225"""""""""" 14226 14227Follows the IEEE-754 semantics for minNum, except for handling of 14228signaling NaNs. This match's the behavior of libm's fmin. 14229 14230If either operand is a NaN, returns the other non-NaN operand. Returns 14231NaN only if both operands are NaN. The returned NaN is always 14232quiet. If the operands compare equal, returns a value that compares 14233equal to both operands. This means that fmin(+/-0.0, +/-0.0) could 14234return either -0.0 or 0.0. 14235 14236Unlike the IEEE-754 2008 behavior, this does not distinguish between 14237signaling and quiet NaN inputs. If a target's implementation follows 14238the standard and returns a quiet NaN if either input is a signaling 14239NaN, the intrinsic lowering is responsible for quieting the inputs to 14240correctly return the non-NaN input (e.g. by using the equivalent of 14241``llvm.canonicalize``). 14242 14243 14244'``llvm.maxnum.*``' Intrinsic 14245^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14246 14247Syntax: 14248""""""" 14249 14250This is an overloaded intrinsic. You can use ``llvm.maxnum`` on any 14251floating-point or vector of floating-point type. Not all targets support 14252all types however. 14253 14254:: 14255 14256 declare float @llvm.maxnum.f32(float %Val0, float %Val1) 14257 declare double @llvm.maxnum.f64(double %Val0, double %Val1) 14258 declare x86_fp80 @llvm.maxnum.f80(x86_fp80 %Val0, x86_fp80 %Val1) 14259 declare fp128 @llvm.maxnum.f128(fp128 %Val0, fp128 %Val1) 14260 declare ppc_fp128 @llvm.maxnum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1) 14261 14262Overview: 14263""""""""" 14264 14265The '``llvm.maxnum.*``' intrinsics return the maximum of the two 14266arguments. 14267 14268 14269Arguments: 14270"""""""""" 14271 14272The arguments and return value are floating-point numbers of the same 14273type. 14274 14275Semantics: 14276"""""""""" 14277Follows the IEEE-754 semantics for maxNum except for the handling of 14278signaling NaNs. This matches the behavior of libm's fmax. 14279 14280If either operand is a NaN, returns the other non-NaN operand. Returns 14281NaN only if both operands are NaN. The returned NaN is always 14282quiet. If the operands compare equal, returns a value that compares 14283equal to both operands. This means that fmax(+/-0.0, +/-0.0) could 14284return either -0.0 or 0.0. 14285 14286Unlike the IEEE-754 2008 behavior, this does not distinguish between 14287signaling and quiet NaN inputs. If a target's implementation follows 14288the standard and returns a quiet NaN if either input is a signaling 14289NaN, the intrinsic lowering is responsible for quieting the inputs to 14290correctly return the non-NaN input (e.g. by using the equivalent of 14291``llvm.canonicalize``). 14292 14293'``llvm.minimum.*``' Intrinsic 14294^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14295 14296Syntax: 14297""""""" 14298 14299This is an overloaded intrinsic. You can use ``llvm.minimum`` on any 14300floating-point or vector of floating-point type. Not all targets support 14301all types however. 14302 14303:: 14304 14305 declare float @llvm.minimum.f32(float %Val0, float %Val1) 14306 declare double @llvm.minimum.f64(double %Val0, double %Val1) 14307 declare x86_fp80 @llvm.minimum.f80(x86_fp80 %Val0, x86_fp80 %Val1) 14308 declare fp128 @llvm.minimum.f128(fp128 %Val0, fp128 %Val1) 14309 declare ppc_fp128 @llvm.minimum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1) 14310 14311Overview: 14312""""""""" 14313 14314The '``llvm.minimum.*``' intrinsics return the minimum of the two 14315arguments, propagating NaNs and treating -0.0 as less than +0.0. 14316 14317 14318Arguments: 14319"""""""""" 14320 14321The arguments and return value are floating-point numbers of the same 14322type. 14323 14324Semantics: 14325"""""""""" 14326If either operand is a NaN, returns NaN. Otherwise returns the lesser 14327of the two arguments. -0.0 is considered to be less than +0.0 for this 14328intrinsic. Note that these are the semantics specified in the draft of 14329IEEE 754-2018. 14330 14331'``llvm.maximum.*``' Intrinsic 14332^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14333 14334Syntax: 14335""""""" 14336 14337This is an overloaded intrinsic. You can use ``llvm.maximum`` on any 14338floating-point or vector of floating-point type. Not all targets support 14339all types however. 14340 14341:: 14342 14343 declare float @llvm.maximum.f32(float %Val0, float %Val1) 14344 declare double @llvm.maximum.f64(double %Val0, double %Val1) 14345 declare x86_fp80 @llvm.maximum.f80(x86_fp80 %Val0, x86_fp80 %Val1) 14346 declare fp128 @llvm.maximum.f128(fp128 %Val0, fp128 %Val1) 14347 declare ppc_fp128 @llvm.maximum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1) 14348 14349Overview: 14350""""""""" 14351 14352The '``llvm.maximum.*``' intrinsics return the maximum of the two 14353arguments, propagating NaNs and treating -0.0 as less than +0.0. 14354 14355 14356Arguments: 14357"""""""""" 14358 14359The arguments and return value are floating-point numbers of the same 14360type. 14361 14362Semantics: 14363"""""""""" 14364If either operand is a NaN, returns NaN. Otherwise returns the greater 14365of the two arguments. -0.0 is considered to be less than +0.0 for this 14366intrinsic. Note that these are the semantics specified in the draft of 14367IEEE 754-2018. 14368 14369'``llvm.copysign.*``' Intrinsic 14370^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14371 14372Syntax: 14373""""""" 14374 14375This is an overloaded intrinsic. You can use ``llvm.copysign`` on any 14376floating-point or vector of floating-point type. Not all targets support 14377all types however. 14378 14379:: 14380 14381 declare float @llvm.copysign.f32(float %Mag, float %Sgn) 14382 declare double @llvm.copysign.f64(double %Mag, double %Sgn) 14383 declare x86_fp80 @llvm.copysign.f80(x86_fp80 %Mag, x86_fp80 %Sgn) 14384 declare fp128 @llvm.copysign.f128(fp128 %Mag, fp128 %Sgn) 14385 declare ppc_fp128 @llvm.copysign.ppcf128(ppc_fp128 %Mag, ppc_fp128 %Sgn) 14386 14387Overview: 14388""""""""" 14389 14390The '``llvm.copysign.*``' intrinsics return a value with the magnitude of the 14391first operand and the sign of the second operand. 14392 14393Arguments: 14394"""""""""" 14395 14396The arguments and return value are floating-point numbers of the same 14397type. 14398 14399Semantics: 14400"""""""""" 14401 14402This function returns the same values as the libm ``copysign`` 14403functions would, and handles error conditions in the same way. 14404 14405'``llvm.floor.*``' Intrinsic 14406^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14407 14408Syntax: 14409""""""" 14410 14411This is an overloaded intrinsic. You can use ``llvm.floor`` on any 14412floating-point or vector of floating-point type. Not all targets support 14413all types however. 14414 14415:: 14416 14417 declare float @llvm.floor.f32(float %Val) 14418 declare double @llvm.floor.f64(double %Val) 14419 declare x86_fp80 @llvm.floor.f80(x86_fp80 %Val) 14420 declare fp128 @llvm.floor.f128(fp128 %Val) 14421 declare ppc_fp128 @llvm.floor.ppcf128(ppc_fp128 %Val) 14422 14423Overview: 14424""""""""" 14425 14426The '``llvm.floor.*``' intrinsics return the floor of the operand. 14427 14428Arguments: 14429"""""""""" 14430 14431The argument and return value are floating-point numbers of the same 14432type. 14433 14434Semantics: 14435"""""""""" 14436 14437This function returns the same values as the libm ``floor`` functions 14438would, and handles error conditions in the same way. 14439 14440'``llvm.ceil.*``' Intrinsic 14441^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14442 14443Syntax: 14444""""""" 14445 14446This is an overloaded intrinsic. You can use ``llvm.ceil`` on any 14447floating-point or vector of floating-point type. Not all targets support 14448all types however. 14449 14450:: 14451 14452 declare float @llvm.ceil.f32(float %Val) 14453 declare double @llvm.ceil.f64(double %Val) 14454 declare x86_fp80 @llvm.ceil.f80(x86_fp80 %Val) 14455 declare fp128 @llvm.ceil.f128(fp128 %Val) 14456 declare ppc_fp128 @llvm.ceil.ppcf128(ppc_fp128 %Val) 14457 14458Overview: 14459""""""""" 14460 14461The '``llvm.ceil.*``' intrinsics return the ceiling of the operand. 14462 14463Arguments: 14464"""""""""" 14465 14466The argument and return value are floating-point numbers of the same 14467type. 14468 14469Semantics: 14470"""""""""" 14471 14472This function returns the same values as the libm ``ceil`` functions 14473would, and handles error conditions in the same way. 14474 14475'``llvm.trunc.*``' Intrinsic 14476^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14477 14478Syntax: 14479""""""" 14480 14481This is an overloaded intrinsic. You can use ``llvm.trunc`` on any 14482floating-point or vector of floating-point type. Not all targets support 14483all types however. 14484 14485:: 14486 14487 declare float @llvm.trunc.f32(float %Val) 14488 declare double @llvm.trunc.f64(double %Val) 14489 declare x86_fp80 @llvm.trunc.f80(x86_fp80 %Val) 14490 declare fp128 @llvm.trunc.f128(fp128 %Val) 14491 declare ppc_fp128 @llvm.trunc.ppcf128(ppc_fp128 %Val) 14492 14493Overview: 14494""""""""" 14495 14496The '``llvm.trunc.*``' intrinsics returns the operand rounded to the 14497nearest integer not larger in magnitude than the operand. 14498 14499Arguments: 14500"""""""""" 14501 14502The argument and return value are floating-point numbers of the same 14503type. 14504 14505Semantics: 14506"""""""""" 14507 14508This function returns the same values as the libm ``trunc`` functions 14509would, and handles error conditions in the same way. 14510 14511'``llvm.rint.*``' Intrinsic 14512^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14513 14514Syntax: 14515""""""" 14516 14517This is an overloaded intrinsic. You can use ``llvm.rint`` on any 14518floating-point or vector of floating-point type. Not all targets support 14519all types however. 14520 14521:: 14522 14523 declare float @llvm.rint.f32(float %Val) 14524 declare double @llvm.rint.f64(double %Val) 14525 declare x86_fp80 @llvm.rint.f80(x86_fp80 %Val) 14526 declare fp128 @llvm.rint.f128(fp128 %Val) 14527 declare ppc_fp128 @llvm.rint.ppcf128(ppc_fp128 %Val) 14528 14529Overview: 14530""""""""" 14531 14532The '``llvm.rint.*``' intrinsics returns the operand rounded to the 14533nearest integer. It may raise an inexact floating-point exception if the 14534operand isn't an integer. 14535 14536Arguments: 14537"""""""""" 14538 14539The argument and return value are floating-point numbers of the same 14540type. 14541 14542Semantics: 14543"""""""""" 14544 14545This function returns the same values as the libm ``rint`` functions 14546would, and handles error conditions in the same way. 14547 14548'``llvm.nearbyint.*``' Intrinsic 14549^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14550 14551Syntax: 14552""""""" 14553 14554This is an overloaded intrinsic. You can use ``llvm.nearbyint`` on any 14555floating-point or vector of floating-point type. Not all targets support 14556all types however. 14557 14558:: 14559 14560 declare float @llvm.nearbyint.f32(float %Val) 14561 declare double @llvm.nearbyint.f64(double %Val) 14562 declare x86_fp80 @llvm.nearbyint.f80(x86_fp80 %Val) 14563 declare fp128 @llvm.nearbyint.f128(fp128 %Val) 14564 declare ppc_fp128 @llvm.nearbyint.ppcf128(ppc_fp128 %Val) 14565 14566Overview: 14567""""""""" 14568 14569The '``llvm.nearbyint.*``' intrinsics returns the operand rounded to the 14570nearest integer. 14571 14572Arguments: 14573"""""""""" 14574 14575The argument and return value are floating-point numbers of the same 14576type. 14577 14578Semantics: 14579"""""""""" 14580 14581This function returns the same values as the libm ``nearbyint`` 14582functions would, and handles error conditions in the same way. 14583 14584'``llvm.round.*``' Intrinsic 14585^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14586 14587Syntax: 14588""""""" 14589 14590This is an overloaded intrinsic. You can use ``llvm.round`` on any 14591floating-point or vector of floating-point type. Not all targets support 14592all types however. 14593 14594:: 14595 14596 declare float @llvm.round.f32(float %Val) 14597 declare double @llvm.round.f64(double %Val) 14598 declare x86_fp80 @llvm.round.f80(x86_fp80 %Val) 14599 declare fp128 @llvm.round.f128(fp128 %Val) 14600 declare ppc_fp128 @llvm.round.ppcf128(ppc_fp128 %Val) 14601 14602Overview: 14603""""""""" 14604 14605The '``llvm.round.*``' intrinsics returns the operand rounded to the 14606nearest integer. 14607 14608Arguments: 14609"""""""""" 14610 14611The argument and return value are floating-point numbers of the same 14612type. 14613 14614Semantics: 14615"""""""""" 14616 14617This function returns the same values as the libm ``round`` 14618functions would, and handles error conditions in the same way. 14619 14620'``llvm.roundeven.*``' Intrinsic 14621^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14622 14623Syntax: 14624""""""" 14625 14626This is an overloaded intrinsic. You can use ``llvm.roundeven`` on any 14627floating-point or vector of floating-point type. Not all targets support 14628all types however. 14629 14630:: 14631 14632 declare float @llvm.roundeven.f32(float %Val) 14633 declare double @llvm.roundeven.f64(double %Val) 14634 declare x86_fp80 @llvm.roundeven.f80(x86_fp80 %Val) 14635 declare fp128 @llvm.roundeven.f128(fp128 %Val) 14636 declare ppc_fp128 @llvm.roundeven.ppcf128(ppc_fp128 %Val) 14637 14638Overview: 14639""""""""" 14640 14641The '``llvm.roundeven.*``' intrinsics returns the operand rounded to the nearest 14642integer in floating-point format rounding halfway cases to even (that is, to the 14643nearest value that is an even integer). 14644 14645Arguments: 14646"""""""""" 14647 14648The argument and return value are floating-point numbers of the same type. 14649 14650Semantics: 14651"""""""""" 14652 14653This function implements IEEE-754 operation ``roundToIntegralTiesToEven``. It 14654also behaves in the same way as C standard function ``roundeven``, except that 14655it does not raise floating point exceptions. 14656 14657 14658'``llvm.lround.*``' Intrinsic 14659^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14660 14661Syntax: 14662""""""" 14663 14664This is an overloaded intrinsic. You can use ``llvm.lround`` on any 14665floating-point type. Not all targets support all types however. 14666 14667:: 14668 14669 declare i32 @llvm.lround.i32.f32(float %Val) 14670 declare i32 @llvm.lround.i32.f64(double %Val) 14671 declare i32 @llvm.lround.i32.f80(float %Val) 14672 declare i32 @llvm.lround.i32.f128(double %Val) 14673 declare i32 @llvm.lround.i32.ppcf128(double %Val) 14674 14675 declare i64 @llvm.lround.i64.f32(float %Val) 14676 declare i64 @llvm.lround.i64.f64(double %Val) 14677 declare i64 @llvm.lround.i64.f80(float %Val) 14678 declare i64 @llvm.lround.i64.f128(double %Val) 14679 declare i64 @llvm.lround.i64.ppcf128(double %Val) 14680 14681Overview: 14682""""""""" 14683 14684The '``llvm.lround.*``' intrinsics return the operand rounded to the nearest 14685integer with ties away from zero. 14686 14687 14688Arguments: 14689"""""""""" 14690 14691The argument is a floating-point number and the return value is an integer 14692type. 14693 14694Semantics: 14695"""""""""" 14696 14697This function returns the same values as the libm ``lround`` 14698functions would, but without setting errno. 14699 14700'``llvm.llround.*``' Intrinsic 14701^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14702 14703Syntax: 14704""""""" 14705 14706This is an overloaded intrinsic. You can use ``llvm.llround`` on any 14707floating-point type. Not all targets support all types however. 14708 14709:: 14710 14711 declare i64 @llvm.lround.i64.f32(float %Val) 14712 declare i64 @llvm.lround.i64.f64(double %Val) 14713 declare i64 @llvm.lround.i64.f80(float %Val) 14714 declare i64 @llvm.lround.i64.f128(double %Val) 14715 declare i64 @llvm.lround.i64.ppcf128(double %Val) 14716 14717Overview: 14718""""""""" 14719 14720The '``llvm.llround.*``' intrinsics return the operand rounded to the nearest 14721integer with ties away from zero. 14722 14723Arguments: 14724"""""""""" 14725 14726The argument is a floating-point number and the return value is an integer 14727type. 14728 14729Semantics: 14730"""""""""" 14731 14732This function returns the same values as the libm ``llround`` 14733functions would, but without setting errno. 14734 14735'``llvm.lrint.*``' Intrinsic 14736^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14737 14738Syntax: 14739""""""" 14740 14741This is an overloaded intrinsic. You can use ``llvm.lrint`` on any 14742floating-point type. Not all targets support all types however. 14743 14744:: 14745 14746 declare i32 @llvm.lrint.i32.f32(float %Val) 14747 declare i32 @llvm.lrint.i32.f64(double %Val) 14748 declare i32 @llvm.lrint.i32.f80(float %Val) 14749 declare i32 @llvm.lrint.i32.f128(double %Val) 14750 declare i32 @llvm.lrint.i32.ppcf128(double %Val) 14751 14752 declare i64 @llvm.lrint.i64.f32(float %Val) 14753 declare i64 @llvm.lrint.i64.f64(double %Val) 14754 declare i64 @llvm.lrint.i64.f80(float %Val) 14755 declare i64 @llvm.lrint.i64.f128(double %Val) 14756 declare i64 @llvm.lrint.i64.ppcf128(double %Val) 14757 14758Overview: 14759""""""""" 14760 14761The '``llvm.lrint.*``' intrinsics return the operand rounded to the nearest 14762integer. 14763 14764 14765Arguments: 14766"""""""""" 14767 14768The argument is a floating-point number and the return value is an integer 14769type. 14770 14771Semantics: 14772"""""""""" 14773 14774This function returns the same values as the libm ``lrint`` 14775functions would, but without setting errno. 14776 14777'``llvm.llrint.*``' Intrinsic 14778^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14779 14780Syntax: 14781""""""" 14782 14783This is an overloaded intrinsic. You can use ``llvm.llrint`` on any 14784floating-point type. Not all targets support all types however. 14785 14786:: 14787 14788 declare i64 @llvm.llrint.i64.f32(float %Val) 14789 declare i64 @llvm.llrint.i64.f64(double %Val) 14790 declare i64 @llvm.llrint.i64.f80(float %Val) 14791 declare i64 @llvm.llrint.i64.f128(double %Val) 14792 declare i64 @llvm.llrint.i64.ppcf128(double %Val) 14793 14794Overview: 14795""""""""" 14796 14797The '``llvm.llrint.*``' intrinsics return the operand rounded to the nearest 14798integer. 14799 14800Arguments: 14801"""""""""" 14802 14803The argument is a floating-point number and the return value is an integer 14804type. 14805 14806Semantics: 14807"""""""""" 14808 14809This function returns the same values as the libm ``llrint`` 14810functions would, but without setting errno. 14811 14812Bit Manipulation Intrinsics 14813--------------------------- 14814 14815LLVM provides intrinsics for a few important bit manipulation 14816operations. These allow efficient code generation for some algorithms. 14817 14818'``llvm.bitreverse.*``' Intrinsics 14819^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14820 14821Syntax: 14822""""""" 14823 14824This is an overloaded intrinsic function. You can use bitreverse on any 14825integer type. 14826 14827:: 14828 14829 declare i16 @llvm.bitreverse.i16(i16 <id>) 14830 declare i32 @llvm.bitreverse.i32(i32 <id>) 14831 declare i64 @llvm.bitreverse.i64(i64 <id>) 14832 declare <4 x i32> @llvm.bitreverse.v4i32(<4 x i32> <id>) 14833 14834Overview: 14835""""""""" 14836 14837The '``llvm.bitreverse``' family of intrinsics is used to reverse the 14838bitpattern of an integer value or vector of integer values; for example 14839``0b10110110`` becomes ``0b01101101``. 14840 14841Semantics: 14842"""""""""" 14843 14844The ``llvm.bitreverse.iN`` intrinsic returns an iN value that has bit 14845``M`` in the input moved to bit ``N-M`` in the output. The vector 14846intrinsics, such as ``llvm.bitreverse.v4i32``, operate on a per-element 14847basis and the element order is not affected. 14848 14849'``llvm.bswap.*``' Intrinsics 14850^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14851 14852Syntax: 14853""""""" 14854 14855This is an overloaded intrinsic function. You can use bswap on any 14856integer type that is an even number of bytes (i.e. BitWidth % 16 == 0). 14857 14858:: 14859 14860 declare i16 @llvm.bswap.i16(i16 <id>) 14861 declare i32 @llvm.bswap.i32(i32 <id>) 14862 declare i64 @llvm.bswap.i64(i64 <id>) 14863 declare <4 x i32> @llvm.bswap.v4i32(<4 x i32> <id>) 14864 14865Overview: 14866""""""""" 14867 14868The '``llvm.bswap``' family of intrinsics is used to byte swap an integer 14869value or vector of integer values with an even number of bytes (positive 14870multiple of 16 bits). 14871 14872Semantics: 14873"""""""""" 14874 14875The ``llvm.bswap.i16`` intrinsic returns an i16 value that has the high 14876and low byte of the input i16 swapped. Similarly, the ``llvm.bswap.i32`` 14877intrinsic returns an i32 value that has the four bytes of the input i32 14878swapped, so that if the input bytes are numbered 0, 1, 2, 3 then the 14879returned i32 will have its bytes in 3, 2, 1, 0 order. The 14880``llvm.bswap.i48``, ``llvm.bswap.i64`` and other intrinsics extend this 14881concept to additional even-byte lengths (6 bytes, 8 bytes and more, 14882respectively). The vector intrinsics, such as ``llvm.bswap.v4i32``, 14883operate on a per-element basis and the element order is not affected. 14884 14885'``llvm.ctpop.*``' Intrinsic 14886^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14887 14888Syntax: 14889""""""" 14890 14891This is an overloaded intrinsic. You can use llvm.ctpop on any integer 14892bit width, or on any vector with integer elements. Not all targets 14893support all bit widths or vector types, however. 14894 14895:: 14896 14897 declare i8 @llvm.ctpop.i8(i8 <src>) 14898 declare i16 @llvm.ctpop.i16(i16 <src>) 14899 declare i32 @llvm.ctpop.i32(i32 <src>) 14900 declare i64 @llvm.ctpop.i64(i64 <src>) 14901 declare i256 @llvm.ctpop.i256(i256 <src>) 14902 declare <2 x i32> @llvm.ctpop.v2i32(<2 x i32> <src>) 14903 14904Overview: 14905""""""""" 14906 14907The '``llvm.ctpop``' family of intrinsics counts the number of bits set 14908in a value. 14909 14910Arguments: 14911"""""""""" 14912 14913The only argument is the value to be counted. The argument may be of any 14914integer type, or a vector with integer elements. The return type must 14915match the argument type. 14916 14917Semantics: 14918"""""""""" 14919 14920The '``llvm.ctpop``' intrinsic counts the 1's in a variable, or within 14921each element of a vector. 14922 14923'``llvm.ctlz.*``' Intrinsic 14924^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14925 14926Syntax: 14927""""""" 14928 14929This is an overloaded intrinsic. You can use ``llvm.ctlz`` on any 14930integer bit width, or any vector whose elements are integers. Not all 14931targets support all bit widths or vector types, however. 14932 14933:: 14934 14935 declare i8 @llvm.ctlz.i8 (i8 <src>, i1 <is_zero_undef>) 14936 declare i16 @llvm.ctlz.i16 (i16 <src>, i1 <is_zero_undef>) 14937 declare i32 @llvm.ctlz.i32 (i32 <src>, i1 <is_zero_undef>) 14938 declare i64 @llvm.ctlz.i64 (i64 <src>, i1 <is_zero_undef>) 14939 declare i256 @llvm.ctlz.i256(i256 <src>, i1 <is_zero_undef>) 14940 declare <2 x i32> @llvm.ctlz.v2i32(<2 x i32> <src>, i1 <is_zero_undef>) 14941 14942Overview: 14943""""""""" 14944 14945The '``llvm.ctlz``' family of intrinsic functions counts the number of 14946leading zeros in a variable. 14947 14948Arguments: 14949"""""""""" 14950 14951The first argument is the value to be counted. This argument may be of 14952any integer type, or a vector with integer element type. The return 14953type must match the first argument type. 14954 14955The second argument must be a constant and is a flag to indicate whether 14956the intrinsic should ensure that a zero as the first argument produces a 14957defined result. Historically some architectures did not provide a 14958defined result for zero values as efficiently, and many algorithms are 14959now predicated on avoiding zero-value inputs. 14960 14961Semantics: 14962"""""""""" 14963 14964The '``llvm.ctlz``' intrinsic counts the leading (most significant) 14965zeros in a variable, or within each element of the vector. If 14966``src == 0`` then the result is the size in bits of the type of ``src`` 14967if ``is_zero_undef == 0`` and ``undef`` otherwise. For example, 14968``llvm.ctlz(i32 2) = 30``. 14969 14970'``llvm.cttz.*``' Intrinsic 14971^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14972 14973Syntax: 14974""""""" 14975 14976This is an overloaded intrinsic. You can use ``llvm.cttz`` on any 14977integer bit width, or any vector of integer elements. Not all targets 14978support all bit widths or vector types, however. 14979 14980:: 14981 14982 declare i8 @llvm.cttz.i8 (i8 <src>, i1 <is_zero_undef>) 14983 declare i16 @llvm.cttz.i16 (i16 <src>, i1 <is_zero_undef>) 14984 declare i32 @llvm.cttz.i32 (i32 <src>, i1 <is_zero_undef>) 14985 declare i64 @llvm.cttz.i64 (i64 <src>, i1 <is_zero_undef>) 14986 declare i256 @llvm.cttz.i256(i256 <src>, i1 <is_zero_undef>) 14987 declare <2 x i32> @llvm.cttz.v2i32(<2 x i32> <src>, i1 <is_zero_undef>) 14988 14989Overview: 14990""""""""" 14991 14992The '``llvm.cttz``' family of intrinsic functions counts the number of 14993trailing zeros. 14994 14995Arguments: 14996"""""""""" 14997 14998The first argument is the value to be counted. This argument may be of 14999any integer type, or a vector with integer element type. The return 15000type must match the first argument type. 15001 15002The second argument must be a constant and is a flag to indicate whether 15003the intrinsic should ensure that a zero as the first argument produces a 15004defined result. Historically some architectures did not provide a 15005defined result for zero values as efficiently, and many algorithms are 15006now predicated on avoiding zero-value inputs. 15007 15008Semantics: 15009"""""""""" 15010 15011The '``llvm.cttz``' intrinsic counts the trailing (least significant) 15012zeros in a variable, or within each element of a vector. If ``src == 0`` 15013then the result is the size in bits of the type of ``src`` if 15014``is_zero_undef == 0`` and ``undef`` otherwise. For example, 15015``llvm.cttz(2) = 1``. 15016 15017.. _int_overflow: 15018 15019'``llvm.fshl.*``' Intrinsic 15020^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15021 15022Syntax: 15023""""""" 15024 15025This is an overloaded intrinsic. You can use ``llvm.fshl`` on any 15026integer bit width or any vector of integer elements. Not all targets 15027support all bit widths or vector types, however. 15028 15029:: 15030 15031 declare i8 @llvm.fshl.i8 (i8 %a, i8 %b, i8 %c) 15032 declare i67 @llvm.fshl.i67(i67 %a, i67 %b, i67 %c) 15033 declare <2 x i32> @llvm.fshl.v2i32(<2 x i32> %a, <2 x i32> %b, <2 x i32> %c) 15034 15035Overview: 15036""""""""" 15037 15038The '``llvm.fshl``' family of intrinsic functions performs a funnel shift left: 15039the first two values are concatenated as { %a : %b } (%a is the most significant 15040bits of the wide value), the combined value is shifted left, and the most 15041significant bits are extracted to produce a result that is the same size as the 15042original arguments. If the first 2 arguments are identical, this is equivalent 15043to a rotate left operation. For vector types, the operation occurs for each 15044element of the vector. The shift argument is treated as an unsigned amount 15045modulo the element size of the arguments. 15046 15047Arguments: 15048"""""""""" 15049 15050The first two arguments are the values to be concatenated. The third 15051argument is the shift amount. The arguments may be any integer type or a 15052vector with integer element type. All arguments and the return value must 15053have the same type. 15054 15055Example: 15056"""""""" 15057 15058.. code-block:: text 15059 15060 %r = call i8 @llvm.fshl.i8(i8 %x, i8 %y, i8 %z) ; %r = i8: msb_extract((concat(x, y) << (z % 8)), 8) 15061 %r = call i8 @llvm.fshl.i8(i8 255, i8 0, i8 15) ; %r = i8: 128 (0b10000000) 15062 %r = call i8 @llvm.fshl.i8(i8 15, i8 15, i8 11) ; %r = i8: 120 (0b01111000) 15063 %r = call i8 @llvm.fshl.i8(i8 0, i8 255, i8 8) ; %r = i8: 0 (0b00000000) 15064 15065'``llvm.fshr.*``' Intrinsic 15066^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15067 15068Syntax: 15069""""""" 15070 15071This is an overloaded intrinsic. You can use ``llvm.fshr`` on any 15072integer bit width or any vector of integer elements. Not all targets 15073support all bit widths or vector types, however. 15074 15075:: 15076 15077 declare i8 @llvm.fshr.i8 (i8 %a, i8 %b, i8 %c) 15078 declare i67 @llvm.fshr.i67(i67 %a, i67 %b, i67 %c) 15079 declare <2 x i32> @llvm.fshr.v2i32(<2 x i32> %a, <2 x i32> %b, <2 x i32> %c) 15080 15081Overview: 15082""""""""" 15083 15084The '``llvm.fshr``' family of intrinsic functions performs a funnel shift right: 15085the first two values are concatenated as { %a : %b } (%a is the most significant 15086bits of the wide value), the combined value is shifted right, and the least 15087significant bits are extracted to produce a result that is the same size as the 15088original arguments. If the first 2 arguments are identical, this is equivalent 15089to a rotate right operation. For vector types, the operation occurs for each 15090element of the vector. The shift argument is treated as an unsigned amount 15091modulo the element size of the arguments. 15092 15093Arguments: 15094"""""""""" 15095 15096The first two arguments are the values to be concatenated. The third 15097argument is the shift amount. The arguments may be any integer type or a 15098vector with integer element type. All arguments and the return value must 15099have the same type. 15100 15101Example: 15102"""""""" 15103 15104.. code-block:: text 15105 15106 %r = call i8 @llvm.fshr.i8(i8 %x, i8 %y, i8 %z) ; %r = i8: lsb_extract((concat(x, y) >> (z % 8)), 8) 15107 %r = call i8 @llvm.fshr.i8(i8 255, i8 0, i8 15) ; %r = i8: 254 (0b11111110) 15108 %r = call i8 @llvm.fshr.i8(i8 15, i8 15, i8 11) ; %r = i8: 225 (0b11100001) 15109 %r = call i8 @llvm.fshr.i8(i8 0, i8 255, i8 8) ; %r = i8: 255 (0b11111111) 15110 15111Arithmetic with Overflow Intrinsics 15112----------------------------------- 15113 15114LLVM provides intrinsics for fast arithmetic overflow checking. 15115 15116Each of these intrinsics returns a two-element struct. The first 15117element of this struct contains the result of the corresponding 15118arithmetic operation modulo 2\ :sup:`n`\ , where n is the bit width of 15119the result. Therefore, for example, the first element of the struct 15120returned by ``llvm.sadd.with.overflow.i32`` is always the same as the 15121result of a 32-bit ``add`` instruction with the same operands, where 15122the ``add`` is *not* modified by an ``nsw`` or ``nuw`` flag. 15123 15124The second element of the result is an ``i1`` that is 1 if the 15125arithmetic operation overflowed and 0 otherwise. An operation 15126overflows if, for any values of its operands ``A`` and ``B`` and for 15127any ``N`` larger than the operands' width, ``ext(A op B) to iN`` is 15128not equal to ``(ext(A) to iN) op (ext(B) to iN)`` where ``ext`` is 15129``sext`` for signed overflow and ``zext`` for unsigned overflow, and 15130``op`` is the underlying arithmetic operation. 15131 15132The behavior of these intrinsics is well-defined for all argument 15133values. 15134 15135'``llvm.sadd.with.overflow.*``' Intrinsics 15136^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15137 15138Syntax: 15139""""""" 15140 15141This is an overloaded intrinsic. You can use ``llvm.sadd.with.overflow`` 15142on any integer bit width or vectors of integers. 15143 15144:: 15145 15146 declare {i16, i1} @llvm.sadd.with.overflow.i16(i16 %a, i16 %b) 15147 declare {i32, i1} @llvm.sadd.with.overflow.i32(i32 %a, i32 %b) 15148 declare {i64, i1} @llvm.sadd.with.overflow.i64(i64 %a, i64 %b) 15149 declare {<4 x i32>, <4 x i1>} @llvm.sadd.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b) 15150 15151Overview: 15152""""""""" 15153 15154The '``llvm.sadd.with.overflow``' family of intrinsic functions perform 15155a signed addition of the two arguments, and indicate whether an overflow 15156occurred during the signed summation. 15157 15158Arguments: 15159"""""""""" 15160 15161The arguments (%a and %b) and the first element of the result structure 15162may be of integer types of any bit width, but they must have the same 15163bit width. The second element of the result structure must be of type 15164``i1``. ``%a`` and ``%b`` are the two values that will undergo signed 15165addition. 15166 15167Semantics: 15168"""""""""" 15169 15170The '``llvm.sadd.with.overflow``' family of intrinsic functions perform 15171a signed addition of the two variables. They return a structure --- the 15172first element of which is the signed summation, and the second element 15173of which is a bit specifying if the signed summation resulted in an 15174overflow. 15175 15176Examples: 15177""""""""" 15178 15179.. code-block:: llvm 15180 15181 %res = call {i32, i1} @llvm.sadd.with.overflow.i32(i32 %a, i32 %b) 15182 %sum = extractvalue {i32, i1} %res, 0 15183 %obit = extractvalue {i32, i1} %res, 1 15184 br i1 %obit, label %overflow, label %normal 15185 15186'``llvm.uadd.with.overflow.*``' Intrinsics 15187^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15188 15189Syntax: 15190""""""" 15191 15192This is an overloaded intrinsic. You can use ``llvm.uadd.with.overflow`` 15193on any integer bit width or vectors of integers. 15194 15195:: 15196 15197 declare {i16, i1} @llvm.uadd.with.overflow.i16(i16 %a, i16 %b) 15198 declare {i32, i1} @llvm.uadd.with.overflow.i32(i32 %a, i32 %b) 15199 declare {i64, i1} @llvm.uadd.with.overflow.i64(i64 %a, i64 %b) 15200 declare {<4 x i32>, <4 x i1>} @llvm.uadd.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b) 15201 15202Overview: 15203""""""""" 15204 15205The '``llvm.uadd.with.overflow``' family of intrinsic functions perform 15206an unsigned addition of the two arguments, and indicate whether a carry 15207occurred during the unsigned summation. 15208 15209Arguments: 15210"""""""""" 15211 15212The arguments (%a and %b) and the first element of the result structure 15213may be of integer types of any bit width, but they must have the same 15214bit width. The second element of the result structure must be of type 15215``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned 15216addition. 15217 15218Semantics: 15219"""""""""" 15220 15221The '``llvm.uadd.with.overflow``' family of intrinsic functions perform 15222an unsigned addition of the two arguments. They return a structure --- the 15223first element of which is the sum, and the second element of which is a 15224bit specifying if the unsigned summation resulted in a carry. 15225 15226Examples: 15227""""""""" 15228 15229.. code-block:: llvm 15230 15231 %res = call {i32, i1} @llvm.uadd.with.overflow.i32(i32 %a, i32 %b) 15232 %sum = extractvalue {i32, i1} %res, 0 15233 %obit = extractvalue {i32, i1} %res, 1 15234 br i1 %obit, label %carry, label %normal 15235 15236'``llvm.ssub.with.overflow.*``' Intrinsics 15237^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15238 15239Syntax: 15240""""""" 15241 15242This is an overloaded intrinsic. You can use ``llvm.ssub.with.overflow`` 15243on any integer bit width or vectors of integers. 15244 15245:: 15246 15247 declare {i16, i1} @llvm.ssub.with.overflow.i16(i16 %a, i16 %b) 15248 declare {i32, i1} @llvm.ssub.with.overflow.i32(i32 %a, i32 %b) 15249 declare {i64, i1} @llvm.ssub.with.overflow.i64(i64 %a, i64 %b) 15250 declare {<4 x i32>, <4 x i1>} @llvm.ssub.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b) 15251 15252Overview: 15253""""""""" 15254 15255The '``llvm.ssub.with.overflow``' family of intrinsic functions perform 15256a signed subtraction of the two arguments, and indicate whether an 15257overflow occurred during the signed subtraction. 15258 15259Arguments: 15260"""""""""" 15261 15262The arguments (%a and %b) and the first element of the result structure 15263may be of integer types of any bit width, but they must have the same 15264bit width. The second element of the result structure must be of type 15265``i1``. ``%a`` and ``%b`` are the two values that will undergo signed 15266subtraction. 15267 15268Semantics: 15269"""""""""" 15270 15271The '``llvm.ssub.with.overflow``' family of intrinsic functions perform 15272a signed subtraction of the two arguments. They return a structure --- the 15273first element of which is the subtraction, and the second element of 15274which is a bit specifying if the signed subtraction resulted in an 15275overflow. 15276 15277Examples: 15278""""""""" 15279 15280.. code-block:: llvm 15281 15282 %res = call {i32, i1} @llvm.ssub.with.overflow.i32(i32 %a, i32 %b) 15283 %sum = extractvalue {i32, i1} %res, 0 15284 %obit = extractvalue {i32, i1} %res, 1 15285 br i1 %obit, label %overflow, label %normal 15286 15287'``llvm.usub.with.overflow.*``' Intrinsics 15288^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15289 15290Syntax: 15291""""""" 15292 15293This is an overloaded intrinsic. You can use ``llvm.usub.with.overflow`` 15294on any integer bit width or vectors of integers. 15295 15296:: 15297 15298 declare {i16, i1} @llvm.usub.with.overflow.i16(i16 %a, i16 %b) 15299 declare {i32, i1} @llvm.usub.with.overflow.i32(i32 %a, i32 %b) 15300 declare {i64, i1} @llvm.usub.with.overflow.i64(i64 %a, i64 %b) 15301 declare {<4 x i32>, <4 x i1>} @llvm.usub.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b) 15302 15303Overview: 15304""""""""" 15305 15306The '``llvm.usub.with.overflow``' family of intrinsic functions perform 15307an unsigned subtraction of the two arguments, and indicate whether an 15308overflow occurred during the unsigned subtraction. 15309 15310Arguments: 15311"""""""""" 15312 15313The arguments (%a and %b) and the first element of the result structure 15314may be of integer types of any bit width, but they must have the same 15315bit width. The second element of the result structure must be of type 15316``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned 15317subtraction. 15318 15319Semantics: 15320"""""""""" 15321 15322The '``llvm.usub.with.overflow``' family of intrinsic functions perform 15323an unsigned subtraction of the two arguments. They return a structure --- 15324the first element of which is the subtraction, and the second element of 15325which is a bit specifying if the unsigned subtraction resulted in an 15326overflow. 15327 15328Examples: 15329""""""""" 15330 15331.. code-block:: llvm 15332 15333 %res = call {i32, i1} @llvm.usub.with.overflow.i32(i32 %a, i32 %b) 15334 %sum = extractvalue {i32, i1} %res, 0 15335 %obit = extractvalue {i32, i1} %res, 1 15336 br i1 %obit, label %overflow, label %normal 15337 15338'``llvm.smul.with.overflow.*``' Intrinsics 15339^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15340 15341Syntax: 15342""""""" 15343 15344This is an overloaded intrinsic. You can use ``llvm.smul.with.overflow`` 15345on any integer bit width or vectors of integers. 15346 15347:: 15348 15349 declare {i16, i1} @llvm.smul.with.overflow.i16(i16 %a, i16 %b) 15350 declare {i32, i1} @llvm.smul.with.overflow.i32(i32 %a, i32 %b) 15351 declare {i64, i1} @llvm.smul.with.overflow.i64(i64 %a, i64 %b) 15352 declare {<4 x i32>, <4 x i1>} @llvm.smul.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b) 15353 15354Overview: 15355""""""""" 15356 15357The '``llvm.smul.with.overflow``' family of intrinsic functions perform 15358a signed multiplication of the two arguments, and indicate whether an 15359overflow occurred during the signed multiplication. 15360 15361Arguments: 15362"""""""""" 15363 15364The arguments (%a and %b) and the first element of the result structure 15365may be of integer types of any bit width, but they must have the same 15366bit width. The second element of the result structure must be of type 15367``i1``. ``%a`` and ``%b`` are the two values that will undergo signed 15368multiplication. 15369 15370Semantics: 15371"""""""""" 15372 15373The '``llvm.smul.with.overflow``' family of intrinsic functions perform 15374a signed multiplication of the two arguments. They return a structure --- 15375the first element of which is the multiplication, and the second element 15376of which is a bit specifying if the signed multiplication resulted in an 15377overflow. 15378 15379Examples: 15380""""""""" 15381 15382.. code-block:: llvm 15383 15384 %res = call {i32, i1} @llvm.smul.with.overflow.i32(i32 %a, i32 %b) 15385 %sum = extractvalue {i32, i1} %res, 0 15386 %obit = extractvalue {i32, i1} %res, 1 15387 br i1 %obit, label %overflow, label %normal 15388 15389'``llvm.umul.with.overflow.*``' Intrinsics 15390^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15391 15392Syntax: 15393""""""" 15394 15395This is an overloaded intrinsic. You can use ``llvm.umul.with.overflow`` 15396on any integer bit width or vectors of integers. 15397 15398:: 15399 15400 declare {i16, i1} @llvm.umul.with.overflow.i16(i16 %a, i16 %b) 15401 declare {i32, i1} @llvm.umul.with.overflow.i32(i32 %a, i32 %b) 15402 declare {i64, i1} @llvm.umul.with.overflow.i64(i64 %a, i64 %b) 15403 declare {<4 x i32>, <4 x i1>} @llvm.umul.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b) 15404 15405Overview: 15406""""""""" 15407 15408The '``llvm.umul.with.overflow``' family of intrinsic functions perform 15409a unsigned multiplication of the two arguments, and indicate whether an 15410overflow occurred during the unsigned multiplication. 15411 15412Arguments: 15413"""""""""" 15414 15415The arguments (%a and %b) and the first element of the result structure 15416may be of integer types of any bit width, but they must have the same 15417bit width. The second element of the result structure must be of type 15418``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned 15419multiplication. 15420 15421Semantics: 15422"""""""""" 15423 15424The '``llvm.umul.with.overflow``' family of intrinsic functions perform 15425an unsigned multiplication of the two arguments. They return a structure --- 15426the first element of which is the multiplication, and the second 15427element of which is a bit specifying if the unsigned multiplication 15428resulted in an overflow. 15429 15430Examples: 15431""""""""" 15432 15433.. code-block:: llvm 15434 15435 %res = call {i32, i1} @llvm.umul.with.overflow.i32(i32 %a, i32 %b) 15436 %sum = extractvalue {i32, i1} %res, 0 15437 %obit = extractvalue {i32, i1} %res, 1 15438 br i1 %obit, label %overflow, label %normal 15439 15440Saturation Arithmetic Intrinsics 15441--------------------------------- 15442 15443Saturation arithmetic is a version of arithmetic in which operations are 15444limited to a fixed range between a minimum and maximum value. If the result of 15445an operation is greater than the maximum value, the result is set (or 15446"clamped") to this maximum. If it is below the minimum, it is clamped to this 15447minimum. 15448 15449 15450'``llvm.sadd.sat.*``' Intrinsics 15451^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15452 15453Syntax 15454""""""" 15455 15456This is an overloaded intrinsic. You can use ``llvm.sadd.sat`` 15457on any integer bit width or vectors of integers. 15458 15459:: 15460 15461 declare i16 @llvm.sadd.sat.i16(i16 %a, i16 %b) 15462 declare i32 @llvm.sadd.sat.i32(i32 %a, i32 %b) 15463 declare i64 @llvm.sadd.sat.i64(i64 %a, i64 %b) 15464 declare <4 x i32> @llvm.sadd.sat.v4i32(<4 x i32> %a, <4 x i32> %b) 15465 15466Overview 15467""""""""" 15468 15469The '``llvm.sadd.sat``' family of intrinsic functions perform signed 15470saturating addition on the 2 arguments. 15471 15472Arguments 15473"""""""""" 15474 15475The arguments (%a and %b) and the result may be of integer types of any bit 15476width, but they must have the same bit width. ``%a`` and ``%b`` are the two 15477values that will undergo signed addition. 15478 15479Semantics: 15480"""""""""" 15481 15482The maximum value this operation can clamp to is the largest signed value 15483representable by the bit width of the arguments. The minimum value is the 15484smallest signed value representable by this bit width. 15485 15486 15487Examples 15488""""""""" 15489 15490.. code-block:: llvm 15491 15492 %res = call i4 @llvm.sadd.sat.i4(i4 1, i4 2) ; %res = 3 15493 %res = call i4 @llvm.sadd.sat.i4(i4 5, i4 6) ; %res = 7 15494 %res = call i4 @llvm.sadd.sat.i4(i4 -4, i4 2) ; %res = -2 15495 %res = call i4 @llvm.sadd.sat.i4(i4 -4, i4 -5) ; %res = -8 15496 15497 15498'``llvm.uadd.sat.*``' Intrinsics 15499^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15500 15501Syntax 15502""""""" 15503 15504This is an overloaded intrinsic. You can use ``llvm.uadd.sat`` 15505on any integer bit width or vectors of integers. 15506 15507:: 15508 15509 declare i16 @llvm.uadd.sat.i16(i16 %a, i16 %b) 15510 declare i32 @llvm.uadd.sat.i32(i32 %a, i32 %b) 15511 declare i64 @llvm.uadd.sat.i64(i64 %a, i64 %b) 15512 declare <4 x i32> @llvm.uadd.sat.v4i32(<4 x i32> %a, <4 x i32> %b) 15513 15514Overview 15515""""""""" 15516 15517The '``llvm.uadd.sat``' family of intrinsic functions perform unsigned 15518saturating addition on the 2 arguments. 15519 15520Arguments 15521"""""""""" 15522 15523The arguments (%a and %b) and the result may be of integer types of any bit 15524width, but they must have the same bit width. ``%a`` and ``%b`` are the two 15525values that will undergo unsigned addition. 15526 15527Semantics: 15528"""""""""" 15529 15530The maximum value this operation can clamp to is the largest unsigned value 15531representable by the bit width of the arguments. Because this is an unsigned 15532operation, the result will never saturate towards zero. 15533 15534 15535Examples 15536""""""""" 15537 15538.. code-block:: llvm 15539 15540 %res = call i4 @llvm.uadd.sat.i4(i4 1, i4 2) ; %res = 3 15541 %res = call i4 @llvm.uadd.sat.i4(i4 5, i4 6) ; %res = 11 15542 %res = call i4 @llvm.uadd.sat.i4(i4 8, i4 8) ; %res = 15 15543 15544 15545'``llvm.ssub.sat.*``' Intrinsics 15546^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15547 15548Syntax 15549""""""" 15550 15551This is an overloaded intrinsic. You can use ``llvm.ssub.sat`` 15552on any integer bit width or vectors of integers. 15553 15554:: 15555 15556 declare i16 @llvm.ssub.sat.i16(i16 %a, i16 %b) 15557 declare i32 @llvm.ssub.sat.i32(i32 %a, i32 %b) 15558 declare i64 @llvm.ssub.sat.i64(i64 %a, i64 %b) 15559 declare <4 x i32> @llvm.ssub.sat.v4i32(<4 x i32> %a, <4 x i32> %b) 15560 15561Overview 15562""""""""" 15563 15564The '``llvm.ssub.sat``' family of intrinsic functions perform signed 15565saturating subtraction on the 2 arguments. 15566 15567Arguments 15568"""""""""" 15569 15570The arguments (%a and %b) and the result may be of integer types of any bit 15571width, but they must have the same bit width. ``%a`` and ``%b`` are the two 15572values that will undergo signed subtraction. 15573 15574Semantics: 15575"""""""""" 15576 15577The maximum value this operation can clamp to is the largest signed value 15578representable by the bit width of the arguments. The minimum value is the 15579smallest signed value representable by this bit width. 15580 15581 15582Examples 15583""""""""" 15584 15585.. code-block:: llvm 15586 15587 %res = call i4 @llvm.ssub.sat.i4(i4 2, i4 1) ; %res = 1 15588 %res = call i4 @llvm.ssub.sat.i4(i4 2, i4 6) ; %res = -4 15589 %res = call i4 @llvm.ssub.sat.i4(i4 -4, i4 5) ; %res = -8 15590 %res = call i4 @llvm.ssub.sat.i4(i4 4, i4 -5) ; %res = 7 15591 15592 15593'``llvm.usub.sat.*``' Intrinsics 15594^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15595 15596Syntax 15597""""""" 15598 15599This is an overloaded intrinsic. You can use ``llvm.usub.sat`` 15600on any integer bit width or vectors of integers. 15601 15602:: 15603 15604 declare i16 @llvm.usub.sat.i16(i16 %a, i16 %b) 15605 declare i32 @llvm.usub.sat.i32(i32 %a, i32 %b) 15606 declare i64 @llvm.usub.sat.i64(i64 %a, i64 %b) 15607 declare <4 x i32> @llvm.usub.sat.v4i32(<4 x i32> %a, <4 x i32> %b) 15608 15609Overview 15610""""""""" 15611 15612The '``llvm.usub.sat``' family of intrinsic functions perform unsigned 15613saturating subtraction on the 2 arguments. 15614 15615Arguments 15616"""""""""" 15617 15618The arguments (%a and %b) and the result may be of integer types of any bit 15619width, but they must have the same bit width. ``%a`` and ``%b`` are the two 15620values that will undergo unsigned subtraction. 15621 15622Semantics: 15623"""""""""" 15624 15625The minimum value this operation can clamp to is 0, which is the smallest 15626unsigned value representable by the bit width of the unsigned arguments. 15627Because this is an unsigned operation, the result will never saturate towards 15628the largest possible value representable by this bit width. 15629 15630 15631Examples 15632""""""""" 15633 15634.. code-block:: llvm 15635 15636 %res = call i4 @llvm.usub.sat.i4(i4 2, i4 1) ; %res = 1 15637 %res = call i4 @llvm.usub.sat.i4(i4 2, i4 6) ; %res = 0 15638 15639 15640'``llvm.sshl.sat.*``' Intrinsics 15641^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15642 15643Syntax 15644""""""" 15645 15646This is an overloaded intrinsic. You can use ``llvm.sshl.sat`` 15647on integers or vectors of integers of any bit width. 15648 15649:: 15650 15651 declare i16 @llvm.sshl.sat.i16(i16 %a, i16 %b) 15652 declare i32 @llvm.sshl.sat.i32(i32 %a, i32 %b) 15653 declare i64 @llvm.sshl.sat.i64(i64 %a, i64 %b) 15654 declare <4 x i32> @llvm.sshl.sat.v4i32(<4 x i32> %a, <4 x i32> %b) 15655 15656Overview 15657""""""""" 15658 15659The '``llvm.sshl.sat``' family of intrinsic functions perform signed 15660saturating left shift on the first argument. 15661 15662Arguments 15663"""""""""" 15664 15665The arguments (``%a`` and ``%b``) and the result may be of integer types of any 15666bit width, but they must have the same bit width. ``%a`` is the value to be 15667shifted, and ``%b`` is the amount to shift by. If ``b`` is (statically or 15668dynamically) equal to or larger than the integer bit width of the arguments, 15669the result is a :ref:`poison value <poisonvalues>`. If the arguments are 15670vectors, each vector element of ``a`` is shifted by the corresponding shift 15671amount in ``b``. 15672 15673 15674Semantics: 15675"""""""""" 15676 15677The maximum value this operation can clamp to is the largest signed value 15678representable by the bit width of the arguments. The minimum value is the 15679smallest signed value representable by this bit width. 15680 15681 15682Examples 15683""""""""" 15684 15685.. code-block:: llvm 15686 15687 %res = call i4 @llvm.sshl.sat.i4(i4 2, i4 1) ; %res = 4 15688 %res = call i4 @llvm.sshl.sat.i4(i4 2, i4 2) ; %res = 7 15689 %res = call i4 @llvm.sshl.sat.i4(i4 -5, i4 1) ; %res = -8 15690 %res = call i4 @llvm.sshl.sat.i4(i4 -1, i4 1) ; %res = -2 15691 15692 15693'``llvm.ushl.sat.*``' Intrinsics 15694^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15695 15696Syntax 15697""""""" 15698 15699This is an overloaded intrinsic. You can use ``llvm.ushl.sat`` 15700on integers or vectors of integers of any bit width. 15701 15702:: 15703 15704 declare i16 @llvm.ushl.sat.i16(i16 %a, i16 %b) 15705 declare i32 @llvm.ushl.sat.i32(i32 %a, i32 %b) 15706 declare i64 @llvm.ushl.sat.i64(i64 %a, i64 %b) 15707 declare <4 x i32> @llvm.ushl.sat.v4i32(<4 x i32> %a, <4 x i32> %b) 15708 15709Overview 15710""""""""" 15711 15712The '``llvm.ushl.sat``' family of intrinsic functions perform unsigned 15713saturating left shift on the first argument. 15714 15715Arguments 15716"""""""""" 15717 15718The arguments (``%a`` and ``%b``) and the result may be of integer types of any 15719bit width, but they must have the same bit width. ``%a`` is the value to be 15720shifted, and ``%b`` is the amount to shift by. If ``b`` is (statically or 15721dynamically) equal to or larger than the integer bit width of the arguments, 15722the result is a :ref:`poison value <poisonvalues>`. If the arguments are 15723vectors, each vector element of ``a`` is shifted by the corresponding shift 15724amount in ``b``. 15725 15726Semantics: 15727"""""""""" 15728 15729The maximum value this operation can clamp to is the largest unsigned value 15730representable by the bit width of the arguments. 15731 15732 15733Examples 15734""""""""" 15735 15736.. code-block:: llvm 15737 15738 %res = call i4 @llvm.ushl.sat.i4(i4 2, i4 1) ; %res = 4 15739 %res = call i4 @llvm.ushl.sat.i4(i4 3, i4 3) ; %res = 15 15740 15741 15742Fixed Point Arithmetic Intrinsics 15743--------------------------------- 15744 15745A fixed point number represents a real data type for a number that has a fixed 15746number of digits after a radix point (equivalent to the decimal point '.'). 15747The number of digits after the radix point is referred as the `scale`. These 15748are useful for representing fractional values to a specific precision. The 15749following intrinsics perform fixed point arithmetic operations on 2 operands 15750of the same scale, specified as the third argument. 15751 15752The ``llvm.*mul.fix`` family of intrinsic functions represents a multiplication 15753of fixed point numbers through scaled integers. Therefore, fixed point 15754multiplication can be represented as 15755 15756.. code-block:: llvm 15757 15758 %result = call i4 @llvm.smul.fix.i4(i4 %a, i4 %b, i32 %scale) 15759 15760 ; Expands to 15761 %a2 = sext i4 %a to i8 15762 %b2 = sext i4 %b to i8 15763 %mul = mul nsw nuw i8 %a, %b 15764 %scale2 = trunc i32 %scale to i8 15765 %r = ashr i8 %mul, i8 %scale2 ; this is for a target rounding down towards negative infinity 15766 %result = trunc i8 %r to i4 15767 15768The ``llvm.*div.fix`` family of intrinsic functions represents a division of 15769fixed point numbers through scaled integers. Fixed point division can be 15770represented as: 15771 15772.. code-block:: llvm 15773 15774 %result call i4 @llvm.sdiv.fix.i4(i4 %a, i4 %b, i32 %scale) 15775 15776 ; Expands to 15777 %a2 = sext i4 %a to i8 15778 %b2 = sext i4 %b to i8 15779 %scale2 = trunc i32 %scale to i8 15780 %a3 = shl i8 %a2, %scale2 15781 %r = sdiv i8 %a3, %b2 ; this is for a target rounding towards zero 15782 %result = trunc i8 %r to i4 15783 15784For each of these functions, if the result cannot be represented exactly with 15785the provided scale, the result is rounded. Rounding is unspecified since 15786preferred rounding may vary for different targets. Rounding is specified 15787through a target hook. Different pipelines should legalize or optimize this 15788using the rounding specified by this hook if it is provided. Operations like 15789constant folding, instruction combining, KnownBits, and ValueTracking should 15790also use this hook, if provided, and not assume the direction of rounding. A 15791rounded result must always be within one unit of precision from the true 15792result. That is, the error between the returned result and the true result must 15793be less than 1/2^(scale). 15794 15795 15796'``llvm.smul.fix.*``' Intrinsics 15797^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15798 15799Syntax 15800""""""" 15801 15802This is an overloaded intrinsic. You can use ``llvm.smul.fix`` 15803on any integer bit width or vectors of integers. 15804 15805:: 15806 15807 declare i16 @llvm.smul.fix.i16(i16 %a, i16 %b, i32 %scale) 15808 declare i32 @llvm.smul.fix.i32(i32 %a, i32 %b, i32 %scale) 15809 declare i64 @llvm.smul.fix.i64(i64 %a, i64 %b, i32 %scale) 15810 declare <4 x i32> @llvm.smul.fix.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale) 15811 15812Overview 15813""""""""" 15814 15815The '``llvm.smul.fix``' family of intrinsic functions perform signed 15816fixed point multiplication on 2 arguments of the same scale. 15817 15818Arguments 15819"""""""""" 15820 15821The arguments (%a and %b) and the result may be of integer types of any bit 15822width, but they must have the same bit width. The arguments may also work with 15823int vectors of the same length and int size. ``%a`` and ``%b`` are the two 15824values that will undergo signed fixed point multiplication. The argument 15825``%scale`` represents the scale of both operands, and must be a constant 15826integer. 15827 15828Semantics: 15829"""""""""" 15830 15831This operation performs fixed point multiplication on the 2 arguments of a 15832specified scale. The result will also be returned in the same scale specified 15833in the third argument. 15834 15835If the result value cannot be precisely represented in the given scale, the 15836value is rounded up or down to the closest representable value. The rounding 15837direction is unspecified. 15838 15839It is undefined behavior if the result value does not fit within the range of 15840the fixed point type. 15841 15842 15843Examples 15844""""""""" 15845 15846.. code-block:: llvm 15847 15848 %res = call i4 @llvm.smul.fix.i4(i4 3, i4 2, i32 0) ; %res = 6 (2 x 3 = 6) 15849 %res = call i4 @llvm.smul.fix.i4(i4 3, i4 2, i32 1) ; %res = 3 (1.5 x 1 = 1.5) 15850 %res = call i4 @llvm.smul.fix.i4(i4 3, i4 -2, i32 1) ; %res = -3 (1.5 x -1 = -1.5) 15851 15852 ; The result in the following could be rounded up to -2 or down to -2.5 15853 %res = call i4 @llvm.smul.fix.i4(i4 3, i4 -3, i32 1) ; %res = -5 (or -4) (1.5 x -1.5 = -2.25) 15854 15855 15856'``llvm.umul.fix.*``' Intrinsics 15857^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15858 15859Syntax 15860""""""" 15861 15862This is an overloaded intrinsic. You can use ``llvm.umul.fix`` 15863on any integer bit width or vectors of integers. 15864 15865:: 15866 15867 declare i16 @llvm.umul.fix.i16(i16 %a, i16 %b, i32 %scale) 15868 declare i32 @llvm.umul.fix.i32(i32 %a, i32 %b, i32 %scale) 15869 declare i64 @llvm.umul.fix.i64(i64 %a, i64 %b, i32 %scale) 15870 declare <4 x i32> @llvm.umul.fix.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale) 15871 15872Overview 15873""""""""" 15874 15875The '``llvm.umul.fix``' family of intrinsic functions perform unsigned 15876fixed point multiplication on 2 arguments of the same scale. 15877 15878Arguments 15879"""""""""" 15880 15881The arguments (%a and %b) and the result may be of integer types of any bit 15882width, but they must have the same bit width. The arguments may also work with 15883int vectors of the same length and int size. ``%a`` and ``%b`` are the two 15884values that will undergo unsigned fixed point multiplication. The argument 15885``%scale`` represents the scale of both operands, and must be a constant 15886integer. 15887 15888Semantics: 15889"""""""""" 15890 15891This operation performs unsigned fixed point multiplication on the 2 arguments of a 15892specified scale. The result will also be returned in the same scale specified 15893in the third argument. 15894 15895If the result value cannot be precisely represented in the given scale, the 15896value is rounded up or down to the closest representable value. The rounding 15897direction is unspecified. 15898 15899It is undefined behavior if the result value does not fit within the range of 15900the fixed point type. 15901 15902 15903Examples 15904""""""""" 15905 15906.. code-block:: llvm 15907 15908 %res = call i4 @llvm.umul.fix.i4(i4 3, i4 2, i32 0) ; %res = 6 (2 x 3 = 6) 15909 %res = call i4 @llvm.umul.fix.i4(i4 3, i4 2, i32 1) ; %res = 3 (1.5 x 1 = 1.5) 15910 15911 ; The result in the following could be rounded down to 3.5 or up to 4 15912 %res = call i4 @llvm.umul.fix.i4(i4 15, i4 1, i32 1) ; %res = 7 (or 8) (7.5 x 0.5 = 3.75) 15913 15914 15915'``llvm.smul.fix.sat.*``' Intrinsics 15916^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15917 15918Syntax 15919""""""" 15920 15921This is an overloaded intrinsic. You can use ``llvm.smul.fix.sat`` 15922on any integer bit width or vectors of integers. 15923 15924:: 15925 15926 declare i16 @llvm.smul.fix.sat.i16(i16 %a, i16 %b, i32 %scale) 15927 declare i32 @llvm.smul.fix.sat.i32(i32 %a, i32 %b, i32 %scale) 15928 declare i64 @llvm.smul.fix.sat.i64(i64 %a, i64 %b, i32 %scale) 15929 declare <4 x i32> @llvm.smul.fix.sat.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale) 15930 15931Overview 15932""""""""" 15933 15934The '``llvm.smul.fix.sat``' family of intrinsic functions perform signed 15935fixed point saturating multiplication on 2 arguments of the same scale. 15936 15937Arguments 15938"""""""""" 15939 15940The arguments (%a and %b) and the result may be of integer types of any bit 15941width, but they must have the same bit width. ``%a`` and ``%b`` are the two 15942values that will undergo signed fixed point multiplication. The argument 15943``%scale`` represents the scale of both operands, and must be a constant 15944integer. 15945 15946Semantics: 15947"""""""""" 15948 15949This operation performs fixed point multiplication on the 2 arguments of a 15950specified scale. The result will also be returned in the same scale specified 15951in the third argument. 15952 15953If the result value cannot be precisely represented in the given scale, the 15954value is rounded up or down to the closest representable value. The rounding 15955direction is unspecified. 15956 15957The maximum value this operation can clamp to is the largest signed value 15958representable by the bit width of the first 2 arguments. The minimum value is the 15959smallest signed value representable by this bit width. 15960 15961 15962Examples 15963""""""""" 15964 15965.. code-block:: llvm 15966 15967 %res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 2, i32 0) ; %res = 6 (2 x 3 = 6) 15968 %res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 2, i32 1) ; %res = 3 (1.5 x 1 = 1.5) 15969 %res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 -2, i32 1) ; %res = -3 (1.5 x -1 = -1.5) 15970 15971 ; The result in the following could be rounded up to -2 or down to -2.5 15972 %res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 -3, i32 1) ; %res = -5 (or -4) (1.5 x -1.5 = -2.25) 15973 15974 ; Saturation 15975 %res = call i4 @llvm.smul.fix.sat.i4(i4 7, i4 2, i32 0) ; %res = 7 15976 %res = call i4 @llvm.smul.fix.sat.i4(i4 7, i4 4, i32 2) ; %res = 7 15977 %res = call i4 @llvm.smul.fix.sat.i4(i4 -8, i4 5, i32 2) ; %res = -8 15978 %res = call i4 @llvm.smul.fix.sat.i4(i4 -8, i4 -2, i32 1) ; %res = 7 15979 15980 ; Scale can affect the saturation result 15981 %res = call i4 @llvm.smul.fix.sat.i4(i4 2, i4 4, i32 0) ; %res = 7 (2 x 4 -> clamped to 7) 15982 %res = call i4 @llvm.smul.fix.sat.i4(i4 2, i4 4, i32 1) ; %res = 4 (1 x 2 = 2) 15983 15984 15985'``llvm.umul.fix.sat.*``' Intrinsics 15986^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15987 15988Syntax 15989""""""" 15990 15991This is an overloaded intrinsic. You can use ``llvm.umul.fix.sat`` 15992on any integer bit width or vectors of integers. 15993 15994:: 15995 15996 declare i16 @llvm.umul.fix.sat.i16(i16 %a, i16 %b, i32 %scale) 15997 declare i32 @llvm.umul.fix.sat.i32(i32 %a, i32 %b, i32 %scale) 15998 declare i64 @llvm.umul.fix.sat.i64(i64 %a, i64 %b, i32 %scale) 15999 declare <4 x i32> @llvm.umul.fix.sat.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale) 16000 16001Overview 16002""""""""" 16003 16004The '``llvm.umul.fix.sat``' family of intrinsic functions perform unsigned 16005fixed point saturating multiplication on 2 arguments of the same scale. 16006 16007Arguments 16008"""""""""" 16009 16010The arguments (%a and %b) and the result may be of integer types of any bit 16011width, but they must have the same bit width. ``%a`` and ``%b`` are the two 16012values that will undergo unsigned fixed point multiplication. The argument 16013``%scale`` represents the scale of both operands, and must be a constant 16014integer. 16015 16016Semantics: 16017"""""""""" 16018 16019This operation performs fixed point multiplication on the 2 arguments of a 16020specified scale. The result will also be returned in the same scale specified 16021in the third argument. 16022 16023If the result value cannot be precisely represented in the given scale, the 16024value is rounded up or down to the closest representable value. The rounding 16025direction is unspecified. 16026 16027The maximum value this operation can clamp to is the largest unsigned value 16028representable by the bit width of the first 2 arguments. The minimum value is the 16029smallest unsigned value representable by this bit width (zero). 16030 16031 16032Examples 16033""""""""" 16034 16035.. code-block:: llvm 16036 16037 %res = call i4 @llvm.umul.fix.sat.i4(i4 3, i4 2, i32 0) ; %res = 6 (2 x 3 = 6) 16038 %res = call i4 @llvm.umul.fix.sat.i4(i4 3, i4 2, i32 1) ; %res = 3 (1.5 x 1 = 1.5) 16039 16040 ; The result in the following could be rounded down to 2 or up to 2.5 16041 %res = call i4 @llvm.umul.fix.sat.i4(i4 3, i4 3, i32 1) ; %res = 4 (or 5) (1.5 x 1.5 = 2.25) 16042 16043 ; Saturation 16044 %res = call i4 @llvm.umul.fix.sat.i4(i4 8, i4 2, i32 0) ; %res = 15 (8 x 2 -> clamped to 15) 16045 %res = call i4 @llvm.umul.fix.sat.i4(i4 8, i4 8, i32 2) ; %res = 15 (2 x 2 -> clamped to 3.75) 16046 16047 ; Scale can affect the saturation result 16048 %res = call i4 @llvm.umul.fix.sat.i4(i4 2, i4 4, i32 0) ; %res = 7 (2 x 4 -> clamped to 7) 16049 %res = call i4 @llvm.umul.fix.sat.i4(i4 2, i4 4, i32 1) ; %res = 4 (1 x 2 = 2) 16050 16051 16052'``llvm.sdiv.fix.*``' Intrinsics 16053^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16054 16055Syntax 16056""""""" 16057 16058This is an overloaded intrinsic. You can use ``llvm.sdiv.fix`` 16059on any integer bit width or vectors of integers. 16060 16061:: 16062 16063 declare i16 @llvm.sdiv.fix.i16(i16 %a, i16 %b, i32 %scale) 16064 declare i32 @llvm.sdiv.fix.i32(i32 %a, i32 %b, i32 %scale) 16065 declare i64 @llvm.sdiv.fix.i64(i64 %a, i64 %b, i32 %scale) 16066 declare <4 x i32> @llvm.sdiv.fix.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale) 16067 16068Overview 16069""""""""" 16070 16071The '``llvm.sdiv.fix``' family of intrinsic functions perform signed 16072fixed point division on 2 arguments of the same scale. 16073 16074Arguments 16075"""""""""" 16076 16077The arguments (%a and %b) and the result may be of integer types of any bit 16078width, but they must have the same bit width. The arguments may also work with 16079int vectors of the same length and int size. ``%a`` and ``%b`` are the two 16080values that will undergo signed fixed point division. The argument 16081``%scale`` represents the scale of both operands, and must be a constant 16082integer. 16083 16084Semantics: 16085"""""""""" 16086 16087This operation performs fixed point division on the 2 arguments of a 16088specified scale. The result will also be returned in the same scale specified 16089in the third argument. 16090 16091If the result value cannot be precisely represented in the given scale, the 16092value is rounded up or down to the closest representable value. The rounding 16093direction is unspecified. 16094 16095It is undefined behavior if the result value does not fit within the range of 16096the fixed point type, or if the second argument is zero. 16097 16098 16099Examples 16100""""""""" 16101 16102.. code-block:: llvm 16103 16104 %res = call i4 @llvm.sdiv.fix.i4(i4 6, i4 2, i32 0) ; %res = 3 (6 / 2 = 3) 16105 %res = call i4 @llvm.sdiv.fix.i4(i4 6, i4 4, i32 1) ; %res = 3 (3 / 2 = 1.5) 16106 %res = call i4 @llvm.sdiv.fix.i4(i4 3, i4 -2, i32 1) ; %res = -3 (1.5 / -1 = -1.5) 16107 16108 ; The result in the following could be rounded up to 1 or down to 0.5 16109 %res = call i4 @llvm.sdiv.fix.i4(i4 3, i4 4, i32 1) ; %res = 2 (or 1) (1.5 / 2 = 0.75) 16110 16111 16112'``llvm.udiv.fix.*``' Intrinsics 16113^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16114 16115Syntax 16116""""""" 16117 16118This is an overloaded intrinsic. You can use ``llvm.udiv.fix`` 16119on any integer bit width or vectors of integers. 16120 16121:: 16122 16123 declare i16 @llvm.udiv.fix.i16(i16 %a, i16 %b, i32 %scale) 16124 declare i32 @llvm.udiv.fix.i32(i32 %a, i32 %b, i32 %scale) 16125 declare i64 @llvm.udiv.fix.i64(i64 %a, i64 %b, i32 %scale) 16126 declare <4 x i32> @llvm.udiv.fix.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale) 16127 16128Overview 16129""""""""" 16130 16131The '``llvm.udiv.fix``' family of intrinsic functions perform unsigned 16132fixed point division on 2 arguments of the same scale. 16133 16134Arguments 16135"""""""""" 16136 16137The arguments (%a and %b) and the result may be of integer types of any bit 16138width, but they must have the same bit width. The arguments may also work with 16139int vectors of the same length and int size. ``%a`` and ``%b`` are the two 16140values that will undergo unsigned fixed point division. The argument 16141``%scale`` represents the scale of both operands, and must be a constant 16142integer. 16143 16144Semantics: 16145"""""""""" 16146 16147This operation performs fixed point division on the 2 arguments of a 16148specified scale. The result will also be returned in the same scale specified 16149in the third argument. 16150 16151If the result value cannot be precisely represented in the given scale, the 16152value is rounded up or down to the closest representable value. The rounding 16153direction is unspecified. 16154 16155It is undefined behavior if the result value does not fit within the range of 16156the fixed point type, or if the second argument is zero. 16157 16158 16159Examples 16160""""""""" 16161 16162.. code-block:: llvm 16163 16164 %res = call i4 @llvm.udiv.fix.i4(i4 6, i4 2, i32 0) ; %res = 3 (6 / 2 = 3) 16165 %res = call i4 @llvm.udiv.fix.i4(i4 6, i4 4, i32 1) ; %res = 3 (3 / 2 = 1.5) 16166 %res = call i4 @llvm.udiv.fix.i4(i4 1, i4 -8, i32 4) ; %res = 2 (0.0625 / 0.5 = 0.125) 16167 16168 ; The result in the following could be rounded up to 1 or down to 0.5 16169 %res = call i4 @llvm.udiv.fix.i4(i4 3, i4 4, i32 1) ; %res = 2 (or 1) (1.5 / 2 = 0.75) 16170 16171 16172'``llvm.sdiv.fix.sat.*``' Intrinsics 16173^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16174 16175Syntax 16176""""""" 16177 16178This is an overloaded intrinsic. You can use ``llvm.sdiv.fix.sat`` 16179on any integer bit width or vectors of integers. 16180 16181:: 16182 16183 declare i16 @llvm.sdiv.fix.sat.i16(i16 %a, i16 %b, i32 %scale) 16184 declare i32 @llvm.sdiv.fix.sat.i32(i32 %a, i32 %b, i32 %scale) 16185 declare i64 @llvm.sdiv.fix.sat.i64(i64 %a, i64 %b, i32 %scale) 16186 declare <4 x i32> @llvm.sdiv.fix.sat.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale) 16187 16188Overview 16189""""""""" 16190 16191The '``llvm.sdiv.fix.sat``' family of intrinsic functions perform signed 16192fixed point saturating division on 2 arguments of the same scale. 16193 16194Arguments 16195"""""""""" 16196 16197The arguments (%a and %b) and the result may be of integer types of any bit 16198width, but they must have the same bit width. ``%a`` and ``%b`` are the two 16199values that will undergo signed fixed point division. The argument 16200``%scale`` represents the scale of both operands, and must be a constant 16201integer. 16202 16203Semantics: 16204"""""""""" 16205 16206This operation performs fixed point division on the 2 arguments of a 16207specified scale. The result will also be returned in the same scale specified 16208in the third argument. 16209 16210If the result value cannot be precisely represented in the given scale, the 16211value is rounded up or down to the closest representable value. The rounding 16212direction is unspecified. 16213 16214The maximum value this operation can clamp to is the largest signed value 16215representable by the bit width of the first 2 arguments. The minimum value is the 16216smallest signed value representable by this bit width. 16217 16218It is undefined behavior if the second argument is zero. 16219 16220 16221Examples 16222""""""""" 16223 16224.. code-block:: llvm 16225 16226 %res = call i4 @llvm.sdiv.fix.sat.i4(i4 6, i4 2, i32 0) ; %res = 3 (6 / 2 = 3) 16227 %res = call i4 @llvm.sdiv.fix.sat.i4(i4 6, i4 4, i32 1) ; %res = 3 (3 / 2 = 1.5) 16228 %res = call i4 @llvm.sdiv.fix.sat.i4(i4 3, i4 -2, i32 1) ; %res = -3 (1.5 / -1 = -1.5) 16229 16230 ; The result in the following could be rounded up to 1 or down to 0.5 16231 %res = call i4 @llvm.sdiv.fix.sat.i4(i4 3, i4 4, i32 1) ; %res = 2 (or 1) (1.5 / 2 = 0.75) 16232 16233 ; Saturation 16234 %res = call i4 @llvm.sdiv.fix.sat.i4(i4 -8, i4 -1, i32 0) ; %res = 7 (-8 / -1 = 8 => 7) 16235 %res = call i4 @llvm.sdiv.fix.sat.i4(i4 4, i4 2, i32 2) ; %res = 7 (1 / 0.5 = 2 => 1.75) 16236 %res = call i4 @llvm.sdiv.fix.sat.i4(i4 -4, i4 1, i32 2) ; %res = -8 (-1 / 0.25 = -4 => -2) 16237 16238 16239'``llvm.udiv.fix.sat.*``' Intrinsics 16240^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16241 16242Syntax 16243""""""" 16244 16245This is an overloaded intrinsic. You can use ``llvm.udiv.fix.sat`` 16246on any integer bit width or vectors of integers. 16247 16248:: 16249 16250 declare i16 @llvm.udiv.fix.sat.i16(i16 %a, i16 %b, i32 %scale) 16251 declare i32 @llvm.udiv.fix.sat.i32(i32 %a, i32 %b, i32 %scale) 16252 declare i64 @llvm.udiv.fix.sat.i64(i64 %a, i64 %b, i32 %scale) 16253 declare <4 x i32> @llvm.udiv.fix.sat.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale) 16254 16255Overview 16256""""""""" 16257 16258The '``llvm.udiv.fix.sat``' family of intrinsic functions perform unsigned 16259fixed point saturating division on 2 arguments of the same scale. 16260 16261Arguments 16262"""""""""" 16263 16264The arguments (%a and %b) and the result may be of integer types of any bit 16265width, but they must have the same bit width. ``%a`` and ``%b`` are the two 16266values that will undergo unsigned fixed point division. The argument 16267``%scale`` represents the scale of both operands, and must be a constant 16268integer. 16269 16270Semantics: 16271"""""""""" 16272 16273This operation performs fixed point division on the 2 arguments of a 16274specified scale. The result will also be returned in the same scale specified 16275in the third argument. 16276 16277If the result value cannot be precisely represented in the given scale, the 16278value is rounded up or down to the closest representable value. The rounding 16279direction is unspecified. 16280 16281The maximum value this operation can clamp to is the largest unsigned value 16282representable by the bit width of the first 2 arguments. The minimum value is the 16283smallest unsigned value representable by this bit width (zero). 16284 16285It is undefined behavior if the second argument is zero. 16286 16287Examples 16288""""""""" 16289 16290.. code-block:: llvm 16291 16292 %res = call i4 @llvm.udiv.fix.sat.i4(i4 6, i4 2, i32 0) ; %res = 3 (6 / 2 = 3) 16293 %res = call i4 @llvm.udiv.fix.sat.i4(i4 6, i4 4, i32 1) ; %res = 3 (3 / 2 = 1.5) 16294 16295 ; The result in the following could be rounded down to 0.5 or up to 1 16296 %res = call i4 @llvm.udiv.fix.sat.i4(i4 3, i4 4, i32 1) ; %res = 1 (or 2) (1.5 / 2 = 0.75) 16297 16298 ; Saturation 16299 %res = call i4 @llvm.udiv.fix.sat.i4(i4 8, i4 2, i32 2) ; %res = 15 (2 / 0.5 = 4 => 3.75) 16300 16301 16302Specialised Arithmetic Intrinsics 16303--------------------------------- 16304 16305.. _i_intr_llvm_canonicalize: 16306 16307'``llvm.canonicalize.*``' Intrinsic 16308^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16309 16310Syntax: 16311""""""" 16312 16313:: 16314 16315 declare float @llvm.canonicalize.f32(float %a) 16316 declare double @llvm.canonicalize.f64(double %b) 16317 16318Overview: 16319""""""""" 16320 16321The '``llvm.canonicalize.*``' intrinsic returns the platform specific canonical 16322encoding of a floating-point number. This canonicalization is useful for 16323implementing certain numeric primitives such as frexp. The canonical encoding is 16324defined by IEEE-754-2008 to be: 16325 16326:: 16327 16328 2.1.8 canonical encoding: The preferred encoding of a floating-point 16329 representation in a format. Applied to declets, significands of finite 16330 numbers, infinities, and NaNs, especially in decimal formats. 16331 16332This operation can also be considered equivalent to the IEEE-754-2008 16333conversion of a floating-point value to the same format. NaNs are handled 16334according to section 6.2. 16335 16336Examples of non-canonical encodings: 16337 16338- x87 pseudo denormals, pseudo NaNs, pseudo Infinity, Unnormals. These are 16339 converted to a canonical representation per hardware-specific protocol. 16340- Many normal decimal floating-point numbers have non-canonical alternative 16341 encodings. 16342- Some machines, like GPUs or ARMv7 NEON, do not support subnormal values. 16343 These are treated as non-canonical encodings of zero and will be flushed to 16344 a zero of the same sign by this operation. 16345 16346Note that per IEEE-754-2008 6.2, systems that support signaling NaNs with 16347default exception handling must signal an invalid exception, and produce a 16348quiet NaN result. 16349 16350This function should always be implementable as multiplication by 1.0, provided 16351that the compiler does not constant fold the operation. Likewise, division by 163521.0 and ``llvm.minnum(x, x)`` are possible implementations. Addition with 16353-0.0 is also sufficient provided that the rounding mode is not -Infinity. 16354 16355``@llvm.canonicalize`` must preserve the equality relation. That is: 16356 16357- ``(@llvm.canonicalize(x) == x)`` is equivalent to ``(x == x)`` 16358- ``(@llvm.canonicalize(x) == @llvm.canonicalize(y))`` is equivalent to 16359 to ``(x == y)`` 16360 16361Additionally, the sign of zero must be conserved: 16362``@llvm.canonicalize(-0.0) = -0.0`` and ``@llvm.canonicalize(+0.0) = +0.0`` 16363 16364The payload bits of a NaN must be conserved, with two exceptions. 16365First, environments which use only a single canonical representation of NaN 16366must perform said canonicalization. Second, SNaNs must be quieted per the 16367usual methods. 16368 16369The canonicalization operation may be optimized away if: 16370 16371- The input is known to be canonical. For example, it was produced by a 16372 floating-point operation that is required by the standard to be canonical. 16373- The result is consumed only by (or fused with) other floating-point 16374 operations. That is, the bits of the floating-point value are not examined. 16375 16376'``llvm.fmuladd.*``' Intrinsic 16377^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16378 16379Syntax: 16380""""""" 16381 16382:: 16383 16384 declare float @llvm.fmuladd.f32(float %a, float %b, float %c) 16385 declare double @llvm.fmuladd.f64(double %a, double %b, double %c) 16386 16387Overview: 16388""""""""" 16389 16390The '``llvm.fmuladd.*``' intrinsic functions represent multiply-add 16391expressions that can be fused if the code generator determines that (a) the 16392target instruction set has support for a fused operation, and (b) that the 16393fused operation is more efficient than the equivalent, separate pair of mul 16394and add instructions. 16395 16396Arguments: 16397"""""""""" 16398 16399The '``llvm.fmuladd.*``' intrinsics each take three arguments: two 16400multiplicands, a and b, and an addend c. 16401 16402Semantics: 16403"""""""""" 16404 16405The expression: 16406 16407:: 16408 16409 %0 = call float @llvm.fmuladd.f32(%a, %b, %c) 16410 16411is equivalent to the expression a \* b + c, except that it is unspecified 16412whether rounding will be performed between the multiplication and addition 16413steps. Fusion is not guaranteed, even if the target platform supports it. 16414If a fused multiply-add is required, the corresponding 16415:ref:`llvm.fma <int_fma>` intrinsic function should be used instead. 16416This never sets errno, just as '``llvm.fma.*``'. 16417 16418Examples: 16419""""""""" 16420 16421.. code-block:: llvm 16422 16423 %r2 = call float @llvm.fmuladd.f32(float %a, float %b, float %c) ; yields float:r2 = (a * b) + c 16424 16425 16426Hardware-Loop Intrinsics 16427------------------------ 16428 16429LLVM support several intrinsics to mark a loop as a hardware-loop. They are 16430hints to the backend which are required to lower these intrinsics further to target 16431specific instructions, or revert the hardware-loop to a normal loop if target 16432specific restriction are not met and a hardware-loop can't be generated. 16433 16434These intrinsics may be modified in the future and are not intended to be used 16435outside the backend. Thus, front-end and mid-level optimizations should not be 16436generating these intrinsics. 16437 16438 16439'``llvm.set.loop.iterations.*``' Intrinsic 16440^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16441 16442Syntax: 16443""""""" 16444 16445This is an overloaded intrinsic. 16446 16447:: 16448 16449 declare void @llvm.set.loop.iterations.i32(i32) 16450 declare void @llvm.set.loop.iterations.i64(i64) 16451 16452Overview: 16453""""""""" 16454 16455The '``llvm.set.loop.iterations.*``' intrinsics are used to specify the 16456hardware-loop trip count. They are placed in the loop preheader basic block and 16457are marked as ``IntrNoDuplicate`` to avoid optimizers duplicating these 16458instructions. 16459 16460Arguments: 16461"""""""""" 16462 16463The integer operand is the loop trip count of the hardware-loop, and thus 16464not e.g. the loop back-edge taken count. 16465 16466Semantics: 16467"""""""""" 16468 16469The '``llvm.set.loop.iterations.*``' intrinsics do not perform any arithmetic 16470on their operand. It's a hint to the backend that can use this to set up the 16471hardware-loop count with a target specific instruction, usually a move of this 16472value to a special register or a hardware-loop instruction. 16473 16474 16475'``llvm.start.loop.iterations.*``' Intrinsic 16476^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16477 16478Syntax: 16479""""""" 16480 16481This is an overloaded intrinsic. 16482 16483:: 16484 16485 declare i32 @llvm.start.loop.iterations.i32(i32) 16486 declare i64 @llvm.start.loop.iterations.i64(i64) 16487 16488Overview: 16489""""""""" 16490 16491The '``llvm.start.loop.iterations.*``' intrinsics are similar to the 16492'``llvm.set.loop.iterations.*``' intrinsics, used to specify the 16493hardware-loop trip count but also produce a value identical to the input 16494that can be used as the input to the loop. They are placed in the loop 16495preheader basic block and the output is expected to be the input to the 16496phi for the induction variable of the loop, decremented by the 16497'``llvm.loop.decrement.reg.*``'. 16498 16499Arguments: 16500"""""""""" 16501 16502The integer operand is the loop trip count of the hardware-loop, and thus 16503not e.g. the loop back-edge taken count. 16504 16505Semantics: 16506"""""""""" 16507 16508The '``llvm.start.loop.iterations.*``' intrinsics do not perform any arithmetic 16509on their operand. It's a hint to the backend that can use this to set up the 16510hardware-loop count with a target specific instruction, usually a move of this 16511value to a special register or a hardware-loop instruction. 16512 16513'``llvm.test.set.loop.iterations.*``' Intrinsic 16514^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16515 16516Syntax: 16517""""""" 16518 16519This is an overloaded intrinsic. 16520 16521:: 16522 16523 declare i1 @llvm.test.set.loop.iterations.i32(i32) 16524 declare i1 @llvm.test.set.loop.iterations.i64(i64) 16525 16526Overview: 16527""""""""" 16528 16529The '``llvm.test.set.loop.iterations.*``' intrinsics are used to specify the 16530the loop trip count, and also test that the given count is not zero, allowing 16531it to control entry to a while-loop. They are placed in the loop preheader's 16532predecessor basic block, and are marked as ``IntrNoDuplicate`` to avoid 16533optimizers duplicating these instructions. 16534 16535Arguments: 16536"""""""""" 16537 16538The integer operand is the loop trip count of the hardware-loop, and thus 16539not e.g. the loop back-edge taken count. 16540 16541Semantics: 16542"""""""""" 16543 16544The '``llvm.test.set.loop.iterations.*``' intrinsics do not perform any 16545arithmetic on their operand. It's a hint to the backend that can use this to 16546set up the hardware-loop count with a target specific instruction, usually a 16547move of this value to a special register or a hardware-loop instruction. 16548The result is the conditional value of whether the given count is not zero. 16549 16550 16551'``llvm.test.start.loop.iterations.*``' Intrinsic 16552^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16553 16554Syntax: 16555""""""" 16556 16557This is an overloaded intrinsic. 16558 16559:: 16560 16561 declare {i32, i1} @llvm.test.start.loop.iterations.i32(i32) 16562 declare {i64, i1} @llvm.test.start.loop.iterations.i64(i64) 16563 16564Overview: 16565""""""""" 16566 16567The '``llvm.test.start.loop.iterations.*``' intrinsics are similar to the 16568'``llvm.test.set.loop.iterations.*``' and '``llvm.start.loop.iterations.*``' 16569intrinsics, used to specify the hardware-loop trip count, but also produce a 16570value identical to the input that can be used as the input to the loop. The 16571second i1 output controls entry to a while-loop. 16572 16573Arguments: 16574"""""""""" 16575 16576The integer operand is the loop trip count of the hardware-loop, and thus 16577not e.g. the loop back-edge taken count. 16578 16579Semantics: 16580"""""""""" 16581 16582The '``llvm.test.start.loop.iterations.*``' intrinsics do not perform any 16583arithmetic on their operand. It's a hint to the backend that can use this to 16584set up the hardware-loop count with a target specific instruction, usually a 16585move of this value to a special register or a hardware-loop instruction. 16586The result is a pair of the input and a conditional value of whether the 16587given count is not zero. 16588 16589 16590'``llvm.loop.decrement.reg.*``' Intrinsic 16591^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16592 16593Syntax: 16594""""""" 16595 16596This is an overloaded intrinsic. 16597 16598:: 16599 16600 declare i32 @llvm.loop.decrement.reg.i32(i32, i32) 16601 declare i64 @llvm.loop.decrement.reg.i64(i64, i64) 16602 16603Overview: 16604""""""""" 16605 16606The '``llvm.loop.decrement.reg.*``' intrinsics are used to lower the loop 16607iteration counter and return an updated value that will be used in the next 16608loop test check. 16609 16610Arguments: 16611"""""""""" 16612 16613Both arguments must have identical integer types. The first operand is the 16614loop iteration counter. The second operand is the maximum number of elements 16615processed in an iteration. 16616 16617Semantics: 16618"""""""""" 16619 16620The '``llvm.loop.decrement.reg.*``' intrinsics do an integer ``SUB`` of its 16621two operands, which is not allowed to wrap. They return the remaining number of 16622iterations still to be executed, and can be used together with a ``PHI``, 16623``ICMP`` and ``BR`` to control the number of loop iterations executed. Any 16624optimisations are allowed to treat it is a ``SUB``, and it is supported by 16625SCEV, so it's the backends responsibility to handle cases where it may be 16626optimised. These intrinsics are marked as ``IntrNoDuplicate`` to avoid 16627optimizers duplicating these instructions. 16628 16629 16630'``llvm.loop.decrement.*``' Intrinsic 16631^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16632 16633Syntax: 16634""""""" 16635 16636This is an overloaded intrinsic. 16637 16638:: 16639 16640 declare i1 @llvm.loop.decrement.i32(i32) 16641 declare i1 @llvm.loop.decrement.i64(i64) 16642 16643Overview: 16644""""""""" 16645 16646The HardwareLoops pass allows the loop decrement value to be specified with an 16647option. It defaults to a loop decrement value of 1, but it can be an unsigned 16648integer value provided by this option. The '``llvm.loop.decrement.*``' 16649intrinsics decrement the loop iteration counter with this value, and return a 16650false predicate if the loop should exit, and true otherwise. 16651This is emitted if the loop counter is not updated via a ``PHI`` node, which 16652can also be controlled with an option. 16653 16654Arguments: 16655"""""""""" 16656 16657The integer argument is the loop decrement value used to decrement the loop 16658iteration counter. 16659 16660Semantics: 16661"""""""""" 16662 16663The '``llvm.loop.decrement.*``' intrinsics do a ``SUB`` of the loop iteration 16664counter with the given loop decrement value, and return false if the loop 16665should exit, this ``SUB`` is not allowed to wrap. The result is a condition 16666that is used by the conditional branch controlling the loop. 16667 16668 16669Vector Reduction Intrinsics 16670--------------------------- 16671 16672Horizontal reductions of vectors can be expressed using the following 16673intrinsics. Each one takes a vector operand as an input and applies its 16674respective operation across all elements of the vector, returning a single 16675scalar result of the same element type. 16676 16677.. _int_vector_reduce_add: 16678 16679'``llvm.vector.reduce.add.*``' Intrinsic 16680^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16681 16682Syntax: 16683""""""" 16684 16685:: 16686 16687 declare i32 @llvm.vector.reduce.add.v4i32(<4 x i32> %a) 16688 declare i64 @llvm.vector.reduce.add.v2i64(<2 x i64> %a) 16689 16690Overview: 16691""""""""" 16692 16693The '``llvm.vector.reduce.add.*``' intrinsics do an integer ``ADD`` 16694reduction of a vector, returning the result as a scalar. The return type matches 16695the element-type of the vector input. 16696 16697Arguments: 16698"""""""""" 16699The argument to this intrinsic must be a vector of integer values. 16700 16701.. _int_vector_reduce_fadd: 16702 16703'``llvm.vector.reduce.fadd.*``' Intrinsic 16704^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16705 16706Syntax: 16707""""""" 16708 16709:: 16710 16711 declare float @llvm.vector.reduce.fadd.v4f32(float %start_value, <4 x float> %a) 16712 declare double @llvm.vector.reduce.fadd.v2f64(double %start_value, <2 x double> %a) 16713 16714Overview: 16715""""""""" 16716 16717The '``llvm.vector.reduce.fadd.*``' intrinsics do a floating-point 16718``ADD`` reduction of a vector, returning the result as a scalar. The return type 16719matches the element-type of the vector input. 16720 16721If the intrinsic call has the 'reassoc' flag set, then the reduction will not 16722preserve the associativity of an equivalent scalarized counterpart. Otherwise 16723the reduction will be *sequential*, thus implying that the operation respects 16724the associativity of a scalarized reduction. That is, the reduction begins with 16725the start value and performs an fadd operation with consecutively increasing 16726vector element indices. See the following pseudocode: 16727 16728:: 16729 16730 float sequential_fadd(start_value, input_vector) 16731 result = start_value 16732 for i = 0 to length(input_vector) 16733 result = result + input_vector[i] 16734 return result 16735 16736 16737Arguments: 16738"""""""""" 16739The first argument to this intrinsic is a scalar start value for the reduction. 16740The type of the start value matches the element-type of the vector input. 16741The second argument must be a vector of floating-point values. 16742 16743To ignore the start value, negative zero (``-0.0``) can be used, as it is 16744the neutral value of floating point addition. 16745 16746Examples: 16747""""""""" 16748 16749:: 16750 16751 %unord = call reassoc float @llvm.vector.reduce.fadd.v4f32(float -0.0, <4 x float> %input) ; relaxed reduction 16752 %ord = call float @llvm.vector.reduce.fadd.v4f32(float %start_value, <4 x float> %input) ; sequential reduction 16753 16754 16755.. _int_vector_reduce_mul: 16756 16757'``llvm.vector.reduce.mul.*``' Intrinsic 16758^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16759 16760Syntax: 16761""""""" 16762 16763:: 16764 16765 declare i32 @llvm.vector.reduce.mul.v4i32(<4 x i32> %a) 16766 declare i64 @llvm.vector.reduce.mul.v2i64(<2 x i64> %a) 16767 16768Overview: 16769""""""""" 16770 16771The '``llvm.vector.reduce.mul.*``' intrinsics do an integer ``MUL`` 16772reduction of a vector, returning the result as a scalar. The return type matches 16773the element-type of the vector input. 16774 16775Arguments: 16776"""""""""" 16777The argument to this intrinsic must be a vector of integer values. 16778 16779.. _int_vector_reduce_fmul: 16780 16781'``llvm.vector.reduce.fmul.*``' Intrinsic 16782^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16783 16784Syntax: 16785""""""" 16786 16787:: 16788 16789 declare float @llvm.vector.reduce.fmul.v4f32(float %start_value, <4 x float> %a) 16790 declare double @llvm.vector.reduce.fmul.v2f64(double %start_value, <2 x double> %a) 16791 16792Overview: 16793""""""""" 16794 16795The '``llvm.vector.reduce.fmul.*``' intrinsics do a floating-point 16796``MUL`` reduction of a vector, returning the result as a scalar. The return type 16797matches the element-type of the vector input. 16798 16799If the intrinsic call has the 'reassoc' flag set, then the reduction will not 16800preserve the associativity of an equivalent scalarized counterpart. Otherwise 16801the reduction will be *sequential*, thus implying that the operation respects 16802the associativity of a scalarized reduction. That is, the reduction begins with 16803the start value and performs an fmul operation with consecutively increasing 16804vector element indices. See the following pseudocode: 16805 16806:: 16807 16808 float sequential_fmul(start_value, input_vector) 16809 result = start_value 16810 for i = 0 to length(input_vector) 16811 result = result * input_vector[i] 16812 return result 16813 16814 16815Arguments: 16816"""""""""" 16817The first argument to this intrinsic is a scalar start value for the reduction. 16818The type of the start value matches the element-type of the vector input. 16819The second argument must be a vector of floating-point values. 16820 16821To ignore the start value, one (``1.0``) can be used, as it is the neutral 16822value of floating point multiplication. 16823 16824Examples: 16825""""""""" 16826 16827:: 16828 16829 %unord = call reassoc float @llvm.vector.reduce.fmul.v4f32(float 1.0, <4 x float> %input) ; relaxed reduction 16830 %ord = call float @llvm.vector.reduce.fmul.v4f32(float %start_value, <4 x float> %input) ; sequential reduction 16831 16832.. _int_vector_reduce_and: 16833 16834'``llvm.vector.reduce.and.*``' Intrinsic 16835^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16836 16837Syntax: 16838""""""" 16839 16840:: 16841 16842 declare i32 @llvm.vector.reduce.and.v4i32(<4 x i32> %a) 16843 16844Overview: 16845""""""""" 16846 16847The '``llvm.vector.reduce.and.*``' intrinsics do a bitwise ``AND`` 16848reduction of a vector, returning the result as a scalar. The return type matches 16849the element-type of the vector input. 16850 16851Arguments: 16852"""""""""" 16853The argument to this intrinsic must be a vector of integer values. 16854 16855.. _int_vector_reduce_or: 16856 16857'``llvm.vector.reduce.or.*``' Intrinsic 16858^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16859 16860Syntax: 16861""""""" 16862 16863:: 16864 16865 declare i32 @llvm.vector.reduce.or.v4i32(<4 x i32> %a) 16866 16867Overview: 16868""""""""" 16869 16870The '``llvm.vector.reduce.or.*``' intrinsics do a bitwise ``OR`` reduction 16871of a vector, returning the result as a scalar. The return type matches the 16872element-type of the vector input. 16873 16874Arguments: 16875"""""""""" 16876The argument to this intrinsic must be a vector of integer values. 16877 16878.. _int_vector_reduce_xor: 16879 16880'``llvm.vector.reduce.xor.*``' Intrinsic 16881^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16882 16883Syntax: 16884""""""" 16885 16886:: 16887 16888 declare i32 @llvm.vector.reduce.xor.v4i32(<4 x i32> %a) 16889 16890Overview: 16891""""""""" 16892 16893The '``llvm.vector.reduce.xor.*``' intrinsics do a bitwise ``XOR`` 16894reduction of a vector, returning the result as a scalar. The return type matches 16895the element-type of the vector input. 16896 16897Arguments: 16898"""""""""" 16899The argument to this intrinsic must be a vector of integer values. 16900 16901.. _int_vector_reduce_smax: 16902 16903'``llvm.vector.reduce.smax.*``' Intrinsic 16904^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16905 16906Syntax: 16907""""""" 16908 16909:: 16910 16911 declare i32 @llvm.vector.reduce.smax.v4i32(<4 x i32> %a) 16912 16913Overview: 16914""""""""" 16915 16916The '``llvm.vector.reduce.smax.*``' intrinsics do a signed integer 16917``MAX`` reduction of a vector, returning the result as a scalar. The return type 16918matches the element-type of the vector input. 16919 16920Arguments: 16921"""""""""" 16922The argument to this intrinsic must be a vector of integer values. 16923 16924.. _int_vector_reduce_smin: 16925 16926'``llvm.vector.reduce.smin.*``' Intrinsic 16927^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16928 16929Syntax: 16930""""""" 16931 16932:: 16933 16934 declare i32 @llvm.vector.reduce.smin.v4i32(<4 x i32> %a) 16935 16936Overview: 16937""""""""" 16938 16939The '``llvm.vector.reduce.smin.*``' intrinsics do a signed integer 16940``MIN`` reduction of a vector, returning the result as a scalar. The return type 16941matches the element-type of the vector input. 16942 16943Arguments: 16944"""""""""" 16945The argument to this intrinsic must be a vector of integer values. 16946 16947.. _int_vector_reduce_umax: 16948 16949'``llvm.vector.reduce.umax.*``' Intrinsic 16950^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16951 16952Syntax: 16953""""""" 16954 16955:: 16956 16957 declare i32 @llvm.vector.reduce.umax.v4i32(<4 x i32> %a) 16958 16959Overview: 16960""""""""" 16961 16962The '``llvm.vector.reduce.umax.*``' intrinsics do an unsigned 16963integer ``MAX`` reduction of a vector, returning the result as a scalar. The 16964return type matches the element-type of the vector input. 16965 16966Arguments: 16967"""""""""" 16968The argument to this intrinsic must be a vector of integer values. 16969 16970.. _int_vector_reduce_umin: 16971 16972'``llvm.vector.reduce.umin.*``' Intrinsic 16973^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16974 16975Syntax: 16976""""""" 16977 16978:: 16979 16980 declare i32 @llvm.vector.reduce.umin.v4i32(<4 x i32> %a) 16981 16982Overview: 16983""""""""" 16984 16985The '``llvm.vector.reduce.umin.*``' intrinsics do an unsigned 16986integer ``MIN`` reduction of a vector, returning the result as a scalar. The 16987return type matches the element-type of the vector input. 16988 16989Arguments: 16990"""""""""" 16991The argument to this intrinsic must be a vector of integer values. 16992 16993.. _int_vector_reduce_fmax: 16994 16995'``llvm.vector.reduce.fmax.*``' Intrinsic 16996^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 16997 16998Syntax: 16999""""""" 17000 17001:: 17002 17003 declare float @llvm.vector.reduce.fmax.v4f32(<4 x float> %a) 17004 declare double @llvm.vector.reduce.fmax.v2f64(<2 x double> %a) 17005 17006Overview: 17007""""""""" 17008 17009The '``llvm.vector.reduce.fmax.*``' intrinsics do a floating-point 17010``MAX`` reduction of a vector, returning the result as a scalar. The return type 17011matches the element-type of the vector input. 17012 17013This instruction has the same comparison semantics as the '``llvm.maxnum.*``' 17014intrinsic. That is, the result will always be a number unless all elements of 17015the vector are NaN. For a vector with maximum element magnitude 0.0 and 17016containing both +0.0 and -0.0 elements, the sign of the result is unspecified. 17017 17018If the intrinsic call has the ``nnan`` fast-math flag, then the operation can 17019assume that NaNs are not present in the input vector. 17020 17021Arguments: 17022"""""""""" 17023The argument to this intrinsic must be a vector of floating-point values. 17024 17025.. _int_vector_reduce_fmin: 17026 17027'``llvm.vector.reduce.fmin.*``' Intrinsic 17028^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17029 17030Syntax: 17031""""""" 17032This is an overloaded intrinsic. 17033 17034:: 17035 17036 declare float @llvm.vector.reduce.fmin.v4f32(<4 x float> %a) 17037 declare double @llvm.vector.reduce.fmin.v2f64(<2 x double> %a) 17038 17039Overview: 17040""""""""" 17041 17042The '``llvm.vector.reduce.fmin.*``' intrinsics do a floating-point 17043``MIN`` reduction of a vector, returning the result as a scalar. The return type 17044matches the element-type of the vector input. 17045 17046This instruction has the same comparison semantics as the '``llvm.minnum.*``' 17047intrinsic. That is, the result will always be a number unless all elements of 17048the vector are NaN. For a vector with minimum element magnitude 0.0 and 17049containing both +0.0 and -0.0 elements, the sign of the result is unspecified. 17050 17051If the intrinsic call has the ``nnan`` fast-math flag, then the operation can 17052assume that NaNs are not present in the input vector. 17053 17054Arguments: 17055"""""""""" 17056The argument to this intrinsic must be a vector of floating-point values. 17057 17058'``llvm.experimental.vector.insert``' Intrinsic 17059^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17060 17061Syntax: 17062""""""" 17063This is an overloaded intrinsic. You can use ``llvm.experimental.vector.insert`` 17064to insert a fixed-width vector into a scalable vector, but not the other way 17065around. 17066 17067:: 17068 17069 declare <vscale x 4 x float> @llvm.experimental.vector.insert.v4f32(<vscale x 4 x float> %vec, <4 x float> %subvec, i64 %idx) 17070 declare <vscale x 2 x double> @llvm.experimental.vector.insert.v2f64(<vscale x 2 x double> %vec, <2 x double> %subvec, i64 %idx) 17071 17072Overview: 17073""""""""" 17074 17075The '``llvm.experimental.vector.insert.*``' intrinsics insert a vector into another vector 17076starting from a given index. The return type matches the type of the vector we 17077insert into. Conceptually, this can be used to build a scalable vector out of 17078non-scalable vectors. 17079 17080Arguments: 17081"""""""""" 17082 17083The ``vec`` is the vector which ``subvec`` will be inserted into. 17084The ``subvec`` is the vector that will be inserted. 17085 17086``idx`` represents the starting element number at which ``subvec`` will be 17087inserted. ``idx`` must be a constant multiple of ``subvec``'s known minimum 17088vector length. If ``subvec`` is a scalable vector, ``idx`` is first scaled by 17089the runtime scaling factor of ``subvec``. The elements of ``vec`` starting at 17090``idx`` are overwritten with ``subvec``. Elements ``idx`` through (``idx`` + 17091num_elements(``subvec``) - 1) must be valid ``vec`` indices. If this condition 17092cannot be determined statically but is false at runtime, then the result vector 17093is undefined. 17094 17095 17096'``llvm.experimental.vector.extract``' Intrinsic 17097^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17098 17099Syntax: 17100""""""" 17101This is an overloaded intrinsic. You can use 17102``llvm.experimental.vector.extract`` to extract a fixed-width vector from a 17103scalable vector, but not the other way around. 17104 17105:: 17106 17107 declare <4 x float> @llvm.experimental.vector.extract.v4f32(<vscale x 4 x float> %vec, i64 %idx) 17108 declare <2 x double> @llvm.experimental.vector.extract.v2f64(<vscale x 2 x double> %vec, i64 %idx) 17109 17110Overview: 17111""""""""" 17112 17113The '``llvm.experimental.vector.extract.*``' intrinsics extract a vector from 17114within another vector starting from a given index. The return type must be 17115explicitly specified. Conceptually, this can be used to decompose a scalable 17116vector into non-scalable parts. 17117 17118Arguments: 17119"""""""""" 17120 17121The ``vec`` is the vector from which we will extract a subvector. 17122 17123The ``idx`` specifies the starting element number within ``vec`` from which a 17124subvector is extracted. ``idx`` must be a constant multiple of the known-minimum 17125vector length of the result type. If the result type is a scalable vector, 17126``idx`` is first scaled by the result type's runtime scaling factor. Elements 17127``idx`` through (``idx`` + num_elements(result_type) - 1) must be valid vector 17128indices. If this condition cannot be determined statically but is false at 17129runtime, then the result vector is undefined. The ``idx`` parameter must be a 17130vector index constant type (for most targets this will be an integer pointer 17131type). 17132 17133'``llvm.experimental.vector.reverse``' Intrinsic 17134^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17135 17136Syntax: 17137""""""" 17138This is an overloaded intrinsic. 17139 17140:: 17141 17142 declare <2 x i8> @llvm.experimental.vector.reverse.v2i8(<2 x i8> %a) 17143 declare <vscale x 4 x i32> @llvm.experimental.vector.reverse.nxv4i32(<vscale x 4 x i32> %a) 17144 17145Overview: 17146""""""""" 17147 17148The '``llvm.experimental.vector.reverse.*``' intrinsics reverse a vector. 17149The intrinsic takes a single vector and returns a vector of matching type but 17150with the original lane order reversed. These intrinsics work for both fixed 17151and scalable vectors. While this intrinsic is marked as experimental the 17152recommended way to express reverse operations for fixed-width vectors is still 17153to use a shufflevector, as that may allow for more optimization opportunities. 17154 17155Arguments: 17156"""""""""" 17157 17158The argument to this intrinsic must be a vector. 17159 17160'``llvm.experimental.vector.splice``' Intrinsic 17161^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17162 17163Syntax: 17164""""""" 17165This is an overloaded intrinsic. 17166 17167:: 17168 17169 declare <2 x double> @llvm.experimental.vector.splice.v2f64(<2 x double> %vec1, <2 x double> %vec2, i32 %imm) 17170 declare <vscale x 4 x i32> @llvm.experimental.vector.splice.nxv4i32(<vscale x 4 x i32> %vec1, <vscale x 4 x i32> %vec2, i32 %imm) 17171 17172Overview: 17173""""""""" 17174 17175The '``llvm.experimental.vector.splice.*``' intrinsics construct a vector by 17176concatenating elements from the first input vector with elements of the second 17177input vector, returning a vector of the same type as the input vectors. The 17178signed immediate, modulo the number of elements in the vector, is the index 17179into the first vector from which to extract the result value. This means 17180conceptually that for a positive immediate, a vector is extracted from 17181``concat(%vec1, %vec2)`` starting at index ``imm``, whereas for a negative 17182immediate, it extracts ``-imm`` trailing elements from the first vector, and 17183the remaining elements from ``%vec2``. 17184 17185These intrinsics work for both fixed and scalable vectors. While this intrinsic 17186is marked as experimental, the recommended way to express this operation for 17187fixed-width vectors is still to use a shufflevector, as that may allow for more 17188optimization opportunities. 17189 17190For example: 17191 17192.. code-block:: text 17193 17194 llvm.experimental.vector.splice(<A,B,C,D>, <E,F,G,H>, 1) ==> <B, C, D, E> ; index 17195 llvm.experimental.vector.splice(<A,B,C,D>, <E,F,G,H>, -3) ==> <B, C, D, E> ; trailing elements 17196 17197 17198Arguments: 17199"""""""""" 17200 17201The first two operands are vectors with the same type. The third argument 17202``imm`` is the start index, modulo VL, where VL is the runtime vector length of 17203the source/result vector. The ``imm`` is a signed integer constant in the range 17204``-VL <= imm < VL``. For values outside of this range the result is poison. 17205 17206'``llvm.experimental.stepvector``' Intrinsic 17207^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17208 17209This is an overloaded intrinsic. You can use ``llvm.experimental.stepvector`` 17210to generate a vector whose lane values comprise the linear sequence 17211<0, 1, 2, ...>. It is primarily intended for scalable vectors. 17212 17213:: 17214 17215 declare <vscale x 4 x i32> @llvm.experimental.stepvector.nxv4i32() 17216 declare <vscale x 8 x i16> @llvm.experimental.stepvector.nxv8i16() 17217 17218The '``llvm.experimental.stepvector``' intrinsics are used to create vectors 17219of integers whose elements contain a linear sequence of values starting from 0 17220with a step of 1. This experimental intrinsic can only be used for vectors 17221with integer elements that are at least 8 bits in size. If the sequence value 17222exceeds the allowed limit for the element type then the result for that lane is 17223undefined. 17224 17225These intrinsics work for both fixed and scalable vectors. While this intrinsic 17226is marked as experimental, the recommended way to express this operation for 17227fixed-width vectors is still to generate a constant vector instead. 17228 17229 17230Arguments: 17231"""""""""" 17232 17233None. 17234 17235 17236Matrix Intrinsics 17237----------------- 17238 17239Operations on matrixes requiring shape information (like number of rows/columns 17240or the memory layout) can be expressed using the matrix intrinsics. These 17241intrinsics require matrix dimensions to be passed as immediate arguments, and 17242matrixes are passed and returned as vectors. This means that for a ``R`` x 17243``C`` matrix, element ``i`` of column ``j`` is at index ``j * R + i`` in the 17244corresponding vector, with indices starting at 0. Currently column-major layout 17245is assumed. The intrinsics support both integer and floating point matrixes. 17246 17247 17248'``llvm.matrix.transpose.*``' Intrinsic 17249^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17250 17251Syntax: 17252""""""" 17253This is an overloaded intrinsic. 17254 17255:: 17256 17257 declare vectorty @llvm.matrix.transpose.*(vectorty %In, i32 <Rows>, i32 <Cols>) 17258 17259Overview: 17260""""""""" 17261 17262The '``llvm.matrix.transpose.*``' intrinsics treat ``%In`` as a ``<Rows> x 17263<Cols>`` matrix and return the transposed matrix in the result vector. 17264 17265Arguments: 17266"""""""""" 17267 17268The first argument ``%In`` is a vector that corresponds to a ``<Rows> x 17269<Cols>`` matrix. Thus, arguments ``<Rows>`` and ``<Cols>`` correspond to the 17270number of rows and columns, respectively, and must be positive, constant 17271integers. The returned vector must have ``<Rows> * <Cols>`` elements, and have 17272the same float or integer element type as ``%In``. 17273 17274'``llvm.matrix.multiply.*``' Intrinsic 17275^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17276 17277Syntax: 17278""""""" 17279This is an overloaded intrinsic. 17280 17281:: 17282 17283 declare vectorty @llvm.matrix.multiply.*(vectorty %A, vectorty %B, i32 <OuterRows>, i32 <Inner>, i32 <OuterColumns>) 17284 17285Overview: 17286""""""""" 17287 17288The '``llvm.matrix.multiply.*``' intrinsics treat ``%A`` as a ``<OuterRows> x 17289<Inner>`` matrix, ``%B`` as a ``<Inner> x <OuterColumns>`` matrix, and 17290multiplies them. The result matrix is returned in the result vector. 17291 17292Arguments: 17293"""""""""" 17294 17295The first vector argument ``%A`` corresponds to a matrix with ``<OuterRows> * 17296<Inner>`` elements, and the second argument ``%B`` to a matrix with 17297``<Inner> * <OuterColumns>`` elements. Arguments ``<OuterRows>``, 17298``<Inner>`` and ``<OuterColumns>`` must be positive, constant integers. The 17299returned vector must have ``<OuterRows> * <OuterColumns>`` elements. 17300Vectors ``%A``, ``%B``, and the returned vector all have the same float or 17301integer element type. 17302 17303 17304'``llvm.matrix.column.major.load.*``' Intrinsic 17305^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17306 17307Syntax: 17308""""""" 17309This is an overloaded intrinsic. 17310 17311:: 17312 17313 declare vectorty @llvm.matrix.column.major.load.*( 17314 ptrty %Ptr, i64 %Stride, i1 <IsVolatile>, i32 <Rows>, i32 <Cols>) 17315 17316Overview: 17317""""""""" 17318 17319The '``llvm.matrix.column.major.load.*``' intrinsics load a ``<Rows> x <Cols>`` 17320matrix using a stride of ``%Stride`` to compute the start address of the 17321different columns. The offset is computed using ``%Stride``'s bitwidth. This 17322allows for convenient loading of sub matrixes. If ``<IsVolatile>`` is true, the 17323intrinsic is considered a :ref:`volatile memory access <volatile>`. The result 17324matrix is returned in the result vector. If the ``%Ptr`` argument is known to 17325be aligned to some boundary, this can be specified as an attribute on the 17326argument. 17327 17328Arguments: 17329"""""""""" 17330 17331The first argument ``%Ptr`` is a pointer type to the returned vector type, and 17332corresponds to the start address to load from. The second argument ``%Stride`` 17333is a positive, constant integer with ``%Stride >= <Rows>``. ``%Stride`` is used 17334to compute the column memory addresses. I.e., for a column ``C``, its start 17335memory addresses is calculated with ``%Ptr + C * %Stride``. The third Argument 17336``<IsVolatile>`` is a boolean value. The fourth and fifth arguments, 17337``<Rows>`` and ``<Cols>``, correspond to the number of rows and columns, 17338respectively, and must be positive, constant integers. The returned vector must 17339have ``<Rows> * <Cols>`` elements. 17340 17341The :ref:`align <attr_align>` parameter attribute can be provided for the 17342``%Ptr`` arguments. 17343 17344 17345'``llvm.matrix.column.major.store.*``' Intrinsic 17346^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17347 17348Syntax: 17349""""""" 17350 17351:: 17352 17353 declare void @llvm.matrix.column.major.store.*( 17354 vectorty %In, ptrty %Ptr, i64 %Stride, i1 <IsVolatile>, i32 <Rows>, i32 <Cols>) 17355 17356Overview: 17357""""""""" 17358 17359The '``llvm.matrix.column.major.store.*``' intrinsics store the ``<Rows> x 17360<Cols>`` matrix in ``%In`` to memory using a stride of ``%Stride`` between 17361columns. The offset is computed using ``%Stride``'s bitwidth. If 17362``<IsVolatile>`` is true, the intrinsic is considered a 17363:ref:`volatile memory access <volatile>`. 17364 17365If the ``%Ptr`` argument is known to be aligned to some boundary, this can be 17366specified as an attribute on the argument. 17367 17368Arguments: 17369"""""""""" 17370 17371The first argument ``%In`` is a vector that corresponds to a ``<Rows> x 17372<Cols>`` matrix to be stored to memory. The second argument ``%Ptr`` is a 17373pointer to the vector type of ``%In``, and is the start address of the matrix 17374in memory. The third argument ``%Stride`` is a positive, constant integer with 17375``%Stride >= <Rows>``. ``%Stride`` is used to compute the column memory 17376addresses. I.e., for a column ``C``, its start memory addresses is calculated 17377with ``%Ptr + C * %Stride``. The fourth argument ``<IsVolatile>`` is a boolean 17378value. The arguments ``<Rows>`` and ``<Cols>`` correspond to the number of rows 17379and columns, respectively, and must be positive, constant integers. 17380 17381The :ref:`align <attr_align>` parameter attribute can be provided 17382for the ``%Ptr`` arguments. 17383 17384 17385Half Precision Floating-Point Intrinsics 17386---------------------------------------- 17387 17388For most target platforms, half precision floating-point is a 17389storage-only format. This means that it is a dense encoding (in memory) 17390but does not support computation in the format. 17391 17392This means that code must first load the half-precision floating-point 17393value as an i16, then convert it to float with 17394:ref:`llvm.convert.from.fp16 <int_convert_from_fp16>`. Computation can 17395then be performed on the float value (including extending to double 17396etc). To store the value back to memory, it is first converted to float 17397if needed, then converted to i16 with 17398:ref:`llvm.convert.to.fp16 <int_convert_to_fp16>`, then storing as an 17399i16 value. 17400 17401.. _int_convert_to_fp16: 17402 17403'``llvm.convert.to.fp16``' Intrinsic 17404^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17405 17406Syntax: 17407""""""" 17408 17409:: 17410 17411 declare i16 @llvm.convert.to.fp16.f32(float %a) 17412 declare i16 @llvm.convert.to.fp16.f64(double %a) 17413 17414Overview: 17415""""""""" 17416 17417The '``llvm.convert.to.fp16``' intrinsic function performs a conversion from a 17418conventional floating-point type to half precision floating-point format. 17419 17420Arguments: 17421"""""""""" 17422 17423The intrinsic function contains single argument - the value to be 17424converted. 17425 17426Semantics: 17427"""""""""" 17428 17429The '``llvm.convert.to.fp16``' intrinsic function performs a conversion from a 17430conventional floating-point format to half precision floating-point format. The 17431return value is an ``i16`` which contains the converted number. 17432 17433Examples: 17434""""""""" 17435 17436.. code-block:: llvm 17437 17438 %res = call i16 @llvm.convert.to.fp16.f32(float %a) 17439 store i16 %res, i16* @x, align 2 17440 17441.. _int_convert_from_fp16: 17442 17443'``llvm.convert.from.fp16``' Intrinsic 17444^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17445 17446Syntax: 17447""""""" 17448 17449:: 17450 17451 declare float @llvm.convert.from.fp16.f32(i16 %a) 17452 declare double @llvm.convert.from.fp16.f64(i16 %a) 17453 17454Overview: 17455""""""""" 17456 17457The '``llvm.convert.from.fp16``' intrinsic function performs a 17458conversion from half precision floating-point format to single precision 17459floating-point format. 17460 17461Arguments: 17462"""""""""" 17463 17464The intrinsic function contains single argument - the value to be 17465converted. 17466 17467Semantics: 17468"""""""""" 17469 17470The '``llvm.convert.from.fp16``' intrinsic function performs a 17471conversion from half single precision floating-point format to single 17472precision floating-point format. The input half-float value is 17473represented by an ``i16`` value. 17474 17475Examples: 17476""""""""" 17477 17478.. code-block:: llvm 17479 17480 %a = load i16, i16* @x, align 2 17481 %res = call float @llvm.convert.from.fp16(i16 %a) 17482 17483Saturating floating-point to integer conversions 17484------------------------------------------------ 17485 17486The ``fptoui`` and ``fptosi`` instructions return a 17487:ref:`poison value <poisonvalues>` if the rounded-towards-zero value is not 17488representable by the result type. These intrinsics provide an alternative 17489conversion, which will saturate towards the smallest and largest representable 17490integer values instead. 17491 17492'``llvm.fptoui.sat.*``' Intrinsic 17493^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17494 17495Syntax: 17496""""""" 17497 17498This is an overloaded intrinsic. You can use ``llvm.fptoui.sat`` on any 17499floating-point argument type and any integer result type, or vectors thereof. 17500Not all targets may support all types, however. 17501 17502:: 17503 17504 declare i32 @llvm.fptoui.sat.i32.f32(float %f) 17505 declare i19 @llvm.fptoui.sat.i19.f64(double %f) 17506 declare <4 x i100> @llvm.fptoui.sat.v4i100.v4f128(<4 x fp128> %f) 17507 17508Overview: 17509""""""""" 17510 17511This intrinsic converts the argument into an unsigned integer using saturating 17512semantics. 17513 17514Arguments: 17515"""""""""" 17516 17517The argument may be any floating-point or vector of floating-point type. The 17518return value may be any integer or vector of integer type. The number of vector 17519elements in argument and return must be the same. 17520 17521Semantics: 17522"""""""""" 17523 17524The conversion to integer is performed subject to the following rules: 17525 17526- If the argument is any NaN, zero is returned. 17527- If the argument is smaller than zero (this includes negative infinity), 17528 zero is returned. 17529- If the argument is larger than the largest representable unsigned integer of 17530 the result type (this includes positive infinity), the largest representable 17531 unsigned integer is returned. 17532- Otherwise, the result of rounding the argument towards zero is returned. 17533 17534Example: 17535"""""""" 17536 17537.. code-block:: text 17538 17539 %a = call i8 @llvm.fptoui.sat.i8.f32(float 123.9) ; yields i8: 123 17540 %b = call i8 @llvm.fptoui.sat.i8.f32(float -5.7) ; yields i8: 0 17541 %c = call i8 @llvm.fptoui.sat.i8.f32(float 377.0) ; yields i8: 255 17542 %d = call i8 @llvm.fptoui.sat.i8.f32(float 0xFFF8000000000000) ; yields i8: 0 17543 17544'``llvm.fptosi.sat.*``' Intrinsic 17545^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17546 17547Syntax: 17548""""""" 17549 17550This is an overloaded intrinsic. You can use ``llvm.fptosi.sat`` on any 17551floating-point argument type and any integer result type, or vectors thereof. 17552Not all targets may support all types, however. 17553 17554:: 17555 17556 declare i32 @llvm.fptosi.sat.i32.f32(float %f) 17557 declare i19 @llvm.fptosi.sat.i19.f64(double %f) 17558 declare <4 x i100> @llvm.fptosi.sat.v4i100.v4f128(<4 x fp128> %f) 17559 17560Overview: 17561""""""""" 17562 17563This intrinsic converts the argument into a signed integer using saturating 17564semantics. 17565 17566Arguments: 17567"""""""""" 17568 17569The argument may be any floating-point or vector of floating-point type. The 17570return value may be any integer or vector of integer type. The number of vector 17571elements in argument and return must be the same. 17572 17573Semantics: 17574"""""""""" 17575 17576The conversion to integer is performed subject to the following rules: 17577 17578- If the argument is any NaN, zero is returned. 17579- If the argument is smaller than the smallest representable signed integer of 17580 the result type (this includes negative infinity), the smallest 17581 representable signed integer is returned. 17582- If the argument is larger than the largest representable signed integer of 17583 the result type (this includes positive infinity), the largest representable 17584 signed integer is returned. 17585- Otherwise, the result of rounding the argument towards zero is returned. 17586 17587Example: 17588"""""""" 17589 17590.. code-block:: text 17591 17592 %a = call i8 @llvm.fptosi.sat.i8.f32(float 23.9) ; yields i8: 23 17593 %b = call i8 @llvm.fptosi.sat.i8.f32(float -130.8) ; yields i8: -128 17594 %c = call i8 @llvm.fptosi.sat.i8.f32(float 999.0) ; yields i8: 127 17595 %d = call i8 @llvm.fptosi.sat.i8.f32(float 0xFFF8000000000000) ; yields i8: 0 17596 17597.. _dbg_intrinsics: 17598 17599Debugger Intrinsics 17600------------------- 17601 17602The LLVM debugger intrinsics (which all start with ``llvm.dbg.`` 17603prefix), are described in the `LLVM Source Level 17604Debugging <SourceLevelDebugging.html#format-common-intrinsics>`_ 17605document. 17606 17607Exception Handling Intrinsics 17608----------------------------- 17609 17610The LLVM exception handling intrinsics (which all start with 17611``llvm.eh.`` prefix), are described in the `LLVM Exception 17612Handling <ExceptionHandling.html#format-common-intrinsics>`_ document. 17613 17614.. _int_trampoline: 17615 17616Trampoline Intrinsics 17617--------------------- 17618 17619These intrinsics make it possible to excise one parameter, marked with 17620the :ref:`nest <nest>` attribute, from a function. The result is a 17621callable function pointer lacking the nest parameter - the caller does 17622not need to provide a value for it. Instead, the value to use is stored 17623in advance in a "trampoline", a block of memory usually allocated on the 17624stack, which also contains code to splice the nest value into the 17625argument list. This is used to implement the GCC nested function address 17626extension. 17627 17628For example, if the function is ``i32 f(i8* nest %c, i32 %x, i32 %y)`` 17629then the resulting function pointer has signature ``i32 (i32, i32)*``. 17630It can be created as follows: 17631 17632.. code-block:: llvm 17633 17634 %tramp = alloca [10 x i8], align 4 ; size and alignment only correct for X86 17635 %tramp1 = getelementptr [10 x i8], [10 x i8]* %tramp, i32 0, i32 0 17636 call i8* @llvm.init.trampoline(i8* %tramp1, i8* bitcast (i32 (i8*, i32, i32)* @f to i8*), i8* %nval) 17637 %p = call i8* @llvm.adjust.trampoline(i8* %tramp1) 17638 %fp = bitcast i8* %p to i32 (i32, i32)* 17639 17640The call ``%val = call i32 %fp(i32 %x, i32 %y)`` is then equivalent to 17641``%val = call i32 %f(i8* %nval, i32 %x, i32 %y)``. 17642 17643.. _int_it: 17644 17645'``llvm.init.trampoline``' Intrinsic 17646^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17647 17648Syntax: 17649""""""" 17650 17651:: 17652 17653 declare void @llvm.init.trampoline(i8* <tramp>, i8* <func>, i8* <nval>) 17654 17655Overview: 17656""""""""" 17657 17658This fills the memory pointed to by ``tramp`` with executable code, 17659turning it into a trampoline. 17660 17661Arguments: 17662"""""""""" 17663 17664The ``llvm.init.trampoline`` intrinsic takes three arguments, all 17665pointers. The ``tramp`` argument must point to a sufficiently large and 17666sufficiently aligned block of memory; this memory is written to by the 17667intrinsic. Note that the size and the alignment are target-specific - 17668LLVM currently provides no portable way of determining them, so a 17669front-end that generates this intrinsic needs to have some 17670target-specific knowledge. The ``func`` argument must hold a function 17671bitcast to an ``i8*``. 17672 17673Semantics: 17674"""""""""" 17675 17676The block of memory pointed to by ``tramp`` is filled with target 17677dependent code, turning it into a function. Then ``tramp`` needs to be 17678passed to :ref:`llvm.adjust.trampoline <int_at>` to get a pointer which can 17679be :ref:`bitcast (to a new function) and called <int_trampoline>`. The new 17680function's signature is the same as that of ``func`` with any arguments 17681marked with the ``nest`` attribute removed. At most one such ``nest`` 17682argument is allowed, and it must be of pointer type. Calling the new 17683function is equivalent to calling ``func`` with the same argument list, 17684but with ``nval`` used for the missing ``nest`` argument. If, after 17685calling ``llvm.init.trampoline``, the memory pointed to by ``tramp`` is 17686modified, then the effect of any later call to the returned function 17687pointer is undefined. 17688 17689.. _int_at: 17690 17691'``llvm.adjust.trampoline``' Intrinsic 17692^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17693 17694Syntax: 17695""""""" 17696 17697:: 17698 17699 declare i8* @llvm.adjust.trampoline(i8* <tramp>) 17700 17701Overview: 17702""""""""" 17703 17704This performs any required machine-specific adjustment to the address of 17705a trampoline (passed as ``tramp``). 17706 17707Arguments: 17708"""""""""" 17709 17710``tramp`` must point to a block of memory which already has trampoline 17711code filled in by a previous call to 17712:ref:`llvm.init.trampoline <int_it>`. 17713 17714Semantics: 17715"""""""""" 17716 17717On some architectures the address of the code to be executed needs to be 17718different than the address where the trampoline is actually stored. This 17719intrinsic returns the executable address corresponding to ``tramp`` 17720after performing the required machine specific adjustments. The pointer 17721returned can then be :ref:`bitcast and executed <int_trampoline>`. 17722 17723 17724.. _int_vp: 17725 17726Vector Predication Intrinsics 17727----------------------------- 17728VP intrinsics are intended for predicated SIMD/vector code. A typical VP 17729operation takes a vector mask and an explicit vector length parameter as in: 17730 17731:: 17732 17733 <W x T> llvm.vp.<opcode>.*(<W x T> %x, <W x T> %y, <W x i1> %mask, i32 %evl) 17734 17735The vector mask parameter (%mask) always has a vector of `i1` type, for example 17736`<32 x i1>`. The explicit vector length parameter always has the type `i32` and 17737is an unsigned integer value. The explicit vector length parameter (%evl) is in 17738the range: 17739 17740:: 17741 17742 0 <= %evl <= W, where W is the number of vector elements 17743 17744Note that for :ref:`scalable vector types <t_vector>` ``W`` is the runtime 17745length of the vector. 17746 17747The VP intrinsic has undefined behavior if ``%evl > W``. The explicit vector 17748length (%evl) creates a mask, %EVLmask, with all elements ``0 <= i < %evl`` set 17749to True, and all other lanes ``%evl <= i < W`` to False. A new mask %M is 17750calculated with an element-wise AND from %mask and %EVLmask: 17751 17752:: 17753 17754 M = %mask AND %EVLmask 17755 17756A vector operation ``<opcode>`` on vectors ``A`` and ``B`` calculates: 17757 17758:: 17759 17760 A <opcode> B = { A[i] <opcode> B[i] M[i] = True, and 17761 { undef otherwise 17762 17763Optimization Hint 17764^^^^^^^^^^^^^^^^^ 17765 17766Some targets, such as AVX512, do not support the %evl parameter in hardware. 17767The use of an effective %evl is discouraged for those targets. The function 17768``TargetTransformInfo::hasActiveVectorLength()`` returns true when the target 17769has native support for %evl. 17770 17771.. _int_vp_select: 17772 17773'``llvm.vp.select.*``' Intrinsics 17774^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17775 17776Syntax: 17777""""""" 17778This is an overloaded intrinsic. 17779 17780:: 17781 17782 declare <16 x i32> @llvm.vp.select.v16i32 (<16 x i1> <condition>, <16 x i32> <on_true>, <16 x i32> <on_false>, i32 <evl>) 17783 declare <vscale x 4 x i64> @llvm.vp.select.nxv4i64 (<vscale x 4 x i1> <condition>, <vscale x 4 x i32> <on_true>, <vscale x 4 x i32> <on_false>, i32 <evl>) 17784 17785Overview: 17786""""""""" 17787 17788The '``llvm.vp.select``' intrinsic is used to choose one value based on a 17789condition vector, without IR-level branching. 17790 17791Arguments: 17792"""""""""" 17793 17794The first operand is a vector of ``i1`` and indicates the condition. The 17795second operand is the value that is selected where the condition vector is 17796true. The third operand is the value that is selected where the condition 17797vector is false. The vectors must be of the same size. The fourth operand is 17798the explicit vector length. 17799 17800#. The optional ``fast-math flags`` marker indicates that the select has one or 17801 more :ref:`fast-math flags <fastmath>`. These are optimization hints to 17802 enable otherwise unsafe floating-point optimizations. Fast-math flags are 17803 only valid for selects that return a floating-point scalar or vector type, 17804 or an array (nested to any depth) of floating-point scalar or vector types. 17805 17806Semantics: 17807"""""""""" 17808 17809The intrinsic selects lanes from the second and third operand depending on a 17810condition vector. 17811 17812All result lanes at positions greater or equal than ``%evl`` are undefined. 17813For all lanes below ``%evl`` where the condition vector is true the lane is 17814taken from the second operand. Otherwise, the lane is taken from the third 17815operand. 17816 17817Example: 17818"""""""" 17819 17820.. code-block:: llvm 17821 17822 %r = call <4 x i32> @llvm.vp.select.v4i32(<4 x i1> %cond, <4 x i32> %on_true, <4 x i32> %on_false, i32 %evl) 17823 17824 ;;; Expansion. 17825 ;; Any result is legal on lanes at and above %evl. 17826 %also.r = select <4 x i1> %cond, <4 x i32> %on_true, <4 x i32> %on_false 17827 17828 17829 17830.. _int_vp_add: 17831 17832'``llvm.vp.add.*``' Intrinsics 17833^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17834 17835Syntax: 17836""""""" 17837This is an overloaded intrinsic. 17838 17839:: 17840 17841 declare <16 x i32> @llvm.vp.add.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 17842 declare <vscale x 4 x i32> @llvm.vp.add.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 17843 declare <256 x i64> @llvm.vp.add.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 17844 17845Overview: 17846""""""""" 17847 17848Predicated integer addition of two vectors of integers. 17849 17850 17851Arguments: 17852"""""""""" 17853 17854The first two operands and the result have the same vector of integer type. The 17855third operand is the vector mask and has the same number of elements as the 17856result vector type. The fourth operand is the explicit vector length of the 17857operation. 17858 17859Semantics: 17860"""""""""" 17861 17862The '``llvm.vp.add``' intrinsic performs integer addition (:ref:`add <i_add>`) 17863of the first and second vector operand on each enabled lane. The result on 17864disabled lanes is undefined. 17865 17866Examples: 17867""""""""" 17868 17869.. code-block:: llvm 17870 17871 %r = call <4 x i32> @llvm.vp.add.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 17872 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 17873 17874 %t = add <4 x i32> %a, %b 17875 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef 17876 17877.. _int_vp_sub: 17878 17879'``llvm.vp.sub.*``' Intrinsics 17880^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17881 17882Syntax: 17883""""""" 17884This is an overloaded intrinsic. 17885 17886:: 17887 17888 declare <16 x i32> @llvm.vp.sub.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 17889 declare <vscale x 4 x i32> @llvm.vp.sub.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 17890 declare <256 x i64> @llvm.vp.sub.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 17891 17892Overview: 17893""""""""" 17894 17895Predicated integer subtraction of two vectors of integers. 17896 17897 17898Arguments: 17899"""""""""" 17900 17901The first two operands and the result have the same vector of integer type. The 17902third operand is the vector mask and has the same number of elements as the 17903result vector type. The fourth operand is the explicit vector length of the 17904operation. 17905 17906Semantics: 17907"""""""""" 17908 17909The '``llvm.vp.sub``' intrinsic performs integer subtraction 17910(:ref:`sub <i_sub>`) of the first and second vector operand on each enabled 17911lane. The result on disabled lanes is undefined. 17912 17913Examples: 17914""""""""" 17915 17916.. code-block:: llvm 17917 17918 %r = call <4 x i32> @llvm.vp.sub.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 17919 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 17920 17921 %t = sub <4 x i32> %a, %b 17922 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef 17923 17924 17925 17926.. _int_vp_mul: 17927 17928'``llvm.vp.mul.*``' Intrinsics 17929^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17930 17931Syntax: 17932""""""" 17933This is an overloaded intrinsic. 17934 17935:: 17936 17937 declare <16 x i32> @llvm.vp.mul.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 17938 declare <vscale x 4 x i32> @llvm.vp.mul.nxv46i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 17939 declare <256 x i64> @llvm.vp.mul.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 17940 17941Overview: 17942""""""""" 17943 17944Predicated integer multiplication of two vectors of integers. 17945 17946 17947Arguments: 17948"""""""""" 17949 17950The first two operands and the result have the same vector of integer type. The 17951third operand is the vector mask and has the same number of elements as the 17952result vector type. The fourth operand is the explicit vector length of the 17953operation. 17954 17955Semantics: 17956"""""""""" 17957The '``llvm.vp.mul``' intrinsic performs integer multiplication 17958(:ref:`mul <i_mul>`) of the first and second vector operand on each enabled 17959lane. The result on disabled lanes is undefined. 17960 17961Examples: 17962""""""""" 17963 17964.. code-block:: llvm 17965 17966 %r = call <4 x i32> @llvm.vp.mul.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 17967 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 17968 17969 %t = mul <4 x i32> %a, %b 17970 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef 17971 17972 17973.. _int_vp_sdiv: 17974 17975'``llvm.vp.sdiv.*``' Intrinsics 17976^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 17977 17978Syntax: 17979""""""" 17980This is an overloaded intrinsic. 17981 17982:: 17983 17984 declare <16 x i32> @llvm.vp.sdiv.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 17985 declare <vscale x 4 x i32> @llvm.vp.sdiv.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 17986 declare <256 x i64> @llvm.vp.sdiv.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 17987 17988Overview: 17989""""""""" 17990 17991Predicated, signed division of two vectors of integers. 17992 17993 17994Arguments: 17995"""""""""" 17996 17997The first two operands and the result have the same vector of integer type. The 17998third operand is the vector mask and has the same number of elements as the 17999result vector type. The fourth operand is the explicit vector length of the 18000operation. 18001 18002Semantics: 18003"""""""""" 18004 18005The '``llvm.vp.sdiv``' intrinsic performs signed division (:ref:`sdiv <i_sdiv>`) 18006of the first and second vector operand on each enabled lane. The result on 18007disabled lanes is undefined. 18008 18009Examples: 18010""""""""" 18011 18012.. code-block:: llvm 18013 18014 %r = call <4 x i32> @llvm.vp.sdiv.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 18015 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 18016 18017 %t = sdiv <4 x i32> %a, %b 18018 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef 18019 18020 18021.. _int_vp_udiv: 18022 18023'``llvm.vp.udiv.*``' Intrinsics 18024^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18025 18026Syntax: 18027""""""" 18028This is an overloaded intrinsic. 18029 18030:: 18031 18032 declare <16 x i32> @llvm.vp.udiv.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 18033 declare <vscale x 4 x i32> @llvm.vp.udiv.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 18034 declare <256 x i64> @llvm.vp.udiv.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 18035 18036Overview: 18037""""""""" 18038 18039Predicated, unsigned division of two vectors of integers. 18040 18041 18042Arguments: 18043"""""""""" 18044 18045The first two operands and the result have the same vector of integer type. The third operand is the vector mask and has the same number of elements as the result vector type. The fourth operand is the explicit vector length of the operation. 18046 18047Semantics: 18048"""""""""" 18049 18050The '``llvm.vp.udiv``' intrinsic performs unsigned division 18051(:ref:`udiv <i_udiv>`) of the first and second vector operand on each enabled 18052lane. The result on disabled lanes is undefined. 18053 18054Examples: 18055""""""""" 18056 18057.. code-block:: llvm 18058 18059 %r = call <4 x i32> @llvm.vp.udiv.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 18060 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 18061 18062 %t = udiv <4 x i32> %a, %b 18063 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef 18064 18065 18066 18067.. _int_vp_srem: 18068 18069'``llvm.vp.srem.*``' Intrinsics 18070^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18071 18072Syntax: 18073""""""" 18074This is an overloaded intrinsic. 18075 18076:: 18077 18078 declare <16 x i32> @llvm.vp.srem.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 18079 declare <vscale x 4 x i32> @llvm.vp.srem.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 18080 declare <256 x i64> @llvm.vp.srem.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 18081 18082Overview: 18083""""""""" 18084 18085Predicated computations of the signed remainder of two integer vectors. 18086 18087 18088Arguments: 18089"""""""""" 18090 18091The first two operands and the result have the same vector of integer type. The 18092third operand is the vector mask and has the same number of elements as the 18093result vector type. The fourth operand is the explicit vector length of the 18094operation. 18095 18096Semantics: 18097"""""""""" 18098 18099The '``llvm.vp.srem``' intrinsic computes the remainder of the signed division 18100(:ref:`srem <i_srem>`) of the first and second vector operand on each enabled 18101lane. The result on disabled lanes is undefined. 18102 18103Examples: 18104""""""""" 18105 18106.. code-block:: llvm 18107 18108 %r = call <4 x i32> @llvm.vp.srem.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 18109 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 18110 18111 %t = srem <4 x i32> %a, %b 18112 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef 18113 18114 18115 18116.. _int_vp_urem: 18117 18118'``llvm.vp.urem.*``' Intrinsics 18119^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18120 18121Syntax: 18122""""""" 18123This is an overloaded intrinsic. 18124 18125:: 18126 18127 declare <16 x i32> @llvm.vp.urem.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 18128 declare <vscale x 4 x i32> @llvm.vp.urem.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 18129 declare <256 x i64> @llvm.vp.urem.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 18130 18131Overview: 18132""""""""" 18133 18134Predicated computation of the unsigned remainder of two integer vectors. 18135 18136 18137Arguments: 18138"""""""""" 18139 18140The first two operands and the result have the same vector of integer type. The 18141third operand is the vector mask and has the same number of elements as the 18142result vector type. The fourth operand is the explicit vector length of the 18143operation. 18144 18145Semantics: 18146"""""""""" 18147 18148The '``llvm.vp.urem``' intrinsic computes the remainder of the unsigned division 18149(:ref:`urem <i_urem>`) of the first and second vector operand on each enabled 18150lane. The result on disabled lanes is undefined. 18151 18152Examples: 18153""""""""" 18154 18155.. code-block:: llvm 18156 18157 %r = call <4 x i32> @llvm.vp.urem.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 18158 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 18159 18160 %t = urem <4 x i32> %a, %b 18161 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef 18162 18163 18164.. _int_vp_ashr: 18165 18166'``llvm.vp.ashr.*``' Intrinsics 18167^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18168 18169Syntax: 18170""""""" 18171This is an overloaded intrinsic. 18172 18173:: 18174 18175 declare <16 x i32> @llvm.vp.ashr.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 18176 declare <vscale x 4 x i32> @llvm.vp.ashr.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 18177 declare <256 x i64> @llvm.vp.ashr.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 18178 18179Overview: 18180""""""""" 18181 18182Vector-predicated arithmetic right-shift. 18183 18184 18185Arguments: 18186"""""""""" 18187 18188The first two operands and the result have the same vector of integer type. The 18189third operand is the vector mask and has the same number of elements as the 18190result vector type. The fourth operand is the explicit vector length of the 18191operation. 18192 18193Semantics: 18194"""""""""" 18195 18196The '``llvm.vp.ashr``' intrinsic computes the arithmetic right shift 18197(:ref:`ashr <i_ashr>`) of the first operand by the second operand on each 18198enabled lane. The result on disabled lanes is undefined. 18199 18200Examples: 18201""""""""" 18202 18203.. code-block:: llvm 18204 18205 %r = call <4 x i32> @llvm.vp.ashr.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 18206 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 18207 18208 %t = ashr <4 x i32> %a, %b 18209 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef 18210 18211 18212.. _int_vp_lshr: 18213 18214 18215'``llvm.vp.lshr.*``' Intrinsics 18216^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18217 18218Syntax: 18219""""""" 18220This is an overloaded intrinsic. 18221 18222:: 18223 18224 declare <16 x i32> @llvm.vp.lshr.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 18225 declare <vscale x 4 x i32> @llvm.vp.lshr.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 18226 declare <256 x i64> @llvm.vp.lshr.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 18227 18228Overview: 18229""""""""" 18230 18231Vector-predicated logical right-shift. 18232 18233 18234Arguments: 18235"""""""""" 18236 18237The first two operands and the result have the same vector of integer type. The 18238third operand is the vector mask and has the same number of elements as the 18239result vector type. The fourth operand is the explicit vector length of the 18240operation. 18241 18242Semantics: 18243"""""""""" 18244 18245The '``llvm.vp.lshr``' intrinsic computes the logical right shift 18246(:ref:`lshr <i_lshr>`) of the first operand by the second operand on each 18247enabled lane. The result on disabled lanes is undefined. 18248 18249Examples: 18250""""""""" 18251 18252.. code-block:: llvm 18253 18254 %r = call <4 x i32> @llvm.vp.lshr.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 18255 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 18256 18257 %t = lshr <4 x i32> %a, %b 18258 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef 18259 18260 18261.. _int_vp_shl: 18262 18263'``llvm.vp.shl.*``' Intrinsics 18264^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18265 18266Syntax: 18267""""""" 18268This is an overloaded intrinsic. 18269 18270:: 18271 18272 declare <16 x i32> @llvm.vp.shl.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 18273 declare <vscale x 4 x i32> @llvm.vp.shl.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 18274 declare <256 x i64> @llvm.vp.shl.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 18275 18276Overview: 18277""""""""" 18278 18279Vector-predicated left shift. 18280 18281 18282Arguments: 18283"""""""""" 18284 18285The first two operands and the result have the same vector of integer type. The 18286third operand is the vector mask and has the same number of elements as the 18287result vector type. The fourth operand is the explicit vector length of the 18288operation. 18289 18290Semantics: 18291"""""""""" 18292 18293The '``llvm.vp.shl``' intrinsic computes the left shift (:ref:`shl <i_shl>`) of 18294the first operand by the second operand on each enabled lane. The result on 18295disabled lanes is undefined. 18296 18297Examples: 18298""""""""" 18299 18300.. code-block:: llvm 18301 18302 %r = call <4 x i32> @llvm.vp.shl.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 18303 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 18304 18305 %t = shl <4 x i32> %a, %b 18306 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef 18307 18308 18309.. _int_vp_or: 18310 18311'``llvm.vp.or.*``' Intrinsics 18312^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18313 18314Syntax: 18315""""""" 18316This is an overloaded intrinsic. 18317 18318:: 18319 18320 declare <16 x i32> @llvm.vp.or.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 18321 declare <vscale x 4 x i32> @llvm.vp.or.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 18322 declare <256 x i64> @llvm.vp.or.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 18323 18324Overview: 18325""""""""" 18326 18327Vector-predicated or. 18328 18329 18330Arguments: 18331"""""""""" 18332 18333The first two operands and the result have the same vector of integer type. The 18334third operand is the vector mask and has the same number of elements as the 18335result vector type. The fourth operand is the explicit vector length of the 18336operation. 18337 18338Semantics: 18339"""""""""" 18340 18341The '``llvm.vp.or``' intrinsic performs a bitwise or (:ref:`or <i_or>`) of the 18342first two operands on each enabled lane. The result on disabled lanes is 18343undefined. 18344 18345Examples: 18346""""""""" 18347 18348.. code-block:: llvm 18349 18350 %r = call <4 x i32> @llvm.vp.or.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 18351 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 18352 18353 %t = or <4 x i32> %a, %b 18354 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef 18355 18356 18357.. _int_vp_and: 18358 18359'``llvm.vp.and.*``' Intrinsics 18360^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18361 18362Syntax: 18363""""""" 18364This is an overloaded intrinsic. 18365 18366:: 18367 18368 declare <16 x i32> @llvm.vp.and.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 18369 declare <vscale x 4 x i32> @llvm.vp.and.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 18370 declare <256 x i64> @llvm.vp.and.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 18371 18372Overview: 18373""""""""" 18374 18375Vector-predicated and. 18376 18377 18378Arguments: 18379"""""""""" 18380 18381The first two operands and the result have the same vector of integer type. The 18382third operand is the vector mask and has the same number of elements as the 18383result vector type. The fourth operand is the explicit vector length of the 18384operation. 18385 18386Semantics: 18387"""""""""" 18388 18389The '``llvm.vp.and``' intrinsic performs a bitwise and (:ref:`and <i_or>`) of 18390the first two operands on each enabled lane. The result on disabled lanes is 18391undefined. 18392 18393Examples: 18394""""""""" 18395 18396.. code-block:: llvm 18397 18398 %r = call <4 x i32> @llvm.vp.and.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 18399 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 18400 18401 %t = and <4 x i32> %a, %b 18402 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef 18403 18404 18405.. _int_vp_xor: 18406 18407'``llvm.vp.xor.*``' Intrinsics 18408^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18409 18410Syntax: 18411""""""" 18412This is an overloaded intrinsic. 18413 18414:: 18415 18416 declare <16 x i32> @llvm.vp.xor.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 18417 declare <vscale x 4 x i32> @llvm.vp.xor.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 18418 declare <256 x i64> @llvm.vp.xor.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 18419 18420Overview: 18421""""""""" 18422 18423Vector-predicated, bitwise xor. 18424 18425 18426Arguments: 18427"""""""""" 18428 18429The first two operands and the result have the same vector of integer type. The 18430third operand is the vector mask and has the same number of elements as the 18431result vector type. The fourth operand is the explicit vector length of the 18432operation. 18433 18434Semantics: 18435"""""""""" 18436 18437The '``llvm.vp.xor``' intrinsic performs a bitwise xor (:ref:`xor <i_xor>`) of 18438the first two operands on each enabled lane. 18439The result on disabled lanes is undefined. 18440 18441Examples: 18442""""""""" 18443 18444.. code-block:: llvm 18445 18446 %r = call <4 x i32> @llvm.vp.xor.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl) 18447 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 18448 18449 %t = xor <4 x i32> %a, %b 18450 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef 18451 18452 18453.. _int_vp_fadd: 18454 18455'``llvm.vp.fadd.*``' Intrinsics 18456^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18457 18458Syntax: 18459""""""" 18460This is an overloaded intrinsic. 18461 18462:: 18463 18464 declare <16 x float> @llvm.vp.fadd.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 18465 declare <vscale x 4 x float> @llvm.vp.fadd.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 18466 declare <256 x double> @llvm.vp.fadd.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 18467 18468Overview: 18469""""""""" 18470 18471Predicated floating-point addition of two vectors of floating-point values. 18472 18473 18474Arguments: 18475"""""""""" 18476 18477The first two operands and the result have the same vector of floating-point type. The 18478third operand is the vector mask and has the same number of elements as the 18479result vector type. The fourth operand is the explicit vector length of the 18480operation. 18481 18482Semantics: 18483"""""""""" 18484 18485The '``llvm.vp.fadd``' intrinsic performs floating-point addition (:ref:`add <i_fadd>`) 18486of the first and second vector operand on each enabled lane. The result on 18487disabled lanes is undefined. The operation is performed in the default 18488floating-point environment. 18489 18490Examples: 18491""""""""" 18492 18493.. code-block:: llvm 18494 18495 %r = call <4 x float> @llvm.vp.fadd.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl) 18496 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 18497 18498 %t = fadd <4 x float> %a, %b 18499 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> undef 18500 18501 18502.. _int_vp_fsub: 18503 18504'``llvm.vp.fsub.*``' Intrinsics 18505^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18506 18507Syntax: 18508""""""" 18509This is an overloaded intrinsic. 18510 18511:: 18512 18513 declare <16 x float> @llvm.vp.fsub.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 18514 declare <vscale x 4 x float> @llvm.vp.fsub.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 18515 declare <256 x double> @llvm.vp.fsub.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 18516 18517Overview: 18518""""""""" 18519 18520Predicated floating-point subtraction of two vectors of floating-point values. 18521 18522 18523Arguments: 18524"""""""""" 18525 18526The first two operands and the result have the same vector of floating-point type. The 18527third operand is the vector mask and has the same number of elements as the 18528result vector type. The fourth operand is the explicit vector length of the 18529operation. 18530 18531Semantics: 18532"""""""""" 18533 18534The '``llvm.vp.fsub``' intrinsic performs floating-point subtraction (:ref:`add <i_fsub>`) 18535of the first and second vector operand on each enabled lane. The result on 18536disabled lanes is undefined. The operation is performed in the default 18537floating-point environment. 18538 18539Examples: 18540""""""""" 18541 18542.. code-block:: llvm 18543 18544 %r = call <4 x float> @llvm.vp.fsub.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl) 18545 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 18546 18547 %t = fsub <4 x float> %a, %b 18548 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> undef 18549 18550 18551.. _int_vp_fmul: 18552 18553'``llvm.vp.fmul.*``' Intrinsics 18554^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18555 18556Syntax: 18557""""""" 18558This is an overloaded intrinsic. 18559 18560:: 18561 18562 declare <16 x float> @llvm.vp.fmul.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 18563 declare <vscale x 4 x float> @llvm.vp.fmul.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 18564 declare <256 x double> @llvm.vp.fmul.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 18565 18566Overview: 18567""""""""" 18568 18569Predicated floating-point multiplication of two vectors of floating-point values. 18570 18571 18572Arguments: 18573"""""""""" 18574 18575The first two operands and the result have the same vector of floating-point type. The 18576third operand is the vector mask and has the same number of elements as the 18577result vector type. The fourth operand is the explicit vector length of the 18578operation. 18579 18580Semantics: 18581"""""""""" 18582 18583The '``llvm.vp.fmul``' intrinsic performs floating-point multiplication (:ref:`add <i_fmul>`) 18584of the first and second vector operand on each enabled lane. The result on 18585disabled lanes is undefined. The operation is performed in the default 18586floating-point environment. 18587 18588Examples: 18589""""""""" 18590 18591.. code-block:: llvm 18592 18593 %r = call <4 x float> @llvm.vp.fmul.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl) 18594 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 18595 18596 %t = fmul <4 x float> %a, %b 18597 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> undef 18598 18599 18600.. _int_vp_fdiv: 18601 18602'``llvm.vp.fdiv.*``' Intrinsics 18603^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18604 18605Syntax: 18606""""""" 18607This is an overloaded intrinsic. 18608 18609:: 18610 18611 declare <16 x float> @llvm.vp.fdiv.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 18612 declare <vscale x 4 x float> @llvm.vp.fdiv.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 18613 declare <256 x double> @llvm.vp.fdiv.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 18614 18615Overview: 18616""""""""" 18617 18618Predicated floating-point division of two vectors of floating-point values. 18619 18620 18621Arguments: 18622"""""""""" 18623 18624The first two operands and the result have the same vector of floating-point type. The 18625third operand is the vector mask and has the same number of elements as the 18626result vector type. The fourth operand is the explicit vector length of the 18627operation. 18628 18629Semantics: 18630"""""""""" 18631 18632The '``llvm.vp.fdiv``' intrinsic performs floating-point division (:ref:`add <i_fdiv>`) 18633of the first and second vector operand on each enabled lane. The result on 18634disabled lanes is undefined. The operation is performed in the default 18635floating-point environment. 18636 18637Examples: 18638""""""""" 18639 18640.. code-block:: llvm 18641 18642 %r = call <4 x float> @llvm.vp.fdiv.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl) 18643 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 18644 18645 %t = fdiv <4 x float> %a, %b 18646 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> undef 18647 18648 18649.. _int_vp_frem: 18650 18651'``llvm.vp.frem.*``' Intrinsics 18652^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18653 18654Syntax: 18655""""""" 18656This is an overloaded intrinsic. 18657 18658:: 18659 18660 declare <16 x float> @llvm.vp.frem.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>) 18661 declare <vscale x 4 x float> @llvm.vp.frem.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>) 18662 declare <256 x double> @llvm.vp.frem.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>) 18663 18664Overview: 18665""""""""" 18666 18667Predicated floating-point remainder of two vectors of floating-point values. 18668 18669 18670Arguments: 18671"""""""""" 18672 18673The first two operands and the result have the same vector of floating-point type. The 18674third operand is the vector mask and has the same number of elements as the 18675result vector type. The fourth operand is the explicit vector length of the 18676operation. 18677 18678Semantics: 18679"""""""""" 18680 18681The '``llvm.vp.frem``' intrinsic performs floating-point remainder (:ref:`add <i_frem>`) 18682of the first and second vector operand on each enabled lane. The result on 18683disabled lanes is undefined. The operation is performed in the default 18684floating-point environment. 18685 18686Examples: 18687""""""""" 18688 18689.. code-block:: llvm 18690 18691 %r = call <4 x float> @llvm.vp.frem.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl) 18692 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r 18693 18694 %t = frem <4 x float> %a, %b 18695 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> undef 18696 18697 18698 18699.. _int_vp_reduce_add: 18700 18701'``llvm.vp.reduce.add.*``' Intrinsics 18702^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18703 18704Syntax: 18705""""""" 18706This is an overloaded intrinsic. 18707 18708:: 18709 18710 declare i32 @llvm.vp.reduce.add.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>) 18711 declare i16 @llvm.vp.reduce.add.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>) 18712 18713Overview: 18714""""""""" 18715 18716Predicated integer ``ADD`` reduction of a vector and a scalar starting value, 18717returning the result as a scalar. 18718 18719Arguments: 18720"""""""""" 18721 18722The first operand is the start value of the reduction, which must be a scalar 18723integer type equal to the result type. The second operand is the vector on 18724which the reduction is performed and must be a vector of integer values whose 18725element type is the result/start type. The third operand is the vector mask and 18726is a vector of boolean values with the same number of elements as the vector 18727operand. The fourth operand is the explicit vector length of the operation. 18728 18729Semantics: 18730"""""""""" 18731 18732The '``llvm.vp.reduce.add``' intrinsic performs the integer ``ADD`` reduction 18733(:ref:`llvm.vector.reduce.add <int_vector_reduce_add>`) of the vector operand 18734``val`` on each enabled lane, adding it to the scalar ``start_value``. Disabled 18735lanes are treated as containing the neutral value ``0`` (i.e. having no effect 18736on the reduction operation). If the vector length is zero, the result is equal 18737to ``start_value``. 18738 18739To ignore the start value, the neutral value can be used. 18740 18741Examples: 18742""""""""" 18743 18744.. code-block:: llvm 18745 18746 %r = call i32 @llvm.vp.reduce.add.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl) 18747 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl 18748 ; are treated as though %mask were false for those lanes. 18749 18750 %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> zeroinitializer 18751 %reduction = call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> %masked.a) 18752 %also.r = add i32 %reduction, %start 18753 18754 18755.. _int_vp_reduce_fadd: 18756 18757'``llvm.vp.reduce.fadd.*``' Intrinsics 18758^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18759 18760Syntax: 18761""""""" 18762This is an overloaded intrinsic. 18763 18764:: 18765 18766 declare float @llvm.vp.reduce.fadd.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, i32 <vector_length>) 18767 declare double @llvm.vp.reduce.fadd.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>) 18768 18769Overview: 18770""""""""" 18771 18772Predicated floating-point ``ADD`` reduction of a vector and a scalar starting 18773value, returning the result as a scalar. 18774 18775Arguments: 18776"""""""""" 18777 18778The first operand is the start value of the reduction, which must be a scalar 18779floating-point type equal to the result type. The second operand is the vector 18780on which the reduction is performed and must be a vector of floating-point 18781values whose element type is the result/start type. The third operand is the 18782vector mask and is a vector of boolean values with the same number of elements 18783as the vector operand. The fourth operand is the explicit vector length of the 18784operation. 18785 18786Semantics: 18787"""""""""" 18788 18789The '``llvm.vp.reduce.fadd``' intrinsic performs the floating-point ``ADD`` 18790reduction (:ref:`llvm.vector.reduce.fadd <int_vector_reduce_fadd>`) of the 18791vector operand ``val`` on each enabled lane, adding it to the scalar 18792``start_value``. Disabled lanes are treated as containing the neutral value 18793``-0.0`` (i.e. having no effect on the reduction operation). If no lanes are 18794enabled, the resulting value will be equal to ``start_value``. 18795 18796To ignore the start value, the neutral value can be used. 18797 18798See the unpredicated version (:ref:`llvm.vector.reduce.fadd 18799<int_vector_reduce_fadd>`) for more detail on the semantics of the reduction. 18800 18801Examples: 18802""""""""" 18803 18804.. code-block:: llvm 18805 18806 %r = call float @llvm.vp.reduce.fadd.v4f32(float %start, <4 x float> %a, <4 x i1> %mask, i32 %evl) 18807 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl 18808 ; are treated as though %mask were false for those lanes. 18809 18810 %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float -0.0, float -0.0, float -0.0, float -0.0> 18811 %also.r = call float @llvm.vector.reduce.fadd.v4f32(float %start, <4 x float> %masked.a) 18812 18813 18814.. _int_vp_reduce_mul: 18815 18816'``llvm.vp.reduce.mul.*``' Intrinsics 18817^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18818 18819Syntax: 18820""""""" 18821This is an overloaded intrinsic. 18822 18823:: 18824 18825 declare i32 @llvm.vp.reduce.mul.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>) 18826 declare i16 @llvm.vp.reduce.mul.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>) 18827 18828Overview: 18829""""""""" 18830 18831Predicated integer ``MUL`` reduction of a vector and a scalar starting value, 18832returning the result as a scalar. 18833 18834 18835Arguments: 18836"""""""""" 18837 18838The first operand is the start value of the reduction, which must be a scalar 18839integer type equal to the result type. The second operand is the vector on 18840which the reduction is performed and must be a vector of integer values whose 18841element type is the result/start type. The third operand is the vector mask and 18842is a vector of boolean values with the same number of elements as the vector 18843operand. The fourth operand is the explicit vector length of the operation. 18844 18845Semantics: 18846"""""""""" 18847 18848The '``llvm.vp.reduce.mul``' intrinsic performs the integer ``MUL`` reduction 18849(:ref:`llvm.vector.reduce.mul <int_vector_reduce_mul>`) of the vector operand ``val`` 18850on each enabled lane, multiplying it by the scalar ``start_value``. Disabled 18851lanes are treated as containing the neutral value ``1`` (i.e. having no effect 18852on the reduction operation). If the vector length is zero, the result is the 18853start value. 18854 18855To ignore the start value, the neutral value can be used. 18856 18857Examples: 18858""""""""" 18859 18860.. code-block:: llvm 18861 18862 %r = call i32 @llvm.vp.reduce.mul.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl) 18863 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl 18864 ; are treated as though %mask were false for those lanes. 18865 18866 %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 1, i32 1, i32 1, i32 1> 18867 %reduction = call i32 @llvm.vector.reduce.mul.v4i32(<4 x i32> %masked.a) 18868 %also.r = mul i32 %reduction, %start 18869 18870.. _int_vp_reduce_fmul: 18871 18872'``llvm.vp.reduce.fmul.*``' Intrinsics 18873^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18874 18875Syntax: 18876""""""" 18877This is an overloaded intrinsic. 18878 18879:: 18880 18881 declare float @llvm.vp.reduce.fmul.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, i32 <vector_length>) 18882 declare double @llvm.vp.reduce.fmul.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>) 18883 18884Overview: 18885""""""""" 18886 18887Predicated floating-point ``MUL`` reduction of a vector and a scalar starting 18888value, returning the result as a scalar. 18889 18890 18891Arguments: 18892"""""""""" 18893 18894The first operand is the start value of the reduction, which must be a scalar 18895floating-point type equal to the result type. The second operand is the vector 18896on which the reduction is performed and must be a vector of floating-point 18897values whose element type is the result/start type. The third operand is the 18898vector mask and is a vector of boolean values with the same number of elements 18899as the vector operand. The fourth operand is the explicit vector length of the 18900operation. 18901 18902Semantics: 18903"""""""""" 18904 18905The '``llvm.vp.reduce.fmul``' intrinsic performs the floating-point ``MUL`` 18906reduction (:ref:`llvm.vector.reduce.fmul <int_vector_reduce_fmul>`) of the 18907vector operand ``val`` on each enabled lane, multiplying it by the scalar 18908`start_value``. Disabled lanes are treated as containing the neutral value 18909``1.0`` (i.e. having no effect on the reduction operation). If no lanes are 18910enabled, the resulting value will be equal to the starting value. 18911 18912To ignore the start value, the neutral value can be used. 18913 18914See the unpredicated version (:ref:`llvm.vector.reduce.fmul 18915<int_vector_reduce_fmul>`) for more detail on the semantics. 18916 18917Examples: 18918""""""""" 18919 18920.. code-block:: llvm 18921 18922 %r = call float @llvm.vp.reduce.fmul.v4f32(float %start, <4 x float> %a, <4 x i1> %mask, i32 %evl) 18923 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl 18924 ; are treated as though %mask were false for those lanes. 18925 18926 %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float 1.0, float 1.0, float 1.0, float 1.0> 18927 %also.r = call float @llvm.vector.reduce.fmul.v4f32(float %start, <4 x float> %masked.a) 18928 18929 18930.. _int_vp_reduce_and: 18931 18932'``llvm.vp.reduce.and.*``' Intrinsics 18933^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18934 18935Syntax: 18936""""""" 18937This is an overloaded intrinsic. 18938 18939:: 18940 18941 declare i32 @llvm.vp.reduce.and.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>) 18942 declare i16 @llvm.vp.reduce.and.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>) 18943 18944Overview: 18945""""""""" 18946 18947Predicated integer ``AND`` reduction of a vector and a scalar starting value, 18948returning the result as a scalar. 18949 18950 18951Arguments: 18952"""""""""" 18953 18954The first operand is the start value of the reduction, which must be a scalar 18955integer type equal to the result type. The second operand is the vector on 18956which the reduction is performed and must be a vector of integer values whose 18957element type is the result/start type. The third operand is the vector mask and 18958is a vector of boolean values with the same number of elements as the vector 18959operand. The fourth operand is the explicit vector length of the operation. 18960 18961Semantics: 18962"""""""""" 18963 18964The '``llvm.vp.reduce.and``' intrinsic performs the integer ``AND`` reduction 18965(:ref:`llvm.vector.reduce.and <int_vector_reduce_and>`) of the vector operand 18966``val`` on each enabled lane, performing an '``and``' of that with with the 18967scalar ``start_value``. Disabled lanes are treated as containing the neutral 18968value ``UINT_MAX``, or ``-1`` (i.e. having no effect on the reduction 18969operation). If the vector length is zero, the result is the start value. 18970 18971To ignore the start value, the neutral value can be used. 18972 18973Examples: 18974""""""""" 18975 18976.. code-block:: llvm 18977 18978 %r = call i32 @llvm.vp.reduce.and.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl) 18979 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl 18980 ; are treated as though %mask were false for those lanes. 18981 18982 %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1> 18983 %reduction = call i32 @llvm.vector.reduce.and.v4i32(<4 x i32> %masked.a) 18984 %also.r = and i32 %reduction, %start 18985 18986 18987.. _int_vp_reduce_or: 18988 18989'``llvm.vp.reduce.or.*``' Intrinsics 18990^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 18991 18992Syntax: 18993""""""" 18994This is an overloaded intrinsic. 18995 18996:: 18997 18998 declare i32 @llvm.vp.reduce.or.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>) 18999 declare i16 @llvm.vp.reduce.or.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>) 19000 19001Overview: 19002""""""""" 19003 19004Predicated integer ``OR`` reduction of a vector and a scalar starting value, 19005returning the result as a scalar. 19006 19007 19008Arguments: 19009"""""""""" 19010 19011The first operand is the start value of the reduction, which must be a scalar 19012integer type equal to the result type. The second operand is the vector on 19013which the reduction is performed and must be a vector of integer values whose 19014element type is the result/start type. The third operand is the vector mask and 19015is a vector of boolean values with the same number of elements as the vector 19016operand. The fourth operand is the explicit vector length of the operation. 19017 19018Semantics: 19019"""""""""" 19020 19021The '``llvm.vp.reduce.or``' intrinsic performs the integer ``OR`` reduction 19022(:ref:`llvm.vector.reduce.or <int_vector_reduce_or>`) of the vector operand 19023``val`` on each enabled lane, performing an '``or``' of that with the scalar 19024``start_value``. Disabled lanes are treated as containing the neutral value 19025``0`` (i.e. having no effect on the reduction operation). If the vector length 19026is zero, the result is the start value. 19027 19028To ignore the start value, the neutral value can be used. 19029 19030Examples: 19031""""""""" 19032 19033.. code-block:: llvm 19034 19035 %r = call i32 @llvm.vp.reduce.or.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl) 19036 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl 19037 ; are treated as though %mask were false for those lanes. 19038 19039 %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 0, i32 0, i32 0, i32 0> 19040 %reduction = call i32 @llvm.vector.reduce.or.v4i32(<4 x i32> %masked.a) 19041 %also.r = or i32 %reduction, %start 19042 19043.. _int_vp_reduce_xor: 19044 19045'``llvm.vp.reduce.xor.*``' Intrinsics 19046^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19047 19048Syntax: 19049""""""" 19050This is an overloaded intrinsic. 19051 19052:: 19053 19054 declare i32 @llvm.vp.reduce.xor.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>) 19055 declare i16 @llvm.vp.reduce.xor.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>) 19056 19057Overview: 19058""""""""" 19059 19060Predicated integer ``XOR`` reduction of a vector and a scalar starting value, 19061returning the result as a scalar. 19062 19063 19064Arguments: 19065"""""""""" 19066 19067The first operand is the start value of the reduction, which must be a scalar 19068integer type equal to the result type. The second operand is the vector on 19069which the reduction is performed and must be a vector of integer values whose 19070element type is the result/start type. The third operand is the vector mask and 19071is a vector of boolean values with the same number of elements as the vector 19072operand. The fourth operand is the explicit vector length of the operation. 19073 19074Semantics: 19075"""""""""" 19076 19077The '``llvm.vp.reduce.xor``' intrinsic performs the integer ``XOR`` reduction 19078(:ref:`llvm.vector.reduce.xor <int_vector_reduce_xor>`) of the vector operand 19079``val`` on each enabled lane, performing an '``xor``' of that with the scalar 19080``start_value``. Disabled lanes are treated as containing the neutral value 19081``0`` (i.e. having no effect on the reduction operation). If the vector length 19082is zero, the result is the start value. 19083 19084To ignore the start value, the neutral value can be used. 19085 19086Examples: 19087""""""""" 19088 19089.. code-block:: llvm 19090 19091 %r = call i32 @llvm.vp.reduce.xor.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl) 19092 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl 19093 ; are treated as though %mask were false for those lanes. 19094 19095 %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 0, i32 0, i32 0, i32 0> 19096 %reduction = call i32 @llvm.vector.reduce.xor.v4i32(<4 x i32> %masked.a) 19097 %also.r = xor i32 %reduction, %start 19098 19099 19100.. _int_vp_reduce_smax: 19101 19102'``llvm.vp.reduce.smax.*``' Intrinsics 19103^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19104 19105Syntax: 19106""""""" 19107This is an overloaded intrinsic. 19108 19109:: 19110 19111 declare i32 @llvm.vp.reduce.smax.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>) 19112 declare i16 @llvm.vp.reduce.smax.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>) 19113 19114Overview: 19115""""""""" 19116 19117Predicated signed-integer ``MAX`` reduction of a vector and a scalar starting 19118value, returning the result as a scalar. 19119 19120 19121Arguments: 19122"""""""""" 19123 19124The first operand is the start value of the reduction, which must be a scalar 19125integer type equal to the result type. The second operand is the vector on 19126which the reduction is performed and must be a vector of integer values whose 19127element type is the result/start type. The third operand is the vector mask and 19128is a vector of boolean values with the same number of elements as the vector 19129operand. The fourth operand is the explicit vector length of the operation. 19130 19131Semantics: 19132"""""""""" 19133 19134The '``llvm.vp.reduce.smax``' intrinsic performs the signed-integer ``MAX`` 19135reduction (:ref:`llvm.vector.reduce.smax <int_vector_reduce_smax>`) of the 19136vector operand ``val`` on each enabled lane, and taking the maximum of that and 19137the scalar ``start_value``. Disabled lanes are treated as containing the 19138neutral value ``INT_MIN`` (i.e. having no effect on the reduction operation). 19139If the vector length is zero, the result is the start value. 19140 19141To ignore the start value, the neutral value can be used. 19142 19143Examples: 19144""""""""" 19145 19146.. code-block:: llvm 19147 19148 %r = call i8 @llvm.vp.reduce.smax.v4i8(i8 %start, <4 x i8> %a, <4 x i1> %mask, i32 %evl) 19149 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl 19150 ; are treated as though %mask were false for those lanes. 19151 19152 %masked.a = select <4 x i1> %mask, <4 x i8> %a, <4 x i8> <i8 -128, i8 -128, i8 -128, i8 -128> 19153 %reduction = call i8 @llvm.vector.reduce.smax.v4i8(<4 x i8> %masked.a) 19154 %also.r = call i8 @llvm.smax.i8(i8 %reduction, i8 %start) 19155 19156 19157.. _int_vp_reduce_smin: 19158 19159'``llvm.vp.reduce.smin.*``' Intrinsics 19160^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19161 19162Syntax: 19163""""""" 19164This is an overloaded intrinsic. 19165 19166:: 19167 19168 declare i32 @llvm.vp.reduce.smin.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>) 19169 declare i16 @llvm.vp.reduce.smin.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>) 19170 19171Overview: 19172""""""""" 19173 19174Predicated signed-integer ``MIN`` reduction of a vector and a scalar starting 19175value, returning the result as a scalar. 19176 19177 19178Arguments: 19179"""""""""" 19180 19181The first operand is the start value of the reduction, which must be a scalar 19182integer type equal to the result type. The second operand is the vector on 19183which the reduction is performed and must be a vector of integer values whose 19184element type is the result/start type. The third operand is the vector mask and 19185is a vector of boolean values with the same number of elements as the vector 19186operand. The fourth operand is the explicit vector length of the operation. 19187 19188Semantics: 19189"""""""""" 19190 19191The '``llvm.vp.reduce.smin``' intrinsic performs the signed-integer ``MIN`` 19192reduction (:ref:`llvm.vector.reduce.smin <int_vector_reduce_smin>`) of the 19193vector operand ``val`` on each enabled lane, and taking the minimum of that and 19194the scalar ``start_value``. Disabled lanes are treated as containing the 19195neutral value ``INT_MAX`` (i.e. having no effect on the reduction operation). 19196If the vector length is zero, the result is the start value. 19197 19198To ignore the start value, the neutral value can be used. 19199 19200Examples: 19201""""""""" 19202 19203.. code-block:: llvm 19204 19205 %r = call i8 @llvm.vp.reduce.smin.v4i8(i8 %start, <4 x i8> %a, <4 x i1> %mask, i32 %evl) 19206 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl 19207 ; are treated as though %mask were false for those lanes. 19208 19209 %masked.a = select <4 x i1> %mask, <4 x i8> %a, <4 x i8> <i8 127, i8 127, i8 127, i8 127> 19210 %reduction = call i8 @llvm.vector.reduce.smin.v4i8(<4 x i8> %masked.a) 19211 %also.r = call i8 @llvm.smin.i8(i8 %reduction, i8 %start) 19212 19213 19214.. _int_vp_reduce_umax: 19215 19216'``llvm.vp.reduce.umax.*``' Intrinsics 19217^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19218 19219Syntax: 19220""""""" 19221This is an overloaded intrinsic. 19222 19223:: 19224 19225 declare i32 @llvm.vp.reduce.umax.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>) 19226 declare i16 @llvm.vp.reduce.umax.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>) 19227 19228Overview: 19229""""""""" 19230 19231Predicated unsigned-integer ``MAX`` reduction of a vector and a scalar starting 19232value, returning the result as a scalar. 19233 19234 19235Arguments: 19236"""""""""" 19237 19238The first operand is the start value of the reduction, which must be a scalar 19239integer type equal to the result type. The second operand is the vector on 19240which the reduction is performed and must be a vector of integer values whose 19241element type is the result/start type. The third operand is the vector mask and 19242is a vector of boolean values with the same number of elements as the vector 19243operand. The fourth operand is the explicit vector length of the operation. 19244 19245Semantics: 19246"""""""""" 19247 19248The '``llvm.vp.reduce.umax``' intrinsic performs the unsigned-integer ``MAX`` 19249reduction (:ref:`llvm.vector.reduce.umax <int_vector_reduce_umax>`) of the 19250vector operand ``val`` on each enabled lane, and taking the maximum of that and 19251the scalar ``start_value``. Disabled lanes are treated as containing the 19252neutral value ``0`` (i.e. having no effect on the reduction operation). If the 19253vector length is zero, the result is the start value. 19254 19255To ignore the start value, the neutral value can be used. 19256 19257Examples: 19258""""""""" 19259 19260.. code-block:: llvm 19261 19262 %r = call i32 @llvm.vp.reduce.umax.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl) 19263 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl 19264 ; are treated as though %mask were false for those lanes. 19265 19266 %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 0, i32 0, i32 0, i32 0> 19267 %reduction = call i32 @llvm.vector.reduce.umax.v4i32(<4 x i32> %masked.a) 19268 %also.r = call i32 @llvm.umax.i32(i32 %reduction, i32 %start) 19269 19270 19271.. _int_vp_reduce_umin: 19272 19273'``llvm.vp.reduce.umin.*``' Intrinsics 19274^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19275 19276Syntax: 19277""""""" 19278This is an overloaded intrinsic. 19279 19280:: 19281 19282 declare i32 @llvm.vp.reduce.umin.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>) 19283 declare i16 @llvm.vp.reduce.umin.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>) 19284 19285Overview: 19286""""""""" 19287 19288Predicated unsigned-integer ``MIN`` reduction of a vector and a scalar starting 19289value, returning the result as a scalar. 19290 19291 19292Arguments: 19293"""""""""" 19294 19295The first operand is the start value of the reduction, which must be a scalar 19296integer type equal to the result type. The second operand is the vector on 19297which the reduction is performed and must be a vector of integer values whose 19298element type is the result/start type. The third operand is the vector mask and 19299is a vector of boolean values with the same number of elements as the vector 19300operand. The fourth operand is the explicit vector length of the operation. 19301 19302Semantics: 19303"""""""""" 19304 19305The '``llvm.vp.reduce.umin``' intrinsic performs the unsigned-integer ``MIN`` 19306reduction (:ref:`llvm.vector.reduce.umin <int_vector_reduce_umin>`) of the 19307vector operand ``val`` on each enabled lane, taking the minimum of that and the 19308scalar ``start_value``. Disabled lanes are treated as containing the neutral 19309value ``UINT_MAX``, or ``-1`` (i.e. having no effect on the reduction 19310operation). If the vector length is zero, the result is the start value. 19311 19312To ignore the start value, the neutral value can be used. 19313 19314Examples: 19315""""""""" 19316 19317.. code-block:: llvm 19318 19319 %r = call i32 @llvm.vp.reduce.umin.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl) 19320 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl 19321 ; are treated as though %mask were false for those lanes. 19322 19323 %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1> 19324 %reduction = call i32 @llvm.vector.reduce.umin.v4i32(<4 x i32> %masked.a) 19325 %also.r = call i32 @llvm.umin.i32(i32 %reduction, i32 %start) 19326 19327 19328.. _int_vp_reduce_fmax: 19329 19330'``llvm.vp.reduce.fmax.*``' Intrinsics 19331^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19332 19333Syntax: 19334""""""" 19335This is an overloaded intrinsic. 19336 19337:: 19338 19339 declare float @llvm.vp.reduce.fmax.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, float <vector_length>) 19340 declare double @llvm.vp.reduce.fmax.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>) 19341 19342Overview: 19343""""""""" 19344 19345Predicated floating-point ``MAX`` reduction of a vector and a scalar starting 19346value, returning the result as a scalar. 19347 19348 19349Arguments: 19350"""""""""" 19351 19352The first operand is the start value of the reduction, which must be a scalar 19353floating-point type equal to the result type. The second operand is the vector 19354on which the reduction is performed and must be a vector of floating-point 19355values whose element type is the result/start type. The third operand is the 19356vector mask and is a vector of boolean values with the same number of elements 19357as the vector operand. The fourth operand is the explicit vector length of the 19358operation. 19359 19360Semantics: 19361"""""""""" 19362 19363The '``llvm.vp.reduce.fmax``' intrinsic performs the floating-point ``MAX`` 19364reduction (:ref:`llvm.vector.reduce.fmax <int_vector_reduce_fmax>`) of the 19365vector operand ``val`` on each enabled lane, taking the maximum of that and the 19366scalar ``start_value``. Disabled lanes are treated as containing the neutral 19367value (i.e. having no effect on the reduction operation). If the vector length 19368is zero, the result is the start value. 19369 19370The neutral value is dependent on the :ref:`fast-math flags <fastmath>`. If no 19371flags are set, the neutral value is ``-QNAN``. If ``nnan`` and ``ninf`` are 19372both set, then the neutral value is the smallest floating-point value for the 19373result type. If only ``nnan`` is set then the neutral value is ``-Infinity``. 19374 19375This instruction has the same comparison semantics as the 19376:ref:`llvm.vector.reduce.fmax <int_vector_reduce_fmax>` intrinsic (and thus the 19377'``llvm.maxnum.*``' intrinsic). That is, the result will always be a number 19378unless all elements of the vector and the starting value are ``NaN``. For a 19379vector with maximum element magnitude ``0.0`` and containing both ``+0.0`` and 19380``-0.0`` elements, the sign of the result is unspecified. 19381 19382To ignore the start value, the neutral value can be used. 19383 19384Examples: 19385""""""""" 19386 19387.. code-block:: llvm 19388 19389 %r = call float @llvm.vp.reduce.fmax.v4f32(float %float, <4 x float> %a, <4 x i1> %mask, i32 %evl) 19390 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl 19391 ; are treated as though %mask were false for those lanes. 19392 19393 %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float QNAN, float QNAN, float QNAN, float QNAN> 19394 %reduction = call float @llvm.vector.reduce.fmax.v4f32(<4 x float> %masked.a) 19395 %also.r = call float @llvm.maxnum.f32(float %reduction, float %start) 19396 19397 19398.. _int_vp_reduce_fmin: 19399 19400'``llvm.vp.reduce.fmin.*``' Intrinsics 19401^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19402 19403Syntax: 19404""""""" 19405This is an overloaded intrinsic. 19406 19407:: 19408 19409 declare float @llvm.vp.reduce.fmin.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, float <vector_length>) 19410 declare double @llvm.vp.reduce.fmin.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>) 19411 19412Overview: 19413""""""""" 19414 19415Predicated floating-point ``MIN`` reduction of a vector and a scalar starting 19416value, returning the result as a scalar. 19417 19418 19419Arguments: 19420"""""""""" 19421 19422The first operand is the start value of the reduction, which must be a scalar 19423floating-point type equal to the result type. The second operand is the vector 19424on which the reduction is performed and must be a vector of floating-point 19425values whose element type is the result/start type. The third operand is the 19426vector mask and is a vector of boolean values with the same number of elements 19427as the vector operand. The fourth operand is the explicit vector length of the 19428operation. 19429 19430Semantics: 19431"""""""""" 19432 19433The '``llvm.vp.reduce.fmin``' intrinsic performs the floating-point ``MIN`` 19434reduction (:ref:`llvm.vector.reduce.fmin <int_vector_reduce_fmin>`) of the 19435vector operand ``val`` on each enabled lane, taking the minimum of that and the 19436scalar ``start_value``. Disabled lanes are treated as containing the neutral 19437value (i.e. having no effect on the reduction operation). If the vector length 19438is zero, the result is the start value. 19439 19440The neutral value is dependent on the :ref:`fast-math flags <fastmath>`. If no 19441flags are set, the neutral value is ``+QNAN``. If ``nnan`` and ``ninf`` are 19442both set, then the neutral value is the largest floating-point value for the 19443result type. If only ``nnan`` is set then the neutral value is ``+Infinity``. 19444 19445This instruction has the same comparison semantics as the 19446:ref:`llvm.vector.reduce.fmin <int_vector_reduce_fmin>` intrinsic (and thus the 19447'``llvm.minnum.*``' intrinsic). That is, the result will always be a number 19448unless all elements of the vector and the starting value are ``NaN``. For a 19449vector with maximum element magnitude ``0.0`` and containing both ``+0.0`` and 19450``-0.0`` elements, the sign of the result is unspecified. 19451 19452To ignore the start value, the neutral value can be used. 19453 19454Examples: 19455""""""""" 19456 19457.. code-block:: llvm 19458 19459 %r = call float @llvm.vp.reduce.fmin.v4f32(float %start, <4 x float> %a, <4 x i1> %mask, i32 %evl) 19460 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl 19461 ; are treated as though %mask were false for those lanes. 19462 19463 %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float QNAN, float QNAN, float QNAN, float QNAN> 19464 %reduction = call float @llvm.vector.reduce.fmin.v4f32(<4 x float> %masked.a) 19465 %also.r = call float @llvm.minnum.f32(float %reduction, float %start) 19466 19467 19468.. _int_get_active_lane_mask: 19469 19470'``llvm.get.active.lane.mask.*``' Intrinsics 19471^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19472 19473Syntax: 19474""""""" 19475This is an overloaded intrinsic. 19476 19477:: 19478 19479 declare <4 x i1> @llvm.get.active.lane.mask.v4i1.i32(i32 %base, i32 %n) 19480 declare <8 x i1> @llvm.get.active.lane.mask.v8i1.i64(i64 %base, i64 %n) 19481 declare <16 x i1> @llvm.get.active.lane.mask.v16i1.i64(i64 %base, i64 %n) 19482 declare <vscale x 16 x i1> @llvm.get.active.lane.mask.nxv16i1.i64(i64 %base, i64 %n) 19483 19484 19485Overview: 19486""""""""" 19487 19488Create a mask representing active and inactive vector lanes. 19489 19490 19491Arguments: 19492"""""""""" 19493 19494Both operands have the same scalar integer type. The result is a vector with 19495the i1 element type. 19496 19497Semantics: 19498"""""""""" 19499 19500The '``llvm.get.active.lane.mask.*``' intrinsics are semantically equivalent 19501to: 19502 19503:: 19504 19505 %m[i] = icmp ult (%base + i), %n 19506 19507where ``%m`` is a vector (mask) of active/inactive lanes with its elements 19508indexed by ``i``, and ``%base``, ``%n`` are the two arguments to 19509``llvm.get.active.lane.mask.*``, ``%icmp`` is an integer compare and ``ult`` 19510the unsigned less-than comparison operator. Overflow cannot occur in 19511``(%base + i)`` and its comparison against ``%n`` as it is performed in integer 19512numbers and not in machine numbers. If ``%n`` is ``0``, then the result is a 19513poison value. The above is equivalent to: 19514 19515:: 19516 19517 %m = @llvm.get.active.lane.mask(%base, %n) 19518 19519This can, for example, be emitted by the loop vectorizer in which case 19520``%base`` is the first element of the vector induction variable (VIV) and 19521``%n`` is the loop tripcount. Thus, these intrinsics perform an element-wise 19522less than comparison of VIV with the loop tripcount, producing a mask of 19523true/false values representing active/inactive vector lanes, except if the VIV 19524overflows in which case they return false in the lanes where the VIV overflows. 19525The arguments are scalar types to accommodate scalable vector types, for which 19526it is unknown what the type of the step vector needs to be that enumerate its 19527lanes without overflow. 19528 19529This mask ``%m`` can e.g. be used in masked load/store instructions. These 19530intrinsics provide a hint to the backend. I.e., for a vector loop, the 19531back-edge taken count of the original scalar loop is explicit as the second 19532argument. 19533 19534 19535Examples: 19536""""""""" 19537 19538.. code-block:: llvm 19539 19540 %active.lane.mask = call <4 x i1> @llvm.get.active.lane.mask.v4i1.i64(i64 %elem0, i64 429) 19541 %wide.masked.load = call <4 x i32> @llvm.masked.load.v4i32.p0v4i32(<4 x i32>* %3, i32 4, <4 x i1> %active.lane.mask, <4 x i32> undef) 19542 19543 19544.. _int_experimental_vp_splice: 19545 19546'``llvm.experimental.vp.splice``' Intrinsic 19547^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19548 19549Syntax: 19550""""""" 19551This is an overloaded intrinsic. 19552 19553:: 19554 19555 declare <2 x double> @llvm.experimental.vp.splice.v2f64(<2 x double> %vec1, <2 x double> %vec2, i32 %imm, <2 x i1> %mask, i32 %evl1, i32 %evl2) 19556 declare <vscale x 4 x i32> @llvm.experimental.vp.splice.nxv4i32(<vscale x 4 x i32> %vec1, <vscale x 4 x i32> %vec2, i32 %imm, <2 x i1> %mask i32 %evl1, i32 %evl2) 19557 19558Overview: 19559""""""""" 19560 19561The '``llvm.experimental.vp.splice.*``' intrinsic is the vector length 19562predicated version of the '``llvm.experimental.vector.splice.*``' intrinsic. 19563 19564Arguments: 19565"""""""""" 19566 19567The result and the first two arguments ``vec1`` and ``vec2`` are vectors with 19568the same type. The third argument ``imm`` is an immediate signed integer that 19569indicates the offset index. The fourth argument ``mask`` is a vector mask and 19570has the same number of elements as the result. The last two arguments ``evl1`` 19571and ``evl2`` are unsigned integers indicating the explicit vector lengths of 19572``vec1`` and ``vec2`` respectively. ``imm``, ``evl1`` and ``evl2`` should 19573respect the following constraints: ``-evl1 <= imm < evl1``, ``0 <= evl1 <= VL`` 19574and ``0 <= evl2 <= VL``, where ``VL`` is the runtime vector factor. If these 19575constraints are not satisfied the intrinsic has undefined behaviour. 19576 19577Semantics: 19578"""""""""" 19579 19580Effectively, this intrinsic concatenates ``vec1[0..evl1-1]`` and 19581``vec2[0..evl2-1]`` and creates the result vector by selecting the elements in a 19582window of size ``evl2``, starting at index ``imm`` (for a positive immediate) of 19583the concatenated vector. Elements in the result vector beyond ``evl2`` are 19584``undef``. If ``imm`` is negative the starting index is ``evl1 + imm``. The result 19585vector of active vector length ``evl2`` contains ``evl1 - imm`` (``-imm`` for 19586negative ``imm``) elements from indices ``[imm..evl1 - 1]`` 19587(``[evl1 + imm..evl1 -1]`` for negative ``imm``) of ``vec1`` followed by the 19588first ``evl2 - (evl1 - imm)`` (``evl2 + imm`` for negative ``imm``) elements of 19589``vec2``. If ``evl1 - imm`` (``-imm``) >= ``evl2``, only the first ``evl2`` 19590elements are considered and the remaining are ``undef``. The lanes in the result 19591vector disabled by ``mask`` are ``undef``. 19592 19593Examples: 19594""""""""" 19595 19596.. code-block:: text 19597 19598 llvm.experimental.vp.splice(<A,B,C,D>, <E,F,G,H>, 1, 2, 3) ==> <B, E, F, undef> ; index 19599 llvm.experimental.vp.splice(<A,B,C,D>, <E,F,G,H>, -2, 3, 2) ==> <B, C, undef, undef> ; trailing elements 19600 19601 19602.. _int_mload_mstore: 19603 19604Masked Vector Load and Store Intrinsics 19605--------------------------------------- 19606 19607LLVM provides intrinsics for predicated vector load and store operations. The predicate is specified by a mask operand, which holds one bit per vector element, switching the associated vector lane on or off. The memory addresses corresponding to the "off" lanes are not accessed. When all bits of the mask are on, the intrinsic is identical to a regular vector load or store. When all bits are off, no memory is accessed. 19608 19609.. _int_mload: 19610 19611'``llvm.masked.load.*``' Intrinsics 19612^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19613 19614Syntax: 19615""""""" 19616This is an overloaded intrinsic. The loaded data is a vector of any integer, floating-point or pointer data type. 19617 19618:: 19619 19620 declare <16 x float> @llvm.masked.load.v16f32.p0v16f32 (<16 x float>* <ptr>, i32 <alignment>, <16 x i1> <mask>, <16 x float> <passthru>) 19621 declare <2 x double> @llvm.masked.load.v2f64.p0v2f64 (<2 x double>* <ptr>, i32 <alignment>, <2 x i1> <mask>, <2 x double> <passthru>) 19622 ;; The data is a vector of pointers to double 19623 declare <8 x double*> @llvm.masked.load.v8p0f64.p0v8p0f64 (<8 x double*>* <ptr>, i32 <alignment>, <8 x i1> <mask>, <8 x double*> <passthru>) 19624 ;; The data is a vector of function pointers 19625 declare <8 x i32 ()*> @llvm.masked.load.v8p0f_i32f.p0v8p0f_i32f (<8 x i32 ()*>* <ptr>, i32 <alignment>, <8 x i1> <mask>, <8 x i32 ()*> <passthru>) 19626 19627Overview: 19628""""""""" 19629 19630Reads a vector from memory according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes. The masked-off lanes in the result vector are taken from the corresponding lanes of the '``passthru``' operand. 19631 19632 19633Arguments: 19634"""""""""" 19635 19636The first operand is the base pointer for the load. The second operand is the alignment of the source location. It must be a power of two constant integer value. The third operand, mask, is a vector of boolean values with the same number of elements as the return type. The fourth is a pass-through value that is used to fill the masked-off lanes of the result. The return type, underlying type of the base pointer and the type of the '``passthru``' operand are the same vector types. 19637 19638Semantics: 19639"""""""""" 19640 19641The '``llvm.masked.load``' intrinsic is designed for conditional reading of selected vector elements in a single IR operation. It is useful for targets that support vector masked loads and allows vectorizing predicated basic blocks on these targets. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar load operations. 19642The result of this operation is equivalent to a regular vector load instruction followed by a 'select' between the loaded and the passthru values, predicated on the same mask. However, using this intrinsic prevents exceptions on memory access to masked-off lanes. 19643 19644 19645:: 19646 19647 %res = call <16 x float> @llvm.masked.load.v16f32.p0v16f32 (<16 x float>* %ptr, i32 4, <16 x i1>%mask, <16 x float> %passthru) 19648 19649 ;; The result of the two following instructions is identical aside from potential memory access exception 19650 %loadlal = load <16 x float>, <16 x float>* %ptr, align 4 19651 %res = select <16 x i1> %mask, <16 x float> %loadlal, <16 x float> %passthru 19652 19653.. _int_mstore: 19654 19655'``llvm.masked.store.*``' Intrinsics 19656^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19657 19658Syntax: 19659""""""" 19660This is an overloaded intrinsic. The data stored in memory is a vector of any integer, floating-point or pointer data type. 19661 19662:: 19663 19664 declare void @llvm.masked.store.v8i32.p0v8i32 (<8 x i32> <value>, <8 x i32>* <ptr>, i32 <alignment>, <8 x i1> <mask>) 19665 declare void @llvm.masked.store.v16f32.p0v16f32 (<16 x float> <value>, <16 x float>* <ptr>, i32 <alignment>, <16 x i1> <mask>) 19666 ;; The data is a vector of pointers to double 19667 declare void @llvm.masked.store.v8p0f64.p0v8p0f64 (<8 x double*> <value>, <8 x double*>* <ptr>, i32 <alignment>, <8 x i1> <mask>) 19668 ;; The data is a vector of function pointers 19669 declare void @llvm.masked.store.v4p0f_i32f.p0v4p0f_i32f (<4 x i32 ()*> <value>, <4 x i32 ()*>* <ptr>, i32 <alignment>, <4 x i1> <mask>) 19670 19671Overview: 19672""""""""" 19673 19674Writes a vector to memory according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes. 19675 19676Arguments: 19677"""""""""" 19678 19679The first operand is the vector value to be written to memory. The second operand is the base pointer for the store, it has the same underlying type as the value operand. The third operand is the alignment of the destination location. It must be a power of two constant integer value. The fourth operand, mask, is a vector of boolean values. The types of the mask and the value operand must have the same number of vector elements. 19680 19681 19682Semantics: 19683"""""""""" 19684 19685The '``llvm.masked.store``' intrinsics is designed for conditional writing of selected vector elements in a single IR operation. It is useful for targets that support vector masked store and allows vectorizing predicated basic blocks on these targets. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar store operations. 19686The result of this operation is equivalent to a load-modify-store sequence. However, using this intrinsic prevents exceptions and data races on memory access to masked-off lanes. 19687 19688:: 19689 19690 call void @llvm.masked.store.v16f32.p0v16f32(<16 x float> %value, <16 x float>* %ptr, i32 4, <16 x i1> %mask) 19691 19692 ;; The result of the following instructions is identical aside from potential data races and memory access exceptions 19693 %oldval = load <16 x float>, <16 x float>* %ptr, align 4 19694 %res = select <16 x i1> %mask, <16 x float> %value, <16 x float> %oldval 19695 store <16 x float> %res, <16 x float>* %ptr, align 4 19696 19697 19698Masked Vector Gather and Scatter Intrinsics 19699------------------------------------------- 19700 19701LLVM provides intrinsics for vector gather and scatter operations. They are similar to :ref:`Masked Vector Load and Store <int_mload_mstore>`, except they are designed for arbitrary memory accesses, rather than sequential memory accesses. Gather and scatter also employ a mask operand, which holds one bit per vector element, switching the associated vector lane on or off. The memory addresses corresponding to the "off" lanes are not accessed. When all bits are off, no memory is accessed. 19702 19703.. _int_mgather: 19704 19705'``llvm.masked.gather.*``' Intrinsics 19706^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19707 19708Syntax: 19709""""""" 19710This is an overloaded intrinsic. The loaded data are multiple scalar values of any integer, floating-point or pointer data type gathered together into one vector. 19711 19712:: 19713 19714 declare <16 x float> @llvm.masked.gather.v16f32.v16p0f32 (<16 x float*> <ptrs>, i32 <alignment>, <16 x i1> <mask>, <16 x float> <passthru>) 19715 declare <2 x double> @llvm.masked.gather.v2f64.v2p1f64 (<2 x double addrspace(1)*> <ptrs>, i32 <alignment>, <2 x i1> <mask>, <2 x double> <passthru>) 19716 declare <8 x float*> @llvm.masked.gather.v8p0f32.v8p0p0f32 (<8 x float**> <ptrs>, i32 <alignment>, <8 x i1> <mask>, <8 x float*> <passthru>) 19717 19718Overview: 19719""""""""" 19720 19721Reads scalar values from arbitrary memory locations and gathers them into one vector. The memory locations are provided in the vector of pointers '``ptrs``'. The memory is accessed according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes. The masked-off lanes in the result vector are taken from the corresponding lanes of the '``passthru``' operand. 19722 19723 19724Arguments: 19725"""""""""" 19726 19727The first operand is a vector of pointers which holds all memory addresses to read. The second operand is an alignment of the source addresses. It must be 0 or a power of two constant integer value. The third operand, mask, is a vector of boolean values with the same number of elements as the return type. The fourth is a pass-through value that is used to fill the masked-off lanes of the result. The return type, underlying type of the vector of pointers and the type of the '``passthru``' operand are the same vector types. 19728 19729Semantics: 19730"""""""""" 19731 19732The '``llvm.masked.gather``' intrinsic is designed for conditional reading of multiple scalar values from arbitrary memory locations in a single IR operation. It is useful for targets that support vector masked gathers and allows vectorizing basic blocks with data and control divergence. Other targets may support this intrinsic differently, for example by lowering it into a sequence of scalar load operations. 19733The semantics of this operation are equivalent to a sequence of conditional scalar loads with subsequent gathering all loaded values into a single vector. The mask restricts memory access to certain lanes and facilitates vectorization of predicated basic blocks. 19734 19735 19736:: 19737 19738 %res = call <4 x double> @llvm.masked.gather.v4f64.v4p0f64 (<4 x double*> %ptrs, i32 8, <4 x i1> <i1 true, i1 true, i1 true, i1 true>, <4 x double> undef) 19739 19740 ;; The gather with all-true mask is equivalent to the following instruction sequence 19741 %ptr0 = extractelement <4 x double*> %ptrs, i32 0 19742 %ptr1 = extractelement <4 x double*> %ptrs, i32 1 19743 %ptr2 = extractelement <4 x double*> %ptrs, i32 2 19744 %ptr3 = extractelement <4 x double*> %ptrs, i32 3 19745 19746 %val0 = load double, double* %ptr0, align 8 19747 %val1 = load double, double* %ptr1, align 8 19748 %val2 = load double, double* %ptr2, align 8 19749 %val3 = load double, double* %ptr3, align 8 19750 19751 %vec0 = insertelement <4 x double>undef, %val0, 0 19752 %vec01 = insertelement <4 x double>%vec0, %val1, 1 19753 %vec012 = insertelement <4 x double>%vec01, %val2, 2 19754 %vec0123 = insertelement <4 x double>%vec012, %val3, 3 19755 19756.. _int_mscatter: 19757 19758'``llvm.masked.scatter.*``' Intrinsics 19759^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19760 19761Syntax: 19762""""""" 19763This is an overloaded intrinsic. The data stored in memory is a vector of any integer, floating-point or pointer data type. Each vector element is stored in an arbitrary memory address. Scatter with overlapping addresses is guaranteed to be ordered from least-significant to most-significant element. 19764 19765:: 19766 19767 declare void @llvm.masked.scatter.v8i32.v8p0i32 (<8 x i32> <value>, <8 x i32*> <ptrs>, i32 <alignment>, <8 x i1> <mask>) 19768 declare void @llvm.masked.scatter.v16f32.v16p1f32 (<16 x float> <value>, <16 x float addrspace(1)*> <ptrs>, i32 <alignment>, <16 x i1> <mask>) 19769 declare void @llvm.masked.scatter.v4p0f64.v4p0p0f64 (<4 x double*> <value>, <4 x double**> <ptrs>, i32 <alignment>, <4 x i1> <mask>) 19770 19771Overview: 19772""""""""" 19773 19774Writes each element from the value vector to the corresponding memory address. The memory addresses are represented as a vector of pointers. Writing is done according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes. 19775 19776Arguments: 19777"""""""""" 19778 19779The first operand is a vector value to be written to memory. The second operand is a vector of pointers, pointing to where the value elements should be stored. It has the same underlying type as the value operand. The third operand is an alignment of the destination addresses. It must be 0 or a power of two constant integer value. The fourth operand, mask, is a vector of boolean values. The types of the mask and the value operand must have the same number of vector elements. 19780 19781Semantics: 19782"""""""""" 19783 19784The '``llvm.masked.scatter``' intrinsics is designed for writing selected vector elements to arbitrary memory addresses in a single IR operation. The operation may be conditional, when not all bits in the mask are switched on. It is useful for targets that support vector masked scatter and allows vectorizing basic blocks with data and control divergence. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar store operations. 19785 19786:: 19787 19788 ;; This instruction unconditionally stores data vector in multiple addresses 19789 call @llvm.masked.scatter.v8i32.v8p0i32 (<8 x i32> %value, <8 x i32*> %ptrs, i32 4, <8 x i1> <true, true, .. true>) 19790 19791 ;; It is equivalent to a list of scalar stores 19792 %val0 = extractelement <8 x i32> %value, i32 0 19793 %val1 = extractelement <8 x i32> %value, i32 1 19794 .. 19795 %val7 = extractelement <8 x i32> %value, i32 7 19796 %ptr0 = extractelement <8 x i32*> %ptrs, i32 0 19797 %ptr1 = extractelement <8 x i32*> %ptrs, i32 1 19798 .. 19799 %ptr7 = extractelement <8 x i32*> %ptrs, i32 7 19800 ;; Note: the order of the following stores is important when they overlap: 19801 store i32 %val0, i32* %ptr0, align 4 19802 store i32 %val1, i32* %ptr1, align 4 19803 .. 19804 store i32 %val7, i32* %ptr7, align 4 19805 19806 19807Masked Vector Expanding Load and Compressing Store Intrinsics 19808------------------------------------------------------------- 19809 19810LLVM provides intrinsics for expanding load and compressing store operations. Data selected from a vector according to a mask is stored in consecutive memory addresses (compressed store), and vice-versa (expanding load). These operations effective map to "if (cond.i) a[j++] = v.i" and "if (cond.i) v.i = a[j++]" patterns, respectively. Note that when the mask starts with '1' bits followed by '0' bits, these operations are identical to :ref:`llvm.masked.store <int_mstore>` and :ref:`llvm.masked.load <int_mload>`. 19811 19812.. _int_expandload: 19813 19814'``llvm.masked.expandload.*``' Intrinsics 19815^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19816 19817Syntax: 19818""""""" 19819This is an overloaded intrinsic. Several values of integer, floating point or pointer data type are loaded from consecutive memory addresses and stored into the elements of a vector according to the mask. 19820 19821:: 19822 19823 declare <16 x float> @llvm.masked.expandload.v16f32 (float* <ptr>, <16 x i1> <mask>, <16 x float> <passthru>) 19824 declare <2 x i64> @llvm.masked.expandload.v2i64 (i64* <ptr>, <2 x i1> <mask>, <2 x i64> <passthru>) 19825 19826Overview: 19827""""""""" 19828 19829Reads a number of scalar values sequentially from memory location provided in '``ptr``' and spreads them in a vector. The '``mask``' holds a bit for each vector lane. The number of elements read from memory is equal to the number of '1' bits in the mask. The loaded elements are positioned in the destination vector according to the sequence of '1' and '0' bits in the mask. E.g., if the mask vector is '10010001', "expandload" reads 3 values from memory addresses ptr, ptr+1, ptr+2 and places them in lanes 0, 3 and 7 accordingly. The masked-off lanes are filled by elements from the corresponding lanes of the '``passthru``' operand. 19830 19831 19832Arguments: 19833"""""""""" 19834 19835The first operand is the base pointer for the load. It has the same underlying type as the element of the returned vector. The second operand, mask, is a vector of boolean values with the same number of elements as the return type. The third is a pass-through value that is used to fill the masked-off lanes of the result. The return type and the type of the '``passthru``' operand have the same vector type. 19836 19837Semantics: 19838"""""""""" 19839 19840The '``llvm.masked.expandload``' intrinsic is designed for reading multiple scalar values from adjacent memory addresses into possibly non-adjacent vector lanes. It is useful for targets that support vector expanding loads and allows vectorizing loop with cross-iteration dependency like in the following example: 19841 19842.. code-block:: c 19843 19844 // In this loop we load from B and spread the elements into array A. 19845 double *A, B; int *C; 19846 for (int i = 0; i < size; ++i) { 19847 if (C[i] != 0) 19848 A[i] = B[j++]; 19849 } 19850 19851 19852.. code-block:: llvm 19853 19854 ; Load several elements from array B and expand them in a vector. 19855 ; The number of loaded elements is equal to the number of '1' elements in the Mask. 19856 %Tmp = call <8 x double> @llvm.masked.expandload.v8f64(double* %Bptr, <8 x i1> %Mask, <8 x double> undef) 19857 ; Store the result in A 19858 call void @llvm.masked.store.v8f64.p0v8f64(<8 x double> %Tmp, <8 x double>* %Aptr, i32 8, <8 x i1> %Mask) 19859 19860 ; %Bptr should be increased on each iteration according to the number of '1' elements in the Mask. 19861 %MaskI = bitcast <8 x i1> %Mask to i8 19862 %MaskIPopcnt = call i8 @llvm.ctpop.i8(i8 %MaskI) 19863 %MaskI64 = zext i8 %MaskIPopcnt to i64 19864 %BNextInd = add i64 %BInd, %MaskI64 19865 19866 19867Other targets may support this intrinsic differently, for example, by lowering it into a sequence of conditional scalar load operations and shuffles. 19868If all mask elements are '1', the intrinsic behavior is equivalent to the regular unmasked vector load. 19869 19870.. _int_compressstore: 19871 19872'``llvm.masked.compressstore.*``' Intrinsics 19873^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19874 19875Syntax: 19876""""""" 19877This is an overloaded intrinsic. A number of scalar values of integer, floating point or pointer data type are collected from an input vector and stored into adjacent memory addresses. A mask defines which elements to collect from the vector. 19878 19879:: 19880 19881 declare void @llvm.masked.compressstore.v8i32 (<8 x i32> <value>, i32* <ptr>, <8 x i1> <mask>) 19882 declare void @llvm.masked.compressstore.v16f32 (<16 x float> <value>, float* <ptr>, <16 x i1> <mask>) 19883 19884Overview: 19885""""""""" 19886 19887Selects elements from input vector '``value``' according to the '``mask``'. All selected elements are written into adjacent memory addresses starting at address '`ptr`', from lower to higher. The mask holds a bit for each vector lane, and is used to select elements to be stored. The number of elements to be stored is equal to the number of active bits in the mask. 19888 19889Arguments: 19890"""""""""" 19891 19892The first operand is the input vector, from which elements are collected and written to memory. The second operand is the base pointer for the store, it has the same underlying type as the element of the input vector operand. The third operand is the mask, a vector of boolean values. The mask and the input vector must have the same number of vector elements. 19893 19894 19895Semantics: 19896"""""""""" 19897 19898The '``llvm.masked.compressstore``' intrinsic is designed for compressing data in memory. It allows to collect elements from possibly non-adjacent lanes of a vector and store them contiguously in memory in one IR operation. It is useful for targets that support compressing store operations and allows vectorizing loops with cross-iteration dependences like in the following example: 19899 19900.. code-block:: c 19901 19902 // In this loop we load elements from A and store them consecutively in B 19903 double *A, B; int *C; 19904 for (int i = 0; i < size; ++i) { 19905 if (C[i] != 0) 19906 B[j++] = A[i] 19907 } 19908 19909 19910.. code-block:: llvm 19911 19912 ; Load elements from A. 19913 %Tmp = call <8 x double> @llvm.masked.load.v8f64.p0v8f64(<8 x double>* %Aptr, i32 8, <8 x i1> %Mask, <8 x double> undef) 19914 ; Store all selected elements consecutively in array B 19915 call <void> @llvm.masked.compressstore.v8f64(<8 x double> %Tmp, double* %Bptr, <8 x i1> %Mask) 19916 19917 ; %Bptr should be increased on each iteration according to the number of '1' elements in the Mask. 19918 %MaskI = bitcast <8 x i1> %Mask to i8 19919 %MaskIPopcnt = call i8 @llvm.ctpop.i8(i8 %MaskI) 19920 %MaskI64 = zext i8 %MaskIPopcnt to i64 19921 %BNextInd = add i64 %BInd, %MaskI64 19922 19923 19924Other targets may support this intrinsic differently, for example, by lowering it into a sequence of branches that guard scalar store operations. 19925 19926 19927Memory Use Markers 19928------------------ 19929 19930This class of intrinsics provides information about the 19931:ref:`lifetime of memory objects <objectlifetime>` and ranges where variables 19932are immutable. 19933 19934.. _int_lifestart: 19935 19936'``llvm.lifetime.start``' Intrinsic 19937^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19938 19939Syntax: 19940""""""" 19941 19942:: 19943 19944 declare void @llvm.lifetime.start(i64 <size>, i8* nocapture <ptr>) 19945 19946Overview: 19947""""""""" 19948 19949The '``llvm.lifetime.start``' intrinsic specifies the start of a memory 19950object's lifetime. 19951 19952Arguments: 19953"""""""""" 19954 19955The first argument is a constant integer representing the size of the 19956object, or -1 if it is variable sized. The second argument is a pointer 19957to the object. 19958 19959Semantics: 19960"""""""""" 19961 19962If ``ptr`` is a stack-allocated object and it points to the first byte of 19963the object, the object is initially marked as dead. 19964``ptr`` is conservatively considered as a non-stack-allocated object if 19965the stack coloring algorithm that is used in the optimization pipeline cannot 19966conclude that ``ptr`` is a stack-allocated object. 19967 19968After '``llvm.lifetime.start``', the stack object that ``ptr`` points is marked 19969as alive and has an uninitialized value. 19970The stack object is marked as dead when either 19971:ref:`llvm.lifetime.end <int_lifeend>` to the alloca is executed or the 19972function returns. 19973 19974After :ref:`llvm.lifetime.end <int_lifeend>` is called, 19975'``llvm.lifetime.start``' on the stack object can be called again. 19976The second '``llvm.lifetime.start``' call marks the object as alive, but it 19977does not change the address of the object. 19978 19979If ``ptr`` is a non-stack-allocated object, it does not point to the first 19980byte of the object or it is a stack object that is already alive, it simply 19981fills all bytes of the object with ``poison``. 19982 19983 19984.. _int_lifeend: 19985 19986'``llvm.lifetime.end``' Intrinsic 19987^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 19988 19989Syntax: 19990""""""" 19991 19992:: 19993 19994 declare void @llvm.lifetime.end(i64 <size>, i8* nocapture <ptr>) 19995 19996Overview: 19997""""""""" 19998 19999The '``llvm.lifetime.end``' intrinsic specifies the end of a memory object's 20000lifetime. 20001 20002Arguments: 20003"""""""""" 20004 20005The first argument is a constant integer representing the size of the 20006object, or -1 if it is variable sized. The second argument is a pointer 20007to the object. 20008 20009Semantics: 20010"""""""""" 20011 20012If ``ptr`` is a stack-allocated object and it points to the first byte of the 20013object, the object is dead. 20014``ptr`` is conservatively considered as a non-stack-allocated object if 20015the stack coloring algorithm that is used in the optimization pipeline cannot 20016conclude that ``ptr`` is a stack-allocated object. 20017 20018Calling ``llvm.lifetime.end`` on an already dead alloca is no-op. 20019 20020If ``ptr`` is a non-stack-allocated object or it does not point to the first 20021byte of the object, it is equivalent to simply filling all bytes of the object 20022with ``poison``. 20023 20024 20025'``llvm.invariant.start``' Intrinsic 20026^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20027 20028Syntax: 20029""""""" 20030This is an overloaded intrinsic. The memory object can belong to any address space. 20031 20032:: 20033 20034 declare {}* @llvm.invariant.start.p0i8(i64 <size>, i8* nocapture <ptr>) 20035 20036Overview: 20037""""""""" 20038 20039The '``llvm.invariant.start``' intrinsic specifies that the contents of 20040a memory object will not change. 20041 20042Arguments: 20043"""""""""" 20044 20045The first argument is a constant integer representing the size of the 20046object, or -1 if it is variable sized. The second argument is a pointer 20047to the object. 20048 20049Semantics: 20050"""""""""" 20051 20052This intrinsic indicates that until an ``llvm.invariant.end`` that uses 20053the return value, the referenced memory location is constant and 20054unchanging. 20055 20056'``llvm.invariant.end``' Intrinsic 20057^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20058 20059Syntax: 20060""""""" 20061This is an overloaded intrinsic. The memory object can belong to any address space. 20062 20063:: 20064 20065 declare void @llvm.invariant.end.p0i8({}* <start>, i64 <size>, i8* nocapture <ptr>) 20066 20067Overview: 20068""""""""" 20069 20070The '``llvm.invariant.end``' intrinsic specifies that the contents of a 20071memory object are mutable. 20072 20073Arguments: 20074"""""""""" 20075 20076The first argument is the matching ``llvm.invariant.start`` intrinsic. 20077The second argument is a constant integer representing the size of the 20078object, or -1 if it is variable sized and the third argument is a 20079pointer to the object. 20080 20081Semantics: 20082"""""""""" 20083 20084This intrinsic indicates that the memory is mutable again. 20085 20086'``llvm.launder.invariant.group``' Intrinsic 20087^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20088 20089Syntax: 20090""""""" 20091This is an overloaded intrinsic. The memory object can belong to any address 20092space. The returned pointer must belong to the same address space as the 20093argument. 20094 20095:: 20096 20097 declare i8* @llvm.launder.invariant.group.p0i8(i8* <ptr>) 20098 20099Overview: 20100""""""""" 20101 20102The '``llvm.launder.invariant.group``' intrinsic can be used when an invariant 20103established by ``invariant.group`` metadata no longer holds, to obtain a new 20104pointer value that carries fresh invariant group information. It is an 20105experimental intrinsic, which means that its semantics might change in the 20106future. 20107 20108 20109Arguments: 20110"""""""""" 20111 20112The ``llvm.launder.invariant.group`` takes only one argument, which is a pointer 20113to the memory. 20114 20115Semantics: 20116"""""""""" 20117 20118Returns another pointer that aliases its argument but which is considered different 20119for the purposes of ``load``/``store`` ``invariant.group`` metadata. 20120It does not read any accessible memory and the execution can be speculated. 20121 20122'``llvm.strip.invariant.group``' Intrinsic 20123^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20124 20125Syntax: 20126""""""" 20127This is an overloaded intrinsic. The memory object can belong to any address 20128space. The returned pointer must belong to the same address space as the 20129argument. 20130 20131:: 20132 20133 declare i8* @llvm.strip.invariant.group.p0i8(i8* <ptr>) 20134 20135Overview: 20136""""""""" 20137 20138The '``llvm.strip.invariant.group``' intrinsic can be used when an invariant 20139established by ``invariant.group`` metadata no longer holds, to obtain a new pointer 20140value that does not carry the invariant information. It is an experimental 20141intrinsic, which means that its semantics might change in the future. 20142 20143 20144Arguments: 20145"""""""""" 20146 20147The ``llvm.strip.invariant.group`` takes only one argument, which is a pointer 20148to the memory. 20149 20150Semantics: 20151"""""""""" 20152 20153Returns another pointer that aliases its argument but which has no associated 20154``invariant.group`` metadata. 20155It does not read any memory and can be speculated. 20156 20157 20158 20159.. _constrainedfp: 20160 20161Constrained Floating-Point Intrinsics 20162------------------------------------- 20163 20164These intrinsics are used to provide special handling of floating-point 20165operations when specific rounding mode or floating-point exception behavior is 20166required. By default, LLVM optimization passes assume that the rounding mode is 20167round-to-nearest and that floating-point exceptions will not be monitored. 20168Constrained FP intrinsics are used to support non-default rounding modes and 20169accurately preserve exception behavior without compromising LLVM's ability to 20170optimize FP code when the default behavior is used. 20171 20172If any FP operation in a function is constrained then they all must be 20173constrained. This is required for correct LLVM IR. Optimizations that 20174move code around can create miscompiles if mixing of constrained and normal 20175operations is done. The correct way to mix constrained and less constrained 20176operations is to use the rounding mode and exception handling metadata to 20177mark constrained intrinsics as having LLVM's default behavior. 20178 20179Each of these intrinsics corresponds to a normal floating-point operation. The 20180data arguments and the return value are the same as the corresponding FP 20181operation. 20182 20183The rounding mode argument is a metadata string specifying what 20184assumptions, if any, the optimizer can make when transforming constant 20185values. Some constrained FP intrinsics omit this argument. If required 20186by the intrinsic, this argument must be one of the following strings: 20187 20188:: 20189 20190 "round.dynamic" 20191 "round.tonearest" 20192 "round.downward" 20193 "round.upward" 20194 "round.towardzero" 20195 "round.tonearestaway" 20196 20197If this argument is "round.dynamic" optimization passes must assume that the 20198rounding mode is unknown and may change at runtime. No transformations that 20199depend on rounding mode may be performed in this case. 20200 20201The other possible values for the rounding mode argument correspond to the 20202similarly named IEEE rounding modes. If the argument is any of these values 20203optimization passes may perform transformations as long as they are consistent 20204with the specified rounding mode. 20205 20206For example, 'x-0'->'x' is not a valid transformation if the rounding mode is 20207"round.downward" or "round.dynamic" because if the value of 'x' is +0 then 20208'x-0' should evaluate to '-0' when rounding downward. However, this 20209transformation is legal for all other rounding modes. 20210 20211For values other than "round.dynamic" optimization passes may assume that the 20212actual runtime rounding mode (as defined in a target-specific manner) matches 20213the specified rounding mode, but this is not guaranteed. Using a specific 20214non-dynamic rounding mode which does not match the actual rounding mode at 20215runtime results in undefined behavior. 20216 20217The exception behavior argument is a metadata string describing the floating 20218point exception semantics that required for the intrinsic. This argument 20219must be one of the following strings: 20220 20221:: 20222 20223 "fpexcept.ignore" 20224 "fpexcept.maytrap" 20225 "fpexcept.strict" 20226 20227If this argument is "fpexcept.ignore" optimization passes may assume that the 20228exception status flags will not be read and that floating-point exceptions will 20229be masked. This allows transformations to be performed that may change the 20230exception semantics of the original code. For example, FP operations may be 20231speculatively executed in this case whereas they must not be for either of the 20232other possible values of this argument. 20233 20234If the exception behavior argument is "fpexcept.maytrap" optimization passes 20235must avoid transformations that may raise exceptions that would not have been 20236raised by the original code (such as speculatively executing FP operations), but 20237passes are not required to preserve all exceptions that are implied by the 20238original code. For example, exceptions may be potentially hidden by constant 20239folding. 20240 20241If the exception behavior argument is "fpexcept.strict" all transformations must 20242strictly preserve the floating-point exception semantics of the original code. 20243Any FP exception that would have been raised by the original code must be raised 20244by the transformed code, and the transformed code must not raise any FP 20245exceptions that would not have been raised by the original code. This is the 20246exception behavior argument that will be used if the code being compiled reads 20247the FP exception status flags, but this mode can also be used with code that 20248unmasks FP exceptions. 20249 20250The number and order of floating-point exceptions is NOT guaranteed. For 20251example, a series of FP operations that each may raise exceptions may be 20252vectorized into a single instruction that raises each unique exception a single 20253time. 20254 20255Proper :ref:`function attributes <fnattrs>` usage is required for the 20256constrained intrinsics to function correctly. 20257 20258All function *calls* done in a function that uses constrained floating 20259point intrinsics must have the ``strictfp`` attribute. 20260 20261All function *definitions* that use constrained floating point intrinsics 20262must have the ``strictfp`` attribute. 20263 20264'``llvm.experimental.constrained.fadd``' Intrinsic 20265^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20266 20267Syntax: 20268""""""" 20269 20270:: 20271 20272 declare <type> 20273 @llvm.experimental.constrained.fadd(<type> <op1>, <type> <op2>, 20274 metadata <rounding mode>, 20275 metadata <exception behavior>) 20276 20277Overview: 20278""""""""" 20279 20280The '``llvm.experimental.constrained.fadd``' intrinsic returns the sum of its 20281two operands. 20282 20283 20284Arguments: 20285"""""""""" 20286 20287The first two arguments to the '``llvm.experimental.constrained.fadd``' 20288intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` 20289of floating-point values. Both arguments must have identical types. 20290 20291The third and fourth arguments specify the rounding mode and exception 20292behavior as described above. 20293 20294Semantics: 20295"""""""""" 20296 20297The value produced is the floating-point sum of the two value operands and has 20298the same type as the operands. 20299 20300 20301'``llvm.experimental.constrained.fsub``' Intrinsic 20302^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20303 20304Syntax: 20305""""""" 20306 20307:: 20308 20309 declare <type> 20310 @llvm.experimental.constrained.fsub(<type> <op1>, <type> <op2>, 20311 metadata <rounding mode>, 20312 metadata <exception behavior>) 20313 20314Overview: 20315""""""""" 20316 20317The '``llvm.experimental.constrained.fsub``' intrinsic returns the difference 20318of its two operands. 20319 20320 20321Arguments: 20322"""""""""" 20323 20324The first two arguments to the '``llvm.experimental.constrained.fsub``' 20325intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` 20326of floating-point values. Both arguments must have identical types. 20327 20328The third and fourth arguments specify the rounding mode and exception 20329behavior as described above. 20330 20331Semantics: 20332"""""""""" 20333 20334The value produced is the floating-point difference of the two value operands 20335and has the same type as the operands. 20336 20337 20338'``llvm.experimental.constrained.fmul``' Intrinsic 20339^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20340 20341Syntax: 20342""""""" 20343 20344:: 20345 20346 declare <type> 20347 @llvm.experimental.constrained.fmul(<type> <op1>, <type> <op2>, 20348 metadata <rounding mode>, 20349 metadata <exception behavior>) 20350 20351Overview: 20352""""""""" 20353 20354The '``llvm.experimental.constrained.fmul``' intrinsic returns the product of 20355its two operands. 20356 20357 20358Arguments: 20359"""""""""" 20360 20361The first two arguments to the '``llvm.experimental.constrained.fmul``' 20362intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` 20363of floating-point values. Both arguments must have identical types. 20364 20365The third and fourth arguments specify the rounding mode and exception 20366behavior as described above. 20367 20368Semantics: 20369"""""""""" 20370 20371The value produced is the floating-point product of the two value operands and 20372has the same type as the operands. 20373 20374 20375'``llvm.experimental.constrained.fdiv``' Intrinsic 20376^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20377 20378Syntax: 20379""""""" 20380 20381:: 20382 20383 declare <type> 20384 @llvm.experimental.constrained.fdiv(<type> <op1>, <type> <op2>, 20385 metadata <rounding mode>, 20386 metadata <exception behavior>) 20387 20388Overview: 20389""""""""" 20390 20391The '``llvm.experimental.constrained.fdiv``' intrinsic returns the quotient of 20392its two operands. 20393 20394 20395Arguments: 20396"""""""""" 20397 20398The first two arguments to the '``llvm.experimental.constrained.fdiv``' 20399intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` 20400of floating-point values. Both arguments must have identical types. 20401 20402The third and fourth arguments specify the rounding mode and exception 20403behavior as described above. 20404 20405Semantics: 20406"""""""""" 20407 20408The value produced is the floating-point quotient of the two value operands and 20409has the same type as the operands. 20410 20411 20412'``llvm.experimental.constrained.frem``' Intrinsic 20413^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20414 20415Syntax: 20416""""""" 20417 20418:: 20419 20420 declare <type> 20421 @llvm.experimental.constrained.frem(<type> <op1>, <type> <op2>, 20422 metadata <rounding mode>, 20423 metadata <exception behavior>) 20424 20425Overview: 20426""""""""" 20427 20428The '``llvm.experimental.constrained.frem``' intrinsic returns the remainder 20429from the division of its two operands. 20430 20431 20432Arguments: 20433"""""""""" 20434 20435The first two arguments to the '``llvm.experimental.constrained.frem``' 20436intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` 20437of floating-point values. Both arguments must have identical types. 20438 20439The third and fourth arguments specify the rounding mode and exception 20440behavior as described above. The rounding mode argument has no effect, since 20441the result of frem is never rounded, but the argument is included for 20442consistency with the other constrained floating-point intrinsics. 20443 20444Semantics: 20445"""""""""" 20446 20447The value produced is the floating-point remainder from the division of the two 20448value operands and has the same type as the operands. The remainder has the 20449same sign as the dividend. 20450 20451'``llvm.experimental.constrained.fma``' Intrinsic 20452^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20453 20454Syntax: 20455""""""" 20456 20457:: 20458 20459 declare <type> 20460 @llvm.experimental.constrained.fma(<type> <op1>, <type> <op2>, <type> <op3>, 20461 metadata <rounding mode>, 20462 metadata <exception behavior>) 20463 20464Overview: 20465""""""""" 20466 20467The '``llvm.experimental.constrained.fma``' intrinsic returns the result of a 20468fused-multiply-add operation on its operands. 20469 20470Arguments: 20471"""""""""" 20472 20473The first three arguments to the '``llvm.experimental.constrained.fma``' 20474intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector 20475<t_vector>` of floating-point values. All arguments must have identical types. 20476 20477The fourth and fifth arguments specify the rounding mode and exception behavior 20478as described above. 20479 20480Semantics: 20481"""""""""" 20482 20483The result produced is the product of the first two operands added to the third 20484operand computed with infinite precision, and then rounded to the target 20485precision. 20486 20487'``llvm.experimental.constrained.fptoui``' Intrinsic 20488^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20489 20490Syntax: 20491""""""" 20492 20493:: 20494 20495 declare <ty2> 20496 @llvm.experimental.constrained.fptoui(<type> <value>, 20497 metadata <exception behavior>) 20498 20499Overview: 20500""""""""" 20501 20502The '``llvm.experimental.constrained.fptoui``' intrinsic converts a 20503floating-point ``value`` to its unsigned integer equivalent of type ``ty2``. 20504 20505Arguments: 20506"""""""""" 20507 20508The first argument to the '``llvm.experimental.constrained.fptoui``' 20509intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector 20510<t_vector>` of floating point values. 20511 20512The second argument specifies the exception behavior as described above. 20513 20514Semantics: 20515"""""""""" 20516 20517The result produced is an unsigned integer converted from the floating 20518point operand. The value is truncated, so it is rounded towards zero. 20519 20520'``llvm.experimental.constrained.fptosi``' Intrinsic 20521^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20522 20523Syntax: 20524""""""" 20525 20526:: 20527 20528 declare <ty2> 20529 @llvm.experimental.constrained.fptosi(<type> <value>, 20530 metadata <exception behavior>) 20531 20532Overview: 20533""""""""" 20534 20535The '``llvm.experimental.constrained.fptosi``' intrinsic converts 20536:ref:`floating-point <t_floating>` ``value`` to type ``ty2``. 20537 20538Arguments: 20539"""""""""" 20540 20541The first argument to the '``llvm.experimental.constrained.fptosi``' 20542intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector 20543<t_vector>` of floating point values. 20544 20545The second argument specifies the exception behavior as described above. 20546 20547Semantics: 20548"""""""""" 20549 20550The result produced is a signed integer converted from the floating 20551point operand. The value is truncated, so it is rounded towards zero. 20552 20553'``llvm.experimental.constrained.uitofp``' Intrinsic 20554^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20555 20556Syntax: 20557""""""" 20558 20559:: 20560 20561 declare <ty2> 20562 @llvm.experimental.constrained.uitofp(<type> <value>, 20563 metadata <rounding mode>, 20564 metadata <exception behavior>) 20565 20566Overview: 20567""""""""" 20568 20569The '``llvm.experimental.constrained.uitofp``' intrinsic converts an 20570unsigned integer ``value`` to a floating-point of type ``ty2``. 20571 20572Arguments: 20573"""""""""" 20574 20575The first argument to the '``llvm.experimental.constrained.uitofp``' 20576intrinsic must be an :ref:`integer <t_integer>` or :ref:`vector 20577<t_vector>` of integer values. 20578 20579The second and third arguments specify the rounding mode and exception 20580behavior as described above. 20581 20582Semantics: 20583"""""""""" 20584 20585An inexact floating-point exception will be raised if rounding is required. 20586Any result produced is a floating point value converted from the input 20587integer operand. 20588 20589'``llvm.experimental.constrained.sitofp``' Intrinsic 20590^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20591 20592Syntax: 20593""""""" 20594 20595:: 20596 20597 declare <ty2> 20598 @llvm.experimental.constrained.sitofp(<type> <value>, 20599 metadata <rounding mode>, 20600 metadata <exception behavior>) 20601 20602Overview: 20603""""""""" 20604 20605The '``llvm.experimental.constrained.sitofp``' intrinsic converts a 20606signed integer ``value`` to a floating-point of type ``ty2``. 20607 20608Arguments: 20609"""""""""" 20610 20611The first argument to the '``llvm.experimental.constrained.sitofp``' 20612intrinsic must be an :ref:`integer <t_integer>` or :ref:`vector 20613<t_vector>` of integer values. 20614 20615The second and third arguments specify the rounding mode and exception 20616behavior as described above. 20617 20618Semantics: 20619"""""""""" 20620 20621An inexact floating-point exception will be raised if rounding is required. 20622Any result produced is a floating point value converted from the input 20623integer operand. 20624 20625'``llvm.experimental.constrained.fptrunc``' Intrinsic 20626^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20627 20628Syntax: 20629""""""" 20630 20631:: 20632 20633 declare <ty2> 20634 @llvm.experimental.constrained.fptrunc(<type> <value>, 20635 metadata <rounding mode>, 20636 metadata <exception behavior>) 20637 20638Overview: 20639""""""""" 20640 20641The '``llvm.experimental.constrained.fptrunc``' intrinsic truncates ``value`` 20642to type ``ty2``. 20643 20644Arguments: 20645"""""""""" 20646 20647The first argument to the '``llvm.experimental.constrained.fptrunc``' 20648intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector 20649<t_vector>` of floating point values. This argument must be larger in size 20650than the result. 20651 20652The second and third arguments specify the rounding mode and exception 20653behavior as described above. 20654 20655Semantics: 20656"""""""""" 20657 20658The result produced is a floating point value truncated to be smaller in size 20659than the operand. 20660 20661'``llvm.experimental.constrained.fpext``' Intrinsic 20662^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20663 20664Syntax: 20665""""""" 20666 20667:: 20668 20669 declare <ty2> 20670 @llvm.experimental.constrained.fpext(<type> <value>, 20671 metadata <exception behavior>) 20672 20673Overview: 20674""""""""" 20675 20676The '``llvm.experimental.constrained.fpext``' intrinsic extends a 20677floating-point ``value`` to a larger floating-point value. 20678 20679Arguments: 20680"""""""""" 20681 20682The first argument to the '``llvm.experimental.constrained.fpext``' 20683intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector 20684<t_vector>` of floating point values. This argument must be smaller in size 20685than the result. 20686 20687The second argument specifies the exception behavior as described above. 20688 20689Semantics: 20690"""""""""" 20691 20692The result produced is a floating point value extended to be larger in size 20693than the operand. All restrictions that apply to the fpext instruction also 20694apply to this intrinsic. 20695 20696'``llvm.experimental.constrained.fcmp``' and '``llvm.experimental.constrained.fcmps``' Intrinsics 20697^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20698 20699Syntax: 20700""""""" 20701 20702:: 20703 20704 declare <ty2> 20705 @llvm.experimental.constrained.fcmp(<type> <op1>, <type> <op2>, 20706 metadata <condition code>, 20707 metadata <exception behavior>) 20708 declare <ty2> 20709 @llvm.experimental.constrained.fcmps(<type> <op1>, <type> <op2>, 20710 metadata <condition code>, 20711 metadata <exception behavior>) 20712 20713Overview: 20714""""""""" 20715 20716The '``llvm.experimental.constrained.fcmp``' and 20717'``llvm.experimental.constrained.fcmps``' intrinsics return a boolean 20718value or vector of boolean values based on comparison of its operands. 20719 20720If the operands are floating-point scalars, then the result type is a 20721boolean (:ref:`i1 <t_integer>`). 20722 20723If the operands are floating-point vectors, then the result type is a 20724vector of boolean with the same number of elements as the operands being 20725compared. 20726 20727The '``llvm.experimental.constrained.fcmp``' intrinsic performs a quiet 20728comparison operation while the '``llvm.experimental.constrained.fcmps``' 20729intrinsic performs a signaling comparison operation. 20730 20731Arguments: 20732"""""""""" 20733 20734The first two arguments to the '``llvm.experimental.constrained.fcmp``' 20735and '``llvm.experimental.constrained.fcmps``' intrinsics must be 20736:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` 20737of floating-point values. Both arguments must have identical types. 20738 20739The third argument is the condition code indicating the kind of comparison 20740to perform. It must be a metadata string with one of the following values: 20741 20742- "``oeq``": ordered and equal 20743- "``ogt``": ordered and greater than 20744- "``oge``": ordered and greater than or equal 20745- "``olt``": ordered and less than 20746- "``ole``": ordered and less than or equal 20747- "``one``": ordered and not equal 20748- "``ord``": ordered (no nans) 20749- "``ueq``": unordered or equal 20750- "``ugt``": unordered or greater than 20751- "``uge``": unordered or greater than or equal 20752- "``ult``": unordered or less than 20753- "``ule``": unordered or less than or equal 20754- "``une``": unordered or not equal 20755- "``uno``": unordered (either nans) 20756 20757*Ordered* means that neither operand is a NAN while *unordered* means 20758that either operand may be a NAN. 20759 20760The fourth argument specifies the exception behavior as described above. 20761 20762Semantics: 20763"""""""""" 20764 20765``op1`` and ``op2`` are compared according to the condition code given 20766as the third argument. If the operands are vectors, then the 20767vectors are compared element by element. Each comparison performed 20768always yields an :ref:`i1 <t_integer>` result, as follows: 20769 20770- "``oeq``": yields ``true`` if both operands are not a NAN and ``op1`` 20771 is equal to ``op2``. 20772- "``ogt``": yields ``true`` if both operands are not a NAN and ``op1`` 20773 is greater than ``op2``. 20774- "``oge``": yields ``true`` if both operands are not a NAN and ``op1`` 20775 is greater than or equal to ``op2``. 20776- "``olt``": yields ``true`` if both operands are not a NAN and ``op1`` 20777 is less than ``op2``. 20778- "``ole``": yields ``true`` if both operands are not a NAN and ``op1`` 20779 is less than or equal to ``op2``. 20780- "``one``": yields ``true`` if both operands are not a NAN and ``op1`` 20781 is not equal to ``op2``. 20782- "``ord``": yields ``true`` if both operands are not a NAN. 20783- "``ueq``": yields ``true`` if either operand is a NAN or ``op1`` is 20784 equal to ``op2``. 20785- "``ugt``": yields ``true`` if either operand is a NAN or ``op1`` is 20786 greater than ``op2``. 20787- "``uge``": yields ``true`` if either operand is a NAN or ``op1`` is 20788 greater than or equal to ``op2``. 20789- "``ult``": yields ``true`` if either operand is a NAN or ``op1`` is 20790 less than ``op2``. 20791- "``ule``": yields ``true`` if either operand is a NAN or ``op1`` is 20792 less than or equal to ``op2``. 20793- "``une``": yields ``true`` if either operand is a NAN or ``op1`` is 20794 not equal to ``op2``. 20795- "``uno``": yields ``true`` if either operand is a NAN. 20796 20797The quiet comparison operation performed by 20798'``llvm.experimental.constrained.fcmp``' will only raise an exception 20799if either operand is a SNAN. The signaling comparison operation 20800performed by '``llvm.experimental.constrained.fcmps``' will raise an 20801exception if either operand is a NAN (QNAN or SNAN). Such an exception 20802does not preclude a result being produced (e.g. exception might only 20803set a flag), therefore the distinction between ordered and unordered 20804comparisons is also relevant for the 20805'``llvm.experimental.constrained.fcmps``' intrinsic. 20806 20807'``llvm.experimental.constrained.fmuladd``' Intrinsic 20808^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20809 20810Syntax: 20811""""""" 20812 20813:: 20814 20815 declare <type> 20816 @llvm.experimental.constrained.fmuladd(<type> <op1>, <type> <op2>, 20817 <type> <op3>, 20818 metadata <rounding mode>, 20819 metadata <exception behavior>) 20820 20821Overview: 20822""""""""" 20823 20824The '``llvm.experimental.constrained.fmuladd``' intrinsic represents 20825multiply-add expressions that can be fused if the code generator determines 20826that (a) the target instruction set has support for a fused operation, 20827and (b) that the fused operation is more efficient than the equivalent, 20828separate pair of mul and add instructions. 20829 20830Arguments: 20831"""""""""" 20832 20833The first three arguments to the '``llvm.experimental.constrained.fmuladd``' 20834intrinsic must be floating-point or vector of floating-point values. 20835All three arguments must have identical types. 20836 20837The fourth and fifth arguments specify the rounding mode and exception behavior 20838as described above. 20839 20840Semantics: 20841"""""""""" 20842 20843The expression: 20844 20845:: 20846 20847 %0 = call float @llvm.experimental.constrained.fmuladd.f32(%a, %b, %c, 20848 metadata <rounding mode>, 20849 metadata <exception behavior>) 20850 20851is equivalent to the expression: 20852 20853:: 20854 20855 %0 = call float @llvm.experimental.constrained.fmul.f32(%a, %b, 20856 metadata <rounding mode>, 20857 metadata <exception behavior>) 20858 %1 = call float @llvm.experimental.constrained.fadd.f32(%0, %c, 20859 metadata <rounding mode>, 20860 metadata <exception behavior>) 20861 20862except that it is unspecified whether rounding will be performed between the 20863multiplication and addition steps. Fusion is not guaranteed, even if the target 20864platform supports it. 20865If a fused multiply-add is required, the corresponding 20866:ref:`llvm.experimental.constrained.fma <int_fma>` intrinsic function should be 20867used instead. 20868This never sets errno, just as '``llvm.experimental.constrained.fma.*``'. 20869 20870Constrained libm-equivalent Intrinsics 20871-------------------------------------- 20872 20873In addition to the basic floating-point operations for which constrained 20874intrinsics are described above, there are constrained versions of various 20875operations which provide equivalent behavior to a corresponding libm function. 20876These intrinsics allow the precise behavior of these operations with respect to 20877rounding mode and exception behavior to be controlled. 20878 20879As with the basic constrained floating-point intrinsics, the rounding mode 20880and exception behavior arguments only control the behavior of the optimizer. 20881They do not change the runtime floating-point environment. 20882 20883 20884'``llvm.experimental.constrained.sqrt``' Intrinsic 20885^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20886 20887Syntax: 20888""""""" 20889 20890:: 20891 20892 declare <type> 20893 @llvm.experimental.constrained.sqrt(<type> <op1>, 20894 metadata <rounding mode>, 20895 metadata <exception behavior>) 20896 20897Overview: 20898""""""""" 20899 20900The '``llvm.experimental.constrained.sqrt``' intrinsic returns the square root 20901of the specified value, returning the same value as the libm '``sqrt``' 20902functions would, but without setting ``errno``. 20903 20904Arguments: 20905"""""""""" 20906 20907The first argument and the return type are floating-point numbers of the same 20908type. 20909 20910The second and third arguments specify the rounding mode and exception 20911behavior as described above. 20912 20913Semantics: 20914"""""""""" 20915 20916This function returns the nonnegative square root of the specified value. 20917If the value is less than negative zero, a floating-point exception occurs 20918and the return value is architecture specific. 20919 20920 20921'``llvm.experimental.constrained.pow``' Intrinsic 20922^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20923 20924Syntax: 20925""""""" 20926 20927:: 20928 20929 declare <type> 20930 @llvm.experimental.constrained.pow(<type> <op1>, <type> <op2>, 20931 metadata <rounding mode>, 20932 metadata <exception behavior>) 20933 20934Overview: 20935""""""""" 20936 20937The '``llvm.experimental.constrained.pow``' intrinsic returns the first operand 20938raised to the (positive or negative) power specified by the second operand. 20939 20940Arguments: 20941"""""""""" 20942 20943The first two arguments and the return value are floating-point numbers of the 20944same type. The second argument specifies the power to which the first argument 20945should be raised. 20946 20947The third and fourth arguments specify the rounding mode and exception 20948behavior as described above. 20949 20950Semantics: 20951"""""""""" 20952 20953This function returns the first value raised to the second power, 20954returning the same values as the libm ``pow`` functions would, and 20955handles error conditions in the same way. 20956 20957 20958'``llvm.experimental.constrained.powi``' Intrinsic 20959^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20960 20961Syntax: 20962""""""" 20963 20964:: 20965 20966 declare <type> 20967 @llvm.experimental.constrained.powi(<type> <op1>, i32 <op2>, 20968 metadata <rounding mode>, 20969 metadata <exception behavior>) 20970 20971Overview: 20972""""""""" 20973 20974The '``llvm.experimental.constrained.powi``' intrinsic returns the first operand 20975raised to the (positive or negative) power specified by the second operand. The 20976order of evaluation of multiplications is not defined. When a vector of 20977floating-point type is used, the second argument remains a scalar integer value. 20978 20979 20980Arguments: 20981"""""""""" 20982 20983The first argument and the return value are floating-point numbers of the same 20984type. The second argument is a 32-bit signed integer specifying the power to 20985which the first argument should be raised. 20986 20987The third and fourth arguments specify the rounding mode and exception 20988behavior as described above. 20989 20990Semantics: 20991"""""""""" 20992 20993This function returns the first value raised to the second power with an 20994unspecified sequence of rounding operations. 20995 20996 20997'``llvm.experimental.constrained.sin``' Intrinsic 20998^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20999 21000Syntax: 21001""""""" 21002 21003:: 21004 21005 declare <type> 21006 @llvm.experimental.constrained.sin(<type> <op1>, 21007 metadata <rounding mode>, 21008 metadata <exception behavior>) 21009 21010Overview: 21011""""""""" 21012 21013The '``llvm.experimental.constrained.sin``' intrinsic returns the sine of the 21014first operand. 21015 21016Arguments: 21017"""""""""" 21018 21019The first argument and the return type are floating-point numbers of the same 21020type. 21021 21022The second and third arguments specify the rounding mode and exception 21023behavior as described above. 21024 21025Semantics: 21026"""""""""" 21027 21028This function returns the sine of the specified operand, returning the 21029same values as the libm ``sin`` functions would, and handles error 21030conditions in the same way. 21031 21032 21033'``llvm.experimental.constrained.cos``' Intrinsic 21034^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21035 21036Syntax: 21037""""""" 21038 21039:: 21040 21041 declare <type> 21042 @llvm.experimental.constrained.cos(<type> <op1>, 21043 metadata <rounding mode>, 21044 metadata <exception behavior>) 21045 21046Overview: 21047""""""""" 21048 21049The '``llvm.experimental.constrained.cos``' intrinsic returns the cosine of the 21050first operand. 21051 21052Arguments: 21053"""""""""" 21054 21055The first argument and the return type are floating-point numbers of the same 21056type. 21057 21058The second and third arguments specify the rounding mode and exception 21059behavior as described above. 21060 21061Semantics: 21062"""""""""" 21063 21064This function returns the cosine of the specified operand, returning the 21065same values as the libm ``cos`` functions would, and handles error 21066conditions in the same way. 21067 21068 21069'``llvm.experimental.constrained.exp``' Intrinsic 21070^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21071 21072Syntax: 21073""""""" 21074 21075:: 21076 21077 declare <type> 21078 @llvm.experimental.constrained.exp(<type> <op1>, 21079 metadata <rounding mode>, 21080 metadata <exception behavior>) 21081 21082Overview: 21083""""""""" 21084 21085The '``llvm.experimental.constrained.exp``' intrinsic computes the base-e 21086exponential of the specified value. 21087 21088Arguments: 21089"""""""""" 21090 21091The first argument and the return value are floating-point numbers of the same 21092type. 21093 21094The second and third arguments specify the rounding mode and exception 21095behavior as described above. 21096 21097Semantics: 21098"""""""""" 21099 21100This function returns the same values as the libm ``exp`` functions 21101would, and handles error conditions in the same way. 21102 21103 21104'``llvm.experimental.constrained.exp2``' Intrinsic 21105^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21106 21107Syntax: 21108""""""" 21109 21110:: 21111 21112 declare <type> 21113 @llvm.experimental.constrained.exp2(<type> <op1>, 21114 metadata <rounding mode>, 21115 metadata <exception behavior>) 21116 21117Overview: 21118""""""""" 21119 21120The '``llvm.experimental.constrained.exp2``' intrinsic computes the base-2 21121exponential of the specified value. 21122 21123 21124Arguments: 21125"""""""""" 21126 21127The first argument and the return value are floating-point numbers of the same 21128type. 21129 21130The second and third arguments specify the rounding mode and exception 21131behavior as described above. 21132 21133Semantics: 21134"""""""""" 21135 21136This function returns the same values as the libm ``exp2`` functions 21137would, and handles error conditions in the same way. 21138 21139 21140'``llvm.experimental.constrained.log``' Intrinsic 21141^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21142 21143Syntax: 21144""""""" 21145 21146:: 21147 21148 declare <type> 21149 @llvm.experimental.constrained.log(<type> <op1>, 21150 metadata <rounding mode>, 21151 metadata <exception behavior>) 21152 21153Overview: 21154""""""""" 21155 21156The '``llvm.experimental.constrained.log``' intrinsic computes the base-e 21157logarithm of the specified value. 21158 21159Arguments: 21160"""""""""" 21161 21162The first argument and the return value are floating-point numbers of the same 21163type. 21164 21165The second and third arguments specify the rounding mode and exception 21166behavior as described above. 21167 21168 21169Semantics: 21170"""""""""" 21171 21172This function returns the same values as the libm ``log`` functions 21173would, and handles error conditions in the same way. 21174 21175 21176'``llvm.experimental.constrained.log10``' Intrinsic 21177^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21178 21179Syntax: 21180""""""" 21181 21182:: 21183 21184 declare <type> 21185 @llvm.experimental.constrained.log10(<type> <op1>, 21186 metadata <rounding mode>, 21187 metadata <exception behavior>) 21188 21189Overview: 21190""""""""" 21191 21192The '``llvm.experimental.constrained.log10``' intrinsic computes the base-10 21193logarithm of the specified value. 21194 21195Arguments: 21196"""""""""" 21197 21198The first argument and the return value are floating-point numbers of the same 21199type. 21200 21201The second and third arguments specify the rounding mode and exception 21202behavior as described above. 21203 21204Semantics: 21205"""""""""" 21206 21207This function returns the same values as the libm ``log10`` functions 21208would, and handles error conditions in the same way. 21209 21210 21211'``llvm.experimental.constrained.log2``' Intrinsic 21212^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21213 21214Syntax: 21215""""""" 21216 21217:: 21218 21219 declare <type> 21220 @llvm.experimental.constrained.log2(<type> <op1>, 21221 metadata <rounding mode>, 21222 metadata <exception behavior>) 21223 21224Overview: 21225""""""""" 21226 21227The '``llvm.experimental.constrained.log2``' intrinsic computes the base-2 21228logarithm of the specified value. 21229 21230Arguments: 21231"""""""""" 21232 21233The first argument and the return value are floating-point numbers of the same 21234type. 21235 21236The second and third arguments specify the rounding mode and exception 21237behavior as described above. 21238 21239Semantics: 21240"""""""""" 21241 21242This function returns the same values as the libm ``log2`` functions 21243would, and handles error conditions in the same way. 21244 21245 21246'``llvm.experimental.constrained.rint``' Intrinsic 21247^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21248 21249Syntax: 21250""""""" 21251 21252:: 21253 21254 declare <type> 21255 @llvm.experimental.constrained.rint(<type> <op1>, 21256 metadata <rounding mode>, 21257 metadata <exception behavior>) 21258 21259Overview: 21260""""""""" 21261 21262The '``llvm.experimental.constrained.rint``' intrinsic returns the first 21263operand rounded to the nearest integer. It may raise an inexact floating-point 21264exception if the operand is not an integer. 21265 21266Arguments: 21267"""""""""" 21268 21269The first argument and the return value are floating-point numbers of the same 21270type. 21271 21272The second and third arguments specify the rounding mode and exception 21273behavior as described above. 21274 21275Semantics: 21276"""""""""" 21277 21278This function returns the same values as the libm ``rint`` functions 21279would, and handles error conditions in the same way. The rounding mode is 21280described, not determined, by the rounding mode argument. The actual rounding 21281mode is determined by the runtime floating-point environment. The rounding 21282mode argument is only intended as information to the compiler. 21283 21284 21285'``llvm.experimental.constrained.lrint``' Intrinsic 21286^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21287 21288Syntax: 21289""""""" 21290 21291:: 21292 21293 declare <inttype> 21294 @llvm.experimental.constrained.lrint(<fptype> <op1>, 21295 metadata <rounding mode>, 21296 metadata <exception behavior>) 21297 21298Overview: 21299""""""""" 21300 21301The '``llvm.experimental.constrained.lrint``' intrinsic returns the first 21302operand rounded to the nearest integer. An inexact floating-point exception 21303will be raised if the operand is not an integer. An invalid exception is 21304raised if the result is too large to fit into a supported integer type, 21305and in this case the result is undefined. 21306 21307Arguments: 21308"""""""""" 21309 21310The first argument is a floating-point number. The return value is an 21311integer type. Not all types are supported on all targets. The supported 21312types are the same as the ``llvm.lrint`` intrinsic and the ``lrint`` 21313libm functions. 21314 21315The second and third arguments specify the rounding mode and exception 21316behavior as described above. 21317 21318Semantics: 21319"""""""""" 21320 21321This function returns the same values as the libm ``lrint`` functions 21322would, and handles error conditions in the same way. 21323 21324The rounding mode is described, not determined, by the rounding mode 21325argument. The actual rounding mode is determined by the runtime floating-point 21326environment. The rounding mode argument is only intended as information 21327to the compiler. 21328 21329If the runtime floating-point environment is using the default rounding mode 21330then the results will be the same as the llvm.lrint intrinsic. 21331 21332 21333'``llvm.experimental.constrained.llrint``' Intrinsic 21334^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21335 21336Syntax: 21337""""""" 21338 21339:: 21340 21341 declare <inttype> 21342 @llvm.experimental.constrained.llrint(<fptype> <op1>, 21343 metadata <rounding mode>, 21344 metadata <exception behavior>) 21345 21346Overview: 21347""""""""" 21348 21349The '``llvm.experimental.constrained.llrint``' intrinsic returns the first 21350operand rounded to the nearest integer. An inexact floating-point exception 21351will be raised if the operand is not an integer. An invalid exception is 21352raised if the result is too large to fit into a supported integer type, 21353and in this case the result is undefined. 21354 21355Arguments: 21356"""""""""" 21357 21358The first argument is a floating-point number. The return value is an 21359integer type. Not all types are supported on all targets. The supported 21360types are the same as the ``llvm.llrint`` intrinsic and the ``llrint`` 21361libm functions. 21362 21363The second and third arguments specify the rounding mode and exception 21364behavior as described above. 21365 21366Semantics: 21367"""""""""" 21368 21369This function returns the same values as the libm ``llrint`` functions 21370would, and handles error conditions in the same way. 21371 21372The rounding mode is described, not determined, by the rounding mode 21373argument. The actual rounding mode is determined by the runtime floating-point 21374environment. The rounding mode argument is only intended as information 21375to the compiler. 21376 21377If the runtime floating-point environment is using the default rounding mode 21378then the results will be the same as the llvm.llrint intrinsic. 21379 21380 21381'``llvm.experimental.constrained.nearbyint``' Intrinsic 21382^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21383 21384Syntax: 21385""""""" 21386 21387:: 21388 21389 declare <type> 21390 @llvm.experimental.constrained.nearbyint(<type> <op1>, 21391 metadata <rounding mode>, 21392 metadata <exception behavior>) 21393 21394Overview: 21395""""""""" 21396 21397The '``llvm.experimental.constrained.nearbyint``' intrinsic returns the first 21398operand rounded to the nearest integer. It will not raise an inexact 21399floating-point exception if the operand is not an integer. 21400 21401 21402Arguments: 21403"""""""""" 21404 21405The first argument and the return value are floating-point numbers of the same 21406type. 21407 21408The second and third arguments specify the rounding mode and exception 21409behavior as described above. 21410 21411Semantics: 21412"""""""""" 21413 21414This function returns the same values as the libm ``nearbyint`` functions 21415would, and handles error conditions in the same way. The rounding mode is 21416described, not determined, by the rounding mode argument. The actual rounding 21417mode is determined by the runtime floating-point environment. The rounding 21418mode argument is only intended as information to the compiler. 21419 21420 21421'``llvm.experimental.constrained.maxnum``' Intrinsic 21422^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21423 21424Syntax: 21425""""""" 21426 21427:: 21428 21429 declare <type> 21430 @llvm.experimental.constrained.maxnum(<type> <op1>, <type> <op2> 21431 metadata <exception behavior>) 21432 21433Overview: 21434""""""""" 21435 21436The '``llvm.experimental.constrained.maxnum``' intrinsic returns the maximum 21437of the two arguments. 21438 21439Arguments: 21440"""""""""" 21441 21442The first two arguments and the return value are floating-point numbers 21443of the same type. 21444 21445The third argument specifies the exception behavior as described above. 21446 21447Semantics: 21448"""""""""" 21449 21450This function follows the IEEE-754 semantics for maxNum. 21451 21452 21453'``llvm.experimental.constrained.minnum``' Intrinsic 21454^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21455 21456Syntax: 21457""""""" 21458 21459:: 21460 21461 declare <type> 21462 @llvm.experimental.constrained.minnum(<type> <op1>, <type> <op2> 21463 metadata <exception behavior>) 21464 21465Overview: 21466""""""""" 21467 21468The '``llvm.experimental.constrained.minnum``' intrinsic returns the minimum 21469of the two arguments. 21470 21471Arguments: 21472"""""""""" 21473 21474The first two arguments and the return value are floating-point numbers 21475of the same type. 21476 21477The third argument specifies the exception behavior as described above. 21478 21479Semantics: 21480"""""""""" 21481 21482This function follows the IEEE-754 semantics for minNum. 21483 21484 21485'``llvm.experimental.constrained.maximum``' Intrinsic 21486^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21487 21488Syntax: 21489""""""" 21490 21491:: 21492 21493 declare <type> 21494 @llvm.experimental.constrained.maximum(<type> <op1>, <type> <op2> 21495 metadata <exception behavior>) 21496 21497Overview: 21498""""""""" 21499 21500The '``llvm.experimental.constrained.maximum``' intrinsic returns the maximum 21501of the two arguments, propagating NaNs and treating -0.0 as less than +0.0. 21502 21503Arguments: 21504"""""""""" 21505 21506The first two arguments and the return value are floating-point numbers 21507of the same type. 21508 21509The third argument specifies the exception behavior as described above. 21510 21511Semantics: 21512"""""""""" 21513 21514This function follows semantics specified in the draft of IEEE 754-2018. 21515 21516 21517'``llvm.experimental.constrained.minimum``' Intrinsic 21518^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21519 21520Syntax: 21521""""""" 21522 21523:: 21524 21525 declare <type> 21526 @llvm.experimental.constrained.minimum(<type> <op1>, <type> <op2> 21527 metadata <exception behavior>) 21528 21529Overview: 21530""""""""" 21531 21532The '``llvm.experimental.constrained.minimum``' intrinsic returns the minimum 21533of the two arguments, propagating NaNs and treating -0.0 as less than +0.0. 21534 21535Arguments: 21536"""""""""" 21537 21538The first two arguments and the return value are floating-point numbers 21539of the same type. 21540 21541The third argument specifies the exception behavior as described above. 21542 21543Semantics: 21544"""""""""" 21545 21546This function follows semantics specified in the draft of IEEE 754-2018. 21547 21548 21549'``llvm.experimental.constrained.ceil``' Intrinsic 21550^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21551 21552Syntax: 21553""""""" 21554 21555:: 21556 21557 declare <type> 21558 @llvm.experimental.constrained.ceil(<type> <op1>, 21559 metadata <exception behavior>) 21560 21561Overview: 21562""""""""" 21563 21564The '``llvm.experimental.constrained.ceil``' intrinsic returns the ceiling of the 21565first operand. 21566 21567Arguments: 21568"""""""""" 21569 21570The first argument and the return value are floating-point numbers of the same 21571type. 21572 21573The second argument specifies the exception behavior as described above. 21574 21575Semantics: 21576"""""""""" 21577 21578This function returns the same values as the libm ``ceil`` functions 21579would and handles error conditions in the same way. 21580 21581 21582'``llvm.experimental.constrained.floor``' Intrinsic 21583^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21584 21585Syntax: 21586""""""" 21587 21588:: 21589 21590 declare <type> 21591 @llvm.experimental.constrained.floor(<type> <op1>, 21592 metadata <exception behavior>) 21593 21594Overview: 21595""""""""" 21596 21597The '``llvm.experimental.constrained.floor``' intrinsic returns the floor of the 21598first operand. 21599 21600Arguments: 21601"""""""""" 21602 21603The first argument and the return value are floating-point numbers of the same 21604type. 21605 21606The second argument specifies the exception behavior as described above. 21607 21608Semantics: 21609"""""""""" 21610 21611This function returns the same values as the libm ``floor`` functions 21612would and handles error conditions in the same way. 21613 21614 21615'``llvm.experimental.constrained.round``' Intrinsic 21616^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21617 21618Syntax: 21619""""""" 21620 21621:: 21622 21623 declare <type> 21624 @llvm.experimental.constrained.round(<type> <op1>, 21625 metadata <exception behavior>) 21626 21627Overview: 21628""""""""" 21629 21630The '``llvm.experimental.constrained.round``' intrinsic returns the first 21631operand rounded to the nearest integer. 21632 21633Arguments: 21634"""""""""" 21635 21636The first argument and the return value are floating-point numbers of the same 21637type. 21638 21639The second argument specifies the exception behavior as described above. 21640 21641Semantics: 21642"""""""""" 21643 21644This function returns the same values as the libm ``round`` functions 21645would and handles error conditions in the same way. 21646 21647 21648'``llvm.experimental.constrained.roundeven``' Intrinsic 21649^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21650 21651Syntax: 21652""""""" 21653 21654:: 21655 21656 declare <type> 21657 @llvm.experimental.constrained.roundeven(<type> <op1>, 21658 metadata <exception behavior>) 21659 21660Overview: 21661""""""""" 21662 21663The '``llvm.experimental.constrained.roundeven``' intrinsic returns the first 21664operand rounded to the nearest integer in floating-point format, rounding 21665halfway cases to even (that is, to the nearest value that is an even integer), 21666regardless of the current rounding direction. 21667 21668Arguments: 21669"""""""""" 21670 21671The first argument and the return value are floating-point numbers of the same 21672type. 21673 21674The second argument specifies the exception behavior as described above. 21675 21676Semantics: 21677"""""""""" 21678 21679This function implements IEEE-754 operation ``roundToIntegralTiesToEven``. It 21680also behaves in the same way as C standard function ``roundeven`` and can signal 21681the invalid operation exception for a SNAN operand. 21682 21683 21684'``llvm.experimental.constrained.lround``' Intrinsic 21685^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21686 21687Syntax: 21688""""""" 21689 21690:: 21691 21692 declare <inttype> 21693 @llvm.experimental.constrained.lround(<fptype> <op1>, 21694 metadata <exception behavior>) 21695 21696Overview: 21697""""""""" 21698 21699The '``llvm.experimental.constrained.lround``' intrinsic returns the first 21700operand rounded to the nearest integer with ties away from zero. It will 21701raise an inexact floating-point exception if the operand is not an integer. 21702An invalid exception is raised if the result is too large to fit into a 21703supported integer type, and in this case the result is undefined. 21704 21705Arguments: 21706"""""""""" 21707 21708The first argument is a floating-point number. The return value is an 21709integer type. Not all types are supported on all targets. The supported 21710types are the same as the ``llvm.lround`` intrinsic and the ``lround`` 21711libm functions. 21712 21713The second argument specifies the exception behavior as described above. 21714 21715Semantics: 21716"""""""""" 21717 21718This function returns the same values as the libm ``lround`` functions 21719would and handles error conditions in the same way. 21720 21721 21722'``llvm.experimental.constrained.llround``' Intrinsic 21723^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21724 21725Syntax: 21726""""""" 21727 21728:: 21729 21730 declare <inttype> 21731 @llvm.experimental.constrained.llround(<fptype> <op1>, 21732 metadata <exception behavior>) 21733 21734Overview: 21735""""""""" 21736 21737The '``llvm.experimental.constrained.llround``' intrinsic returns the first 21738operand rounded to the nearest integer with ties away from zero. It will 21739raise an inexact floating-point exception if the operand is not an integer. 21740An invalid exception is raised if the result is too large to fit into a 21741supported integer type, and in this case the result is undefined. 21742 21743Arguments: 21744"""""""""" 21745 21746The first argument is a floating-point number. The return value is an 21747integer type. Not all types are supported on all targets. The supported 21748types are the same as the ``llvm.llround`` intrinsic and the ``llround`` 21749libm functions. 21750 21751The second argument specifies the exception behavior as described above. 21752 21753Semantics: 21754"""""""""" 21755 21756This function returns the same values as the libm ``llround`` functions 21757would and handles error conditions in the same way. 21758 21759 21760'``llvm.experimental.constrained.trunc``' Intrinsic 21761^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21762 21763Syntax: 21764""""""" 21765 21766:: 21767 21768 declare <type> 21769 @llvm.experimental.constrained.trunc(<type> <op1>, 21770 metadata <exception behavior>) 21771 21772Overview: 21773""""""""" 21774 21775The '``llvm.experimental.constrained.trunc``' intrinsic returns the first 21776operand rounded to the nearest integer not larger in magnitude than the 21777operand. 21778 21779Arguments: 21780"""""""""" 21781 21782The first argument and the return value are floating-point numbers of the same 21783type. 21784 21785The second argument specifies the exception behavior as described above. 21786 21787Semantics: 21788"""""""""" 21789 21790This function returns the same values as the libm ``trunc`` functions 21791would and handles error conditions in the same way. 21792 21793.. _int_experimental_noalias_scope_decl: 21794 21795'``llvm.experimental.noalias.scope.decl``' Intrinsic 21796^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21797 21798Syntax: 21799""""""" 21800 21801 21802:: 21803 21804 declare void @llvm.experimental.noalias.scope.decl(metadata !id.scope.list) 21805 21806Overview: 21807""""""""" 21808 21809The ``llvm.experimental.noalias.scope.decl`` intrinsic identifies where a 21810noalias scope is declared. When the intrinsic is duplicated, a decision must 21811also be made about the scope: depending on the reason of the duplication, 21812the scope might need to be duplicated as well. 21813 21814 21815Arguments: 21816"""""""""" 21817 21818The ``!id.scope.list`` argument is metadata that is a list of ``noalias`` 21819metadata references. The format is identical to that required for ``noalias`` 21820metadata. This list must have exactly one element. 21821 21822Semantics: 21823"""""""""" 21824 21825The ``llvm.experimental.noalias.scope.decl`` intrinsic identifies where a 21826noalias scope is declared. When the intrinsic is duplicated, a decision must 21827also be made about the scope: depending on the reason of the duplication, 21828the scope might need to be duplicated as well. 21829 21830For example, when the intrinsic is used inside a loop body, and that loop is 21831unrolled, the associated noalias scope must also be duplicated. Otherwise, the 21832noalias property it signifies would spill across loop iterations, whereas it 21833was only valid within a single iteration. 21834 21835.. code-block:: llvm 21836 21837 ; This examples shows two possible positions for noalias.decl and how they impact the semantics: 21838 ; If it is outside the loop (Version 1), then %a and %b are noalias across *all* iterations. 21839 ; If it is inside the loop (Version 2), then %a and %b are noalias only within *one* iteration. 21840 declare void @decl_in_loop(i8* %a.base, i8* %b.base) { 21841 entry: 21842 ; call void @llvm.experimental.noalias.scope.decl(metadata !2) ; Version 1: noalias decl outside loop 21843 br label %loop 21844 21845 loop: 21846 %a = phi i8* [ %a.base, %entry ], [ %a.inc, %loop ] 21847 %b = phi i8* [ %b.base, %entry ], [ %b.inc, %loop ] 21848 ; call void @llvm.experimental.noalias.scope.decl(metadata !2) ; Version 2: noalias decl inside loop 21849 %val = load i8, i8* %a, !alias.scope !2 21850 store i8 %val, i8* %b, !noalias !2 21851 %a.inc = getelementptr inbounds i8, i8* %a, i64 1 21852 %b.inc = getelementptr inbounds i8, i8* %b, i64 1 21853 %cond = call i1 @cond() 21854 br i1 %cond, label %loop, label %exit 21855 21856 exit: 21857 ret void 21858 } 21859 21860 !0 = !{!0} ; domain 21861 !1 = !{!1, !0} ; scope 21862 !2 = !{!1} ; scope list 21863 21864Multiple calls to `@llvm.experimental.noalias.scope.decl` for the same scope 21865are possible, but one should never dominate another. Violations are pointed out 21866by the verifier as they indicate a problem in either a transformation pass or 21867the input. 21868 21869 21870Floating Point Environment Manipulation intrinsics 21871-------------------------------------------------- 21872 21873These functions read or write floating point environment, such as rounding 21874mode or state of floating point exceptions. Altering the floating point 21875environment requires special care. See :ref:`Floating Point Environment <floatenv>`. 21876 21877'``llvm.flt.rounds``' Intrinsic 21878^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21879 21880Syntax: 21881""""""" 21882 21883:: 21884 21885 declare i32 @llvm.flt.rounds() 21886 21887Overview: 21888""""""""" 21889 21890The '``llvm.flt.rounds``' intrinsic reads the current rounding mode. 21891 21892Semantics: 21893"""""""""" 21894 21895The '``llvm.flt.rounds``' intrinsic returns the current rounding mode. 21896Encoding of the returned values is same as the result of ``FLT_ROUNDS``, 21897specified by C standard: 21898 21899:: 21900 21901 0 - toward zero 21902 1 - to nearest, ties to even 21903 2 - toward positive infinity 21904 3 - toward negative infinity 21905 4 - to nearest, ties away from zero 21906 21907Other values may be used to represent additional rounding modes, supported by a 21908target. These values are target-specific. 21909 21910 21911'``llvm.set.rounding``' Intrinsic 21912^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21913 21914Syntax: 21915""""""" 21916 21917:: 21918 21919 declare void @llvm.set.rounding(i32 <val>) 21920 21921Overview: 21922""""""""" 21923 21924The '``llvm.set.rounding``' intrinsic sets current rounding mode. 21925 21926Arguments: 21927"""""""""" 21928 21929The argument is the required rounding mode. Encoding of rounding mode is 21930the same as used by '``llvm.flt.rounds``'. 21931 21932Semantics: 21933"""""""""" 21934 21935The '``llvm.set.rounding``' intrinsic sets the current rounding mode. It is 21936similar to C library function 'fesetround', however this intrinsic does not 21937return any value and uses platform-independent representation of IEEE rounding 21938modes. 21939 21940 21941General Intrinsics 21942------------------ 21943 21944This class of intrinsics is designed to be generic and has no specific 21945purpose. 21946 21947'``llvm.var.annotation``' Intrinsic 21948^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21949 21950Syntax: 21951""""""" 21952 21953:: 21954 21955 declare void @llvm.var.annotation(i8* <val>, i8* <str>, i8* <str>, i32 <int>) 21956 21957Overview: 21958""""""""" 21959 21960The '``llvm.var.annotation``' intrinsic. 21961 21962Arguments: 21963"""""""""" 21964 21965The first argument is a pointer to a value, the second is a pointer to a 21966global string, the third is a pointer to a global string which is the 21967source file name, and the last argument is the line number. 21968 21969Semantics: 21970"""""""""" 21971 21972This intrinsic allows annotation of local variables with arbitrary 21973strings. This can be useful for special purpose optimizations that want 21974to look for these annotations. These have no other defined use; they are 21975ignored by code generation and optimization. 21976 21977'``llvm.ptr.annotation.*``' Intrinsic 21978^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 21979 21980Syntax: 21981""""""" 21982 21983This is an overloaded intrinsic. You can use '``llvm.ptr.annotation``' on a 21984pointer to an integer of any width. *NOTE* you must specify an address space for 21985the pointer. The identifier for the default address space is the integer 21986'``0``'. 21987 21988:: 21989 21990 declare i8* @llvm.ptr.annotation.p<address space>i8(i8* <val>, i8* <str>, i8* <str>, i32 <int>) 21991 declare i16* @llvm.ptr.annotation.p<address space>i16(i16* <val>, i8* <str>, i8* <str>, i32 <int>) 21992 declare i32* @llvm.ptr.annotation.p<address space>i32(i32* <val>, i8* <str>, i8* <str>, i32 <int>) 21993 declare i64* @llvm.ptr.annotation.p<address space>i64(i64* <val>, i8* <str>, i8* <str>, i32 <int>) 21994 declare i256* @llvm.ptr.annotation.p<address space>i256(i256* <val>, i8* <str>, i8* <str>, i32 <int>) 21995 21996Overview: 21997""""""""" 21998 21999The '``llvm.ptr.annotation``' intrinsic. 22000 22001Arguments: 22002"""""""""" 22003 22004The first argument is a pointer to an integer value of arbitrary bitwidth 22005(result of some expression), the second is a pointer to a global string, the 22006third is a pointer to a global string which is the source file name, and the 22007last argument is the line number. It returns the value of the first argument. 22008 22009Semantics: 22010"""""""""" 22011 22012This intrinsic allows annotation of a pointer to an integer with arbitrary 22013strings. This can be useful for special purpose optimizations that want to look 22014for these annotations. These have no other defined use; they are ignored by code 22015generation and optimization. 22016 22017'``llvm.annotation.*``' Intrinsic 22018^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 22019 22020Syntax: 22021""""""" 22022 22023This is an overloaded intrinsic. You can use '``llvm.annotation``' on 22024any integer bit width. 22025 22026:: 22027 22028 declare i8 @llvm.annotation.i8(i8 <val>, i8* <str>, i8* <str>, i32 <int>) 22029 declare i16 @llvm.annotation.i16(i16 <val>, i8* <str>, i8* <str>, i32 <int>) 22030 declare i32 @llvm.annotation.i32(i32 <val>, i8* <str>, i8* <str>, i32 <int>) 22031 declare i64 @llvm.annotation.i64(i64 <val>, i8* <str>, i8* <str>, i32 <int>) 22032 declare i256 @llvm.annotation.i256(i256 <val>, i8* <str>, i8* <str>, i32 <int>) 22033 22034Overview: 22035""""""""" 22036 22037The '``llvm.annotation``' intrinsic. 22038 22039Arguments: 22040"""""""""" 22041 22042The first argument is an integer value (result of some expression), the 22043second is a pointer to a global string, the third is a pointer to a 22044global string which is the source file name, and the last argument is 22045the line number. It returns the value of the first argument. 22046 22047Semantics: 22048"""""""""" 22049 22050This intrinsic allows annotations to be put on arbitrary expressions 22051with arbitrary strings. This can be useful for special purpose 22052optimizations that want to look for these annotations. These have no 22053other defined use; they are ignored by code generation and optimization. 22054 22055'``llvm.codeview.annotation``' Intrinsic 22056^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 22057 22058Syntax: 22059""""""" 22060 22061This annotation emits a label at its program point and an associated 22062``S_ANNOTATION`` codeview record with some additional string metadata. This is 22063used to implement MSVC's ``__annotation`` intrinsic. It is marked 22064``noduplicate``, so calls to this intrinsic prevent inlining and should be 22065considered expensive. 22066 22067:: 22068 22069 declare void @llvm.codeview.annotation(metadata) 22070 22071Arguments: 22072"""""""""" 22073 22074The argument should be an MDTuple containing any number of MDStrings. 22075 22076'``llvm.trap``' Intrinsic 22077^^^^^^^^^^^^^^^^^^^^^^^^^ 22078 22079Syntax: 22080""""""" 22081 22082:: 22083 22084 declare void @llvm.trap() cold noreturn nounwind 22085 22086Overview: 22087""""""""" 22088 22089The '``llvm.trap``' intrinsic. 22090 22091Arguments: 22092"""""""""" 22093 22094None. 22095 22096Semantics: 22097"""""""""" 22098 22099This intrinsic is lowered to the target dependent trap instruction. If 22100the target does not have a trap instruction, this intrinsic will be 22101lowered to a call of the ``abort()`` function. 22102 22103'``llvm.debugtrap``' Intrinsic 22104^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 22105 22106Syntax: 22107""""""" 22108 22109:: 22110 22111 declare void @llvm.debugtrap() nounwind 22112 22113Overview: 22114""""""""" 22115 22116The '``llvm.debugtrap``' intrinsic. 22117 22118Arguments: 22119"""""""""" 22120 22121None. 22122 22123Semantics: 22124"""""""""" 22125 22126This intrinsic is lowered to code which is intended to cause an 22127execution trap with the intention of requesting the attention of a 22128debugger. 22129 22130'``llvm.ubsantrap``' Intrinsic 22131^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 22132 22133Syntax: 22134""""""" 22135 22136:: 22137 22138 declare void @llvm.ubsantrap(i8 immarg) cold noreturn nounwind 22139 22140Overview: 22141""""""""" 22142 22143The '``llvm.ubsantrap``' intrinsic. 22144 22145Arguments: 22146"""""""""" 22147 22148An integer describing the kind of failure detected. 22149 22150Semantics: 22151"""""""""" 22152 22153This intrinsic is lowered to code which is intended to cause an execution trap, 22154embedding the argument into encoding of that trap somehow to discriminate 22155crashes if possible. 22156 22157Equivalent to ``@llvm.trap`` for targets that do not support this behaviour. 22158 22159'``llvm.stackprotector``' Intrinsic 22160^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 22161 22162Syntax: 22163""""""" 22164 22165:: 22166 22167 declare void @llvm.stackprotector(i8* <guard>, i8** <slot>) 22168 22169Overview: 22170""""""""" 22171 22172The ``llvm.stackprotector`` intrinsic takes the ``guard`` and stores it 22173onto the stack at ``slot``. The stack slot is adjusted to ensure that it 22174is placed on the stack before local variables. 22175 22176Arguments: 22177"""""""""" 22178 22179The ``llvm.stackprotector`` intrinsic requires two pointer arguments. 22180The first argument is the value loaded from the stack guard 22181``@__stack_chk_guard``. The second variable is an ``alloca`` that has 22182enough space to hold the value of the guard. 22183 22184Semantics: 22185"""""""""" 22186 22187This intrinsic causes the prologue/epilogue inserter to force the position of 22188the ``AllocaInst`` stack slot to be before local variables on the stack. This is 22189to ensure that if a local variable on the stack is overwritten, it will destroy 22190the value of the guard. When the function exits, the guard on the stack is 22191checked against the original guard by ``llvm.stackprotectorcheck``. If they are 22192different, then ``llvm.stackprotectorcheck`` causes the program to abort by 22193calling the ``__stack_chk_fail()`` function. 22194 22195'``llvm.stackguard``' Intrinsic 22196^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 22197 22198Syntax: 22199""""""" 22200 22201:: 22202 22203 declare i8* @llvm.stackguard() 22204 22205Overview: 22206""""""""" 22207 22208The ``llvm.stackguard`` intrinsic returns the system stack guard value. 22209 22210It should not be generated by frontends, since it is only for internal usage. 22211The reason why we create this intrinsic is that we still support IR form Stack 22212Protector in FastISel. 22213 22214Arguments: 22215"""""""""" 22216 22217None. 22218 22219Semantics: 22220"""""""""" 22221 22222On some platforms, the value returned by this intrinsic remains unchanged 22223between loads in the same thread. On other platforms, it returns the same 22224global variable value, if any, e.g. ``@__stack_chk_guard``. 22225 22226Currently some platforms have IR-level customized stack guard loading (e.g. 22227X86 Linux) that is not handled by ``llvm.stackguard()``, while they should be 22228in the future. 22229 22230'``llvm.objectsize``' Intrinsic 22231^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 22232 22233Syntax: 22234""""""" 22235 22236:: 22237 22238 declare i32 @llvm.objectsize.i32(i8* <object>, i1 <min>, i1 <nullunknown>, i1 <dynamic>) 22239 declare i64 @llvm.objectsize.i64(i8* <object>, i1 <min>, i1 <nullunknown>, i1 <dynamic>) 22240 22241Overview: 22242""""""""" 22243 22244The ``llvm.objectsize`` intrinsic is designed to provide information to the 22245optimizer to determine whether a) an operation (like memcpy) will overflow a 22246buffer that corresponds to an object, or b) that a runtime check for overflow 22247isn't necessary. An object in this context means an allocation of a specific 22248class, structure, array, or other object. 22249 22250Arguments: 22251"""""""""" 22252 22253The ``llvm.objectsize`` intrinsic takes four arguments. The first argument is a 22254pointer to or into the ``object``. The second argument determines whether 22255``llvm.objectsize`` returns 0 (if true) or -1 (if false) when the object size is 22256unknown. The third argument controls how ``llvm.objectsize`` acts when ``null`` 22257in address space 0 is used as its pointer argument. If it's ``false``, 22258``llvm.objectsize`` reports 0 bytes available when given ``null``. Otherwise, if 22259the ``null`` is in a non-zero address space or if ``true`` is given for the 22260third argument of ``llvm.objectsize``, we assume its size is unknown. The fourth 22261argument to ``llvm.objectsize`` determines if the value should be evaluated at 22262runtime. 22263 22264The second, third, and fourth arguments only accept constants. 22265 22266Semantics: 22267"""""""""" 22268 22269The ``llvm.objectsize`` intrinsic is lowered to a value representing the size of 22270the object concerned. If the size cannot be determined, ``llvm.objectsize`` 22271returns ``i32/i64 -1 or 0`` (depending on the ``min`` argument). 22272 22273'``llvm.expect``' Intrinsic 22274^^^^^^^^^^^^^^^^^^^^^^^^^^^ 22275 22276Syntax: 22277""""""" 22278 22279This is an overloaded intrinsic. You can use ``llvm.expect`` on any 22280integer bit width. 22281 22282:: 22283 22284 declare i1 @llvm.expect.i1(i1 <val>, i1 <expected_val>) 22285 declare i32 @llvm.expect.i32(i32 <val>, i32 <expected_val>) 22286 declare i64 @llvm.expect.i64(i64 <val>, i64 <expected_val>) 22287 22288Overview: 22289""""""""" 22290 22291The ``llvm.expect`` intrinsic provides information about expected (the 22292most probable) value of ``val``, which can be used by optimizers. 22293 22294Arguments: 22295"""""""""" 22296 22297The ``llvm.expect`` intrinsic takes two arguments. The first argument is 22298a value. The second argument is an expected value. 22299 22300Semantics: 22301"""""""""" 22302 22303This intrinsic is lowered to the ``val``. 22304 22305'``llvm.expect.with.probability``' Intrinsic 22306^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 22307 22308Syntax: 22309""""""" 22310 22311This intrinsic is similar to ``llvm.expect``. This is an overloaded intrinsic. 22312You can use ``llvm.expect.with.probability`` on any integer bit width. 22313 22314:: 22315 22316 declare i1 @llvm.expect.with.probability.i1(i1 <val>, i1 <expected_val>, double <prob>) 22317 declare i32 @llvm.expect.with.probability.i32(i32 <val>, i32 <expected_val>, double <prob>) 22318 declare i64 @llvm.expect.with.probability.i64(i64 <val>, i64 <expected_val>, double <prob>) 22319 22320Overview: 22321""""""""" 22322 22323The ``llvm.expect.with.probability`` intrinsic provides information about 22324expected value of ``val`` with probability(or confidence) ``prob``, which can 22325be used by optimizers. 22326 22327Arguments: 22328"""""""""" 22329 22330The ``llvm.expect.with.probability`` intrinsic takes three arguments. The first 22331argument is a value. The second argument is an expected value. The third 22332argument is a probability. 22333 22334Semantics: 22335"""""""""" 22336 22337This intrinsic is lowered to the ``val``. 22338 22339.. _int_assume: 22340 22341'``llvm.assume``' Intrinsic 22342^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 22343 22344Syntax: 22345""""""" 22346 22347:: 22348 22349 declare void @llvm.assume(i1 %cond) 22350 22351Overview: 22352""""""""" 22353 22354The ``llvm.assume`` allows the optimizer to assume that the provided 22355condition is true. This information can then be used in simplifying other parts 22356of the code. 22357 22358More complex assumptions can be encoded as 22359:ref:`assume operand bundles <assume_opbundles>`. 22360 22361Arguments: 22362"""""""""" 22363 22364The argument of the call is the condition which the optimizer may assume is 22365always true. 22366 22367Semantics: 22368"""""""""" 22369 22370The intrinsic allows the optimizer to assume that the provided condition is 22371always true whenever the control flow reaches the intrinsic call. No code is 22372generated for this intrinsic, and instructions that contribute only to the 22373provided condition are not used for code generation. If the condition is 22374violated during execution, the behavior is undefined. 22375 22376Note that the optimizer might limit the transformations performed on values 22377used by the ``llvm.assume`` intrinsic in order to preserve the instructions 22378only used to form the intrinsic's input argument. This might prove undesirable 22379if the extra information provided by the ``llvm.assume`` intrinsic does not cause 22380sufficient overall improvement in code quality. For this reason, 22381``llvm.assume`` should not be used to document basic mathematical invariants 22382that the optimizer can otherwise deduce or facts that are of little use to the 22383optimizer. 22384 22385.. _int_ssa_copy: 22386 22387'``llvm.ssa.copy``' Intrinsic 22388^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 22389 22390Syntax: 22391""""""" 22392 22393:: 22394 22395 declare type @llvm.ssa.copy(type %operand) returned(1) readnone 22396 22397Arguments: 22398"""""""""" 22399 22400The first argument is an operand which is used as the returned value. 22401 22402Overview: 22403"""""""""" 22404 22405The ``llvm.ssa.copy`` intrinsic can be used to attach information to 22406operations by copying them and giving them new names. For example, 22407the PredicateInfo utility uses it to build Extended SSA form, and 22408attach various forms of information to operands that dominate specific 22409uses. It is not meant for general use, only for building temporary 22410renaming forms that require value splits at certain points. 22411 22412.. _type.test: 22413 22414'``llvm.type.test``' Intrinsic 22415^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 22416 22417Syntax: 22418""""""" 22419 22420:: 22421 22422 declare i1 @llvm.type.test(i8* %ptr, metadata %type) nounwind readnone 22423 22424 22425Arguments: 22426"""""""""" 22427 22428The first argument is a pointer to be tested. The second argument is a 22429metadata object representing a :doc:`type identifier <TypeMetadata>`. 22430 22431Overview: 22432""""""""" 22433 22434The ``llvm.type.test`` intrinsic tests whether the given pointer is associated 22435with the given type identifier. 22436 22437.. _type.checked.load: 22438 22439'``llvm.type.checked.load``' Intrinsic 22440^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 22441 22442Syntax: 22443""""""" 22444 22445:: 22446 22447 declare {i8*, i1} @llvm.type.checked.load(i8* %ptr, i32 %offset, metadata %type) argmemonly nounwind readonly 22448 22449 22450Arguments: 22451"""""""""" 22452 22453The first argument is a pointer from which to load a function pointer. The 22454second argument is the byte offset from which to load the function pointer. The 22455third argument is a metadata object representing a :doc:`type identifier 22456<TypeMetadata>`. 22457 22458Overview: 22459""""""""" 22460 22461The ``llvm.type.checked.load`` intrinsic safely loads a function pointer from a 22462virtual table pointer using type metadata. This intrinsic is used to implement 22463control flow integrity in conjunction with virtual call optimization. The 22464virtual call optimization pass will optimize away ``llvm.type.checked.load`` 22465intrinsics associated with devirtualized calls, thereby removing the type 22466check in cases where it is not needed to enforce the control flow integrity 22467constraint. 22468 22469If the given pointer is associated with a type metadata identifier, this 22470function returns true as the second element of its return value. (Note that 22471the function may also return true if the given pointer is not associated 22472with a type metadata identifier.) If the function's return value's second 22473element is true, the following rules apply to the first element: 22474 22475- If the given pointer is associated with the given type metadata identifier, 22476 it is the function pointer loaded from the given byte offset from the given 22477 pointer. 22478 22479- If the given pointer is not associated with the given type metadata 22480 identifier, it is one of the following (the choice of which is unspecified): 22481 22482 1. The function pointer that would have been loaded from an arbitrarily chosen 22483 (through an unspecified mechanism) pointer associated with the type 22484 metadata. 22485 22486 2. If the function has a non-void return type, a pointer to a function that 22487 returns an unspecified value without causing side effects. 22488 22489If the function's return value's second element is false, the value of the 22490first element is undefined. 22491 22492 22493'``llvm.arithmetic.fence``' Intrinsic 22494^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 22495 22496Syntax: 22497""""""" 22498 22499:: 22500 22501 declare <type> 22502 @llvm.arithmetic.fence(<type> <op>) 22503 22504Overview: 22505""""""""" 22506 22507The purpose of the ``llvm.arithmetic.fence`` intrinsic 22508is to prevent the optimizer from performing fast-math optimizations, 22509particularly reassociation, 22510between the argument and the expression that contains the argument. 22511It can be used to preserve the parentheses in the source language. 22512 22513Arguments: 22514"""""""""" 22515 22516The ``llvm.arithmetic.fence`` intrinsic takes only one argument. 22517The argument and the return value are floating-point numbers, 22518or vector floating-point numbers, of the same type. 22519 22520Semantics: 22521"""""""""" 22522 22523This intrinsic returns the value of its operand. The optimizer can optimize 22524the argument, but the optimizer cannot hoist any component of the operand 22525to the containing context, and the optimizer cannot move the calculation of 22526any expression in the containing context into the operand. 22527 22528 22529'``llvm.donothing``' Intrinsic 22530^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 22531 22532Syntax: 22533""""""" 22534 22535:: 22536 22537 declare void @llvm.donothing() nounwind readnone 22538 22539Overview: 22540""""""""" 22541 22542The ``llvm.donothing`` intrinsic doesn't perform any operation. It's one of only 22543three intrinsics (besides ``llvm.experimental.patchpoint`` and 22544``llvm.experimental.gc.statepoint``) that can be called with an invoke 22545instruction. 22546 22547Arguments: 22548"""""""""" 22549 22550None. 22551 22552Semantics: 22553"""""""""" 22554 22555This intrinsic does nothing, and it's removed by optimizers and ignored 22556by codegen. 22557 22558'``llvm.experimental.deoptimize``' Intrinsic 22559^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 22560 22561Syntax: 22562""""""" 22563 22564:: 22565 22566 declare type @llvm.experimental.deoptimize(...) [ "deopt"(...) ] 22567 22568Overview: 22569""""""""" 22570 22571This intrinsic, together with :ref:`deoptimization operand bundles 22572<deopt_opbundles>`, allow frontends to express transfer of control and 22573frame-local state from the currently executing (typically more specialized, 22574hence faster) version of a function into another (typically more generic, hence 22575slower) version. 22576 22577In languages with a fully integrated managed runtime like Java and JavaScript 22578this intrinsic can be used to implement "uncommon trap" or "side exit" like 22579functionality. In unmanaged languages like C and C++, this intrinsic can be 22580used to represent the slow paths of specialized functions. 22581 22582 22583Arguments: 22584"""""""""" 22585 22586The intrinsic takes an arbitrary number of arguments, whose meaning is 22587decided by the :ref:`lowering strategy<deoptimize_lowering>`. 22588 22589Semantics: 22590"""""""""" 22591 22592The ``@llvm.experimental.deoptimize`` intrinsic executes an attached 22593deoptimization continuation (denoted using a :ref:`deoptimization 22594operand bundle <deopt_opbundles>`) and returns the value returned by 22595the deoptimization continuation. Defining the semantic properties of 22596the continuation itself is out of scope of the language reference -- 22597as far as LLVM is concerned, the deoptimization continuation can 22598invoke arbitrary side effects, including reading from and writing to 22599the entire heap. 22600 22601Deoptimization continuations expressed using ``"deopt"`` operand bundles always 22602continue execution to the end of the physical frame containing them, so all 22603calls to ``@llvm.experimental.deoptimize`` must be in "tail position": 22604 22605 - ``@llvm.experimental.deoptimize`` cannot be invoked. 22606 - The call must immediately precede a :ref:`ret <i_ret>` instruction. 22607 - The ``ret`` instruction must return the value produced by the 22608 ``@llvm.experimental.deoptimize`` call if there is one, or void. 22609 22610Note that the above restrictions imply that the return type for a call to 22611``@llvm.experimental.deoptimize`` will match the return type of its immediate 22612caller. 22613 22614The inliner composes the ``"deopt"`` continuations of the caller into the 22615``"deopt"`` continuations present in the inlinee, and also updates calls to this 22616intrinsic to return directly from the frame of the function it inlined into. 22617 22618All declarations of ``@llvm.experimental.deoptimize`` must share the 22619same calling convention. 22620 22621.. _deoptimize_lowering: 22622 22623Lowering: 22624""""""""" 22625 22626Calls to ``@llvm.experimental.deoptimize`` are lowered to calls to the 22627symbol ``__llvm_deoptimize`` (it is the frontend's responsibility to 22628ensure that this symbol is defined). The call arguments to 22629``@llvm.experimental.deoptimize`` are lowered as if they were formal 22630arguments of the specified types, and not as varargs. 22631 22632 22633'``llvm.experimental.guard``' Intrinsic 22634^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 22635 22636Syntax: 22637""""""" 22638 22639:: 22640 22641 declare void @llvm.experimental.guard(i1, ...) [ "deopt"(...) ] 22642 22643Overview: 22644""""""""" 22645 22646This intrinsic, together with :ref:`deoptimization operand bundles 22647<deopt_opbundles>`, allows frontends to express guards or checks on 22648optimistic assumptions made during compilation. The semantics of 22649``@llvm.experimental.guard`` is defined in terms of 22650``@llvm.experimental.deoptimize`` -- its body is defined to be 22651equivalent to: 22652 22653.. code-block:: text 22654 22655 define void @llvm.experimental.guard(i1 %pred, <args...>) { 22656 %realPred = and i1 %pred, undef 22657 br i1 %realPred, label %continue, label %leave [, !make.implicit !{}] 22658 22659 leave: 22660 call void @llvm.experimental.deoptimize(<args...>) [ "deopt"() ] 22661 ret void 22662 22663 continue: 22664 ret void 22665 } 22666 22667 22668with the optional ``[, !make.implicit !{}]`` present if and only if it 22669is present on the call site. For more details on ``!make.implicit``, 22670see :doc:`FaultMaps`. 22671 22672In words, ``@llvm.experimental.guard`` executes the attached 22673``"deopt"`` continuation if (but **not** only if) its first argument 22674is ``false``. Since the optimizer is allowed to replace the ``undef`` 22675with an arbitrary value, it can optimize guard to fail "spuriously", 22676i.e. without the original condition being false (hence the "not only 22677if"); and this allows for "check widening" type optimizations. 22678 22679``@llvm.experimental.guard`` cannot be invoked. 22680 22681After ``@llvm.experimental.guard`` was first added, a more general 22682formulation was found in ``@llvm.experimental.widenable.condition``. 22683Support for ``@llvm.experimental.guard`` is slowly being rephrased in 22684terms of this alternate. 22685 22686'``llvm.experimental.widenable.condition``' Intrinsic 22687^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 22688 22689Syntax: 22690""""""" 22691 22692:: 22693 22694 declare i1 @llvm.experimental.widenable.condition() 22695 22696Overview: 22697""""""""" 22698 22699This intrinsic represents a "widenable condition" which is 22700boolean expressions with the following property: whether this 22701expression is `true` or `false`, the program is correct and 22702well-defined. 22703 22704Together with :ref:`deoptimization operand bundles <deopt_opbundles>`, 22705``@llvm.experimental.widenable.condition`` allows frontends to 22706express guards or checks on optimistic assumptions made during 22707compilation and represent them as branch instructions on special 22708conditions. 22709 22710While this may appear similar in semantics to `undef`, it is very 22711different in that an invocation produces a particular, singular 22712value. It is also intended to be lowered late, and remain available 22713for specific optimizations and transforms that can benefit from its 22714special properties. 22715 22716Arguments: 22717"""""""""" 22718 22719None. 22720 22721Semantics: 22722"""""""""" 22723 22724The intrinsic ``@llvm.experimental.widenable.condition()`` 22725returns either `true` or `false`. For each evaluation of a call 22726to this intrinsic, the program must be valid and correct both if 22727it returns `true` and if it returns `false`. This allows 22728transformation passes to replace evaluations of this intrinsic 22729with either value whenever one is beneficial. 22730 22731When used in a branch condition, it allows us to choose between 22732two alternative correct solutions for the same problem, like 22733in example below: 22734 22735.. code-block:: text 22736 22737 %cond = call i1 @llvm.experimental.widenable.condition() 22738 br i1 %cond, label %solution_1, label %solution_2 22739 22740 label %fast_path: 22741 ; Apply memory-consuming but fast solution for a task. 22742 22743 label %slow_path: 22744 ; Cheap in memory but slow solution. 22745 22746Whether the result of intrinsic's call is `true` or `false`, 22747it should be correct to pick either solution. We can switch 22748between them by replacing the result of 22749``@llvm.experimental.widenable.condition`` with different 22750`i1` expressions. 22751 22752This is how it can be used to represent guards as widenable branches: 22753 22754.. code-block:: text 22755 22756 block: 22757 ; Unguarded instructions 22758 call void @llvm.experimental.guard(i1 %cond, <args...>) ["deopt"(<deopt_args...>)] 22759 ; Guarded instructions 22760 22761Can be expressed in an alternative equivalent form of explicit branch using 22762``@llvm.experimental.widenable.condition``: 22763 22764.. code-block:: text 22765 22766 block: 22767 ; Unguarded instructions 22768 %widenable_condition = call i1 @llvm.experimental.widenable.condition() 22769 %guard_condition = and i1 %cond, %widenable_condition 22770 br i1 %guard_condition, label %guarded, label %deopt 22771 22772 guarded: 22773 ; Guarded instructions 22774 22775 deopt: 22776 call type @llvm.experimental.deoptimize(<args...>) [ "deopt"(<deopt_args...>) ] 22777 22778So the block `guarded` is only reachable when `%cond` is `true`, 22779and it should be valid to go to the block `deopt` whenever `%cond` 22780is `true` or `false`. 22781 22782``@llvm.experimental.widenable.condition`` will never throw, thus 22783it cannot be invoked. 22784 22785Guard widening: 22786""""""""""""""" 22787 22788When ``@llvm.experimental.widenable.condition()`` is used in 22789condition of a guard represented as explicit branch, it is 22790legal to widen the guard's condition with any additional 22791conditions. 22792 22793Guard widening looks like replacement of 22794 22795.. code-block:: text 22796 22797 %widenable_cond = call i1 @llvm.experimental.widenable.condition() 22798 %guard_cond = and i1 %cond, %widenable_cond 22799 br i1 %guard_cond, label %guarded, label %deopt 22800 22801with 22802 22803.. code-block:: text 22804 22805 %widenable_cond = call i1 @llvm.experimental.widenable.condition() 22806 %new_cond = and i1 %any_other_cond, %widenable_cond 22807 %new_guard_cond = and i1 %cond, %new_cond 22808 br i1 %new_guard_cond, label %guarded, label %deopt 22809 22810for this branch. Here `%any_other_cond` is an arbitrarily chosen 22811well-defined `i1` value. By making guard widening, we may 22812impose stricter conditions on `guarded` block and bail to the 22813deopt when the new condition is not met. 22814 22815Lowering: 22816""""""""" 22817 22818Default lowering strategy is replacing the result of 22819call of ``@llvm.experimental.widenable.condition`` with 22820constant `true`. However it is always correct to replace 22821it with any other `i1` value. Any pass can 22822freely do it if it can benefit from non-default lowering. 22823 22824 22825'``llvm.load.relative``' Intrinsic 22826^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 22827 22828Syntax: 22829""""""" 22830 22831:: 22832 22833 declare i8* @llvm.load.relative.iN(i8* %ptr, iN %offset) argmemonly nounwind readonly 22834 22835Overview: 22836""""""""" 22837 22838This intrinsic loads a 32-bit value from the address ``%ptr + %offset``, 22839adds ``%ptr`` to that value and returns it. The constant folder specifically 22840recognizes the form of this intrinsic and the constant initializers it may 22841load from; if a loaded constant initializer is known to have the form 22842``i32 trunc(x - %ptr)``, the intrinsic call is folded to ``x``. 22843 22844LLVM provides that the calculation of such a constant initializer will 22845not overflow at link time under the medium code model if ``x`` is an 22846``unnamed_addr`` function. However, it does not provide this guarantee for 22847a constant initializer folded into a function body. This intrinsic can be 22848used to avoid the possibility of overflows when loading from such a constant. 22849 22850.. _llvm_sideeffect: 22851 22852'``llvm.sideeffect``' Intrinsic 22853^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 22854 22855Syntax: 22856""""""" 22857 22858:: 22859 22860 declare void @llvm.sideeffect() inaccessiblememonly nounwind 22861 22862Overview: 22863""""""""" 22864 22865The ``llvm.sideeffect`` intrinsic doesn't perform any operation. Optimizers 22866treat it as having side effects, so it can be inserted into a loop to 22867indicate that the loop shouldn't be assumed to terminate (which could 22868potentially lead to the loop being optimized away entirely), even if it's 22869an infinite loop with no other side effects. 22870 22871Arguments: 22872"""""""""" 22873 22874None. 22875 22876Semantics: 22877"""""""""" 22878 22879This intrinsic actually does nothing, but optimizers must assume that it 22880has externally observable side effects. 22881 22882'``llvm.is.constant.*``' Intrinsic 22883^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 22884 22885Syntax: 22886""""""" 22887 22888This is an overloaded intrinsic. You can use llvm.is.constant with any argument type. 22889 22890:: 22891 22892 declare i1 @llvm.is.constant.i32(i32 %operand) nounwind readnone 22893 declare i1 @llvm.is.constant.f32(float %operand) nounwind readnone 22894 declare i1 @llvm.is.constant.TYPENAME(TYPE %operand) nounwind readnone 22895 22896Overview: 22897""""""""" 22898 22899The '``llvm.is.constant``' intrinsic will return true if the argument 22900is known to be a manifest compile-time constant. It is guaranteed to 22901fold to either true or false before generating machine code. 22902 22903Semantics: 22904"""""""""" 22905 22906This intrinsic generates no code. If its argument is known to be a 22907manifest compile-time constant value, then the intrinsic will be 22908converted to a constant true value. Otherwise, it will be converted to 22909a constant false value. 22910 22911In particular, note that if the argument is a constant expression 22912which refers to a global (the address of which _is_ a constant, but 22913not manifest during the compile), then the intrinsic evaluates to 22914false. 22915 22916The result also intentionally depends on the result of optimization 22917passes -- e.g., the result can change depending on whether a 22918function gets inlined or not. A function's parameters are 22919obviously not constant. However, a call like 22920``llvm.is.constant.i32(i32 %param)`` *can* return true after the 22921function is inlined, if the value passed to the function parameter was 22922a constant. 22923 22924On the other hand, if constant folding is not run, it will never 22925evaluate to true, even in simple cases. 22926 22927.. _int_ptrmask: 22928 22929'``llvm.ptrmask``' Intrinsic 22930^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 22931 22932Syntax: 22933""""""" 22934 22935:: 22936 22937 declare ptrty llvm.ptrmask(ptrty %ptr, intty %mask) readnone speculatable 22938 22939Arguments: 22940"""""""""" 22941 22942The first argument is a pointer. The second argument is an integer. 22943 22944Overview: 22945"""""""""" 22946 22947The ``llvm.ptrmask`` intrinsic masks out bits of the pointer according to a mask. 22948This allows stripping data from tagged pointers without converting them to an 22949integer (ptrtoint/inttoptr). As a consequence, we can preserve more information 22950to facilitate alias analysis and underlying-object detection. 22951 22952Semantics: 22953"""""""""" 22954 22955The result of ``ptrmask(ptr, mask)`` is equivalent to 22956``getelementptr ptr, (ptrtoint(ptr) & mask) - ptrtoint(ptr)``. Both the returned 22957pointer and the first argument are based on the same underlying object (for more 22958information on the *based on* terminology see 22959:ref:`the pointer aliasing rules <pointeraliasing>`). If the bitwidth of the 22960mask argument does not match the pointer size of the target, the mask is 22961zero-extended or truncated accordingly. 22962 22963.. _int_vscale: 22964 22965'``llvm.vscale``' Intrinsic 22966^^^^^^^^^^^^^^^^^^^^^^^^^^^ 22967 22968Syntax: 22969""""""" 22970 22971:: 22972 22973 declare i32 llvm.vscale.i32() 22974 declare i64 llvm.vscale.i64() 22975 22976Overview: 22977""""""""" 22978 22979The ``llvm.vscale`` intrinsic returns the value for ``vscale`` in scalable 22980vectors such as ``<vscale x 16 x i8>``. 22981 22982Semantics: 22983"""""""""" 22984 22985``vscale`` is a positive value that is constant throughout program 22986execution, but is unknown at compile time. 22987If the result value does not fit in the result type, then the result is 22988a :ref:`poison value <poisonvalues>`. 22989 22990 22991Stack Map Intrinsics 22992-------------------- 22993 22994LLVM provides experimental intrinsics to support runtime patching 22995mechanisms commonly desired in dynamic language JITs. These intrinsics 22996are described in :doc:`StackMaps`. 22997 22998Element Wise Atomic Memory Intrinsics 22999------------------------------------- 23000 23001These intrinsics are similar to the standard library memory intrinsics except 23002that they perform memory transfer as a sequence of atomic memory accesses. 23003 23004.. _int_memcpy_element_unordered_atomic: 23005 23006'``llvm.memcpy.element.unordered.atomic``' Intrinsic 23007^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 23008 23009Syntax: 23010""""""" 23011 23012This is an overloaded intrinsic. You can use ``llvm.memcpy.element.unordered.atomic`` on 23013any integer bit width and for different address spaces. Not all targets 23014support all bit widths however. 23015 23016:: 23017 23018 declare void @llvm.memcpy.element.unordered.atomic.p0i8.p0i8.i32(i8* <dest>, 23019 i8* <src>, 23020 i32 <len>, 23021 i32 <element_size>) 23022 declare void @llvm.memcpy.element.unordered.atomic.p0i8.p0i8.i64(i8* <dest>, 23023 i8* <src>, 23024 i64 <len>, 23025 i32 <element_size>) 23026 23027Overview: 23028""""""""" 23029 23030The '``llvm.memcpy.element.unordered.atomic.*``' intrinsic is a specialization of the 23031'``llvm.memcpy.*``' intrinsic. It differs in that the ``dest`` and ``src`` are treated 23032as arrays with elements that are exactly ``element_size`` bytes, and the copy between 23033buffers uses a sequence of :ref:`unordered atomic <ordering>` load/store operations 23034that are a positive integer multiple of the ``element_size`` in size. 23035 23036Arguments: 23037"""""""""" 23038 23039The first three arguments are the same as they are in the :ref:`@llvm.memcpy <int_memcpy>` 23040intrinsic, with the added constraint that ``len`` is required to be a positive integer 23041multiple of the ``element_size``. If ``len`` is not a positive integer multiple of 23042``element_size``, then the behaviour of the intrinsic is undefined. 23043 23044``element_size`` must be a compile-time constant positive power of two no greater than 23045target-specific atomic access size limit. 23046 23047For each of the input pointers ``align`` parameter attribute must be specified. It 23048must be a power of two no less than the ``element_size``. Caller guarantees that 23049both the source and destination pointers are aligned to that boundary. 23050 23051Semantics: 23052"""""""""" 23053 23054The '``llvm.memcpy.element.unordered.atomic.*``' intrinsic copies ``len`` bytes of 23055memory from the source location to the destination location. These locations are not 23056allowed to overlap. The memory copy is performed as a sequence of load/store operations 23057where each access is guaranteed to be a multiple of ``element_size`` bytes wide and 23058aligned at an ``element_size`` boundary. 23059 23060The order of the copy is unspecified. The same value may be read from the source 23061buffer many times, but only one write is issued to the destination buffer per 23062element. It is well defined to have concurrent reads and writes to both source and 23063destination provided those reads and writes are unordered atomic when specified. 23064 23065This intrinsic does not provide any additional ordering guarantees over those 23066provided by a set of unordered loads from the source location and stores to the 23067destination. 23068 23069Lowering: 23070""""""""" 23071 23072In the most general case call to the '``llvm.memcpy.element.unordered.atomic.*``' is 23073lowered to a call to the symbol ``__llvm_memcpy_element_unordered_atomic_*``. Where '*' 23074is replaced with an actual element size. See :ref:`RewriteStatepointsForGC intrinsic 23075lowering <RewriteStatepointsForGC_intrinsic_lowering>` for details on GC specific 23076lowering. 23077 23078Optimizer is allowed to inline memory copy when it's profitable to do so. 23079 23080'``llvm.memmove.element.unordered.atomic``' Intrinsic 23081^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 23082 23083Syntax: 23084""""""" 23085 23086This is an overloaded intrinsic. You can use 23087``llvm.memmove.element.unordered.atomic`` on any integer bit width and for 23088different address spaces. Not all targets support all bit widths however. 23089 23090:: 23091 23092 declare void @llvm.memmove.element.unordered.atomic.p0i8.p0i8.i32(i8* <dest>, 23093 i8* <src>, 23094 i32 <len>, 23095 i32 <element_size>) 23096 declare void @llvm.memmove.element.unordered.atomic.p0i8.p0i8.i64(i8* <dest>, 23097 i8* <src>, 23098 i64 <len>, 23099 i32 <element_size>) 23100 23101Overview: 23102""""""""" 23103 23104The '``llvm.memmove.element.unordered.atomic.*``' intrinsic is a specialization 23105of the '``llvm.memmove.*``' intrinsic. It differs in that the ``dest`` and 23106``src`` are treated as arrays with elements that are exactly ``element_size`` 23107bytes, and the copy between buffers uses a sequence of 23108:ref:`unordered atomic <ordering>` load/store operations that are a positive 23109integer multiple of the ``element_size`` in size. 23110 23111Arguments: 23112"""""""""" 23113 23114The first three arguments are the same as they are in the 23115:ref:`@llvm.memmove <int_memmove>` intrinsic, with the added constraint that 23116``len`` is required to be a positive integer multiple of the ``element_size``. 23117If ``len`` is not a positive integer multiple of ``element_size``, then the 23118behaviour of the intrinsic is undefined. 23119 23120``element_size`` must be a compile-time constant positive power of two no 23121greater than a target-specific atomic access size limit. 23122 23123For each of the input pointers the ``align`` parameter attribute must be 23124specified. It must be a power of two no less than the ``element_size``. Caller 23125guarantees that both the source and destination pointers are aligned to that 23126boundary. 23127 23128Semantics: 23129"""""""""" 23130 23131The '``llvm.memmove.element.unordered.atomic.*``' intrinsic copies ``len`` bytes 23132of memory from the source location to the destination location. These locations 23133are allowed to overlap. The memory copy is performed as a sequence of load/store 23134operations where each access is guaranteed to be a multiple of ``element_size`` 23135bytes wide and aligned at an ``element_size`` boundary. 23136 23137The order of the copy is unspecified. The same value may be read from the source 23138buffer many times, but only one write is issued to the destination buffer per 23139element. It is well defined to have concurrent reads and writes to both source 23140and destination provided those reads and writes are unordered atomic when 23141specified. 23142 23143This intrinsic does not provide any additional ordering guarantees over those 23144provided by a set of unordered loads from the source location and stores to the 23145destination. 23146 23147Lowering: 23148""""""""" 23149 23150In the most general case call to the 23151'``llvm.memmove.element.unordered.atomic.*``' is lowered to a call to the symbol 23152``__llvm_memmove_element_unordered_atomic_*``. Where '*' is replaced with an 23153actual element size. See :ref:`RewriteStatepointsForGC intrinsic lowering 23154<RewriteStatepointsForGC_intrinsic_lowering>` for details on GC specific 23155lowering. 23156 23157The optimizer is allowed to inline the memory copy when it's profitable to do so. 23158 23159.. _int_memset_element_unordered_atomic: 23160 23161'``llvm.memset.element.unordered.atomic``' Intrinsic 23162^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 23163 23164Syntax: 23165""""""" 23166 23167This is an overloaded intrinsic. You can use ``llvm.memset.element.unordered.atomic`` on 23168any integer bit width and for different address spaces. Not all targets 23169support all bit widths however. 23170 23171:: 23172 23173 declare void @llvm.memset.element.unordered.atomic.p0i8.i32(i8* <dest>, 23174 i8 <value>, 23175 i32 <len>, 23176 i32 <element_size>) 23177 declare void @llvm.memset.element.unordered.atomic.p0i8.i64(i8* <dest>, 23178 i8 <value>, 23179 i64 <len>, 23180 i32 <element_size>) 23181 23182Overview: 23183""""""""" 23184 23185The '``llvm.memset.element.unordered.atomic.*``' intrinsic is a specialization of the 23186'``llvm.memset.*``' intrinsic. It differs in that the ``dest`` is treated as an array 23187with elements that are exactly ``element_size`` bytes, and the assignment to that array 23188uses uses a sequence of :ref:`unordered atomic <ordering>` store operations 23189that are a positive integer multiple of the ``element_size`` in size. 23190 23191Arguments: 23192"""""""""" 23193 23194The first three arguments are the same as they are in the :ref:`@llvm.memset <int_memset>` 23195intrinsic, with the added constraint that ``len`` is required to be a positive integer 23196multiple of the ``element_size``. If ``len`` is not a positive integer multiple of 23197``element_size``, then the behaviour of the intrinsic is undefined. 23198 23199``element_size`` must be a compile-time constant positive power of two no greater than 23200target-specific atomic access size limit. 23201 23202The ``dest`` input pointer must have the ``align`` parameter attribute specified. It 23203must be a power of two no less than the ``element_size``. Caller guarantees that 23204the destination pointer is aligned to that boundary. 23205 23206Semantics: 23207"""""""""" 23208 23209The '``llvm.memset.element.unordered.atomic.*``' intrinsic sets the ``len`` bytes of 23210memory starting at the destination location to the given ``value``. The memory is 23211set with a sequence of store operations where each access is guaranteed to be a 23212multiple of ``element_size`` bytes wide and aligned at an ``element_size`` boundary. 23213 23214The order of the assignment is unspecified. Only one write is issued to the 23215destination buffer per element. It is well defined to have concurrent reads and 23216writes to the destination provided those reads and writes are unordered atomic 23217when specified. 23218 23219This intrinsic does not provide any additional ordering guarantees over those 23220provided by a set of unordered stores to the destination. 23221 23222Lowering: 23223""""""""" 23224 23225In the most general case call to the '``llvm.memset.element.unordered.atomic.*``' is 23226lowered to a call to the symbol ``__llvm_memset_element_unordered_atomic_*``. Where '*' 23227is replaced with an actual element size. 23228 23229The optimizer is allowed to inline the memory assignment when it's profitable to do so. 23230 23231Objective-C ARC Runtime Intrinsics 23232---------------------------------- 23233 23234LLVM provides intrinsics that lower to Objective-C ARC runtime entry points. 23235LLVM is aware of the semantics of these functions, and optimizes based on that 23236knowledge. You can read more about the details of Objective-C ARC `here 23237<https://clang.llvm.org/docs/AutomaticReferenceCounting.html>`_. 23238 23239'``llvm.objc.autorelease``' Intrinsic 23240^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 23241 23242Syntax: 23243""""""" 23244:: 23245 23246 declare i8* @llvm.objc.autorelease(i8*) 23247 23248Lowering: 23249""""""""" 23250 23251Lowers to a call to `objc_autorelease <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autorelease>`_. 23252 23253'``llvm.objc.autoreleasePoolPop``' Intrinsic 23254^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 23255 23256Syntax: 23257""""""" 23258:: 23259 23260 declare void @llvm.objc.autoreleasePoolPop(i8*) 23261 23262Lowering: 23263""""""""" 23264 23265Lowers to a call to `objc_autoreleasePoolPop <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-autoreleasepoolpop-void-pool>`_. 23266 23267'``llvm.objc.autoreleasePoolPush``' Intrinsic 23268^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 23269 23270Syntax: 23271""""""" 23272:: 23273 23274 declare i8* @llvm.objc.autoreleasePoolPush() 23275 23276Lowering: 23277""""""""" 23278 23279Lowers to a call to `objc_autoreleasePoolPush <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-autoreleasepoolpush-void>`_. 23280 23281'``llvm.objc.autoreleaseReturnValue``' Intrinsic 23282^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 23283 23284Syntax: 23285""""""" 23286:: 23287 23288 declare i8* @llvm.objc.autoreleaseReturnValue(i8*) 23289 23290Lowering: 23291""""""""" 23292 23293Lowers to a call to `objc_autoreleaseReturnValue <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue>`_. 23294 23295'``llvm.objc.copyWeak``' Intrinsic 23296^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 23297 23298Syntax: 23299""""""" 23300:: 23301 23302 declare void @llvm.objc.copyWeak(i8**, i8**) 23303 23304Lowering: 23305""""""""" 23306 23307Lowers to a call to `objc_copyWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-copyweak-id-dest-id-src>`_. 23308 23309'``llvm.objc.destroyWeak``' Intrinsic 23310^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 23311 23312Syntax: 23313""""""" 23314:: 23315 23316 declare void @llvm.objc.destroyWeak(i8**) 23317 23318Lowering: 23319""""""""" 23320 23321Lowers to a call to `objc_destroyWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-destroyweak-id-object>`_. 23322 23323'``llvm.objc.initWeak``' Intrinsic 23324^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 23325 23326Syntax: 23327""""""" 23328:: 23329 23330 declare i8* @llvm.objc.initWeak(i8**, i8*) 23331 23332Lowering: 23333""""""""" 23334 23335Lowers to a call to `objc_initWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-initweak>`_. 23336 23337'``llvm.objc.loadWeak``' Intrinsic 23338^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 23339 23340Syntax: 23341""""""" 23342:: 23343 23344 declare i8* @llvm.objc.loadWeak(i8**) 23345 23346Lowering: 23347""""""""" 23348 23349Lowers to a call to `objc_loadWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-loadweak>`_. 23350 23351'``llvm.objc.loadWeakRetained``' Intrinsic 23352^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 23353 23354Syntax: 23355""""""" 23356:: 23357 23358 declare i8* @llvm.objc.loadWeakRetained(i8**) 23359 23360Lowering: 23361""""""""" 23362 23363Lowers to a call to `objc_loadWeakRetained <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-loadweakretained>`_. 23364 23365'``llvm.objc.moveWeak``' Intrinsic 23366^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 23367 23368Syntax: 23369""""""" 23370:: 23371 23372 declare void @llvm.objc.moveWeak(i8**, i8**) 23373 23374Lowering: 23375""""""""" 23376 23377Lowers to a call to `objc_moveWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-moveweak-id-dest-id-src>`_. 23378 23379'``llvm.objc.release``' Intrinsic 23380^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 23381 23382Syntax: 23383""""""" 23384:: 23385 23386 declare void @llvm.objc.release(i8*) 23387 23388Lowering: 23389""""""""" 23390 23391Lowers to a call to `objc_release <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-release-id-value>`_. 23392 23393'``llvm.objc.retain``' Intrinsic 23394^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 23395 23396Syntax: 23397""""""" 23398:: 23399 23400 declare i8* @llvm.objc.retain(i8*) 23401 23402Lowering: 23403""""""""" 23404 23405Lowers to a call to `objc_retain <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retain>`_. 23406 23407'``llvm.objc.retainAutorelease``' Intrinsic 23408^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 23409 23410Syntax: 23411""""""" 23412:: 23413 23414 declare i8* @llvm.objc.retainAutorelease(i8*) 23415 23416Lowering: 23417""""""""" 23418 23419Lowers to a call to `objc_retainAutorelease <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retainautorelease>`_. 23420 23421'``llvm.objc.retainAutoreleaseReturnValue``' Intrinsic 23422^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 23423 23424Syntax: 23425""""""" 23426:: 23427 23428 declare i8* @llvm.objc.retainAutoreleaseReturnValue(i8*) 23429 23430Lowering: 23431""""""""" 23432 23433Lowers to a call to `objc_retainAutoreleaseReturnValue <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retainautoreleasereturnvalue>`_. 23434 23435'``llvm.objc.retainAutoreleasedReturnValue``' Intrinsic 23436^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 23437 23438Syntax: 23439""""""" 23440:: 23441 23442 declare i8* @llvm.objc.retainAutoreleasedReturnValue(i8*) 23443 23444Lowering: 23445""""""""" 23446 23447Lowers to a call to `objc_retainAutoreleasedReturnValue <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retainautoreleasedreturnvalue>`_. 23448 23449'``llvm.objc.retainBlock``' Intrinsic 23450^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 23451 23452Syntax: 23453""""""" 23454:: 23455 23456 declare i8* @llvm.objc.retainBlock(i8*) 23457 23458Lowering: 23459""""""""" 23460 23461Lowers to a call to `objc_retainBlock <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retainblock>`_. 23462 23463'``llvm.objc.storeStrong``' Intrinsic 23464^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 23465 23466Syntax: 23467""""""" 23468:: 23469 23470 declare void @llvm.objc.storeStrong(i8**, i8*) 23471 23472Lowering: 23473""""""""" 23474 23475Lowers to a call to `objc_storeStrong <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-storestrong-id-object-id-value>`_. 23476 23477'``llvm.objc.storeWeak``' Intrinsic 23478^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 23479 23480Syntax: 23481""""""" 23482:: 23483 23484 declare i8* @llvm.objc.storeWeak(i8**, i8*) 23485 23486Lowering: 23487""""""""" 23488 23489Lowers to a call to `objc_storeWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-storeweak>`_. 23490 23491Preserving Debug Information Intrinsics 23492^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 23493 23494These intrinsics are used to carry certain debuginfo together with 23495IR-level operations. For example, it may be desirable to 23496know the structure/union name and the original user-level field 23497indices. Such information got lost in IR GetElementPtr instruction 23498since the IR types are different from debugInfo types and unions 23499are converted to structs in IR. 23500 23501'``llvm.preserve.array.access.index``' Intrinsic 23502^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 23503 23504Syntax: 23505""""""" 23506:: 23507 23508 declare <ret_type> 23509 @llvm.preserve.array.access.index.p0s_union.anons.p0a10s_union.anons(<type> base, 23510 i32 dim, 23511 i32 index) 23512 23513Overview: 23514""""""""" 23515 23516The '``llvm.preserve.array.access.index``' intrinsic returns the getelementptr address 23517based on array base ``base``, array dimension ``dim`` and the last access index ``index`` 23518into the array. The return type ``ret_type`` is a pointer type to the array element. 23519The array ``dim`` and ``index`` are preserved which is more robust than 23520getelementptr instruction which may be subject to compiler transformation. 23521The ``llvm.preserve.access.index`` type of metadata is attached to this call instruction 23522to provide array or pointer debuginfo type. 23523The metadata is a ``DICompositeType`` or ``DIDerivedType`` representing the 23524debuginfo version of ``type``. 23525 23526Arguments: 23527"""""""""" 23528 23529The ``base`` is the array base address. The ``dim`` is the array dimension. 23530The ``base`` is a pointer if ``dim`` equals 0. 23531The ``index`` is the last access index into the array or pointer. 23532 23533The ``base`` argument must be annotated with an :ref:`elementtype 23534<attr_elementtype>` attribute at the call-site. This attribute specifies the 23535getelementptr element type. 23536 23537Semantics: 23538"""""""""" 23539 23540The '``llvm.preserve.array.access.index``' intrinsic produces the same result 23541as a getelementptr with base ``base`` and access operands ``{dim's 0's, index}``. 23542 23543'``llvm.preserve.union.access.index``' Intrinsic 23544^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 23545 23546Syntax: 23547""""""" 23548:: 23549 23550 declare <type> 23551 @llvm.preserve.union.access.index.p0s_union.anons.p0s_union.anons(<type> base, 23552 i32 di_index) 23553 23554Overview: 23555""""""""" 23556 23557The '``llvm.preserve.union.access.index``' intrinsic carries the debuginfo field index 23558``di_index`` and returns the ``base`` address. 23559The ``llvm.preserve.access.index`` type of metadata is attached to this call instruction 23560to provide union debuginfo type. 23561The metadata is a ``DICompositeType`` representing the debuginfo version of ``type``. 23562The return type ``type`` is the same as the ``base`` type. 23563 23564Arguments: 23565"""""""""" 23566 23567The ``base`` is the union base address. The ``di_index`` is the field index in debuginfo. 23568 23569Semantics: 23570"""""""""" 23571 23572The '``llvm.preserve.union.access.index``' intrinsic returns the ``base`` address. 23573 23574'``llvm.preserve.struct.access.index``' Intrinsic 23575^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 23576 23577Syntax: 23578""""""" 23579:: 23580 23581 declare <ret_type> 23582 @llvm.preserve.struct.access.index.p0i8.p0s_struct.anon.0s(<type> base, 23583 i32 gep_index, 23584 i32 di_index) 23585 23586Overview: 23587""""""""" 23588 23589The '``llvm.preserve.struct.access.index``' intrinsic returns the getelementptr address 23590based on struct base ``base`` and IR struct member index ``gep_index``. 23591The ``llvm.preserve.access.index`` type of metadata is attached to this call instruction 23592to provide struct debuginfo type. 23593The metadata is a ``DICompositeType`` representing the debuginfo version of ``type``. 23594The return type ``ret_type`` is a pointer type to the structure member. 23595 23596Arguments: 23597"""""""""" 23598 23599The ``base`` is the structure base address. The ``gep_index`` is the struct member index 23600based on IR structures. The ``di_index`` is the struct member index based on debuginfo. 23601 23602The ``base`` argument must be annotated with an :ref:`elementtype 23603<attr_elementtype>` attribute at the call-site. This attribute specifies the 23604getelementptr element type. 23605 23606Semantics: 23607"""""""""" 23608 23609The '``llvm.preserve.struct.access.index``' intrinsic produces the same result 23610as a getelementptr with base ``base`` and access operands ``{0, gep_index}``. 23611